Methods - C# to IL

-8-

Methods

The code of a data type is implemented by a method, which is executed by the Execution Engine. The CLR offers a large number of services to support the execution of code.

Any code that uses these services is called managed code. Managed code allows the CLR to provide a set of features such as handling exceptions. It also makes sure that the code is verifiable. Only managed code has access to managed data.

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object {

.method public hidebysig static void vijay() il managed {

.entrypoint

call instance void a1()

ret

}

.method public instance void a1() il managed

{

ldstr "hi"

call void System.Console::WriteLine(class System.String)

ret

}

Output

There is no rule in the IL book that prevents a method from being global. It can certainly be written outside a class.

a.il

.assembly mukhi {}

.method public instance void a1() il managed

{

.entrypoint

ldstr "hi"

call void System.Console::WriteLine(class System.String)

ret

}

Output

In fact we can write the smallest IL program without using the class directive. It is mandatory to have a function with the entrypoint directive. Thus, had the designers of C# so desired, they could have provided the facility of global functions, but they chose not to. They decided, in their infinite wisdom, that all functions should be placed within a class. There is no such restriction imposed by IL.

The CLR recognizes three types of methods: static, instance and virtual. There are some special functions that are automatically called by the runtime such as static constructors or type initializers such as .cctor and instance constructors such as .ctor.

A method in IL is uniquely identified by its signature. A signature consists of five parts:

• The name of the method

• The type or class that the method resides in

• The calling convention used

• The return type

• The parameter types.

a.il

.assembly mukhi {}

.method public instance void a1() il managed

{

.entrypoint

call instance int32 a2()

pop

call instance void a2()

ret

}

.method public instance void a2() il managed

{

ldstr "hi"

call void System.Console::WriteLine(class System.String)

ret

}

.method public instance int32 a2() il managed

{

ldstr "hi1"

call void System.Console::WriteLine(class System.String)

ldc.i4.2

ret

}

Output

hi1

For people like us, who are familiar with the world of C, C++ and Java, the concept of a message signature depending upon the return type of a function is alien.

Here, we have two functions, both named a2, which differ in the type of return value. This is perfectly valid in IL. The reason being that when calling a method in IL, we only have to state the return type. But what is allowed in IL, may be taboo in C#.

Method overloading is a concept where the same function name appears in a class, more than once. In fact, you may not have clearly observed, in the above programs, the this pointer is not passed to the global functions. Even then, things worked well.

The reason for this is that generally, global functions are static by default. In fact, static functions are found in classes, value types and interfaces. Static functions always have a body associated with them.

The second type of method very commonly used is an instance. These are functions associated with an instance of a class. In this version of the CLR, we cannot declare them in interfaces. Unlike static methods which are stand-alone methods and behave like global functions, an instance functions is always passed a pointer or reference to the data associated with the object. Thus, it can use the this pointer to access a different set of data each time.

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object {

.method public hidebysig static void vijay() il managed {

.entrypoint

call void zzz::a1()

ret

}

.method public instance void a1() il managed

{

ldstr "hi"

call void System.Console::WriteLine(class System.String)

ret

}

Output

Exception occurred: System.MissingMethodException: Void .zzz.a1()

at zzz.vijay()

A runtime exception is thrown cause the call expects the method to be static, whereas, our method is an instance. To avoid this runtime error, replace the modifier instance with static.

The this pointer is of the same type as the class in which the method resides. We therefore, have to create an instance of a class before we can execute any instance method from the class.

As a rule, all instance functions must have the this pointer as the first parameter. Therefore, it is automatically added as a first hidden parameter. The this pointer can be a null reference too.

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.field int32 i

.method public hidebysig static void vijay() il managed

{

.entrypoint

newobj instance void zzz::.ctor()

ldnull

call instance void zzz::a1()

ret

}

.method public instance void a1() il managed

{

ldstr "hi"

call void System.Console::WriteLine(class System.String)

ldarg.0

ldc.i4.2

stfld int32 zzz::i

ret

}

Output

Exception occurred: System.ExecutionEngineException: An exception of type System.ExecutionEngineException was thrown.

at zzz.vijay()

Whenever we refer to a field in a type, through a function, the this pointer should first be available on the stack. This facilitates access to the instance fields. This explains the above error.

Here, we have placed a ldnull as the this pointer, and thus, are unable to access the instance members. On commenting the ldnull, no error is generated.

The instruction newobj places a this pointer on the stack. Therefore, prior to using it, ldarg.0 is checked for NULL. However, for a value type, the this pointer is a managed pointer to the value type. Unlike static or virtual, an instance is not an attribute of a method. It is part of the calling convention of a method.

There are three ways to call a method in IL. These are: call, callvirt and calli. Two of these, call and callvirt, have already been dealt with, in the past.

There are three other instructions that can be used to call a method in a special way. These are jmp, jmpi and newobj. Every method that we call has its own evaluation stack. The parameters to the function are placed on this stack, and instructions also obtain their arguments from the same stack.

On the execution of an instruction, the result is also placed on the same stack. The runtime creates and maintains this stack. When the method quits out, the stack is released.

There is another stack that we do not concern ourselves with. This stack keeps track of the method being called, and hence, is known as the call stack.

The last and final instruction in any function is the ret instruction. This instruction is responsible for the method returning control back to the calling method. If a function returns a value, it must be placed on the stack before ret is called. When quitting off a method, the stack must not contain any value, other than the value to be returned.

We use the call instruction to call static or virtual functions. Before the call instruction, all the parameters to the method must be placed on the stack. The first argument to the function is placed first. The only difference between calling a static and an instance method is that, the modifier instance is used for an instance method whereas, no modifier is required for a static method.

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.field int32 i

.method public hidebysig static void vijay() il managed

{

.entrypoint

newobj instance void zzz::.ctor()

pop

ldnull

callvirt instance void zzz::a1()

ret

}

.method public virtual instance void a1() il managed

{

ldstr "hi"

call void System.Console::WriteLine(class System.String)

ret

}

Output

Exception occurred: System.NullReferenceException: Attempted to dereference a null object reference.

at zzz.vijay()

Virtual functions have to be handled with care as they are runtime entities. With virtual functions, the instruction callvirt is used in place of call. callvirt unlike call executes the overriding version of the method.

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

.locals (class yyy V_0)

newobj instance void xxx::.ctor()

stloc.0

ldloc.0

callvirt instance void yyy::abc()

ldloc.0

call instance void yyy::abc()

ret

}

.class private auto ansi yyy extends [mscorlib]System.Object

{

.method public hidebysig newslot virtual instance void abc() il managed

{

ldstr "yyy abc"

call void [mscorlib]System.Console::WriteLine(class System.String)

ret

}

.class private auto ansi xxx extends yyy

{

.method public hidebysig virtual instance void abc() il managed

{

ldstr "xxx abc"

call void [mscorlib]System.Console::WriteLine(class System.String)

ret

}

.method public hidebysig specialname rtspecialname instance void .ctor() il managed

{

ldarg.0

call instance void yyy::.ctor()

ret

}

Output

xxx abc

yyy abc

We have pulled out this program from an earlier chapter, where we explained new, override and virtual functions. The callvirt function calls the function abc from xxx, as it overrides the one from the class yyy.

The reason being, in the class xxx, there is no modifier newslot for the function abc, hence it is a different abc from the one in the base class. With call however, the instruction simply calls abc from the class specified, as it does not understand modifiers like virtual, newslot etc. instance is used with callvirt as the this pointer, under no circumstances, can be NULL.

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

.locals (class yyy V_0)

newobj instance void xxx::.ctor()

stloc.0

ldloc.0

callvirt instance void yyy::abc()

ret

}

.class private auto ansi yyy extends [mscorlib]System.Object

{

.method public hidebysig newslot virtual instance void abc() il managed

{

ldstr "yyy abc"

call void [mscorlib]System.Console::WriteLine(class System.String)

ret

}

.class private auto ansi xxx extends yyy

{

.method public hidebysig virtual instance void abc() il managed

{

ldstr "xxx abc"

call void [mscorlib]System.Console::WriteLine(class System.String)

ldloc.0

call instance void yyy::abc()

ret

}

.method public hidebysig specialname rtspecialname instance void .ctor() il managed

{

ldarg.0

call instance void yyy::.ctor()

ret

}

Output

xxx abc

yyy abc

In the above example, the super class function abc from the class yyy is called, from the function abc from class xxx. This facilitates reusing code defined in the super class.

A virtual function may want to call all code in the base class. In IL parlance, it is termed as a super call. In the above code, we foresee a problem with callvirt as it will either call itself over and over again, or give us the following exception:

Output

xxx abc

Exception occurred: System.NullReferenceException: Attempted to dereference a null object reference.

at xxx.abc()

at zzz.vijay()

The reason for the above error is that, the this pointer refers to class xxx and not of the class yyy. Thus, the instruction call is used and not callvirt.

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

newobj void zzz::.ctor()

call instance void zzz::abc()

ret

}

.method public instance void abc()

{

ldstr "hi"

call void System.Console::WriteLine(class System.String)

jmp instance void zzz::pqr()

ldstr "bye"

call void System.Console::WriteLine(class System.String)

ret

}

.method public instance void pqr()

{

ldstr "pqr"

call void System.Console::WriteLine(class System.String)

ret

}

Output

pqr

We have created an object like zzz using newobj. It places a reference to a zzz on the stack. The this pointer then calls the instance function abc.

Here we have displayed "hi" and then an instance method pqr is called using the jmp instruction.

After the method pqr finishes execution, control does not regress to method abc. Instead, control returns back to vijay, which is the method that called abc. Thus the string "bye" present in the method pqr, does not get displayed.

The jmp instruction does not revert the control back to the method from where the program initially branched out.

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object {

.method public hidebysig static void vijay() il managed {

.entrypoint

newobj void zzz::.ctor()

call instance void zzz::abc()

ret

}

.method public instance void abc()

{

ldstr "hi"

call void System.Console::WriteLine(class System.String)

ldftn instance void zzz::pqr()

jmpi

ldstr "bye"

call void System.Console::WriteLine(class System.String)

ret

}

.method public instance void pqr()

{

ldstr "pqr"

call void System.Console::WriteLine(class System.String)

ret

}

Output

pqr

The above program is similar to its predecessor, but it uses the instruction jmpi instead of jmp. This instruction is similar to jmp, but differs in the following aspects:

• In the case of the jmp instruction, we placed the method signature on the stack as a parameter to the instruction.

• In the case of the jmpi instruction, we first use the instruction ldftn to load the address of the function pqr on the stack, and then call jmpi.

The jmp family of instructions executes a jump or a branch across a method. We can only jump to the beginning of a method, and not to anywhere inside it. The signature of the method that we intend to jump to, must be the same.

Output

Exception occurred: System.ExecutionEngineException: An exception of type System.ExecutionEngineException was thrown.

at zzz.abc()

at zzz.vijay()

If the signature of the method being jumped to is not the same, the above exception is thrown. The jmp instruction is not verifiable.

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

newobj void zzz::.ctor()

ldc.i4.1

ldc.i4.2

call instance void zzz::abc(int32,int32)

ret

}

.method public instance void abc(int32 i, int32 j)

{

ldc.i4.3

starg j

ldarg j

call void System.Console::WriteLine(int32)

jmp instance void zzz::pqr(int32,int32)

ret

}

.method public instance void pqr(int32 p,int32 q)

{

ldarg.1

call void System.Console::WriteLine(int32)

ldarg.2

call void System.Console::WriteLine(int32)

ret

}

Output

The method abc take two ints as parameters. We have placed the constant 3 on the stack, and then used the instruction starg to change the parameter j. Then, ldarg is used to place the new value on the stack. Thereafter, we have called the WriteLine function to confirm if the new value is 3. The jmp instruction is the next to be called.

Here we have not placed any parameters on the stack. The jmp instruction first places the numbers 1 and 2 on the stack, and then, calls the function pqr, that simply displays the parameters that have been passed.

Even though we have changed the parameter j, the change is not reflected in the called function pqr. This is contrary to what the documentation states. The call does not pass parameters to the next method. The instruction jmp does so.

If function pqr returns a value, it will be passed to the function vijay and not to abc. We cannot place any values on the stack before executing the jump. Jumps can be executed only between methods that have the same signatures.

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

newobj void zzz::.ctor()

ldftn instance void zzz::abc()

calli instance void ()

ret

}

.method public instance void abc()

{

ldstr "hi"

call void System.Console::WriteLine(class System.String)

ret

}

Output

We can call a method indirectly by first, placing its address on the stack, and then, using the calli instruction. At first, the instruction ldftn places the address of a non-virtual function on the stack. Like in the case of instance functions, the this pointer has to be placed first on the stack, followed by the parameters to the functions. When we tried using calli with the address of a virtual function, Windows generated an error.

We use the newobj instruction to create a new instance, and also, call the constructor of a class, which is nothing more than a special instance method.

The only difference between a constructor and an instance call is that, the this pointer is not passed to the constructor. newobj first creates the object, and then, automatically places the this pointer on the stack.

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.field int32 i

.method public hidebysig static void vijay() il managed

{

.entrypoint

.locals ( class zzz v)

newobj void zzz::.ctor()

stloc.1

ldloc.1

ldfld int32 zzz::i

call void System.Console::WriteLine(int32)

ldloc.1

ldc.i4.2

stfld int32 zzz::i

ldloc.1

ldfld int32 zzz::i

call void System.Console::WriteLine(int32)

ldloc.1

call instance void zzz::.ctor()

ldloc.1

ldfld int32 zzz::i

call void System.Console::WriteLine(int32)

ret

}

.method public hidebysig specialname rtspecialname instance void .ctor() il managed

{

ldarg.0

ldc.i4.1

stfld int32 zzz::i

ret

}

Output

The newobj instruction places the this pointer on the stack before calling the constructor. If we desire to call the constructor ourselves, we too need to place the this pointer on the stack.

In the above program, we have changed the value of the field i to 1, then again changed it to 2 using stfld and then displayed this value. Thereafter, we have called the constructor, which changes the value back to 1 again. This proves that a constructor is no different from any other function.

A method definition is called a method head in IL. The head also functions as an interface to other methods. The format of the head is as follows:

• It starts with a number of predefined method attributes.

• These are followed by an optional indication, specifying whether the method is an instance method or not.

• Thereafter, the calling convention is specified.

• This is followed by the return type and a few more optional parameters.

• Finally, we state the name and the parameters to the method and the implementation attributes.

Methods are instance by default. To change the default behavior, we use use the modifiers static or virtual. As of today, the return type cannot have any attributes, but who knows, what changes may take place tomorrow.

The code for the method is written in the method body. It can incorporate a large number of directives.

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

.emitbyte 0x19

call void System.Console::WriteLine(int32)

ret

}

Output

The code that we write, gets converted into numbers. Every IL instruction is represented by a number. The ldc.i4.3 instruction is known by the number 19 hex. This information is available in the Instruction Set Reference. The directive emitbyte emits an unsigned 8 bit number directly into the code section of the method.

Thus, we can use the opcodes of an IL instruction directly in il programs.

The return value of the entrypoint function can either be void, int32 or unsigned int32. This value is handed over to the Operating System. A value of ZERO normally indicates success and any other value indicates an error. The entrypoint method is unique, meaning, it can have private accessibility, and yet be accessed by the runtime.

The .locals directive is used to create a local variable that can only be accessed from within that method. Thus, it is used to store data that exists only for the duration of a method call. After a method quits, all the memory allocated for a local is reclaimed by IL.

It is faster for the system to allocate memory on the stack, where locals get stored, than to allocate memory on the heap for the fields. We cannot specify attributes for local variables, like we do for parameters.

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

ldc.i4.1

stloc.0

ldloc.0

call void System.Console::WriteLine(int32)

.locals ( int32 i)

ret

}

Output

The .locals directive can be placed at the end of the code and does not have to be placed at the beginning. Thus, in a sense, a forward reference is allowed here.

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

//.zeroinit

.locals ( int32 i)

ldloc.0

call void System.Console::WriteLine(int32)

ret

}

Output

51380288

Remove the comments and a value of zero will be displayed.

There is some overlap in IL. If we use the modifier init in the locals directive, then all the variables will be assigned their default values, depending upon their type. We have touched upon this point earlier.

The same effect is seen when we use the directive .zeroinit. This applies to all the locals in the method.

• If we place the comments, the variable i will be assigned whatever value is present on the stack.

• If we remove the comments, the runtime initialises all the value types to ZERO and all the reference types to NULL.

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.zeroinit

.method public hidebysig static void vijay() il managed

{

.entrypoint

ret

}

Error

a.il(4) : error : syntax error at token '.zeroinit' in: .zeroinit

***** FAILURE *****

Some of the directives can only be used within certain entities. The directive .zeroinit can only be used within a method and not outside. The assembler checks whether the directive has been used at the right place or not. If not, it generates an error message that is hardly informative.

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

.locals (class yyy V_0)

newobj instance void xxx::.ctor()

stloc.0

ldloc.0

callvirt instance void yyy::abc()

ret

}

.class private auto ansi yyy extends [mscorlib]System.Object

{

.method public hidebysig virtual instance void abc() il managed

{

ldstr "yyy abc"

call void [mscorlib]System.Console::WriteLine(class System.String)

ret

}

.method public hidebysig specialname rtspecialname instance void .ctor() il managed

{

ldarg.0

call instance void System.Object::.ctor()

ret

}

.class private auto ansi xxx extends yyy

{

.method public hidebysig virtual instance void abc() il managed

{

ldstr "xxx abc"

call void [mscorlib]System.Console::WriteLine(class System.String)

ret

}

.method public hidebysig specialname rtspecialname instance void .ctor() il managed

{

ldarg.0

call instance void yyy::.ctor()

ret

}

Output

xxx abc

You may accuse us of being repetitive, but there is no harm in refreshing our memory.

Class yyy is a base class and xxx the derived class. We have created a local of type yyy, which is the base class, but initialized it to the class xxx, which is the derived class. A better way to say it is, we are creating an object that looks like xxx, but storing it in a yyy local.

callvirt calls the function abc from the class xxx despite of it being called from the yyy class, . This is because, the instruction callvirt executes at runtime. In that environment, the this pointer on the stack is of class xxx, and thus abc from the class xxx is called. The virtual function has its own unique way of deciding on the pointer to be placed on the stack.

If we remove the modifier virtual from the function abc in class xxx, then the function abc will be called from the yyy class. Changing the newobj to yyy does not make a difference, as both the run time and compile time data types should be the same. The run time data type takes precedence over the compile time data type.

We add the modifier newslot in function abc class xxx as follows:

.method public hidebysig newslot virtual instance void abc() il managed

Here, from the point of view of the run time, the function abc is treated as a new function. As there is no connection with the abc of class yyy, they are now treated as two distinct functions. The abc of class yyy is called. Placing the modifier newslot in class yyy function for abc makes it a new function abc, if one is present in the object. Thus, it makes no difference here.

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

.locals (class yyy V_0 , class xxx V_1)

newobj instance void www::.ctor()

stloc.0

ldloc.0

callvirt instance void yyy::abc()

newobj instance void www::.ctor()

stloc.1

ldloc.1

callvirt instance void xxx::abc()

ret

}

.class private auto ansi yyy extends [mscorlib]System.Object

{

.method public hidebysig virtual instance void abc() il managed

{

ldstr "yyy abc"

call void [mscorlib]System.Console::WriteLine(class System.String)

ret

}

.method public hidebysig specialname rtspecialname instance void .ctor() il managed

{

ldarg.0

call instance void System.Object::.ctor()

ret

}

.class private auto ansi xxx extends yyy

{

.method public hidebysig virtual instance void abc() il managed

{

ldstr "xxx abc"

call void [mscorlib]System.Console::WriteLine(class System.String)

ret

}

.method public hidebysig specialname rtspecialname instance void .ctor() il managed

{

ldarg.0

call instance void yyy::.ctor()

ret

}

.class private auto ansi www extends xxx

{

.method public hidebysig virtual instance void abc() il managed

{

ldstr "www abc"

call void [mscorlib]System.Console::WriteLine(class System.String)

ret

}

.method public hidebysig specialname rtspecialname instance void .ctor() il managed

{

ldarg.0

call instance void xxx::.ctor()

ret

}

Output

www abc

The above program is pretty large. The only difference between this program and its predecessor is that, we have added one more class www derived from xxx. We have created two locals, one each of the types xxx and yyy, but the run time data type of both the locals is a www object.

The functions abc are virtual throughout. When we call the functions abc though callvirt, even though we are using the class prefix xxx and yyy, the function gets called from www.

This is so because the run time data type, i.e. www, of the this pointer has been passed.

Then, we make our first small change: We add a newslot to the function abc in class www.

The output now reads as follows:

xxx abc

This output has resulted as shown above because, newslot dissociates the function abc of the class www, from the earlier abc functions. Thus, since the abc of class xxx is the newest, it gets called.

Next, we add the modifier newslot to the function abc from class xxx and remove it from the class www. The output now reads as.

yyy abc

www abc

Isn't the output fascinating? Now you probably can understand, as to why we are revisiting virtual functions.

By adding the modifier newslot to the function abc in class xxx, we are creating two families of abc:

• One that comprises only of a single abc in class yyy

• Another made up of abc functions from classes xxx and www.

Thus, in every instance, the last member of the family gets called and, since the first family has only one member, this single member i.e. class yyy, gets called.

In the second case, the abc of class www gets called. Now let us add the newslot modifier to function abc class www, without removing the one from class xxx.

The output now reads as follows:

yyy abc

xxx abc

Now, we have three families of abc functions. Each of them has only one function abc that has nothing to do with the abc functions of the other families.

If we add the modifier newslot to the function abc in class yyy, we will not see any change in the output. This is because, we are cutting off abc from its root, from class yyy onwards. There is no function abc in any of the classes that yyy derives from. Hence, there is no change in the output.

If we remove virtual from the function abc in class www, it has the same effect as adding the modifier newslot. A virtual modifier function signifies that the address of the function to be called should be read from the vtable. If we remove the virtual modifier from function abc class xxx, the output will be as follows:

www abc

xxx abc

This output has resulted because of the following:

The object created is a www type.

• In the first case, the vtable has the address of a www abc. The vtable stores a single address of every virtual function. The runtime checks for the compile time data type of the pointer and on examining, it looks like yyy. Within yyy, it discovers that function abc is virtual. Thus it looks into the vtable for the address which turns out to be that of www.

• In the second case, at the compile time the type revealed is xxx. But within the class xxx, the function is not virtual and thus, the vtable does not come into play.

Now we remove virtual from the function abc of class yyy only. Remember, we are making only one change a time. The output now will be as follows:

yyy abc

www abc

The same explanation as given earlier applies here too. We hope you will remember us and our brilliant explanation of the concept of virtual. At least, this is how we interpret it, and do not mind being the only ones to do so in this manner.

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

.locals (int32 i)

ldc.i4.1

stloc i

ldloc i

call void System.Console::WriteLine(int32)

{

.locals (int32 i)

ldc.i4.2

stloc i

ldloc i

call void System.Console::WriteLine(int32)

}

ldloc.0

call void System.Console::WriteLine(int32)

ldloc.1

call void System.Console::WriteLine(int32)

ret

}

Output

In IL, the scoping levels do not exhibit similar behavior to those found in traditional languages like C. Here i is created as a new variable each time with the { brace even though, all the variables are moulded together into one large local directive.

Thus we refer to the individual variables i in their respective blocks. The ldloc.0 stands for the first i whereas, ldloc.2 stands for the inner i that is visible in the outer brace.

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

.locals (int32 i)

ldloca i

call void System.Console::WriteLine(int32)

{

.locals (int32 i)

ldloca i

call void System.Console::WriteLine(int32)

}

ret

}

Output

6552336

6552340

The above program displays different values for the local variable i. The output proves that they are created consecutively in memory.

Whenever you are in doubt, display the value of the variables and clear up the cobwebs in your mind. Thus, scope blocks are also known as syntactic sugar and are only used to increase the readability and to debug code written by others.

Internally, for a variable name, IL begins at the scope we are presently in, and recursively tries to resolve the name of the variable. Thus, even though a declaration hides the name of a variable, we can access it using the index. The scope does not change the lifetime of a variable. All the variables in a method are created when we first enter the method, and die when we exit from it. The variable is always accessible by the zero based index, that is allocated on a "first come first served" basis.

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

ldc.i8 4

call vararg void zzz::abc(..., int32)

ret

}

.method public static vararg void abc()

{

.locals init (value class System.ArgIterator it,int32 x)

ldloca it

initobj value class System.ArgIterator

ldloca it

arglist

call instance void System.ArgIterator::.ctor(value class System.RuntimeArgumentHandle)

ldloca x

ldloca it

call instance typedref System.ArgIterator::GetNextArg()

call class System.Object System.TypedReference::ToObject(typedref)

castclass System.Int32

unbox int32

cpobj int32

ldloc x

call void System.Console::WriteLine(int32)

ret

}

Output

The above program demonstrates how a function accepts multiple number of parameters.

Vararg is a calling convention that allows passing of multiple parameters to a function. We have created a variable called it, that looks like System.ArgIterator. We have then loaded its address on the stack using ldloca and then called arglist. This instruction returns an opaque handle i.e. an unmanaged pointer which represents all the arguments passed to the method. This handle can be passed to other methods but is valid only during the lifetime of the current method. This opaque handle is of the type RuntimeArgumentHandle.

The arglist instruction is valid on methods that take a variable number of arguments. The constructor of the value class ArgIterator is called with this handle as a parameter.

Once the value class is instantiated, we place the address of a local variable x on the stack. This is more to store the parameter passed to our function. Subsequenly, the address of variable it is put on the stack too. A function GetNextArg from class ArgIterator is called that places a typedref on the stack, which is then passed to the function ToObject.

Then, the class to an int32 is casted and unboxed as we need a value type. This value is copied to the variable x. The vararg is a calling convention, and thus, part of the signature of the method. We are specifying it as part of the call instruction. The ellipsis denote the end of fixed parameters and beginning of the variable number of parameters. This is because, a function may want to have a certain fixed number of parameters also.

The other functions of the class ArgIterator can also give us useful information, such as the number of items on the stack.

We use method parameters to enable a method to accept data from the caller. Method parameters are checked for type safety. They make it mandatory for a method to be called with the correct parameters. The Execution Engine enforces the contract between the caller and the called methods.

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

.locals ( int32 i)

ldc.i4 4

stloc.0

ldloc.0

call void System.Console::WriteLine(int32)

newobj instance void zzz::.ctor()

ldloc.0

call instance void zzz::abc(int32)

ret

}

.method public instance void abc(int32 )

{

ldarg.1

call void System.Console::WriteLine(int32)

ret

}

Output

We are not compelled to assign any name to the parameters. In the above program, we have a local as well as a parameter of type int32 which has no name or id. IL does not seem to care at all. However, the unnamed variables can be referenced only as an index. Parameters can also have attributes, as we shall now see, but these attributes have nothing to do with the signature.

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

newobj instance void zzz::.ctor()

ldc.i4 2

call instance void zzz::abc(int32)

ret

}

.method public instance void abc([opt] int32 i )

{

ldarg.1

call void System.Console::WriteLine(int32)

ret

}

Output

The first attribute to a parameter is opt, which makes it optional. This means that, it is not compulsory to pass a parameter to our function.

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

newobj instance void zzz::.ctor()

call instance void zzz::abc()

ret

}

.method public instance void abc([opt] int32 i )

{

ret

}

Output

Exception occurred: System.MissingMethodException: Void .zzz.abc()

at zzz.vijay()

Always read the fine print. The opt attribute may indicate that the parameter is optional, but it is used for documentation purposes only. The compiler may place the opt attribute on a parameter, so that other tools make sense of it. As far as the runtime is concerned, however, all the parameters are mandatory, and it simply ignores the opt attribute. Thus, opt has no significance for the runtime.

Implementation attributes provide a lot of information about the nature of the method to the runtime. These attributes decide whether the method requires special handling at runtime or not.

The Synchronized Attribute

a.il

.assembly mukhi {}

.class public auto ansi yyy extends [mscorlib]System.Object

{

.method public hidebysig instance void abc()synchronized

{

.locals (int32 V_0)

ldc.i4.0

stloc.0

br.s IL_0018

IL_0004: ldloc.0

call void [mscorlib]System.Console::WriteLine(int32)

ldc.i4 0x3e8

call void [mscorlib]System.Threading.Thread::Sleep(int32)

ldloc.0

ldc.i4.1

add

stloc.0

IL_0018: ldloc.0

ldc.i4.3

ble.s IL_0004

ret

}

.class public auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

.locals (class yyy V_0,class [mscorlib]System.Threading.Thread V_1,class [mscorlib]System.Threading.Thread V_2)

newobj instance void yyy::.ctor()

stloc.0

ldloc.0

ldftn instance void yyy::abc()

newobj instance void [mscorlib]System.Threading.ThreadStart::.ctor(class System.Object,int32)

newobj instance void [mscorlib]System.Threading.Thread::.ctor(class [mscorlib]System.Threading.ThreadStart)

stloc.1

ldloc.0

ldftn instance void yyy::abc()

newobj instance void [mscorlib]System.Threading.ThreadStart::.ctor(class System.Object,int32)

newobj instance void [mscorlib]System.Threading.Thread::.ctor(class [mscorlib]System.Threading.ThreadStart)

stloc.2

ldloc.1

call instance void [mscorlib]System.Threading.Thread::Start()

ldloc.2

call instance void [mscorlib]System.Threading.Thread::Start()

ret

}

Output with synchronized

Output without synchronized

You should run the above program with and without the synchronized attribute to appreciate its significance.

The attribute il managed tells the runtime that the method contains IL code that will run in the managed world. We have created two threads, V_1 and V_2. These execute the same function abc from class yyy.

In the function abc, we display numbers from 0 to 3, using a loop. After displaying a number, the Sleep function stalls all operations for 1000 milliseconds. Thus the first thread executes function abc, prints the value 0 and then sleeps. Now the second thread takes advantage of the fact that the first thread is sleeping, and it also displays 0 and falls asleep. This continues till we reach the value 3 and exit from the loop.

The synchronized attribute does not execute the second function until the first thread terminates. Thus, the second thread has no choice but to wait until the first thread finishes execution. Try implementing the above in C#.

What we are trying to say is that if C# does not inculcate a feature of IL, there is no way you can use it in any .cs program.

If a code implementation attribute is not given, the default value is il managed. The other three options are native, optil and runtime. These are mutually exclusive. The runtime attribute specifies that the implementation of the code will be supplied by the runtime, and not by the programmer. We cannot place any code in this type of a method. It is used for constructors and delegates.

a.il

.assembly mukhi {}

.class public auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() optil

{

.entrypoint

ret

}

On running the ‘a.exe’ executable, three message boxes pop up with the following message.

Unable to load OptJit Compiler (MSCOROJT.DLL). File may be missing or corrupt. Please check your setup or rerun setup.

Failure to compile a method to native code. Most likely it is a corrupt executable file.

Windows Protection Error

The program reported the above errors on the introduction of the new attribute optil. It clearly says that it could not find a particular dll. The attribute optil means that the code is an optimized IL code that runs faster.

We normally end all our attributes for a method with the qualifier managed or unmanaged. The default value is managed. This signifies as to who will manage the execution of the method.

• Managed signifies that the CLR will manage it.

• Unmanaged signifies that someone else will manage it.

a.il

.assembly mukhi {}

.class public auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il unmanaged

{

.entrypoint

ret

}

Output

Exception occurred: System.ExecutionEngineException: An exception of type System.ExecutionEngineException was thrown.

at zzz.vijay()

If we use the unmanaged attribute with pure IL code we get the above exception.

a.cs

using System;

using System.Runtime.InteropServices;

class zzz

{

[DllImport("user32.dll")]

public static extern int MessageBoxA(int h, string m, string c, int type);

public static void Main()

{

MessageBoxA(0,"Hell","Bye",0);

}

a.il

.assembly mukhi {}

.class private auto ansi zzz

extends [mscorlib]System.Object

{

.method public hidebysig static pinvokeimpl("user32.dll" winapi)

int32 MessageBoxA(int32 h,class System.String m,class System.String c,int32 type) il managed

{

}

.method public hidebysig static void vijay() il managed

{

.entrypoint

ldc.i4.0

ldstr "Hell"

ldstr "Bye"

ldc.i4.0

call int32 zzz::MessageBoxA(int32,class System.String,class System.String,int32)

pop

ret

}

There are over a trillion lines of code already written in the programming language C, under the Windows Operating System. This code resides in files called dll's or Dynamic Link Libraries. To ensure that this code is also be available to programs written in IL, C# provides an attribute called DllImport.

To be technically accurate, code written in a dll has nothing to do with a programming language. Once we obtain a dll, there is no way one can detect as to which programming language it was originally written in. The C# compiler converts our attribute DllImport to a method. This implies that C# understands attributes and depending upon the attribute it generates relevant IL code. The method is called MessageBoxA and has the same parameters that we specified in C#. The added attribute is pinvokeimpl, that is first passed the name of the dll that contains the function.

Then we have a calling convention that has three parameters. The parameters are pushed on the stack before the function gets called. The order of placing parameters on the stack that IL follows is "first written first placed" i.e. from left to right. The winapi calling convention follows the reverse order i.e. right to left.

Then, the name of the function gets added with a number specifying the size of the parameters on the stack. Finally who restores the stack, the caller or the callee?

The function MessageBoxA can be called in the same manner that any other static function of IL gets called.

There are two primary ways of calling unmanaged methods :

• One is using pinvokeimpl,

• The other is using IJW (It Just Works).

In IJW, the runtime stays out of our way, and we have to write code for handling everything. We stick to pinvokeimpl, the one we can work with. The runtime will automatically drift us from managed to unmanaged code, convert data types and handle all the issues of transition management. The attributes to be used are native and unmanaged as, that is what the documentation recommends. The C# compiler however, is not familiar with the documentation.

Tail Calls

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed {

.entrypoint

ldc.i4 2

ldc.i4 3

call int32 zzz::abc(int32, int32)

call void System.Console::WriteLine(int32)

ret

}

.method static public int32 abc(int32 a, int32 r)

{

ldarg a

ldc.i4 0

bgt c

ldarg r

ret

ldarg a

ldc.i4 1

sub

ldarg r

ldarg a

mul

tail.

call int32 zzz::abc(int32, int32)

ret

}

Output

The above example uses recursion to find out the factorial of a number. It uses the prefix tail. wich is a tail call instruction. Functional programming languages like Lisp or Prolog use tail calls extensively. In a non-tail call, the current stack frame is kept intact, and a new frame is allocated. This means that the stack position changes. In a tail call, the stack frame is replaced with a frame for the function to be called.

When a call terminates with a ret, the control returns to the caller function. In the case of tail calls, control continues to remain with the called method. Since non-tail calls need to store information as to who the caller is, it uses up memory on the stack, and may limit the amount of recursion that is possible. Thus, tail calls handle recursion more effectively than non-tail calls.

The above program works even without the tail prefix.