The above example is to illustrate a simple point. The assembler does not do extensive error checks. We have two files, p2.il and p3.il that contain the same class xxx and function xyz. In the file p1.il, we have the same .class extern twice and we get no error. The second class extern refers to a file p3.dll that does not have a file directive at the top level.

 

No errors at all at assemble time which means that the assembler ignores our indiscretions. Also at runtime the xyz method gets called from p2.dll. Change the order of the class externs and watch the assembler complain. Remember the old adage buy at your own risk applies here also. No helping hand from the assembler. No hand holding also.

 

A assembly and a module lets us group constructs together and each play a different role in the .Net world. A set or a group of files is what we call an assembly. It is the abstract entity that we call a manifest that keeps track of all the files present.

 

The assembly table only store for us the Version, name, culture and security requirements that we have not touched upon yet. The assembly manifest must keep track of not only the files but also the cryptographic hash of each file. The manifest is computed from the metadata as mentioned earlier.

 

The runtime also needs to know which types are defined by other files and can be exported out of this assembly. This is achieved using the class extern directive.

 

Obviously if the type is defined in the same file that has the assembly directive, its attributes like public, private etc  decide whether it can be exported from this assembly or not. We could use digital signatures and a public key to compute it with the Manifest.

 

The problem with digital signatures is that for some reason it did not set the Thames on fire. The major difference between assemblies and modules is that assemblies comprise modules. A module consists of a single file that adheres to the rules of the .net world. It can either be a dll or a exe file, it must contain executable code.

 

There is one file in the list of files that carries the assembly directive and it is this file that gives us a list of other modules that make up the assembly. The file that carries the assembly directive is also a module. If the assembly is a dll, then we do not need a function that has the entrypoint directive.

 

However if it is a exe file, then the file that contains the entrypoint directive should also contain the assembly directive or manifest. The concept of namespaces may exist in programming language, the CLI does not understand what we are talking about.

 

Type names are always specified using the full name which is relative to the assembly that we have created them in. If the file is say a text file or a bmp file or a video file and does not contain metadata, will not have the module directive. It is normally not a common practice that many assemblies can refer to the same module but nothing stops us from doing so.

 

The advantage is that once a dll is loaded into memory, the next assembly that references the dll will not result in the dll being loaded again in memory.

 

Program13.csc

public void abc(string [] args)

{

ReadPEStructures(args);

DisplayPEStructures();

ReadandDisplayImportAdressTable();

ReadandDisplayCLRHeader();

ReadStreamsData();

FillTableSizes();

ReadTablesIntoStructures();

DisplayTablesForDebugging();

ReadandDisplayVTableFixup();

ReadandDisplayExportAddressTableJumps();

DisplayModuleRefs();

DisplayAssembleyRefs();

DisplayAssembley();

DisplayFileTable();

DisplayClassExtern();

}

 

public void DisplayClassExtern ()

{

if (ExportedTypeStruct == null)

return;

for ( int ii = 1 ; ii < ExportedTypeStruct.Length ; ii++)

{

string ss1 = GetTypeAttributeFlagsForClassExtern(ExportedTypeStruct[ii].flags );

string ss = GetString(ExportedTypeStruct[ii].nspace) ;

if ( ss.Length != 0)

ss = ss + ".";

ss = ss + NameReserved(GetString(ExportedTypeStruct[ii].name));

int table = ExportedTypeStruct[ii].coded & 0x03;

int index = ExportedTypeStruct[ii].coded >> 2;

if ( index != 0)

{

Console.Write(".class extern /*27{0}*/ {1}" , ii.ToString("X6") , ss1 );

Console.WriteLine(ss);

Console.WriteLine("{");

if ( table == 0)

{

Console.WriteLine("  .file {0}/*26{1}*/ " , NameReserved(GetString(FileStruct[index].name)) , index.ToString("X6"));

}

if ( table == 2)

{

Console.WriteLine("  .comtype '{0}' /*27{1}*/ " , NameReserved(GetString(ExportedTypeStruct[index].name)) , index.ToString("X6"));

}

if ( ExportedTypeStruct[ii].typedefindex != 0)

Console.WriteLine("  .class 0x{0}", ExportedTypeStruct[ii].typedefindex.ToString("X8"));

Console.WriteLine("}");

}

}

}

public string GetTypeAttributeFlagsForClassExtern(int typeattributeflags )

{

typeattributeflags = typeattributeflags & 0x07;

string visibiltymaskstring="";

if ( typeattributeflags == 1)

visibiltymaskstring = "public ";

if ( typeattributeflags == 2)

visibiltymaskstring = "nested public ";

return visibiltymaskstring;

}

 

e.il

.file mscorlib.dll

.file jj.dll

.assembly e

{

}

.file aa.dll

.file bb

.class extern o1

{

.file jj.dll

}

.file jj.dll

.class  extern public jjj

{

.file aa.dll

}

.class  extern nested public kkk

{

.file aa.dll

}

.class  extern public lll

{

.class extern kkk

}

.class  extern public ooo

{

.class extern nnn

}

.class  extern public ppp

{

.class extern jjj

}

.class  extern public mmm

{

.class extern System.Object

}

.class  extern public System.Object

{

.file mscorlib.dll

}

.class  extern public nnn

{

.file jj.dll

}

.class extern a1

{

.file jj.dll

.class 56

 

}

.class zzz

{

.method static void abc()

{

.entrypoint

ret

}

}

 

Output

.assembly extern /*23000001*/ mscorlib

{

  .ver 0:0:0:0

}

.assembly /*20000001*/ e

{

  .ver 0:0:0:0

}

.file /*26000001*/ mscorlib.dll

.file /*26000002*/ jj.dll

.file /*26000003*/ aa.dll

.file /*26000004*/ bb

.class extern /*27000001*/ o1

{

  .file jj.dll/*26000002*/

}

.class extern /*27000002*/ public jjj

{

  .file aa.dll/*26000003*/

}

.class extern /*27000003*/ nested public kkk

{

  .file aa.dll/*26000003*/

}

.class extern /*27000004*/ public lll

{

  .comtype 'kkk' /*27000003*/

}

.class extern /*27000006*/ public ppp

{

  .comtype 'jjj' /*27000002*/

}

.class extern /*27000008*/ public System.Object

{

  .file mscorlib.dll/*26000001*/

}

.class extern /*27000009*/ public nnn

{

  .file jj.dll/*26000002*/

}

.class extern /*2700000A*/ a1

{

  .file jj.dll/*26000002*/

  .class 0x00000038

}

 

 

In this program we simply display the directive class extern. First we as always add a function DisplayClassExtern to the abc function. We then in a loop iterate through all the rows in the ExportedType table. The first thing we need to figure out is the export attributes that can have one of two values, public or nested public.

 

These specify the visibility or who all can see and thus use this type. We have a function GetTypeAttributeFlagsForClassExtern to do the job for us and return the string for us. The visibility attributes are stored in the first three bits. Thus we bit wise and the parameter passed by 7 as the other bits specify other attributes.

 

We then check whether the value is 1 or 2 which tells us whether the public or nested public attributes are on. We then need the name or more precisely the dotted name of the extern type. This is stored in two fields name and nspace.

 

We first get the name of the namespace and then if the namespace is present we add a dot and then add the name of the class name. The next field coded is most important as it tells us what directives are used in the braces. We could if you remember use .file and .class extern.

 

The coded field tells us the name of the table and the row in that table. The row in the table must be non zero if the directive class extern has to be displayed. Why this is so will be explained in a short while by using a practical example.

 

Thus the entire directive will not be displayed if the variable index is zero even though the dotted name has a non null value. Now that the index is non zero we then check the table that the index refers to. If its value is zero, it refers to the file table, 2 means that it refers to the Exported Type table.

 

A value of 1 means the Assembly Ref table and for some reason we have found no way to simulate this value. Now we take each individual table that make up the coded index. The first is the file table with a value of 0. Here we simply use the index variable as a index into the FileStruct table and display the name of the file stored there.

 

We also need to display the row number of the file table that is available to us in the index variable. A table with number 2 means that we have used the class extern directive and lets understand it with a specific reference to the class lll in the file e.il. Here we had written the class extern directive  with the name of the class kkk.

 

The class kkk happens to be another class extern with a file directive. This gets converted to a directive comtype whose explanation is no where to be found. Also ilasm does not understand such a directive. All that we know is that we have used a class name that is a class defined somewhere else as we have used the class extern directive.

 

This class name kkk is stored in the ExportedType table and use the index variable to fetch its name. We are not allowed to use a type name like zzz in the class extern as it is a type we are creating in the current file. The if statement which checks the value of the typedefindex as its always zero unless we add the .class directive. All that this directive does is fills up the typedefindex field with the row number of the type table which is present in the other module. As we will explain much later this is only a hint and nobody checks the value we write after the class directive.

 

Looking at the file e.il, the first three class extern classes o1,jjj and kkk simply use file directive. The difference is the visibility attributes. The class ooo is not present in the output at all even though it is there in the e.il file.

 

Lets investigate. For this class we have used the class extern directive with the class name nnn. This class in turn is stated to be in the file aa.dll. Thus as class ooo does nothing but refers to class nnn, there is no use for its independent existence and hence ilasm being a smart cookie ignores it and does not place it in the Exported Type table.

 

Makes sense as the class is simply an alias for another class.  This class ppp is  similar in concept to the class ppp that uses the undocumented comtype directive. The class mmm is also not found as its an alias for the class System.Object. no point cluttering up this table with classes that do nothing.

 

Whenever we use the class extern directive we should use the assembly directive also but we will not get a error if we do not. Obviously there is internally one manifest module only and all types exported should be present in this manifest.

 

The rational for this table is that after some tool reads this table, he/she can figure out all the types that others can use from this assembly. This manifest module will therefore contain all the types exported from all the modules that make up the assembly. Unfortunately this manifest module is also called the assembly.

 

Thus each time we create a type in any module of an assembly, this table of ours gets one row added. In other words, every type created in another module has a entry in the TypeDef table and this row number will be placed in the Exported Type table. The TypeDefID or typedefindex field in our case is always zero as mentioned earlier unless of course we use the class directive.

 

The point we are making is that this is the first time we are referring to a row in another module and these are called foreign tokens. The fact of the matter is that the assembler is too lazy to go the other modules and figure out what the type def row indexes are.

 

The type name is stored in the name and namespace fields and there was a plan to use the typedefid field if the above search failed. The specs at one place say that the implementation coded index can contain only the File and Exported Type table and at a another place adds the Assembly Ref table also.

 

For the file table the visibility mask has to be public and not nested public but we do not get an error if we do make such a mistake. The specs also say that if the table is the Exported type then the visibility mask has to be nested public. No error again if we break the rule. Why have rules when no one checks for them.

 

Program14.csc

public void abc(string [] args)

{

ReadPEStructures(args);

DisplayPEStructures();

ReadandDisplayImportAdressTable();

ReadandDisplayCLRHeader();

ReadStreamsData();

FillTableSizes();

ReadTablesIntoStructures();

DisplayTablesForDebugging();

ReadandDisplayVTableFixup();

ReadandDisplayExportAddressTableJumps();

DisplayModuleRefs();

DisplayAssembleyRefs();

DisplayAssembley();

DisplayFileTable();

DisplayClassExtern();

DisplayResources();

}

public void DisplayResources()

{

if ( ManifestResourceStruct == null)

return;

for ( int ii = 1 ; ii < ManifestResourceStruct.Length ; ii++)

{

string flags = GetManifestResourceAttributes(ManifestResourceStruct[ii].flags);

Console.WriteLine(".mresource /*28{0}*/ {1}{2}" ,  ii.ToString("X6") , flags, NameReserved(GetString(ManifestResourceStruct[ii].name)) );

 Console.WriteLine("{");

string table = GetManifestResourceTable(ManifestResourceStruct[ii].coded);

int index = GetManifestResourceValue(ManifestResourceStruct[ii].coded);

if ( table == "AssemblyRef")

Console.WriteLine("  .assembly extern {0} /*23{1}*/ " , NameReserved(GetString(AssemblyRefStruct[index].name)) , index.ToString("X6"));

else if ( table == "File" && index > 0)

Console.WriteLine("  .file {0}/*26{1}*/  at 0x{2}" , NameReserved(GetString(FileStruct[index].name)) , index.ToString("X6") ,ManifestResourceStruct[ii].offset.ToString("X8") );

else

Console.WriteLine("  // WARNING: managed resource file {0} created",NameReserved(GetString(ManifestResourceStruct[ii].name) ) );

Console.WriteLine("}");

}

}

public int GetManifestResourceValue(int manifiestvalue)

{

return manifiestvalue>> 2;

}

public string GetManifestResourceTable(int manifiestvalue)

{

string returnstring = "";

short tag = (short)(manifiestvalue & (short)0x03);

if ( tag  == 0)

returnstring = returnstring + "File";

if ( tag  == 1)

returnstring = returnstring + "AssemblyRef";

return returnstring;

}

public string GetManifestResourceAttributes(int manifiestvalue)

{

string returnstring="";

if ( (manifiestvalue & 0x001) == 0x001)

returnstring = returnstring + "public ";

if ( (manifiestvalue & 0x002) == 0x002)

returnstring = returnstring + "private ";

return returnstring;

}

 

e.il

.assembly e

{

}

.assembly extern a1

{

}

.file aa.dll

.mresource r1

{

}

.mresource public r1

{

}

.mresource private r2

{

.assembly extern a1

}

.mresource public r3

{

.file aa.dll at 12

}

.mresource public r4

{

.file aa.dll at 12

.assembly extern a1

}

.class zzz

{

.method static void abc()

{

.entrypoint

ret

}

}

 

Output

.assembly extern /*23000001*/ a1

{

  .ver 0:0:0:0

}

.assembly extern /*23000002*/ mscorlib

{

  .ver 0:0:0:0

}

.assembly /*20000001*/ e

{

  .ver 0:0:0:0

}

.file /*26000001*/ aa.dll

.mresource /*28000001*/ r1

{

  // WARNING: managed resource file r1 created

}

.mresource /*28000002*/ public r1

{

  // WARNING: managed resource file r1 created

}

.mresource /*28000003*/ private r2

{

  .assembly extern a1 /*23000001*/

}

.mresource /*28000004*/ public r3

{

  .file aa.dll/*26000001*/  at 0x0000000C

}

.mresource /*28000005*/ public r4

{

  .assembly extern a1 /*23000001*/

}

 

The above program displays all the resources that we have. We have added a function  DisplayResources that displays all the rows from the Manifest Resource table. This table gets filled up by the mresource directive. The first field is the flags field which gives us the visibility mask that is similar to the class extern directive.

 

The only difference is that the two valid values are public and private. We use the function GetManifestResourceAttributes to return one of these values. None of these visibility attributes is mandatory. We display the visibility attributes along with the mresource directive and the row number in the Manifest Resource table.

 

Nearly every table has a coded index and the Resource table does not lag behind. In this case the coded index either points to the File table if the value is 0 or assembly ref table whose value is 1. The specs call this the Implementation coded index which also included the Exported type table but for resources it is only the above two.

 

We use the GetManifestResourceTable function to return the table name and the function GetManifestResourceValue to return the row number after right shifting the coded index by two bits. In the mresource directive we can either write the Assembley extern directive or the file directive or both or none.

 

If we use the assembly extern directive, then the index variable is the row number of the Assembly Ref table and we display the name of the assembly. If we had had used the file directive and this indexes a row number larger than or equal to 1, then we display the file name along with the position of the file where the resource starts.

 

The index variable is the row number in the File table and the offset within the file is stored in the offset field of the Resource table. Finally if we have used the file directive and the index variable is zero, then within comments we display a warning stating that a file with the same name as the mresource directive has been created.

 

If we specify no directive, we get no error and the above warning is displayed. If we use both, the assembly directive is used and not the file directive. As always no error checks for the file directive or the offset. An assembly can have lots of different data items associated with it.

 

If we ever want to name a item of data, we use the manifest resource to do so. If we do not have a assembly directive, it is perfectly legal to use the mresource directive, but we will not be able to execute the assembly.

 

The reason we specify public or private for a manifest resource is so that the assembly knows whether this item can be exported or seen outside this assembly or should remain visible only within the assembly if it is flagged private.

 

If the resource is stored in the file and that file is not a module, it can be a text file for example, then we would need a separate file directive, declaring that file. In this case the byte offset will be zero.

 

If the resource is defined in another assembly, we would need to have a assembly extern directive at the top level before we can use the assembly extern directive within the mresource directive. The offset field is normally a valid offset which is relative from the resource data directory entry in the COR header.

 

But as said earlier, this error check is not done at present. If the index is null, it means that the resource is stored in the current file and hence the warning.

 

Program15.csc

public void abc(string [] args)

{

ReadPEStructures(args);

DisplayPEStructures();

ReadandDisplayImportAdressTable();

ReadandDisplayCLRHeader();

ReadStreamsData();

FillTableSizes();

ReadTablesIntoStructures();

DisplayTablesForDebugging();

ReadandDisplayVTableFixup();

ReadandDisplayExportAddressTableJumps();

DisplayModuleRefs();

DisplayAssembleyRefs();

DisplayAssembley();

DisplayFileTable();

DisplayClassExtern();

DisplayResources();

DisplayModuleAndMore();

}

public void DisplayModuleAndMore()

{

Console.WriteLine(".module {0}" , NameReserved(GetString(ModuleStruct[1].Name)));

Console.Write("// MVID: ");

DisplayGuid(ModuleStruct[1].Mvid);

Console.WriteLine();

Console.WriteLine(".imagebase 0x{0}" , ImageBase.ToString("x8"));

Console.WriteLine(".subsystem 0x{0}" , subsystem.ToString("X8"));

Console.WriteLine(".file alignment {0}" , filea);

Console.WriteLine(".corflags 0x{0}" , corflags.ToString("x8"));

Console.WriteLine("// Image base: 0x03000000");

}

public void DisplayGuid (int guidindex)

{

Console.Write("{");

Console.Write("{0}{1}{2}{3}", guid[guidindex+2].ToString("X2") , guid[guidindex+1].ToString("X2") , guid[guidindex].ToString("X2") , guid[guidindex-1].ToString("X2"));

Console.Write("-{0}{1}-",guid[guidindex+4].ToString("X2") , guid[guidindex+3].ToString("X2"));

Console.Write("{0}{1}-",guid[guidindex+6].ToString("X2") , guid[guidindex+5].ToString("X2"));

Console.Write("{0}{1}-",guid[guidindex+7].ToString("X2") , guid[guidindex+8].ToString("X2"));

Console.Write("{0}{1}{2}{3}{4}{5}",guid[guidindex+9].ToString("X2"),guid[guidindex+10].ToString("X2"),guid[guidindex+11].ToString("X2"),guid[guidindex+12].ToString("X2"),guid[guidindex+13].ToString("X2"),guid[guidindex+14].ToString("X2"));

Console.Write("}");

}

 

e.il

.assembly e

{

}

.module aaaa

.class zzz

{

.method static void abc()

{

.entrypoint

ret

}

}

 

output

.module aaaa

// MVID: {EDBE9E84-F6DE-468C-B8CB-0CB099FD1EA4}

.imagebase 0x00400000

.subsystem 0x00000003

.file alignment 512

.corflags 0x00000001

// Image base: 0x03000000

 

This one is a small program and all that it does is adds one more function call DisplayModuleAndMore to the abc function. The .module directive is optional and this adds one record to the Module table. As mentioned earlier we can have only one module directive and not two and if we do not, one gets added for us automatically.

 

We simply display the directive module and follow this with the words MVID. We then call a function DisplayGuid that displays a guid for us. Every application needs to be uniquely identified and the assembler gives it a unique 128 bit number stored in the field Mvid. Each time we regenerate our assembly, this number changes.

 

The reason it is a 128 bit number is because such a number is unique across time and space. The problem with the guid is that it is displayed in a  certain manner and the function DisplayGuid simply displays the bytes from the offset of the guid stream passed as a parameter.

 

For example we display the third byte first followed by the second etc. This guid is calculated using the ISO/IEC standard 11578:1996. The full form of a GUID is a Globally Unique IDentifier. It is a concept used by CORBA and OLE in the past.

 

The VES (Virtual Execution System or Runtime) does not make any use of the Guid but debuggers should use this number to uniquely identify the module. The name of the file is not the physical file name but the logical name stored in the metadata.

 

The module table is the first one the designers of the metadata thought of as they gave it a number of 0. The generation field is reserved and has a value of zero. EncId and EncBaseId are also reserved but they are indexes into th Guid heap and also have a value of 0. Both Mvid and the name field have to have a non null value.

 

If you remember earlier we had figured some instance variables like file alignment and subsystem. We are simply displaying these values here. The Imagebase, subsystem and file alignment variables we have already displayed earlier. The ImageBase is once again displayed within comments with a constant value.

 

Some points that we missed about these values we will explain now. The corflags directive was not written by us and the CLI expects the value to be 1. For backwards compatibility the least 3 significant bits are reserved. Thus the values from 8 to 65535 will be used by future versions.

 

The guys who create experimental and or non standard versions have the blessings of the .Net standard to use values larger than 65535. The subsystem directive is used only when we execute the assembly and thus dll’s have no use for it. This may be a 32 bit number but it can have only two possible values.

 

A value of 2 means that the program should be run using whatever conventions are fit for a GUI applications. A value of 3 is for a console application. At this point in time there is no third environment for a application to execute. The file alignment can only be values that are a multiple of 512 bytes.

 

Program16.csc

string [] vtfixuparray;

 

public void abc(string [] args)

{

ReadPEStructures(args);

DisplayPEStructures();

ReadandDisplayImportAdressTable();

ReadandDisplayCLRHeader();

ReadStreamsData();

FillTableSizes();

ReadTablesIntoStructures();

DisplayTablesForDebugging();

ReadandDisplayVTableFixup();

ReadandDisplayExportAddressTableJumps();

DisplayModuleRefs();

DisplayAssembleyRefs();

DisplayAssembley();

DisplayFileTable();

DisplayClassExtern();

DisplayResources();

DisplayModuleAndMore();

DispalyVtFixup();

}

 

public void ReadandDisplayVTableFixup()

{

if ( MethodStruct == null)

return;

if ( vtablerva != 0)

{

long save ;

long position = ConvertRVA(vtablerva) ;

if ( position == -1)

return;

mfilestream.Position = position;

Console.WriteLine("// VTableFixup Directory:");

int count1 = vtablesize/8;

vtfixuparray = new string[count1];

for ( int ii = 0 ; ii < count1 ; ii++)

{

vtfixuparray[ii] = ".vtfixup ";

int fixuprva = mbinaryreader.ReadInt32();

Console.WriteLine("//   IMAGE_COR_VTABLEFIXUP[{0}]:" , ii);

Console.WriteLine("//       RVA:               {0}",fixuprva.ToString("x8"));

short count = mbinaryreader.ReadInt16();

Console.WriteLine("//       Count:             {0}", count.ToString("x4"));

short type = mbinaryreader.ReadInt16();

Console.WriteLine("//       Type:              {0}", type.ToString("x4"));

save = mfilestream.Position;

mfilestream.Position = ConvertRVA(fixuprva) ;

int i1 ;

long [] val = new long[count] ;

for ( i1 = 0 ; i1 < count ; i1++)

{

if ( (type&0x01)  == 0x01)

val[i1] = mbinaryreader.ReadInt32();

if ( (type&0x02)  == 0x02)

val[i1] = mbinaryreader.ReadInt64();

if ( (type&0x01)  == 0x01 )

Console.WriteLine("//         [{0}]            ({1})",i1.ToString("x4") , val[i1].ToString("X8"));

if ( (type&0x02)  == 0x02)

Console.WriteLine("//         [{0}]            (         {1})",i1.ToString("x4") , (val[i1]&0xffffffff).ToString("X"));

}

mfilestream.Position = save;

vtfixuparray[ii] = vtfixuparray[ii] + "[" + (i1).ToString("X") + "] ";

if ( (type&0x01)  == 0x01)

vtfixuparray[ii] = vtfixuparray[ii] + "int32 ";

if ( (type&0x02)  == 0x02)

vtfixuparray[ii] = vtfixuparray[ii] + "int64 ";

if ( (type&0x04)  == 0x04)

vtfixuparray[ii] = vtfixuparray[ii] + "fromunmanaged ";

vtfixuparray[ii] = vtfixuparray[ii] + "at D_" + fixuprva.ToString("X8");

vtfixuparray[ii] = vtfixuparray[ii] + " //";

for ( i1 = 0 ; i1 < count ; i1++)

{

if ( (type&0x01)  == 0x01)

vtfixuparray[ii] = vtfixuparray[ii]  + " " + val[i1].ToString("X8");

if ( (type&0x02)  == 0x02)

vtfixuparray[ii] = vtfixuparray[ii] + " " + val[i1].ToString("X16");

}

}

Console.WriteLine();

}

}

 

public void DispalyVtFixup()

{

if (vtfixuparray == null)

return;

for ( int ii = 0 ; ii < vtfixuparray.Length ; ii++)

Console.WriteLine(vtfixuparray[ii]);

}

 

e.il

.class public a11111

{

.method public static void  adf() cil managed

{

.entrypoint

}

.method  public int64  a1() cil managed

{

}

.method  public int64  a2() cil managed

{

}

.method  public int64  a3() cil managed

{

}

.method  public int64  a4() cil managed

{

}

.method  public int64  a5() cil managed

{

}

.method  public int64  a6() cil managed

{

}

.method  public int64  a7() cil managed

{

}

}

.vtfixup [1] int32 at D_00008010

.vtfixup [1] int32 fromunmanaged at D_00008020

.vtfixup [1] int64 at D_00008030

.vtfixup [1] int64 fromunmanaged at D_00008040

.vtfixup [0] int64 at D_00008050

.vtfixup [2] int64 int64 at D_00008060

.data D_00008010 = bytearray ( 01 00 00 06)

.data D_00008020 = bytearray ( 02 00 00 06)

.data D_00008030 = bytearray ( 03 00 00 06)

.data D_00008040 = bytearray ( 04 00 00 06)

.data D_00008050 = bytearray ( 05 00 00 06)

.data D_00008060 = bytearray ( 06 00 00 06 00 00 00 00 07 00 00 06 00 00 00 00)

 

Output

.vtfixup [1] int32 at D_00004000 // 06000001

.vtfixup [1] int32 fromunmanaged at D_00004004 // 06000002

.vtfixup [1] int64 at D_00004008 // 0600000406000003

.vtfixup [1] int64 fromunmanaged at D_0000400C // 0600000506000004

.vtfixup [0] int64 at D_00004010 //

.vtfixup [2] int64 at D_00004014 // 0000000006000006 0000000006000007

 

If you look at program7.csc carefully, we displayed the directive vtfixup. If you looked at the output of that program very carefully, you would have realized that it was all in comments. We did not display the actual directive vtfixup at that time.

 

We have added a instance array vtfixuparray and we also call a function DispalyVtFixup in the function abc to display the directive. We have also added some more code in the ReadandDisplayVTableFixup function that will populate the array vtfixuparray and then display it later at the end.

 

As explained before the count1 variable is a count of the items and we use this to give us an array of the desired size. Creating an array of size is not an error and the array just does not get created. We use the same variable count1 to also create the array val to store the individual values.

 

Depending upon the type being a int32 or int64 we read either 4 or 8 bytes into the corresponding val array. We then create the entire string into the vtfixuparray and use the type variable to determine the width of the table. The fixuprva gives us the data address which we concatenate with a D_.

 

Then we place the comments and after this we need the values that we wrote in the bytearray. These depend upon the count value we specified in the square brackets. Thus we use the same loop again and concatenate the bytearray values, reading either 4 or 8 bytes depending as we said on the width of the table.

 

When we look at the DispalyVtFixup function, it simply displays the members of the vtfixuparray.

 

 

Program17.csc

public void abc(string [] args)

{

ReadPEStructures(args);

DisplayPEStructures();

ReadandDisplayImportAdressTable();

ReadandDisplayCLRHeader();

ReadStreamsData();

FillTableSizes();

ReadTablesIntoStructures();

DisplayTablesForDebugging();

ReadandDisplayVTableFixup();

ReadandDisplayExportAddressTableJumps();

DisplayModuleRefs();

DisplayAssembleyRefs();

DisplayAssembley();

DisplayFileTable();

DisplayClassExtern();

DisplayResources();

DisplayModuleAndMore();

DispalyVtFixup();

DisplayTypeDefs();

}

 

public void DisplayTypeDefs ()

{

if ( TypeDefStruct.Length != 2)

{

Console.WriteLine("//");

Console.WriteLine("// ============== CLASS STRUCTURE DECLARATION ==================");

Console.WriteLine("//");

writenamespace = true;

for ( int i = 2 ; i < TypeDefStruct.Length ; i++)

{

if ( GetString(TypeDefStruct[i].name) == "_Deleted"  && streamnames[0] == "#-")

{

continue;

}

if ( ! IsTypeNested(i) )

{

DisplayOneTypePrototype(i);

}

}

}

}

public bool IsTypeNested (int typeindex)

{

if (NestedClassStruct == null)

return false;

for ( int ii = 1 ;  ii< NestedClassStruct.Length ; ii++)

{

if ( NestedClassStruct[ii].nestedclass == typeindex)

return true;

}

return false;

}

public void DisplayOneTypePrototype (int typedefindex)

{

DisplayOneTypeDefStart(typedefindex);

DisplayNestedTypesPrototypes(typedefindex);

DisplayOneTypeDefEnd(typedefindex);

}

public void DisplayOneTypeDefStart (int typerow)

{

string namespacename = NameReserved(GetString(TypeDefStruct[typerow].nspace));

if (  namespacename != "")

{

if ( writenamespace )

{

Console.WriteLine(".namespace {0}" , namespacename );

Console.WriteLine("{"  );

spacefornamespace = 2;

spacesforrest = 4;

}

}

string typestring = "";

if ( IsTypeNested(typerow))

typestring = typestring + CreateSpaces(spacesfornested);

typestring = typestring + CreateSpaces(spacefornamespace);

typestring = typestring + ".class /*02" + typerow.ToString("X6") + "*/ ";

string attributeflags = GetTypeAttributeFlags(TypeDefStruct[typerow].flags , typerow);

Console.WriteLine("{0}{1}{2}" , typestring ,  attributeflags , NameReserved(GetString(TypeDefStruct[typerow].name)));

string tablename = GetTypeDefOrRefTable(TypeDefStruct[typerow].cindex);

int index = GetTypeDefOrRefValue(TypeDefStruct[typerow].cindex);

string typeextends = "";

if ( tablename == "TypeRef" )

{

typeextends = DisplayTypeRefExtends(index);

}

if ( tablename == "TypeDef" )

{

typeextends =  GetNestedTypeAsString(index) + DisplayTypeDefExtends(index);

}

if ( typeextends.Length != 0)

{

typestring = "";

if ( IsTypeNested(typerow))

typestring = typestring + CreateSpaces(spacesfornested);

typestring = typestring + CreateSpaces(spacefornamespace);

typestring = typestring + "       extends " + typeextends;

Console.WriteLine(typestring);

}

string interfacestring = DisplayAllInterfaces(typerow);

if ( interfacestring.Length != 0)

{

typestring = "";

if ( IsTypeNested(typerow))

typestring = typestring + CreateSpaces(spacesfornested);

typestring = typestring + CreateSpaces(spacefornamespace);

typestring = typestring + "       implements " + interfacestring;

Console.Write(typestring);

}

typestring = "";

if ( IsTypeNested(typerow))

typestring = typestring + CreateSpaces(spacesfornested);

typestring = typestring + CreateSpaces(spacefornamespace);

typestring = typestring + "{";

Console.WriteLine(typestring);

}

public string GetTypeAttributeFlags (int typeattributeflags , int typeindex)

{

string returnstring = "";

int visibiltymask = typeattributeflags & 0x07;

string visibiltymaskstring="";

if ( visibiltymask == 0)

visibiltymaskstring = "private ";

if ( visibiltymask == 1)

visibiltymaskstring = "public ";

if ( visibiltymask == 2)

visibiltymaskstring = "nested public ";

if ( visibiltymask == 3)

visibiltymaskstring = "nested private ";

if ( visibiltymask == 4)

visibiltymaskstring = "nested family ";

if ( visibiltymask == 5)

visibiltymaskstring = "nested assembly ";

if ( visibiltymask == 6)

visibiltymaskstring = "nested famandassem ";

if ( visibiltymask == 7)

visibiltymaskstring = "nested famorassem ";

int classlayoutmask = typeattributeflags & 0x18;

string classlayoutstring = "";

if ( classlayoutmask == 0)

classlayoutstring = "auto ";

if ( classlayoutmask == 0x08)

classlayoutstring = "sequential ";

if ( classlayoutmask == 0x10)

classlayoutstring = "explicit ";

string interfacestring = "";

if ( (typeattributeflags & 0x20) == 0x20)

interfacestring  = "interface ";

string abstractstring = "";

if ( (typeattributeflags & 0x80) == 0x80)

abstractstring  = "abstract ";

string sealedstring = "";

if ( (typeattributeflags & 0x100) == 0x100)

sealedstring  = "sealed ";

string specialnamestring = "";

if ( (typeattributeflags & 0x400) == 0x400)

specialnamestring  = "specialname ";

string importstring = "";

if ( (typeattributeflags & 0x1000) == 0x1000)

importstring  = "import ";

string serializablestring = "";

if ( (typeattributeflags & 0x2000) == 0x2000)

serializablestring = "serializable ";

int stringformatmask = typeattributeflags & 0x30000;

string stringformastring = "";

if ( stringformatmask == 0)

stringformastring = "ansi ";

if ( stringformatmask == 0x10000)

stringformastring = "unicode ";

if ( stringformatmask == 0x20000)

stringformastring = "autochar ";

string beforefieldinitstring = "";

if ( (typeattributeflags & 0x00100000) == 0x00100000)

beforefieldinitstring = "beforefieldinit ";

//string rtspecialnamestring = "";

//if ( (typeattributeflags & 0x800) == 0x800)

//rtspecialnamestring = "rtspecialname ";

if ( IsTypeNested(typeindex) )

returnstring = interfacestring + abstractstring + classlayoutstring + stringformastring +  serializablestring + sealedstring + importstring + visibiltymaskstring  + beforefieldinitstring;

else

returnstring = interfacestring + visibiltymaskstring + abstractstring + classlayoutstring + stringformastring + importstring + serializablestring + sealedstring + specialnamestring + beforefieldinitstring ;

return returnstring;

}

public string DisplayTypeDefExtends (int typedefindex)

{

if ( typedefindex == 0)

return "";

string name = NameReserved(GetString(TypeDefStruct[typedefindex].name));

string returnstring = NameReserved(GetString(TypeDefStruct[typedefindex].nspace));

if ( returnstring.Length != 0)

returnstring = returnstring + ".";

returnstring = returnstring + name  + "/* 02" + typedefindex.ToString("X6") + " */";

return returnstring;

}

public string GetNestedTypeAsString(int rowindex)

{

string netsedtypestring = "";

string namespaceandnameparent2 = "";

string namespaceandnameparent3= "";

if ( IsTypeNested(rowindex) )

{

int rowindexparent = GetParentForNestedType(rowindex);

if ( IsTypeNested(rowindexparent) )

{

int rowindexparentparent = GetParentForNestedType(rowindexparent);

if ( IsTypeNested(rowindexparentparent) )

{

int rowindexp3 = GetParentForNestedType(rowindexparentparent);

string nameparent3 = NameReserved(GetString(TypeDefStruct[rowindexp3].name));

namespaceandnameparent3= NameReserved(GetString(TypeDefStruct[rowindexp3].nspace));

if ( namespaceandnameparent3.Length != 0)

namespaceandnameparent3 = namespaceandnameparent3 + ".";

namespaceandnameparent3= namespaceandnameparent3 + nameparent3 + "/* 02" + rowindexp3.ToString("X6") + " *//";

}

string nameparent2 = NameReserved(GetString(TypeDefStruct[rowindexparentparent].name));

namespaceandnameparent2 = NameReserved(GetString(TypeDefStruct[rowindexparentparent].nspace));

if ( namespaceandnameparent2.Length != 0)

namespaceandnameparent2 = namespaceandnameparent2 + ".";

namespaceandnameparent2 = namespaceandnameparent3 + namespaceandnameparent2 + nameparent2 + "/* 02" + rowindexparentparent.ToString("X6") + " *//";

}

string nameparent1 = NameReserved(GetString(TypeDefStruct[rowindexparent].name));

netsedtypestring = NameReserved(GetString(TypeDefStruct[rowindexparent].nspace));

if ( netsedtypestring.Length != 0)

netsedtypestring = netsedtypestring + ".";

netsedtypestring = namespaceandnameparent2 + netsedtypestring + nameparent1 + "/* 02" + rowindexparent.ToString("X6") + " *//";

}

return netsedtypestring;

}

public int GetParentForNestedType (int typeindex)

{

int ii = 0;

if ( NestedClassStruct == null)

return 0;

for ( ii = 0 ; ii < NestedClassStruct.Length - 1 ; ii++)

{

if ( typeindex == NestedClassStruct[ii].nestedclass )

break;

}

return NestedClassStruct[ii].enclosingclass;

}

public string DisplayTypeRefExtends (int typerefindex)

{

string returnstring = "";

int resolutionscope = TypeRefStruct[typerefindex].resolutionscope;

string resolutionscopetable = GetResolutionScopeTable(resolutionscope);

int resolutionscopeindex = GetResolutionScopeValue(resolutionscope);

string dummy = "";

if ( resolutionscopetable == "Module")

{

}

if ( resolutionscopetable == "AssemblyRef")

{

returnstring = "[" + NameReserved(GetString(AssemblyRefStruct[resolutionscopeindex].name)) ;

returnstring = returnstring + "/* 23" + resolutionscopeindex.ToString("X6") + " */]";

}

if ( resolutionscopetable == "ModuleRef")

{

returnstring = "[.module " + NameReserved(GetString(ModuleRefStruct[resolutionscopeindex].name))  ;

returnstring = returnstring + "/* 1A" + resolutionscopeindex.ToString("X6") + " */]";

}

if ( resolutionscopetable == "TypeRef")

{

int resolutionscopeindex1 = GetResolutionScopeValue(TypeRefStruct[resolutionscopeindex].resolutionscope );

string resolutionscopetable1 = GetResolutionScopeTable(TypeRefStruct[resolutionscopeindex].resolutionscope );

if ( resolutionscopetable1 == "AssemblyRef")

{

dummy = "[" + NameReserved(GetString(AssemblyRefStruct[resolutionscopeindex1].name)) + "/* 23" + resolutionscopeindex1.ToString("X6") + " */]";

string nspace1 = NameReserved(GetString(TypeRefStruct[resolutionscopeindex].nspace));

if ( nspace1 != "")

nspace1 = nspace1 + ".";

dummy = dummy + nspace1 + NameReserved(GetString(TypeRefStruct[resolutionscopeindex].name)) + "/* 01" + resolutionscopeindex.ToString("X6") + " *//";

}

}

int namespaceindex = TypeRefStruct[typerefindex].nspace;

string nspace = NameReserved(GetString(namespaceindex));

returnstring = returnstring + nspace ;

if ( nspace.Length  != 0)

returnstring = returnstring + ".";

int nameindex = TypeRefStruct[typerefindex].name;

returnstring = dummy + returnstring + NameReserved(GetString(nameindex)) + "/* 01" + typerefindex.ToString("X6") + " */";

return returnstring;

}

public string DisplayAllInterfaces (int typeindex)

{

string returnstring = "";

if ( InterfaceImplStruct == null || InterfaceImplStruct.Length == 1)

return "";

for ( int i = 1 ; i < InterfaceImplStruct.Length ; i++)

{

if ( typeindex == InterfaceImplStruct[i].classindex  )

{

string codedtablename = GetTypeDefOrRefTable(InterfaceImplStruct[i].interfaceindex);

int interfaceindex = GetTypeDefOrRefValue(InterfaceImplStruct[i].interfaceindex);

string interfacename = "";

if ( codedtablename == "TypeRef" )

interfacename  = DisplayTypeRefExtends(interfaceindex);

if ( codedtablename == "TypeDef" )

interfacename  = GetNestedTypeAsString(interfaceindex) + DisplayTypeDefExtends(interfaceindex);

returnstring = returnstring + interfacename;

bool nextclassindex ;

if ( i == (InterfaceImplStruct.Length - 1))

nextclassindex = false;

else if ( typeindex != InterfaceImplStruct[i+1].classindex  )

nextclassindex = false;

else

nextclassindex = true;

if ( nextclassindex )

returnstring = returnstring + ",\r\n                  " + CreateSpaces(spacefornamespace+spacesfornested);

else

returnstring = returnstring + "\r\n";

}

}

return returnstring;

}

public string GetTypeDefOrRefTable (int codedvalue)

{

string returnstring = "";

short tag = (short)(codedvalue & (short)0x03);

if ( tag  == 0)

returnstring = returnstring + "TypeDef";

if ( tag  == 1)

returnstring = returnstring + "TypeRef";

if ( tag  == 2)

returnstring = returnstring + "TypeSpec";

return returnstring;

}

public int GetTypeDefOrRefValue(int codedvalue)

{

return codedvalue >> 2;

}

public void DisplayNestedTypesPrototypes (int typedefindex)

{

if (NestedClassStruct == null)

return ;

for ( int ii = 1 ; ii < NestedClassStruct.Length ; ii++)

{

if (NestedClassStruct[ii].enclosingclass == typedefindex)

{

spacesfornested += 2;

DisplayOneTypePrototype(NestedClassStruct[ii].nestedclass  );

spacesfornested -= 2;

}

}

}

public void DisplayOneTypeDefEnd (int typeindex )

{

string dummy = "";

if ( IsTypeNested(typeindex) )

dummy = dummy + CreateSpaces(spacesfornested);

dummy = dummy + CreateSpaces(spacefornamespace);

dummy = dummy + "} // end of class ";

string classname = NameReserved(GetString(TypeDefStruct[typeindex].name));

dummy = dummy + classname ;

Console.WriteLine(dummy);

string namespacename = NameReserved(GetString(TypeDefStruct[typeindex].nspace));

Console.WriteLine();

if (  namespacename != "")

{

string nspace1 = NameReserved(GetString(TypeDefStruct[typeindex].nspace));

int ii;

for ( ii = typeindex + 1 ; ii < TypeDefStruct.Length - 1 ; ii++)

{

if ( IsTypeNested(ii) )

continue;

break;

}

string nspace2 = "";

if ( ii != TypeDefStruct.Length )

nspace2 = NameReserved(GetString(TypeDefStruct[ii].nspace));

if ( nspace1 != nspace2 )

{

if ( lasttypedisplayed == typeindex && notprototype  )

{

Console.WriteLine();

Console.WriteLine("// =============================================================");

Console.WriteLine();

placedend = true;

}

Console.Write("}");

Console.WriteLine(" // end of namespace {0}", namespacename);

spacefornamespace = 0;

spacesforrest = 2;

writenamespace = true;

Console.WriteLine();

}

else

writenamespace = false;

}

}

 

e.il

.class public abstract beforefieldinit a1

{

.method public static void Main()

{

.entrypoint

}

}

.class private sequential unicode explicit sealed specialname rtspecialname interface _Deleted

{

}

.namespace n1

{

.namespace n2

{

.class public autochar a2

{

.class public a3

{

.class a33

{

.class a333

{

.class a3333

{

}

}

}

}

.class a4

{

}

}

}

}

.class a5

{

}

.class a6 extends a5

{

}

.class a7 extends n1.n2.a2/a3/a33/a333

{

}

.class a71 extends n1.n2.a2/a3

{

}

.class a72 extends n1.n2.a2/a3/a33

{

}

 

.class a8 extends [mscorlib]aaa

{

}

.class a9 extends [mscorlib]ppp.aaa/aa

{

}

.module extern bb

.module e.exe

.class a10 extends [.module bb]a11

{

}

.class a12 extends [.module e.exe]a13

{

}

.assembly ee

{

}

.class a14 extends [ee]a13

{

}

.class interface a15

{

}

.class interface a16

{

}

.class a17 implements a15,a16,[mscorlib]aa

{

}

 

Output

.module extern bb /*1A000001*/

.assembly extern /*23000001*/ mscorlib

{

  .ver 0:0:0:0

}

.assembly extern /*23000002*/ ee

{

  .ver 0:0:0:0

}

.assembly /*20000001*/ ee

{

  .ver 0:0:0:0

}

.module e.exe

// MVID: {DB25BB73-A933-4B68-9124-8F5135AC4D55}

.imagebase 0x00400000

.subsystem 0x00000003

.file alignment 512

.corflags 0x00000001

// Image base: 0x03000000

//

// ============== CLASS STRUCTURE DECLARATION ==================

//

.class /*02000002*/ public abstract auto ansi beforefieldinit a1

       extends [mscorlib/* 23000001 */]System.Object/* 01000001 */

{

} // end of class a1

 

.class /*02000003*/ interface private abstract explicit unicode sealed specialname _Deleted

{

} // end of class _Deleted

 

.namespace n1.n2

{

  .class /*02000004*/ public auto autochar a2

         extends [mscorlib/* 23000001 */]System.Object/* 01000001 */

  {

    .class /*02000005*/ auto ansi nested public a3

           extends [mscorlib/* 23000001 */]System.Object/* 01000001 */

    {

      .class /*02000006*/ auto ansi nested private a33

             extends [mscorlib/* 23000001 */]System.Object/* 01000001 */

      {

        .class /*02000007*/ auto ansi nested private a333

               extends [mscorlib/* 23000001 */]System.Object/* 01000001 */

        {

          .class /*02000008*/ auto ansi nested private a3333

                 extends [mscorlib/* 23000001 */]System.Object/* 01000001 */

          {

          } // end of class a3333

 

        } // end of class a333

 

      } // end of class a33

 

    } // end of class a3

 

    .class /*02000009*/ auto ansi nested private a4

           extends [mscorlib/* 23000001 */]System.Object/* 01000001 */

    {

    } // end of class a4

 

  } // end of class a2

 

} // end of namespace n1.n2

 

.class /*0200000A*/ private auto ansi a5

       extends [mscorlib/* 23000001 */]System.Object/* 01000001 */

{

} // end of class a5

 

.class /*0200000B*/ private auto ansi a6

       extends a5/* 0200000A */

{

} // end of class a6

 

.class /*0200000C*/ private auto ansi a7

       extends n1.n2.a2/* 02000004 *//a3/* 02000005 *//a33/* 02000006 *//a333/* 02000007 */

{

} // end of class a7

 

.class /*0200000D*/ private auto ansi a71

       extends n1.n2.a2/* 02000004 *//a3/* 02000005 */

{

} // end of class a71

 

.class /*0200000E*/ private auto ansi a72

       extends n1.n2.a2/* 02000004 *//a3/* 02000005 *//a33/* 02000006 */

{

} // end of class a72

 

.class /*0200000F*/ private auto ansi a8

       extends [mscorlib/* 23000001 */]aaa/* 01000002 */

{

} // end of class a8

 

.class /*02000010*/ private auto ansi a9

       extends [mscorlib/* 23000001 */]ppp.aaa/* 01000003 *//aa/* 01000004 */

{

} // end of class a9

 

.class /*02000011*/ private auto ansi a10

       extends [.module bb/* 1A000001 */]a11/* 01000005 */

{

} // end of class a10

 

.class /*02000012*/ private auto ansi a12

       extends a13/* 01000006 */

{

} // end of class a12

 

.class /*02000013*/ private auto ansi a14

       extends [ee/* 23000002 */]a13/* 01000007 */

{

} // end of class a14

 

.class /*02000014*/ interface private abstract auto ansi a15

{

} // end of class a15

 

.class /*02000015*/ interface private abstract auto ansi a16

{

} // end of class a16

 

.class /*02000016*/ private auto ansi a17

       extends [mscorlib/* 23000001 */]System.Object/* 01000001 */

       implements a15/* 02000014 */,

                  a16/* 02000015 */,

                  [mscorlib/* 23000001 */]aa/* 01000008 */

{

} // end of class a17

 

We now add the second last function call to the function DisplayTypeDefs in the abc function. This function simply displays the prototypes of all the classes or types that we have created. In C# we may have types like enums, interfaces  or structures, but these are alien concepts in the IL world.

 

These types are all represented by the class directive. An enum is in the IL world is a sealed class that extends the System.Enum class. A struct is the same but instead extends from the System.ValueType class. Finally a interface has a interface attribute. Thus in the IL world we use the class directive to represent all types.

 

In the IL world we are allowed to have methods that are global and thus unlike the language C# we are permitted to create a valid il file with no type defined. Each time we assemble a il file, a type called <Module> within a null namespace gets automatically created. Thus our TypeDef  table will have contain at least one type come what happens.

 

If you remember we have one more member in our array and thus we can only display a type if the length of our array that stores the type is large than 2. There are valid cases where the il file has only global functions and thus only type called Module that does not have to be displayed.

 

We first display the heading class structure declaration and then initialize a variable writenamespace to true. This instance variable will be used to determine whether the directive namespace be displayed or not.

 

We then iterate through the for loop but the loop variable here starts from 2 and not 1 as the dummy class Module is disliked by ildasm which does not display it. If the name of the class is _Deleted and the stream name is #- and not #~, ildasm does not display such a stream. If you remember the older assemblers called the main stream #- and not #~.

 

This is what happens when you stress test the code to catch all exceptions. Yet we are not sure whether we have a working disassembler. A class within a class is a nested class and the function IsTypeNested checks whether the class is a nested class. It returns true if the class is nested.

 

Lets first take a peek at what this function is all about. If you first take a quick peek at the e.il file, you will realize that the class a1 ahs a class a3 defined within it. This means that class a3 is nested within class a2. It however does not stop there and class a33 is aloes created within class a3.

 

This means that class a33 is nested within class a3. To make matters worse class a333 is nested within class a33. The class a2 has one more nested class a4 which has no more classes nested within it. Thus we are allowed to nest as many classes within each other to our hearts content.

 

Each type we create is stored in the TypeDef table. The Nested classes table has only two fields called the Nested class and the enclosing class field which both are indexes into the TypeDef table. The nested class is defined as a class that is placed  inside or within the text of the enclosing class.

 

Thus the nested class field is the name of the nested type and the enclosing class field is the class within which the nested class resides in. As a example the class a2 is the enclosing class and the a3 class the nested class.

 

However there will also we a record where the class a3 will be the enclosing or parent class and the class a33 the nested class. All that we do in this function is simply loop though each record of the nested class table and if the type passed as the parameter matches the nestedclass field we return a true.

 

If we quit out of the loop we know that no match has been found, we return a false. The two must index valid rows in the TypeDef table. Both the fields cannot index the TypeRef table where we store all the types that we are referring to but exist somewhere else. Obviously duplicate rows are not allowed.

 

A nested type can have only one parent or enclosing type. Thus the nestedclass table cannot have two rows with the same value for the nestedclass field but differing enclosing class field values as that would mean that it is nested within two classes which is not possible.

 

The class a2 will appear twice in the nested class table as the value of the enclosing class field as it contains two nested classes a3 and a4. The nestedclass fields have different values however. The display of the class prototype is carried out by a function DisplayOneTypePrototype.

 

We will only call this function if the class is not a nested class. Nested classes do have an independent existence and they are displayed within their enclosing class. Nested classes are displayed by their parent class but the TypeDef table does not distinguish between nested and no nested classes, each is offered one row each.

 

Thus there is no column in the typedef table that tells us that we have a nested class in our hands. The only way we know that is class is nested is if we look at the nestedclass field of the nested class table.

 

Function call after function call, are we out of our minds or are we going to get a award for the maximum use of functions in our code. The answer is as usual somewhere in the middle. The point is that we are going to display the type prototypes twice, once know, once again when we display the types with all sorts of things like fields, methods etc within the types.

 

No point in writing the same code twice. Thus the function DisplayOneTypePrototype calls three other functions to do the job. The first DisplayOneTypeDefStart displays the class name and the attributes. Before we display the type we need to figure out whether it part of a namespace or not.

 

In the il file we have placed the type a2 within two namespace directives, but ilasm very smartly converts in into a single namespace with a dot within it. We use the string namespacename to store the namespace name or null if one does not exist. We then speck the value of this variable is not null and then write the namespace directive.

 

As this is a top level directive, it needs to spaces in front of it. We then place the open braces and then initialize two very crucial variables.  The spacefornamespace variable is either 2 or its default 0. If the type is within a namespace, then we have to add two spaces for indentation and if there is no namespace, then the spaces for a namespace is zero.

 

The spacesforrest will be explained in a short while. The variable writenamespace is true and hence the inner if statement is true. We will explain the use of this variable later.

 

Most of the time we will not use the WriteLine function to write something out but use a string to store what we want to write and then use a single WriteLine to write the entire string out for us. This is where we  use the typestring to store the entire string for the class directive.

 

We first need to know whether our class is a nested class or not. If it is then we add to the typestring variable the number of spaces stored in the spacesfornested variable. This variable has a value of 2 at present but we will show you later that each time we come across a nested class we increase this variable by 2.

 

We then add the number of spaces zero or 2 depending whether the type has a namespace or not. Remember this variable spacefornamespace has a value of zero or two only. Now that we have written out the number of initial spaces we write out the words class and then the table name plus the row number or what is called a token within comments.

 

This token only gets written out if the /ALL option is specified to the disassembler and is very useful to as it tells us the table and row number the entity belongs to. Everything in the metadata is stored in tables and we are simply displaying what is there in these tables. Each type has lots of attributes and the we the function GetTypeAttributeFlags to return all the attributes as string.

 

The flags member of the TypeDef table is a bit mask of all the attributes and we also pass the row of the type. Lets move on to this function to figure out how the bit mask is decoded.

 

The designers of the type attributes we were very meticulous. Not only do they specify which bits signify what attributes but they also created logical groups for them. Thus the first three bits specify the visibility attributes. We first bit wise and with 7 to cull out the first three bits and then check which bits are on or off.

 

Thus we have 8 visibility values and each one is mutually exclusive of the other. Thus both private and public will not be on together. Strictly speaking this sets of bits are called visibility and accessibility. A type that is not nested within a another type will either we public or private and it will have no accessibility.

 

A programming language may have introduce any visibility attributes but in IL a type can only be private or public.  Nested types instead have no visibility but have one of six accessibility attributes. These are nested assembly, nested family and assembly or famandassem, nested family or assembly or famorassem, simply nested family or assembly or finally nested public or private.

 

Family is protected in C# where the derived classes have accessibility and assembly is restricts it to classes in the same assembly or file. As always in life there are defaults, private for non nested types and for nested types nested private.

 

Along with visibility and accessibility we have one more concept called hiding or better still method name hiding. Hiding controls which method names that we get or inherit from a base type are available to the compiler for compile time name binding.

 

A nested type can have no visibility attributes as this privilege is reserved for top level types only. Here we have only two possibilities, visible to types within the same assembly private or public visible to types anywhere in the world or forget about the assembly it resides in. 

 

Nested types are different, as the accessibility further refines which set of methods can access this type. The visibility is decided by the parent or enclosing or top level type. This means that the nested type can not be more visible than the enclosing type. This makes sense as the enclosing type cannot be private the nested type public.

 

If we do specify the enclosing type as assembly, even though the nested type is public, the enclosing type decides and the nested type is only available within the assembly. However if the enclosing type is public to be seen everywhere, but if the nested type is private, the wishes of the nested type hold and it is not visible outside the assembly.

 

The same logical model is used to describe both the top level and nested types. The word class is what we have been using all our lives and we should be using type instead. A interface, structures or value types are not classes and hence we should use the broader term type instead of class. Old habits die hard.

 

Hiding does not apply to types but to the methods of a type and is compile time phenomena not a runtime. The Common Type System offers us one bit to distinguish between the two mechanism of Hiding. The first is hide by name where by specifying a method of a certain name hides all method with that name from all derived classes.

 

The more complex one is name and signature where the data types of the parameters passed decides the name of the method. Thus two methods having the same name but different parameters as in type and number are treated as different methods. There is no runtime support for hiding.

 

The CLI treats all method as if they used the name and signature method of hiding. The newslot attribute is what is used to specify hide by name only. The next set of attributes are the type layout attributes. We do what we did earlier, mask of the bits that represent this logical family and then check which bits are on.

 

The type layout attributes take any of three values, auto, explicit and sequential. If these bits are zero, then the auto attribute is on. The only way we can determine this is by masking off all the other bits but those that represent the type attributes. We cannot check the byte for zero as the private visibility mask also has a value of zero.

 

The type layout decides who is in charge of arranging the fields of an instance of a type. The type can have only one of these type attributes. If we specify no attribute, the default is auto. This auto attribute tells the CLI that the programmer does not take the responsibility of laying out the fields in memory.

 

Let the CLI place the fields wherever it wants and the user will not lay down any conditions at all. The only problem with this way is that we lack flexibility and at times we would want to decide how things are laid out at the dinner table or in memory. At these times we use the explicit attribute where we decide where the fields will be laid out in memory.

 

The last option is sequential where we let the CLI lay the fields out with one small condition. The fields should be placed one after the after in memory and the metadata tables decide which fields come in what order. We should normally in our interest let the CLI decide how to lay fields in memory.

 

We should use sequential for languages like C/C++ as we get the best of both worlds. We get verifiable output and we also follow the rules of these languages also. The last option is explicit where we are the masters of the memory layout. As we told you earlier a type can be a class or value type or interface.

 

The type semantics attribute tells us just that and it is  single bit which if on tells us that we have an interface. If the type is derived from System.Value either directly or indirectly a value type and if none of the above is true, a simple class. The size of a value type at runtime is 1 MB or 0x100000 and the implementation we are using takes us up to 0x3f0000 and this value may be reduced. 

 

Honestly why do people want such large value type is beyond out understanding. A value type is a class that becomes a value type for reasons of efficiency. The basic C# classes like int and char are all value types.  We now have two inheritance attributes abstract or sealed.

 

These attributes are not mutually exclusive and thus we have two separate strings to hold their values. A abstract class cannot be used directly i.e. we cannot instantiate it using new. This is because it contains functions and unless we implement these in a derived class we cannot use the abstract class.

 

We use abstract classes when we want to user to implement some methods before using our class. Thus an abstract type must contain abstract methods that the derived class has to supply code for.  Sealed classes are a different kettle of fish. These classes cannot be derived from or have subclasses.

 

One of the reasons of doing this is efficiency as the compiler knows that no one can derive from these classes and thus generate more efficient code. Also at times I do want my classes to be tampered with and thus by using the sealed attribute I let no programmer modify my code. Use as is or don’t.

 

Virtual functions are used so that the derived classes can override them and in a sealed class they become common instance methods as there are no derived class to override them. The big problem with sealed classes is that they stop the user from extending the class hierarchy.

 

One reason to use sealed is when we have a class that implements different interfaces becomes interdependent on implementation issues that will not be visible to sub classes. A type that is both abstract and sealed should have only static members as the abstract is not a usable class and the sealed does not let us derive from the class.

 

This is also what is called a namespace in some languages. We now come to the three interoperation attributes ansi, autochar or unicode. We are running in what is called managed code and the earlier programs we ran in C/C++ before the advent of .Net were running in unmanaged code.

 

We would like to call unmanaged code from managed code and thus these attributes tell the system how to deal with strings. Thus is the return type or a parameter is of type string, does any specific conversion need to be done. This conversion is called marshalling. Once again these values are mutually exclusive and the default us ansi.

 

This means that the marshalling will done from either side as ansi strings. The unicode attribute specifies that string both sides will be in unicode. Finally the best is autochar that will use either unicode or ansi depending upon the platform we are running. All decent platforms today support unicode as computer people have realized that the world does not speak English forget about good English.

 

Finally we come to the special handling attributes that are four in number and can be combined in any manner that we desire. These attributes are also meant for the tools because if the CLI treats an item in a different or special way, why hide it from the tools.

 

The special name attribute means that the item name is special not only to the CLI but also to tools. One example is the name .ctor which stands for a non static constructor. The attribute rtspecialname is like special name but with a small difference. It very clearly means that the CLI understands this item.

 

There are no types as of today that will be marked with this attribute as it is reserved for future use. Any item that will be marked rtspecialname  will also be marked special name. We have placed this code in comments as if we use this attribute, the assembler ignores it.

 

When a static method is called from a type we do not have to create a instance of the type. By using the beforefieldinit attribute we are telling the CLI that it need no initialize the type when a static method is called. The default is that it does initialize the type. Serialization is the art of writing data to disk or a data stream.

 

By specifying the serializable attribute we are allowing the CLI serializer to write the type to a data stream. Finally the import attribute tells the CLI that this type is imported from a COM type library. Before the .Net came on, the mantra at Microsoft was ActiveX which was based on COM or the Component Object Model.

 

Now that we have individual strings from each logical family, we need to display them in a certain order. This order specifies that the interface attribute comes first and the beforefieldinit attribute last. The attributes are not displayed in the order that we wrote them. Also the nested types follow a different order. It took us a long time to figure out this order.

 

Now that we have all the attributes, we write out the name of the type along with the attributes.  Every type derives from a type and if do not specify one, the type derives from System.Object in the assembly mscorlib. Also a type can implement from a number of interfaces but it can derive from a single type only.

 

We use the extends keyword to specify the base type. In the same vein the implements keyword is used for interfaces. These words are what are also used in Java, the arch enemy of the .net world. If the interface has say five functions, the type implementing this interface has to implement all the five functions before it can be instantiated. 

 

Thus the type has to fulfill a contract of implementing all the method specified in the interface  before the type can be used. The field extends or whet we call cindex tells us which class this type extends from. An extends specifies only one type whereas the implements clause can specify many interfaces.

 

These two bits of information are kept separately in the metadata. The cindex field specifies a TypeDefOrRef Coded index. This coded index specifies one of three tables, TypeDef, TypeRef or the TypeSpec table. In this case it cannot specify the TypeSpec table as that table ahs a single field and there is no name attached to the type.

 

Thus we cannot use these types found in the TypeSpec table for the extends clause. We use the GetTypeDefOrRefTable to figure which table the coded index points to. If the code index refers to the TypeRef table, it means that type used is defined in another assembly and had it been the TypeDef table, the same assembly.

 

We use the functions DisplayTypeRefExtends and DisplayTypeDefExtends to figure out the name of the class. Over to the DisplayTypeRefExtends function first.

 

If you have not noticed every coded index function comes in a pair, the name ending with table gives us the coded index table, the name ending with Value gives us the coded index table row number. The function DisplayTypeRefExtends is a pretty large and thus we advise to have a big hot cop of coffee before you start reading.

 

You have been warned. This function is passed the type ref row number as whenever we use a type defined somewhere else, the type ref table gets a row added. This type ref row has a coded index called the resolution scope.

 

This coded index points to four tables, Module, ModuleRef, AssemblyRef and itself, TypeRef. In spite of it pointing to four tables we will explain why we have only checked for three tables, ModuleRef, TypeRef and AssemblyRef. The first value we check is AssemblyRef as this should be the most common.

 

When you look at class a8 we are extending it from the class aaa in the mscorlib assembly. The resolutionscopeindex variable now points to a row number in the AssemblyRef table and we display the name of the assembly ref struct and the row number in square brackets. The class a9 also meets the same fate but this class is a nested class.

 

The class a10 and a12   are slightly similar. In the case of class a10 we are specifying that the external class a11 is in a module called bb. This module is defined by using the module extern directive. Thus in the square brackets we can either specify a assembly ref or a external module.

 

The class can be in the present in the same assembly but in a separate module. In this case we write out the words .module and use the variable resolutionscopeindex as index into the ModuleRef table. We also write out the table number as 1A. Thus the above two coded index tables are similar in concept only the table changes.

 

We come to class a12 where we specify the module directive again but use the name of the module as the name of the current module e.exe. The assembler is smart enough to know that we are referring to the same module that we are in and ignores our module directive.

 

Thus the resolutionscope index can have a coded index of Module but we have to write no code to handle it. Remember if something is in the same module we do not have to qualify it. We only need to specify where an entity is if it is somewhere else. This somewhere else can be a different module than the one we are in or a another assembly.

 

Now lets look at the fourth table TypeRef which means that coded index is pointing to a row in itself. This happens in the case of class a9 which extends from a nested class in the assembly ref mscorlib. The class a8 also extends from the class aaa in the assembly mscorlib but it is not nested the coded index table is straight forward.

 

Now that we have a nested class the coded index table will be a typeref. When we go to that row in the TypeRef table, we will once again pick up the coded index and in this specific case it will be the Assembly Ref table as our nested class is in the mscorlib assembly.

 

We first write out the assembly name like before and then figure out whether the class aaa belongs to a namespace or not. The field nspace tells us the name of the namespace but we use the earlier coded index row variable resolutionscopeindex and not the second one. Thus variable nspace1 will contain the value ppp.

 

We then add a dot as the class namespace separator only if the class aaa belongs to a namespace. We next display the row number of the class aaa that happens to be 3 in this case. Then we place a / because the next class following is the nested class and separator between nested classes is a single /.

 

The problem with writing code is that there are too many possibilities to take care of. If the first coded index table is TypeRef, then we have to check for the other tables also. If we place a nested class with module, then the second coded index table will have the values of Module or ModuleRef.

 

We have not written code for these cases as we have left it to you to implement. Thus a class statement as .class a99 extends [.module bb]ppp.aaa/aa {} will not work as the second coded index will now be ModuleRef and we have not handled such a case.

 

The mistake we made is that we should have had a function that handles a resolution coded index. Then we could call this function each time. The reason we do not is that it would make the program more difficult to understand.

 

Finally if we write a line like .class a99 extends [mscorlib]ppp.aaa/aa/bb we will also get an error as the second coded index table will be a TypeRef. Thus each nested within nested class will keep giving us a row in the TypeRef table.

 

What we are saying is that each nested class within a nested class is a separate row in the TypeDef table and we would need a concept called recursion to handle this. Finally we come the end where we need to display the namespace and the name of the class.

 

We use the parameter typerefindex to get at the name of the class and namespace and decide whether to place the dot separator. As this is a common thing we always do, we will not explain it again. The point to understand is that the class name is written in the reverse order for a nested class.

 

The enclosing class is specified first and then the nested classes. Thus for class a9, it extends nested class aa that has a row number of 4 in the TypeRef table. This class is nested within class aaa which is row number 3 and is in a namespace ppp.

 

Our program like all programs is not complete because the following line is in error. .class a99 extends [mscorlib]ppp.aaa/qqq.aa. The assembler passes it in spite of a flaw. The nested class aa cannot have a namespace qqq as a namespace is a top level construct.

 

The disassembler is  smart enough to understand and it removes the namespace. We thus need to write code that says if the first coded index table is TypeRef then the nested classes cannot have a namespace. We did not as the assembler did not.

 

Whew. Now we come back to the function DisplayOneTypeDefStart  and check the second table name of the TypeDefOrRef coded index TypeDef. This happens when we extend from a type that is created in the same module. This type will obviously have a row in the TypeDef table.

 

We simply have to display the contents of the TypeDef table and this is what the function DisplayTypeDefExtends does. If ever the row number of any table is 0, this means that this is not a valid row number. This is a simple function as  the name and nspace fields give us the name and namespace and we add the token within comments and return this string.

 

The problem is not for simple types but for nested types. Thus we have a function GetNestedTypeAsString that first figures out the nested type. The problem is with class a7 which extends nested class a333 that is nested within classes a33 and a3. If you are not clear about the tokens which is displayed immediately after the class name.

 

The class a2 has a row number of 4 which you can verify by seeing the class 4 definition. The parameter rowindex is that of class a333 and its value is row 7. We first check if this class is nested using the function IsTypeNested. Now that we have nested class, we would want to figure out the parent of this nested class a333.

 

We use a function GetParentForNestedType that returns the enclosing class for us. This function is a no brainer. We told you eons ago that the Nested Class table has two fields, one that was the type index of the nested  class and other the enclosing class.

 

Thus we loop though the entire array NestedClassStruct and break when we meet a row where the parameter typeindex equals the field nestedclass, the row number of the nested class. We return the enclosingclass field as this is the row number of the enclosing class.

 

We build no error check as both fields have to valid indexes in the TypeDef table. We then check whether row 6 or class a33 is a nested class, and if it is we again determine the parent of the nested class a33 which is a3 or row number 5. Finally we come to the parent of class a33 that is class a3 or row number 4.

 

The string nameparent3 and namespaceandnameparent3 will display the name and namespace for the top most class a2 .i.e. row 4 as it uses the variable rowindexp3. This is responsible for the namespace n1.n2. We then place the nested class separator and then use the variable rowindexparentparent to give us the name of class a33.

 

Then we use variable rowindexparent to give us the next nested class a33 and the namespace if any and finally the variable rowindexparent for the actual name of the class a333. The second namespace is used in the case of class a72. Thus depending upon the number of levels of nested classes we use the relevant namespace field. 

 

In the case of class a71, the netsedtypestring variable gives us the namespace n1.n2. This is why we need to place the namespace twice in our code. Obviously all is not right in our program as we have assumed a certain level of nesting.

 

Thus if we introduce a class .class a72 extends n1.n2.a2/a3/a33/a333/a3333 we will get an error as we have one more level of nesting. A better way would be to use recursion as we mentioned earlier. To solve the above problem add one more level of nesting but yet it would be a imperfect solution.

 

A similar problem arises with the TypeRef class and we have only handled one level of nested classes. Thus the class .class a91 extends [mscorlib]ppp.aaa/aa/bb will give us a problem as we have not handled a case where there are two levels of nesting. The coded index table will have a value of TypeRef in the second case.

 

A point to understand is the difference in how nested classes are handled by a TypeRef and a TypeDef. Each nested class exists in the TypeRef table as a separate row and the coded index tells us that there are more types nested by a value of TypeRef. In the case of the TypeDef table, the Nested Class table tells us whether the levels of nesting.

 

The typeextends variable by now contains the class name that we write after the word extends. The problem is that if we do not specify a class name, ilasm adds the class name Object automatically. The problem comes in when we have a interface. Interfaces do not automatically derive from the class Object.

 

Thus if and only if the typeextends variable is non null do we write out the keyword extends along with the type. Before we write the words extends on a new line we need to first write spaces for nested classes followed by the space taken if the class is within a namespace.

 

The last part of the class directive is the implements that carry the list of interfaces and not interface that we implement from. The function DisplayAllInterfaces is what we use. Each time we create an interface it is stored in the InterfaceImpl table. We loop though all the records in this table and check for the field called classindex.

 

A interface is also a class and has a valid row number in the TypeDef table. This same row number is also the value of the classindex field. If a class implements more than one interface, the classindex field will repeat. Thus if a class implements from three interfaces, this table will have three rows with the same value of the classindex field.

 

The extends and the implements classes are stored in different tables as the implements shares a one to many relationship. The field interfaceindex is the TypeDefOrRef coded index and from now on the same code we used earlier to retrieve the class name is also used here. The last problem in this function is one of indentation.

 

Each interface needs a line of its own to be displayed. The problem on hand is how do we know that this is the last interface to be displayed. We take a Boolean variable nextclassindex to tell us whether we should display an enter and spaces after the interface name. We first check if it is the last row, there cannot be any more interfaces and hence we set variable nextclassindex to false.

 

Also if the classindex of the next row is not the same as the parameter typeindex we set nextclassindex to false. This assumes that for the same type, all the interfaces are stored one after another. If none of the above are true, we set variable nextclassindex to true. We then use this variable to write out a new line with spaces or not.

 

The last thing we do is display the { with the right number of spaces. Within the class prototype we display nothing else but prototypes of nested classes.

 

Coming back to one function DisplayOneTypePrototype we now call the function DisplayNestedTypesPrototypes. We will use this function to display all the nested types for which the typeindex is a enclosing class.

 

The important thing to understand here is that nested classes are to be displayed in the same way as the enclosing class but only need to be indented by 2 each time. Thus the code that we wrote in the function DisplayOneTypeDefStart can be reused.

 

Also unlike the way we have been writing code so far, we cannot assume that the level of nesting will be 3 or 4 or 5. This is the first time we will use the concept of recursion. The basic premise of nesting is that write the code once call it again. This make recursion like a function call. True but with a slight difference.

 

We will call the same function within the same function. Complex, lets explain with a actual example.  In the function DisplayNestedTypesPrototypes we have no idea of how many classes are nested within this class or does it have any nested classes.

 

There is one table the Nested classes table that has the nested classes details stored. We iterate though this table and check whether the parent or enclosing class field has the typeindex passed as a parameter. If yes, then this class of ours has nested classes.

 

We then call the function DisplayOneTypePrototype that displays the prototype of a type. We pass the nestedclass field as that field contains the nested class to be displayed. A class can have 6 nested classes in it and this if statement will be true six times.

 

The major point is that each nested class in turn may have a million levels of nesting. In the function DisplayOneTypePrototype we are first calling the function DisplayOneTypeDefStart which displays the class directive.

 

Before calling this function we increase the instance variable spacesfornested by 2 and thus this nested class will indent by 2 as its value will be 4 and not 2 as earlier. After the { is displayed, the function DisplayNestedTypesPrototypes will be called again. Remember the first DisplayNestedTypesPrototypes is not yet over. 

 

This is the flavor of recursion, calling the same function again in spite of the fact that the earlier call is not over and lies suspended in memory waiting for the second call to be over. Now in the for loop the typeindex value is of the nested class and now it is the enclosing class.

 

If the nested class has any further nests, then the second DisplayNestedTypesPrototypes is suspended and the variable spacesfornested increase by 2. At some point in time the for loop will not match and the function DisplayOneTypePrototype will not get called. This will result in the function DisplayOneTypeDefEnd be called which displays the end of the class closing brace.

 

We will then move out of the function DisplayOneTypePrototype as there are no more functions left to be called. This will result in the variable spacesfornested having a value of two less. The problem with recursion is that some take a million tries before they get it but no one gets it in one or two.

 

So read again and then come back to the last function DisplayOneTypeDefEnd.

 

This function simply has to display the close brace, end of class and the end of namespace if any. The writenamespace variable starts with a value of true and if and only if its value is true do we display the namespace directive in the function DisplayOneTypeDefStart.

 

We first check whether the namespace variable contains a valid namespace name. If it is null we make the variable writenamespace to false so that the directive namespace is not displayed. We then need to find out the next class that will be displayed after this class.

 

We cannot assume that it will be the next row as that could be a nested type within the current type. Thus we need to scan the next row from the type we have right now displayed and keep going till we reach the end. We check each succeeding type and loop back if it is a nested type under the type we have just displayed.

 

If the type is not nested we break. We are assuming that the enclosing types and nested types are placed one after the other. Now that we have left the for loop we are on a type that will be displayed after this type. We add a check that we are not at the end of the table and find out the namespace of this type. 

 

We write out the closing brace of the namespace and also its name. As this is the end of a namespace we reset the variable spacefornamespace to 0 as there is no indentation for the namespace and the spacesforrest is reset to 2 as this variable gives us the indentation for the succeeding entity to be placed as we shall soon see.

 

As the writenamespace is set to true, the next time we come across a class that has a namespace the namespace directive will be displayed. The point is that if two classes fall under the same namespace, the namespace directive is not repeated at all.

 

We display the first namespace directive and then only display it if we have written out the end of namespace as the namespace directive cannot be nested. Hence the writenamespace variable is set to null in the else of the first if.

 

The TypeRef table contains only three fields and the ResolutionScope coed index as per the specs can take five possible values. Out of these four we could demonstrate i.e. the TypeRef, Module, ModuleRef and AssemblyRef  tables. The fifth is when the value is null which means that the table is the ExportedType table. A type in this table is not a valid type after the extends keyword. We cannot have two types in this table where the name and namespace fields are the same.

 

If we did not mention this before the InterfaceImpl table only has two fields class which is the index into the TypeDef table and the field interface that is a TypeDefOrRef coded index. If we have 10 rows in this table, it means that the type denoted by the class field implements a certain interface.

 

Obviously the class field must be non null and if is by mistake null, we should assume that this row does not exist at all. It happens when a class is deleted and the metadata is not updated and rewritten when the compiler incrementally compiles. A time saver. The interface field indexes into the TypeDef or TypeRef table and not TypeSpec as TypeSpecs do not have a name.

 

Also the class must have the interface flag on and cannot be a formal class or Value Type. The class and interface values together cannot be duplicate, but the class by itself and be more than one as a class can implement lots of interfaces. Vice Versa the interface field can be multiple as one interface can be used by multiple classes.

 

Program18.csc

public void abc(string [] args)

{

ReadPEStructures(args);

DisplayPEStructures();

ReadandDisplayImportAdressTable();

ReadandDisplayCLRHeader();

ReadStreamsData();

FillTableSizes();

ReadTablesIntoStructures();

DisplayTablesForDebugging();

ReadandDisplayVTableFixup();

ReadandDisplayExportAddressTableJumps();

DisplayModuleRefs();

DisplayAssembleyRefs();

DisplayAssembley();

DisplayFileTable();

DisplayClassExtern();

DisplayResources();

DisplayModuleAndMore();

DispalyVtFixup();

DisplayTypeDefs();

DisplayTypeDefsAndMethods ();

}

public void DisplayTypeDefsAndMethods ()

{

notprototype = true;

if ( TypeDefStruct.Length != 2)

{

Console.WriteLine();

Console.WriteLine("// =============================================================");

Console.WriteLine();

}

Console.WriteLine();

Console.WriteLine("// =============== GLOBAL FIELDS AND METHODS ===================");

Console.WriteLine();

//DisplayGlobalFields();

//DisplayGlobalMethods();

if ( TypeDefStruct.Length != 2)

{

Console.WriteLine();

Console.WriteLine("// =============================================================");

Console.WriteLine();

Console.WriteLine();

Console.WriteLine("// =============== CLASS MEMBERS DECLARATION ===================");

Console.WriteLine("//   note that class flags, 'extends' and 'implements' clauses");

Console.WriteLine("//          are provided here for information only");

Console.WriteLine();

int kk = TypeDefStruct.Length ;

for ( int i = 2 ; i < kk ; i++)

{

if ( GetString(TypeDefStruct[i].name) == "_Deleted" && streamnames[0] == "#-")

continue;

if ( ! IsTypeNested(i) )

{

DisplayOneType(i);

}

}

}

DisplayEnd ();

}

public void DisplayOneType (int typedefindex)

{

DisplayOneTypeDefStart(typedefindex);

DisplayNestedTypes(typedefindex);

DisplayOneTypeDefEnd(typedefindex );

}

public void DisplayNestedTypes (int typedefindex)

{

if (NestedClassStruct == null)

return ;

for ( int ii = 1 ; ii < NestedClassStruct.Length ; ii++)

{

if (NestedClassStruct[ii].enclosingclass == typedefindex)

{

spacesfornested += 2;

DisplayOneType(NestedClassStruct[ii].nestedclass  );

spacesfornested -= 2;

}

}

}

public void DisplayEnd()

{

string nspace = NameReserved(GetString(TypeDefStruct[TypeDefStruct.Length-1].nspace));

if ( ! placedend)

{

Console.WriteLine();

Console.WriteLine("// =============================================================");

Console.WriteLine();

placedend = true;

}

Console.WriteLine("//*********** DISASSEMBLY COMPLETE ***********************");

if (datadirectoryrva[2] != 0)

Console.WriteLine("// WARNING: Created Win32 resource file a.res");

}

 

public void ReadTablesIntoStructures()

{

 

 

int ii ;

for ( ii = 1 ; ii <= TypeDefStruct.Length - 1 ; ii++)

{

//Console.WriteLine("........{0} {1} {2}" , TypeDefStruct.Length , IsTypeNested(ii) , ii);

if ( ! IsTypeNested(ii) )

lasttypedisplayed = ii;

}

}

 

e.il

.namespace aa

{

.class a1

{

.class a33

{

}

}

}

.class a2

{

}

 

Output

// =============================================================

 

 

// =============== GLOBAL FIELDS AND METHODS ===================

 

 

// =============================================================

 

 

// =============== CLASS MEMBERS DECLARATION ===================

//   note that class flags, 'extends' and 'implements' clauses

//          are provided here for information only

 

.namespace aa

{

  .class /*02000002*/ private auto ansi a1

         extends [mscorlib/* 23000001 */]System.Object/* 01000001 */

  {

    .class /*02000003*/ auto ansi nested private a33

           extends [mscorlib/* 23000001 */]System.Object/* 01000001 */

    {

    } // end of class a33

 

  } // end of class a1

 

} // end of namespace aa

 

.class /*02000004*/ private auto ansi a2

       extends [mscorlib/* 23000001 */]System.Object/* 01000001 */

{

} // end of class a2

 

 

// =============================================================

 

//*********** DISASSEMBLY COMPLETE ***********************

 

After a long time program18 gives us a output that matches the output displayed by the original disassembler. We have a namespace aa that has a class a1 which in turn has a nested class a33. The class a2 is not enclosed in a namespace. The above il file has no methods and thus we need to compile it into a dll using the /DLL option.

 

Now lets look at code that we have added to get the above magic. Th excitement now starts and step by step we will keep adding more and more to the il file and write code that matches the output by the disassembler. We call the function DisplayTypeDefsAndMethods in the function abc to display the contents of the types for us.

 

In this function we first set the variable notprototype to true which if you remember was set to false earlier and thus some code which we did not explain but will explain now did not get called. We also told you some time back that a dummy class called module always gets created even though you do not have a single class defined.

 

Thus we first check that the user has created at least one class and then only display the = signs in comments. We then write out the comments for global Fields and Methods. These global entities are allowed in languages like IL, C++ but not in C#.

 

We use two functions to display these global fields and methods but comment them out for the moment as we will explain them later. We have more pressing thing to do at this moment. Once again if there are classes to be displayed we write out the words class members declaration with the required comments for extends and implements.

 

A waste of code and space if you ask us. We then place a blank line and come to the core of this function using a for loop that starts at 2 and not 1 and till the number of rows in the array TypeDefStruct. As before we take care of classes called _Deleted and let the function DisplayOneType take care of the displaying one Type.

 

This is like the earlier Type Prototype display function and thus we will not explain the if statement checking for nested classes. Finally when we leave the for loop the function DisplayEnd will display the last lines of the output. After this it going back to bed.

 

No more function calls. Looking at the DisplayOneType function, all that we do is call two functions we have called before DisplayOneTypeDefStart and DisplayOneTypeDefEnd. This is how we reuse code.

 

We also call the DisplayNestedTypes function to display nested types with the only difference from its prototype cousin is that is calls the DisplayOneType function instead of the prototype cousin. Lets explain the role of a variable lasttypedisplayed that we initialize in the function ReadTablesIntoStructures.

 

At the end of this function we have a  for loop that loop through the type table. We would like to know the last type we are displaying. We cannot assume that it is the last physical type as that type could be a nested type that will display within another type.

 

Thus set the variable lasttypedisplayed to the loop variable ii as long as the type is not nested. We do this once and then do not change the value of this variable at all. To understand better we would like to flip lots of pages and move over to DisplayOneTypeDefEnd function of the earlier program. Lets take a different e.il for this case.

 

e.il

.namespace aa

{

.class a1

{

}

}

 

We have to write out a series of = equal to signs with a enter before and after. If the last type to be displayed falls within a namespace we have to write out the equal to signs before we close the namespace.

 

Thus if we are displaying the last type which happens when the variables lasttypedisplayed equals typeindex then we write out the many equal to signs. As this function gets called twice once for prototypes if you have not forgotten, the notprototype variable has to be set to true.

 

The placedend variable is set to true as we have written out the equal to signs. The above il file calls the first if statement as the type to be displayed a1 is also the last type. We must remember that the two namespaces nspace1 and nspace2 must not be null.  Finally we come to the function DisplayEnd which as said before is the last function to be called.

 

If we have a il file with no namespace or better still the last class is not within a namespace the earlier code will not place the = signs. Thus we first check the value of the placedend variable. If it is false, it means that we have to write out the closing equal to signs.

 

We do have to initialize the placedend variable to true as no code gets called after this but even if we do it only means that we are lousy programmers. If the second data directory member is non zero, it means that we have a the disassembler create a resource file for us and we need to display the value of this file that is always called a.res.

 

Thus we have now written a program that matches the output generated by the disassembler but do not start the celebrations as we have miles an miles to go.

 

Program19.csc

public void DisplayOneType (int typedefindex)

{

DisplayOneTypeDefStart(typedefindex);

DisplaySizeAndPack (typedefindex);

DisplayNestedTypes(typedefindex);

DisplayOneTypeDefEnd(typedefindex );

}

public void DisplaySizeAndPack (int typeindex)

{

if ( ClassLayoutStruct == null)

return;

for ( int ii = 1 ; ii < ClassLayoutStruct.Length ; ii++)

{

if ( ClassLayoutStruct[ii].parent == typeindex )

{

Console.Write(CreateSpaces(spacesfornested + spacesforrest));

Console.WriteLine(".pack {0}" , ClassLayoutStruct[ii].packingsize);

Console.Write(CreateSpaces(spacesfornested + spacesforrest));

Console.WriteLine(".size {0}" , ClassLayoutStruct[ii].classsize);

}

}

}

 

e.il

.class a1

{

.pack 2

}

.class a2

{

.size 2

}

.class a3

{

.size 2

.pack 2

}

 

 

Output

.class /*02000002*/ private auto ansi a1

       extends [mscorlib/* 23000001 */]System.Object/* 01000001 */

{

  .pack 2

  .size 0

} // end of class a1

 

.class /*02000003*/ private auto ansi a2

       extends [mscorlib/* 23000001 */]System.Object/* 01000001 */

{

  .pack 1

  .size 2

} // end of class a2

 

.class /*02000004*/ private auto ansi a3

       extends [mscorlib/* 23000001 */]System.Object/* 01000001 */

{

  .pack 2

  .size 2

} // end of class a3

 

We have added a function DisplaySizeAndPack that displays the directive size and pack that we have added in our class. A size and pack directive fill up a row in Class Layout table. This table has three fields, two to store the size and pack directives and the third to tell us which type carries these directives.

 

The reason we do not store these pack and size directives in the TypeDef table itself as they are optional. The Packing Size field is a short whereas the class size is larger at four bytes. We use the class layout table to tell the compiler how the fields in IL and instance variables in Il should be laid out or arranged in memory.

 

These directives only apply to a class or value type and not to an interface. Normally the CLI is free to place the fields wherever it wants in memory and leave as many gas it likes. If it so pleases it is also allowed to move the fields in memory.

 

The managed world of .net has to give you most of the features available with the unmanaged world of C/C++. In these languages we had the freedom of placing our fields or structures the way we liked. By allowing us the same flexibility, we can now access unmanaged code structures in the same way using managed code.

 

There are if you have been reading this book sequentially, three types of layout attributes, Auto, sequential and explicit. The default is auto if we do not specify a layout. Coming to our program, the first class a1 has a pack size of 2 and no size directive and the Output shows us that the default size of 0..

 

A value of 0 does not mean that the class size is zero, it means that the CLI will figure it out. The second class a2 has a size and no pack and the default pack size is 1 and not 0. In our function DisplaySizeAndPack all that we do is scan the class layout table and check for that single record that has the parent field being equal to the parameter typeindex.

 

If a record does not match the type, we do not display these directives. We first check whether the packingsize is non zero as the default is zero when we do not specify a size and only a pack as mentioned before.

 

We write out the indentation and the spacesforrest takes care of the indentation other than that for the nested classes. We write out the pack and size directives. Lets now understand what the pack directive is all about. If we have a pack size of say 16, this means that every field in memory at runtime should start at a address which is a multiple of 16 or a natural alignment of the field type.

 

Unlike life, the CLO chose whichever is smaller and not larger.  Thus if we specify a pack 2, then a 32 bit field will begin at an address that is a multiple of 2 and 4 which would happen naturally if there was no pack. The pack can only have values of 0, 1,2 , 4 , 8 , 16, 32 , 64 or 128.

 

A value of zero does not mean no pack but the pack size used should be decided by the platform we are running on. Obviously the pack directive and the explicit attribute where we are being explicit cannot be used together.

 

The size directive is easier to understand as it specifies the size of memory allocated to th fields of the class and not for the methods. This value should be larger or at the very best equal to the calculated size of the class. The size of the class is the sum of the individual fields and the extra gaps due to the pack directive.

 

The pack and size directives are not hints and the system better obey our values or else … . The class layout table may be empty and normally is. Obviously the class containing these two directives must not have the auto layout as gave have, but we get a warning by the assembler and not a error.

 

If the class size is larger than the actual size, padding is provided at the end of the class by the compiler. A class size of zero specifies the system can figure out the size of the class as it normally does. Even though we use the Explicit layout attribute, we can yet have a verifiable type if out type does not have a union.

 

A union is a entity that allows different fields to start at the same location in memory. For a explicit layout attribute the packing size is 0 as we are explicitly specifying each offset. If you have forgotten all classes derive from System.Object and value types from System.ValueType.

 

A layout has to start from the first class that derives from class Object and it cannot start from any other point in the inheritance hierarchy. We can stop the layout anywhere in the chain but from then on no class can have layout. We cannot stop the layout and two classes later start again.

 

Thus no holes are allowed in the layout of classes. Thus the two rules we have specified are no holes and also that the layout starts from the highest class.

 

Program20.csc

public void DisplayOneType (int typedefindex)

{

DisplayOneTypeDefStart(typedefindex);

DisplaySizeAndPack(typedefindex);

DisplayNestedTypes(typedefindex);

DisplayAllMethods(typedefindex);

DisplayOneTypeDefEnd(typedefindex );

}

public void DisplayAllMethods (int typerow)

{

if ( TypeDefStruct == null)

return;

if ( MethodStruct == null)

return;

int start , startofnext=0;

start =  TypeDefStruct[typerow].mindex ;

if ( typerow == (TypeDefStruct.Length -1) )

{

startofnext= MethodStruct.Length;

}

else

startofnext = TypeDefStruct[typerow+1].mindex ;

for ( int methodindex = start ; methodindex < startofnext ; methodindex++)

{

string methodstring = CreateSpaces(spacesforrest);

if ( IsTypeNested(typerow))

methodstring = methodstring + CreateSpaces(spacesfornested);

methodstring = methodstring  + ".method ";

methodstring = methodstring + "/*06" + methodindex.ToString("X6") + "*/ " ;

string methodattribute = GetMethodAttribute(MethodStruct[methodindex].flags , methodindex);

Console.WriteLine(methodstring + methodattribute);

}

}

public string GetMethodAttribute (int methodflags , int methodrow)

{

string returnstring = "";

methodaccessattribute="" ;

methodhidebysigattribute= "";

methodpinvokestring = "";

methodunmanagedexpattribute = "";

methodreqsecobjattribute = "";

methodstaticinstanceattr="";

methodnewslotattr = "";

methodspecialnameattr = "";

methodrtspecialnameattr = "";

methodpinvokeimplattr = "";

methodfinalattr = "";

methodvirtualattr = "";

methodabstractattr = "";

if ( (methodflags & 0x0006) == 0x0006)

returnstring = "public ";

else

if ( (methodflags & 0x0005) == 0x0005)

returnstring = "famorassem ";

else

if ( (methodflags & 0x0003) == 0x0003)

returnstring = "assembly ";

else

if ( (methodflags & 0x0004) == 0x0004)

returnstring = "family ";

else

if ( (methodflags & 0x0001) == 0x0001)

returnstring = "private ";

else if ( (methodflags & 0x0002) == 0x0002)

returnstring = "famandassem ";

else

returnstring = "privatescope ";

methodaccessattribute = returnstring;

if ( (methodflags & 0x0080) == 0x0080)

{

methodhidebysigattribute = "hidebysig " + methodstaticinstanceattr;

returnstring = returnstring + "hidebysig ";

}

if ( (methodflags & 0x0100) == 0x0100)

{

methodnewslotattr = "newslot " ;

returnstring = returnstring + "newslot ";

}

if ( (methodflags & 0x0800) == 0x0800 || (methodflags & 0x0200) == 0x0200 )

{

methodspecialnameattr = "specialname ";

returnstring = returnstring + "specialname ";

}

if ( (methodflags & 0x1000) == 0x1000)

{

methodrtspecialnameattr = "rtspecialname " ;

returnstring = returnstring + "rtspecialname ";

}

if ( (methodflags & 0x0010) == 0x0010)

{

methodstaticinstanceattr = "static " + methodstaticinstanceattr ;

returnstring = returnstring + "static ";

}

else

{

methodstaticinstanceattr = "instance " + methodstaticinstanceattr;

returnstring = returnstring + "instance ";

}

if ( (methodflags & 0x0020) == 0x0020)

{

methodfinalattr = "final " ;

returnstring = returnstring + "final ";

}

if ( (methodflags & 0x0040) == 0x0040)

{

methodvirtualattr = "virtual " ;

returnstring = returnstring + "virtual ";

}

if ( (methodflags & 0x0400) == 0x0400)

{

methodabstractattr = "abstract " ;

returnstring = returnstring + "abstract ";

}

if ( (methodflags & 0x2000) == 0x2000)

{

methodpinvokeimplattr = "pinvokeimpl " ;

returnstring = returnstring + "pinvokeimpl(";

int ii;

if ( ImplMapStruct == null)

{

returnstring = returnstring + "/* No map */) ";

return returnstring;

}

else

{

for ( ii=1; ii < ImplMapStruct.Length ; ii++)

{

int index = ImplMapStruct[ii].cindex;

index = index >> 1;

if ( index == methodrow )

break;

}

if ( ii == ImplMapStruct.Length )

{

returnstring = returnstring + "/* No map */) ";

return returnstring;

}

string methodname = NameReserved(GetString(MethodStruct[methodrow].name));

string name = NameReserved(GetString(ImplMapStruct[ii].name));

int scope = ImplMapStruct[ii].scope;

string modulename = NameReserved(GetString(ModuleRefStruct[scope].name));

modulename = modulename.Replace("\\" , "\\\\");

returnstring = returnstring + "\"" + modulename + "\"" ;

if ( String.Compare(methodname , name) != 0)

returnstring = returnstring + " as \"" + name + "\"";

string pinvokeattribute1;

string pinvokeattribute = GetPinvokeAttributes(ImplMapStruct[ii].attr , out pinvokeattribute1);

returnstring = returnstring + pinvokeattribute1;

if (pinvokeattribute.IndexOf("stdcall") == -1)

returnstring = returnstring + " " + pinvokeattribute;

returnstring = returnstring + ") ";

int index1 = returnstring.IndexOf("pinvok") ;

methodpinvokestring  = returnstring.Remove(0, index1);

}

}

if ( (methodflags & 0x08) == 0x08)

{

methodunmanagedexpattribute = "unmanagedexp ";

returnstring = returnstring + "unmanagedexp ";

}

if ( (methodflags & 0xffff8000) == 0xffff8000)

{

methodreqsecobjattribute = "reqsecobj ";

returnstring = returnstring + "reqsecobj ";

}

return returnstring;

}

public string GetPinvokeAttributes (int attribute , out string returnattribute)

{

returnattribute = "";

if ( (attribute & 0x001) == 0x0001)

returnattribute = " nomangle";

if ( (attribute & 0x006) == 0x0006)

returnattribute = returnattribute+ " autochar";

else if ( (attribute & 0x002) == 0x0002)

returnattribute = returnattribute + " ansi";

else if ( (attribute & 0x004) == 0x0004)

returnattribute = returnattribute + " unicode";

if ( (attribute & 0x040) == 0x0040)

returnattribute = returnattribute + " lasterr";

string returnstring = "";

if ( (attribute & 0x0500) == 0x0500)

returnstring = returnstring+ "fastcall";

else if ( (attribute & 0x0300) == 0x0300)

returnstring= returnstring + "stdcall";

else if ( (attribute & 0x0100) == 0x0100)

returnstring = returnstring + "winapi";

else if ( (attribute & 0x0200) == 0x0200)

returnstring = returnstring + "cdecl";

else if ( (attribute & 0x0400) == 0x0400)

returnstring = returnstring + "thiscall";

return returnstring;

}

 

e.il

.class a1

{

.method public pinvokeimpl("Ole32.dll" as "CoCreateInstance" autochar winapi) int32 CoCreateInstance2()

{

}

.method public pinvokeimpl("Ole322.dll" as "CoCreateInstance" autochar winapi) int32 CoCreateInstance3()

{

}

.method public pinvokeimpl("Ole322.dll" as "CoCreateInstance" autochar winapi) int32 CoCreateInstance()

{

}

.method public pinvokeimpl("Ole322.dll" as "CoCreateInstance" autochar stdcall) int32 CoCreateInstance4()

{

}

}

 

 

Output

.module extern Ole322.dll /*1A000001*/

.module extern Ole32.dll /*1A000002*/

 

.class /*02000002*/ private auto ansi a1

       extends [mscorlib/* 23000001 */]System.Object/* 01000001 */

{

  .method /*06000001*/ public instance pinvokeimpl("Ole32.dll" as "CoCreateInstance" autochar winapi)

  .method /*06000002*/ public instance pinvokeimpl("Ole322.dll" as "CoCreateInstance" autochar winapi)

  .method /*06000003*/ public instance pinvokeimpl("Ole322.dll" autochar winapi)

  .method /*06000004*/ public instance pinvokeimpl("Ole322.dll" as "CoCreateInstance" autochar)

} // end of class a1

 

This program onwards will display the method directive by means of which we define a method or function as we knew them earlier. The problem with the method directive is that they are really complex and if we did all of the method directive the program will run into 1500 lines.

 

Thus we break up this directive by having smaller programs deal with each individual feature and then like Humty Dumty put them together again. Lets start with displaying the attributes that a method can have. First things first.

 

In the method DisplayOneType we add a call to the method passing it the type row number that carries the methods. This function will display all the method owned by a type and a little later all the global methods also that have a type of 1. Each method directive adds one row in the method table.

 

The question in your mind is how do we represent all the methods a type contains. The simplest way in our mind is to have to have two fields, one field for the starting method number in the method table and the second field for the last method number in the methods table.

 

The problem with out way is that it is too space consuming and where we can use one field why use two. The field mindex tells us the first method number in the methods table but there is no field for the last method. We infer this by looking at the next mindex field of the next type.

 

We thus have the starting method numbers owned by two succeeding types.  We thus have two row numbers in the method table. The first is the starting point and the second minus 1  is the last method that this type owns. Thus the variable start is the first method row number and variable startofnext is the method row of the succeeding type.

 

We need to check if the type we are displaying method for is the last type, then the last method owned by this type is the last row number or the length of the Method table. In the for loop the methodindex variable is the loop variable and  we start at the variable start and go one less than the value of startofnext.

 

The other reason that we do not have two fields is that a type can have no methods at all. In this case this type and the next type will have the same value for the field mindex. Why use two when one can do the job.

 

If there is one variable we have used the most it is the variable spacesforrest. This variable has basically three different values 0, 2 or 4. the value of 0 is when we are displaying global functions and there is no indentation needed then. The value of two is most common as there is 2 spaces between a type and a method.

 

Finally if we in a namespace the indentation is 4. The spacesfornested store the spaces needed if there is a nested class. Thus whenever we write something out these two variable will always be used for the initial spaces to be written out. We write out the directive method and then its row number and we then use a function GetMethodAttribute to give us the method attributes which are like the class attribute.

 

We write out the method directive and the attributes and nothing else. Thus the output give by the disassembler will not match as we are as we said before only displaying parts. These parts will however match the same part displayed by the disassembler.  Lets move on to this method GetMethodAttribute.

 

The Method table has a member called flags that tells us which attributes this method carries like in the case of the type, this field has to be read bitwise and not the byte wise. The problem that we will come across later is whether the abstract attribute gets displayed first or the newslot attribute.

 

Thus we work within this function in two ways, we use a common variable returnstring to store the individual attributes and also another series of variables like methodaccessattribute that store the individual attribute values. We will use these variables to write out the attributes in a certain order.

 

The return string also contains all the attributes but in the wrong order. Some programs later we will tell you what the right order of writing out the attributes is all about. These attributes are divided into families and the first is the accessibility attributes.

 

These are attributes that decide who can access these methods and even though they are seven of them, the seventh compilercontrolled gives us an ilasm error. This is because ilasm does not support this attribute in its first release and we are advised by the specs to use privatescope instead as the effect is the same. It is hidden and cannot be referenced.

 

The other six like assembly, famandassem , family, famorassem , private and public have been touched upon earlier. The assembly attribute has the first and second bit on and thus we bitwise and with 3. The trick here is that the first bit on means private, the second bit on means famandassem.

 

Only if both bits are on, its means assembly. This means that the order of if statements is important because if we check for private and famandassem first, when the attribute is assembly, they will match. Thus we first check for the combination bit being on and then the single bits.

 

That is why the single bits check is carried out at the end and those attributes that have more than one bit on at the beginning. Another way of handling it would be like the way we did earlier, extract the bits for accessibility and then check the absolute varies and not bite. To each his own.

 

The above attributes are mutually exclusive and cannot be combined. This is because a method can have only one attribute. The next attribute is the attribute that lets us override methods called newslot and is only used by virtual functions. We will delve into the newslot and virtual methods some time later.

 

Abstract classes we explained earlier we incomplete and could not be used. This is because they had abstract functions. This abstract attribute can only be used with virtual functions and these must not be marked final as abstract methods must be overwritten as its in they genes.

 

A abstract says that there is no code or implementation for this method and the class that overrides this class containing a abstract method will supply the body of the code. Obviously abstract methods must be present in only abstract types.

 

One important attribute that we would do later is the pinvokeimpl attribute and a major part of this function deals with this attribute. The nest family is the method contact attribute that has four members, final, hidebysig, static and virtual. These attributes can be combined keeping some conditions in mind.

 

A method cannot be static and virtual and if it is virtual then only can it be final. A final method is a method that a subclass is not allowed to override. It is a way of saying use the function as is, thou can override the class but not this method. We explained earlier that the hidebysig is ignored by the VES or system but is used by tools to do whatever they like with the name.

 

This attribute specifies that this method will hide all the methods from the inheritance hierarchy that have the same method signature. If this attribute is not present, the hiding yet takes place but the signature is not taken into account and all methods having the same name are hidden.

 

A method name is its name plus signature and this is what gives the object oriented world a big advantage over plain C which hide method only by name. Finally we have the special handling attributes that are rtspecialname and specialname. A constructor is called .ctor and a static constructor .cctor.

 

These names are recognized by the runtime and treated in a special way. This is the meaning of the rtspecialname attribute and the specialname is for use by tools and not the runtime. There are two more special attributes bought in by Microsoft which are unmanagedexp and reqsecobj.

 

The unmanagedexp says that the method is exported to unmanaged code using COM interop that we will explain in a short while. Reqsecobj says that this method calls another method which has security attributes. Finally some final conditions on applying attributes, static cannot be sued with final, virtual or newslot.

 

This makes sense as static functions belong to a type or class and not to an instance. Abstract functions have no code and hence cannot be used with pinvokeimpl and final. When compilercontrolled when implemented cannot be used with virtual, final, specialname or rtspecialname.

 

A small bit of information, if we use the attribute specialname, the flags field has a value of 0x800 as the specs tell us. In our file isymwrapper.dll, the specialname attribute was written out by ildasm even though the flags field had a value of 0x200.

 

This value of 0x200, the specs are silent on so we put two and two together and assumed that a value of 0x200 or 0x800 stand for the attribute specialname. We have also not checked for a value of 0x4000 which is the security attributes as we did not know the name of the attribute.

 

A little later we will trouble you with what is called the custom attribute and this directive adds the security attribute. The problem is that ildasm ignores it and so do we. Now lets move on to the pinvokeimpl attribute that we have been promising we would do.

 

We first check whether we have the attribute pinvokeimpl which has a value of 0x2000. We first write out the name of the attribute and then tell ourselves that each time we have such an attribute, a extra row gets added to the ImplMap table. Lets start with a simple error check first.

 

We assume that we only one method that uses the pinvokeimpl attribute and this attribute requires some values in within brackets. If we do not supply these values, the ImplMap table has no rows and hence we need to return the words No map within brackets. If the ImplMap table has records we iterate through each and every one of them. 

 

But first a small note on the use of the pinvokeimpl attribute. For years programmers have written code that runs under windows and this code has passed the test of time. It makes no sense that this tried and tested code goes waste just because something new has come about.

 

Thus Microsoft invested lots of time to come with a method by means of which, managed code i.e. .net code can call the earlier unmanaged code. This method is called Pinvoke or platform invoke.  Thus code written in the past and present in Dll’s can now be executed in the manage world of .net.

 

Thus Pinvoke switches from managed to unmanaged, makes the function call and then back to managed code. The .net world may have a data type that needs to be transformed to another data type in the unmanaged world and the return value in the unmanaged world may need to be transformed to another data type as we enter the managed world.

 

This transformation is called data marshalling. These functions that really do not exist in the managed world as they carry no code but are a gateway to the unmanaged world are marked as pinvokeimpl. A little later we will talk of another set of attributes called implementation attributes that come after the function parameters.

 

A pinvokeimpl attribute must have the implementation attributes of native and unmanaged. If you look closely at the il file, after the pinvokeimpl attribute name we have a set of parameters in brackets. This is the only attribute that excepts parameters. The first is the name of the dll that carries the code that will get executed.

 

This is a quoted string or what the docs call a QSTRING, a string in double and not single inverted commas. In our case we are specifying that the code of the function CoCreateInstance2 actually lies in this dll. The second string is optional and specifies the real name of the function as it exists on that platform.

 

Lets take you back in time and in the days of programming in C a function name was simply its name. Then came in C++ and the name of a function was the name plus data types of parameters. A function abc that took two ints as a parameter was renamed by the C++ compiler from Borland to abc$qii.

 

In the same vein the same function abc with a single int was renamed to abc$qi. Thus the Borland compiler added a $q and then the data types of each parameter. This changed the name of the function beyond recognition and is called name mangling.

 

The Microsoft complier Visual C++ would however use a different set of symbols to represent the parameter types. There is no standard that describes name mangling.

 

Thus in our case we are saying that the function name in the .net world is CoCreateInstance2 but in Ole32.dll it will be seen as CoCreateInstance instead due to name mangling. The .net law says that if a method is to be marked by the pinvokeimpl attribute then it should be a global method i.e. outside a type and must be static.

 

Fortunately for us the assembler does not seem to care. Also pinvokeimpl methods as they are a stand in for methods in unmanaged code, need not contain any code as this code they carry is not to be called. We now scan our ImplMap table and move our eyes to the cindex field.

 

This field as the name suggests is a coded index field called the MemberForwarded coded index that points to two tables, method or field. For some reason, a field will never have a pinvokeimpl attribute and thus the coded index always represents a row in the Method table.

 

We right shift the coded index by 1 and then check whether it is the row number of pour method. Each time a method has the pinvokeimpl attribute, the field cindex will carry the method number. We now break out of the for loop if a match is found. We need to check whether our method row number actually exits in this table.

 

This happens if the pinvokeimpl attribute exits on the method but has no parameters in brackets. This was the same condition that we checked for earlier but here the check is a little. There we had only method and this method did not have pinvokeimpl parameters.

 

In this case we have at least one method that has the pinvokeimpl parameters, but the current method has the pinvokeimpl attribute but no parameters. We first pick up the name of the method as stored in the method table and store it in the methodname variable.

 

The field name in the ImplMapStruct stores the actual name of the function we wrote in the as clause. The first thing we wrote was the name of the Dll that carries the code of this function and there is no field that gives us this dll name. The pinvokeimpl attribute is a lot more complex.

 

Every dll name that we write, a row gets added to the ModuleRef table. This row number is the value of the scope field. Thus if we have two pinvokeimpl attributes, with different dll name, Ole32.dll and Ole322.dll in our case, the output carries two module extern directives.

 

If the module name caries a backslash, we need to replace it with two backslashes. We then compare the name of the method as stored in the method table with the name of the actual unmanaged code method name. If they different, we put the as clause and the original method name.

 

Thus in our il file, the third method has the as method name the same as the method name and thus the as is omitted form the pinvokeimpl attribute.  There are some more attributes that we write out after the as clause and we use the function GetPinvokeAttributes to get at these extra attributes stored in the attr field.

 

In this function we are returning one set of attributes as the return value and another set through the second out parameter returnattribute. The our parameter is one of five attributes, nomangle, autochar, ansi, unicode that we did earlier and lasterr. The second set of attributes that we actually return do with the calling convention and they are fastcall, stdcall, winapi ,cdecl and thiscall.

 

We first write out the out parameter first and then the calling convention. The only problem with the calling convention stdcall is that even if write it like for the function CoCreateInstance4, the assembler does not spit it out for us. Thus the if statement makes sure that all calling conventions but the stdcall will get written out.

 

We then write out the close bracket and then find out the index of the words pinvok in the returnstring. If you have not forgotten so far, the words pinvokeimpl are preceded with the other attributes. We need to knock off all the other attributes before the pinvokeimpl attribute and thus use the Remove function of the string class.

 

The first parameter is the starting point in the string. We store this pinvokeimpl attribute in a variable  in the variable methodpinvokestring and this variable will be used later to actually write out the method attributes. We finally check for the last two attributes and then return the value of returnstring that we finally display.

 

One of the things that we have explained above is the ability of the CLI to call pre existing native code from a platform that we call unmanaged code. The platform will decide what the rules are and hence specific to a operating system.

 

What this entails is deciding a file format so that function pointer to managed code can be called from unmanaged code. What we have seen so far is a way for specifying methods to be implemented in unmanaged code.

 

We also need a way for marking call sites i.e. calling functions through instructions to indicate that the function to be called is actually in unmanaged code. The call and calli instructions will be explained in detail later.

 

What the specification finally specifies is a set of pre defined data types that can be marshaled across irrespective of where the CLI has been implemented. We can however extend these small number of data types using custom attributes and modifiers.

 

These extensions are specific to each platform and are not guaranteed to work across platforms. Lets take up the attributes ansi, autochar and unicode that we have dealt with earlier. First like before these attributes are exclusive and we know that we are repeating ourselves in a manner of speaking.

 

These attributes decide how strings will be passed across or marshaled to the other side. A value of ansi mean that the native code or unmanaged code will receive or return the string as an ansi string. This normally is the way C/C++ stores strings. Unicode specifies that the string is in unicode the international standard which everyone follows today.

 

The safest is autochar that chooses whatever is natural for that platform, ansi for Win 95, unicode for Win 2000. The calling conventions specify issues like how the parameters are seen on the stack and they are a large number of them.

 

The oldest is cdecl which is the standard originally followed by the  C programming language. Windows programming in C bought in the stdcall calling convention. The this pointer will be explained later and this introduced the thiscall. There are variations of the C calling conventions like fastcall.

 

To get out of the mess we have platformapi that says like autochar, use whatever is appropriate  for the platform. Once again we have to use winapi instead of platformapi. These calling conventions are for native code and not for managed code where we have no choice and take whatever Microsoft gives us.

 

Like always, there are two attributes specific to Windows, lasterr and nomangle. In the good days of C programming we would use lasterr to get at the last error. When Windows  first came about, there was a function called MessageBox amongst others that gave us what the function name says a Message Box.

 

All was well until unicode arrived on the seen. This created a problem as strings had to be passed to this function. The solution was that lets have two functions, MessageBoxA for the ascii or ansi verison and MessageBoxW for the unicode version.

 

I would write my code using the function name MessageBox and depending upon the platform I run on, the appropriate A(ascii)  or W(widechar) would be added to the function name. Thus the programmer was insulated for knowing anything about ansi or unicode. Remember there is no function called MessageBox in Windows any more.

 

We know all this because we have used computers for a very long time. Thus the attribute nomangle indicates that the name of the dll should be used as we have written it and no adding the ending A or W that normally would happen. We can also call unmanaged functions as briefly mentioned above using function pointers.

 

The way we call functions using function pointers is the same for managed or unmanaged functions. The little we do is tag the unmanaged function with the pinvokeimpl attribute.

 

There is only one table that stores the information about unmanaged functions that can be called from managed functions using the PInvoke dispatch. To sum up again, each row in the ImplMap table tells us the method row number in the method table and the name of the method in the dll whose is specified in the module ref table.

 

Thus each time a call is made to any method, the CLI will first look at this table and if the coded index MemberForwarded matches the method number, it will call the function specified by the field InportName that resides in the extern module specified by the ImportScope field.

 

Finally there is the MappingFlags field. In the Microsoft world, the calling convention attribute can only have the values winapi, cdecl and stdcall. The values fastcall and thiscall are not allowed. This is for information purposes only as most of time ilasm does not like to follow the specs we are reading.

 

Program21.csc

public void DisplayAllMethods (int typerow)

{

methodstring = methodstring + "/*06" + methodindex.ToString("X6") + "*/ " ;

string parammarshalstring = "";

parammarshalstring = GetParamAttrforMethodMarshal(methodindex , 0);

Console.WriteLine(methodstring + parammarshalstring);

}

}

 

public string GetParamAttrforMethodMarshal (int methodindex , int seq )

{

string returnstring = "";

if (ParamStruct == null)

return returnstring;

int end;

int start = MethodStruct[methodindex].param;

if ( methodindex == (MethodStruct.Length - 1) )

end = ParamStruct.Length + 1;

else

end =  MethodStruct[methodindex+1].param;

if ( start == ParamStruct.Length)

return returnstring;

if ( start == end)

return returnstring;

if ( seq == 0 && ParamStruct[start].sequence != 0)

return "";

int pattr = ParamStruct[start].pattr;

returnstring = DecodeParamAttributes(pattr , 1 , start , 0x2000);

if ( returnstring != "" && returnstring[0] == 32)

returnstring = returnstring.Remove(0 , 1);

return returnstring;

}

public string DecodeParamAttributes(int pattr , int tabletype , int start , int bytemask)

{

string returnstring = "";

if ( (pattr & bytemask) == bytemask)

{

int ii ;

for ( ii = 1 ; ii <= FieldMarshalStruct.Length ; ii++)

{

int coded = FieldMarshalStruct[ii].coded;

int table = FieldMarshalStruct[ii].coded & 0x01;

coded = coded >> 1;

if ( coded == start && tabletype == table)

break;

}

int blobindex = FieldMarshalStruct[ii].index;

int length , howmanybytes;

howmanybytes = CorSigUncompressData(blob , blobindex, out length);

//Console.WriteLine("{0} {1} {2} {3}" ,blob[blobindex].ToString("X") , blob[blobindex+1].ToString("X"),blob[blobindex+2].ToString("X"),blob[blobindex+3].ToString("X") );

int blobvalue = blob[blobindex+howmanybytes];

string ss1 = GetMarshallType(blob[blobindex+howmanybytes] , howmanybytes , blobindex);

if ( ss1 == "[]" || ss1.IndexOf("[ + ") != -1 || ss1 == "" || ( ss1.Length >= 2 && ss1[0] == '[' && ss1[ss1.Length-1] == ']' ))

returnstring = " marshal(" + ss1;

else

returnstring = " marshal( " + ss1;

 

returnstring = returnstring + ")";

}

if ( returnstring != "")

returnstring = returnstring + " ";

return returnstring ;

}

public string GetMarshallType (byte marshalflags , int howmanybytes , int blobindex)

{

//Console.WriteLine("...{0} {1} {2} {3} {4}" ,blob[blobindex] , blob[blobindex+1].ToString("X"), blob[blobindex+2].ToString("X"), blob[blobindex+3].ToString("X") , blob[blobindex+4].ToString("X") );

if ( blob[blobindex] == 0)

return "";

if ( marshalflags == 0x01)

return "void";

if ( marshalflags == 0x02)

return "bool";

if ( marshalflags == 0x03)

return "int8";

if ( marshalflags == 0x04)

return "unsigned int8";

if ( marshalflags == 0x05)

return "int16";

if ( marshalflags == 0x06)

return "unsigned int16";

if ( marshalflags == 0x07)

return "int32";

if ( marshalflags == 0x08)

return "unsigned int32";

if ( marshalflags == 0x09)

return "int64";

if ( marshalflags == 0x0a)

return "unsigned int64";

if ( marshalflags == 0x0b)

return "float32";

if ( marshalflags == 0x0c)

return "float64";

if ( marshalflags == 0x0D)

return "syschar";

if ( marshalflags == 0x0e)

return "variant";

if ( marshalflags == 0x0f)

return "currency";

if ( marshalflags == 0x10)

return "*";

if ( marshalflags == 0x11)

return "decimal";

if ( marshalflags == 0x12)

return "date";

if ( marshalflags == 0x13)

return "bstr";

if ( marshalflags == 0x14)

return "lpstr";

if ( marshalflags == 0x15)

return "lpwstr";

if ( marshalflags == 0x16)

return "lptstr";

if ( marshalflags == 0x17)

{

int uncompressedbyte;

CorSigUncompressData(blob , blobindex+howmanybytes+1 , out uncompressedbyte);

return "fixed sysstring [" + uncompressedbyte.ToString() + "]";

}

if ( marshalflags == 0x18)

return "objectref";

if ( marshalflags == 0x19)

return "iunknown";

if ( marshalflags == 0x1a)

return "idispatch";

if ( marshalflags == 0x1b)

return "struct";

if ( marshalflags == 0x1c)

return "interface";

if ( marshalflags == 0x1d)

{

string returnstring = "safearray";

if ( blob[blobindex] > 1)

{

string dummy = GetSafeArrayType(blob[blobindex+howmanybytes+1]);

if ( dummy != "")

returnstring = returnstring + " " + dummy;

}

int len = blob[blobindex] - 3;

if ( len > 0)

{

returnstring = returnstring + ", \""  ;

for ( int iii = 0 ; iii < len ; iii++)

returnstring = returnstring + (char)blob[blobindex+iii+howmanybytes+3] ;

returnstring = returnstring + "\""  ;

}

return returnstring;

}

if ( marshalflags == 0x1e)

{

int uncompressedbyte;

CorSigUncompressData(blob , blobindex+howmanybytes+1 , out uncompressedbyte);

return "fixed array [" + uncompressedbyte.ToString() + "]";

}

if ( marshalflags == 0x1f)

return "int";

if ( marshalflags == 0x20)

return "unsigned int";

if ( marshalflags == 0x21)

return "nested struct";

if ( marshalflags == 0x22)

return "byvalstr";

if ( marshalflags == 0x23)

return "ansi bstr";

if ( marshalflags == 0x24)

return "tbstr";

if ( marshalflags == 0x25)

return "variant bool";

if ( marshalflags == 0x26)

return "method";

if ( marshalflags == 0x27)

return "";

if ( marshalflags == 0x28)

return "as any";

if ( marshalflags == 0x29)

return "";

if ( marshalflags == 0x2a)

{

/*

for ( int i = 0 ; i <= blob[blobindex] ; i++)

Console.Write("{0} " , blob[blobindex+i].ToString("X"));

Console.WriteLine();

*/

string returnstring = "";

string arrays = "[]";

string dummy1 = "";

if ( blob[blobindex] == 3)

{

dummy1 = " ";

arrays = "[ + " + blob[blobindex+2+howmanybytes].ToString() + "]";

}

if ( blob[blobindex] == 4)

{

dummy1 = "";

if ( blob[blobindex+2+howmanybytes] != 0)

arrays = "[" + blob[blobindex+3+howmanybytes].ToString() + " + " + blob[blobindex+2+howmanybytes].ToString() +  "]";

else

arrays = "[" + blob[blobindex+3+howmanybytes].ToString() + "]";

}

if ( blob[blobindex] >= 7)

{

int howmanytypes = blob[blobindex]/3;

returnstring = GetMarshallType(blob[blobindex+howmanybytes+howmanytypes] ,howmanybytes , blobindex);

if ( blob[blobindex+1+howmanybytes+howmanytypes] != 0)

arrays = "[" + blob[blobindex+2+howmanybytes+howmanytypes].ToString() + " + " + blob[blobindex+1+howmanybytes+howmanytypes].ToString() +  "]";

else

arrays = "[" + blob[blobindex+2+howmanybytes+howmanytypes].ToString() + "]";

returnstring = returnstring + arrays ;

for ( int i = 1 ; i < howmanytypes ; i++)

{

if ( blob[blobindex+howmanybytes+howmanytypes+i*2+2] == 0)

returnstring = returnstring + " " + GetMarshallType(blob[blobindex+howmanybytes+howmanytypes+i*2+1] ,howmanybytes , blobindex);

else

returnstring = returnstring + " " + GetMarshallType(blob[blobindex+howmanybytes+howmanytypes+i*2+2] ,howmanybytes , blobindex);

}

return returnstring;

}

if ( blob[blobindex+howmanybytes+1] == 0x50)

returnstring = arrays;

else

returnstring = dummy1 + GetMarshallType(blob[blobindex+howmanybytes+1] ,howmanybytes , blobindex) + arrays;

return returnstring;

}

if ( marshalflags == 0x2b)

return "lpstruct";

if ( marshalflags == 0x2c)

{

int len = 0;

int howmanybytes1 = 0;

howmanybytes1 = CorSigUncompressData(blob , blobindex + howmanybytes+3 , out len );

string returnstring = "custom (\"";

for ( int ii1 = 0 ; ii1 < len ; ii1++)

returnstring = returnstring +  (char)blob[blobindex+3+ii1+howmanybytes+howmanybytes1];

returnstring = returnstring + "\"" + "," ;

int len1 = len;

int bytes = 1;

if ( len1 >= 128)

bytes = 2;

howmanybytes1 = CorSigUncompressData(blob , blobindex + howmanybytes+3+len1+bytes , out len );

returnstring = returnstring + "\"" ;

for ( int ii1 = 1 ; ii1 <= len ; ii1++)

returnstring = returnstring +  (char)blob[blobindex+3+len1+ii1+howmanybytes+howmanybytes1];

returnstring = returnstring +  "\")";

return returnstring;

}

if ( marshalflags == 0x2d)

return "error";

return "Unknown";

}

public string GetSafeArrayType (byte safearraytype)

{

string returnstring = "";

if (safearraytype == 0)

returnstring = "";

if (safearraytype == 1)

returnstring = "null";

if (safearraytype == 2)

returnstring = "int16";

if (safearraytype == 3)

returnstring = "int32";

if (safearraytype == 4)

returnstring = "float32";

if (safearraytype == 5)

returnstring = "float34";

if (safearraytype == 6)

returnstring = "currency";

if (safearraytype == 7)

returnstring = "date";

if (safearraytype == 8)

returnstring = "bstr";

if (safearraytype == 9)

returnstring = "idispatch";

if (safearraytype == 0x0a)

returnstring = "error";

if (safearraytype == 0x0b)

returnstring = "bool";

if (safearraytype == 0x0c)

returnstring = "variant";

if (safearraytype == 0x0d)

returnstring = "iunknown";

if (safearraytype == 0x0e)

returnstring = "decimal";

if (safearraytype == 0x0f)

returnstring = "illegal";

if (safearraytype == 0x10)

returnstring = "int8";

if (safearraytype == 0x11)

returnstring = "unsigned int8";

if (safearraytype == 0x12)

returnstring = "unsigned int16";

if (safearraytype == 0x13)

returnstring = "unsigned int32";

if (safearraytype == 0x14)

returnstring = "int64";

if (safearraytype == 0x15)

returnstring = "unsigned int64";

if (safearraytype == 0x16)

returnstring = "int";

if (safearraytype == 0x17)

returnstring = "unsigned int";

if (safearraytype == 0x18)

returnstring = "void";

if (safearraytype == 0x19)

returnstring = "hresult";

if (safearraytype == 0x1a)

returnstring = "*";

if (safearraytype == 0x1b)

returnstring = "safearray";

if (safearraytype == 0x1c)

returnstring = "carray";

if (safearraytype == 0x1d)

returnstring = "userdefined";

if (safearraytype == 0x1e)

returnstring = "lpstr";

if (safearraytype == 0x1f)

returnstring = "lpwstr";

if (safearraytype == 0x20)

returnstring = "illegal";

if (safearraytype == 0x21)

returnstring = "illegal";

if (safearraytype == 0x22)

returnstring = "illegal";

if (safearraytype == 0x23)

returnstring = "illegal";

if (safearraytype == 0x24)

returnstring = "record";

if (safearraytype >= 0x25)

returnstring = "illegal";

return returnstring;

}

 

e.il

.class a1

{

.method  void marshal () a1()

{

}

.method  void marshal ( int8) a2()

{

}

.method  void marshal ( fixed sysstring [12]) a3()

{

}

.method  void marshal ( safearray) a4()

{

}

.method  void marshal ( safearray int8) a5()

{

}

.method  void marshal ( safearray int8 , "hi") a6()

{

}

.method  void marshal ( int16 [+4] ) a7()

{

}

.method  void marshal ( int16 [] ) a8()

{

}

.method  void marshal ( int8 [0]  ) a9()

{

}

.method  void marshal ([7]  ) a10()

{

}

.method  void marshal (custom ("AB" , "CDEF")  ) a11()

{

}

.method  void marshal (custom ("AB" , "")  ) a11()

{

}

.method  void marshal (int8 [4 + 5] ) a12()

{

}

}

 

Output

  .method /*06000001*/ marshal()

  .method /*06000002*/ marshal( int8)

  .method /*06000003*/ marshal( fixed sysstring [12])

  .method /*06000004*/ marshal( safearray)

  .method /*06000005*/ marshal( safearray int8)

  .method /*06000006*/ marshal( safearray int8, "hi")

  .method /*06000007*/ marshal( int16[ + 4])

  .method /*06000008*/ marshal( int16[])

  .method /*06000009*/ marshal( int8[0])

  .method /*0600000A*/ marshal([7])

  .method /*0600000B*/ marshal( custom ("AB","CDEF"))

  .method /*0600000C*/ marshal( custom ("AB",""))

  .method /*0600000D*/ marshal( int8[4 + 5])

 

e.il

.class a1

{

.method  void marshal (int16 [4+5] [2] ) a122()

{

}

.method  void marshal (int32 [4] [2][3] ) a123()

{

}

.method  void marshal (unsigned int8 [4+5] [2][3][4] ) a124()

{

}

.method  void marshal (unsigned int16 [+5] [2][3][4][5] ) a125()

{

}

}

 

Output

  .method /*06000001*/ marshal( int16[4 + 5] bool)

  .method /*06000002*/ marshal( int32[4] bool int8)

  .method /*06000003*/ marshal( unsigned int8[4 + 5] bool int8 unsigned int8)

  .method /*06000004*/ marshal( unsigned int16[0 + 5] bool int8 unsigned int8 int16)

 

 

In this program we display the marshal keyword. As explained before we have broken the various parts of what comprises a method declaration and are doing each on separately. We are not displaying the entire function DisplayAllMethods as most of the code is repetitive.

 

All that we would like you to do is remove the last two lines at the end or after the methodstring variable and replace them with what we have above. We are calling a function GetParamAttrforMethodMarshal that will contain the marshal attribute. But, wait, we are moving ahead and first lets be clear on what this marshal is all about.

 

It would be ideal if the CLI ran on its own and did not need or been hosted on top of another operating system. We are running .Net or the CLI under a operating system Windows 2000. Under these operating systems some data types have a certain specific meaning or they perform certain functions.

 

Thus we need a way to convert the built in or our user defined data types to the native data types of that operating system. This marshalling information is specified using the keyword marshal. Every function may have a return type which is the value returned by the native code in that operating system and we can use the marshal keyword to convert it into a CLI data type.

 

Thus the marshal keyword tells us the original data type in th operating system and we will use the marshal keyword to convert it into a CLI data type that we have specified. If we did not have the marshal keyword, how does our CLI know what is the return value. It would assume the return value of the native code is equivalent to our functions return type.

 

The marshal keyword comes in to tell the CLI that the native function will return a certain data type and we need to convert this data type to the data type that our IL function requires. The same holds good for parameters to functions in a slightly different way. Here we need to specify what the CLI data type needs to be converted to as the marshal keyword specifies the data type that the native functions expects.

 

Just the reverse of what we explained earlier. This means that in our function DisplayAllMethods, the GetParamAttrforMethodMarshal functions gets called more than once and hence the different parameters and the complexity of our code increases. The first parameter is the method row number as this is what identifies each method uniquely in the metadata.

 

The second number is the what we cal the sequence number. This value is either 0 or 1. At this point in time its value is 0 and we will see its use a little later. At little while ago we explained the concept of how a type can have a more than one method and how a single field can tell us which method is owned by which type.

 

This is a one to many relationship which also applies to methods and parameters as one method can have many parameters. The param field stores the starting parameter index in the param table of each method. When you have a good idea its nice to use it everywhere. Thus we get two variables start and end that tell us the first and last param index owned by this method in the param table.

 

We use these values for error checks only. If the value of start points to the end of the param table, it means that we have methods that have parameters as the ParamStruct table is not empty, but this method and the ones following have no parameters at all and also no marshal keyword. The second error check is when both the start and end variables have the same value.

 

This means that these methods have no parameters at all. The question uppermost in your mind is what does parameters have to do with the marshal keyword. At this point we are talking not about the marshalling of parameters but that of the return value. Does not  matter. If the return value has a marshal keyword, a row gets added into the param table.

 

But this method does not have a parameter at all and thus the sequence field of the param table which otherwise tells us  the param number is now zero. We have one last error check which checks for those cases where we have parameters but no marshal keyword for the return type.

 

This checks says that if seq is 0, we are checking for the return type being marshaled and thus the sequence field must also be 0. We now need to decode the marshal keyword and use the method DecodeParamAttributes to do the job for us. As both methods and fields can be marshaled this function gets called more than once with different parameters.

 

It is this function that actually gives us the marshal keyword and we first check whether the first character of the return string is a space using the read only indexer. If in the affirmative, we remove this space. We do this because at times we do not need the first space that we write out before the keyword marshal.

 

If you look closely at the function DecodeParamAttributes we have the marshal keyword starting with a space and for methods we remove this space. The return string may be null and hence before accessing the string we need to make sure that it is non null.  Let us now move on to the function DecodeParamAttributes that does the bilk of the work in figuring out the marshal keyword.

 

The first and last parameters are used together. The Param table has a flags field that tells us all about the attributes on a parameter. The return value is also taken to be a parameter with a sequence number of 0. A bit mask of 0x2000 means that this parameter has a marshal attribute.

 

Therefore the last parameter also has a value of 0x2000. If a field has a marshal attribute associated with it, the bit mask is different. Thus any parameter that has a marshal attribute, the pattr field of the param table will have a bit mask of 0x2000.

 

This is why we start the function by checking whether the pattr parameter has a bit mask of 0x2000 or if there is a marshal attribute associated with the parameter or in this case the return value. If the answer is in the affirmative, we now scan the FieldMarshal table that has one row for every marshal keyword used.

 

This table uses a coded index that has a single bit that points to the field or param table. Looking at our code the value of 1 means the param table and 0 means the field table. The second parameter to this function is the param table number owned by this parameter or return value.

 

Thus to find a matching row in the FieldMarshal table we need to check the param or field row number as well as the table type at the same time. If we meet a match, we then use the index variable ii to get at the index field which is nothing but an offset of the marshal signature in the blob stream.

 

The variable blobindex now holds an offset into the blob stream which as always starts with a length of bytes that it controls. This is as a blob means something that is not defined has no known structure. We first need to now two things, the number of bytes that make up the blob marshal signature and its value.

 

We use our good old trusty method CorSigUncompressData that returns if you forget the number of bytes that make up the signature and the last out parameter the actual bytes. Normally the marshal blob signature will not be greater than 127 bytes and in the following code at times we have directly read the length and not used the CorSigUncompressData method.

 

We can like you get sloppy at times. That is why we ask you to also make changes in the code and treat it like a joint venture, between you and us. We then have a comment that displays the first four bytes of the marshal signature. Where we use this will become clear in a short while.

 

Stay with us as you will then appreciate how we get things working. We call a function GetMarshallType that takes three parameters. It is actually this function that decodes the blob signature. The marshal signature starts with the length byte and then a byte that describes the actual data type which will be marshaled to at the other end.

 

It is this byte that we pass to the GetMarshallType method followed by the length of the signature and also the starting blob index. This function returns the marshal signature starting with space that we may remove in the earlier function if you remember.

 

If the marshal signature  is the empty array braces [] or contains an array and a plus sign, the powers to be decided that the space after the open bracket after the word marshal should not be there. Thus the if statement removes a pesky space. We have spend a lot of if statements removing a space or adding a space. What a life.

 

We then close the bracket and if there is a valid marshal signature we add a space at the end of the marshal signature. See one more if statement. Now lets move on the GetMarshallType function. The first thing we do in this function is check the length of the blob signature.

 

If it is zero, then we have no signature and bail out. This means that we have the marshal keyword but no data type within brackets like in method a1 in the il file. This for some reason is a valid marshal keyword. Normally we use the marshal keyword as in function a2 where we specify a data type like int8.

 

Thus we now have a series of if statements that check the value of the first parameter marshalflags with a predefined set of values. When we come to a value of 0x17 which has a data type of fixed sysstring, where we have to pass it a number in square brackets like method a3. While most of the data types are simple a value of 0x1d is a little more complex as it stands for safearray.

 

The safearray data type may or may not followed by data type like method a4 or a5. We first call a method GetSafeArrayType that we pass the byte following the length and the marshal type, i.e. the third byte of the blob signature. This function once again checks the byte against a predefined set of values and returns is the safe array data type.

 

There is nothing in this function that will surprise us and it is a long series of if statements. We only add a extra space if the GetSafeArrayType method returns a data type. The if statement is important as we have a safearray data type only if the length of the marshal signature is larger than 2. We then tell ourselves that the safearray data type can have a comma followed by a string in double inverted commas.

 

This is unique to the safearray data type and will occur only if the length is larger than 3. The 3 bytes are taken by the length, the safearray data type and the data type following the safe array. We thus reduce the length byte by 3 and if this result is larger than 0, we print out the string in double quotes with a comma. The value 0x1e is for a fixed array which is similar to the fixed sysstring.

 

A value of 0x2a in our opinion  is the most complex as it deals with an array. A array starts with a data type and then may have a number in brackets. The data type however is optional. Thus the array data type if it is three bytes large would mean that the first byte is the length that we do not count, the second the array data type value 0x2a, the third the data type before the array and  the last the number in brackets with the plus sign.

 

This is represented by the method a7.

However for the method a8 the length is 2 bytes and the second byte is the byte 2a followed by the array data type. This is as the brackets have no value. For function a9 we have a length of four as the first two bytes are the same as above and we are followed by two zeroes. The last method a10 is special as we have no data type.

 

Here the first two bytes are the same followed by a 0x50 which means no data type, then followed by a zero and then the number in the brackets. In our code the variable dummy1 simply contains a space or null depending upon the length of the blob signature. A value of three means a space, 4 no space. The string arrays has the number in array brackets with the plus sign with or without the space before the plus sign.

 

The default is the empty array brackets. This number is either the second or third byte of the blob signature depending upon the length of the signature.   The first byte after the marshal data type is always the data type of the array and thus we use the GetMarshallType type function to return this value to us.

 

Once again a use of recursion as we are in the same function and calling the same function. Lets take function a12 to demonstrate all the bytes of a blob signature for a array. We have added two numbers 4 and 5 together. The blob signature reads as 4 2A 3 5 4. The first as always is the length of the blob signature without considering the length byte itself.

 

This is followed by the array type 0x2a. Then comes the data type of the array int8 or 3. this is followed by the two number in brackets 5 and 4. Lets now turn our attention to the second il file and lets understand the complexities here and why we have broken our array signature into two separate il files.

 

We start our array if statement with a for loop that simply displays the array signature using the first byte as the length. In our program we have commented it out, but this is the way we have seen the blob signature. For the first function a122 the marshal keyword is marshal (int16 [4+5] [2] ).

 

We have added a 2 in square brackets after the first array gets over. This results in the blob signature look like 7 2A 2A 5 5 4 0 2. The length which is normally 4 gets increased by 3. The three bytes are made up of a extra 0x2a as well as a 0 and the number 2 which we wrote.

 

If you now look at function a123 which has the marshal keyword as marshal (int32 [4] [2][3] ), we have added two array dimensions to it. The bytes it creates are A 2A 2A 2A 7 0 4 0 2 0 3. There is a further increase of 3 bytes with a 0x2a in the beginning and a 0 and the number we write 3 at the end.

 

Thus each extra dimensions adds a 0x2a as well as two bytes representing the dimension at the very end. We now need to handle this special case. As mentioned earlier, the length is normally 3 or 4. If it is greater than 7, we kick off a new if statement. We start with figuring out how many dimensions we have.

 

This number can be obtained by dividing by 3 as each new dimension increases the length by 3. We store this value in the variable howmanytypes. The main data type of the array is stored after all the 0x2a’s get over and thus we now need the offset of where the array data type starts.

 

We start at blobindex and then add the number of bytes this length takes and then the number of 0x2a’s that we have. We do not add by 1 as there is always one 0x2a. We use the GetMarshallType method to return this data type as a string and then use the same method earlier to get at the first array dimension with or without the plus sign.

 

We then need to add the final data types. The variable how many bytes gives us one more than the number of data types and hence the for loop is one less. We use the variable howmanytypes to get at the end of the normal signature and then we hit a roadblock.

 

The length of the array signature can be 3 or 4 and hence if we hit a 0, we know that the length is 3 and hence we increase by 1 and not 2. we display these dimensions as a data types. We now move on to the original file e.il.

 

If the first data type is 0x50, this means that we do not have a data type and hence the returnstring is only the square brackets. The last value is the custom data type which is nothing but the words custom followed by two strings. The format is the length of the string, followed by 0x2c and then two zeros.

 

This is followed by the length of the first string, then the actual string. When this string gets over, we then have the second string and its contents. Thus we first figure out the length of the first string that is 3 bytes from the data type and in a for loop print out the string.

 

We then jump to the length of the second string knowing that its length is stored in the variable len1 and we need to go one more and hence we add by the value stored in the variable bytes. This can be 1 or 2 depending upon the length of the two strings.

 

If the sum exceeds 128, then the length of the blob signature byte will be 2 and not one and hence if we assume a value of 1, we will be reading the last byte of the first string as the length byte. The rest of the code remains the same.

 

There are a minimum set of data types that have to b supported by the CLI and these are int8, int16, unsigned int8, bool , char and all the native integer data types. Remember under Windows all code is written in the C programming language.

 

The list of data types goes on and this includes enums that are glorified constants and the floating point data types float32 and float64. Even though C and c++ do not support the string data type it is common enough to be included in the list of mandatory data types.

 

Obviously pointer to the above data types are also included along with one dimensional arrays who start counting from zero. These conversions are from managed to unmanaged and need not be supported form unmanaged to managed i.e. for the return types. Delegates and pointers to functions are not the same and thus a delegate cannot be used in unmanaged code.

 

The marshal keyword is the only interoperable keyword available and lets us work closely with older legacy code. It is platform specific and will not work across the board. This means that the windows implementation will never ever work say on Linux. Once again the marshal keyword specifies the data type that the managed code will be converted to when it goes to unmanaged code.

 

The system however has a lot of default rules that govern what happens when we do not use this keyword. The problem arises with the use of user defined types or classes and the CLI does not require this marshalling from all its conforming implementations. 

 

Each implementation decides how to marshal user defined types and the system imposes no restrictions. This will guarantee that code generated will not be portable but it is a price to pay as user defined types are too generic for anyone to impose rules. The FieldMarshal table has only columns which we have used earlier.

 

It is obvious that this table is only used by code that calls into unmanaged code. Once the code calls unmanaged code, we are lying outside the regime of the CLI and thus we are assuming that the code we call does not break any rules.

 

The question uppermost in your mind is how did we figure out all the data types that we could use with the marshal. Elementary, you might say, peek into the specs. We did just that and realized that there were huge gaps in the data types that we specified in the docs. Thus we were in a quandary.

 

How do we figure out all of them. Even though we went though 5000 file, we were yet not sure whether we had it all sewed up. Thus we first displayed the marshal signature bytes and then searched for them in a hex editor like ultra edit that can be downloaded free from the net.

 

These bytes will always be after the BSJB signature. Now that we have found the bytes, we change the byte that contains the say data type and then save the file. We next call the ildasm program which now tells us what data type that value stood for. Simple is it not. This is how we could figure out what the specs did not contain.

 

We are telling you all this as this as the best way to learn. Change the bytes in the table itself and see what the disassembler has to say about the change. This is why everywhere you will find us display the bytes.

 

Program22.csc

public void DisplayAllMethods (int typerow)

{

methodstring = methodstring + "/*06" + methodindex.ToString("X6") + "*/ " ;

string paramattrstring = "";

paramattrstring = GetParamAttrforMethodCalling (methodindex);

Console.WriteLine(methodstring + paramattrstring);

}

}

 

public string GetParamAttrforMethodCalling (int methodindex)

{

string returnstring = "";

if (ParamStruct == null)

return returnstring;

int end;

int start = MethodStruct[methodindex].param;

if ( methodindex == (MethodStruct.Length -1) )

end = ParamStruct.Length + 1;

else

end =  MethodStruct[methodindex+1].param;

if ( start == ParamStruct.Length)

return returnstring;

if ( start == end)

return returnstring;

if (ParamStruct[start].sequence != 0)

return "";

int pattr = ParamStruct[start].pattr;

if ( (pattr & 0x01) == 0x01)

returnstring = returnstring + "[in]" ;

if ( (pattr & 0x02) == 0x02)

returnstring = returnstring + "[out]" ;

if ( (pattr & 0x10) == 0x10)

returnstring = returnstring + "[opt]" ;

if ( returnstring != "")

returnstring = returnstring + " ";

return returnstring ;

}

 

 

e.il

.class a1

{

.method  [in] void a1()

{

}

.method  [out][opt] void a2()

{

}

}

 

 

Output

  .method /*06000001*/ [in]

  .method /*06000002*/ [out][opt]

 

Parameters and return values can have a parameter attribute of in, out or optional. These are part of the parameter definition and not part of the method signature. The above attributes are associated with parameters and not really with return values. The in and out attributes apply to pointers of either managed or unmanaged types. 

 

All that they say is whether the parameter supplies a value to the function in or the function fills it up with a value or both. The default is in. However the CLI does not worry about whether this contract is being enforced.

 

This helps the CLI in optimizations for distributed computing, specially if it is a in parameter, the value needs to send across to another computer if the called function resides there and we do not worry about the return value. For a out parameter it is the reverse, we do not send a value across but the return value is meaningful.

 

The opt value means that from  the programmers point of view, the value is optional. A little later we will deal with the .param keyword that will supply opt parameters with values.  The method DisplayAllMethods now calls a method GetParamAttrforMethodCalling to figure out these values.

 

In this method we start with the mandatory error checks and then check whether we have a attribute or not. If the sequence field is not zero we abort with a null string as this means our return value does not have any param attributes.

 

We then bit wise and with 1, 2 or 0x10 to check which of the 3 param attributes we have. The only thing to remember is that these attributes are not mutually exclusive. A small program after a long time. Do not expect such small mercies for a long time.

 

Progarm23.csc

 

string [] methoddefreturnarray;

string []methoddeftypearray;

int [] methoddefparamcount;

 

public void abc(string [] args)

{

ReadPEStructures(args);

DisplayPEStructures();

ReadandDisplayImportAdressTable();

ReadandDisplayCLRHeader();

ReadStreamsData();

FillTableSizes();

ReadTablesIntoStructures();

DisplayTablesForDebugging();

ReadandDisplayVTableFixup();

ReadandDisplayExportAddressTableJumps();

DisplayModuleRefs();

DisplayAssembleyRefs();

CreateSignatures();

DisplayAssembley();

DisplayFileTable();

DisplayClassExtern();

DisplayResources();

DisplayModuleAndMore();

DispalyVtFixup();

DisplayTypeDefs();

DisplayTypeDefsAndMethods();

}

 

public void CreateSignatures ()

{

if (MethodStruct != null)

{

methoddefreturnarray = new string[MethodStruct.Length];

methoddeftypearray  = new string[MethodStruct.Length];

methoddefparamcount = new int[MethodStruct.Length];

for ( int l = 1 ; l < MethodStruct.Length ; l++)

{

CreateSignatureForEachType (1 , MethodStruct[l].signature, l);

}

}

}

public void CreateSignatureForEachType (byte type , int index , int row)

{

//Console.WriteLine(".......type={0} row={1} index={2} blob.Length={3} {4}" , type , row.ToString("X") , (ushort)index , blob.Length , (uint)index);

int uncompressedbyte , count , howmanybytes;

howmanybytes = CorSigUncompressData (blob , index , out uncompressedbyte);

count = uncompressedbyte;

byte [] blob1 = new byte[count];

Array.Copy(blob , index + howmanybytes , blob1 , 0 , count);

if ( type == 1)

CreateMethodDefSignature (blob1 , row);

}

public void DisplayAllMethods (int typerow)

{

methodstring = methodstring + "/*06" + methodindex.ToString("X6") + "*/ " ;

string s = methodstring + "  " + methoddefreturnarray[methodindex]+  " " + methoddeftypearray[methodindex] ;

Console.WriteLine(s);

}

}

public void CreateMethodDefSignature (byte [] blobarray , int row)

{

//Console.WriteLine("CreateMethodDefSignature Array Length={0} method row={1} name={2}" , blobarray.Length , row , GetString(MethodStruct[row].name));

int aa = -1;

if ( row == aa)

{

Console.WriteLine(GetString(MethodStruct[row].name));

for ( int l = 0 ; l < blobarray.Length ; l++)

Console.Write("{0} " , blobarray[l].ToString("X"));

Console.WriteLine();

Console.WriteLine("Length of array is {0}" , blobarray.Length);

}

int howmanybytes,uncompressedbyte , count , index;

index = 0;

howmanybytes = CorSigUncompressData (blobarray , index , out uncompressedbyte);

methoddeftypearray [row] = DecodeFirstByteofMethodSignature (uncompressedbyte , row);

index = index + howmanybytes;

howmanybytes = CorSigUncompressData (blobarray , index , out uncompressedbyte);

count = uncompressedbyte;

methoddefparamcount[row] = count;

index = index + howmanybytes;

string returntypestring = "";

returntypestring = GetElementType(index , blobarray , out howmanybytes );

methoddefreturnarray [row] = returntypestring;

}

public string GetElementType ( int index , byte [] blobarray , out int howmanybytes)

{

howmanybytes = 0;

string returnstring = "";

byte type = blobarray[index];

if ( type >= 0x01 && type <= 0x0e )

{

returnstring = GetType(type);

howmanybytes = 1;

}

return returnstring;

}

public string DecodeFirstByteofMethodSignature (int firstbyte , int methodrow)

{

string returnstring = "";

if ( (firstbyte& 0x20 ) == 0x20 )

returnstring = "instance ";

if ( (firstbyte & 0x40 ) == 0x40 )

returnstring = "explicit instance ";

int firstbits = firstbyte & 0xf;

if ( firstbits  == 0x02 )

returnstring = returnstring + "unmanaged stdcall ";

else if ( firstbits  == 0x03 )

returnstring = returnstring + "unmanaged thiscall ";

else if ( firstbits  == 0x05 )

returnstring = returnstring + "vararg ";

else if ( firstbits  == 0x01 )

returnstring = returnstring + "unmanaged cdecl ";

else if ( firstbits  == 0x04 )

returnstring = returnstring + "unmanaged fastcall ";

return  returnstring;

}

public string GetType (int typebyte)

{

if ( typebyte == 0x01)

return "void";

if ( typebyte == 0x02)

return "bool";

if ( typebyte == 0x03)

return "char";

if ( typebyte == 0x04)

return "int8";

if ( typebyte == 0x05)

return "unsigned int8";

if ( typebyte == 0x06)

return "int16";

if ( typebyte == 0x07)

return "unsigned int16";

if ( typebyte == 0x08)

return "int32";

if ( typebyte == 0x09)

return "unsigned int32";

if ( typebyte == 0x0a)

return "int64";

if ( typebyte == 0x0b)

return "unsigned int64";

if ( typebyte == 0x0c)

return "float32";

if ( typebyte == 0x0d)

return "float64";

if ( typebyte == 0x0e)

return "string";

return "unknown";

}

 

 

e.il

.class a1

{

.method  explicit instance bool  a1()

{

}

.method  instance int16 a2()

{

}

.method  vararg void a3()

{

}

.method  default int8 a4()

{

}

.method  unmanaged stdcall int8 a5()

{

}

.method  unmanaged thiscall int8 a6()

{

}

.method  unmanaged cdecl int8 a7()

{

}

.method  unmanaged fastcall int8 a8()

{

}

}

 

Output

  .method /*06000001*/   bool explicit instance

  .method /*06000002*/   int16 instance

  .method /*06000003*/   void instance vararg

  .method /*06000004*/   int8 instance

  .method /*06000005*/   int8 instance unmanaged stdcall

  .method /*06000006*/   int8 instance unmanaged thiscall

  .method /*06000007*/   int8 instance unmanaged cdecl

  .method /*06000008*/   int8 instance unmanaged fastcall

 

In this program we display some more stuff about a method like its calling convention as well as the data type of the return value. The problem with the data type is that it fills up hundreds of pages and hence we have broken up the data types into dozens of programs.

 

Thus the next couple of programs only focus on the data types that a return value can carry. We have three instance arrays that will carry the calling convention methoddeftypearray, the data type of the return value methoddeftypearray and finally the number of parameters that the method has.

 

These variables are arrays as we will have scores of functions in our il code. If you look at the new abc method, we have a added a method CreateSignatures that simply create all the method signatures in one go and populate our arrays. Thus in our code later on we simply display the relevant array members.

 

In the CreateSignatures method we first make sure that we have at least one method in our code as there is no point in computing signatures if we have no methods to deal. It is here that we first create the three arrays of the desired sizes using the length of the message table as the size of the array.

 

We then use a for loop to call another method CreateSignatureForEachType that does the actual work. We have broken up our code into different functions as we need to calculate different types of signatures. These include those for local variables, fields etc.

 

Thus the first parameter is 1 as we have used this number to denote a method def signature. A method def is a method that we have defined, a method ref a method we are calling that is defined somewhere else. The second parameter is the field signature that is an offset into the blob heap where the signature of the method is located.

 

The third is the row number of each method. We now move on the function CreateSignatureForEachType where we do some work. All the signature to start with have a common rule. The first byte is the length of the signature as we are dealing with the blob stream.

 

We use our function CorSigUncompressData to get at the length as the signature may cross 127 bytes. Now that we know the length of the signature, we create an array blob1 that is of the same size. We copy the signature bytes minus the length byte into this array using the static Copy method of the array class.

 

This methods first parameter is the source array that contains the original data blob, then we have the starting point in the original array blob from where we want to start the copy. The index variable tells us the starting point and the howmanybytes the length of the count bytes.

 

The third parameter is the destination array blob1 and the fourth the starting point in the destination array. We  use 0 as we want to start the copy from the beginning. Finally we have the length of the number of bytes to copy which is the count variable. Thus all that we have done is create an array blob1 that contains only the signature.

 

As the type parameter is 1 we now call the method CreateMethodDefSignature with the newly created array and the row number. It is this function that does the actual grunt work.

 

The idea being that its easier to work with a array that contains only the method signature than an array that contains the same signature but at an offset. You do have to agree with us on this one. Lets take a short detour and first move on the function DisplayAllMethods.

 

All that we do here is display the contents of two of the three arrays that the CreateMethodDefSignature populates. In the method CreateMethodDefSignature we start with displaying the entire signature bytes of the function. The row number can never be –1 and hence the display code never gets code.

 

We change the aa variable to the row number whose signature we want to look at. The first byte of the method def signature includes two things. In a method we use either the words explicit or explicit instance like in method a1 or only instance in method a2.

 

Then comes in the variable number of arguments vararg in method a3 or the default which is default as in method a4. The hasthis and explicitthis is ored with the two calling convention values default or vararg. Even though we know that this byte will always be one, we yet use the CorSigUncompressData to extract its value.

 

To figure out which bits stand for what we use the DecodeFirstByteofMethodSignature to populate the methoddeftypearray array. This method is like what we have done before and we hope you understand why the order of the second round of if statements is very important.

 

Even though the method def rules only use two calling conventions, we have included four more as the others signatures use them. Also if we tag our methods with these calling conventions like unmanaged stdcall, ildasm does not complain. Also using the default keyword is not an error but it does not show up in the disassembled output.

 

The second byte is the number of parameters that we have. We may have more than 127 parameters and hence we use the CorSigUncompressData to get at this value. Before using this function we need to add to the index variable the number of bytes taken up by the earlier field which in this case is 1.

 

We will keep adding the variable howmanybytes to the index variable. We store the number of parameters in the array methoddefparamcount for future use and now call a very important method GetElementType that will give us the data type of the return value of the function.

 

This is the next bit of information stored in the signature. We pass the GetElementType method three parameters. The first is the starting byte of the return data type stored in the index variable, the array blobarray and finally the number of bytes of the signature the return data type takes up.

 

This number could be in the hundreds as you will soon see. The return value of this function, the actual data type is stored in the array methoddefreturnarray. The first byte of the data type signature tells us about the rest of the bytes.

 

If its value is between is 1 and 14, then the data type is a very simple data type like bool or string or int8. We thus use a if statement to tell us whether it is a simple data type and use the GetType method to return this type. The GetType method is simply a series of 14 if statements.

 

We return the value stored in the variable returnstring and set the out variable howmanybytes to 1 as our data type being simple takes only one byte in the signature. This is how we take care of the elementary data types. The next example takes on more complex data types and unless we finish all of them we will not proceed to do anything else.

 

Program24.csc

public void DisplayAllMethods (int typerow)

{

string s = methodstring + "  " + methoddefreturnarray[methodindex] ;

Console.WriteLine(s);

}

 

public string GetElementType ( int index , byte [] blobarray , out int howmanybytes)

{

howmanybytes = 0;

string returnstring = "";

byte type = blobarray[index];

if ( type >= 0x01 && type <= 0x0e )

{

returnstring = GetType(type);

howmanybytes = 1;

}

if ( type == 0x13  )

{

returnstring = "!" + blobarray[index+1].ToString();

howmanybytes = 2;

}

if ( type == 0x15  || type == 0x17 || type == 0x1e || type == 0x21 )

{

returnstring = "/* UNKNOWN TYPE (0x" + type.ToString("X") + ")*/";

howmanybytes = 1;

}

if ( type == 0x16)

{

returnstring = "typedref";

howmanybytes = 1;

}

if ( type == 0x18)

{

returnstring = "native int";

howmanybytes = 1;

}

if ( type == 0x19)

{

returnstring = "native unsigned int";

howmanybytes = 1;

}

if ( type == 0x1a)

{

returnstring = "native float";

howmanybytes = 1;

}

if ( type == 0x1c)

{

returnstring = "object";

howmanybytes = 1;

}

if ( type == 0x45 )

{

int howmanybytes2 ;

returnstring = GetElementType( index + 1 , blobarray , out howmanybytes2) + " pinned";

howmanybytes = howmanybytes2 + 1;

}

return returnstring;

}

 

e.il

.class a1

{

.method  !20 a1()

{

}

.method  typedref  a2()

{

}

.method  native int a3()

{

}

.method  native unsigned int a4()

{

}

.method  native float a5()

{

}

.method  object a6()

{

}

.method  int8 pinned a7()

{

}

}

 

Output

  .method /*06000001*/   !20

  .method /*06000002*/   typedref

  .method /*06000003*/   native int

  .method /*06000004*/   native unsigned int

  .method /*06000005*/   native float

  .method /*06000006*/   object

  .method /*06000007*/   int8 pinned

 

In this example we like in the earlier one have simply handled some of the more easier types. All the code remains the same but some if statement that we have added to the GetElementType method. The first function a1 uses a depreciated type ! that stands for the var type.

 

The specifications do not specify this type and we followed the advice we gave you some time ago. We went to the third byte of the signature and put 0x13 there. We then ran the disassembler which came up with the ! type. This type is followed by a number and hence the howmanybytes variable should be 2 as this is the length of the type.

 

The values 15, 17, 1e and 21 are unknown. The others but the last are simple types. The last one which is 45 is a type but followed by the words pinned. This type 0x45 is followed by the type and hence we call the GetElementType method with one added to the index variable. This method returns the type and we simply add the words pinned to it.

 

The howmanybytes variable is what the GetElementType method returns plus 1.

 

Program25.csc

public string GetElementType ( int index , byte [] blobarray , out int howmanybytes)

{

howmanybytes = 0;

string returnstring = "";

byte type = blobarray[index];

if ( type == 0x1d)

{

returnstring = GetSzArray(index , blobarray , out howmanybytes);

}

return returnstring;

}

public string GetSzArray (int index , byte [] blobarray , out int howmanybytes)

{

string returnstring = "";

int i = 1;

returnstring = "[]";

while ( true )

{

byte next = blobarray[index+i];

if ( next != 0x1d )

break;

returnstring = returnstring + "[]";

i = i +1 ;

}

int howmanybytes2;

returnstring = GetElementType(index + i , blobarray , out howmanybytes2) + returnstring;

howmanybytes = i + howmanybytes2;

return returnstring;

}

 

 

e.il

.class a1

{

.method  int8 [][][] a1()

{

}

}

 

Output

  .method /*06000001*/   int8[][][]

 

In this example and the next we deal with arrays. In the method GetElementType we add one more if statement that checks if the type byte or the first byte is 0x1d. This type is for a simple array that has no specific dimensions values. We can have as many dimensions as we like.

 

If we see the signature we find the following 1D 1D 1D 04. Thus each dimension we add brings in a extra 0x1d. At the end if the type signature is the actual data type. The fact that the  GetSzArray gets called simple means that we have at the very least one dimension which is a must but can have more.

 

We initialize the returnstring variable to a empty array brackets and then set out in a loop that is indefinite as we do not know how many dimensions the array will have. We at the beginning at the loop check whether the following byte is not a 0x1d. If the answer is yes, we know that all is over and we exit from the loop.

 

If the next byte is a 0x1d, we add a extra pair of array brackets or a extra dimension to the variable returnstring. When we finally leave the indefinite while loop, we have the returnstring variable contain the right number of array brackets and the variable I tells us how many 0x1d’s we there.

 

We now call the method GetElementType to get the type stored after the last 0x1d. This is why we add the value of I to the index variable. The bytes to be returned are the number of bytes taken up by the type itself and variable I, which as we told you earlier gives us the number of 0x1d or the array dimensions.

 

Program26.csc

public string GetElementType ( int index , byte [] blobarray , out int howmanybytes)

{

howmanybytes = 0;

string returnstring = "";

byte type = blobarray[index];

if ( type == 0x14 )

{

int howmanybytes2;

returnstring = GetArrayType( blobarray , index , out howmanybytes2);

howmanybytes = howmanybytes2 + 1;

}

return returnstring;

}

 

public string GetArrayType (byte [] blobarray ,  int index , out int howmanybytes)

{

string returnstring ;

int total = 1;

int uncompressedbyte;

int rank;

int numsizes;

int howmanybytes1;

returnstring = GetElementType(index +1 , blobarray ,  out howmanybytes);

total = total + howmanybytes;

returnstring = returnstring + "[";

howmanybytes1 = CorSigUncompressData(blobarray , index + total, out uncompressedbyte);

total = total + howmanybytes1;

rank = uncompressedbyte;

howmanybytes1 = CorSigUncompressData(blobarray , index + total, out uncompressedbyte);

total = total + howmanybytes1;

numsizes = uncompressedbyte;

int [] sizearray = new int[numsizes];

for ( int l = 1 ; l <= numsizes ; l++)

{

howmanybytes1 = CorSigUncompressData(blobarray , index + total, out uncompressedbyte);

total = total + howmanybytes1;

sizearray[l-1] = uncompressedbyte;

}

howmanybytes1 = CorSigUncompressData(blobarray , index + total, out uncompressedbyte);

total = total + howmanybytes1;

int bounds = uncompressedbyte;

int [] boundsarray = new int[bounds];

//Console.WriteLine(".....rank={0} numsizes={1} bounds={2} " , rank, numsizes,bounds);

if ( rank != 0 && bounds == 0 && numsizes == 0)

{

for ( int i = 1 ; i < rank ; i++)

returnstring = returnstring + ",";

returnstring = returnstring + "]";

return returnstring;

}

int dots = 0;

for ( int l = 1 ; l <= bounds ; l++)

{

howmanybytes1 = CorSigUncompressData(blobarray , index + total, out uncompressedbyte);

total = total + howmanybytes1;

int ulSigned = uncompressedbyte & 0x1;

uncompressedbyte  = uncompressedbyte >> 1;

boundsarray[l-1] = uncompressedbyte ;

}

if ( numsizes == 0)

{

for ( int l = 0 ; l < bounds ; l++)

{

returnstring = returnstring + boundsarray[l] + "..." ;

if ( l != (bounds-1) )

returnstring = returnstring + ",";

}

}

else

{

for ( int l = 0 ; l < bounds ; l++)

{

if ( l < numsizes )

{

 int upper = boundsarray[l] + sizearray[l] - 1 ;

 if ( boundsarray[l] == 0 && sizearray[l] != 0 )

 returnstring = returnstring + sizearray[l] ;

 if (boundsarray[l] == 0 && sizearray[l] == 0)

returnstring = returnstring + "0" ;

 else if (boundsarray[l] != 0 && sizearray[l] != 0)

returnstring = returnstring + boundsarray[l] + "..." + upper.ToString()  ;

else if (boundsarray[l] != 0 && sizearray[l] == 0)

returnstring = returnstring + boundsarray[l] + "..."  ;

 

 }

else

{

dots++;

returnstring = returnstring + boundsarray[l] + "..."  ;

}

if ( l != bounds - 1 )

returnstring = returnstring + ",";

}

}

if ( numsizes != 0) // method a6

{

int leftover = rank - numsizes - dots ;

for ( int l = 1 ; l <= leftover ; l++)

returnstring = returnstring + ",";

}

returnstring = returnstring + "]";

howmanybytes = total-1;

return returnstring;

}

 

e.il

.class a1

{

.method  int8 [4] a1()

{

}

.method  int16 [5] a2()

{

}

.method  int32 [5,7,12] a3()

{

}

.method  int32 [7,,,] a4()

{

}

.method  int32 [0...3, 3...8 , 10...14] a5()

{

}

.method  int32 [3...] a6()

{

}

.method  int32 [6...9,1,13] a7()

{

}

.method  int32 [,,,] a8()

{

}

.method  int32 [6,,13] a9()

{

}

.method  int32 [, 3...8 , 4... , 8...] a10()

{

}

.method  int32 [, 3...8 , 4... , , 8... ,,,] a11()

{

}

.method  int32 [,  , 4... , , 8...] a12()

{

}

.method  int32 [,,6...9,1,13] a13()

{

}

.method  int32 [8... , 4 , 5] a14()

{

}

}

 

Output

.method /*06000001*/   int8[4]

  .method /*06000002*/   int16[5]

  .method /*06000003*/   int32[5,7,12]

  .method /*06000004*/   int32[7,,,]

  .method /*06000005*/   int32[4,3...8,10...14]

  .method /*06000006*/   int32[3...]

  .method /*06000007*/   int32[6...9,1,13]

  .method /*06000008*/   int32[,,,]

  .method /*06000009*/   int32[6,0,13]

  .method /*0600000A*/   int32[0,3...8,4...,8...]

  .method /*0600000B*/   int32[0,3...8,4...,0...,8...,,,]

  .method /*0600000C*/   int32[0...,0...,4...,0...,8...]

  .method /*0600000D*/   int32[0,0,6...9,1,13]

  .method /*0600000E*/   int32[8...,4,5]

 

In this example we deal with how the array data type is handled. This is different from the earlier where we did not specify a dimension along with the array. All that we have done is add a if statement that checks for type 0x14 that represents a array.

 

We then call a method GetArrayType that understands arrays. Lets first look at the method a1 that has the simplest array int8 [4]. We will be extremely practical as the arrays can be a pain in the neck. Thus each time we will see the signature also. In this case it is 14 04 01 and this is only part of the signature, the part we are trying to explain to you.

 

A array signature starts with the number 0x14 and then follows the data type of the array. A number of 4 stands for int8. Thus the first thing we do in our GetArrayType function is call the GetElementType method passing index+1 as index points to the array type 0x14.

 

This is how we figure out the data type of the array and we increase the variable total by howmanybytes as the array data type can be as complex as we please. The second function a2 has the type int16 [5]  and its signature is 14 06 01 as the data type for a int16 is 06. The byte following the data type is called the rank.

 

This specifies  the number of dimensions which has to be 1 or more. We use the method CorSigUncompressData to pick it up for us as it may be larger than 127. Looking at the method a3 whose array type is int16[5,7,12] and signature is 14 06 03. The rank here is 3 as we have three dimensions in the array.

 

This is different from a double dimensional array which ahs multiple []. We will consider them later. The rank is stored in our program in a variable called rank and the returnstring variable will contain our actual array signature that we return and each time we increase total by the return value of the CorSigUncompressData method. 

 

After the rank is the number of sizes member. This gives us the number of dimensions that have a size. Lets start with method a8, its type is int32[,,,] and its signature is 14 08 04 00. The numsizes is zero as no dimension has any size at all. If we take method a9, type int32 [6,,13], signature 14 08 03 03.

 

The numsizes is 3 and not 2 as the dimension that has no size is in the middle. Method a4 type int32[7,,,] signature 14 08 04 01 says it better. The numsizes is 1 as only one dimension has a value and it is the first. Thus the numsizes is the number of dimensions that do not have a value, but counting from the last one that has no value.

 

The next series of bytes tell us the size of each dimension. If the numsizes field is 3, the next three bytes tell us the size of each dimension. Lets look at method a11, type int32 [, 3...8 , 4... , , 8... ,,,] and signature 14 08 08 02 00 06. As the numsizes is only 2, the next two bytes tell us the size of each dimension.

 

The rank however is 8. The first dimension is empty and hence it is a zero. The second dimension starts at 3 and end at 8. Thus its size is 6 as we count both 3 and 6. The ones that have a upper undefined limit like 3… have no size as we have not specified a upper bound.

 

Take another example, method  a7 type int32[6...9,1,13]signature 14 08 03 03 04 01 0D. We have 3 sizes, the first is from 6 to 9 and hence 4. The second is a 1 and the third 13. We do not know in advance how many sizes we have an hence we create an array sizearray that is numsizes large.

 

In a for loop we read each byte and store it into the array sizearray. We also increase total by the number of bytes each dimension takes up. We now handle a special case where we have a rank which is a must but no numsizes or bounds. This means that the array has no dimension that has a size and also no dimension that has a lower bound.

 

This could only happen in case of method a8 type int32[,,,] signature 14 08 04 00 00. Here we have a rank of 4 and the next two numbers 0. we loop depending upon the number of dimensions and keep adding a comma and then return the string. A special case and now lets move to the last field.

 

When we leave the loop we are at the next field that tells us how many dimensions have a lower bound as the upper bound is optional. If we look at method a7 type int32[6...9,1,13] bytes 14 08 03 03 04 01 0D 03 0C 00 00. We have to move to the 8th byte that is a three. The have three lower bounds and the last two are zero as they are a single bound.

 

Only the three dots come into the picture here. The lower bound is 3 but we see the value 0x0c. Why the discrepancy. This is because the lower bound is stored in a compressed form. Here is how it works. We first take the lower bound and bit wise and with a one to check whether the first bit is on or not. If it is on or set, then the byte is compressed. In all our cases, there is no compression on the byte at all.

 

Thus the first bit is not used to store the lower bound and is always zero. We then right shift the bytes by 1. Thus 12 becomes 6 as by right shifting we are dividing by 2. Finally lets  take method a5, type int32[4,3...8,10...14], signature 20 00 14 08 03 03 04 06 05 03 00 06 14.

 

We take the fourth last number which tells is that we have three upper bounds. The first dimension is not a range and hence its value is 0, the second is 6, we divide by 3 and we get 3 the lower bound and the last is 20 divided by 2 is 10.

 

Finally method a11 sums it up, type int32[0,3...8,4...,0...,8...,,,], signature 14 08 08 02 00 06 05 00 06 08 00 10. We start at the beginning as the signature is complex and we see a array type 0x14.  The  first 08 is the data type for in32 and the second 8 is the number of ranks as we have 8 dimensions. Count if you do not believe us.

 

Then we have 2 sizes as only the second has a size and the first who size is 0 gets in because of the second. Then we have two bytes for the size of the dimensions, 0 and 6. This is followed by the number of lower bounds which are 5.

 

The first has no lower bound as it is a actual value and the second and third have a lower bound of 3 and 4 that show as double 6 and 8. The next has a lower bound of 0 and hence its zero and this is followed by a lower bound of 8 that doubles to 16 or 0x10.

 

The last three dimensions have no values and hence to save on signature space by not specifying a endless number of zeroes they are ignored. This causes trouble for us as we know have to account for all this optimizations in our code. We store the uncompressed value of the lower bound in an array boundsarray like we did for the sizes.

 

Now is the time for actually creating the array signature. For the moment we have simply filled up two arrays. Lets look at function a12 type int32 [,  , 4... , , 8...] , ildasm shows us  int32[0...,0...,4...,0...,8...] signature 14 08 05 00 05 00 00 08 00 10.

 

As none of the dimension has a size as they are either empty or do not have a upper bound, the numsizes field is zero and hence the first if statement gets called. Thus we start with a for loop and simply take what is there in the boundsarray and add a … to it.

 

The bounds and rank members will be the same and if the dimension is 0 or empty, a zero gets displayed instead. At the end if it is the last dimension, we   do not place the comma and the if statement handles it. Now lets move on to the else statement that has some complex code.

 

Like in the if, in the else we also iterate in the for loop using the number of bounds as the index variable. We do this as the rank is the theoretical number of dimensions. The bounds are those that have a lower dimension which is a must. The difference between rank and bounds are the last empty dimensions.

 

If there are no empty dimensions both rank and bounds will be the same value. The numsizes have no significance. We have a if statement as the sizearray array will be less than the bounds array as every dimension does not have a size. This happens for two reasons, it is empty or the upper bound is not specified.

 

However as specified before, the empty ones fall into the purview of the numsizes if they are before a sized dimension. Thus the if statement makes sure that we are not accessing a sizearray member that does not exist.

 

When the if statement is false it could mean that all the valid sizes are over and the dimensions following are either empty or have no upper dimension value. This happens with method a11, type int32 [, 3...8 , 4... , , 8... ,,,] signature 14 08 08 02 00 06 05 00 06 08 00 10 and actual answer is int32[0,3...8,4...,0...,8...,,,].

 

Here the rank is 8, numsizes is 02 and the number of bounds is 05. Thus for values of l from 2,3 and 4, the else gets called. For these dimensions we have no upper bounds at all and the third empty dimension also gets displayed with a range starting with 0.

 

We also increase the  variable dots by one that will tell us how many times the else gets called. These remember are at the end of all the dimensions that have sizes. The If the if statement is true which means we have a size as well as lower bound, the upper variable stores for us the upper bound.

 

This is calculated as the lower bound plus the size minus 1 gives us the upper bound. We now need to figure out whether we place the three dots or is it is single dimension. This is achieved by the next series of three if statements. Lets take a method a13 type int32 [,,6...9,1,13] signature 14 08 05 05 00 00 04 01 0D 05 00 00 0C 00 00.

 

The rank, numsizes and bounds are all 5. If the lower bound is zero and the size is non zero, this is a single dimension value. Thus the if statement gets called for the last two dimensions where the size array has values 1 and 13. These are sizes that do not have a range and hence the lower bound is zero.

 

If there is a single value this is stored in the sizearray. We also check whether the comma needs to be placed at the end. We then check if both the bounds and size array are zero. This can happen in the first two dimensions and we need to place a 0.

 

Finally we check  whether both the size and bounds are non zero which means that it is a range like the middle case and here we place the …. We should place the comma only if it is not at the end and that explains the final if statement. We finally need to place the final empty commas if any.

 

We first need to find out if there we any empty dimensions at the very end. This we do by subtracting rank form the number of sizes. We also need to subtract dots as this variable contains the number of range dimensions at the end that do not have a upper bound. In this for loop we only fill up the returnstring by a certain number of commas.

 

We place this code within a if statement for a method like a6 that have a rank of 1, numsizes of 0 not to activate this code. We finally come to method a14 that has the type int32 [8... , 4 , 5], signature 14 08 03 03 00 04 05 03 10 00 00 and the answer by ildasm is int32[8...7,4,5].

 

Thus we get the right answer and ildasm the wrong one and yes we are gloating. This is because we have one more if statement that checks whether the size array is zero which is in this case and the bounds array is not zero which means that we have a range dimension with no upper bound.

 

Finally we have to ask ourselves can our code handle double dimension arrays like int32[5][4]. The signature for the above will be pretty large as 14 14 08 01 01 05 01 00 01 01 04 01 00. The first 14 is the array type and the data type for the array is another array.

 

We read this array using the GetElementType method and it is followed by the 8 which specifies a int32. The rank and number of sizes are 1 and the 5 is the array size. Thus the first array is the inner one and the array dimension  4 is the outer array.

 

Program27.csc.txt

public void abc(string [] args)

{

ReadPEStructures(args);

DisplayPEStructures();

ReadandDisplayImportAdressTable();

ReadandDisplayCLRHeader();

ReadStreamsData();

FillTableSizes();

ReadTablesIntoStructures();

DisplayTablesForDebugging();

ReadandDisplayVTableFixup();

ReadandDisplayExportAddressTableJumps();

FillArray();

DisplayModuleRefs();

DisplayAssembleyRefs();

CreateSignatures();

DisplayAssembley();

DisplayFileTable();

DisplayClassExtern();

DisplayResources();

DisplayModuleAndMore();

DispalyVtFixup();

DisplayTypeDefs();

DisplayTypeDefsAndMethods();

}

public string GetElementType ( int index , byte [] blobarray , out int howmanybytes)

{

howmanybytes = 0;

string returnstring = "";

byte type = blobarray[index];

if ( type == 0x12 )

{

int howmanybytes2;

returnstring = GetTokenType( blobarray , index , out howmanybytes2);

howmanybytes = howmanybytes2 + 1;

}

public string GetTokenType ( byte [] blobarray , int index , out int howmanybytes)

{

string returnstring = "";

int uncompressedbyte;

int howmanybytes1 = 0;

howmanybytes1= howmanybytes1  + CorSigUncompressData (blobarray , index + 1 , out uncompressedbyte);

string dummy1  = DecodeToken(uncompressedbyte , blobarray[index]);

returnstring = "class " + dummy1; 

howmanybytes = howmanybytes1;

return returnstring;

}

public string DecodeToken (int token , int type)

{

byte tabletype = (byte)(token & 0x03);

int tableindex = token >> 2;

string returnstring = "";

if ( tabletype == 0)

returnstring = typedefnames[tableindex];

return returnstring;

}

 

string [] typedefnames;

public void FillArray ()

{

int old = tableoffset;

bool tablehasrows = tablepresent(2);

int offs = tableoffset;

tableoffset = old;

if ( tablehasrows )

{

typedefnames = new string[rows[2]+1];

for ( int k = 1 ; k <= rows[2] ; k++)

{

int name = TypeDefStruct[k].name;

offs += offsetstring;

int nspace = TypeDefStruct[k].nspace;

offs += offsetstring;

string nestedtypestring  = "";

nestedtypestring  = GetNestedTypeAsString(k);

string namestring  = GetString(name);

string namespacestring = NameReserved(GetString(nspace));

if ( namespacestring.Length != 0)

namespacestring = namespacestring + ".";

namestring  = NameReserved(namestring );

typedefnames[k] = nestedtypestring + namespacestring + namestring  + "/* 02" + k.ToString("X6") + " */";

}

}

}

 

e.il

.class yyy

{

}

.class zzz

{

.method  class zzz a1()

{

}

.method  class yyy a2()

{

}

}

 

Output

.class /*02000002*/ private auto ansi yyy

       extends [mscorlib/* 23000001 */]System.Object/* 01000001 */

{

} // end of class yyy

 

.class /*02000003*/ private auto ansi zzz

       extends [mscorlib/* 23000001 */]System.Object/* 01000001 */

{

  .method /*06000001*/   class zzz/* 02000003 */

  .method /*06000002*/   class yyy/* 02000002 */

} // end of class zzz

 

In this program we show you how to work with return values that are a instance of a type that we create ourselves in our code. If you take a close look at the il file, method a1 returns a predefined type zzz and method a2 returns a type yyy. If we look at the signatures, both start with a value 0x12 and then some number that we will explain soon.

 

If you look closely at the abc function we have called a new method FillArray. This is the last method we will add in the abc function and we will explain this method very soon. In the GetElementType method we have simply added a if statement that check for the type being 0x12.

 

If it is, we call a function called GetTokenType to figure out the type for us which we simple return. The GetTokenType method takes a out parameter that tells us how many bytes the numbers that represent the type take up. We add one to this value to account for the 0x12 data type.

 

What we are saying is that the minute we find a 0x12, this signifies some type. The method GetTokenType is passed the blob array as well as the index in the array of the 0x12. We first call the CorSigUncompressData method which signifies that the bytes following may be compressed.

 

We then pass this uncompressed byte that in our case is a single byte as the value of  howmanybytes1 will confirm. We pass this byte to the method DecodeToken and also the value 0x12 that is the start of the type signature. The return string we simply add the word class and return  it.

 

The number of bytes taken up is simply the return value of the CorSigUncompressData function. Thus all the action now moves to the method DecodeToken. After the byte 0x12 is stored what the specs call a token. A token is a efficient way of storing a table and row number together.

 

Thus the first two bits are the table number and the remaining the row number. We thus bit wise and with 0x3 to extract the table number and right shift by 2 to get the row number. If the tabletype is 0, the token rows point to the type def table. Thus the value of 0xc denotes table 0 and by dividing by 4, we get a value of 3.

 

The class zzz has a row number of 3. The second token value was 8, divided by 4 gives us 2 and class yyy is row 2 in the type def table.

 

As we simply have to read the name and namespace from the type def table why can we not create a array typedefnames that simply store the name of the type as a string and we simply read the type name by using the appropriate index into the array. This what the FillArray method does.

 

We start with defining a instance array of strings typedefnames. We then use the same old, tableoffset variables to position us on the starting point of the type def table which is known as number 2.

 

We will have at least one row and we create an array typedefnames that is one larger then the number of rows, bearing in mind the global type that gets automatically created. We now iterate in a for loop depending upon the number of rows we have. We store the name and namespace fields for later use to get at the name and namespace names as strings.

 

We have to concatenate the name and namespace names and place a dot between them if and only if there exists a non null namespace name. This we check by adding a dot to the  namespace name only it has a valid length. We could have used the NameReserved function after the GetString function but chose to break it up on two lines.

 

A type may also be nested as hence we use our trusted method GetNestedTypeAsString passing it the type so that it returns the names of the nested types this type falls in. Program17 is where we first introduced this method.

 

We now fill up the typedefnames array by first starting with the nested type, then the namespacename with or without the dot, followed by the name of the type and its number in comments.

 

Program28.csc.txt

public string GetElementType ( int index , byte [] blobarray , out int howmanybytes)

{

howmanybytes = 0;

string returnstring = "";

byte type = blobarray[index];

if ( type == 0x12 || type == 0x11 )

{

int howmanybytes2;

returnstring = GetTokenType( blobarray , index , out howmanybytes2);

howmanybytes = howmanybytes2 + 1;

}

return returnstring;

}

public string GetTokenType ( byte [] blobarray , int index , out int howmanybytes)

{

string returnstring = "";

int uncompressedbyte;

int howmanybytes1 = 0;

howmanybytes1 = howmanybytes1  + CorSigUncompressData(blobarray , index + 1 , out uncompressedbyte);

string dummy1  = DecodeToken(uncompressedbyte , blobarray[index]);

if ( blobarray[index] == 0x12)

returnstring = "class " + dummy1; 

else if ( blobarray[index] == 0x11)

returnstring = "valuetype " + dummy1;

howmanybytes = howmanybytes1;

return returnstring;

}

 

e.il

.class yyy

{

}

.class zzz

{

.method  valuetype zzz a1()

{

}

.method  valuetype yyy a2()

{

}

}

 

Output

.class /*02000002*/ private auto ansi yyy

       extends [mscorlib/* 23000001 */]System.Object/* 01000001 */

{

} // end of class yyy

 

.class /*02000003*/ private auto ansi zzz

       extends [mscorlib/* 23000001 */]System.Object/* 01000001 */

{

  .method /*06000001*/   valuetype zzz/* 02000003 */

  .method /*06000002*/   valuetype yyy/* 02000002 */

} // end of class zzz

 

This example is a slight variation of the earlier one. In the il file, instead of using the word class we use the word valuetype. There are two basic data types in the il world. Those that are created on the stack and are simple are called value types. The others are represented by the word class.

 

A class denotes a object whose actual value is not passed but a reference. Thus each time we want to access a class/object we have to de-reference the value which is pointed at by the reference. A value type instead stores the actual value and hence is faster. A value type object is extended from the ValueType class.

 

These data types are simpler and faster to access and there is check to make sure that the data type is derived from the ValueType class. There is only one way to create a data type and that is using the class directive. In the GetElementType method we simply add the check for a type 0x11.

 

Thus both the class and value type are followed by a type token.  In the class GetTokenType we use a if statement to figure out whether we add the words class or valuetype. This is why we pass the first byte or type byte to this function.

 

Program29.csc

public string GetElementType ( int index , byte [] blobarray , out int howmanybytes)

{

howmanybytes = 0;

string returnstring = "";

byte type = blobarray[index];

if ( type == 0x10)

returnstring = GetByrefToken(index, blobarray , out howmanybytes);

return returnstring;

}

public string GetByrefToken (int index , byte [] blobarray , out int howmanybytes)

{

string returnstring = "";

int howmanybytes2;

returnstring = GetElementType (index+1 , blobarray , out howmanybytes2) + "&";

howmanybytes = howmanybytes2 + 1;

return returnstring;

}

 

e.il

.class zzz

{

.method  int32 & a1()

{

}

.method  class zzz & a2()

{

}

.method   int32  [12][3,5] &  a3()

{

}

}

 

Output

  .method /*06000001*/   int32&

  .method /*06000002*/   class zzz/* 02000002 */&

  .method /*06000003*/   int32[12][3,5]&

 

If you look at il file we have a & following the data type. This makes the data type a unmanaged pointer. We will study the difference between managed and unmanaged pointers in greater detail later. All that we would like to say is that all programmers if they do not work with pointers need to go back to school.

 

The only difference we see by adding a & is that the first type byte is 0x10. Then we have the same type signature as we have worked with before. In method a3, an array follows and thus we have a 0x14 following the 0x10. In the GetElementType method we call the method  GetByrefToken which does the grunt work.

 

We first call the GetElementType method to figure out the type for us and then return the same type followed by a &. We also initialize the howmanybytes variable to one larger than the value set by the GetElementType method as in the GetElementType method we do not increase it by one keeping the extra 0x10 in mind.

 

Everything else remains the same and you can now see how we use recursion to call the same code over and over again.

 

 

Continued >>>