3. MetaData Table Types

 

Every table type is assigned a unique number, or to be more precise, a bit in the valid field. Thus, there can be a maximum of 64 different table types, since the valid field has been declared as a long.

 

At the tail-end of the previous chapter, we made a passing mention of the names of the tables that dwell in the file. In this chapter, we will investigate the contents of these tables in detail for a smallest exe file.

 

The Module Table

The first table, identified by bit 0, is the Module table. This row is inserted in the module table as a consequence of the existence of the .module directive in IL or Intermediate Language. The limitation placed on the Module table is that it is equipped to contain just a single row. This is because a module can represent only a single file, i.e. either a dll file or an exe file.

 

The fields in the table are as follows:

 

 

Given below is the xyz function, which is utilized to display the contents of each field. Modify the contents of xyz function of the last program in a.cs by adding the following statements:

 

public void xyz()

{

int offs = tableoffset ;

int Generation = BitConverter.ToUInt16 (metadata, offs);

offs += 2;

int Name = BitConverter.ToUInt16 (metadata, offs);

offs += 2;

int Mvid = BitConverter.ToUInt16 (metadata, offs);

offs += 2;

int EncId = BitConverter.ToUInt16 (metadata, offs);

offs += 2;

int EncBaseId = BitConverter.ToUInt16 (metadata, offs);

Console.WriteLine("Generation: {0}" ,Generation );

Console.WriteLine("Name      :{0} {1}" ,  GetString(Name) , Name);

Console.WriteLine("Mvid      :#GUID[{0}]" , Mvid);

DisplayGuid(Mvid);

Console.WriteLine();

Console.WriteLine("EncId     :#GUID[{0}]" , EncId );

Console.WriteLine("EncBaseId :#GUID[{0}]" , EncBaseId);

}

 

Output

Generation: 0

Name      :b.exe 10

Mvid      :#GUID[1]

{A921C043-32F-4C5C-A5D8-C8C3986BD4EA}

EncId     :#GUID[0]

EncBaseId :#GUID[0]

 

Every table commands its own internal structure. The Module table starts with a 2 byte reserved field called the Generation field, whose value is always set to zero. We trust that by this time, you would have mastered the mechanism of employing the functions from the BitConverter class.

 

The second field is the name field, which contains an index to the string table. Here, it has a value of 10. So, the string present from the 10th byte onwards in the data of the Strings stream, is a name that the module represents.

 

The index is set to 2 bytes, because the heaps field member in the #~ stream header contains a value of 0. It is the 7th byte from the commencement of the header. Out of the 8 bits, only 3 bits are active for the moment.

 

 

If the first bit is set, it denotes that all indexes into the strings stream are 4 bytes wide. Since it is unset in this case, the size of the index is set to 2 bytes. The second bit is for the GUID stream. The third stream is not used. The fourth bit is for the Blob stream.

 

Thus, each time a structure member contains an index value, the heapsize field is checked to determine if it is set to 2 bytes or 4 bytes. The basic rule is that if the size of a stream is larger than 64k, the index is 4 bytes, or else it is 2 bytes.

 

This provides ample testimony to the fact that the metadata world has been created for enhancing efficiency.

 

Using the GetString function, the string that begins at the 10th position in the strings heap area, is obtained. In this case, the extracted bytes reveal the name of b.exe.

 

Thus, the Name field is an index into the string stream. It cannot contain a null value. This format of the name is controlled by the constant MAX_PATH_NAME. The format consists of just the file name and the extension. No other information, such as the path name or drive name, is permitted.

The field called Mvid is an index into the Guid heap. So, the Guid is displayed using the function DisplayGuid. The value displayed in our output will probably be very different from the value flashed on your screen. If you compile the b.exe program once again, you will notice that the value of the Guid also changes.

 

Each time a compiler creates an exe file under the .Net framework, it generates a new Guid. This aids in distinguishing between two different versions of the module. Thus, the Mvid column uniquely identifies an instance of the column.

 

The documentation also divulges the fact that the algorithm used is specified in ISO/IEC 11578:1996 ( Annex A). In the Common Object Request Broker Architecture (CORBA) and Remote Procedure Call (RPC) world, it is termed as a UUID, or a Universally Unique Identifier, while the COM world uses a CLSID, a GUID or an IID.

 

The Mvid is not employed by the Virtual Execution System (VES). Other programs like debuggers may exploit the fact that every exe file has a unique number embedded in it. Finally, even though the Mvid is not used, it cannot be a null Guid.

 

The next two fields are indexes into the Guid heap. They are reserved with both possessing values of zero.

 

TypeRef Table

The xyz function now displays the contents of the TypeRef table. This table has three rows, where the size of each row is 6 bytes. Also, this table is at an index position of 1, which signifies that the second bit in the valid table field vector is on.

 

a.cs

public void xyz()

{

bool b = tablepresent(1);

int offs = tableoffset;

if ( b )

{

for ( int k = 1 ; k <= rows[1] ; k++)

{

int resolutionscope = BitConverter.ToInt16 (metadata , offs);

offs = offs + 2;

int name = BitConverter.ToInt16 (metadata , offs);

offs = offs + 2;

int nspace = BitConverter.ToInt16 (metadata , offs);

offs = offs + 2;

Console.WriteLine("Row[{0}]" , k);

int tag = resolutionscope & 0x03;

if ( tag == 0 )

Console.Write("Module :");

if ( tag == 1 )

Console.Write("ModuleRef :");

if ( tag == 2 )

Console.Write("AssemblyRef");

if ( tag == 3 )

Console.Write("TypeRef:");

int riid = resolutionscope >> 2;

Console.Write("[{0}] token=0x{1}" , riid , resolutionscope.ToString("X") );

Console.WriteLine();

Console.WriteLine("Name      :{0},0x{1}",GetString(name), name.ToString("X"));

Console.WriteLine("Namespace :{0},0x{1}",GetString(nspace), nspace.ToString("X"));

}

}

}

public bool tablepresent(byte i)

{

int p = (int)(valid >> i) & 1;

byte [] sizes = {10,6,14,2,6,2,14,2,6,4,6,6,6,4,6,8,6,2,4,2,6,4,2,6,6,6,2,2,8,6,8,4,22,4,12,20,6,14,8,14,12,4};

for ( int j = 0 ; j < i ; j++)

{

int o = sizes[j] * rows[j];

tableoffset = tableoffset + o;

}

if ( p == 1)

return true;

else

return false;

}

 

Output

Row[1]

AssemblyRef[1] token=0x6

Name         :Object,0x20

Namespace :System,0x19

Row[2]

AssemblyRef[1] token=0x6

Name        :DebuggableAttribute,0x49

Namespace :System.Diagnostics,0x36

Row[3]

AssemblyRef[1] token=0x6

Name         :Console,0x5F

Namespace :System,0x19

 

Besides the xyz function, a new function called tablepresent has been introduced. This function performs two tasks:

     Firstly, it reveals whether a table at the index position supplied as a parameter, exists or not.

     Secondly, and more importantly, it sets the tableoffset variable to the position where the table begins in the array.

 

Now, let us cast a closer look at the working of this function. We have passed a value of 1, which is the index of the TypeRef table to the function that accepts it in parameter i

public bool tablepresent(byte i)

 

The first line in the function right shifts the bits in the valid field i times, and then, it performs a bitwise AND with the value of 1.

 

Thus, it only checks for the first bit to be on. If the resultant value in the variable p is 1, a boolean value of true is returned; otherwise, a value of false is returned.

 

int p = (int)(valid >> i) & 1;

 

if ( p == 1)

return true;

else

return false;

 

The positioning of the tableoffset variable is handled differently. Initially, the tableoffset variable points to the start of the tabledata section in the metadata array.

 

The total number of tables present in the valid field may vary. Thus, we have an array named sizes, which specifies the size of each row in a table.

 

byte [] sizes = {10,6,14,2,6,2,14,2,6,4,6,6,6,4,6,8,6,2,4,2,6,4,2,6,6,6,2,2,8,6,8,4,22,4,12,20,6,14,8,14,12,4};

 

In the earlier program, we noticed that the row size of the first table Module was 10, the second TypeDef was 6, etc. The size of each table can be well ascertained, thought it has not been specifically documented. However the size can vary when the number of rows in the table exceed 64k, resulting into a 4byte index in place of 2. Since our programs are too small, such a case will never occur.

 

Thus, the array named sizes is filled up with the sizes of the rows in the table types.

 

Once this is achieved, the 'for' loop iterates as many times as the number of tables that are present. In this case, since the value of variable i is 1, the loop repeats only once. Within the loop, the size of the row in the sizes array is multiplied with the corresponding number of rows in the rows array. The variable j in the 'for' loop represents the offset into the two arrays, as well as, into the table id.

 

for ( int j = 0 ; j < i ; j++)

{

int o = sizes[j] * rows[j];

tableoffset = tableoffset + o;

}

 

As an outcome of the above calculation, the variable o will characterize the space occupied by each table that has been added to the tableoffset variable.

 

In the xyz function, the return value of the function tablepresent is verified. If the value is true, then it is unmistakably evident that the table is present. Thus, the program marches ahead in order to flaunt its contents.

 

A 'for' loop is implemented to authenticate the values in each row. Since the TypeRef table has 3 rows, the loop repeats thrice.

 

The TypeRef table has a size of six bytes with the following columns:

 

 

Let us examine this backwards. All the columns are an index into the string heap. Thus, they occupy two bytes each. The last column refers to the namespace and the second column points to the class within the namespace. The output shows that there are a total of three classes or types that are referred to in the program, viz.

System.Object, System.Console and System.Diagnostics.DebuggableAttribute.

 

We have propounded the file b.cs once again in order to substantiate our handiwork.

 

 

b.cs

public class zzz {

public static void Main ()

{

System.Console.WriteLine("hello");

}

}

Everything in the .Net world is derived from the object class. Thus, internally, the zzz class is derived from the System.Object. Due to this, the object class makes an appearance in the file. The Console class comes into play because we have called it explicitly. However, no reference is ever made to the DebuggableAttribute class. So, from where did it originate?

 

Comment out the WriteLine function in the file and run the a.exe file again. You will notice that the reference to the class Console in the TypeRef table also does a vanishing act. Thus, as and when you call methods from different classes, a row for every unique class name gets added to the TypeRef table.

 

The TypeRef table incorporates all the Types or classes that are referred to in the program. The words class and type may be used interchangeably. This applies not only to the static methods called from the classes, but also to any class that has been referred to, including the classes that have been instantiated using new.

 

There is no distinction between the System class and a user-defined class. Therefore, when the /R option is applied to refer to classes in other dlls, the table carries an entry for it too. Merely defining a variable also results in the addition of the name of the namespace-class to the TypeRef table. Thus, to cut a long story short, every external type or class referred to in the exe file, finds an entry in the TypeRef table.

 

Undoubtedly, the C# compiler too has ushered in some code, which refers to the class DebuggableAttribute. Hence, the class name is incorporated in the table.

 

Now, let us examine the first two bytes in the row. This short is called a 'resolution scope' or a 'resolution scope coded index'. The field can have one of the following four values: Module, ModuleRef, AssemblyRef or TypeRef.

 

The resolution scope value is an index into any of the above tables. The value signifies the scope of the namespace-class. The first two bits in the short identify the scope applied to the tables mentioned above.

The definition of scope merely restricts the use of the entity to a select few. For instance, the scope of a local variable is the method by which it is created.

 

In the program, the first 2 bits are verified. Hence, we use the bitwise AND operator with 3. This substantiates the fact that all the three classes have a scope of Assembly, which is the largest entity in the .Net world. The remaining six bits provide the actual index value in the table that is identified by the first 2 bits.

 

Then, the bits are right-shifted by 2 to furnish an index into the AssemblyRef table. We will elucidate this table in considerable detail at a later stage.

 

 

An AssemblyRef token is displayed when the types referred to in the program, originate from another assembly.

 

The name field is an index into the string heap. This string can never be a null string. The length of this string is limited to MAX_CLASS_NAME. In contrast, the namespace field is allowed to be null. Every name ought to be a valid CLS identifier.

 

Finally, it may be observed that two rows with the same Resolution Scope, Name and Namespace, cannot obviously co-exist.

 

The AssemblyRef table, which is referred to by all the three classes, will be expounded in a short while from now, since it is a table in its own right.

 

 

TypeDef Table

 

a.cs

using System.Reflection;

 

public void xyz()

{

bool b = tablepresent(2);

int offs = tableoffset;

if ( b )

{

for ( int k = 1 ; k <= rows[2] ; k++)

{

 TypeAttributes flags = (TypeAttributes)BitConverter.ToInt32 (metadata, offs);

offs += 4;

int name = BitConverter.ToInt16 (metadata, offs);

offs += 2;

int nspace = BitConverter.ToInt16 (metadata, offs);

offs += 2;

int cindex = BitConverter.ToInt16 (metadata, offs);

offs += 2;

int findex = BitConverter.ToInt16 (metadata, offs);

offs += 2;

int mindex = BitConverter.ToInt16 (metadata, offs);

offs += 2;

Console.WriteLine("Row:{0}" , k);

Console.WriteLine("Flags     : {0}" , flags);

Console.WriteLine("Name      : {0}" , GetString(name));

Console.WriteLine("NameSpace : {0}" , GetString(nspace));

Console.Write("Extends:");

int u = cindex & 3;

if (u == 0)

Console.Write("TypeDef");

if (u == 1)

Console.Write("TypeRef");

if (u == 2)

Console.Write("TypeSpec");

Console.Write("[{0}]", cindex >> 2);

Console.WriteLine();

Console.WriteLine("FieldList Field[{0}]", findex);

Console.WriteLine("MethodList Method[{0}]", mindex);

}

}

}

 

Output

Row:1

Flags     : Class

Name      : <Module>

NameSpace :

Extends:TypeDef[0]

FieldList Field[1]

MethodList Method[1]

Row:2

Flags     : AutoLayout, AnsiClass, NotPublic, Public, BeforeFieldInit

Name      : zzz

NameSpace :

Extends:TypeRef[1]

FieldList Field[1]

MethodList Method[1]

 

The 'using System.Reflection;' statement must be supplemented at the beginning of the program to avoid all compiler errors.

 

The TypeDef table that follows the TypeRef table, is the next in sequence. Hence, the value assigned to it is 2. The tablepresent function has been left unaltered. The xyz function has been modified to cater to the fields of the TypeDef table.

 

In short, the TypeDef table stores every type or class created in our assembly. A type could be a class, an interface, a structure, an enum, and so on.

 

You may add an interface, or a structure, or an enum in the file b.cs. Thereafter, you can verify the fact that a fresh row gets added for every new type that is appended.

We had set about with a revelation to you that, a type and a class are one and the same. However, we now amplify the definition of a type to include the above mentioned entities. The TypeDef structure is a product of the following fields:

 

 

 

The first 4 bytes are the Flags field that we shall explain shortly. They are followed by the name of the type. The first row has the type name as <Module> and the flag of Class. The second row has the type name as zzz, which is the name of the class in the program. Thus, the TypeDef table rows or types depend upon the entities created in the class. However, we never create the first type, i.e. <Module>.

 

The first row symbolises a pseudo class called <Module>. It contains all the functions and variables that are created either globally or at the module level. It comports itself as the parent for all such entities. In the C# language, we are not authorised to create anything outside a class, however, in C++, we are allowed to have global functions and variables, which are enclosed by the class named Class.

 

The Name field is followed by the index for the Namespace. The class named <Module> does not belong to any namespace, since such a concept is non-existent in the C# programming language. The namespace for the class zzz also does not exist. Therefore, the index value into the String heap is zero.

 

The next field is called the extends field. This field is a code index, which can assume one of the three values of TypeDef, TypeRef or TypeSpec. It refers to the table to which the value is an index. In the case of the class named <Module>, it is an index into the TypeDef table. However, since the value of the index is zero, it becomes an invalid index. The rationale behind it is that the class called <Module> does not extend from any class.

 

This field is easier to comprehend with the second row. All classes in the .Net world extend from System.Object. Therefore, it is assumed that the zzz class is also derived from it. Thus, the code index points to the first index into the TypeRef table.

 

The first of the three rows of the TypeRef table in the previous example, had illustrated the different Type references. Here, one of them represents the System.Object class. Thus, the code index not only reveals the table, but also the specific row within that table. We shall finally write a program that will cross-reference all these disparate tables.

 

The next two fields are indexes into the field table and the method table, respectively. The explanation for these tables will be furnished in due course.

 

It is essential to understand the TypeAttributes int before we conclude the explanation of the TypeDef table. The Reflection namespace defines an enum called TypeAttributes. The first field of the TypeDef table is simply an int, where every bit refers to the attributes applied to the class statement. We merely employ the ToString function to display the attributes.

 

To quench your thirst for knowledge, you can take a look at the file named corhdr.h in the folder Program files-Microsoft Visual Studio.Net-FrameWorkSDK-Include. This header file has an enum called CorTypeAttr, which possesses the bits representing the attributes for a class. If the 1st bit is on, it signifies that it is a class with public access. If the 6th bit is on, it signifies that the class is an interface.

 

The most clear-cut solution is to use the ToString function of the enum, just as we have incorporated it here. The output unfurls the fact that the special class called <Module> is tagged with only a single attribute named Class.

 

However, the class zzz has many flags set. Let us understand what the various bits actually signify.

 

The AutoLayout flag has been set. It specifies that the Common Language Runtime or the code supplied by Microsoft, will be responsible for laying out the fields of the class. The AnsiClass does not apply to C# coding, since it basically deals with interpretation of a C++ pointer to a string, or a LPTSTR as per the terminology of ANSI or Unicode.

 

Finally, the BeforeFieldInit flag calls upon the runtime to initialize the members of the class, before the first static field is accessed. This is a very succinct explanation on flags. We shall revert to the flags field a little later, when we come across the flags set by other types, such as a structure, an interface and an enum.

 

Method Table

 

a.cs

public void xyz()

{

bool b = tablepresent(6);

int offs = tableoffset;

if ( b )

{

for ( int k = 1 ; k <= rows[6] ; k++)

{

int rva = BitConverter.ToInt32 (metadata, offs);

offs += 4;

MethodImplAttributes impflags = (MethodImplAttributes ) BitConverter.ToInt16 (metadata, offs);

offs += 2;

 

MethodAttributes flags = (MethodAttributes) BitConverter.ToInt16 (metadata, offs);

offs += 2;

int name = BitConverter.ToInt16 (metadata, offs);

offs += 2;

int signature = BitConverter.ToInt16 (metadata, offs);

offs += 2;

int param = BitConverter.ToInt16 (metadata, offs);

offs += 2;

Console.WriteLine("Row {0}",k );

Console.WriteLine("RVA      :{0}", rva.ToString("X"));

Console.WriteLine("Name     : {0}", GetString(name));

Console.WriteLine("ImpFlags :{0}",impflags );

Console.WriteLine("Flags    :{0}",flags.ToString("X"));

Console.WriteLine("Signature: #Blob[{0}]",signature);

Console.WriteLine("ParamList: Param[{0}]",param);

Console.WriteLine();

}

}

}

 

Output

Row 1

RVA      :2050

Name     : Main

ImpFlags :Managed

Flags    :00000096

Signature: #Blob[10]

ParamList: Param[1]

 

Row 2

RVA      :2068

Name     : .ctor

ImpFlags :Managed

Flags    :00001886

Signature: #Blob[14]

ParamList: Param[1]

 

The next metadata table that we would be dealing with, is the one at the 7th position in the valid table. The table is the Method table, thereby having an index of 6. This table has one row for every method created in the module. The output confirms the presence of two methods.

 

 

 

Before absorbing what RVA is all about, let us first explore the Name field, which is an index to the name table. The first row has a method named Main and the second row has a method called .ctor. You surely would wonder as to which is the wellspring or the source from which this method has emanated.

 

All methods whose names begin with a dot are created without any human intervention. Such methods are of special significance from the compiler's point of view.

 

A class that is devoid of a constructor, is always provided with a free constructor that has no parameters. This free constructor is named .ctor.

 

Thus, you would certainly realise and appreciate that a substantial quantum of the C# programming language can be learnt by deciphering the metadata.

 

The RVA field is the Relative Virtual Address, which is a number that points to the starting location of the executable code of the method. The output reveals that the first function Main starts at memory location 2050.

 

To arrive at the location from where the code begins on disk, we first detect the difference between 0x2050 and 0x2000 (section alignment). The result is 0x50, which is then added to 512 (file alignment). The final outcome is 592, which is the location at which the code for the method Main begins on disk.

 

The code, in much the same manner as everything else, begins with a header and is followed by the bytes in IL. These bytes in turn refer to the metadata tables. The next book in the metadata series will elucidate the mechanism of a Dis-assembler as in ILDasm.

 

 

The second field consists of the MethodImplAttributes flags. These flags determine the attributes that are applied on the method.

 

There are two basic types of methods, i.e. managed and unmanaged methods. The unmanaged methods come into play when pointers are used in C#, thereby evading all verification of the code.

 

The documentation contains details about each bit and its representation.

 

If the first bit is off, the method implementation is CIL and managed.

If it is on, then it signifies a native method.

 

The second bit called OPTIL is reserved and is always assigned a value of zero. This specifies code that is to be employed only by the .Net infrastructure and never by mere mortals like us.

 

A value of 3 signifies that the method is a runtime method, indicating that there is no code present in the file, since it would be supplied by the runtime. Events are handled in much the same manner.

 

A value of 0x20 acquaints us with the fact that the method is single threaded or synchronized. The lock statement of C# is employed for this purpose.

 

A value of 0x08 denotes that the method cannot be inlined. The CodTypeMask, which has a value of 3, specifies the flags. These flags indicate the type of code. A value of IL or 0 suggests that the method code is in MSIL. The value of 0x1000 stands for an internal call, which is reserved for internal consumption only.

 

The range check value is 0xffff. The PreserveSig notifies us that the method signature is exported as advertised, and is not to be mangled for HRESULT conversions. The HRESULT is the return type in the COM world.

 

Let us look at the next program to interpret the second flags field for Method Attributes. This field is widely disparate from the one we just covered for the method implementation attributes.

 

a.cs

public void xyz()

{

bool b = tablepresent(6);

int offs = tableoffset;

if ( b )

{

for ( int k = 1 ; k <= rows[6] ; k++)

{

int rva = BitConverter.ToInt32 (metadata, offs);

offs += 4;

MethodImplAttributes impflags = (MethodImplAttributes )BitConverter.ToInt16 (metadata, offs);

offs += 2;

short flags = BitConverter.ToInt16 (metadata, offs);

offs += 2;

int name = BitConverter.ToInt16 (metadata, offs);

offs += 2;

int signature = BitConverter.ToInt16 (metadata, offs);

offs += 2;

int param = BitConverter.ToInt16 (metadata, offs);

offs += 2;

Console.WriteLine("Row {0}",k );

Console.WriteLine("Name     : {0}", GetString(name));

Console.WriteLine("Flags    :{0}",flags.ToString("X"));

Type t = typeof( System.Reflection.MethodAttributes );

FieldInfo[] f = t.GetFields(BindingFlags.Public | BindingFlags.Static);

for ( int i = 0; i < f.Length; i++ )

{

int fv = (int)f[i].GetValue(null);

if ( (fv & flags) == fv)

Console.Write( "  {0} {1}  " , f[i].Name , fv.ToString("X") );

}

Console.WriteLine();

}

}

}

 

 

Output

Row 1

Name     : Main

Flags    :96

PrivateScope 0 FamANDAssem 2 Family 4 Public 6 Static 10 HideBySig 80 ReuseSlot 0

Row 2

Name     : .ctor

Flags    :1886

PrivateScope 0 FamANDAssem 2 Family 4 Public 6 HideBySig 80 ReuseSlot 0 SpecialName 800 RTSpecialName 1000

 

The glitch with regards to the method attributes is that, unlike the implementation attributes, the ToString method of the MethodAttributes enum merely displays the hex value. Every flag variable pursues the same concept. Every bit represents some property or attribute that is set, such as static, private etc.

 

The documentation bequeaths us with no enlightenment with regards to these bits. So, we have abstained from writing a large program that compares every bit. Instead, the implementation of the Reflection API is recommended. This API is merely a window to the metadata, which is proffered to make life less hassled.

 

In the above program, the third field is read into an integer variable named flags. Then, employing the keyword typeof, a Type object for a class MethodAttributes is restored from the System.Reflection namespace. Every class has a corresponding type object associated with it.

 

The GetFields function returns an array of FieldInfo structures, which in our case, is an array of public fields available in the type. The Binding flags enumeration has a total of 18 members. Thus, enum simply acts as a filter.

 

By analysing the bits set in the field info object, we can gain an insight into the essence of the metadata. The fv variable holds all the bits that can be set on, whereas, the flags variable stores all the bits that are on for this specific method.

 

Thus, we merely need to use the bitwise AND operator on the two variables. If the outcome of this operation remains unaltered, it is indicative of the fact that the corresponding bit is set on. In such a case, using the Field Info object, an English-like depiction of the bit is assigned. Thus, if there are 20 possible Bit Combinations, the 'for' loop runs 20 times.

 

The next program uses the Reflection API in lieu of the bitwise operations.

 

a.cs

public void xyz()

{

bool b = tablepresent(6);

int offs = tableoffset;

if ( b )

{

for ( int k = 1 ; k <= rows[6] ; k++)

{

int rva = BitConverter.ToInt32 (metadata, offs);

offs += 4;

short  impflags = BitConverter.ToInt16 (metadata, offs);

offs += 2;

short flags = BitConverter.ToInt16 (metadata, offs);

offs += 2;

int name = BitConverter.ToInt16 (metadata, offs);

offs += 2;

int signature = BitConverter.ToInt16 (metadata, offs);

offs += 2;

int param = BitConverter.ToInt16 (metadata, offs);

offs += 2;

Console.WriteLine("Row {0}",k );

Console.WriteLine("Name     : {0}", GetString(name));

Type t = typeof( System.Reflection.MethodAttributes );

FieldInfo[] f = t.GetFields(BindingFlags.Public | BindingFlags.Static);

Console.Write("Flags ");

for ( int i = 0; i < f.Length; i++ )

{

int fv = (int)f[i].GetValue(null);

if ( (fv & flags) == fv)

Console.Write( "{0} " , f[i].Name );

}

Console.WriteLine();

t = typeof( System.Reflection.MethodImplAttributes );

f = t.GetFields(BindingFlags.Public | BindingFlags.Static);

Console.Write("Impl Flags ");

for ( int i = 0; i < f.Length; i++ )

{

int fv = (int)f[i].GetValue(null);

if ( (fv & impflags) == fv)

Console.Write( "{0} " , f[i].Name );

}

Console.WriteLine();

}

}

}

 

Output

Row 1

Name     : Main

Flags PrivateScope FamANDAssem Family Public Static HideBySig ReuseSlot

Impl Flags IL Managed

Row 2

Name     : .ctor

Flags PrivateScope FamANDAssem Family Public HideBySig ReuseSlot SpecialName RTSpecialName

Impl Flags IL Managed

 

The MethodAttributes is replaced by the enum of MethodImplAttributes, since the objective here is to check the bits set from this enum. The functions in the Reflection API are exploited to generate the output for the MethodImplAttributes.

 

The last field is an entry into a Params table. Before we proceed any further with our explanation of the Params table, we urge you to modify the b.cs file to contain the following:

 

 

b.cs

public class zzz

{

public static void Main ()

{

System.Console.WriteLine("hell");

}

public int abc(float k)

{

return 0;

}

public long pqr( int i , char j)

{

return 0;

}

public void xyz()

{

}

}

 

a.cs

public void xyz() {

bool b = tablepresent(6);

int offs = tableoffset;

if ( b )

{

for ( int k = 1 ; k <= rows[6] ; k++)

{

offs += 8;

int name = BitConverter.ToInt16 (metadata, offs);

offs += 2;

int signature = BitConverter.ToInt16 (metadata, offs);

offs += 4;

Console.WriteLine("Row {0}",k );

Console.WriteLine("Name :{0}", GetString(name));

byte count = blob[signature];

Console.WriteLine("Blob:{0} Count:{1} ", signature , count);

for ( int l = 1 ; l <= count ; l++)

{

Console.Write("{0} " , blob[signature+l].ToString("X"));

}

Console.WriteLine();

}

}

}

 

Output

Row 1

Name :Main

Blob:10 Count:3

0 0 1

Row 2

Name :abc

Blob:14 Count:4

20 1 8 C

Row 3

Name :pqr

Blob:19 Count:5

20 2 A 8 3

Row 4

Name :xyz

Blob:25 Count:3

20 0 1

Row 5

Name :.ctor

Blob:25 Count:3

20 0 1

 

The above example examines the method signature, which is stored in the Blob heap. The Blob heap has not been addressed so far.

 

A signature divulges all the information related to a function, including details such as the calling convention; the values being pushed on the stack; whether the 'this' pointer is passed to the function or not; and above all, the parameters and the return value.

 

Since this program scrutinizes method signatures, we have augmented our program b.cs with 3 more functions, viz. abc, pqr and xyz. In the first function of Main, the function's signature is stored at the 10th position in the Blob heap.

 

 

A function signature commences with a count of the number of bytes, i.e. 3, followed by the actual signature. The function Main uses 3 bytes to store its signature, whereas, the function abc takes up 4 bytes. If we cease our explorations here, the truth behind the storage of the signature shall remain a mystery.

 

The actual truth is that the metadata world is eager to compress every byte that it needs to store. Therefore, everything in the heap Blob is compressed. However, to achieve this, a specific pattern has to be followed.

 

If the first bit from the left (i.e. the 7th bit or the high bit) is 0, the next 7 bits store the value in an uncompressed form. Thus, numbers from 0 to 127 are not stored in a compressed form.

 

 

If the first bit from the left hand side is 1 and the second bit is 0, i.e. bits 15 and 14, in that case, the next 14 bits store the value. This endows it with a range from 0x80 to 0x3fff or 2^8 to 2 ^14-1.

 

 

Finally, if the first bit is 1, the second is also 1 and the third is 0, i.e. bits 31, 30 and 29, the next 29 bits are used to store the value.

 

 

Fortunately, we do not need to plague ourselves about compression at this stage, since the count byte in our program does not exceed 127. Thus, the first two bits from the left are always zero. However, we may subsequently reach a stage where we would be compelled to decompress the bytes first and then read them. The bytes are stored in the reverse order or in big endian format. The default in Intel machines is the little endian format, wherein, the smaller byte is stored first.

 

After establishing the position in the Blob stream, the count byte is stored in the variable count and its value is displayed. Then, using a 'for' loop, the next set of bytes, upto the count, is displayed from the blob heap.

 

Row 1

Name :Main

Blob:10 Count:3

0 0 1

Row 2

Name :abc

Blob:14 Count:4

20 1 8 C

Name :xyz

Blob:25 Count:3

20 0 1

 

Let us start by probing the simplest function named xyz.

 

The first byte imparts information about two types: the 'this' pointer and the calling convention. Initiate counting from 0 upto the 5th bit in sequence. If this bit is on, it signifies that the 'this' pointer has been passed to the function.

 

 

Thus xyz is an instance variable, since its 5th bit is marked on. If you make the function xyz static, you will notice that the value of 0x20 changes to 0x00. The value 0x0 signifies the calling convention of DEFAULT.

 

 

The function Main is not passed the 'this' pointer. This fact can easily be verified, since the 5th byte is off.

 

 

The 2nd byte gives a count of the number of parameters that have been passed.

 

The functions of Main, .ctor and xyz have no parameters, the method abc has 1 parameter passed to it and the method pqr has 2 parameters passed to it. The next byte contains the return type.

 

Each of the functions, i.e. Main, .ctor and xyz have a value of 1, which represents the element void. The value 8 is an int, thus suggesting that the method abc returns an int. The pqr function returns a long, which is why the value assigned to it is 0xA.

 

The documentation makes no mention of int or long. Instead, it specifies I4 and I8, which represent the actual number of bytes. Each type is allocated a distinct number, which is documented in the ECMA standards in section 22.1.15.

 

 

This byte is followed by the information about the actual parameters. As three of the functions have no parameters, no more bytes are present. Thus, the minimum size of the signature is 3 bytes.

 

The function abc takes a single float as a parameter, and thus, its signature size is 4 bytes. The 4th byte contains the type of parameter passed to the function. Since the function pqr has two parameters, the size is of five bytes, where the last two bytes reveal the type of parameters supplied to the function.

 

We shall delve deeper into the concept of method signatures very shortly.

 

a.cs

public void xyz()

{

bool b = tablepresent(6);

int offs = tableoffset;

if ( b )

{

for ( int k = 1 ; k <= rows[6] ; k++)

{

int rva = BitConverter.ToInt32 (metadata, offs);

offs += 4;

MethodImplAttributes impflags = (MethodImplAttributes) BitConverter.ToInt16 (metadata, offs);

offs += 2;

int flags = (int)BitConverter.ToInt16 (metadata, offs);

offs += 2;

int name = BitConverter.ToInt16 (metadata, offs);

offs += 2;

int signature = BitConverter.ToInt16 (metadata, offs);

offs += 2;

int param = BitConverter.ToInt16 (metadata, offs);

offs += 2;

Console.WriteLine("Row {0}",k );

Console.WriteLine("RVA      :{0}", rva.ToString("X"));

Console.WriteLine("Name     : {0}", GetString(name));

Console.WriteLine("ImpFlags :{0}",impflags );

Console.WriteLine("Flags    :{0}",flags.ToString("X"));

Type t = typeof( System.Reflection.MethodAttributes );

FieldInfo[] f = t.GetFields(BindingFlags.Public | BindingFlags.Static);

for ( int i = 0; i < f.Length; i++ )

{

int fv = (int)f[i].GetValue(null);

if ( (fv & flags) == fv)

Console.Write( "{0} " , f[i].Name );

}

Console.WriteLine();

Console.WriteLine("Signature: #Blob[{0}]",signature);

byte count = blob[signature];

Console.Write("Blob:{0} Count:{1} Bytes ", signature , count);

for ( int l = 1 ; l <= count ; l++)

{

Console.Write("{0} " , blob[signature+l].ToString("X"));

}

Console.WriteLine();

Console.WriteLine("ParamList: Param[{0}]",param);

Console.WriteLine();

}

}

}

 

Output

Position of Blob 1240

tableoffset 64

Row 1

RVA      :2050

Name     : Main

ImpFlags :Managed

Flags    :96

PrivateScope FamANDAssem Family Public Static HideBySig ReuseSlot

Signature: #Blob[10]

Blob:10 Count:3 Bytes 0 0 1

ParamList: Param[1]

 

Row 2

RVA      :2068

Name     : abc

ImpFlags :Managed

Flags    :86

PrivateScope FamANDAssem Family Public HideBySig ReuseSlot

Signature: #Blob[14]

Blob:14 Count:4 Bytes 20 1 8 C

ParamList: Param[1]

 

Row 3

RVA      :207C

Name     : pqr

ImpFlags :Managed

Flags    :86

PrivateScope FamANDAssem Family Public HideBySig ReuseSlot

Signature: #Blob[19]

Blob:19 Count:5 Bytes 20 2 A 8 3

ParamList: Param[2]

 

Row 4

RVA      :2090

Name     : xyz

ImpFlags :Managed

Flags    :86

PrivateScope FamANDAssem Family Public HideBySig ReuseSlot

Signature: #Blob[25]

Blob:25 Count:3 Bytes 20 0 1

ParamList: Param[4]

 

Row 5

RVA      :20A0

Name     : .ctor

ImpFlags :Managed

Flags    :1886

PrivateScope FamANDAssem Family Public HideBySig ReuseSlot SpecialName RTSpecialName

Signature: #Blob[25]

Blob:25 Count:3 Bytes 20 0 1

ParamList: Param[4]

 

The above example encompasses all the programs that we have dealt with so far. So, we will not squander away any more time explaining it. Instead, let us progress on to the next program, which displays the MemberRef table.

 

The file b.cs has been modified to encompass the earlier code.

 

MemberRefTable

 

b.cs

public class zzz

{

public static void Main ()

{

System.Console.WriteLine("hello");

}

}

 

a.cs

public void xyz()

{

bool b = tablepresent(10);

int offs = tableoffset;

if ( b )

{

for ( int k = 1 ; k <= rows[10] ; k++)

{

int clas = BitConverter.ToInt16 (metadata, offs);

offs += 2;

int name = BitConverter.ToInt16 (metadata, offs);

offs += 2;

int sig = BitConverter.ToInt16 (metadata, offs);

offs += 2;

Console.WriteLine("Row {0}",k);

Console.Write("Class:");

int tag = clas & 0x07;

int rid = (int) ((uint) clas >> 3);

if ( tag == 0)

Console.Write("TypeDef");

if ( tag == 1)

Console.Write("TypeRef");

if ( tag == 2)

Console.Write("ModuleRef");

if ( tag == 3)

Console.Write("MethodDef");

if ( tag == 4)

Console.Write("TypeSpec");

Console.WriteLine("[{0}]",rid);

Console.WriteLine("Name:{0}" , GetString(name));

int count = blob[sig];

Console.Write("Signature #BLOB[{0}] Count {1} ", sig , count.ToString("X"));

for ( int l = 1 ; l <= count ; l++)

{

Console.Write("{0} " , blob[sig+l].ToString("X"));

}

Console.WriteLine();

}

}

}

 

Output

Row 1

Class:TypeRef[2]

Name:.ctor

Signature #BLOB[18] Count 5 20 2 1 2 2

Row 2

Class:TypeRef[3]

Name:WriteLine

Signature #BLOB[24] Count 4 0 1 1 E

Row 3

Class:TypeRef[1]

Name:.ctor

Signature #BLOB[14] Count 3 20 0 1

 

 

 

TypeRef Table Output

Row[1]

AssemblyRef[1] token=0x6

Name      :Object,0x20

Namespace :System,0x19

Row[2]

AssemblyRef[1] token=0x6

Name      : DebuggableAttribute,0x49

Namespace : System.Diagnostics,0x36

Row[3]

AssemblyRef[1] token=0x6

Name      :Console,0x5F

Namespace :System,0x19

 

 

The MemberRef or many a times called MethodRef table is at the 10th position in the valid field. The size of the table is just 6 bytes.

 

The second member in the table is the method name that is referred to in the module. The three methods comprise of the two constructors and the WriteLine function. The WriteLine function has been called explicitly, but as far as the constructors are concerned, we have not created even a single object.

 

To establish the class and namespace that the methods belong to, the first field called the Class is inspected.

 

 

This field is an index into one of the five tables, viz. TypeRef, ModuleRef, Method, TypeSpec or TypeDef. In effect, the first 3 bits in the byte are taken by a MemberRefParent coded index.

 

Since the ultimate value of the first three bits is 1, the table referred to is the TypeRef table. Now, to acquire the specific row in the table, we first right shift the value by 3, since it is a part of the coded index.

 

 

Thus, the first row in the memberref points to the second row in the TypeRef table. In order to ensure comprehensiveness and to cross-reference the entities, we have pasted the rows contained in the TypeRef table.

 

Row[2]

AssemblyRef[1] token=0x6

Name        : DebuggableAttribute,0x49

Namespace : System.Diagnostics,0x36

 

Thus, it is presumed that the constructor in class Debuggable Attribute from the System.Diagnostics namespace, is the first member row in the table. We have not added this attribute. The C# compiler has done it for intrinsic reasons.

 

 

The second method, which is the WriteLine function, shall elucidate this concept further.

 

The value of 1 in the MemberRefParent table, points to the TypeRef table, as before. After right-shifting 25 by 3, the value obtained is 3, as shown below.

 

Thus, the row index in the TypeRef table is 3. The third row in the TypeRef table represents the Console class from the System namespace.

 

Row[3]

AssemblyRef[1] token=0x6

Name      :Console,0x5F

Namespace :System,0x19

 

 

Thus, it now becomes easier to figure out the namespace-class combination to which the method belongs.

 

The last field named sig in the MemberRef table is the signature of the function being called. This signature is incredibly vital, since it is the only way to verify whether the parameters are being passed in the appropriate order or not.

The first byte is 4, which represents the count for the second record.

 

As the WriteLine function is a static function, the 'this' pointer is not passed. Hence, the next byte is 00. The number of parameters passed to the function is 1 and the return type is also 1, which signifies void. The last byte defines the parameter to be ELEMENT_TYPE_STRING. This will be covered in greater detail later.

 

The third row representing a constructor refers to the first row of the TypeRef table. This row represents the Object class from the System table. The signature Blob further enhances our knowledge by revealing that it is a non-static function, which has no parameters passed and has a void return value.

 

 

A point to be noted here is that every object that is created, has to call the base class constructor. It is for this reason that while creating zzz, the constructor of the base class Object is called. In due course of time, we will divulge the IL code written for this module. Furthermore, when no constructors are created manually, a free constructor that takes no parameters, is assigned to the class zzz.

 

This constructor of DebuggableAttribute takes two parameters, hence displaying a value of 5.

 

Thus, to conclude, every row in the MemberRef acquaints us with the presence of a particular method in the code. The Class field points to the table that has the type for the member; the name field provides the name; and finally, the signature field describes the actual signature of the method call.

 

We now insert the following code in the file b.cs:

 

b.cs

public class zzz

{

public static void Main ()

{

System.Console.WriteLine("hell");

System.Console.WriteLine(10);

System.Console.WriteLine(true);

}

}

 

Output

Row 1

Class:TypeRef[2]

Name:.ctor

Signature #BLOB[18] Count 5 20 2 1 2 2

Row 2

Class:TypeRef[3]

Name:WriteLine

Signature #BLOB[24] Count 4 0 1 1 E

Row 3

Class:TypeRef[3]

Name:WriteLine

Signature #BLOB[29] Count 4 0 1 1 8

Row 4

Class:TypeRef[3]

Name:WriteLine

Signature #BLOB[34] Count 4 0 1 1 2

Row 5

Class:TypeRef[1]

Name:.ctor

Signature #BLOB[14] Count 3 20 0 1

 

The file b.cs is modified to call the WriteLine function thrice. Thus, in the memberref table, there exist three entries for WriteLine. The index to the TypeRef table remains the same, i.e. 3. The only aspect that varies is the Signature, since the datatypes of the parameters are dissimilar.

 

The parameter type is much more complex and has a greater role than merely signifying the data type. The element type table E is a string, 8 is an int and 2 is a boolean.

 

The Custom Attribute table is unveiled in the next program, where the file b.cs is modified to contain only one WriteLine function, as before.

 

Custom Attribute Table

 

a.cs

public void xyz()

{

bool b = tablepresent(12);

int offs = tableoffset;

if ( b )

{

for ( int k = 1 ; k <= rows[12] ; k++)

{

int parent = BitConverter.ToInt16 (metadata, offs);

offs += 2;

int type = BitConverter.ToInt16 (metadata, offs);

offs += 2;

int value = BitConverter.ToInt16 (metadata, offs);

offs += 2;

int tag = parent & 0x1F;

int rid = (int) ((uint) parent >> 5);

Console.Write("Parent:");

if ( tag == 0)

Console.Write("MethodRef");

if ( tag == 1)

Console.Write("FieldRef");

if ( tag == 2)

Console.Write("TypeRef");

if ( tag == 3)

Console.Write("TypeDef");

if ( tag == 4)

Console.Write("ParamDef");

if ( tag == 5)

Console.Write("InterfaceImpl");

if ( tag == 6)

Console.Write("MemberRef");

if ( tag == 7)

Console.Write("Module");

if ( tag == 8)

Console.Write("Permission");

if ( tag == 9)

Console.Write("Property");

if ( tag == 10)

Console.Write("Event");

if ( tag == 11)

Console.Write("Signature");

if ( tag == 12)

Console.Write("ModuleRef");

if ( tag == 13)

Console.Write("TypeSpec");

if ( tag == 14)

Console.Write("Assembly");

if ( tag == 15)

Console.Write("AssemblyRef");

if ( tag == 16)

Console.Write("File");

if ( tag == 17)

Console.Write("ExportedType");

if ( tag == 16)

Console.Write("ManifestResource");

Console.WriteLine("[{0}]",rid);

tag = type & 0x07;

rid = (int) ((uint) type >> 3);

Console.Write("Type:");

if ( tag == 0)

Console.Write("TypeRef");

if ( tag == 1)

Console.Write("TypeDef");

if ( tag == 2)

Console.Write("MethodDef");

if ( tag == 3)

Console.Write("MemberRef");

if ( tag == 4)

Console.Write("String");

Console.WriteLine("[{0}]",rid);

int count = blob[value];

Console.WriteLine("Value Blob[{0}] Count {1}",value , count );

for ( int l = 1 ; l <= count ; l++)

{

Console.Write("{0} " , blob[value+l].ToString("X"));

}

}

}

}

 

Output

Parent:Assembly[1]

Type:MemberRef[1]

Value Blob[29] Count 6

1 0 0 1 0 0

 

MemberRef Table

Row 1

Class:TypeRef[2]

Name:.ctor

Signature #BLOB[18] Count 5 20 2 1 2 2

 

TypeRef Table

Row[2]

AssemblyRef[1] token=0x6

Name      : DebuggableAttribute,0x49

Namespace : System.Diagnostics,0x36

 

The 12th position in the valid field is assigned to the Custom Attribute table, which deals with Attributes. You may recall that a little while ago, we had disclosed to you that one attribute is added by the C# compiler.

 

The Custom Attribute table has the following columns:

 

 

The first field is called parent, which has a HasCustomAttribute coded index.

 

 

 

This index uses the first five bits to encode the table. The probable values can range from 0 to 18 and have been assigned the following significance:

 

 

The Assembly table is the Parent. The tabletype is retrieved after executing a bitwise AND operation with the first five bits. Then, to ascertain the index into the table, the last three bits are checked by right shifting the byte by 5 bits. It can be an index into any table, except for the Custom Attribute table.

 

 

The result is 1, signifying that it is the first index in the assembly table. The second field is called the type. It is a CustomAttributeType coded index into any of the five probable tables.

 

The last 3 bits supply a value of 3, thereby indicating the table of MemberRef. The index in the table will be 1, after the bytes are right shifted by 3. Thus, the attribute applies to the first index in the MemberRef table, i.e. .ctor in the class DebuggableAttribute, belonging to the System.Diagnostics namespace.

 

The last field is a two-byte index into the Blob heap. The CustomAttribute table contains data in the Blob heap. It is used to instantiate an object, which is an instance of the Custom Attribute, at run time.

There is extensive cross-referencing between tables. Hence, we decided to illustrate the individual tables first. The type field is the index to the constructor of the Custom Attribute. There is no rule that stipulates that the presence of a custom attribute is mandatory.

 

The documentation in no uncertain terms, states that the type must index a valid table in the Method and in the MethodRef table. Nevertheless, the code on the .Net clearly demonstrates that it could be any one of the 5 tables that finds a mention in our code. The last column value could be null.

 

 

As always, the Blob heap begins with the count of the number of bytes. Our custom attribute has 6 bytes.

 

Section 22.3 defines the syntax of the Blob heap for a custom attribute. It commences with a prolog, which is a short with the value 0x0001. If you are under the notion that we have reversed the bytes, you couldn't be more right, since the bytes in the Blob are stored in big endian, which is the opposite of small endian.

 

Let us consider a value, such as 258.

 

 

The hex equivalent of this decimal number in little endian format is 0x0102, whereas in big endian, it is 0x0201. This is so because the bytes are reversed in the big endian format.

 

Assembly Table

 

a.cs

using System.Configuration.Assemblies;

...

...

public void xyz()

{

bool b = tablepresent(32);

int offs = tableoffset;

if ( b )

{

for ( int k = 1 ; k <= rows[32] ; k++)

{

AssemblyHashAlgorithm HashAlgId = (AssemblyHashAlgorithm)BitConverter.ToInt32 (metadata, offs);

offs += 4;

int major = BitConverter.ToInt16 (metadata, offs);

offs += 2;

int minor = BitConverter.ToInt16 (metadata, offs);

offs += 2;

int build= BitConverter.ToInt16 (metadata, offs);

offs += 2;

int revision = BitConverter.ToInt16 (metadata, offs);

offs += 2;

AssemblyFlags flags = (AssemblyFlags)BitConverter.ToInt32 (metadata, offs);

offs += 4;

int publickey = BitConverter.ToInt16 (metadata, offs);

offs += 2;

int name = BitConverter.ToInt16 (metadata, offs);

offs += 2;

int culture = BitConverter.ToInt16 (metadata, offs);

offs += 2;

Console.WriteLine("HashAlgId {0}",HashAlgId );

Console.WriteLine("MajorVersion {0}",major );

Console.WriteLine("MinorVersion {0}",minor);

Console.WriteLine("BuildNumber {0}",build);

Console.WriteLine("RevisionNumber {0}",revision);

Console.WriteLine("Flags {0}",flags.ToString());

Console.WriteLine("Public Key #BLOB[{0}] {1}",publickey , blob[publickey] );

Console.WriteLine("Name:{0}",GetString(name));

Console.WriteLine("Culture:{0}",GetString(culture));

}

}

}

public enum AssemblyFlags

{

PublicKey = 0x0001,

SideBySideCompatible = 0x0000,

NonSideBySideAppDomain = 0x0010,

NonSideBySideProcess = 0x0020,

NonSideBySideMachine = 0x0030,

EnableJITcompileTracking = 0x8000,

DisableJITcompileOptimizer = 0x4000,

}

 

Output

HashAlgId SHA1

MajorVersion 0

MinorVersion 0

BuildNumber 0

RevisionNumber 0

Flags SideBySideCompatible

Public Key #BLOB[0] 0

Name:b

Culture:

 

 

The Assembly table is positioned at the 32nd bit in the valid table. Its task is to store details of an assembly. An assembly in turn comprises of many modules, which in turn represent a dll or an exe file. The assembly table can contain zero or one row only.

 

 

The first field is a four byte constant of type enum AssemblyHashAlgorithm that originates from the namespace System.Configuration.Assemblies. Therefore, we have added this namespace to our program. The technique of hashing has many applications including cryptography. This hash value can assume only one of the three probable values.

 

 

Close on the heels of the four bytes of AssemblyHashAlgorithm, there exist 4 sets of two byte constants, which denote the Major Version, Minor Version, Build Number and Revision Number, respectively. The value for each of them happens to be zero.

 

This is followed by a 4-byte flag mask that has been represented using an enum. In an enum, every member is assigned a default value. Therefore, while displaying the field, instead of the enum value, the name of the member is displayed. This approach surpasses the process of bitwise ANDing with different values. However, this method succeeds only if one and only one of the members match the enum. If more than one member matches the enum, then it is the number, and not the member name, which shall be displayed.

 

If the PublicKey flag is set, it denotes that the assembly reference holds the complete unhashed public key. The Side by Side Compatible value is exactly what the name indicates. The last two values are reserved.  The PublicKey field is an index into the Blob heap. Since it has an index value of zero, it becomes invalid, thereby having no index at all.

This also proves the fact that the Public key can be zero.

 

The name of the assembly without the file extension, comes next in sequence. It is followed by the culture field, which is currently a null string.

 

AssemblyRef Table

 

a.cs

public void xyz() {

bool b = tablepresent(35);

int offs = tableoffset;

if ( b ) {

for ( int k = 1 ; k <= rows[35]; k++)  {

int major = BitConverter.ToInt16 (metadata, offs);

offs += 2;

int minor = BitConverter.ToInt16 (metadata, offs);

offs += 2;

int build= BitConverter.ToInt16 (metadata, offs);

offs += 2;

int revision = BitConverter.ToInt16 (metadata, offs);

offs += 2;

AssemblyFlags flags = (AssemblyFlags)BitConverter.ToInt32 (metadata, offs);

offs += 4;

int publickey = BitConverter.ToInt16 (metadata, offs);

offs += 2;

int name = BitConverter.ToInt16 (metadata, offs);

offs += 2;

int culture = BitConverter.ToInt16 (metadata, offs);

offs += 2;

int hashvalue = BitConverter.ToInt16 (metadata, offs);

offs += 2;

Console.WriteLine("MajorVersion {0}",major );

Console.WriteLine("MinorVersion {0}",minor);

Console.WriteLine("BuildNumber {0}",build);

Console.WriteLine("RevisionNumber {0}",revision);

Console.WriteLine("Flags {0}",flags.ToString());

int count = blob[publickey ];

Console.WriteLine("Public Key or Token #BLOB[{0}] {1}",publickey , count);

for ( int l = 1 ; l <= count ; l++)  {

Console.Write("{0} " , blob[publickey+l].ToString("X"));

}

Console.WriteLine();

Console.WriteLine("Name:{0}",GetString(name));

Console.WriteLine("Culture:{0}",GetString(culture));

Console.WriteLine("Hash Value #BLOB[{0}]",hashvalue);

}

}

}

 

Output

MajorVersion 1

MinorVersion 0

BuildNumber 3300

RevisionNumber 0

Flags SideBySideCompatible

Public Key or Token #BLOB[1] 8

B7 7A 5C 56 19 34 E0 89

Name:mscorlib

Culture:

Hash Value #BLOB[0]

 

 

The AssemblyRef table stores all the assemblies that are referenced in the file.

 

The code for the System namespace is located in a file called mscorlib.dll. Since only one assembly is referenced in the program, there exists only one row in the AssemblyRef table. The Assembly Ref table takes up the 35th position in the valid field.

 

The Major Version, Minor Version, Build Number, Revision Number, Flags, Public Key and Culture are all obtained from the Assembly table present in mscorlib.dll. The compiler reads the metadata of each assembly that is referenced, in order to figure out the namespaces and classes that they contain.

 

Our exe file has no public key, but mscorlib has an 8 byte public key that is displayed.

 

We will conclude this chapter here, since we have scrutinized the eight metadata tables present in the smallest possible exe file. The next chapter shall investigate the remaining tables that have not been deliberated upon so far.