Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / C#

.NET File Format - Signatures Under the Hood, Part 1 of 2

5.00/5 (50 votes)
29 Sep 2009CPOL28 min read 100.7K   1.6K  
A full description of signatures, that are part of the .NET file format

Contents

  1. Introduction
  2. What are Signatures?
  3. Getting Started
    1. CFF Explorer
    2. Endianness
    3. Compressed Integer
    4. Constants
  4. Signatures
    1. FieldSig
    2. PropertySig
    3. MethodDefSig
    4. MethodRefSig
    5. StandAloneMethodSig
  5. Next Part

1. Introduction

Some time ago, I had to write my own reflection engine for .NET, I needed it to build .NET documentation generator. After a few weeks, I discovered Mono.Cecil which (after some modifications) is now ideal for me, therefore I dismissed developing my reflection engine. I thought it is a great opportunity to thank all CodeProject community members for sharing their hard earned knowledge, that they have made in recent years, and write an article.

This two-part article covers signatures, that are, second, very important part of .NET file after metadata, about which Daniel Pistelli wrote an excellent article that can be found here. It's strongly recommended to read this article before progressing, additionally, you can also read An In-Depth Look into the Win32 Portable Executable File Format in MSDN Magazine, describing PE file format which forms the foundation for .NET metadata, signatures and Intermediate Language (IL) code. Of course, almost everything can be found in the Partition II Metadata specification, but as usual, specifications sacrifice readability for completeness, that is another reason why I wrote this article.

2. What are Signatures?

In brief, signatures store data that cannot be compactly stored in metadata tables, for instance, parameter types, arguments supplied to custom attributes, marshalling descriptors, etc. Storing information such as parameter types in tables would result in excessive data fragmentation, unintelligibility and impose performance penalty, hence CLI/CLR engineers invented signatures allowing storing previously mentioned sort of data in compact and decent manner. In the next chapters, you will clearly see why they are so important.

3. Getting Started

In this chapter, you will learn few things needed to understand the rest of the article, so do not underestimate it, information contained here will be extensively used in the following sections. Terms associated with signatures, but not covered here, will be explained later, along the way.

3.1 CFF Explorer

For viewing .NET metadata and signatures code, we will be using CFF Explorer written also by Daniel Pistelli. The CFF Explorer is a freeware tool, that is capable to view and edit PE headers, resources and some fields and flags of .NET metadata, you can download it at this site. At the picture below, you can see CFF Explorer running with sample assembly loaded. Usually, signatures are indexed by Signature column, in the red circle there is location of MethodDefSig signature indexed by Method.Signature in the #Blob heap that can be explored by clicking on the green circle.

CFF Explorer in action.

Picture 1. CFF Explorer in action

3.2 Endianness

I think the best description is given by Wikipedia:

"In computing, endianness is the byte (and sometimes bit) ordering used to represent some kind of data. Typical cases are the order in which integer values are stored as bytes in computer memory (relative to a given memory addressing scheme) and the transmission order over a network or other medium. When specifically talking about bytes, endianness is also referred to simply as byte order."

In our case, we consider endianness as byte order, data (usually integers) stored in a file. There are two methods (orders) of representing data in a file, big-endian and little-endian, PE/.NET file uses both methods, so below we discuss each of them.

Big Endian

In this ordering method, the most significant byte is stored at file location with the lowest offset, the next byte value is stored at the following file offset and so on. In the example below, we want to store value 0x1B5680DA at the offset 100, then memory would look like:

  100 101 102 103  
... 1B 56 80 DA ...

Little Endian

Compared to Big endian, Little endian stores data in reverse order, i.e., the least byte is stored at the lowest offset. In this case, our value 0x1B5680DA stored in Little Endian would look like:

  100 101 102 103  
... DA 80 56 1B ...

3.3 Compressed Integer

Signatures are compressed before being stored into the #Blob heap by compressing the integers embedded in the signature. In contrast to normal integers that have fixed size, compressed integers use only as much space as needed, almost all signatures use integer compression instead of normal, fixed size integers. Because a vast majority of numbers in signatures lies below 128, space saving is significant. Below, you can see encoding algorithm copied from the specification:

If the value lies between 0 (0x00) and 127 (0x7F), inclusive, encode as a one-byte integer (bit 7 is clear, value held in bits 6 through 0).
If the value lies between 2<sup>8</sup> (0x80) and 2<sup>14</sup> - 1 (0x3FFF), inclusive, encode as a 2-byte integer with bit 15 set, bit 14 clear (value held in bits 13 through 0).
Otherwise, encode as a 4-byte integer, with bit 31 set, bit 30 set, bit 29 clear (value held in bits 28 through 0)
A null string should be represented with the reserved single byte 0xFF, and no following data.

Example 1

Value is less than 0x80, so this is the first case, we cut three unnecessary bytes.

  Original value (32-bit) Compressed value Saved bytes
Hex 00 00 00 03 0x03 3
Binary 00000000 0000000 00000000 00000011 00000011 -

Example 2

The same as Example 1.

  Original value (32-bit) Compressed value Saved bytes
Hex 00 00 00 7F 7F 3
Binary 00000000 0000000 00000000 01111111 01111111 -

Example 3

In this example, original value is equal to 0x80, although one byte is enough to save 0x80, using compressed integer requires clearing the last bit, hence to store value 0x80 as compressed integer, we have to have an additional byte.

  Original value (32-bit) Compressed value Saved bytes
Hex 00 00 00 80 80 80 2
Binary 00000000 0000000 00000000 10000000 10000000 10000000 -

Example 4

We cut two unnecessary bytes:

  Original value (32-bit) Compressed value Saved bytes
Hex 00 00 2E 57 AE 57 2
Binary 00000000 0000000 00101110 01010111 10101110 01010111 -

Obviously, compression comes at a cost, some bits must be reserved to indicate how many bytes compressed integer occupies, thus the maximum encodable integer is 29 bits long with value 0x1FFFFFFF. Compressed integers are physically encoded using big endian byte order.

3.4 Constants

The following list represents common constants that are frequently used in almost all signatures, in the next parts of the article, we will be referring to them very often by abbreviations, using only last name member, for example ELEMENT_TYPE_I8 as I8, ELEMENT_TYPE_STRING as STRING, and so on.

Name Value Remarks
ELEMENT_TYPE_END 0x00 Marks end of a list
ELEMENT_TYPE_VOID 0x01 System.Void
ELEMENT_TYPE_BOOLEAN 0x02 System.Boolean
ELEMENT_TYPE_CHAR 0x03 System.Char
ELEMENT_TYPE_I1 0x04 System.SByte
ELEMENT_TYPE_U1 0x05 System.Byte
ELEMENT_TYPE_I2 0x06 System.Int16
ELEMENT_TYPE_U2 0x07 System.UInt16
ELEMENT_TYPE_I4 0x08 System.Int32
ELEMENT_TYPE_U4 0x09 System.UInt32
ELEMENT_TYPE_I8 0x0A System.Int64
ELEMENT_TYPE_U8 0x0B System.UInt64
ELEMENT_TYPE_R4 0x0C System.Single
ELEMENT_TYPE_R8 0x0D System.Double
ELEMENT_TYPE_STRING 0x0E System.String
ELEMENT_TYPE_PTR 0x0F Unmanaged pointer, followed by the Type element.
ELEMENT_TYPE_BYREF 0x10 Managed pointer, followed by the Type element.
ELEMENT_TYPE_VALUETYPE 0x11 A value type modifier, followed by TypeDef or TypeRef token
ELEMENT_TYPE_CLASS 0x12 A class type modifier, followed by TypeDef or TypeRef token
ELEMENT_TYPE_VAR 0x13 Generic parameter in a generic type definition, represented as number
ELEMENT_TYPE_ARRAY 0x14 A multi-dimensional array type modifier.
ELEMENT_TYPE_GENERICINST 0x15 Generic type instantiation. Followed by type type-arg-count type-1 ... type-n
ELEMENT_TYPE_TYPEDBYREF 0x16 A typed reference
ELEMENT_TYPE_I 0x18 System.IntPtr
ELEMENT_TYPE_U 0x19 System.UIntPtr
ELEMENT_TYPE_FNPTR 0x1B A pointer to a function, followed by full method signature
ELEMENT_TYPE_OBJECT 0x1C System.Object
ELEMENT_TYPE_SZARRAY 0x1D A single-dimensional, zero lower-bound array type modifier.
ELEMENT_TYPE_MVAR 0x1E Generic parameter in a generic method definition, represented as number
ELEMENT_TYPE_CMOD_REQD 0x1F Required modifier, followed by a TypeDef or TypeRef token
ELEMENT_TYPE_CMOD_OPT 0x20 Optional modifier, followed by a TypeDef or TypeRef token
ELEMENT_TYPE_INTERNAL 0x21 Implemented within the CLI
ELEMENT_TYPE_MODIFIER 0x40 ORed with following element types
ELEMENT_TYPE_SENTINEL 0x41 Sentinel for vararg method signature
ELEMENT_TYPE_PINNED 0x45 Denotes a local variable that points at a pinned object
  0x50 Indicates an argument of type System.Type
  0x51 Used in custom attributes to specify a boxed object (§23.3 in the ECMA-355 specification).
  0x52 Reserved
  0x53 Used in custom attributes to indicate a FIELD (§22.10, §23.3 in the ECMA-355 specification).
  0x54 Used in custom attributes to indicate a PROPERTY (§22.10, §23.3 in the ECMA-355 specification).
  0x55 Used in custom attributes to specify an enum (§23.3 in the ECMA-355 specification).

4. Signatures

We have almost all preparation behind us, and we can now start talking about signatures, but still they are few things worth mentioning that may not be obvious for everybody. First, almost all integers in signatures are compressed. Second thing that you should remember is that all signatures begin from size (in bytes) that it occupies on the #Blob heap, of course this value is stored using integer compression. Last but not least, values that locate signature on the #Blob heap are absolute, i.e., you do not have to add/subtract anything to the main value (such as in the red circle on the Picture 1) to find signature on the heap.

Also keep in mind that when you recompile attached source code (even without modyfying it) signatures in a resultant assembly may change offset.

Because this article is rather guide, in this chapter, we will discuss byte by byte all signatures, beginning from the most simple and ending on the most advanced, each signature being discussed is associated with description, diagram or syntax copied from specification and set of examples whose complete binaries and sources can be downloaded at the top of this article, if possible applications are written using C#, otherwise using CIL (formerly MSIL).

4.1 FieldSig

As stated above, we begin from the most simple signatures, one of them is FieldSig signature, it mainly describes field's type and custom modifiers attached to a field, is indexed by the Field.Signature column. Of course, Field's signature starts from entire signature size, next comes FIELD prolog that has constant value 0x6, zero or more custom modifiers, and field's type. The syntax diagram for FieldSig is shown below, in Picture 2.

NOTE: Please do not confuse custom modifiers with custom attributes! These are completely different things. Because custom modifiers form part of several signatures, they will be subject to discussion in the next chapter. In examples in the current chapter, we will not use any custom modifier.

The FieldSig signature syntax diagram

Picture 2. The FieldSig signature syntax diagram

Example 1

This example is straightforward, we have created a simple field of int32 type, as below:

C#
// Full source: FieldSig\1.cs 
// Binary: FieldSig\1.dll
// (...)

public int TestField;

Now we have to load binary assembly FieldSig\1.dll to CFF Explorer, and go to Field table in order to find row associated with our field (should be only one), the picture below should help you a little bit.

FieldSig signature explored by CFF Explorer

Picture 3. TestField field's row in the Field metadata table

We found it ! Let us go to 0x000A in #Blob.

FieldSig signature explored by CFF Explorer

Picture 4. FieldSig signature explored by the CFF Explorer program

Now we will dissect signature byte by byte in the below table:

Offset Value Meaning
0x0A 0x02 Signature size
0x0B 0x06 Prolog
0x0C 0x08 Field's type value is int32, see constants

Example 2

This time, we change field's type to string, as following code listing shows:

C#
// Full source: FieldSig\2.cs 
// Binary: FieldSig\2.dll
// (...)

public string TestField;

FieldSig signature for our TestField field is still located at 0x000A, and as you can see only last byte has changed from 0x08 to 0x0E.

Offset Value Meaning
0x0A 0x02 Signature size
0x0B 0x06 Prolog
0x0C 0x0E Field's type value is string, see constants

4.2 PropertySig

PropertySig signature is indexed by the Property.Type column, it stores information about property, that is, the number of parameters supplied to property in order to get data, zero or more custom modifiers, the type of the returned value, the type of each supplied parameter, but there is also one new thing that appeared in PropertySig signature, namely HASTHIS flag (of constant value 0x20), it indicates whether at run-time, the called method is passed a pointer to the target object as its first argument (the this pointer). As you can deduce, the HASTHIS flag is set when property (in fact its setter and getter) is instance or virtual, and is not set when property (getter and setter) is static. The flag (if set) is ORed together with signature's prolog value. Below you can see the full syntax diagram for this signature.

The PropertySig signature syntax diagram

Picture 5. The PropertySig signature syntax diagram

Example 1

The first example is trivial, we have created one instance property of type int32, as shown below:

C#
// Full source: PropertySig\1.cs 
// Binary: PropertySig\1.dll
// (...)

public int TestProperty { get; set; }

The signature begins at 0x001A offset on the #Blob heap.

Offset Value Meaning
0x1A 0x03 Signature size.
0x1B 0x28 Prolog ORed with HASTHIS constant, because 0x20 OR 0x08 = 0x28.
0x1C 0x00 Number of parameters supplied to the property's getter method, see Picture 5 above.
0x1D 0x08 The type of property's return value (int32), see constants.

Example 2

This example is a little bit more complicated because it uses indexed property which returns a different value, depending on the parameters supplied to the property, and as you can see below, such type of property does not have any name (in C#) but in metadata Field table is always declared as Item. You can define only one indexed property per class/structure, but you can overload it.

C#
// Full source: PropertySig\2.cs 
// Binary: PropertySig\2.dll
// (...)

public int this [int Param1, string Param2]
{
    get { return 0; }
    set { }
}

The signature of previously mentioned field resides at 0x001B offset on the #Blob, and is discussed in the below table:

Offset Value Meaning
0x1B 0x05 Signature size
0x1C 0x28 Property is still of instance type, so again signature's prolog is ORed with HASTHIS constant
0x1D 0x02 Number of parameters supplied to the property's getter method, see Picture 5 above
0x1E 0x08 The type of property's return value (int32), see constants
0x1F 0x08 The type of property's first parameter value (int32), see constants
0x20 0x0E The type of property's second parameter value (string), see constants

Example 3

In this example, we will try to disable HASTHIS flag by declaring property as static.

C#
// Full source: PropertySig\3.cs 
// Binary: PropertySig\3.dll
// (...)

public class TestClas
{
    public static int TestProperty { get; set; }
}

The above property's signature this time starts at 0x001A offset on the #Blob.

Offset Value Meaning
0x1A 0x03 Signature size.
0x1B 0x08 Prolog's constant value (only).
0x1C 0x00 Number of parameters supplied to the property's getter method, see Picture 5 above.
0x1D 0x08 The type of property's return value (int32), see constants.

4.3 MethodDefSig

As the name implies, this signature stores information related to methods defined in current assembly, such as the calling convention type, the number of generic parameters, the number of normal method's parameters, the return type and the type of each parameter supplied to the method. Is indexed by the MethodDef.Signature column.

The MethodDefSig signature syntax diagram

Picture 6. The MethodDefSig signature syntax diagram

Additionally, some flags are used (listed in the table below), they are ORed together and placed in the second byte of the signature (first is the size of a signature).

Name Value Meaning
HASTHIS 0x20 First argument passed to a method is the this pointer, this flag is set when method is instance or virtual. You can also see explanation of the HASTHIS flag in the previous subsection.
EXPLICITTHIS 0x40 Specification says: "Normally, a parameter list (which always follows the calling convention) does not provide information about the type of the this pointer, since this can be deduced from other information. When the combination instance explicit is specified, however, the first type in the subsequent parameter list specifies the type of the this pointer and subsequent entries specify the types of the parameters themselves." Please note that if EXPLICITTHIS is set HASTHIS must also be set.
DEFAULT 0x00 Let the Common Language Runtime determine calling convention, this flag is set when calling static methods.
VARARG 0x05 Specifies the calling convention for methods with variable arguments.
GENERIC 0x10 Method has one or more generic parameters.

Example 1

As usual, let us start with a simple example, this time we have created the instance method that has two generic parameters and two normal parameters, for clarity, the method does not have any body.

C#
// Full source: MethodDefSig\1.cs 
// Binary: MethodDefSig\1.dll
// (...)

public void TestMethod<GenArg1, GenArg2>(int Param1, object Param2) { }

The MethodDefSig signature for sample method lies at 0x000A offset, and looks as follows:

Offset Value Meaning
0x0A 0x06 Signature size
0x0B 0x30 Because this is instance and generic method, flags HASTHIS and GENERIC are set, 0x20 OR 0x10 = 0x30
0x0C 0x02 The number of generic parameters
0x0D 0x02 The number of normal parameters
0x0E 0x01 The type of the returned value (void), see constants
0x0F 0x08 First parameter's type (int32), see constants
0x10 0x1C Second parameter's type (object), see constants

Example 2

In this example, we will try once again to demonstrate usage of HASTHIS flag, discussed method definition looks as below:

C#
// Full source: MethodDefSig\2.cs 
// Binary: MethodDefSig\2.dll
// (...)

public class TestClas
{
    public static void TestMethod(int Param1, object Param2) { }
}

Signature again lies at 0x000A on the #Blob heap, and looks like:

Offset Value Meaning
0x0A 0x05 Signature size
0x0B 0x00 There is one flag set, namely DEFAULT, it means that method is static, and lets CLR determine a calling convention used. The method is also not generic method, because GENERIC flag is not set, thus next byte specifies a number of normal (not generic) parameters supplied to the method.
0x0C 0x02 The number of normal parameters
0x0D 0x01 The type of the returned value (void), see constants
0x0E 0x08 First parameter's type (int32), see constants
0x0F 0x1C Second parameter's type (object), see constants

Example 3

Now let us see how EXPLICITTHIS flags works, we can turn it on by using explicit keyword in method definition, of course in the CIL language.

MSIL
// Full source: MethodDefSig\3.il
// Binary: MethodDefSig\3.dll
// (...)

.method instance explicit void TestMethod () cil managed
{
    .maxstack 2
     ret 
}

MethodDefSig for the above method looks like this:

Offset Value Meaning
0x01 0x03 Signature size
0x02 0x60 HASTHIS and EXPLICITTHIS flags are set, because 0x20 OR 0x40 = 0x60
0x03 0x00 The number of parameters that method takes
0x04 0x01 The type of the returned value (void), see constants

Example 4

In this example, we have created method that accepts variable arguments, i.e., in addition to normal parameters that are in declaration it accepts variable number of variable type parameters. Adding vararg in the CIL language keyword to the method definition makes method accepting variable arguments, as you can see on the below code listing.

IMPORTANT: Using params keyword in C# does not set the VARARG flag in associated method's signature. The result of my investigation is that method which use params keyword in C# is just decorated by the C# compiler with the ParamArray attribute, and additional parameters are treated as a normal array. You can also make a method truly VARARG in C# by following this instruction, but this is not CLS compliant.

MSIL
// Full source: MethodDefSig\4.il
// Binary: MethodDefSig\4.dll
// (...)

.method instance vararg void TestMethod () cil managed
{
    .maxstack 2
    ret 
}

The method's signature is explored in the following table:

Offset Value Meaning
0x01 0x03 Signature size
0x02 0x25 The method is instance and accepts variable arguments, thus HASTHIS and VARARG flags are set, and so 0x20 OR 0x05 = 0x25
0x03 0x00 The number of parameters that method takes
0x04 0x01 The type of the returned value (void), see constants

4.4 MethodRefSig

This signature is very similar (if not identical) to the previously mentioned MethodDefSig, but in concern to it, the MethodRefSig describes a method's calling convention, parameters, etc., at the point where a method is called (also known as call site). The signature is indexed by the MemberRef.Signature column, and if a method does not accept variable arguments is identical to MethodDefSig and shall match exactly the signature specified in the definition of the target method, otherwise is as below:

The MethodRefSig signature syntax diagram

Picture 7. The MethodRefSig signature syntax diagram

As you can see, when you calling VARARG method in its associated MethodRefSig, there is one additional constant, namely SENTINEL, this value has only one simple aim, it denotes end of the required parameters supplied to the method, and beginning of additional (variable) parameters, you can find more information about sentinel values here. Also notice that the ParamCount integer indicates total number of parameters supplied to the method. In the table below, there is full listing of abbreviations used in MethodRefSig signature when it is different than MethodDefSig.

Name Value Meaning
HASTHIS 0x20 First argument passed to a method is the this pointer, this flag is set when method is instance or virtual. You can also see explanation of the HASTHIS flag in the subsection 4.2.
EXPLICITTHIS 0x40 Specification says: "Normally, a parameter list (which always follows the calling convention) does not provide information about the type of the this pointer, since this can be deduced from other information. When the combination instance explicit is specified, however, the first type in the subsequent parameter list specifies the type of the this pointer and subsequent entries specify the types of the parameters themselves." Please note that if EXPLICITTHIS is set HASTHIS must also be set.
VARARG 0x05 Specifies the calling convention for methods with variable arguments.
SENTINEL 0x41 Denotes end of required parameters.

Example 1

To convince you that when calling non VARARG method there is no difference between MethodDefSig and its associated MethodRefSig signature, I have created the following code:

C#
// Full source: MethodRefSig\1a.cs
// Binary: MethodRefSig\1a.dll
// (...)

public void TestMethod(int Param1, string Param2) { }
C#
// Full source: MethodRefSig\1b.cs
// Binary: MethodRefSig\1b.dll
// (...)

new TestClass().TestMethod(0, "A simple parameter");

Now let us look at the TestMethod's MethodDefSig signature that resides in the MethodRefSig\1a.dll file.

Offset Value Meaning
0x0A 0x05 Signature size
0x0B 0x20 The method is instance, thus HASTHIS flag is set which means that first argument passed to the method is the this pointer.
0x0C 0x02 Method takes exactly two parameters.
0x0D 0x01 The type of the returned value (void), see constants
0x0E 0x08 First parameter's type (int32), see constants
0x0F 0x0E Second parameter's type (string), see constants

And its related MethodRefSig looks exactly the same, but lies at different offset.

Offset Value Meaning
0x13 0x05 Signature size
0x14 0x20 The method is instance, thus HASTHIS flag is set which means that first argument passed to the method is the this pointer.
0x15 0x02 Method takes exactly two parameters.
0x16 0x01 The type of the returned value (void), see constants
0x17 0x08 First parameter's type (int32), see constants
0x18 0x0E Second parameter's type (string), see constants

Example 2

In this example, we will demonstrate how the MethodRefSig signature deals with calling VARARG methods. For this purpose, we have created truly VARARG method that takes one required parameter, and other, variable parameters. Remember that using params keyword in C# does not set the VARARG flag in associated method's signature, because params just decorates a method with the ParamArray attribute and additional parameters are treated like array of objects of some type. In order to set VARARG flag in the signature, you have to add __arglist to a method definition as the last parameter, but this is not CLS compliant. For more information, go here.

C#
// Full source: MethodRefSig\2a.cs
// Binary: MethodRefSig\2a.dll
// (...)

[CLSCompliant(false)]
public void TestMethod(string RequiredParam, __arglist)
{
    Console.WriteLine("Required parameter is: " + RequiredParam);

    Console.WriteLine("Additional parameters are: ");
    ArgIterator argumentIterator = new ArgIterator(__arglist);
    for (int i = 0; i < argumentIterator.GetRemainingCount(); i++)
    {
        Console.WriteLine(__refvalue(argumentIterator.GetNextArg(), string));
    } 
}

Now it is time to call our method from a separate assembly, The method is called with one required argument of type string, and two additional arguments of type int32, like below:

C#
// Full source: MethodRefSig\2b.cs
// Binary: MethodRefSig\2b.dll
// (...)

[CLSCompliant(false)]
public void TestRunMethod()
{
    new TestClass().TestMethod(
        "I am required parameter.", 
        __arglist(0, 1));
}

I have discovered that for the above call, there are two rows in the MemberRef table. I do not know why this is so, but I know that signature from the first encountered row has HASTHIS flag set but it does not contain any information about variable arguments that have been supplied to the method, specification does not say anything about this strange behaviour. But signature indexed by the second row is OK, let us look.

Offset Value Meaning
0x23 0x07 Signature size
0x24 0x25 The method is instance and accepts variable parameters, hence HASTHIS OR VARARG = 0x20 OR 0x05 = 0x25
0x25 0x03 Total number of parameters supplied to method is 3, one required and two additional
0x26 0x01 The type of the returned value (void), see constants
0x27 0x0E First required parameter's type (string), see constants
0x28 0x41 SENTINEL constant, all parameters after this value are additional
0x29 0x08 First additional parameter's type (int32), see constants
0x30 0x08 Second additional parameter's type (int32), see constants

4.5 StandAloneMethodSig

This signature type is very similar to the MethodRefSig, it provides call site signature for a method, but has two key differences. The first is that StandAloneSig can specify an unmanaged target method, StandAloneSig is usually created as preparation for executing calli instruction that invokes either managed or unmanaged code. The second important difference is that the StandAloneSig signature is indexed by the StandAloneSig.Signature column, which is only one column in the StandAloneSig metadata table, what is more, each row in this table is not referenced by any other table (that is why its name is "stand alone"), this table is filled by code generators. The signature at StandAloneSig.Signature column shall be either, the StandAloneMethodSig signature for each execution of calli instruction or the LocalVarSig signature that describes local variables in each method, and which will be further clarified in the next subsection. The syntax diagram for the StandAloneSig signature is as follows:

The StandAloneMethodSig signature syntax diagram

Picture 8. The StandAloneMethodSig signature syntax diagram

Because this signature is different from the MethodRefSig signature only to those that StansAloneMethodSig can call unmanaged methods, few other constants were added that describe calling conventions used to invoke unmanaged methods.

IMPORTANT: As you will see soon, there are different calling conventions for invoking methods accepting variable parameters for managed and unmanaged code. Diagram for each case may look different, for example, the VARARG calling convention invokes managed methods accepting variable parameters, in this case signature has additional elements, SENTINEL and one or more Param (shaded boxes), however the C calling convention also invokes methods accepting variable parameters (unmanaged code), but signature for this case ends just before Param element. From my observations, compiler generates signatures as stated above, unfortunately my sample code compile, but throws an exception and I do not know where is the problem so I cannot certainly say that my observations are correct, moreover specification is not clear:

"Two separate diagrams have been combined into one in this diagram, using shading to distinguish between them. Thus, for the following calling conventions: DEFAULT (managed), STDCALL, THISCALL and FASTCALL (unmanaged), the signature ends just before the SENTINEL item (these are all non vararg signatures). However, for the managed and unmanaged vararg calling conventions: VARARG (managed) and C (unmanaged), the signature can include the SENTINEL and final Param items (they are not required, however). These options are indicated by the shading of boxes in the syntax diagram."

Do you see that ? Why the C box is not shaded if using C calling convention may add SENTINEL and Param elements when calling unmanaged method which accepts variable arguments? Under what circumstances Param elements are not required? The calli instruction occurs very rarely in 100% properly working code that calls an unmanaged method (392 assemblies from my GAC executes the calli instruction only twice and only against managed methods !), so I cannot say that my explanations for following sample code in this subsection are absolutely true. If somebody know how StandAloneMethodSig signature looks when correctly calling unmanaged method (either, accepting or not accepting variable arguments - in both cases, code throws an exception), please let me know, I would be very grateful.

Name Value Meaning
HASTHIS 0x20 First argument passed to a method is the this pointer, this flag is set when method is instance or virtual. You can also see explanation of the HASTHIS flag in the subsection 4.2.
EXPLICITTHIS 0x40 Specification says: "Normally, a parameter list (which always follows the calling convention) does not provide information about the type of the this pointer, since this can be deduced from other information. When the combination instance explicit is specified, however, the first type in the subsequent parameter list specifies the type of the this pointer and subsequent entries specify the types of the parameters themselves." Please note that if EXPLICITTHIS is set HASTHIS must also be set.
DEFAULT 0x00 Let the Common Language Runtime determine calling convention, this flag is set when calling static methods.
VARARG 0x05 Specifies the calling convention for managed methods with variable arguments.
C 0x01 Calling convention for unmanaged method target, specifics for this convention are:
Parameters are passed from right to left.
Caller of a method performs stack cleanup.
Only this calling convention allows invoking unmanaged methods that have variable parameters (vararg is for managed methods).
You can use this calling convention by adding the unmanaged cdecl keyword to a method definition in the CIL language.
STDCALL 0x02

Calling convention for unmanaged method target, specifics for this convention are:

Parameters are passed from right to left.

Called method performs stack cleanup.

You can use this calling convention by adding the unmanaged stdcall keyword to a method definition in the CIL language.

THISCALL 0x03 Calling convention for unmanaged method target, specifics for this convention are:
Parameters are passed from right to left.
Called method performs stack cleanup.
The this pointer is placed in the ECX register.
You can use this calling convention by adding the unmanaged thiscall keyword to a method definition in the CIL language.
FASTCALL 0x04 Calling convention for unmanaged method target, specifics for this convention are:
Some parameters are placed in ECX and EDX registers, the rest of the arguments are placed (pushed) onto the stack from right to left.
Called method performs stack cleanup.
You can use this calling convention by adding the unmanaged fastcall keyword to a method definition in the CIL language.
SENTINEL 0x41 Denotes end of required parameters

NOTE: One thing worth mentioning here, is that in contrast to the CL (Microsoft C\C++ compiler), the ILASM (Microsoft CIL compiler) does not add any special characters (such as "@", "_", "?", etc.) to a method name when using any of the calling conventions for unmanaged targets. The CIL compiler does not decorate any methods names with special characters because it just generates the bytecode, that can be later compiled into the machine code by the CLR's Just-in-time compiler, so when you choose some calling convention when coding in CIL, the ILASM compiler does not determine who (caller or called method) cleans a stack, does not determine in what order arguments are passed to a method, and does not change methods names, this is doing during JIT compilation / optimization. If you do not know what I am talking about, you can read Nemanja Trifunovic's article entitled Calling Conventions Demystified which thoroughly describes different calling conventions types for C and C++, their meaning, how they work, etc.

Example 1

In sample code listing, we have two managed methods. The first method has one fixed parameter of type int32 and returns also int32 (in fact, it does not return anything, since there is no data that is pushed onto the evaluation stack), second listed method just executes the first method, you can see it below:

MSIL
// Full source: StandAloneMethodSig\1.il
// Binary: StandAloneMethodSig\1.dll
// (...)

.method public static int32 TestMethod(int32 required)
{ 
    ret
}

.method public static void TestRunMethod()
{ 
    .maxstack 8
    ldc.i4.1
    ldftn int32 TestMethod(int32)
    calli int32(int32)
    ret
}

Before the method TestRunMethod executes the TestMethod, it pushes one int32 value (argument) onto the evaluation stack using ldc.i4.1 instruction, then pushes pointer to the first method onto the evaluation stack by ldftn instruction, finally it calls our test "do nothing" managed method executing calli, and this last instruction generates the StandAloneMethodSig signature which is explained in the table below:

Offset Value Meaning
0x01 0x04 Signature size
0x02 0x00 The method does not use any specific calling convention, the method is not instance method, since there is no HASTHIS flag set.
0x03 0x01 The method requires supplying one fixed parameter and zero variable parameters.
0x04 0x08 The type of the returned value (int32), see constants
0x05 0x08 First required parameter's type (int32), see constants

Example 2

In this example, we will make the sample method accepting variable arguments and we will call it by calli with one required and one additional parameter. The fixed parameters are separated from the additional parameters with an ellipsis (...), as seen below:

MSIL
// Full source: StandAloneMethodSig\2.il
// Binary: StandAloneMethodSig\2.dll
// (...)

.method public hidebysig static vararg void TestMethod(int32 required)
{ 
    ret
}

.method public hidebysig static void TestRunMethod()
{ 
    .maxstack 3
    ldc.i4.1
    ldc.i4.2
    ldftn vararg void TestMethod(int32, ..., int32)
    calli vararg void(int32, ..., int32)
    ret
}

For this case, the signature generated by the calli instruction looks the same to MethodRefSig signature discussed in the previous subsection, let us look.

Offset Value Meaning
0x01 0x06 Signature size
0x02 0x05 The method is static, and accepts variable arguments
0x03 0x02 Total number of parameters supplied to method is 2, one required and one additional
0x04 0x01 The type of the returned value (void), see constants
0x05 0x08 First required parameter's type (int32), see constants
0x06 0x41 SENTINEL constant, all parameters after this value are additional
0x07 0x08 First additional parameter's type (int32), see constants

Example 3

This is the most problematic sample of the entire article, the method in sample code below calls the unmanaged method that accepts variable arguments, code compiles but throws a TypeLoadException exception ("The signature is incorrect"), unfortunately specification is not clear about this case (see important note at the beginning of this subsection). The sample code shown below likewise in the first example, calls method that accepts variable arguments but this time, the called method is unmanaged.

MSIL
// Full source: StandAloneMethodSig\3.il
// Binary: StandAloneMethodSig\3.dll
// (...)

.method public hidebysig static unmanaged cdecl void TestMethod(int32 required, ...)
{ 
    ret
}

.method public hidebysig static void TestRunMethod()
{ 
    .maxstack 3
    ldc.i4.1
    ldc.i4.2
    ldftn unmanaged cdecl void TestMethod(int32, ...)
    calli unmanaged cdecl void(int32, ...)
    ret
}

The signature generated by the calli is very strange, it ends just before first additional Param element that we supplied to the method.

Offset Value Meaning
0x01 0x05 Signature size
0x02 0x01 The method is static and unmanaged, the calling convention type is C (set by the unmanaged cdecl keyword) and thus accepts variable arguments.
0x03 0x01 Total number of parameters supplied to method is 1, one required and one omitted (I have not the slightest idea why)
0x04 0x01 The type of the returned value (void), see constants
0x05 0x08 First required parameter's type (int32), see constants
0x06 0x41 SENTINEL constant, all parameters after this value are additional, unfortunately there is no additional arguments after this value. If you know why this is so, please contact me.

5. Next Part

That is it for now, the next part can be found here.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)