Contents
- Introduction
- What are Signatures?
- Getting Started
- CFF Explorer
- Endianness
- Compressed Integer
- Constants
- Signatures
FieldSig
PropertySig
MethodDefSig
MethodRefSig
StandAloneMethodSig
- Next Part
Some time ago, I had to write my own reflection engine for .NET, I needed it to build .NET documentation generator. After a few weeks, I discovered Mono.Cecil which (after some modifications) is now ideal for me, therefore I dismissed developing my reflection engine. I thought it is a great opportunity to thank all CodeProject community members for sharing their hard earned knowledge, that they have made in recent years, and write an article.
This two-part article covers signatures, that are, second, very important part of .NET file after metadata, about which Daniel Pistelli wrote an excellent article that can be found here. It's strongly recommended to read this article before progressing, additionally, you can also read An In-Depth Look into the Win32 Portable Executable File Format in MSDN Magazine, describing PE file format which forms the foundation for .NET metadata, signatures and Intermediate Language (IL) code. Of course, almost everything can be found in the Partition II Metadata specification, but as usual, specifications sacrifice readability for completeness, that is another reason why I wrote this article.
In brief, signatures store data that cannot be compactly stored in metadata tables, for instance, parameter types, arguments supplied to custom attributes, marshalling descriptors, etc. Storing information such as parameter types in tables would result in excessive data fragmentation, unintelligibility and impose performance penalty, hence CLI/CLR engineers invented signatures allowing storing previously mentioned sort of data in compact and decent manner. In the next chapters, you will clearly see why they are so important.
In this chapter, you will learn few things needed to understand the rest of the article, so do not underestimate it, information contained here will be extensively used in the following sections. Terms associated with signatures, but not covered here, will be explained later, along the way.
For viewing .NET metadata and signatures code, we will be using CFF Explorer written also by Daniel Pistelli. The CFF Explorer is a freeware tool, that is capable to view and edit PE headers, resources and some fields and flags of .NET metadata, you can download it at this site. At the picture below, you can see CFF Explorer running with sample assembly loaded. Usually, signatures are indexed by Signature column, in the red circle there is location of MethodDefSig
signature indexed by Method.Signature
in the #Blob
heap that can be explored by clicking on the green circle.
Picture 1. CFF Explorer in action
I think the best description is given by Wikipedia:
"In computing, endianness is the byte (and sometimes bit) ordering used to represent some kind of data. Typical cases are the order in which integer values are stored as bytes in computer memory (relative to a given memory addressing scheme) and the transmission order over a network or other medium. When specifically talking about bytes, endianness is also referred to simply as byte order."
In our case, we consider endianness as byte order, data (usually integers) stored in a file. There are two methods (orders) of representing data in a file, big-endian and little-endian, PE/.NET file uses both methods, so below we discuss each of them.
Big Endian
In this ordering method, the most significant byte is stored at file location with the lowest offset, the next byte value is stored at the following file offset and so on. In the example below, we want to store value 0x1B5680DA
at the offset 100
, then memory would look like:
| 100 | 101 | 102 | 103 | |
... | 1B | 56 | 80 | DA | ... |
Little Endian
Compared to Big endian, Little endian stores data in reverse order, i.e., the least byte is stored at the lowest offset. In this case, our value 0x1B5680DA
stored in Little Endian would look like:
| 100 | 101 | 102 | 103 | |
... | DA | 80 | 56 | 1B | ... |
Signatures are compressed before being stored into the #Blob
heap by compressing the integers embedded in the signature. In contrast to normal integers that have fixed size, compressed integers use only as much space as needed, almost all signatures use integer compression instead of normal, fixed size integers. Because a vast majority of numbers in signatures lies below 128, space saving is significant. Below, you can see encoding algorithm copied from the specification:
If the value lies between 0
(0x00
) and 127
(0x7F
), inclusive, encode as a one-byte integer (bit 7 is clear, value held in bits 6 through 0).
If the value lies between 2<sup>8</sup>
(0x80
) and 2<sup>14</sup>
- 1 (0x3FFF
), inclusive, encode as a 2-byte integer with bit 15 set, bit 14 clear (value held in bits 13 through 0).
Otherwise, encode as a 4-byte integer, with bit 31 set, bit 30 set, bit 29 clear (value held in bits 28 through 0)
A null string should be represented with the reserved single byte 0xFF
, and no following data.
Example 1
Value is less than 0x80
, so this is the first case, we cut three unnecessary bytes.
| Original value (32-bit) | Compressed value | Saved bytes |
Hex | 00 00 00 03 | 0x03 | 3 |
Binary | 00000000 0000000 00000000 00000011 | 00000011 | - |
Example 2
The same as Example 1.
| Original value (32-bit) | Compressed value | Saved bytes |
Hex | 00 00 00 7F | 7F | 3 |
Binary | 00000000 0000000 00000000 01111111 | 01111111 | - |
Example 3
In this example, original value is equal to 0x80
, although one byte is enough to save 0x80
, using compressed integer requires clearing the last bit, hence to store value 0x80
as compressed integer, we have to have an additional byte.
| Original value (32-bit) | Compressed value | Saved bytes |
Hex | 00 00 00 80 | 80 80 | 2 |
Binary | 00000000 0000000 00000000 10000000 | 10000000 10000000 | - |
Example 4
We cut two unnecessary bytes:
| Original value (32-bit) | Compressed value | Saved bytes |
Hex | 00 00 2E 57 | AE 57 | 2 |
Binary | 00000000 0000000 00101110 01010111 | 10101110 01010111 | - |
Obviously, compression comes at a cost, some bits must be reserved to indicate how many bytes compressed integer occupies, thus the maximum encodable integer is 29 bits long with value 0x1FFFFFFF. Compressed integers are physically encoded using big endian byte order.
The following list represents common constants that are frequently used in almost all signatures, in the next parts of the article, we will be referring to them very often by abbreviations, using only last name member, for example ELEMENT_TYPE_I8
as I8
, ELEMENT_TYPE_STRING
as STRING
, and so on.
Name | Value | Remarks |
ELEMENT_TYPE_END | 0x00 | Marks end of a list |
ELEMENT_TYPE_VOID | 0x01 | System.Void |
ELEMENT_TYPE_BOOLEAN | 0x02 | System.Boolean |
ELEMENT_TYPE_CHAR | 0x03 | System.Char |
ELEMENT_TYPE_I1 | 0x04 | System.SByte |
ELEMENT_TYPE_U1 | 0x05 | System.Byte |
ELEMENT_TYPE_I2 | 0x06 | System.Int16 |
ELEMENT_TYPE_U2 | 0x07 | System.UInt16 |
ELEMENT_TYPE_I4 | 0x08 | System.Int32 |
ELEMENT_TYPE_U4 | 0x09 | System.UInt32 |
ELEMENT_TYPE_I8 | 0x0A | System.Int64 |
ELEMENT_TYPE_U8 | 0x0B | System.UInt64 |
ELEMENT_TYPE_R4 | 0x0C | System.Single |
ELEMENT_TYPE_R8 | 0x0D | System.Double |
ELEMENT_TYPE_STRING | 0x0E | System.String |
ELEMENT_TYPE_PTR | 0x0F | Unmanaged pointer, followed by the Type element. |
ELEMENT_TYPE_BYREF | 0x10 | Managed pointer, followed by the Type element. |
ELEMENT_TYPE_VALUETYPE | 0x11 | A value type modifier, followed by TypeDef or TypeRef token |
ELEMENT_TYPE_CLASS | 0x12 | A class type modifier, followed by TypeDef or TypeRef token |
ELEMENT_TYPE_VAR | 0x13 | Generic parameter in a generic type definition, represented as number |
ELEMENT_TYPE_ARRAY | 0x14 | A multi-dimensional array type modifier. |
ELEMENT_TYPE_GENERICINST | 0x15 | Generic type instantiation. Followed by type type-arg-count type-1 ... type-n |
ELEMENT_TYPE_TYPEDBYREF | 0x16 | A typed reference |
ELEMENT_TYPE_I | 0x18 | System.IntPtr |
ELEMENT_TYPE_U | 0x19 | System.UIntPtr |
ELEMENT_TYPE_FNPTR | 0x1B | A pointer to a function, followed by full method signature |
ELEMENT_TYPE_OBJECT | 0x1C | System.Object |
ELEMENT_TYPE_SZARRAY | 0x1D | A single-dimensional, zero lower-bound array type modifier. |
ELEMENT_TYPE_MVAR | 0x1E | Generic parameter in a generic method definition, represented as number |
ELEMENT_TYPE_CMOD_REQD | 0x1F | Required modifier, followed by a TypeDef or TypeRef token |
ELEMENT_TYPE_CMOD_OPT | 0x20 | Optional modifier, followed by a TypeDef or TypeRef token |
ELEMENT_TYPE_INTERNAL | 0x21 | Implemented within the CLI |
ELEMENT_TYPE_MODIFIER | 0x40 | ORed with following element types |
ELEMENT_TYPE_SENTINEL | 0x41 | Sentinel for vararg method signature |
ELEMENT_TYPE_PINNED | 0x45 | Denotes a local variable that points at a pinned object |
| 0x50 | Indicates an argument of type System.Type |
| 0x51 | Used in custom attributes to specify a boxed object (§23.3 in the ECMA-355 specification). |
| 0x52 | Reserved |
| 0x53 | Used in custom attributes to indicate a FIELD (§22.10, §23.3 in the ECMA-355 specification). |
| 0x54 | Used in custom attributes to indicate a PROPERTY (§22.10, §23.3 in the ECMA-355 specification). |
| 0x55 | Used in custom attributes to specify an enum (§23.3 in the ECMA-355 specification). |
We have almost all preparation behind us, and we can now start talking about signatures, but still they are few things worth mentioning that may not be obvious for everybody. First, almost all integers in signatures are compressed. Second thing that you should remember is that all signatures begin from size (in bytes) that it occupies on the #Blob
heap, of course this value is stored using integer compression. Last but not least, values that locate signature on the #Blob
heap are absolute, i.e., you do not have to add/subtract anything to the main value (such as in the red circle on the Picture 1) to find signature on the heap.
Also keep in mind that when you recompile attached source code (even without modyfying it) signatures in a resultant assembly may change offset.
Because this article is rather guide, in this chapter, we will discuss byte by byte all signatures, beginning from the most simple and ending on the most advanced, each signature being discussed is associated with description, diagram or syntax copied from specification and set of examples whose complete binaries and sources can be downloaded at the top of this article, if possible applications are written using C#, otherwise using CIL (formerly MSIL).
As stated above, we begin from the most simple signatures, one of them is FieldSig
signature, it mainly describes field's type and custom modifiers attached to a field, is indexed by the Field.Signature
column. Of course, Field
's signature starts from entire signature size, next comes FIELD
prolog that has constant value 0x6
, zero or more custom modifiers, and field's type. The syntax diagram for FieldSig
is shown below, in Picture 2.
NOTE: Please do not confuse custom modifiers with custom attributes! These are completely different things. Because custom modifiers form part of several signatures, they will be subject to discussion in the next chapter. In examples in the current chapter, we will not use any custom modifier.
Picture 2. The FieldSig signature syntax diagram
Example 1
This example is straightforward, we have created a simple field of int32
type, as below:
public int TestField;
Now we have to load binary assembly FieldSig\1.dll to CFF Explorer, and go to Field
table in order to find row associated with our field (should be only one), the picture below should help you a little bit.
Picture 3. TestField field's row in the Field metadata table
We found it ! Let us go to 0x000A
in #Blob
.
Picture 4. FieldSig signature explored by the CFF Explorer program
Now we will dissect signature byte by byte in the below table:
Offset | Value | Meaning |
0x0A | 0x02 | Signature size |
0x0B | 0x06 | Prolog |
0x0C | 0x08 | Field's type value is int32 , see constants |
Example 2
This time, we change field's type to string
, as following code listing shows:
public string TestField;
FieldSig
signature for our TestField
field is still located at 0x000A
, and as you can see only last byte has changed from 0x08
to 0x0E
.
Offset | Value | Meaning |
0x0A | 0x02 | Signature size |
0x0B | 0x06 | Prolog |
0x0C | 0x0E | Field's type value is string , see constants |
PropertySig
signature is indexed by the Property.Type
column, it stores information about property, that is, the number of parameters supplied to property in order to get data, zero or more custom modifiers, the type of the returned value, the type of each supplied parameter, but there is also one new thing that appeared in PropertySig
signature, namely HASTHIS
flag (of constant value 0x20
), it indicates whether at run-time, the called method is passed a pointer to the target object as its first argument (the this
pointer). As you can deduce, the HASTHIS
flag is set when property (in fact its setter and getter) is instance or virtual, and is not set when property (getter and setter) is static
. The flag (if set) is ORed together with signature's prolog value. Below you can see the full syntax diagram for this signature.
Picture 5. The PropertySig signature syntax diagram
Example 1
The first example is trivial, we have created one instance property of type int32
, as shown below:
public int TestProperty { get; set; }
The signature begins at 0x001A offset on the #Blob
heap.
Offset | Value | Meaning |
0x1A | 0x03 | Signature size. |
0x1B | 0x28 | Prolog ORed with HASTHIS constant, because 0x20 OR 0x08 = 0x28 . |
0x1C | 0x00 | Number of parameters supplied to the property's getter method, see Picture 5 above. |
0x1D | 0x08 | The type of property's return value (int32 ), see constants. |
Example 2
This example is a little bit more complicated because it uses indexed property which returns a different value, depending on the parameters supplied to the property, and as you can see below, such type of property does not have any name (in C#) but in metadata Field
table is always declared as Item
. You can define only one indexed property per class/structure, but you can overload it.
public int this [int Param1, string Param2]
{
get { return 0; }
set { }
}
The signature of previously mentioned field resides at 0x001B
offset on the #Blob
, and is discussed in the below table:
Offset | Value | Meaning |
0x1B | 0x05 | Signature size |
0x1C | 0x28 | Property is still of instance type, so again signature's prolog is ORed with HASTHIS constant |
0x1D | 0x02 | Number of parameters supplied to the property's getter method, see Picture 5 above |
0x1E | 0x08 | The type of property's return value (int32 ), see constants |
0x1F | 0x08 | The type of property's first parameter value (int32 ), see constants |
0x20 | 0x0E | The type of property's second parameter value (string ), see constants |
Example 3
In this example, we will try to disable HASTHIS
flag by declaring property as static
.
public class TestClas
{
public static int TestProperty { get; set; }
}
The above property's signature this time starts at 0x001A offset on the #Blob
.
Offset | Value | Meaning |
0x1A | 0x03 | Signature size. |
0x1B | 0x08 | Prolog's constant value (only). |
0x1C | 0x00 | Number of parameters supplied to the property's getter method, see Picture 5 above. |
0x1D | 0x08 | The type of property's return value (int32 ), see constants. |
As the name implies, this signature stores information related to methods defined in current assembly, such as the calling convention type, the number of generic parameters, the number of normal method's parameters, the return type and the type of each parameter supplied to the method. Is indexed by the MethodDef.Signature
column.
Picture 6. The MethodDefSig signature syntax diagram
Additionally, some flags are used (listed in the table below), they are ORed together and placed in the second byte of the signature (first is the size of a signature).
Name | Value | Meaning |
HASTHIS | 0x20 | First argument passed to a method is the this pointer, this flag is set when method is instance or virtual. You can also see explanation of the HASTHIS flag in the previous subsection. |
EXPLICITTHIS | 0x40 | Specification says: "Normally, a parameter list (which always follows the calling convention) does not provide information about the type of the this pointer, since this can be deduced from other information. When the combination instance explicit is specified, however, the first type in the subsequent parameter list specifies the type of the this pointer and subsequent entries specify the types of the parameters themselves." Please note that if EXPLICITTHIS is set HASTHIS must also be set. |
DEFAULT | 0x00 | Let the Common Language Runtime determine calling convention, this flag is set when calling static methods. |
VARARG | 0x05 | Specifies the calling convention for methods with variable arguments. |
GENERIC | 0x10 | Method has one or more generic parameters. |
Example 1
As usual, let us start with a simple example, this time we have created the instance method that has two generic parameters and two normal parameters, for clarity, the method does not have any body.
public void TestMethod<GenArg1, GenArg2>(int Param1, object Param2) { }
The MethodDefSig
signature for sample method lies at 0x000A
offset, and looks as follows:
Offset | Value | Meaning |
0x0A | 0x06 | Signature size |
0x0B | 0x30 | Because this is instance and generic method, flags HASTHIS and GENERIC are set, 0x20 OR 0x10 = 0x30 |
0x0C | 0x02 | The number of generic parameters |
0x0D | 0x02 | The number of normal parameters |
0x0E | 0x01 | The type of the returned value (void ), see constants |
0x0F | 0x08 | First parameter's type (int32 ), see constants |
0x10 | 0x1C | Second parameter's type (object ), see constants |
Example 2
In this example, we will try once again to demonstrate usage of HASTHIS
flag, discussed method definition looks as below:
public class TestClas
{
public static void TestMethod(int Param1, object Param2) { }
}
Signature again lies at 0x000A
on the #Blob
heap, and looks like:
Offset | Value | Meaning |
0x0A | 0x05 | Signature size |
0x0B | 0x00 | There is one flag set, namely DEFAULT , it means that method is static , and lets CLR determine a calling convention used. The method is also not generic method, because GENERIC flag is not set, thus next byte specifies a number of normal (not generic) parameters supplied to the method. |
0x0C | 0x02 | The number of normal parameters |
0x0D | 0x01 | The type of the returned value (void ), see constants |
0x0E | 0x08 | First parameter's type (int32 ), see constants |
0x0F | 0x1C | Second parameter's type (object ), see constants |
Example 3
Now let us see how EXPLICITTHIS
flags works, we can turn it on by using explicit
keyword in method definition, of course in the CIL language.
.method instance explicit void TestMethod () cil managed
{
.maxstack 2
ret
}
MethodDefSig
for the above method looks like this:
Offset | Value | Meaning |
0x01 | 0x03 | Signature size |
0x02 | 0x60 | HASTHIS and EXPLICITTHIS flags are set, because 0x20 OR 0x40 = 0x60 |
0x03 | 0x00 | The number of parameters that method takes |
0x04 | 0x01 | The type of the returned value (void ), see constants |
Example 4
In this example, we have created method that accepts variable arguments, i.e., in addition to normal parameters that are in declaration it accepts variable number of variable type parameters. Adding vararg
in the CIL language keyword to the method definition makes method accepting variable arguments, as you can see on the below code listing.
IMPORTANT: Using params
keyword in C# does not set the VARARG
flag in associated method's signature. The result of my investigation is that method which use params
keyword in C# is just decorated by the C# compiler with the ParamArray attribute, and additional parameters are treated as a normal array. You can also make a method truly VARARG
in C# by following this instruction, but this is not CLS compliant.
.method instance vararg void TestMethod () cil managed
{
.maxstack 2
ret
}
The method's signature is explored in the following table:
Offset | Value | Meaning |
0x01 | 0x03 | Signature size |
0x02 | 0x25 | The method is instance and accepts variable arguments, thus HASTHIS and VARARG flags are set, and so 0x20 OR 0x05 = 0x25 |
0x03 | 0x00 | The number of parameters that method takes |
0x04 | 0x01 | The type of the returned value (void ), see constants |
This signature is very similar (if not identical) to the previously mentioned MethodDefSig
, but in concern to it, the MethodRefSig
describes a method's calling convention, parameters, etc., at the point where a method is called (also known as call site). The signature is indexed by the MemberRef.Signature
column, and if a method does not accept variable arguments is identical to MethodDefSig
and shall match exactly the signature specified in the definition of the target method, otherwise is as below:
Picture 7. The MethodRefSig signature syntax diagram
As you can see, when you calling VARARG
method in its associated MethodRefSig
, there is one additional constant, namely SENTINEL
, this value has only one simple aim, it denotes end of the required parameters supplied to the method, and beginning of additional (variable) parameters, you can find more information about sentinel values here. Also notice that the ParamCount
integer indicates total number of parameters supplied to the method. In the table below, there is full listing of abbreviations used in MethodRefSig
signature when it is different than MethodDefSig
.
Name | Value | Meaning |
HASTHIS | 0x20 | First argument passed to a method is the this pointer, this flag is set when method is instance or virtual. You can also see explanation of the HASTHIS flag in the subsection 4.2. |
EXPLICITTHIS | 0x40 | Specification says: "Normally, a parameter list (which always follows the calling convention) does not provide information about the type of the this pointer, since this can be deduced from other information. When the combination instance explicit is specified, however, the first type in the subsequent parameter list specifies the type of the this pointer and subsequent entries specify the types of the parameters themselves." Please note that if EXPLICITTHIS is set HASTHIS must also be set. |
VARARG | 0x05 | Specifies the calling convention for methods with variable arguments. |
SENTINEL | 0x41 | Denotes end of required parameters. |
Example 1
To convince you that when calling non VARARG
method there is no difference between MethodDefSig
and its associated MethodRefSig
signature, I have created the following code:
public void TestMethod(int Param1, string Param2) { }
new TestClass().TestMethod(0, "A simple parameter");
Now let us look at the TestMethod
's MethodDefSig
signature that resides in the MethodRefSig\1a.dll file.
Offset | Value | Meaning |
0x0A | 0x05 | Signature size |
0x0B | 0x20 | The method is instance, thus HASTHIS flag is set which means that first argument passed to the method is the this pointer. |
0x0C | 0x02 | Method takes exactly two parameters. |
0x0D | 0x01 | The type of the returned value (void ), see constants |
0x0E | 0x08 | First parameter's type (int32 ), see constants |
0x0F | 0x0E | Second parameter's type (string ), see constants |
And its related MethodRefSig
looks exactly the same, but lies at different offset.
Offset | Value | Meaning |
0x13 | 0x05 | Signature size |
0x14 | 0x20 | The method is instance, thus HASTHIS flag is set which means that first argument passed to the method is the this pointer. |
0x15 | 0x02 | Method takes exactly two parameters. |
0x16 | 0x01 | The type of the returned value (void ), see constants |
0x17 | 0x08 | First parameter's type (int32 ), see constants |
0x18 | 0x0E | Second parameter's type (string ), see constants |
Example 2
In this example, we will demonstrate how the MethodRefSig
signature deals with calling VARARG
methods. For this purpose, we have created truly VARARG
method that takes one required parameter, and other, variable parameters. Remember that using params
keyword in C# does not set the VARARG
flag in associated method's signature, because params
just decorates a method with the ParamArray attribute and additional parameters are treated like array of objects of some type. In order to set VARARG
flag in the signature, you have to add __arglist
to a method definition as the last parameter, but this is not CLS compliant. For more information, go here.
[CLSCompliant(false)]
public void TestMethod(string RequiredParam, __arglist)
{
Console.WriteLine("Required parameter is: " + RequiredParam);
Console.WriteLine("Additional parameters are: ");
ArgIterator argumentIterator = new ArgIterator(__arglist);
for (int i = 0; i < argumentIterator.GetRemainingCount(); i++)
{
Console.WriteLine(__refvalue(argumentIterator.GetNextArg(), string));
}
}
Now it is time to call our method from a separate assembly, The method is called with one required argument of type string
, and two additional arguments of type int32
, like below:
[CLSCompliant(false)]
public void TestRunMethod()
{
new TestClass().TestMethod(
"I am required parameter.",
__arglist(0, 1));
}
I have discovered that for the above call, there are two rows in the MemberRef
table. I do not know why this is so, but I know that signature from the first encountered row has HASTHIS
flag set but it does not contain any information about variable arguments that have been supplied to the method, specification does not say anything about this strange behaviour. But signature indexed by the second row is OK, let us look.
Offset | Value | Meaning |
0x23 | 0x07 | Signature size |
0x24 | 0x25 | The method is instance and accepts variable parameters, hence HASTHIS OR VARARG = 0x20 OR 0x05 = 0x25 |
0x25 | 0x03 | Total number of parameters supplied to method is 3, one required and two additional |
0x26 | 0x01 | The type of the returned value (void ), see constants |
0x27 | 0x0E | First required parameter's type (string ), see constants |
0x28 | 0x41 | SENTINEL constant, all parameters after this value are additional |
0x29 | 0x08 | First additional parameter's type (int32 ), see constants |
0x30 | 0x08 | Second additional parameter's type (int32 ), see constants |
This signature type is very similar to the MethodRefSig
, it provides call site signature for a method, but has two key differences. The first is that StandAloneSig
can specify an unmanaged target method, StandAloneSig
is usually created as preparation for executing calli
instruction that invokes either managed or unmanaged code. The second important difference is that the StandAloneSig
signature is indexed by the StandAloneSig.Signature
column, which is only one column in the StandAloneSig
metadata table, what is more, each row in this table is not referenced by any other table (that is why its name is "stand alone"), this table is filled by code generators. The signature at StandAloneSig.Signature
column shall be either, the StandAloneMethodSig
signature for each execution of calli
instruction or the LocalVarSig
signature that describes local variables in each method, and which will be further clarified in the next subsection. The syntax diagram for the StandAloneSig
signature is as follows:
Picture 8. The StandAloneMethodSig signature syntax diagram
Because this signature is different from the MethodRefSig
signature only to those that StansAloneMethodSig
can call unmanaged methods, few other constants were added that describe calling conventions used to invoke unmanaged methods.
IMPORTANT: As you will see soon, there are different calling conventions for invoking methods accepting variable parameters for managed and unmanaged code. Diagram for each case may look different, for example, the VARARG
calling convention invokes managed methods accepting variable parameters, in this case signature has additional elements, SENTINEL
and one or more Param
(shaded boxes), however the C
calling convention also invokes methods accepting variable parameters (unmanaged code), but signature for this case ends just before Param
element. From my observations, compiler generates signatures as stated above, unfortunately my sample code compile, but throws an exception and I do not know where is the problem so I cannot certainly say that my observations are correct, moreover specification is not clear:
"Two separate diagrams have been combined into one in this diagram, using shading to distinguish between them. Thus, for the following calling conventions: DEFAULT (managed), STDCALL, THISCALL and FASTCALL (unmanaged), the signature ends just before the SENTINEL item (these are all non vararg signatures). However, for the managed and unmanaged vararg calling conventions: VARARG (managed) and C (unmanaged), the signature can include the SENTINEL and final Param items (they are not required, however). These options are indicated by the shading of boxes in the syntax diagram."
Do you see that ? Why the C
box is not shaded if using C
calling convention may add SENTINEL
and Param
elements when calling unmanaged method which accepts variable arguments? Under what circumstances Param
elements are not required? The calli
instruction occurs very rarely in 100% properly working code that calls an unmanaged method (392 assemblies from my GAC executes the calli
instruction only twice and only against managed methods !), so I cannot say that my explanations for following sample code in this subsection are absolutely true. If somebody know how StandAloneMethodSig
signature looks when correctly calling unmanaged method (either, accepting or not accepting variable arguments - in both cases, code throws an exception), please let me know, I would be very grateful.
Name | Value | Meaning |
HASTHIS | 0x20 | First argument passed to a method is the this pointer, this flag is set when method is instance or virtual. You can also see explanation of the HASTHIS flag in the subsection 4.2. |
EXPLICITTHIS | 0x40 | Specification says: "Normally, a parameter list (which always follows the calling convention) does not provide information about the type of the this pointer, since this can be deduced from other information. When the combination instance explicit is specified, however, the first type in the subsequent parameter list specifies the type of the this pointer and subsequent entries specify the types of the parameters themselves." Please note that if EXPLICITTHIS is set HASTHIS must also be set. |
DEFAULT | 0x00 | Let the Common Language Runtime determine calling convention, this flag is set when calling static methods. |
VARARG | 0x05 | Specifies the calling convention for managed methods with variable arguments. |
C | 0x01 | Calling convention for unmanaged method target, specifics for this convention are:
Parameters are passed from right to left.
Caller of a method performs stack cleanup.
Only this calling convention allows invoking unmanaged methods that have variable parameters (vararg is for managed methods).
You can use this calling convention by adding the unmanaged cdecl keyword to a method definition in the CIL language. |
STDCALL | 0x02 | Calling convention for unmanaged method target, specifics for this convention are:
Parameters are passed from right to left.
Called method performs stack cleanup.
You can use this calling convention by adding the unmanaged stdcall keyword to a method definition in the CIL language.
|
THISCALL | 0x03 | Calling convention for unmanaged method target, specifics for this convention are:
Parameters are passed from right to left.
Called method performs stack cleanup.
The this pointer is placed in the ECX register.
You can use this calling convention by adding the unmanaged thiscall keyword to a method definition in the CIL language. |
FASTCALL | 0x04 | Calling convention for unmanaged method target, specifics for this convention are:
Some parameters are placed in ECX and EDX registers, the rest of the arguments are placed (pushed) onto the stack from right to left.
Called method performs stack cleanup.
You can use this calling convention by adding the unmanaged fastcall keyword to a method definition in the CIL language. |
SENTINEL | 0x41 | Denotes end of required parameters |
NOTE: One thing worth mentioning here, is that in contrast to the CL
(Microsoft C\C++ compiler), the ILASM
(Microsoft CIL compiler) does not add any special characters (such as "@", "_", "?", etc.) to a method name when using any of the calling conventions for unmanaged targets. The CIL compiler does not decorate any methods names with special characters because it just generates the bytecode, that can be later compiled into the machine code by the CLR's Just-in-time compiler, so when you choose some calling convention when coding in CIL, the ILASM
compiler does not determine who (caller or called method) cleans a stack, does not determine in what order arguments are passed to a method, and does not change methods names, this is doing during JIT compilation / optimization. If you do not know what I am talking about, you can read Nemanja Trifunovic's article entitled Calling Conventions Demystified which thoroughly describes different calling conventions types for C and C++, their meaning, how they work, etc.
Example 1
In sample code listing, we have two managed methods. The first method has one fixed parameter of type int32
and returns also int32
(in fact, it does not return anything, since there is no data that is pushed onto the evaluation stack), second listed method just executes the first method, you can see it below:
.method public static int32 TestMethod(int32 required)
{
ret
}
.method public static void TestRunMethod()
{
.maxstack 8
ldc.i4.1
ldftn int32 TestMethod(int32)
calli int32(int32)
ret
}
Before the method TestRunMethod
executes the TestMethod
, it pushes one int32
value (argument) onto the evaluation stack using ldc.i4.1
instruction, then pushes pointer to the first method onto the evaluation stack by ldftn
instruction, finally it calls our test "do nothing" managed method executing calli
, and this last instruction generates the StandAloneMethodSig
signature which is explained in the table below:
Offset | Value | Meaning |
0x01 | 0x04 | Signature size |
0x02 | 0x00 | The method does not use any specific calling convention, the method is not instance method, since there is no HASTHIS flag set. |
0x03 | 0x01 | The method requires supplying one fixed parameter and zero variable parameters. |
0x04 | 0x08 | The type of the returned value (int32 ), see constants |
0x05 | 0x08 | First required parameter's type (int32 ), see constants |
Example 2
In this example, we will make the sample method accepting variable arguments and we will call it by calli
with one required and one additional parameter. The fixed parameters are separated from the additional parameters with an ellipsis (...
), as seen below:
.method public hidebysig static vararg void TestMethod(int32 required)
{
ret
}
.method public hidebysig static void TestRunMethod()
{
.maxstack 3
ldc.i4.1
ldc.i4.2
ldftn vararg void TestMethod(int32, ..., int32)
calli vararg void(int32, ..., int32)
ret
}
For this case, the signature generated by the calli
instruction looks the same to MethodRefSig
signature discussed in the previous subsection, let us look.
Offset | Value | Meaning |
0x01 | 0x06 | Signature size |
0x02 | 0x05 | The method is static , and accepts variable arguments |
0x03 | 0x02 | Total number of parameters supplied to method is 2, one required and one additional |
0x04 | 0x01 | The type of the returned value (void ), see constants |
0x05 | 0x08 | First required parameter's type (int32 ), see constants |
0x06 | 0x41 | SENTINEL constant, all parameters after this value are additional |
0x07 | 0x08 | First additional parameter's type (int32 ), see constants |
Example 3
This is the most problematic sample of the entire article, the method in sample code below calls the unmanaged method that accepts variable arguments, code compiles but throws a TypeLoadException exception ("The signature is incorrect"), unfortunately specification is not clear about this case (see important note at the beginning of this subsection). The sample code shown below likewise in the first example, calls method that accepts variable arguments but this time, the called method is unmanaged.
.method public hidebysig static unmanaged cdecl void TestMethod(int32 required, ...)
{
ret
}
.method public hidebysig static void TestRunMethod()
{
.maxstack 3
ldc.i4.1
ldc.i4.2
ldftn unmanaged cdecl void TestMethod(int32, ...)
calli unmanaged cdecl void(int32, ...)
ret
}
The signature generated by the calli
is very strange, it ends just before first additional Param
element that we supplied to the method.
Offset | Value | Meaning |
0x01 | 0x05 | Signature size |
0x02 | 0x01 | The method is static and unmanaged, the calling convention type is C (set by the unmanaged cdecl keyword) and thus accepts variable arguments. |
0x03 | 0x01 | Total number of parameters supplied to method is 1, one required and one omitted (I have not the slightest idea why) |
0x04 | 0x01 | The type of the returned value (void ), see constants |
0x05 | 0x08 | First required parameter's type (int32 ), see constants |
0x06 | 0x41 | SENTINEL constant, all parameters after this value are additional, unfortunately there is no additional arguments after this value. If you know why this is so, please contact me. |
That is it for now, the next part can be found here.