Abstract
This article explores some of the issues in using libraries of unmanaged memory in the new managed memory environment (.NET). We take a look directly at the memory address (pointer) interfaces provided by these libraries and methodologies are presented in C# for manipulating this memory in .NET.
Introduction
Managed memory and garbage collection is a godsend in .NET, but as new development teams are moving to this environment a large number of legacy (unmanaged memory) libraries still exist that provide needed functionality right now. There is too large an investment in these libraries, and too large an investment needed to rewrite them, to just discard them. The good news is .NET provides some tools to access these libraries even in the “keep your hands off” managed memory environment. These tools are not without their caveats though, and this article demonstrates some techniques for accessing and using the libraries where the interfaces are unmanaged memory, specifically pointers. This article is not intended to be a complete primer or tutorial on data marshaling, so a basic understanding of what the Marshal
class (System.Runtime.InteropServices.Marshal
) does is assumed. What we are dealing with here is specifically memory or pointer interfaces.
Emulating Pointers
IntPtr
is the type that is used to represent system pointers in the managed memory environment. So if we were to call a function that passed back a pointer to a data structure, we could import the function as shown here:
[DllImport(“Legacy.dll”)]
public static extern void GetData(IntPtr pDataRecord);
After an invocation of this function, we now have a pointer to our data structure, so how do we access this information? In C++, we may have used a memcpy
to copy this data into our data structures, or we may just use pointer math (offsets) to copy individual variables. But accessing the offset of an IntPtr
is not so straightforward in C#. The compiler does not allow two IntPtr
s to be added, as illustrated by the following compiler error message:
Operator '+' cannot be applied to operands of type 'System.IntPtr' and 'System.IntPtr'
Suppose you needed to read an integer (32-bit) from an offset of 6 bytes from this pointer. The marshal
class provides a function ReadInt32
that allows this:
Int32 I = Marshal.ReadInt32(pDataRecord,4);
This works and there are many more Marshal
methods which will allow you to marshal data from memory offsets. You can get around the compiler error and still do your pointer math by doing the following:
IntPtr ip = new IntPtr(pDataRecord.ToInt32()+4);
This is probably not what was intended by the compiler developers, but does work. It does, however, expose your code to the same risks and maintenance issues as C++ style pointers and errors. You would be well-served to try to utilize the built in functionality of the Marshal
class if possible. Marshalling data structures: Since data is passed to us as data structures, we should set up compatible data structures in our code, allowing the use of the Marshal
class and methods to simplify accessing this data. Given the data structure as defined in the unmanaged code is this:
typedef struct DataRecord
{
int RecordBufferSize;
int NumberOfFields;
int NumberOfElements;
int RecordType;
UINT RecordInfoFlags;
char* FieldDescription;
} DataRecord, *PDataRecord;
Our data structure in C# would be defined like this:
[StructLayout(LayoutKind.Sequential,CharSet=CharSet.Ansi,Pack=2)]
public struct DataRecord
{
[MarshalAs(UnmanagedType.I4)] public int RecordBufferSize;
[MarshalAs(UnmanagedType.I4)] public int NumberOfFields;
[MarshalAs(UnmanagedType.I4)] public int NumberOfElements;
[MarshalAs(UnmanagedType.I4)] public int RecordType;
[MarshalAs(UnmanagedType.U4)] public uint RecordInfoFlags;
public string FieldDescription;
}
Notice the StructLayout
. There are three options for defining the type of layout structure to be used:
- Auto – Chosen by runtime, but objects cannot be exposed outside of managed memory.
- Explicit – The offset of the variable is defined by using the
FieldOffsetAttribute
.
- Sequential – The variables are laid out sequentially as they appear, with the packing specified by
StructLayoutAttribute.Pack
.
Here we have used Sequential as the order of variables specified is the order we will receive it. An example will be given later illustrating the need for Explicit. Additionally the string
is marshaled as an ANSI character set, which is the default. The Pack=2 tells the compiler to align the variables along a 2-byte alignment. The default is 8, and in this illustration setting it to 2 has no effect. In this example, the MarshalAs
attribute is used specifying the type of data that will be marshaled. It is not necessary in this example because the marshal
object is capable of discerning the marshaling needed based on the data type. This would be useful in the event the data needs to be marshaled as a differing data type. There are two ways to create the marshaling for this function. The first is to define the import function here as shown earlier:
[DllImport(“Legacy.dll”)]
public static extern void GetData(IntPtr pDataRecord);
We would invoke this function, receive a pointer to the data structure, and then use the Marshal.PtrToStructure
method to allocate memory for the structure, copy the data from the memory pointed to by the IntPtr
into the structure, and return a reference to the object as shown here:
IntPtr pDataRecord;
UnManagedLib.GetData(pDataRecord);
UnManagedLib.DataRecord ds = (UnManagedLib.DataRecord) Marshal.PtrToStructure
(pDataRecord , typeof(UnManagedLib.DataRecord));
Or alternatively, we can define the function import to allow the marshaling to handle the data transfer as shown here:
[DllImport(“Legacy.dll”)]
public static extern void GetData([MarshalAs(typeof(DataRecord))] pDataRecord);
UnManagedLib.DataRecord ds;
UnManagedLib.GetData(ds);
Suppose we have a union inside of the structure. How will that affect our marshaling? This can be accomplished by changing the StructLayout
mentioned above to Explicit
. Assume the following C++ struct definition:
typedef struct DataVariable
{
char InternalName[16];
char Mnemonic[4];
char CurveLabel[26];
USHORT Format;
union
{
USHORT NumberOfBytes;
USHORT NumberOfElements;
};
USHORT UnitType;
USHORT SpecialHandling;
SHORT NumberOfDecimalPlaces;
USHORT OffsetInRecord;
} DataVariable, *PDataVariable;
C# does not support the union definition, but by using the Explicit StructLayout
we can create the same effect if we define the structure like this:
[StructLayout(LayoutKind.Explicit,CharSet=CharSet.Ansi)]
public struct DataVariable
{
[FieldOffset(0), MarshalAs( UnmanagedType.ByValTStr , SizeConst=16)]
public string InternalName;
[FieldOffset(16), MarshalAs( UnmanagedType.ByValTStr, SizeConst=4)]
public string Mnemonic;
[FieldOffset(20), MarshalAs( UnmanagedType.ByValTStr, SizeConst=26)]
public string CurveLabel;
[FieldOffset(46), MarshalAs( UnmanagedType.U2)] public ushort Format;
[FieldOffset(48), MarshalAs( UnmanagedType.U2)] public ushort NumberOfBytes;
[FieldOffset(48), MarshalAs( UnmanagedType.U2)]
public ushort NumberOfElements;
[FieldOffset(50), MarshalAs( UnmanagedType.U2)] public ushort UnitType;
[FieldOffset(52), MarshalAs( UnmanagedType.U2)] public ushort SpecialHandling;
[FieldOffset(54), MarshalAs( UnmanagedType.I2)] public short NumberOfDecimalPlaces;
[FieldOffset(56), MarshalAs( UnmanagedType.U2)] public ushort OffsetInRecord;
}
The location or offset of each field in the structure is explicitly defined. Note that the fields NumberOfBytes
and NumberOfElements
have the same offset, thus emulating the C++ union. Some other things to note here are the use of the unmanaged type ByValTStr
. This is used on traditional C-style fixed size string
s used in structures. The marshaling of this data structure is similar to the above example and won't be repeated here.
Memory Management
Now that we've looked at the movement of data back and forth between managed and unmanaged memory, what about the management of that memory? If the unmanaged memory library you called allocated memory for the structure it passed you, then it will remain in place (assuming the library itself does not move it). The lifetime and freeing of the memory will depend on the implementation of the library. In the managed memory environment it is a good assumption that the memory will not remain at the same address throughout the life of the program. Fortunately the Marshal
class provides some methods for overcoming this. If the unmanaged memory library expects you to allocate the memory for the structure prior to passing it as a parameter, you will need to use the Marshal.AllocHGlobal
function. Using the example from above, see the following:
IntPtr pDataRecord = Marshal.AllocHGlobal(4);
UnManagedLib.GetData(pDataRecord);
UnManagedLib.DataRecord ds = (UnManagedLib.DataRecord) Marshal.PtrToStructure
(pDataRecord , typeof(UnManagedLib.DataRecord));
Marshal.FreeHGlobal(4);
Memory is allocated for the IntPtr
(we could've used the Marshal.SizeOf
method also to get 4), the pointer is passed in, used, and then the memory is freed. You must always be aware, however, of the expected lifetime of the memory in the library you are calling. The FreeHGlobal
cannot be called until the library will not use the memory again. Again, this is determined by the implementation of the library you are interfacing to.
Conclusion
The Marshal
class provides a rich set of functionality that allows the use of legacy unmanaged-memory libraries in our new managed-memory applications. The use of pointers is obscured as much as is possible while still allowing the access and manipulation of specific memory addresses, and the Marshal
memory methods must be used to insure the runtime memory management does not conflict with the unmanaged-memory libraries.
History
- 19th December, 2005: Initial post