Introduction
The intention of this paper is to explain some features of the CLR as a basis to understand the basics of interoperation. This paper will reference some key system symbol components in order to explain how the System.Runtime.InteropServices
namespace contains methods that are used to enable a C# program to call a native system function contained in a raw C DLL. This feature is called P/Invoke, for platform invoke.
The Common Language Runtime (CLR) is actually a classic COM server contained within a DLL. It functions as the core component of the .NET Framework. An instance of the CLR is an implementation of the Common Language Infrastructure (CLI) specification that executes code inside the bounds of a well-bounded Common Type System. A .NET language like C# or VB.NET is a language whose compiler targets the CLR and whose code emits metadata and IL code. The compiling of several source code files results in a managed module called a Portable Executable. An assembly is unit of deployment, and derives from combined managed modules. This module system reflects the strengths of the .NET Framework, as the Portable Executable is based on the UNIX-founded Common Object File Format (COFF).
It is operational to integrate with other languages on multiple platforms. Similar to the Java Virtual Machine, the CLR is a virtual execution environment. That is, the CLR is actually a system program, and the architecture of the .NET Framework is an underlying infrastructure that provides an environment for a strictly-managed programming platform that uses some core services of the CLR. The CLR then performs services, of which are (but limited to)) automatic memory management using a Garbage Collection (GC) memory heap, built on top of standard Windows memory mechanisms; metadata and module conventions to control the discovery, loading, layout, and analysis of managed libraries and programs; a rich exceptions subsystem to enable programs to communicate and respond to failures in a structured manner; type verification with security checks and code access security.
Having said that, the CLR, used together with certain base classes of the .NET Framework, allows managed code to call unmanaged functions defined in native code. Further, the CLR allows interoperability with both native and legacy code. This feature is called platform invoke, or P/Invoke. An understanding of this service can lead the developer to a sharper understanding of .NET and COM Interop. Just-In-Time compiling involves compiling managed code to native code, which, in a sense, defines the physical nature of the CLR: physically, the CLR is a collection of DLLs containing sophisticated algorithms that interact with Windows via calls to various Win32 and COM APIs.
Managed programs are then essentially Windows DLLs whose code bootstraps to the CLR as part of the Windows load executable sequence. Loading the CLR into a Windows process will illustrate several Windows DLLs in which each version of the CLR is published with two DLLs: mscorsvr.dll and mscorwks.dll. Neither of these DLLs are a .NET assembly, and consequently, do not contain metadata and IL code. Each process that executes one or many .NET executables will contain one of the two.
Mscorsvr.dll contains the version of the CLR specifically optimized for multi-processing machines (svr means server). Mscorwks.dll contains the version of the CLR specifically optimized for a single processing machine (wks stands for workstation). Mscorlib.dll is an assembly, and is the other main component of the .NET Framework. As an assembly, it contains implementations of every base type (class) in the .NET Framework to thus be called a class library.
The loading of the CLR is a process that must handled by the process itself, and involves an entity called the runtime host. Therefore, there must be some unmanaged code within the runtime host (unmanaged application) since the CLR will be handling the managed code. These pieces of unmanaged code take care of loading the CLR, configuring it, and then transferring the current thread within the process into managed code. Once the CLR is loaded, the unmanaged application that is hosting the runtime must take care of other tasks such as dealing with un-trapped exceptions. This is an important feature because an exception can be caught by a runtime host but there must be some method to handle that exception.
There are several commonly used runtime hosts (referred to in the .NET Framework documentation as unmanaged applications): Internet Explorer, the Console, and WinForms hosts. The point made here is to illustrate that interoperability is required from the start in order to load the CLR.
Once the CLR has been loaded and configured for runtime execution, the runtime host is where the executed assembly is launched and loaded into the default application domain (Appdomain). Similar to a light-weight process, an application domain functions as a unit of isolation to prevent collision with other executables within the process. A Windows process context switch will involve pre-fetching and image loading, where the shared DLLs load into to the process' address space. Each process has at least one thread of execution, and the code and data of the application is loaded into the memory-mapped file in order for the application to execute. The process then is a unit of abstraction that functions as a container for resources in order to run the application. The thread of execution within the process is the concretely defined code instruction.
Earlier, I wrote that the CLR is a classic COM server. Microsoft designed this component as a COM server contained in a DLL, and therefore was written using extraneous plumbing code to adhere to the strict identity rules and thus register itself in the system registry. The same is loaded into a Windows process that must have unmanaged code in order to load and configure it to transfer to the current thread to managed code. Platform Invoke is a service that enables managed doe to call unmanaged functions implemented in DLLs. Remember, there are system DLLs and application DLLs. P/Invoke locates an exported function and marshals its arguments (integers, strings, structures, arrays, and so on) across the interoperation boundary as needed.
An Overview of Platform Invoke
Platform invoke is a service that enables managed code to call unmanaged functions implemented in DLLs, such as those in the Win32 API (note Windows internals). It locates an exported function and marshals its arguments (integers, strings, arrays, structures, and so on) across the interoperation boundary as needed. The classes allowing you to use the P/Invoke mechanism are located in the System.Runtime.InteropServices
namespace. To call a function in a native DLL from a C# program, we must first declare this function within a C# class:
- The declaration of this function must be marked with the
System.Runtime.InteropServices
namespace DllImport
attribute, which indicates the name of the DLL. - Use the
static
and extern
keywords in the method declaration. - Use the same name for the method used in the DLL.
- Give a name to each argument.
A basic use of P/Invoke is to allow .NET components to interact with the Win32 API. Several commonly used DLLs in the Win32 API are:
DLL | Description of Contents |
---|
GDI32.dll
| Graphics Device Interface functions for device output, such as those used for drawing and font management.
|
Kernel32.dll
| Low-level Operating System functions for memory management and resource handling.
|
User32.dll
| Windows management functions for message handling, timers, menus, and communications.
|
If you use the dumpbin.exe tool contained in the Visual Studio Tools, you can identify and locate the functions contained in the DLL:
C:\Program Files\Visual Studio 8\bin> dumbin.exe -exports C:\Windows\System32\kernel32.dll
and then redirect this command to a text file using the '>' operator: > kernel.txt. You will most likely find that there are 1027 functions. One of them is the Beep()
function. Here is the code that should be executed on the .NET Framework console:
using System;
using System.Runtime.InteropServices;
class Program {
[DllImport("Kernel32.dll")]
public static extern bool Beep( uint iFreq, uint iDuration );
static void Main() {
bool b = Beep(100, 100);
}
}
At the prompt, type type con > Beep.cs
You will have space without a prompt.
C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727> type con > Beep.cs
Copy and paste the above code into the console space, and hit Control-Z, and then the Enter key.
Now compile:
C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727> csc.exe /r:System.dll Beep.cs
C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727> Beep.exe
And then listen to the system beep.
When the platform invoke calls an unmanaged function, it performs the following sequence of actions:
- It locates the DLL contained in the function
- It loads the DLL into memory
- It located the address of the function (contained in the DLL) in memory
- It pushes the arguments onto stack, marshaling the data as needed
- It transfers control to the unmanaged function
Consider the standard "Hello, World!" in C#:
using System;
public class MainApp {
public static void Main() {
Console.WriteLine("Hello,World!");
}
}
If you following the hierarchal structure of the System
class, then the code is more accurately written as:
using System;
public class MainApp {
static public void Main(System.String[]args ) {
System.Console.WriteLine("Hello, World!");
}
}
To pass a string to a native method using P/Invoke, you must use the System.String
type. Native functions who take strings each exist in two encoding versions: an ANSI suffixed with an A and a UNICODE (or a variation of the 2 byte UNICODE, like UTF-8) suffixed with a W. So identifying functions consists of both the function name and the name of the DLL. For example, specifying the MessageBox
function contained in User32.dll identifies the function (MessageBox
) and its container location (User32.dll).The typical use of P/Invoke, however, is to allow .NET components to interact with the Win32 API. MessageBoxA
is the ANSI entry point for the MessageBox
function; MessageBoxW
is the entry point for the Unicode version. Note: all COM components are required to be in Unicode, and sometimes a translation of the encoding is required. To choose between one byte and two byte encoding, the DllImport
attribute offers a parameter named CharSet
which can take the values of Auto
, Unicode
, and ANSI
. The example below shows how to pass to strings during a call to the MessageBox()
function:
using System;
using System.Runtime.InteropServices;
class Program {
[DllImport("user32.dll", CharSet = CharSet.Auto)]
public static extern int MessageBox( System.IntPtr hWnd,
string text, string caption, uint type );
static void Main() {
MessageBox( System.IntPtr.Zero, "hello", "caption text", 0 );
}
}
To reiterate, the process of calling a C-style DLL begins by declaring the function to call using the static
and extern
C# keywords. And notice that when you declare a C function prototype, you must list the return type function's name and arguments in terms of managed data types. This is called type conversion, and is a premise of interoperability. The prototype for Beep()
in kernel32.dll is:
BOOL Beep(DWORDdwFreq, DWORD dwDuration);
To call Beep()
, you must convert the Win32 BOOL
type into a .NET bool
and the Win32 DWORD
into a .NET unit: the 32-bit double word, the unsigned integer, the unsigned long integer are all data types that convert to the System32.Int32
.NET type. The C# equivalent is uint
. The following table is meant for type conversion when using C#:
Win32 type | .NET type/ C# Equivalent |
---|
LPSTR , LPCSTR , LPWSTR
| System.String /string
System.StringBuilder
|
BYTE
| System.Byte /byte
|
SHORT
| System.Int16 /short
|
WORD
| System.UInt16 /ushort
|
DWORD , UINT , ULONG
| System.Int32 /int
|
INT , LONG
| System.UInt32 /uint
|
BOOL
| System.Bool /bool
|
CHAR
| System.Char /char
|
FLOAT
| System.Single /float
|
DOUBLE
| System.Double /double
|
Reference