(untagged)

The Unsafe way of doing it

Bala Rajesh

0.00/5 (No votes)

8 Jun 2007

Writing codes with pointers and memory allocation in c#

Introduction

If you would ask me I would say the power or C/C++ lies with pointers. So with the advent of c#, has the language lost its own power? Oh no there's a way of doing such pointer related stuffs in C#. Once I say pointers, it's like accessing the memory directly, that's the reason why we call it Unsafe.

Why do we want to deal with it, when we know it is unsafe? If we want to write code that interfaces with the operating system, or want to access memory mapped device or want to implement a time critical algorithm, then pointers will be very handy and this could be the only way around.

The other interesting thing is, unsafe coding is one of the key differences between C# and VB .NET. In VB there is no way of using pointers and such related things. So if you are planning for writing some complex system programs like device drivers, search engines, editors etc, you can exploit the flexibility and user friendly nature of the .NET framework, along with the power of C/C++ by using Unsafe coding. So in this article we will see the basics of writing unsafe code using C#.

What is unsafe code ?

When you use the new keyword to create a new instance of a reference type, you are asking the CLR to set aside enough memory to use for the variable. The CLR allocates enough memory for the variable and associates the memory with your variable. Under normal conditions, your code is unaware of the actual location of that memory, as far as a memory address is concerned. After the new operation succeeds, your code is free to use the allocated memory without knowing or caring where the memory is actually located on your system.

In C and C++, developers have direct access to memory. When a piece of C or C++ code requests access to a block of memory, it is given the specific address of the allocated memory, and the code directly reads from and writes to that memory location. The advantage to this approach is that direct access to memory is extremely fast and made for efficient code. There are problems, however, that outweigh the benefits. The problem with this direct memory access is that it is easy to misuse, and misuse of memory causes code to crash. Misbehaving C or C++ code can easily write to memory that has already been deleted, or can write to memory belonging to another variable. These types of memory access problems result in numerous hard-to-find bugs and software crashes.

The architecture of the CLR eliminates all of these problems by handling memory management for you. This means that your C# code can work with variables without needing to know details about how and where the variables are stored in memory. Because the CLR shields your C# code from these memory-related details, your C# code is free from bugs related to direct access to memory.

Occasionally, however, you need to work with a specific memory address in your C# code. Your code may need that extra ounce of performance, or your C# code may need to work with legacy code that requires that you provide the address of a specific piece of memory. The C# language supports a special mode, called unsafe mode, which enables you to work directly with memory from within your C# code.

This special C# construct is called unsafe mode because your code is no longer safe from the memory-management protection offered by the CLR. In unsafe mode, your C# code is allowed to access memory directly, and it can suffer from the same class of memory-related bugs found in C and C++ code if you're not extremely careful with the way you manage memory.

Here, you take a look at the unsafe mode of the C# language and how it can be used to enable you to access memory locations directly using C and C++ style pointer constructs.

Understanding Pointer Basics

Memory is accessed in C# using a special data type called a pointer. A pointer is a variable whose value points to a specific memory address. A pointer is declared in C# with an asterisk placed between the pointer's type and its identifier, as shown in the following declaration:

int * MyIntegerPointer;

This statement declares an integer pointer named MyIntegerPointer. The pointer's type signifies the type of variable to which the pointer can point. An integer pointer, for example, can only point to memory used by an integer variable.

Pointers must be assigned to a memory address, and C# makes it easy for you to write an expression that evaluates to the memory address of a variable. Prefixing a unary expression with the C# address-of operator, the ampersand, evaluates to a memory address, as shown in the following code:

int MyInteger = 123; 
int * MyIntegerPointer = &MyInteger;

The preceding code does two things:

� It declares an integer variable called MyInteger and assigns a value of 123 to it.

� It declares an integer pointer called MyIntegerPointer and points it to the address of

the MyInteger variable.

Pointers actually have two values:

� The value of the pointer's memory address

� The value of the variable to which the pointer is pointing

C# enables you to write expressions that evaluate to either value. Prefixing the pointer

Identifier with an asterisk enables you to obtain the value of the variable to which the pointer is pointing, as demonstrated in the following code:

int MyInteger = 123; 
int * MyIntegerPointer = &MyInteger; 
Console.WriteLine(*MyIntegerPointer);

This code writes 123 to the console.

Understanding pointer types

Pointers can have one of the following types:

� sbyte

� byte

� short

� ushort

� int

� uint

� long

� ulong

� char

� float

� double

� decimal

� bool

� an enumeration type

� void, which is used to specify a pointer to an unknown type

You cannot declare a pointer to a reference type, such as an object. The memory for objects is managed by the CLR, and the memory may be deleted whenever the garbage collector needs to free the object's memory. If the C# compiler enabled you to maintain a pointer to an object, your code would run the risk of pointing to an object whose memory may be reclaimed at some point by the CLR's garbage collector.

Suppose that the C# compiler enabled you to write code like the following:

MyClass MyObject = new MyClass(); 
MyClass * MyObjectPointer; 
MyObjectPointer = &MyObject;

The memory used by MyObject is automatically managed by the CLR, and its memory is freed when all references to the object are released and the CLR's garbage collector executes.

The problem is that your unsafe code now maintains a pointer to an object whose memory has been freed. There is no way for the CLR to know that you have a pointer to the object, and the result is that you have a pointer that points to nothing after the garbage collector frees the memory. C# gets around this problem by not enabling you to maintain variables to reference types with memory that is managed by the CLR.

Compiling Unsafe Code

By default, the C# compiler compiles only safe C# code. To force the compiler to compile unsafe C# code, you must use the /unsafe compiler argument:

csc /unsafe file1.cs

Unsafe code enables you to write code that accesses memory directly, bypassing the objects that manage memory in managed applications. Unsafe code can perform better in certain types of applications, because memory locations are accessed directly. This command compiles the file1.cs source file and allows unsafe C# code to be compiled.

Note In C#, unsafe code enables you to declare and use pointers as you would in C++.

Specifying pointers in unsafe mode

The C# compiler doesn't enable you to use pointers in your C# code by default. If you try to work with pointers in your code, the C# compiler issues the following error message:

error CS0214: Pointers may only be used in an unsafe context

Pointers are valid only in C# unsafe mode, and you must explicitly define unsafe code to the compiler. You do so by using the C# keyword unsafe. The unsafe keyword must be applied to a code block that uses pointers.

You can specify that a block of code executes in the C# unsafe mode by applying the unsafe keyword to the declaration of the code body, as shown.

Unsafe Methods

using System; 
public class MyClass 
{ 
public unsafe static void "on" />Main() 
{ 
int MyInteger = 123; 
int * MyIntegerPointer = &MyInteger; 
Console.WriteLine(*MyIntegerPointer); 
} 
}

The Main() method here uses the unsafe modifier in its declaration. This indicates to the C# compiler that all of the code in the method must be considered unsafe. After this keyword is used, the code in the method can use unsafe pointer constructs.

The unsafe keyword applies only to the method in which it appears. If the class were to contain another method, that other method could not use an unsafe pointer constructs unless it, too, is declared with the unsafe keyword. The following rules apply to the unsafe modifier:

� Classes, structures, and delegates can include the unsafe modifier, which indicates that

the entire body of the type is considered unsafe.

� Fields, methods, properties, events, indexers, operators, constructors, destructors, and static constructors can be defined with the unsafe modifier, which indicates that the specific member declaration is unsafe.

� A code block can be marked with the unsafe modifier, which indicates that the entire block should be considered unsafe.

Accessing members' values through pointers

The unsafe mode of C# enables you to use the -> operator to access members to structures referenced by a pointer. The operator, which is keyed as a hyphen followed by a greater-than symbol, enables you to access members directly, as shown .

Accessing Structure Members with a Pointer

using System; 
public struct Point2D 
{ 
public int X; 
public int Y; 
} 
public class MyClass 
{ 
public unsafe static void "on" />Main() 
{ 
Point2D MyPoint; 
Point2D * PointerToMyPoint; 
MyPoint = new Point2D(); 
PointerToMyPoint = &MyPoint; 
PointerToMyPoint->X = 100; 
PointerToMyPoint->Y = 200; 
Console.WriteLine("({0}, {1})", PointerToMyPoint->X, 
PointerToMyPoint->Y); 
} 
}

The above code contains a declaration for a structure called Point2D. The structure contains two public members. The listing also includes an unsafe Main() method that creates a new variable of the structure type and creates a pointer to the new structure. The method then uses the pointer member access operator to assign values to the structure, which is then written to the console.

This differs from member access in the default C# safe mode, which uses the . operator. The C# compiler issues an error if you use the wrong operator in the wrong mode. If you use the . operator with an unsafe pointer, the C# compiler issues the following error message:

error CS0023: Operator '.' cannot be applied to operand of type 
'Point2D*'

If you use the -> operator in a safe context, the C# compiler also issues an error message:

error CS0193: The * or -> operator must be applied to a pointer

Using Pointers to Fix Variables to a Specific Address

When memory for a variable is managed by the CLR, your code works with a variable, and management details about the variable's memory are handled by the CLR. During the CLR's garbage collection process, the runtime may move memory around to consolidate the memory heap available at runtime. This means that during the course of an application, the memory address for a variable may change. The CLR might take your variable's data and move it to a different address.

Under normal conditions, your C# code is oblivious to this relocation strategy. Because your code works with a variable identifier, you usually access the variable's memory through the variable identifier, and you can trust that the CLR works with the correct piece of memory as you work with the variable.

The picture is not as straightforward, however, when you work with pointers. Pointers point to a specific memory address. If you assign a pointer to a memory address used by a variable and the CLR later moves that variable's memory location, your pointer is pointing to memory that is no longer used by your variable.

The unsafe mode of C# enables you to specify a variable as exempt from the memory relocation that the CLR offers. This lets you hold a variable at a specific memory address, enabling you to use a pointer with the variable without worrying that the CLR may move the variable's memory address out from under your pointer. The C# keyword fixed is used to specify that a variable's memory address should be fixed. The fixed keyword is followed by a parenthetical expression containing a pointer declaration with an assignment to a variable. A block of code follows the fixed expression, and the fixed variable remains at the same memory address throughout the fixed code block, as shown

Listing 1: Fixing Managed Data in Memory

using System; 
public class MyClass 
{ 
public unsafe static void "on" />Main() 
{ 
int ArrayIndex; 
int [] IntegerArray; 
IntegerArray = new int [5]; 
fixed(int * IntegerPointer = IntegerArray) 
{ 
for(ArrayIndex = 0; ArrayIndex < 5; ArrayIndex++) 
IntegerPointer[ArrayIndex] = ArrayIndex; 
} 
for(ArrayIndex = 0; ArrayIndex < 5; ArrayIndex++) 
Console.WriteLine(IntegerArray[ArrayIndex]); 
} 
}

The fixed keyword in Listing 1 declares an integer pointer that points to an integer array. It is followed by a block of code that writes values to the array using the pointer. Within this block of code, the address of the IntegerArray array is guaranteed to be fixed, and the CLR does not move its location. This enables the code to use a pointer with the array without worrying that the CLR will move the array's physical memory location. After the fixed code block exits, the pointer can no longer be used and the CLR again considers the IntegerArray variable a candidate for relocation in memory.

Understanding pointer array element syntax

Listing 1 also illustrates the array element pointer syntax. The following line of code treats an unsafe mode pointer as if it were an array of bytes:

IntegerPointer[ArrayIndex] = ArrayIndex;

This line of code treats the pointer as if it were an array. The array element pointer syntax allows your unsafe C# code to view the memory pointed to by the pointer as an array of variables that can be read from or written to.

Comparing pointers

The unsafe mode of C# enables you to compare pointers using the following operators:

� Equality (==)

� Inequality (!=)

� Less-than (<)

� Greater-than (>)

� Less-than-or-equal-to (<=)

� Greater-than-or-equal-to (>=)

As with value types, these operators evaluate to Boolean values of True and False when used

with pointer types.

Understanding pointer arithmetic

Pointers can be combined with integer values in mathematical expressions to change the location to which the pointer points. The + operator adds a value to the pointer, and the - operator subtracts a value from the pointer.

The fixed statement in Listing 1 could have also been written as follows:

fixed(int * IntegerPointer = IntegerArray) 
{ 
for(ArrayIndex = 0; ArrayIndex < 5; ArrayIndex++) 
*(IntegerPointer + ArrayIndex) = ArrayIndex; 
}

In this code block, the pointer is offset by a value, and the sum is used to point to a memory location.

The pointer arithmetic is performed in the following statement:

*(IntegerPointer + ArrayIndex) = ArrayIndex;

This statement reads as follows: "Take the value of IntegerPointer and increment it by the number of positions specified by ArrayIndex. Place the value of ArrayIndex in that location." Pointer arithmetic increments a pointer position by a specified number of bytes, depending on the size of the type being pointed to. Listing 1 declares an integer array and an integer pointer. When pointer arithmetic is used on the integer pointer, the value used to offset the pointer specifies the number of variable sizes to move, not the number of bytes.

The following expression uses pointer arithmetic to offset a pointer location by three bytes:

IntegerPointer + 3

The literal value 3 in this expression specifies that the pointer should be incremented by the space taken up by three integers, not by three bytes. Because the pointer points to an integer, the 3 is interpreted as "space for three integers" and not "space for three bytes." Because an integer takes up four bytes of memory, the pointer's address is incremented by twelve bytes (three integers multiplied by four bytes for each integer), not three bytes.

Using the sizeof operator

You can use the sizeof operator in unsafe mode to calculate the number of bytes needed to hold a specific data type. The operator is followed by an unmanaged type name in

parentheses, and the expression evaluates to an integer specifying the number of bytes needed to hold a variable of the specified type.

The following table lists the supported managed types and the values that are returned by a sizeof operation.

Expression	Result
Sizeof(sbyte)	1
Sizeof(byte)	1
Sizeof(short)	2
Sizeof(ushort)	2
Sizeof(int)	4
Sizeof(uint)	4
Sizeof(long)	8
Sizeof(ulong)	8
Sizeof(char)	2
Sizeof(float)	4
Sizeof(double)	8
Sizeof(bool)	1

Allocating Memory from the Stack

C# provides a simple memory allocation mechanism in unsafe code. You can request memory in unsafe mode using the C# stackalloc keyword, as shown in the below code

Allocating Memory from the Stack

using System; 
public class MyClass 
{ 
public unsafe static void "on" />Main() 
{ 
int * CharacterBuffer = stackalloc int [5]; 
int Index; 
for(Index = 0; Index < 5; Index++) 
CharacterBuffer[Index] = Index; 
for(Index = 0; Index < 5; Index++) 
Console.WriteLine(CharacterBuffer[Index]); 
} 
}

A data type follows the stackalloc keyword. It returns a pointer to the allocated memory

block, and you can use the memory just as you would use the memory allocated by the CLR.

There is no explicit operation for freeing the memory allocated by the stackalloc keyword. The memory is freed automatically when the method that allocated the memory exits.

Summary

The unsafe mode in C# enables your code to work directly with memory. Using it can enhance performance because your code accesses memory directly, without having to navigate through the CLR. However, unsafe mode is potentially dangerous and can cause your code to crash if you do not work with the memory properly.

In general, avoid using the C# unsafe mode. If you need that last bit of performance from your code, or if you're working with legacy C or C++ code that requires you to specify a specific memory location, you should stick to the default safe mode and let the CLR handle memory allocation details for you. The minor performance degradation that results is far outweighed by lifting the burden of memory management from your code, and by gaining the freedom to write code that is devoid of bugs related to memory management. My final word is if you are not sure about what you are doing its always better to play it safe.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

The Unsafe way of doing it

Introduction

What is unsafe code ?

Understanding Pointer Basics

Compiling Unsafe Code

Specifying pointers in unsafe mode

Unsafe Methods

Accessing members' values through pointers

Accessing Structure Members with a Pointer

Using Pointers to Fix Variables to a Specific Address

Listing 1: Fixing Managed Data in Memory

Understanding pointer array element syntax

Comparing pointers

Understanding pointer arithmetic

Using the sizeof operator

Expression

Result

Allocating Memory from the Stack

Allocating Memory from the Stack

Summary

License