Introduction
This is an in-depth look at the arrays in the Common Language Runtime and the .NET Framework. This study details the implementation of arrays and describes efficient ways of using them.
This is my second article in the UNDOCUMENTED series, following my earlier article on strings. The intent of writing articles in this series is to help me understand how to use the C# efficiently to develop a serious commercial application.
I am ex-Microsoft developer of Excel, who has started his own software company, developing applications employing artificial intelligence. Since my previous article on String Undocumented last month, I actually started doing some contract work on the side for Microsoft; I am actually working for the Windows group and can't believe the massive amount of great stuff that's gone on for the next generation of the NET framework. But, because I have signed an NDA, I have limited all my information to publicly available information and won't discuss unreleased products.
Background
The Array class is the base class for arrays used by the compiler and the runtime. Along with strings, arrays (including descendent classes) are the only types which are of variable length. The rank of an Array is the number of dimensions in the Array. The lower bound of a dimension of an Array is the starting index of that dimension of the Array; a multidimensional Array can have different bounds for each dimension
Internally, the runtime maintains two separate implementations of arrays -- optimized SZARRAYS
and general arrays, which I will call MDARRAYS
for multi-dimensional arrays (even though an MDARRAY
can be of one dimensional). SZARRAYs
are zero-based and one-dimensional. ARRAYs
, on the otherhand, are multidimensional and/or have non-zero lower-bounds. SZARRAYS
are by far more common than ARRAYs
and therefore are highly optimized. The table below details the differences between the two.
|
SZARRAYS |
MDARRAYS |
Description |
One-dimensional, zero-based |
Multidimensional and/or non-zero-based |
C# Syntax |
object[] -- regular arrays object[][] -- jagged arrays |
object [,] -- multidimensional |
CLS-Compliant |
Yes |
No, if lower bounds are non-zero |
IL-Optimized |
Yes. IL contains specialized instructions such as ldelem and stelem to handle these arrays. |
No, in Version 1.0. Accesses and modifications occur through functions. |
Methods Optimized |
Yes. Arrays of primitives have specialized methods that are perform efficiently without boxing. |
No, in Version 1.0. Arrays of all types are implemented using the same generic code, which casts all elements to objects. Value types there are repeatedly boxed and unboxed within functions such as sort, reverse and so on. |
Base Size (not including 8 bytes of v-table and object header) |
4 for valuetypes 8 for referencetypes |
4-8 bytes + 8 * rank |
JIT Optimized |
Jit performs range-check elimination. |
No special JIT optimization. Additional code is added to perform range-check for each dimension. |
SZARRAYS
, perform so much faster than MDARRAYS
, that jagged arrays, which are SZARRAYS
containing other SZARRAYS
are highly preferred over regular multidimensional arrays for performance. Keep in mind, that jagged arrays are not CLS-compliant and may not be worked cross-language.
In version 1.0 of the CLR, each heap-allocated object consists of a 4 byte objectheader and 4 byte pointer to a method table, so each arrays have this initial overhead; in addition, both SZARRAYS
and MDARRAYs
contain the following internal fields.
Variable |
Type |
Description |
Array Length |
int |
This is the actual number of elements int the array. |
ElementType |
Type |
Optional. According to the source code, this field is only present if the array consists of "pointers." It appears that pointers, in this case, refer to object references and not unmanaged pointers. |
MDARRAYS contain these additional internal internally.
Variable |
Type |
Description |
Bounds[rank] |
int[] |
Optional. The number of elements in the array |
LowerBound[rank] |
int[] |
Optional. Valid indexes are lowerBounds[i] <= index[i] < lowerBounds[i] + bounds[i] |
Thus every access in a regular array must examine several internal members. For high performance, two approaches for multidimensional arrays are possible: the aforementioned, "jagged" C# arrays or manual calculation within SZARRAYs
. For example, (i-lowerbound1) * rowcount + (j-lowerbound2)
would locate the element [i,j].
Non zero-based arrays have the same shortcomings as multidimensional arrays, since they are both MDARRAYS
. Not only are they not optimized, but the suffer from greater quality problems due to both less testing and boundary issues.
For instance, arrays with negative lower bounds produce interesting bugs. array.IndexOf
normally returns -1 for failure; however, for arrays with negative bounds, failure is int32.MaxValue. However, this same array will cause IList.Contain
function to always return true for a missing item, because that function calls IndexOf
and assumes that it will return a failure value less than the lowerbound. Moreover, array.BinarySearch
also exhibits problems with negative bounds. When array.BinarySearch
fails, it returns a negative value, which is the complement of the position to insert an item.
Two arrays are considered the same type if they have the same rank(number of dimensions) and the same element type. In contrast to C/C++, the upper and lower bounds of each dimension are not considered, not even inner dimensions in a multidimensional array. Methods such as Array.Copy
operate on heterogeneous multidimensional arrays of the same type by treating each multidimensional array as one large flat unidimensional array. Jagged arrays of different rank are of different types as well; this simply follows from the fact that such arrays have different element types, despite the fact the elements are all arrays. Interestingly, the Type object representing the Array base class interestingly returns false for Type.IsArray
and null for Type.GetElementType
of course, the Array class is abstract and cannot normally be instantiated without being inherited.
Beyond the base size, both MDARRAYs
and SZARRAYS
will contain consecutive inline structures for value types and consecutive pointers for reference types like object and string. Reference types also have an element type field that precedes the data; it does seem redundant since the array's method table could instead extract the necessary type information. The presence of the element type as a field does allow the elements type information to be extracted quickly without an virtual call indirection and function call, which is important for features like array covariance.
If the data is a value type, then the elements would be of the same size as that value type. Reference types consume IntPtr.Size
bytes. IntPtr.Size
is 4 bytes for Win32 and 8-bytes for Win64. This is suppose to equate with the native size of void * according to the documentation, but in non-Win32 versions of Rotor such as Mac and UNIX, it is always 8 bytes regardless of the CPU.
Type |
Element Size in bytes |
bool |
1 |
byte |
1 |
short |
2 |
int |
4 |
long |
8 |
float |
4 |
double |
8 |
decimal |
16 |
string |
IntPtr.Size |
object |
IntPtr.Size |
interface |
IntPtr.Size |
In contrast to strings, the other variable-sized object, none of the internal array fields are exposed in Reflection. Accessing these internal fields requires the use of unsafe code. Since these fields are already exposed through public methods and properties, proving any source code as I have done in "Strings Undocumented" for such operations would be redundant.
Programmatically, an array is an SZARRAY if and only if array.Rank==1 && array.GetLowerBound(0)==0
. An array contains value type members if (elementType = array.GetType().GetElementType()) && elementType.IsSubclassOf(typeof(ValueType)) && elementType != typeof(Enum) && elementType != typeof(ValueType)
. Ironically, neither Enum[]
or ValueType[]
are value type arrays; they are just arrays containing references to boxed valuetypes elements. Actually, the last condition is redundant since ValueType or any other type cannot be a subclass of itself.
The Dynamic ArrayList
The ArrayList is a very useful class for dealing with dynamic arrays. However, it serves more general purposes as well--encapsulating collection classes and allowing special operations to be performed on these classes.
ArrayList
will construct an array object and modify it directly. By default, ArrayList will create an array of 16 elements. The following members of the ArrayList
class in the follow order.
Variable |
Type |
Description |
_items |
object[] |
The base array. |
_size |
int |
The current size of the array list. |
_version |
int |
Version is incremented after each modification. This is used to signify that any operation on view or enumerator of an earlier version of the list should fail. |
Altogether an ArrayList consumes 20 bytes (8 byte object overhead + 12 bytes for the instance information), not including the space for the underlying array.
If the array needs to grow beyond its capacity, a new array will be constructed with twice the previous capacity or the new desired size, subject to the maxcapacity constraint. This approach takes O(3n) which is linear time. The alternative approach of growing the array by a fixed amount rather than a percentage results in quadratic time performance, O(n^2). For optimal performance, arraylists should be preallocated if the size is known to reduce unnecessary copying.
Compacting the size of the ArrayList when done requires a call to TrimToSize
, which will actually perform another copy operation. If all the elements are already added and no further expansion is necessary, it would be more optimal in both performance and memory to extract a type-safe array using a call to arraylist.ToArray(Type type)
.
Completely releasing the space of an array requires calling Clear
(the only other alternative is using RemoveRange
to free all elements), followed TrimToSize
. Ironically, setting a arraylist with a capacity of less than 16 will result in less space than using a capacity of 0, which automatically uses the default capacity of 16.
ArrayLists
are not a complete replacement for Arrays. (I suspect that they are of higher performance than the multidimensional arrays mentioned earlier, but I will determine that in another article in near future, specifically related to performance.)
|
Arrays |
Array Lists |
Memory requirements |
Compact inline data for value-types. Object references for data. |
Object-based arrays. (Value-types incur additional 12-byte overhead per element--4 bytes for the object reference and 8 bytes for the object header introduced by boxing) |
Performance |
Optimized IL instructions. Range check elimination. |
Indirect references |
|
Fixed-size |
Dynamic |
Random Access |
|
Access to an indexed element is off-limits until elements in all previous indices have been added. An solution to this approach is construct an arraylist using the static ArrayList.Repeat(null, initial length) method. |
It's rather straight-foward to convert back and forth from Arrays to ArrayLists. Arrays can be transformed into ArrayLists using ArrayList.Adapter(array)
. ArrayLists can be converted to compact arrays using ToArray()
or ToArray(type)
.
You can also accessing the underlining array of an ArrayList by calling (object[]) sb.GetType().GetField("_items", BindingFlags.NonPublic|BindingFlags.Instance).GetValue(arrayList).
This is not a perfect replacement for ToArray() because of the fact that the array length is the capacity of the ArrayList rather than its count. By stashing away and reusing the FieldInfo object from the call to GetField, any memory and time overhead can be eliminated.
Array Manipulation Without ArrayLists
Manual Array Resizing
Arrays can be resize manually. Here is a useful function that should have been provided by the array class. To mimic the behavior of ArrayList manually, a call to ensure should be placed before any potentially invalid indexing.
public static Array Resize(Array array, int newSize)
{
Type type = array.Type;
Array newArray = Array.CreateInstance(type.GetElementType(), newSize);
Array.Copy(array, 0, newArray, 0, Math.Min(newArray.Length, newSize))l
return newArray;
}
The Resize method uses Array.CreateInstance to do late-bound construction.
Array Movement
To manual move elements within an array, the Array provides a general Copy function to copy data from one array to another. This function also works withthe same array. Range checks are performed just once. Within the same array, the copying behaves like the C standard library's memmove function rather than memcpy.
The InsertHelper function shifts the contents of array to make room for count elements at the indexth position. Any elements towards the end are right-shifted out of the array and disappear. Similarly, the RemoveHelper method removes elements from the specified position, shift trailing elements to the left.
public static Array InsertHelper(Array array, int index, int count)
{
Array.Copy(array, index, array, index+count, array.Length-(index+count));
array.Clear(index, count);
}
public static Array RemoveHelper(Array array, int index, int count)
{
int copy = ;
Array.Copy(array, index+count, array, index, array.Length - (index+count));
array.Clear(array.Length - count, count);
}
The Buffer class also provides useful functions, GetByte, SetByte, ByteLength and BlockCopy, for manipulating arrays of value-types, that no not contain any internal object references. In fact, the types of the elements are ignored as the Buffer class treats each array as a range of bytes. Arrays of different value types can be copied in one another, so, for example, floating-point values can be overlaid unto integral data and vice-versa. To use this class, it is helpful to use the sizeof keyword or Marshal.SizeOf to obtain the exact size of the valueType.
When copying elements inside a multidimensional array, the array behaves like a long one-dimensional array, where the rows (or columns) are conceptually laid end to end. For example, if an array has three rows (or columns) with four elements each, copying six elements from the beginning of the array would copy all four elements of the first row (or column) and the first two elements of the second row (or column).
Arrays of Bits
One class that should not be forgotten is the Pascal set-like BitArray, which works surprisingly like an array of booleans in code. Booleans in .NET occupy a single byte, while, more compact than the int-size C++ bool, is still quite wasteful in an array.
BitArray is implement as an array of Int32s, each consisting of 32 bits. Using an array of ints instead of bytes allows a single instruction to access and modify 32 bits at a time instead of 8, for a four-fold improvement in performance for many operations.
Additional, operators for BitArray include And, Or, Xor, and Not. There's also a cousin BitVector32, that will use an int for a small set.
Array Casting & Conversion
Array Covariance: Conversions of Arrays
Arrays of references types support a feature called Array covariance, that mimic the ability of C++ to cast an array of pointers of one type to another type. One array can be converted to another array of the different type at compile time, if there is some built-in conversion, either explicit or implicit, between the two. The two arrays must also have the same rank. The array is reinterpreted and no underlying physical changes made during the conversion.
If the conversion is implicit, (that is, the element type of the array before conversion is being converted to an interface that it supports or to a base type) a cast is not required and no runtime check is performed. If the conversion is explicit, (the conversion is from an interface to another type, from a base type to a derived type, or from a base type to an interface not supported directly by that base type) an explicit cast is required and a runtime check is performed.
Each array of references, as mentioned earlier, has an underlying element type that remains fixed throughout the conversions. The runtime check is performed to ensure compatibility between the underlying element type and the new element type that it is being reinterpreted as.
A few examples should make it clear:
public class Animal {}
object [] data = new Animal[2];
Animal [] animals1 = data;
Animal [] animals2 = (Animal[]) data;
string [] strings1 = (string[]) animals2;
string [] strings2 = (string[]) data;
object [] data2 = new object[1];
data2[0] = new Animal();
Animal [] animals3 = (Animal[]) data2;
Being able to reinterpret an array of one type to an array of interfaces, base type or derived achieves efficiencies in time and memory, that would have disappeared if another array of the other type were required to be constructed.
public void Test()
{
string data [] = new string [] { "a", "b", "c", "d", "e" };
SetRange(array, 1, 3, "x" );
}
public void SetRange(object [] array, int start, int count, object value)
{
for (int i=0; i < count; i++)
array[i+start] = value;
}
In the above example, the new string would appear as { "a", "x", "x", "x", "e" }. Though we did not explicit code to handle strings, array covariance allows SetRange to work for string arrays, nevertheless. Passing an integer as the parameter value would have resulted in a runtime exception because all array assignments for reference arrays include runtime type checking.
The following example illustrates issues with using value type arrays and string arrays with param arrays. The call to Write with integer arrays results in Results in "System.Int32 []" being written out because the the array of integers is wrapped up into another newly constructed array of objects. However, the call with string arrays results in "a", "b", "c" being written out; because of covariance, the string array becomes the args parameter.
Write( new int [] { 1, 2, 3 } );
Write( new string[] {"a", "b", "c" } );
void Write(params object [] args)
{
for (int i=0; i<args.Length; i++)
Console.WriteLine(args[i]);
}
Everything has some cost. The downside of array covariance is anytime an element in assigned a new object, that object must be type-checked at runtime.
There's always the choice of using object[]
and Array when using covariance to right general array functions. Object[]
in many cases are faster because it is clear the array is an SZARRAY
and IL has special instructions to call for setting and getting elements. However, Array can always hold arrays of value-types as well as multidimensional arrays.
Coversion of Array Elements (Array.Copy)
When copying elements between arrays of the same types, array.Copy
before a single range check before the transfer followed by a ultrafast memmove byte transfer.
Besides copying elements between arrays of different types, array.Copy
can copy elements between those of different types. When copying elements from a reference array to a value-type array, unboxing is performed; in the other direction, boxing is performed. Between different arrays of different value types, only widening conversions are performed (for example, from int to long, but not from long to int). When a conversion cannot be performed an InvalidCastException
is thrown. Between reference types, elements are type-checked for compatibility and a shallow copy of references is performed; an ArrayTypeMismatchException
occurs for arrays of incompatible arrays.
public Array Convert(Array array, Type type)
{
Array newArray = Array.CreateInstance(type, array.Length);
Array.Copy(array,0, newArray,0, array.Length);
return newArray;
}
ArrayList Views
Some of the versatility of array lists are its ability to construct various types of views of other Arrays, ArrayLists and ILists.
Adapter
ArrayList.Adapter allows any IList (that includes an array and many other collections) to be viewed as a ArrayList. This is useful in several ways: An IList can use the binary search, sorting, reverse, subrange and array conversion capabilities that the arraylist automatically provides. However, this may not be as useful for an array since it already provides all those capabilities accept the subrange views.
|
Syntax |
Converting an IList to an Array |
ArrayList.Adapter(iList).ToArray() |
Reversing an IList |
ArrayList.Adapter(iList).Reverse() |
Getting a Subrange |
ArrayList.Adapter(iList).GetRange(start, count) |
Binary Search |
ArrayList.Adapter(iList).BinarySearch() |
Sort |
ArrayList.Adapter(iList).Sort() |
Array Subranges
To extract a subrange of an array, one can write the following code.
public static Array GetRange(Array range, int start, int count)
{
Type type = array.Type;
Array newArray = Array.CreateInstance(type.GetElementType(), count);
Array.Copy(array, start, newArray, 0, count);
return newArray;
}
Because this produces an actual (shallow) copy of the subrange of the array, this potentially consume a lot of memory. The ArrayList class contains a GetRange(int start, int count) which is potentially a memory saver for large array. GetRange returns a derived ArrayList that presents a view of array. The view can be manipulated as well. Elements can be modified, added and removed from within the view; however, any modification to the arraylist from outside the view will cause an exception to be thrown next time the view is used.
This method does not create copies of the elements. The new ArrayList is only a view window into the source ArrayList. However, all subsequent changes to the source ArrayList must be done through this view window ArrayList. If changes are made directly to the source ArrayList, the view window ArrayList is invalidated and any operations on it will return an InvalidOperationException.
In combination with the Adapter method, a view of an array or another collection can be obtained as well. For example, ArrayList.Adapter(array).GetRange(start, count)
provides a view of the underlying array.
Wrapper Support
ArrayList contains three static methods with two overloads each that take either an IList or an ArrayList and returns a fixed-size, synchronized, or readonly IList and ArrayList respectively. The IList methods returns a lightweight wrapper class inherited from IList and consisting of one instance variable, a reference to the base IList. The ArrayList methods returns a derived ArrayList class, which consists of an additional instance variable referring to the base ArrayList class and ignores the inherited arrays. The ILists returned from the methods overloaded for ILists, are not derived from ArrayList and are somewhat misplaced in the ArrayList class. It would have been more appropriate for them to have been static members of the IList interface.
For example, these are the actual implementations of FixedList.
public static IList FixedSize(IList list) {
if (list==null)
throw new ArgumentNullException("list");
return new FixedSizeList(list);
}
public static ArrayList FixedSize(ArrayList list) {
if (list==null)
throw new ArgumentNullException("list");
return new FixedSizeArrayList(list);
}
|
Description |
FixedSize |
Returns a fixed size IList or ArrayList. All operations that change the size of the array such as Add or Remove result in a NotSupportedException. |
ReadOnly |
Returns a readonly IList or ArrayList. All operations that modify any part of the array results in a NotSupportedException. |
Synchronized |
Returns a synchonized IList or ArrayList |
These operations can be combined for a synchronized fixed-size array or a synchronized read-only array: ArrayList.Synchronized(ArrayList.ReadOnly(list))
. A readonly array is automatically fixed-sized.
Readonly is especially useful for prohibiting modification. Arrays, while fixed-size, are still always passed by reference and are mutable. In marshaling, arrays of greater than ten elements are pinned rather than copied, so potentially they could be modified.
Array Performance
Range Check Elimination: Use for(int i=0; i<a.Length; i++)
The C# compiler performs a special optimization that improves the performance of iterating through an array. First, compare the following three approaches to iterating through an array. Which is fastest?
1) standard iteration
int hash = 0;
for (int i=0; i< a.Length; i++)
{
hash += a[i];
}
2) iterative loop with saved length variable
int hash = 0;
int length = a.length;
for (int i=0; i< length; i++)
{
hash += a[i];
}
3) foreach iteration
foreach (int i in a)
{
hash += i;
}
In the current version of the JIT compiler, you may be surprised to learn the first example produces the fastest code, while the third "foreach" example produces the slowest. In later versions of the compiler, foreach
will be special case for arrays to provide the same performance or better as example 1.
Why is example 1 faster than example 2? This is because the compiler recognizes the pattern for (int i=0; i<a.length; i++) for both arrays. Since array have constant length, the compilers simply stores away the length, so that no function call is made on each iteration. (The JIT compiler may actually inline references to the array length, since the current version automatically inlines non-virtual method calls that consist of simple control flow and < 32 bytes of IL instructions.)
In addition, the compiler eliminates all range-check tests on any instance of s[i] within the loop, because i is guaranteed in the for condition to be within the range 0 <= i < length. Normally, any indexing of an array results in a range-check being performed; this is why attempting to save time by stashing away the length variable in example 2 actually results in slower code than in example 1.
There also the pointer approach.
fixed (int *pfixed = a)
{
for (int *p = pfixed; count-->0; p++)
hash += *p++;
}
Large Arrays
Large arrays can have a significant performance hit. Any large object that consumes 85K is placed in the large object heap. Practically speaking, almost all of these objects are going to be arrays, and possibly some strings, because very few classes will contain enough fields to exceed that amount of memory. Large objects are not compacted, and are only removed in a FULL garbage collection, containing generation 2. If the large object includes any finalizers, then at least two FULL garbage collections will be required. Since FULL garbage collections can be 100 times or more less frequent than the partial collections, it can take a long time for memory to be recovered.
Thus, a very poor allocation scheme, probably the worst, for a .NET application would be one that allocates large amounts very frequently for temporary uses.
Array Initialization
Static arrays of primitive data types are initialized at compiled time, just as in C++. Unfortunately, static arrays of structs are not and must be initialized at initialized at runtime. The developers defer this features because of the complexity introduced by the ability of structures to contain object references.
Presizing ArrayLists and Other Collections
In general, collections in the .NET Framework are dynamically resized by doubling their capacity each time they are full. This is an linear-time approach that takes O(3N), which is superior to the quadratic time approach of appending a fixed size to an array. But, if you know the approximate size of final array beforehand, you can essentially bring down the time to create a collection by two-thirds just by preallocating; instead of 3N copies, the collection will make just 1N copy. Each collection has a capacity setting for doing so.
Use Jagged Arrays Over Multidimensional Arrays
In future versions of C# post-Everett, many of the performance issues from the multidimensional arrays will be address. For now, jagged array which rely on optimized SZARRAYs and require just slightly more memory, are significantly faster.
Use strongly typed arrays whenever possible
Strongly typed arrays are highly optimized, avoiding the costs of boxing, casting, function calls and other indirection. With value-type arrays, a number of functions such as Reverse, IndexOf, LastIndexOf, BinSearch and Sort,
In the future, C# in "Visual Studio for Yukon" is expected to support generics and constraints, which will complete eliminate boxing and allow functions to perform as efficiently as possible. More information is available from Microsoft's public announcement at www.csharp.net .
public class Stack<ItemType>
{
private ItemType[] items;
public void Push(ItemType data)
{
...
}
public ItemType Pop()
{
...
}
}
Generics have several advantages over C# templates such as the elimination of code bloat. Code for specialized classes are generated at run-time on the fly, and all reference types share the same code.
Conclusion
This concludes my discourse on arrays for now. I will continue update this article with new source code and actual benchmarks in the future. Be sure to watch for update versions of this page.
As a result of the enthusiasm that this article has generated, I will continue to develop more UNDOCUMENTED articles. My next article will be a discussion of the implementations of arrays and collections. I hope to publish a couple dozen articles when I am done with the series. I might eventually make this into a book.
My sources include various books, the shared source CLI, MSDN, magazine articles, interviews, inside sources, developer conference presentations, and some decompilers. One reader has suggested that I examined .NET Essentials
by Don Box & Chris Sells and Applied .NET Framework
by Jeffrey Richter provide the some of the same information that I do; but I have tried to come up with new information that in neither of those two books.
All of this behind-the-scenes information takes some amount of work to research and obtain, so, if you enjoyed this article, don't forget to vote.
Version History
Version |
Description |
January 5 |
Original article on arrays |