(untagged)

A Generic, Fast implementation of a 2dimensional dense array in C#

webmaster442

0.00/5 (No votes)

17 Dec 2015

A brief explanation of why 2d array is slow, and why not to use jagged array and howto overcome the problems using the dense array storage

Download 2DArrayTestSuite.zip - 37.2 KB

Introduction

Multi dimensional arrays exist in almost every programming language, but generally they are not the best solution to any problem because of various reasons. In this article I will show the different approaches offered by C# to create a 2D array. The article focuses primarialy on these, because it's the most common use of multidimensional arrays.

The "standard" way

C# rovides two types of arrays. The "standard" multidimensional array which has the following syntax:

var myarray = new int[2, 3];

The First number specifies the number of rows, while the second one specifies the number of columns. It has a neat, and understandable syntax, nothing fancy or complicated. But the biggest catch is the performance. If you have to pass this array to a function, then it will be painfully slow.

The slowness comes from the internals of the CLR. A multidimensional array doesn't have a Length property. Instead it has a GetLength(int dimension) method that returns the specified dimensions length. This requires more arithmetic than a standard Length property call, because length returns a memory segment, that can be compared, and worked with.

Jagged array

Another, better way is to use a Jagged array. A Jagged array is a one dimensional array of one dimensional arrays. Because of this layout the rows can differ in length. An extra overhead of memory consumption is added this way. The jagged array has the following syntax:

var jaggedArray = new int[2][];
for (int i=0; i<2; i++) jaggedArray[i] = new int[3];

The extra memory overhead comes from that we are storing an array of arrays. Each array has some internal properties that needed to be stored. This can be a big overhead, if we are dealing with a large array. Also notice the extra initialization cycle that is needed for the initialization of the sub arrays.

Dense array

Beyond these two approaches exists another implementation, which is called dense array. The dense term comes from matrices. A dense array is actually a one dimensional array indexed in a special way, so you can fit a 2 dimensional array into a one dimensial one.

This has some very good benefits, for example you could easily copy the elements, with an Array.Copy call. The downside of this is that you have to use some fancy indexing, which can lead to a serious performance issue. Typically you would calculate the element index for a given row, and column with the following method:

var index = column * rows + row;
array[index] = 33;

The biggest catch in this formula is the multiplication. Modern compilers and CPUs handle multiplication very well, but at large numbers this takes some time. If you have a 10000x10000 array, then you have to perform 100000000 multiplications on every indexing. This problem can be solved by storing the column and row multiplication value in another array, and looking up the value at indexing, sure it costs some extra memory, but the performance won't be as bad as multiplying every time.

The Implementation

Implementing the dense array method every time is bit of an overkill. Luckily we have Generics in c# since .net 2.0, so I came up with a generic Dense array class. The implementation is currently in a proof of concept stage, but it can easily extended to a full featured implementation.

namespace DenseArray
{
    public class DenseArray<T> : IEnumerable<T>
    {
        private T[] _dataarray;
        private int[] _columnindexes;

        /// <summary>
        /// Calculates the column indexes
        /// </summary>
        private void CalculateColumnIndexes()
        {
            _columnindexes = new int[Columns];
            for (int i=0; i<Columns; i++)
            {
                _columnindexes[i] = i * Rows;
            }
        }

        /// <summary>
        /// Creates a new instance of DenseArray
        /// </summary>
        /// <param name="rows">Number of rows</param>
        /// <param name="columns">Number of columns</param>
        public DenseArray(int rows, int columns)
        {
            _dataarray = new T[rows * columns];
            Rows = rows;
            Columns = columns;
            CalculateColumnIndexes();
        }

        /// <summary>
        /// Creates a new instance of DenseArray
        /// </summary>
        /// <param name="source">Source DenseArray</param>
        public DenseArray(DenseArray<T> source)
        {
            _dataarray = new T[source.Rows * source.Columns];
            Rows = source.Rows;
            Columns = source.Columns;
            Array.Copy(source._dataarray, this._dataarray, source._dataarray.LongLength);
            _columnindexes = new int[Columns];
            Array.Copy(source._columnindexes,
                       this._columnindexes,
                       source._columnindexes.LongLength);
        }

        /// <summary>
        /// Creates a new instance of DenseArray
        /// </summary>
        /// <param name="array">source 2d array</param>
        public DenseArray(T[,] array)
        {
            Rows = array.GetLength(0);
            Columns = array.GetLength(1);
            _dataarray = new T[Rows * Columns];
            CalculateColumnIndexes();
            for (int i = 0; i < Rows; i++)
            {
                for (int j = 0; j < Columns; j++)
                {
                    this[i, j] = array[i, j];
                }
            }
        }

        /// <summary>
        /// Gets the number of columns
        /// </summary>
        public int Columns { get; private set; }

        /// <summary>
        /// Gets the number of rows
        /// </summary>
        public int Rows { get; private set; }

        /// <summary>
        /// Gets or sets an element of the array
        /// </summary>
        /// <param name="row">Row index</param>
        /// <param name="column">Column index</param>
        /// <returns>Value at row and column index</returns>
        public T this[int row, int column]
        {
            get { return _dataarray[_columnindexes[column] + row]; }
            set { _dataarray[_columnindexes[column] + row] = value; }
        }

        /// <summary>
        /// Gets or sets an element of the array
        /// </summary>
        /// <param name="i">Index</param>
        /// <returns>Value at index</returns>
        public T this[int i]
        {
            get { return _dataarray[i]; }
            set { _dataarray[i] = value; }
        }

        /// <summary>
        /// IEnumerable implementation.
        /// </summary>
        /// <returns>internal array enumerator</returns>
        public IEnumerator<T> GetEnumerator()
        {
            return (IEnumerator<T>)_dataarray.GetEnumerator();
        }

        /// <summary>
        /// IEnumerable Implementation
        /// </summary>
        /// <returns>internal array enumerator</returns>
        System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
        {
            return _dataarray.GetEnumerator();
        }

        /// <summary>
        /// Gets a row as an Array
        /// </summary>
        /// <param name="rowindex">Row index</param>
        /// <returns>Row values in an array</returns>
        public T[] GetRow(int rowindex)
        {
            T[] ret = new T[this.Columns];
            for (int i = 0; i < this.Columns; i++)
            {
                ret[i] = this[rowindex, i];
            }
            return ret;
        }

        /// <summary>
        /// Gets a column as an Array
        /// </summary>
        /// <param name="columnindex">Column index</param>
        /// <returns>Column values in an array</returns>
        public T[] GetColumn(int columnindex)
        {
            T[] ret = new T[this.Rows];
            for (int i = 0; i < this.Rows; i++)
            {
                ret[i] = this[i, columnindex];
            }
            return ret;
        }

        /// <summary>
        /// Sets a row's values from an array
        /// </summary>
        /// <param name="row">the row index</param>
        /// <param name="items">items in an array</param>
        public void SetRow(int row, T[] items)
        {
            if (row < 0 || row > this.Rows)
               throw new ArgumentOutOfRangeException("row");
            if (items == null)
               throw new ArgumentNullException("items");

            int limit = Math.Min(items.Length, this.Columns);

            for (int i = 0; i < limit; i++)
            {
                this[row, i] = items[i];
            }
        }

        /// <summary>
        /// Sets a column's values from an array
        /// </summary>
        /// <param name="column">the column index</param>
        /// <param name="items">items in an array</param>
        public void SetColumn(int column, T[] items)
        {
            if (column < 0 || column > this.Columns)
               throw new ArgumentOutOfRangeException("column");
            if (items == null)
               throw new ArgumentNullException("items");

            int limit = Math.Min(items.Length, this.Rows);

            for (int i = 0; i < limit; i++)
            {
                this[i, column] = items[i];
            }
        }
    }
}

Using the code

Using the implementation is quite easy, it acts like a 2D array, because the class provides object indexers:

var dense = new DenseArray<int>(3, 3);
dense[0, 2] = 42;

The implementation provides methods for getting and setting whole column data in an easy way. Also it implements the IEnumerable interface, which is at the moment

Performance

I measured the performance of the methods shown in this article with a simple program (source code is in the downloadable zip), that allocates a 10000x10000 element array using the standard 2D array method, the jagged method and my implementation. The results was measured on a machine with an Intel i3 530 CPU running a 64 bit version Windows 10 with .NET framework 4.6 and 4GiB RAM.

		2D array	Jagged array	DenseArray
1st Run	Memory Allocation in bytes:	400527360	403034112	400023552
	Random fill time in ms:	2292	1599	1962
	Memory overhead in %		0,6259	-0,1258
	Speed Gain in %		43,3396	16,8196
2nd Run	Memory Allocation in bytes:	400510976	403079168	400023552
	Random fill time in ms:	2285	1556	1990
	Memory overhead in %		0,6412	-0,1217
	Speed Gain in %		46,8509	14,8241
3rd Run	Memory Allocation in bytes:	400502784	403087360	400039936
	Random fill time in ms:	2271	1373	1958
	Memory overhead in %		0,6453	-0,1156
	Speed Gain in %		65,4042	15,9857
	Average Memory in bytes:	400513706,7	403066880	400029013,3
	Average random fill time in ms:	2282,666667	1509,333333	1970
	Average memory overhead compared to 2d array in %:		0,6375	-0,1210
	Average speed gain compared to 2d array in %:		51,2367	15,8714

Further improvements

The implementation in the future might be extended with proper GetHashCode() and ToString() ovverides, and a decent implementation of an Enumeratior coud be added, because the current enumerator enumerates items as they are in the underlaying array.

History

2015-12-16 Initial release

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here