Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

High performance C# byte array to hex string to byte array

0.00/5 (No votes)
27 Aug 2012 2  
High performance C# byte array to hex string to byte array.

Introduction 

Conversion of bytes to hex string and vice versa is a common task with a variety of implementations. The performance key point for each to/from conversion is the (perpetual) repetition of the same if blocks and calculations that is the standard approach for all implementations I've seen. I'm using unsafe code in an effort to achieve the best possible performance. My concept is based on direct assignments. 

Byte array to hex string

By definition, the result of a byte array to hex string conversion is a combination/repetition of static values (0-255 or "00"-"FF"). In other words we can avoid calculations as well as formatting (leading zero for bytes 0-9). Due to the fact that each byte will be transformed to two characters (2 x 2 bytes), we may say that each byte is assigned to a 32-bit integer. Therefore for n bytes we'll define n integers instead of n*2 characters. 

Hex string to byte array 

Different case, different facts. Valid values within a hex string are the characters ranges '0'-'9', 'A'-'F' and 'a'-'f'. The existence of upper and lower case characters and the valid character ranges ASCII gaps, remove the concept of assignments by "instinct". Needless to mention that "instinct" is wrong in this case. The solution is two byte arrays with 103 values each where only 22 of them are valid (per array). As a result a pair of characters will become a byte using only 4 if conditions and 1 summation, while most implementations need 12 if conditions, 3 subtructions, 1 left shift or multiplication and 1 summation. 

The code

Source code includes comments line by line. These comments will also help you to easily adapt the code to a safe version that will be faster than any other prêt à porter code. 

using System;
namespace DRDigit
{
    // class is sealed and not static in my personal complete version
    public unsafe sealed partial class Fast
    {
        #region from/to hex
        // assigned int values for bytes (0-255)
        static readonly int[] toHexTable = new int[] {
            3145776, 3211312, 3276848, 3342384, 3407920, 3473456, 3538992, 3604528, 3670064, 3735600,
            4259888, 4325424, 4390960, 4456496, 4522032, 4587568, 3145777, 3211313, 3276849, 3342385,
            3407921, 3473457, 3538993, 3604529, 3670065, 3735601, 4259889, 4325425, 4390961, 4456497,
            4522033, 4587569, 3145778, 3211314, 3276850, 3342386, 3407922, 3473458, 3538994, 3604530,
            3670066, 3735602, 4259890, 4325426, 4390962, 4456498, 4522034, 4587570, 3145779, 3211315,
            3276851, 3342387, 3407923, 3473459, 3538995, 3604531, 3670067, 3735603, 4259891, 4325427,
            4390963, 4456499, 4522035, 4587571, 3145780, 3211316, 3276852, 3342388, 3407924, 3473460,
            3538996, 3604532, 3670068, 3735604, 4259892, 4325428, 4390964, 4456500, 4522036, 4587572,
            3145781, 3211317, 3276853, 3342389, 3407925, 3473461, 3538997, 3604533, 3670069, 3735605,
            4259893, 4325429, 4390965, 4456501, 4522037, 4587573, 3145782, 3211318, 3276854, 3342390,
            3407926, 3473462, 3538998, 3604534, 3670070, 3735606, 4259894, 4325430, 4390966, 4456502,
            4522038, 4587574, 3145783, 3211319, 3276855, 3342391, 3407927, 3473463, 3538999, 3604535,
            3670071, 3735607, 4259895, 4325431, 4390967, 4456503, 4522039, 4587575, 3145784, 3211320,
            3276856, 3342392, 3407928, 3473464, 3539000, 3604536, 3670072, 3735608, 4259896, 4325432,
            4390968, 4456504, 4522040, 4587576, 3145785, 3211321, 3276857, 3342393, 3407929, 3473465,
            3539001, 3604537, 3670073, 3735609, 4259897, 4325433, 4390969, 4456505, 4522041, 4587577,
            3145793, 3211329, 3276865, 3342401, 3407937, 3473473, 3539009, 3604545, 3670081, 3735617,
            4259905, 4325441, 4390977, 4456513, 4522049, 4587585, 3145794, 3211330, 3276866, 3342402,
            3407938, 3473474, 3539010, 3604546, 3670082, 3735618, 4259906, 4325442, 4390978, 4456514,
            4522050, 4587586, 3145795, 3211331, 3276867, 3342403, 3407939, 3473475, 3539011, 3604547,
            3670083, 3735619, 4259907, 4325443, 4390979, 4456515, 4522051, 4587587, 3145796, 3211332,
            3276868, 3342404, 3407940, 3473476, 3539012, 3604548, 3670084, 3735620, 4259908, 4325444,
            4390980, 4456516, 4522052, 4587588, 3145797, 3211333, 3276869, 3342405, 3407941, 3473477,
            3539013, 3604549, 3670085, 3735621, 4259909, 4325445, 4390981, 4456517, 4522053, 4587589,
            3145798, 3211334, 3276870, 3342406, 3407942, 3473478, 3539014, 3604550, 3670086, 3735622,
            4259910, 4325446, 4390982, 4456518, 4522054, 4587590
        };

        public static string ToHexString(byte[] source)
        {
            return ToHexString(source, false);
        }
        
        // hexIndicator: use prefix ("0x") or not
        public static string ToHexString(byte[] source, bool hexIndicator)
        {
            // freeze toHexTable position in memory
            fixed (int* hexRef = toHexTable)
            // freeze source position in memory
            fixed (byte* sourceRef = source)
            {
                // take first parsing position of source - allow inline pointer positioning
                byte* s = sourceRef;
                // calculate result length
                int resultLen = (source.Length << 1);
                // use prefix ("Ox")
                if (hexIndicator)
                    // adapt result length
                    resultLen += 2;
                // initialize result string with any character expect '\0'
                string result = new string(' ', resultLen);
                // take the first character address of result
                fixed (char* resultRef = result)
                {
                    // pairs of characters explain the endianess of toHexTable
                    // move on by pairs of characters (2 x 2 bytes) - allow inline pointer positioning
                    int* pair = (int*)resultRef;
                    // use prefix ("Ox") ?
                    if (hexIndicator)
                        // set first pair value
                        *pair++ = 7864368;
                    // more to go
                    while (*pair != 0)
                        // set the value of the current pair and move to next pair and source byte
                        *pair++ = hexRef[*s++];
                    return result;
                }
            }
        }
        
        // values for '\0' to 'f' where 255 indicates invalid input character
        // starting from '\0' and not from '0' costs 48 bytes
        // but results 0 subtructions and less if conditions
        static readonly byte[] fromHexTable = new byte[] {
            255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
            255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
            255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
            255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
            255, 255, 255, 255, 255, 255, 255, 255, 0, 1,
            2, 3, 4, 5, 6, 7, 8, 9, 255, 255,
            255, 255, 255, 255, 255, 10, 11, 12, 13, 14, 
            15, 255, 255, 255, 255, 255, 255, 255, 255, 255,
            255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
            255, 255, 255, 255, 255, 255, 255, 10, 11, 12,
            13, 14, 15
        };

        // same as above but valid values are multiplied by 16
        static readonly byte[] fromHexTable16 = new byte[] {
            255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
            255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
            255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
            255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
            255, 255, 255, 255, 255, 255, 255, 255, 0, 16,
            32, 48, 64, 80, 96, 112, 128, 144, 255, 255,
            255, 255, 255, 255, 255, 160, 176, 192, 208, 224, 
            240, 255, 255, 255, 255, 255, 255, 255, 255, 255,
            255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
            255, 255, 255, 255, 255, 255, 255, 160, 176, 192,
            208, 224, 240
        };

        public static byte[] FromHexString(string source)
        {
            // return an empty array in case of null or empty source
            if (string.IsNullOrEmpty(source))
                return new byte[0]; // you may change it to return null
            if (source.Length % 2 == 1) // source length must be even
                throw new ArgumentException();
            int
                index = 0, // start position for parsing source
                len = source.Length >> 1; // initial length of result
            // take the first character address of source
            fixed (char* sourceRef = source)
            {
                if (*(int*)sourceRef == 7864368) // source starts with "0x"
                {
                    if (source.Length == 2) // source must not be just a "0x")
                        throw new ArgumentException();
                    index += 2; // start position (bypass "0x")
                    len -= 1; // result length (exclude "0x")
                }
                byte add = 0; // keeps a fromHexTable value
                byte[] result = new byte[len]; // initialization of result for known length
                // freeze fromHexTable16 position in memory
                fixed (byte* hiRef = fromHexTable16)
                // freeze fromHexTable position in memory
                fixed (byte* lowRef = fromHexTable)
                // take the first byte address of result
                fixed (byte* resultRef = result)
                {
                    // take first parsing position of source - allow inremental memory position
                    char* s = (char*)&sourceRef[index];
                    // take first byte position of result - allow incremental memory position
                    byte* r = resultRef;
                    // source has more characters to parse
                    while (*s != 0)
                    {
                        // check for non valid characters in pairs
                        // you may split it if you don't like its readbility
                        if (
                            // check for character > 'f'
                            *s > 102 ||
                            // assign source value to current result position and increment source position
                            // and check if is a valid character
                            (*r = hiRef[*s++]) == 255 ||
                            // check for character > 'f'
                            *s > 102 ||
                            // assign source value to "add" parameter and increment source position
                            // and check if is a valid character
                            (add = lowRef[*s++]) == 255
                            )
                            throw new ArgumentException();
                        // set final value of current result byte and move pointer to next byte
                        *r++ += add;
                    }
                    return result;
                }
            }
        }
        #endregion
    }
}

Performance  

CLR provides a method for generating a hex string from a byte array that I’ve met in many sources:

string hex = BitConverter.ToString(myByteArray).Replace("-", "");

This is probably the worst choice performance wise. Anyway; my implementation is more than 10 times (10x or 1000%) faster and consumes 5 times less memory.

I was looking for a very recent published implementation of hex string to byte array and I found this one at MSDN blogs. I know that this is not the best implementation ever, but it’s one of the best samples for what I've written above. This time, my implementation is about 5 times (5x or 500%) faster.

The code (console app) I used for performance testing is the following:

// a byte array from a file (downloaded from http://hystad.com/words)
byte[] exp = File.ReadAllBytes(@"F:\words.txt");
System.Diagnostics.Stopwatch clock;
long memory = 0;
 
// repeat the test 10 times (I actually repeat it 1000 times)
for (int n = 0; n < 10; n++)
{
	// test for my implementation of toHexString
	clock = Stopwatch.StartNew(); // syntax compatible with older CLR versions
	memory = GC.GetTotalMemory(true);
	string s1 = DRDigit.Fast.ToHexString(exp, false);
	clock.Stop();
	memory = GC.GetTotalMemory(false) - memory;
	Console.Write("{0} [{1}] vs ", clock.Elapsed.TotalMilliseconds, memory);
 
	// test for bit converter
	clock = Stopwatch.StartNew();
	m = GC.GetTotalMemory(true);
	string s2 = BitConverter.ToString(exp).Replace("-", "");
	clock.Stop();
	memory = GC.GetTotalMemory(false) - memory;
	Console.WriteLine("{0} [{1}] -> {2} ", 
	        clock.Elapsed.TotalMilliseconds, memory, s1 == s2);


	// test for my implementation of fromHexString
	clock = Stopwatch.StartNew();
	byte[] b1 = DRDigit.Fast.FromHexString(s1);
	clock.Stop();
	Console.Write("fromHex: {0} vs ", clock.Elapsed.TotalMilliseconds);

	// test for MSDN blog peeked implementation
	clock = Stopwatch.StartNew();
	byte[] b2 = Utils.ConvertToByteArray(s1);
	clock.Stop(); Console.WriteLine(clock.Elapsed.TotalMilliseconds);

	Console.WriteLine("");	
}
Console.ReadLine();

These are the results of running the above code in release mode:

toHex
memory consumption
toHex
execution time in ms
fromHex
execution time in ms
toHexStringBitConverter toHexStringBitConverter fromHexStringMSDN blog peeked
implementation
9,956,99239,795,168 53.5169266.2852 22.757392.5146
9,948,80039,795,168 14.0968254.9983 18.237586.3444
9,948,80039,795,168 21.0198274.6193 18.1744110.9573
9,948,80039,795,168 21.1050275.9161 18.1831122.8697
9,948,80039,795,168 19.6050244.4094 16.7560 110.7480 
9,948,80039,795,168 21.0470275.6667 18.245290.8027
9,948,80039,795,168 20.9721275.5815 18.2375110.6746
9,948,80039,795,168 20.8356276.2732 18.1651122.4027
9,948,80039,795,168 21.2687273.4108 18.188790.1535
9,948,80039,795,168 15.5049254.4780 18.171886.2561

I'm not so sure that I'm allowed to paste the code of BitConverter.ToString as a result of decompiling using reflector. What I can do is to describe what it does affecting the performance. 

Assume a byte array filled with 100 zeros. This is translated to 100 divisions (0 / 16), 100 modulos (0 % 16) and 200 conversions of 0 to '0' requiring 200 if conditions (0 < 10) and 200 summations (0 + 48 to generate char '0').

Using the code  

This is what you expected, isn't it?

// byte array to hex string with "0x" prefix 
string myHexString = DRDigit.Fast.ToHexString(myByteArray, true);  
 
// byte array to hex string without "0x" prefix 
string myHexString = DRDigit.Fast.ToHexString(myByteArray);  
 
// hex string to byte array
byte[] myByteArray = DRDigit.Fast.FromHexString(myHexString);  

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here