Introduction
StringBuilder
s are an excellent way of concatenating string
s without the memory overhead involved in repeated construction and allocation of a number of intermediate string
values:
string result = "";
foreach (string s in myListOfStrings)
{
result += s.StartsWith("A") ? "," : ";" + s;
}
May look fine, but each time round the loop a new string
is created, and the existing content copied into it before the new data is added - string
s are immutable in .NET, remember?
StringBuilder sb = new StringBuilder();
foreach (string s in myListOfStrings)
{
sb.Append(s.StartsWith("A") ? "," : ";");
sb.Append(s);
}
string result = sb.ToString();
Looks more complex, but it doesn't generate any more memory that it needs to - the StringBuilder
expands when it is full, and the data is only copied into the new (larger) space at that point. This is a lot more efficient in terms of memory, and a whole load more efficient in processing time.
But there isn't an obvious class that does this for bytes, is there?
Background
Well, yes, there is a class which does this for bytes - but you might not have thought of it. It's a MemoryStream
. It starts out at 256 bytes, and expands in the same way as a StringBuilder
does. But...it's not quite as obvious:
MemoryStream ms = new MemoryStream();
foreach (byte[] b in myListOfByteArrays)
{
ms.Write(b, 0, b.Length);
}
byte[] result = new byte[ms.Length];
Array.Copy(ms.GetBuffer(), result, ms.Length);
What we want is more like this:
ByteArrayBuilder bab = new ByteArrayBuilder();
foreach (byte[] b in myListOfByteArrays)
{
bab.Append(b);
}
byte[] result = bab.ToArray();
This document presents a class designed to do just that, with some useful features added.
(In fact, that code won't do exactly what you want it to with the class I present here, you need a slight change to how you use it:
ByteArrayBuilder bab = new ByteArrayBuilder();
foreach (byte[] b in myListOfByteArrays)
{
bab.Append(b, false);
}
byte[] result = bab.ToArray();
I will explain why later.)
The ByteArrayBuilder
class encapsulates a MemoryStream
and provides a number of methods to add and extract data from it.
Using the Code
Firstly, do note that unlike StringBuilder
, ByteArrayBuilder
implements IDisposable
(because it contains a MemoryStream
which also implements it, and it is a good idea to dispose large objects when you are finished with them anyway) so it is probably a good idea to use a using
block around the constructor:
byte[] result;
using (ByteArrayBuilder bab = new ByteArrayBuilder())
{
foreach (byte[] b in myListOfByteArrays)
{
bab.Append(b, false);
}
result = bab.ToArray();
}
There are three constructors provided:
public ByteArrayBuilder()
{
}
public ByteArrayBuilder(byte[] data)
{
store.Close();
store.Dispose();
store = new MemoryStream(data);
}
public ByteArrayBuilder(string base64)
{
store.Close();
store.Dispose();
store = new MemoryStream(Convert.FromBase64String(base64));
}
The parameterless version creates a new, empty ByteArrayBuilder
ready for you to add fresh data. The other two constructors allow you to recreate an instance from the data you got from one earlier - either in the form of a raw byte array, or as a Base64 string
(which can be transferred more easily with text based transfer mechanisms than raw binary data.) You can access the data to feed these from using the ToArray
and ToString
methods:
public byte[] ToArray()
{
byte[] data = new byte[Length];
Array.Copy(store.GetBuffer(), data, Length);
return data;
}
public override string ToString()
{
return Convert.ToBase64String(ToArray());
}
This allows you to create complex binary based save files for example and load them later to recover the data (yes, I know you can use a BinarySerializer
, but there are times when that isn't appropriate or even possible).
The ByteArrayBuilder.Append
method has a number of overrides to allow you to add various datatype
s to the store:
public void Append(bool b)
public void Append(byte b)
public void Append(byte[] b, bool addLength = true)
public void Append(char c)
public void Append(char[] c, bool addLength = true)
public void Append(DateTime dt)
public void Append(decimal d)
public void Append(double d)
public void Append(float f)
public void Append(Guid g)
public void Append(int i)
public void Append(long l)
public void Append(short i)
public void Append(string s, bool addLength = true)
public void Append(uint ui)
public void Append(ulong ul)
public void Append(ushort us)
Note that the variable length versions all have an optional bool
parameter - this defaults to true
, and inserts the length of the value to be inserted before the actual data - this allows the value to be extracted in the same form as it went in. Setting it to false
creates a "raw data" byte array with no padding information. This allows for easy transfer to other equipment where the fields may be specific lengths and length info is not necessary.
The addition of the lengths is to allow the data to be extracted from the store as it went in. Again, there are a number of methods to do this:
public bool GetBool()
public byte GetByte()
public byte[] GetByteArray()
public char GetChar()
public char[] GetCharArray()
public DateTime GetDateTime()
public decimal GetDecimal()
public double GetDouble()
public float GetFloat()
public Guid GetGuid()
public int GetInt()
public long GetLong()
public short GetShort()
public string GetString()
public uint GetUint()
public ulong GetUlong()
public ushort GetUshort()
Example
Suppose you have a card index system, and each Card
consists of a hierarchical list of Lines. In this case, both the Card
and CardLine
data have independant version information to allow future versions to be able to read older data.
The Card
class ToBytes
method may need to save a variety of information:
public byte[] ToBytes()
{
using (ByteArrayBuilder bab = new ByteArrayBuilder())
{
bab.Append(cardDataVersion);
bab.Append(Guid);
bab.Append(Text);
bab.Append(Created);
bab.Append(Modified);
bab.Append(_Lines.Count);
foreach (CardLine child in _Lines)
{
bab.Append(child.ToBytes());
}
return bab.ToArray();
}
}
And the CardLine.ToBytes
will also be similar:
public byte[] ToBytes()
{
using (ByteArrayBuilder bab = new ByteArrayBuilder())
{
bab.Append(lineDataVersion);
bab.Append(Guid);
bab.Append(HasPrompt);
bab.Append(Prompt);
bab.Append(HasValue);
bab.Append(Value);
bab.Append(Created);
bab.Append(Modified);
bab.Append(_Children.Count);
foreach (CardLine child in _Children)
{
bab.Append(child.ToBytes());
}
return bab.ToArray();
}
}
Then, when you want to load the card back again:
public Card(byte[] cardData)
{
using (ByteArrayBuilder bab = new ByteArrayBuilder(cardData))
{
short dataVersion = bab.GetShort();
if (dataVersion > cardDataVersion)
{
throw new ApplicationException("Cannot open data:
it was saved with a more advanced version of this program\nExpected version " +
cardDataVersion +
", but found " +
dataVersion);
}
LoadData(bab, dataVersion);
}
}
private void LoadData(ByteArrayBuilder bab, short dataVersion)
{
if (dataVersion == 1) LoadDataVersion1(bab);
else
{
throw new ApplicationException("Cannot open data:
it was saved with an unknown version of this program\nExpected version " +
cardDataVersion +
", but found " +
dataVersion);
}
}
private void LoadDataVersion1(ByteArrayBuilder bab)
{
Guid = bab.GetGuid();
Text = bab.GetString();
Created = bab.GetDateTime();
Modified = bab.GetDateTime();
int childrenCount = bab.GetInt();
_Lines = new List<CardLine>();
while (childrenCount-- > 0)
{
_Lines.Add(new CardLine(bab.GetByteArray()));
}
}
The CardLine
code looks very similar:
public CardLine(byte[] lineData)
{
using (ByteArrayBuilder bab = new ByteArrayBuilder(lineData))
{
short dataVersion = bab.GetShort();
if (dataVersion > lineDataVersion)
{
throw new ApplicationException("Cannot open data:
it was saved with a more advanced version of this program\nExpected version " +
lineDataVersion +
", but found " +
dataVersion);
}
LoadData(bab, dataVersion);
}
}
private void LoadData(ByteArrayBuilder bab, short dataVersion)
{
if (dataVersion == 1) LoadDataVersion1(bab);
else
{
throw new ApplicationException("Cannot open data:
it was saved with an unknown version of this program\nExpected version " +
lineDataVersion +
", but found " +
dataVersion);
}
}
private void LoadDataVersion1(ByteArrayBuilder bab)
{
Guid = bab.GetGuid();
HasPrompt = bab.GetBool();
Prompt = bab.GetString();
HasValue = bab.GetBool();
Value = bab.GetString();
Created = bab.GetDateTime();
Modified = bab.GetDateTime();
int childrenCount = bab.GetInt();
_Children = new List<CardLine>();
while (childrenCount-- > 0)
{
_Children.Add(new CardLine(bab.GetByteArray()));
}
}
History
- 26th October, 2013: Original version