Introduction
Reading binary data is not something you do every day, however, there are times when you need to read the characters from a file and then parse them.
This article aims to provide the reader with an insight into one way this can be done.
Background
Over the past few years, I have been asked to develop systems that parse binary data. Examples of this includes interfacing into a Rail Signaling System to provide passenger information, and now I am developing traffic volume reporting software that uses historical information from binary flat-files.
Both Rail Signaling Systems and Traffic Management Systems use binary files as a means of passing information to track switching controllers, actuated boom gates and traffic signals. They do this because one byte of data can literally hold the values of 8 different variables. Combining bytes allows the programmer to compact the information being sent to the controllers, making them much more efficient.
So, for example, if you want to send information to a fictitious controller whose purpose is to switch on for 3 seconds and then close, you may have a record format as follows:
STX: 1byte | Controller ID: 2bytes | Action: 1byte | ETX: 1byte
Within the Action
byte, you may have something like this:
Bit 7: Controller On|Off, Bit 6: Seconds|Minutes , Bit5-0: Time period for action
The reader will note that using bit0-5 means that at most, the action can occur for 63 time periods, in this case, it is determined by bit 6 which tells us we can have either minutes or seconds.
So, seeing as we want the controller on for 3 seconds, our Action
byte will look as follows:
Action = 1100 0011 or in hex 0x0c3
NOTE: The leading zero after the 'x' is not necessary, but I like to use it for clarity. It allows me to separate the 'x' from the letters within hex, thus making it clearer to me at the very least.
We are now ready to start looking at what can be done to read such a file.
Using the code
In here, we will learn how to open a binary file, read a character from it, do some processing, and then close the file.
So, let's start with some code, and then we can explain it from there:
proc_file(String *filename)
{
BinaryReader*br = new BinaryReader(File::OpenRead(filename));
unsigned int c;
try
{
while(true)
{
c = (unsigned int)br-ReadByte()
proc(c);
}
}
catch(EndOfStreamException *e)
{
br->Close();
}
}
We begin with opening the file using the BinaryReader
class. Then, set up a try
/catch
block, within which we create a loop that we know will never end, except that we know that once the loop attempts to read past the EOF, it will fail and throw the EndOfStream
exception. The proc(c)
function simply processes the character as per the specification you are working to.
BinaryReader
has many more 'reader' types including Boolean and Char. This is a class that needs to be explored based on the work you're attempting to complete.
Points of Interest
The most interesting aspect of this is that originally I was using StreamReader
to open the file and then used the StreamReader::ReadToEnd()
function to collect all the data into a singular String*
variable. However, it kept 'skipping' data. I suddenly realized that in the data, it was the BS (^H/0x08) and the LF (^L/0x0a) characters that were missing, and I put that down to the fact that the String*
variable was 'acting' on those characters.
I then attempted to use good ol' C, but the .NET environment was not happy about me doing this, and told me so in no uncertain terms. However, congrats to the designers of .NET, they realized that we were going to need this feature, even if it is only rarely, and it has been included through the BinaryReader
/Writer
classes.
So, my conclusion is that there normally exists a .NET way of doing things, finding it is sometimes a battle, and in terms of reading/writing binary data is concerned, BinaryRead
/BinaryWrite
are the classes you need.