Introduction
Recently I worked on a project that required exporting some reports
containing various file statistics. The goal was having one format that would
look good immediately and one easy accessible format for the user to import and
build custom reports. For the first export type, plain HTML was used. But for
importing, HTML is not really easy to use, so Excel file creation seemed a good
choice.
Requirements
Since the Excel exported file is just for the user to have easy access to
data, only a small part of the Excel file format features were needed. Also,
there was no guarantee that Excel would even be installed on the computer where
the reports are generated and since a port to Linux was imminent, it would
be nice for the class to be platform independent. The goal was to be able to
write something as this:
CMiniExcel miniexcel;
miniexcel(0,0) = "FileName";
miniexcel(0,1) = "Size (MB)";
miniexcel(1,0) = "c:\\bigfile.txt";
miniexcel(1,1) = 123.45;
miniexcel.SaveAs("Report.xls")
There are no real memory constraints, since Excel cannot handle over ~65K
rows anyway. (Actually we also needed a single big report with more that 65K
rows, and we had to save it as a "Comma separated values" file). So
the whole excel worksheet could be kept in memory.
Excel file format
After a little bit of searching on the net, I decided on the Excel 2.1
Workbook format (description of the format found
here). Even
though this format is old, it can be imported by Excel or by other programs and
it seemed suitable for our simple needs.
From the documentation of the Excel 2.1 format it turns out that an excel
document is stored using a format called Binary File Format (BIFF). This format
is a sequence of BIFF records, each record containing a certain property/value
of the workgroup. Each BIFF record starts with a 2 byte number containing it's type
and another 2 bytes for the size of the record. After this header,
record-specific data will follow:
XX XX |
XX XX |
.... |
Type |
Length |
Specific Data |
From the several record types needed, only a few were really important for
simple excel exporting. BOF and EOF records were unavoidable and also text
strings (LABEL record) and numbers (NUMBER record) needed to be saved.
Implementation
First of all, to take care of the platform independence and also to make life
easier when saving the BIFF records, I implemented a writer class (CLittleEndianWriter
)
that should be able to write 1 byte, 2 byte and 4 byte values in little endian
byte order. Also, it needed to be able to save double values in IEEE format,
since this is the format used to store real numbers (excel has a BIFF record fr
storing a
16 bit unsigned integer, but it is not enough):
class LittleEndianWriter{
...
public:
...
void Write1 (char v);
void Write2 (int v);
void Write4 (long v);
void WriteFloatIEEE (float v);
void WriteDoubleIEEE (double v);
};
After this, it also seems pretty clear that since every single record is
saved in a biff file format, it would be nice have a abstract BIFFRecord
class and that every single record should be derived from it.
class BIFFRecord{
...
void Write (LittleEndianWriter *pWriter, int nRecNo, int nRecLen);
public:
virtual void Write (LittleEndianWriter *pWriter) = 0;
};
This being done, implementing excel file saving is only adding the required
BIFF records. I added in this demo project BOF, EOF, NUMBER and LABEL (others
should be easy to add). We also need a generic container class, so that Excel
file creation is easy. The container class is needed because BIFF records have
to be in a specific order in the XLS file and it's nice if the container class
does this for us:
class CMiniExcel{
...
public:
...
ExcelCell &operator() (unsigned row, unsigned column);
void Write (FILE *dest);
};
Note that the operator ()
returns an ExcelCell
.
This is just a simple container for either numbers or strings and it makes like
a lot easier when coding by not having to worry about cell types in the Excel
Workbook.
Final notes
This is by no means a complete project. The saving is basic and error
checking is almost none existing. Adding other BIFF records to the generic
format should be pretty easy and if somebody finds it useful, I might add a few
more into this demo source code.