Introduction
CSV Is Sometime Usefull to Store Cyclic Data or Even to Exchange Easily Data With Office Application Like Excel or Calc. Csv Creation or Extraction Is Not the Hardest Job to Do but Nobody Want to Spend Time on It.
With this class you can add CSV features to your project in few lines of code !
Background
We all agree that if you want to store or share data in text format, CSV is not the best format compare to named field format like XML or JSON, but CSV is not completely dead and still have some point of interrest:
Pros
- It remain more "Human readable": Without field name and markup characters, a CSV file is much easier to figure out than an XML file. In addition its table like format will also ease the reading => a row is a row.
- It is easy to code: Even without this little class you can do it by yourself by using standard library of C/C++ by using methods like print format and string concatenations.
- It remains light: It won't never be as light as binary, but it will still remain lighter than XML or JSON which integrate field name and markup characters often heavier than data themselves.
- It is comprehensive for all tabular applications. Application like Excel or Calc will integrate this format. This mean you can exchange data easily with this kind of application.
Cons
- There is no "named field": This seems to be obvious but without named field you have to format your data in same order from source to target or you will messed up your data. This is exactly the issue you can meet with binary files.
- It is locale dependant: It was basically a good idea to use coma as separator. Unfortunatelly, comma is radix point for some other language like mine meaning you have to change separator (semi colon for my language) which create an incompatibility between languages. ie CSV files generated on US computer won't be compatible with French one.
Using the code
CSV classes introduced here are composed of two major classes:
CSVTable
which is containing and managing the whole CSV document. CSVRow
which is managing the a row in CSV document. CLocalStack
this third class is a helper in code to stack locales, you won't have to use it in your own code.
Class diagram here after will show you an insight of all class and their methods.
All code is written inline. This means that you won't have to add any code files in your project. Just include CSVTable.h in head of your code.
#include "CSVTable.h"
Here is a simple way to create CSV data:
CCSVTable table;
CCSVRow* pRow;
double real(1.23);
int integer(123);
TCHAR* string = _T("My string");
pRow = table.AddRow()->AddField(_T("First row"));
pRow->AddField(_T("real data: %f"), real); table.AddRow()->AddField(_T("Second row"))->AddField(_T("%05d"), integer)->AddField(string);
pRow = table.AddRow();
pRow->AddFields(_T("Column01"), _T("Column02"), _T("Column03"), _T("Column04"), NULL);
table[0]->AddField(_T("integer data"), integer);
pRow = table.GetAt(0);
table.SaveToFile(_T("C:\\test.csv"));
table.CopyToClipboard();
TCHAR* szOutput = table.PublishCSV();
Of course this class is bidirectionnal meaning that if you can load your data from CSV formatted text:
CCSVTable tableImport;
tableImport.LoadFromFile(_T("D:\\test.csv"));
tableImport.GetFromClipboard(); tableImport.LoadTableFromText(szOutput);
Managing CSV locales
Sample code above will create CSV data according to computer locales which insure an interoparability between your process and other application on your computer.
But you may meet incompatibility with other locales if like me you are working on French computer and send your file to an US partner.
If you want to manage your CSV locales by yourself, you can use some additional parameters in class constructor
CCSVTable tableFR(';', _T("French_France"));
CCSVTable tableUS(',', _T("English_United States"));
tableFR.AddRow()->AddField(_T("%0.2f"), real)->AddField(_T("%0.2f"), real+2.0);
tableUS.AddRow()->AddField(_T("%0.2f"), real)->AddField(_T("%0.2f"), real+2.0);
output << _T("French locales will return this row: ") << tableFR.PublishCSV() << _T("\n");
output << _T("US locales will return this row: ") << tableUS.PublishCSV() << _T("\n");
As you can see we can specify separator character and locale used for numeric formatting.
Few words about CLocaleStack class
If you dig in the code you will notice CLocalStack
class which will stack locales each time you call AddField method.
The fact is that locale is managed by calling setlocale method. When this method is called, your locale are set for the whole thread meaning that if you call AddField method for French formatted CSV, all print format in your code will return French formatted strings which is certainly not what you want !
That's why locale are pushed when entering in AddField then poped when leaving the method so you will still have locales specified for your application.
Points of Interest
Creating CSV seems to be trivial, and in fact it is.
But if you don't care about locales, you can really meet big issues if you are storing data using this format.
This lightweight class doesn't use MFC classes meaning it can be use for non-MFC code, but will minimum required ATL.
It can be use in both MBCS or UNICODE environment, but files output is in ASCII format. Feel free to change this if you want to use UTF8 or UTF16 output.
About sample projects
There are 3 sample projects to demonstrate use of these class:
- consoleSample is a non-MFC application including all code used in this article.
- CSVClipboardListener a dialog based MFC which is listening to clipboard expecting CSV data. This application demonstrate CSV importation.
- InfoDir a second dialog based application which will seek content of your directory and export all files/folder information in CSV format.