Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / C++14

C++14: CSV Stream based on C File API

3.69/5 (12 votes)
6 May 2021CPOL3 min read 23.2K   371  
C++14: CSV Stream based on C File API to remove code bloat from STL File Streams
The purpose of this library which is based on C file API is to reduce code bloat brought on by use of C++ STL streams.

Table of Contents

Primary Motivation

The library is based on C File API. The purpose is to reduce code bloat brought on by use of C++ STL streams. Its usage is similar to Minimalistic CSV Stream which is based on C++ File Streams and likewise a header-only library. Just change the namespace from mini to capi. Some of its optimizations have been back-ported to Minimalistic CSV Stream version 1.8.3 including passing by reference whenever possible, caching the result with data member and avoiding operations that return new string object. Reader can do a diff between v1.8.2 and v1.8.3 to see the difference.

Clearing the Misconception

Using this class alone does not reduce the code bloat in your application. That would only come about when all other fstream, stringstream and cout/cin calls are removed or replaced with non STL stream equivalents.

Breaking Changes

If you overload the STL stream operators, instead of the CSV stream operators for your custom data type, the class cannot be just a drop-in replacement for MiniCSV. You have to overload the CSV stream operators.

Optional Dependencies

Boost Spirit Qi v2

To use Boost Spirit Qi for string to data conversion, define USE_BOOST_SPIRIT_QI before the header inclusion.

C++
#define USE_BOOST_SPIRIT_QI
#include "csv_stream.h"

To read char as ASCII not integer, define CHAR_AS_ASCII before the header inclusion.

C++
#define CHAR_AS_ASCII
#include "csv_stream.h"

Warning: This macro detection is removed in v0.5.2 as it is a global wide setting. For users that want to read/write char as numeric 8-bit integer, use NChar class. Use os << csv::NChar(ch) for writing but user can cast it to int without using NChar. And is >> csv::NChar(ch) for reading integers ranged from -127 to 128 into char variable.

Benchmark

Note: Benchmark results based in latest minicsv v1.8.2.
Note: Various methods only affect the input stream benchmark results.

File Stream Benchmark

C++
      // minicsv using std::stringstream
      mini::csv::ofstream:  387ms
      mini::csv::ifstream:  386ms
      // minicsv using Boost lexical_cast
      mini::csv::ofstream:  405ms
      mini::csv::ifstream:  283ms
      // capi csv using to_string
      capi::csv::ofstream:  152ms
      capi::csv::ifstream:  279ms
      // capi csv using Boost Spirit Qi
      capi::csv::ofstream:  163ms
      capi::csv::ifstream:  266ms
      // capi in-memory cached file csv
capi::csv::ocachedfstream:  124ms
capi::csv::icachedfstream:  127ms
      // capi in-memory cached file csv using Boost Spirit Qi
capi::csv::ocachedfstream:  122ms
capi::csv::icachedfstream:  100ms

Note: In-memory input stream means loading the whole file in memory before processing.
Note: In-memory output stream means keeping the contents in memory before saving.
Caution: In-memory streams requires sufficient memory to keep file contents on memory.

String Stream Benchmark

C++
// minicsv using std::stringstream
mini::csv::ostringstream:  362ms
mini::csv::istringstream:  377ms
// minicsv using Boost lexical_cast
mini::csv::ostringstream:  383ms
mini::csv::istringstream:  283ms
// capi csv
capi::csv::ostringstream:  113ms
capi::csv::istringstream:  127ms
// capi csv using Boost Spirit Qi
capi::csv::ostringstream:  116ms
capi::csv::istringstream:  106ms

Caveat

Instantiation can be slow because of many data members to initialize.

Sample Code for File Stream

C++
#include "csv_stream.h"

using namespace capi;

csv::ofstream os("products.txt");
os.set_delimiter(',', "$$");
os.enable_surround_quote_on_str(true, '\"');
if (os.is_open())
{
    os << "Shampoo" << 200 << 15.0f << NEWLINE;
    os << "Towel" << 300 << 6.0f << NEWLINE;
}
os.flush();
os.close();

csv::ifstream is("products.txt");
is.set_delimiter(',', "$$");
is.enable_trim_quote_on_str(true, '\"');

if (is.is_open())
{
    std::string name = "";
    int qty = 0;
    float price = 0.0f;
    while (is.read_line())
    {
        try
        {
            is >> name >> qty >> price;
            // display the read items
            std::cout << name << "," << qty 
                      << "," << price << std::endl;
        }
        catch (std::runtime_error& e)
        {
            std::cerr << e.what() << std::endl;
        }
    }
}

Sample Code for Cached File Stream

C++
#include "csv_stream.h"

using namespace capi;

csv::ocachedfstream os;
os.set_delimiter(',', "$$");
os.enable_surround_quote_on_str(true, '\"');
if (os.is_open())
{
    os << "Shampoo" << 200 << 15.0f << NEWLINE;
    os << "Towel" << 300 << 6.0f << NEWLINE;
}
os.write_to_file("products.txt");

csv::icachedfstream is("products.txt");
is.set_delimiter(',', "$$");
is.enable_trim_quote_on_str(true, '\"');

if (is.is_open())
{
    std::string name = "";
    int qty = 0;
    float price = 0.0f;
    while (is.read_line())
    {
        try
        {
            is >> name >> qty >> price;
            // display the read items
            std::cout << name << "," << qty 
                      << "," << price << std::endl;
        }
        catch (std::runtime_error& e)
        {
            std::cerr << e.what() << std::endl;
        }
    }
}

Sample Code for String Stream

C++
#include "csv_stream.h"

using namespace capi;

csv::ostringstream os;
os.set_delimiter(',', "$$");
os.enable_surround_quote_on_str(true, '\"');
if (os.is_open())
{
    os << "Shampoo" << 200 << 15.0f << NEWLINE;
    os << "Towel" << 300 << 6.0f << NEWLINE;
}
os.write_to_file("products.txt");

csv::istringstream is(os.get_text().c_str());
is.set_delimiter(',', "$$");
is.enable_trim_quote_on_str(true, '\"');

if (is.is_open())
{
    std::string name = "";
    int qty = 0;
    float price = 0.0f;
    while (is.read_line())
    {
        try
        {
            is >> name >> qty >> price;
            // display the read items
            std::cout << name << "," << qty 
                      << "," << price << std::endl;
        }
        catch (std::runtime_error& e)
        {
            std::cerr << e.what() << std::endl;
        }
    }
}

Output

File content:

"Shampoo",200,15.000000
"Towel",300,6.000000

Display output:

Shampoo,200,15
Towel,300,6

Change Delimiter on the Fly

Delimiter can be changed on the fly on the input/output stream with sep class. The example has whitespace and comma as delimiter in the text.

C++
// demo sep class usage
csv::istringstream is("vt 37.8,44.32,75.1");
is.set_delimiter(' ', "$$");
csv::sep space(' ', "<space>");
csv::sep comma(',', "<comma>");
while (is.read_line())
{
    std::string type;
    float r = 0, b = 0, g = 0;
    is >> space >> type >> comma >> r >> b >> g;
    // display the read items
    std::cout << type << "|" << r << "|" << b << "|" << g << std::endl;
}

The code is hosted at Github.

History

  • 28th January, 2017: Version 0.5.0: First release
  • 19th February, 2017: Version 0.5.1: Fix Input Stream exception while reading char
  • 12th March, 2017: Version 0.5.2 fixed some char output problems and added NChar (char wrapper) class to write to numeric value [-127..128] to char variables.
    C++
    bool test_nchar(bool enable_quote)
    {
        csv::ostringstream os;
        os.set_delimiter(',', "$$");
        os.enable_surround_quote_on_str(enable_quote, '\"');
    
        os << "Wallet" << 56 << NEWLINE;
    
        csv::istringstream is(os.get_text().c_str());
        is.set_delimiter(',', "$$");
        is.enable_trim_quote_on_str(enable_quote, '\"');
    
        while (is.read_line())
        {
            try
            {
                std::string dest_name = "";
                char dest_char = 0;
    
                is >> dest_name >> csv::NChar(dest_char);
    
                std::cout << dest_name << ", " 
                    << (int)dest_char << std::endl;
            }
            catch (std::runtime_error& e)
            {
                std::cerr << __FUNCTION__ << e.what() << std::endl;
            }
        }
        return true;
    }

    Display output:

    Wallet, 56
  • 18th September, 2017: Version 0.5.3:

    If your escape parameter in set_delimiter() is empty, text with delimiter will be automatically enclosed in quotes (to be compliant with Microsoft Excel and general CSV practice)

    JavaScript
    "Hello,World",600

    Microsoft Excel and CSV Stream read this as "Hello,World" and 600.

  • 12th August, 2018: Version 0.5.4: Added overloaded file open functions that take in wide char file parameter (Only available on win32)
  • 21st February, 2021: Version 0.5.4e: Fixed infinite loop in quote_unescape.
  • 6th May, 2021: CSV Stream detects the end of line with the presence of newline. Newline in the string input inevitably breaks the parsing. New version 0.5.5 takes care of newline by escaping it.

Related Articles

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)