Duck Three Ways: File I/O and String Performance Comparison Between C#, C++, and C

Michael Sydney Balloni

5.00/5 (19 votes)

27 Sep 2022CPOL5 min read

30.2K

193

Different languages and approaches are evaluated for a file and string processing benchmark

This article compares different approaches to solving the problems of loading large amounts of line-based, comma-delimited text from a file into data structures, and writing this data back out to another file. The performance profiles of the approaches are compared and conclusions drawn.

Download source code - 29.2 KB

Introduction

I was interested in comparing the performance of using different programming languages and different approaches to solving a basic programming problem, so I developed applications in C#, C++, and C, and drew conclusions for making decisions about which languages and approaches to use in different circumstances.

The Benchmark

The problem I came up with is for a program to load an input CSV file into an array of data structures, and use the data structures to write the data back out to an output CSV file. Your basic input / output data processing problem, without any data processing. This is a file I/O and data structure serialization benchmark.

I chose to use a Unicode-encoded text file for the CSV data, as C# is a Unicode language and C/C++ can work with Unicode data pretty well at this point. I chose to have seven data fields per line of CSV text, typical demographic stuff: first name, last name, address line 1, address line 2, city, state, and zip code. To keep things simple, each field is limited to 127 characters of Unicode text.

The CSV generation program (the "gen" project in the attached code) outputs 100,000 lines of this random CSV data to a file on the desktop.

// C# script to generate the data used by the test programs
using System.Text;

// These constants are shared with the C program that uses fixed-length fields
const int FIELD_LEN = 128;
const int FIELD_CHAR_LEN = FIELD_LEN - 1;

const int RECORD_COUNT = 100 * 1000;

var rnd = new Random(0); // same output each gen

string rnd_str()
{
    StringBuilder sb = new StringBuilder(FIELD_CHAR_LEN);
    int rnd_len = rnd.Next(1, FIELD_CHAR_LEN);
    for (int r = 1; r <= rnd_len; ++r)
        sb.Append((char)((sbyte)'A' + rnd.Next(0, 25)));
    return sb.ToString();
}

string output_file_path = 
    Path.Combine
    (
        Environment.GetFolderPath(Environment.SpecialFolder.Desktop), 
        "recordio.csv"
    );
if (output_file_path == null)
    throw new NullReferenceException("output_file_path");

using (var output = new StreamWriter(output_file_path, false, Encoding.Unicode))
{
    for (int r = 1; r <= RECORD_COUNT; ++r)
        output.Write($"{rnd_str()},{rnd_str()},{rnd_str()},{rnd_str()},
                       {rnd_str()},{rnd_str()},{rnd_str()}\n");
}

Let's Skip to the Punchline: What are the Results?

Here are the timings of the loading and writing phases of the different programs. The programs run through the load /write cycle four times. I took the best load and write times from all the runs for each program, the best of the best.

Approach / Language	Load (ms)	Write (ms)
Approach 1: C# with loops (net)	317	178
Approach 2: C# in bulk (net2)	223	353
Approach 3: C++ loops (class)	2,489	1,379
Approach 4: C batches (struct)	107	147
Approach 5: C++ batches (class2)	202	136
Approach 6: C/C++ batches (recordio)	108	149

Conclusions and Points of Interest

The C# programs, loops and batch, are clean and easy to read and have good performance. The loops use StreamReader / StreamWriter and are intuitive and easy to develop and maintain. The batch program uses the File class functions ReadAllLines / WriteAllLines and are much faster than the C# loops program for reading, LINQ and all, and slower for writing. Given this, you'd use ReadAllLines / LINQ for loading and StreamWriter for writing.

The big news is that there's something very wrong with Approach 3: C++ loops (class). It comes down to the std::getline call for loading, and the stream output for writing; the other code costs little. I'm interested in folks reproducing these numbers and reporting how to solve these performance problems.

The C batches program handily outruns the others for loading data because it uses fixed-size strings packed together in structs stored sequentially in one array, so the data locality is excellent. We're reading 90 MB of CSV in around 100ms. Wow! For some reason, the C program is slower for writing the data to the output file; the code looks clean, not sure about that.

Inspired by the C program, the "C++ batches" program does not come close to C at reading the data, but bests them all at writing the data out. You would choose C for reading and C++ for writing.

Approach 6, recordio - rhymes with rodeo - the hybrid of the C and C++ batch approaches, in a reusable package, yields 108 / 149, besting C#'s bests of 223 / 178. The difference in write performance is not significant. The 2X speedup for load performance cannot be ignored. The use of fixed-length strings packed into a single vector of structs, that can't be beat by any storage of C++ wstrings or C# strings. The code is clean, in implementation and in test driver illustration of use of the reusable class, have at it.

My overall advice would be to do this sort of thing in C/C++, with structs with fixed-width fields, how you might imagine things are stored in a database, and take advantage of recordio or similar to get great file and string I/O.

If you're in C# code, use File.ReadAllLines and LINQ to load the data, and StreamWriter to write it back out, and get respectable performance with great productivity and security.

Here's more information about the different approaches.

Approach 1: C# with Loops

In the attached code, this is the "net" project. This might be the most intuitive of the languages and approaches. You have StreamReader slurp up the data, and StreamWriter write it all out:

// C# performance test program
using System.Diagnostics;
using System.Text;

var sw = Stopwatch.StartNew();

string input_file_path =
    Path.Combine
    (
        Environment.GetFolderPath(Environment.SpecialFolder.Desktop),
        "recordio.csv"
    );
if (input_file_path == null)
    throw new NullReferenceException("input_file_path");

for (int iter = 1; iter <= 4; ++iter)
{
    // Read lines into a list of objects
    var nets = new List<info>();
    using (var input = new StreamReader(input_file_path, Encoding.Unicode))
    {
        while (true)
        {
            string? line = input.ReadLine();
            if (line == null)
                break;
            else
                nets.Add(new info(line.Split(',')));
        }
    }
    Console.WriteLine($".NET load  took {sw.ElapsedMilliseconds} ms");
    sw.Restart();

    // Write the objects to an output CSV file
    using (var output = new StreamWriter("output.csv", false, Encoding.Unicode))
    {
        foreach (var cur in nets)
            output.Write($"{cur.firstname},{cur.lastname},
            {cur.address1},{cur.address2},{cur.city},{cur.state},{cur.zipcode}\n");
    }
    Console.WriteLine($".NET write took {sw.ElapsedMilliseconds} ms");
    sw.Restart();
}

// NOTE: Using struct did not change performance, probably because the 
//       contents of the strings are not stored consecutively, so
//       any data locality with the array of info objects is irrelevant
class info 
{
    public info(string[] parts)
    {
        firstname = parts[0];
        lastname = parts[1];

        address1 = parts[2];
        address2 = parts[3];

        city = parts[4];
        state = parts[5];
        zipcode = parts[6];
    }

    public string firstname;
    public string lastname;
    public string address1;
    public string address2;

    public string city;
    public string state;
    public string zipcode;
}

Approach 2: C# in Bulk

In the attached source, this is the "net2" project. You might say to yourself, "Loops are tedious. I can use the File class functions like ReadAllLines to do things en masse. In .NET I trust!" It's certainly less code...

// C# performance test program
using System.Diagnostics;
using System.Runtime.ConstrainedExecution;
using System.Text;

var sw = Stopwatch.StartNew();

string input_file_path =
    Path.Combine
    (
        Environment.GetFolderPath(Environment.SpecialFolder.Desktop),
        "recordio.csv"
    );
if (input_file_path == null)
    throw new NullReferenceException("input_file_path");

for (int iter = 1; iter <= 4; ++iter)
{
    // Read CSV file into a list of objects
    var nets =
        File.ReadAllLines(input_file_path, Encoding.Unicode)
        .Select(line => new info(line.Split(',')));
    Console.WriteLine($".NET 2 load  took {sw.ElapsedMilliseconds} ms");
    sw.Restart();

    // Write the objects to an output CSV file
    int count = nets.Count();
    string[] strs = new string[count];
    int idx = 0;
    foreach (var cur in nets)
        strs[idx++] = $"{cur.firstname},{cur.lastname},{cur.address1},
                        {cur.address2},{cur.city},{cur.state},{cur.zipcode}\n";
    File.WriteAllLines("output.csv", strs, Encoding.Unicode);
    Console.WriteLine($".NET 2 write took {sw.ElapsedMilliseconds} ms");
    sw.Restart();
}

Approach 3: C++ Loops

C++ has come a long way when it comes to Unicode file I/O and streams. It is now easy to write C++ code that rivals C#'s Loopy Approach 1 for readability and simplicity:

C++

// C++ loop performance test program
#include <codecvt>
#include <fstream>
#include <iostream>
#include <string>
#include <vector>

// Our record type, just a bunch of wstrings
struct info
{
	info(const std::vector<std::wstring>& parts)
		: firstname(parts[0])
		, lastname(parts[1])
		, address1(parts[2])
		, address2(parts[3])
		, city(parts[4])
		, state(parts[5])
		, zipcode(parts[6])
	{
	}

	std::wstring firstname;
	std::wstring lastname;
	std::wstring address1;
	std::wstring address2;
	std::wstring city;
	std::wstring state;
	std::wstring zipcode;
};

// Split a string by a separator, returning a vector of substrings
std::vector<std::wstring> split(const std::wstring& str, const wchar_t seperator)
{
	std::vector<std::wstring> retVal;
	retVal.reserve(FIELD_COUNT); // cheat...

	std::wstring acc;
	acc.reserve(FIELD_CHAR_LEN); // ...a little

	for (wchar_t c : str)
	{
		if (c == seperator)
		{
			retVal.push_back(acc);
			acc.clear();
		}
		else
			acc.push_back(c);
	}

	if (!acc.empty())
		retVal.push_back(acc);

	return retVal;
}

int main(int argc, char* argv[])
{
	timer t;

	for (int iter = 1; iter <= 4; ++iter)
	{
		// Read the file into a vector of line strings
		std::vector<std::wstring> lines;
		{
			std::wifstream input(argv[1], std::ios::binary);
			input.imbue(std::locale(input.getloc(), new std::codecvt_utf16<wchar_t, 
			0x10ffff, std::codecvt_mode(std::consume_header | std::little_endian)>));
			if (!input)
			{
				std::cout << "Opening input file failed\n";
				return 1;
			}

			std::wstring line;
			while (std::getline(input, line))
				lines.push_back(line);
		}

		// Process the lines into a vector of structs
		std::vector<info> infos;
		infos.reserve(lines.size());
		for (const auto& line : lines)
			infos.emplace_back(split(line, ','));
		t.report("class load ");

		// Write the structs to an output CSV file
		{
			std::wofstream output("output.csv", std::ios::binary);
			output.imbue(std::locale(output.getloc(), new std::codecvt_utf16<wchar_t, 
			0x10ffff, std::codecvt_mode(std::generate_header | std::little_endian)>));
			if (!output)
			{
				std::cout << "Opening output file failed\n";
				return 1;
			}
			for (const auto& record : infos)
			{
				output
					<< record.firstname << ','
					<< record.lastname << ','
					<< record.address1 << ','
					<< record.address2 << ','
					<< record.city << ','
					<< record.state << ','
					<< record.zipcode << '\n';
			}
		}
		t.report("class write");
	}
}

For the file load step, a quick timing check shows that all the time is spent on the std::getline() call. And for the file write step, all the time is spent in the output loop, where else would it be? This mystery is left as an exercise to the reader. What is wrong with this simple code?

Approach 4: C Batches

If we are willing to load the entire text into memory, then we can play tricks with slicing and dicing the data in place, and take advantage of fixed-length string buffers and character-level string operations that take advantage of data locality and unsafe operations. What fun!

#include <stdio.h>
#include <stdlib.h>

const size_t FIELD_LEN = 128;
const size_t FIELD_CHAR_LEN = FIELD_LEN - 1;

const size_t FIELD_COUNT = 7;

const size_t RECORD_LEN = std::max(FIELD_COUNT * FIELD_LEN + 1, size_t(1024));

// Struct with fixed char array fields 
struct info
{
	wchar_t firstname[FIELD_LEN];
	wchar_t lastname[FIELD_LEN];

	wchar_t address1[FIELD_LEN];
	wchar_t address2[FIELD_LEN];

	wchar_t city[FIELD_LEN];
	wchar_t state[FIELD_LEN];

	wchar_t zipcode[FIELD_LEN];
};

// Read a comma-delimited string out of a buffer
void read_str(const wchar_t*& input, wchar_t* output)
{
	size_t copied = 0;
	while (*input && *input != ',')
		*output++ = *input++;

	*output = '\0';

	if (*input == ',')
		++input;
}

// Initialize a record using a buffer of text
void set_record(info& record, const wchar_t* buffer)
{
	read_str(buffer, record.firstname);
	read_str(buffer, record.lastname);
	read_str(buffer, record.address1);
	read_str(buffer, record.address2);
	read_str(buffer, record.city);
	read_str(buffer, record.state);
	read_str(buffer, record.zipcode);
}

// Output a record to a buffer of text
wchar_t* add_to_buffer(const wchar_t* input, wchar_t* output, wchar_t separator)
{
	while (*input)
		*output++ = *input++;
		
	*output++ = separator;
	
	return output;
}
int64_t output_record(const info& record, wchar_t* buffer)
{
	const wchar_t* original = buffer;

	buffer = add_to_buffer(record.firstname, buffer, ',');
	buffer = add_to_buffer(record.lastname, buffer, ',');
	buffer = add_to_buffer(record.address1, buffer, ',');
	buffer = add_to_buffer(record.address2, buffer, ',');
	buffer = add_to_buffer(record.city, buffer, ',');
	buffer = add_to_buffer(record.state, buffer, ',');
	buffer = add_to_buffer(record.zipcode, buffer, '\n');

	return buffer - original;
}

int main(int argc, char* argv[])
{
	timer t;

	for (int iter = 1; iter <= 4; ++iter)
	{
		// Open input file
		FILE* input_file = nullptr;
		if (fopen_s(&input_file, argv[1], "rb") != 0)
		{
			printf("Opening input file failed\n");
			return 1;
		}

		// Compute file length
		fseek(input_file, 0, SEEK_END);
		int file_len = ftell(input_file);
		fseek(input_file, 0, SEEK_SET);

		// Read file into memory
		wchar_t* file_contents = (wchar_t*)malloc(file_len + 2);
		if (file_contents == nullptr)
		{
			printf("Allocating input buffer failed\n");
			return 1;
		}
		if (fread(file_contents, file_len, 1, input_file) != 1)
		{
			printf("Reading input file failed\n");
			return 1;
		}
		size_t char_len = file_len / 2;
		file_contents[char_len] = '\0';
		fclose(input_file);
		input_file = nullptr;

		// Compute record count and delineate the line strings
		size_t record_count = 0;
		for (size_t idx = 0; idx < char_len; ++idx)
		{
			if (file_contents[idx] == '\n')
			{
				++record_count;
				file_contents[idx] = '\0';
			}
		}

		// Allocate record array
		info* records = (info*)malloc(record_count * sizeof(info));
		if (records == nullptr)
		{
			printf("Allocating records list failed\n");
			return 1;
		}

		// Process memory text into records
		wchar_t* cur_str = file_contents;
		wchar_t* end_str = cur_str + file_len / 2;
		size_t record_idx = 0;
		while (cur_str < end_str)
		{
			set_record(records[record_idx++], cur_str);
			cur_str += wcslen(cur_str) + 1;
		}
		if (record_idx != record_count)
		{
			printf("Record counts differ: idx: %d - count: %d\n", 
				   (int)record_idx, (int)record_count);
			return 1;
		}
		t.report("struct load ");

		// Write output file
		wchar_t* file_output = (wchar_t*)malloc(record_count * RECORD_LEN);
		if (file_output == nullptr)
		{
			printf("Allocating file output buffer failed\n");
			return 1;
		}
		size_t output_len = 0;
		for (size_t r = 0; r < record_count; ++r)
		{
			int new_output = output_record(records[r], file_output + output_len);
			if (new_output < 0)
			{
				printf("Writing to output buffer failed\n");
				return 1;
			}
			output_len += new_output;
		}
		FILE* output_file = nullptr;
		if (fopen_s(&output_file, "output.csv", "wb") != 0)
		{
			printf("Opening output file failed\n");
			return 1;
		}
		if (fwrite(file_output, output_len * 2, 1, output_file) != 1)
		{
			printf("Writing output file failed\n");
			return 1;
		}
		fclose(output_file);
		output_file = nullptr;
		t.report("struct write");

		// Clean up
		free(file_contents);
		file_contents = nullptr;

		free(records);
		records = nullptr;
	}

	return 0;
}

Approach 5: C++ Batches

Perhaps that pile of C leaves a bad taste in your mouth. Could we apply the same batch approach in C++?

C++

// C++ batch test program 
#include <codecvt>
#include <fstream>
#include <iostream>
#include <string>
#include <vector>

// Our record type, just a bunch of wstrings
struct info
{
	info(const std::vector<wchar_t*>& parts)
		: firstname(parts[0])
		, lastname(parts[1])
		, address1(parts[2])
		, address2(parts[3])
		, city(parts[4])
		, state(parts[5])
		, zipcode(parts[6])
	{
	}

	std::wstring firstname;
	std::wstring lastname;
	std::wstring address1;
	std::wstring address2;
	std::wstring city;
	std::wstring state;
	std::wstring zipcode;
};

void parse_parts(wchar_t* buffer, wchar_t separator, std::vector<wchar_t*>& ret_val)
{
	ret_val.clear();

	ret_val.push_back(buffer); // start at the beginning

	while (*buffer)
	{
		if (*buffer == separator)
		{
			*buffer = '\0';
			if (*(buffer + 1))
				ret_val.push_back(buffer + 1);
		}
		++buffer;
	}
}

int main(int argc, char* argv[])
{
	timer t;

	for (int iter = 1; iter <= 4; ++iter)
	{
		// Read the file into memory
		std::vector<char> file_contents;
		{
			std::ifstream file(argv[1], std::ios::binary | std::ios::ate);
			std::streamsize size = file.tellg();
			file.seekg(0, std::ios::beg);
			file_contents.resize(size + 2); // save room for null termination
			if (!file.read(file_contents.data(), size))
			{
				std::cout << "Reading file failed\n";
				return 1;
			}

			// null terminate
			file_contents.push_back(0);
			file_contents.push_back(0);
		}

		// Get the lines out of the data
		std::vector<wchar_t*> line_pointers;
		parse_parts(reinterpret_cast<wchar_t*>
                   (file_contents.data()), '\n', line_pointers);

		// Process the lines into data structures
		std::vector<info> infos;
		infos.reserve(line_pointers.size());
		std::vector<wchar_t*> line_parts;
		for (wchar_t* line : line_pointers)
		{
			parse_parts(line, ',', line_parts);
			infos.emplace_back(line_parts);
		}
		t.report("C++ 2 load");

		// Write the structs to an output CSV file
		std::wstring output_str;
		output_str.reserve(file_contents.size() / 2);
		for (const auto& record : infos)
		{
			output_str += record.firstname;
			output_str += ',';
			output_str += record.lastname;
			output_str += ',';
			output_str += record.address1;
			output_str += ',';
			output_str += record.address2;
			output_str += ',';
			output_str += record.city;
			output_str += ',';
			output_str += record.state;
			output_str += ',';
			output_str += record.zipcode;
			output_str += '\n';
		}
		std::ofstream output_file("output.csv", std::ios::binary);
		if (!output_file)
		{
			std::cout << "Opening output file failed\n";
			return 1;
		}
		output_file.write(reinterpret_cast<const char*>
                         (output_str.c_str()), output_str.size() * 2);
		output_file.close();
		t.report("C++ 2 write");
	}
}

Approach 6: C/C++ Hybrid

Let's take the best from approaches 4 and 5, and make a reusable class that has a prayer of real-world usability.

First, the reusable class. It is templated by the record type so the record class can define its field and record length limits:

C++

// C/C++ hybrid file / string I/O class
namespace recordio
{
template<class record>
class lineio
{
public:
static void load(const char* inputFilePath, std::vector<record>& records)
{
	// Initialize our output
	records.clear();

	// Open input file
	FILE* input_file = nullptr;
	if (fopen_s(&input_file, inputFilePath, "rb") != 0)
	{
		throw std::runtime_error("Opening input file failed");
	}

	// Compute file length
	fseek(input_file, 0, SEEK_END);
	int file_len = ftell(input_file);
	fseek(input_file, 0, SEEK_SET);

	// Read file into memory
	size_t char_len = file_len / 2;
	std::unique_ptr<wchar_t[]> file_contents(new wchar_t[file_len / 2 + 1]);
	if (fread(reinterpret_cast<void*>
       (file_contents.get()), file_len, 1, input_file) != 1)
	{
		throw std::runtime_error("Reading input file failed\n");
	}
	file_contents[char_len] = '\0';
	fclose(input_file);
	input_file = nullptr;

	// Compute record count and delineate the line strings
	size_t record_count = 0;
	for (size_t idx = 0; idx < char_len; ++idx)
	{
		if (file_contents[idx] == '\n')
		{
			++record_count;
			file_contents[idx] = '\0';
		}
	}

	records.reserve(record_count);

	// Process memory text into records
	wchar_t* cur_str = file_contents.get();
	wchar_t* end_str = cur_str + file_len / 2;
	while (cur_str < end_str)
	{
		records.emplace_back(cur_str);
		cur_str += wcslen(cur_str) + 1;
	}
}

static void write(const char* outputFilePath, const std::vector<record>& records)
{
	std::wstring output_str;
	output_str.reserve(record::max_record_length * records.size());
	for (const auto& cur : records)
	{
		cur.get_record_str(output_str);
		output_str += '\n';
	}

	// Write output file
	std::ofstream output_file(outputFilePath, std::ios::binary);
	if (!output_file)
	{
		throw std::runtime_error("Opening output file failed");
	}
	output_file.write(reinterpret_cast<const char*>
                     (output_str.c_str()), output_str.size() * 2);
}

The "derived" record types have a little work to do, but it's pretty minimal:

C++

// C/C++ hybrid class and test driver
// Struct with fixed char array fields
// Looks familiar...
struct info
{
	info() {}

	info(const wchar_t* str)
	{
		recordio::lineio<info>::read_str(str, firstname);
		recordio::lineio<info>::read_str(str, lastname);
		recordio::lineio<info>::read_str(str, address1);
		recordio::lineio<info>::read_str(str, address2);
		recordio::lineio<info>::read_str(str, city);
		recordio::lineio<info>::read_str(str, state);
		recordio::lineio<info>::read_str(str, zipcode);
	}

	void get_record_str(std::wstring& str) const
	{
		str += firstname;
		str += ',';
		str += lastname;
		str += ',';
		str += address1;
		str += ',';
		str += address2;
		str += ',';
		str += city;
		str += ',';
		str += state;
		str += ',';
		str += zipcode;
	}

	const static size_t max_field_length = FIELD_LEN;
	const static size_t max_record_length = RECORD_LEN;

	wchar_t firstname[FIELD_LEN];
	wchar_t lastname[FIELD_LEN];

	wchar_t address1[FIELD_LEN];
	wchar_t address2[FIELD_LEN];

	wchar_t city[FIELD_LEN];
	wchar_t state[FIELD_LEN];

	wchar_t zipcode[FIELD_LEN];
};

int main(int argc, char* argv[])
{
	if (argc != 2)
	{
		printf("Usage: recordio <input CSV file path>\n");
		return 0;
	}

	timer t;

	for (int iter = 1; iter <= 4; ++iter)
	{
		std::vector<info> records;
		recordio::lineio<info>::load(argv[1], records);
		t.report("recordio load ");

		recordio::lineio<info>::write("output.csv", records);
		t.report("recordio write ");
	}

	printf("All done.\n");
	return 0;
}

And that's it! Looking forward to the comments!

History

25^th September, 2022: Initial version

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)