Introduction
A common requirement is to have applications share data with other programs. Although there are interfaces available to work with, for example, Microsoft Excel data files, this approach is generally complex, involves a fair amount of overhead, and requires that support libraries accompany your application.
Comma-Separated Values (CSV) Files
A much simpler way to have your application share data is by reading and writing Comma-Separated Values (CSV) files. CSV files can easily be read and written by many programs, including Microsoft Excel.
For the most part, reading and writing CSV files is trivial. As the name suggestions, a CSV file is simply a plain text file that contains one or more values per line, separated by commas. Each value is a field (or column in a spreadsheet), and each line is a record (or row in a spreadsheet).
However, there is slightly more work involved. Double quotes are used to wrap values that contain commas so that the commas are not interpreted as a value separator. The same is also done for values that contain double quotes. In addition, two double quotes together signify a double quote in the value and not a value separator.
So this seems like a perfect task for a handy little C# class. Listing 1 shows my CsvFileWriter
and CsvFileReader
classes.
using System;
using System.Collections.Generic;
using System.IO;
using System.Text;
namespace ReadWriteCsv
{
public class CsvRow : List<string>
{
public string LineText { get; set; }
}
public class CsvFileWriter : StreamWriter
{
public CsvFileWriter(Stream stream)
: base(stream)
{
}
public CsvFileWriter(string filename)
: base(filename)
{
}
public void WriteRow(CsvRow row)
{
StringBuilder builder = new StringBuilder();
bool firstColumn = true;
foreach (string value in row)
{
if (!firstColumn)
builder.Append(',');
if (value.IndexOfAny(new char[] { '"', ',' }) != -1)
builder.AppendFormat("\"{0}\"", value.Replace("\"", "\"\""));
else
builder.Append(value);
firstColumn = false;
}
row.LineText = builder.ToString();
WriteLine(row.LineText);
}
}
public class CsvFileReader : StreamReader
{
public CsvFileReader(Stream stream)
: base(stream)
{
}
public CsvFileReader(string filename)
: base(filename)
{
}
public bool ReadRow(CsvRow row)
{
row.LineText = ReadLine();
if (String.IsNullOrEmpty(row.LineText))
return false;
int pos = 0;
int rows = 0;
while (pos < row.LineText.Length)
{
string value;
if (row.LineText[pos] == '"')
{
pos++;
int start = pos;
while (pos < row.LineText.Length)
{
if (row.LineText[pos] == '"')
{
pos++;
if (pos >= row.LineText.Length || row.LineText[pos] != '"')
{
pos--;
break;
}
}
pos++;
}
value = row.LineText.Substring(start, pos - start);
value = value.Replace("\"\"", "\"");
}
else
{
int start = pos;
while (pos < row.LineText.Length && row.LineText[pos] != ',')
pos++;
value = row.LineText.Substring(start, pos - start);
}
if (rows < row.Count)
row[rows] = value;
else
row.Add(value);
rows++;
while (pos < row.LineText.Length && row.LineText[pos] != ',')
pos++;
if (pos < row.LineText.Length)
pos++;
}
while (row.Count > rows)
row.RemoveAt(rows);
return (row.Count > 0);
}
}
}
Listing 1: CsvFileWriter and CsvFileReader Classes
Because the .NET stream classes generally seem to be split into reading and writing, I decided to follow that pattern with my CSV class and split it into CsvFileWriter
and CsvFileReader
. This also simplifies the code because neither class needs to worry about which mode the file is in or protect against the user switching modes.
The writer class performs any encoding necessary, as I described above, and the reader class performs any necessary decoding.
Using the code
Both classes take a CsvRow
argument. The CsvRow
class derives from List<string>
, so it's basically just a list of strings.
When you call CsvFileWriter.WriteRow()
, the row argument specifies the string values to write out. And when you call CsvFileReader.ReadRow()
, the row argument returns the values that were read in.
CsvFileReader.ReadRow()
also places the entire line into the CsvRow.LineText
member, just in case the caller wants to inspect that.
Finally, CsvFileReader.ReadRow()
returns a Boolean value that is false
when no values could be read at the current line. Under normal circumstances, this would indicate the end of the file.
Listing 2 demonstrates using the classes.
void WriteTest()
{
using (CsvFileWriter writer = new CsvFileWriter("WriteTest.csv"))
{
for (int i = 0; i < 100; i++)
{
CsvRow row = new CsvRow();
for (int j = 0; j < 5; j++)
row.Add(String.Format("Column{0}", j));
writer.WriteRow(row);
}
}
}
void ReadTest()
{
using (CsvFileReader reader = new CsvFileReader("ReadTest.csv"))
{
CsvRow row = new CsvRow();
while (reader.ReadRow(row))
{
foreach (string s in row)
{
Console.Write(s);
Console.Write(" ");
}
Console.WriteLine();
}
}
}
Listing 2: Sample Code to Write and Read CSV files
Conclusion
That's about all there is to it. The classes are fairly simple so I didn't include a sample project. All the code for the classes is shown in Listing 1. Note that this code, as presented, does not handle quoted values that span multiple lines.
This code should be helpful for anyone wanting an easy way to share data with Microsoft Excel or any other program that can read or write CSV files.