In this tip, you will learn about fast data access via Row Index and Unique Key, Select, Insert, Update, Delete and Clear interface methods. Events on data manipulation are provided.
Introduction
Have you ever had the situation where you have to deal with structured tables, but a database is not part of the project or would even be some kind of an over-engineered solution? CvsFileProcessor
might be a solution here. You can read the data from a file or you can create the structured table in memory, Select
the data, manipulate the data via the Insert
, Update
and Delete
methods and finally save the actual content into a file.
To be more flexible, it might be a good idea to be able to handle the content of the columns as any flat data type, like: string
, byte
, int
, double
, bool
and so on. The class provides an interface method to define a unique key for the structured table. Please bear in mind that multiple columns can be member of the unique key. Furthermore, the Insert
, Update
, Delete
and Clear
manipulation actions might be in the interest of other threads in a multi-threading application environment. Therefore, the class is raising events for such actions. Interested objects can just subscribe the events of interest and they will be notified.
What is this Article Update About?
This updated version provides some new features for the Select
, Insert
, Update
and Delete
methods. Furthermore, it provides the handling of CSV files where the values are to be handled with or without quote character embedding. Quote character handling is implemented in the new class CsvValue
.
The CsvFileProcessor Class in Detail
The implementation of CsvFileProcessor
is no rocket science. It is just a set of collections to handle the mapping of the Column Names to the Column Indexes, the Row Indexes to the Row Content, the Row Indexes to the Unique Keys and vice versa. The Select
and Update
methods are implemented to handle generic types. The implementation is straight forward and should be easy to understand.
using System;
using System.IO;
using System.Linq;
using System.Collections.Generic;
namespace CommonLibrary
{
public class CsvFileProcessor
{
public event EventHandler<CsvFileProcessorEventArgs> InsertEvent;
public event EventHandler<CsvFileProcessorEventArgs> UpdateEvent;
public event EventHandler<CsvFileProcessorEventArgs> DeleteEvent;
public event EventHandler<CsvFileProcessorEventArgs> ClearEvent;
public char Separator { get; private set; } = ',';
public string FileName { get; private set; } = null;
private string[] ColumnNames = null;
private uint RowCounter = 0;
private CsvValue CsvValue = null;
private readonly Dictionary<string,
int> ColumnNameToIndexCollection = new Dictionary<string, int>();
private readonly Dictionary<uint,
string[]> RowIndexToValueCollection = new Dictionary<uint, string[]>();
private readonly Dictionary<string, uint>
UniqueKeyToRowIndexCollection = new Dictionary<string, uint>();
private readonly Dictionary<uint,
string> RowIndexToUniqueKeyCollection = new Dictionary<uint, string>();
private readonly List<string> UniqueKeyColumnList = new List<string>();
private readonly object Locker = new object();
public CsvFileProcessor(char separator = ',', bool isQuoted = true,
char quoteCharacter = '"') => this.Initialize(null, separator,
isQuoted, quoteCharacter);
public CsvFileProcessor(string fileName, char separator = ',',
bool isQuoted = true, char quoteCharacter = '"')
{
this.Initialize(fileName, separator, isQuoted, quoteCharacter);
this.ReadColumnsFromFile();
}
public CsvFileProcessor(string fileName, string columns, char separator = ',',
bool isQuoted = true, char quoteCharacter = '"')
{
this.Initialize(fileName, separator, isQuoted, quoteCharacter);
this.AddColumns(columns);
}
public CsvFileProcessor(string fileName, string[] columns, char separator = ',',
bool isQuoted = true, char quoteCharacter = '"')
{
this.Initialize(fileName, separator, isQuoted, quoteCharacter);
this.AddColumns(columns);
}
public void Reset(string fileName, char separator = ',',
bool isQuoted = true, char quoteCharacter = '"')
{
lock (this.Locker)
{
this.Initialize(fileName, separator, isQuoted, quoteCharacter);
this.ReadColumnsFromFile();
}
}
public void Reset(string fileName, string[] columns, char separator = ',',
bool isQuoted = true, char quoteCharacter = '"')
{
lock (this.Locker)
{
this.Initialize(fileName, separator, isQuoted, quoteCharacter);
this.AddColumns(columns);
}
}
private void Initialize(string fileName, char separator,
bool isQuoted, char quoteCharacter)
{
string newFileName = fileName ?? this.FileName;
this.Clear();
this.FileName = newFileName;
this.Separator = separator;
this.CsvValue = new CsvValue(this.Separator, isQuoted, quoteCharacter);
}
private void ReadColumnsFromFile()
{
string columns = null;
using (StreamReader streamReader = new StreamReader(this.FileName))
{ columns = streamReader.ReadLine(); }
this.AddColumns(this.CsvValue.Split(this.EraseLastSeparator(columns)));
}
public void AddColumnNameToUniqueKey(string columnName)
{
if (this.ColumnNameToIndexCollection.ContainsKey(columnName))
this.UniqueKeyColumnList.Add(columnName);
else throw new Exception($@"'{columnName}' is an unknown column
and can't be added as a member of the unique key!");
}
public void ReadFile()
{
lock (this.Locker)
{
string row;
uint rowIndex = 0;
using (StreamReader streamReader = new StreamReader(this.FileName))
{
while (streamReader.Peek() >= 0)
{
row = streamReader.ReadLine();
if (rowIndex > 0 && !string.IsNullOrEmpty(row))
this.Insert(row);
rowIndex++;
}
}
}
}
private string GetUniqueKeyOfRowIndex(uint rowIndex)
{
string key = null;
if (this.RowIndexToUniqueKeyCollection.ContainsKey(rowIndex)) key =
this.RowIndexToUniqueKeyCollection[rowIndex];
return key;
}
public void Clear()
{
lock (this.Locker)
{
this.FileName = null;
this.ColumnNames = null;
this.ColumnNameToIndexCollection.Clear();
this.RowIndexToValueCollection.Clear();
this.UniqueKeyToRowIndexCollection.Clear();
this.RowIndexToUniqueKeyCollection.Clear();
this.UniqueKeyColumnList.Clear();
this.RowCounter = 0;
}
this.ClearEvent?.Invoke(this, new CsvFileProcessorEventArgs());
}
private string EraseLastSeparator(string value) => value[value.Length - 1] ==
this.Separator ? value.Substring(0, value.Length - 1) : value;
public bool ContainsKey(uint rowIndex) =>
this.RowIndexToValueCollection.ContainsKey(rowIndex);
public bool ContainsKey(string key) =>
this.UniqueKeyToRowIndexCollection.ContainsKey(key);
public int GetColumnCount() => this.ColumnNames != null ?
this.ColumnNames.Length : 0;
public int GetRowCount() => this.RowIndexToValueCollection.Count;
public string GetColumnName(uint columnIndex) => this.ColumnNames != null &&
columnIndex < this.ColumnNames.Length ? this.ColumnNames[columnIndex] : @"";
public int GetColumnIndex(string columnName) => columnName != null &&
this.ColumnNameToIndexCollection.ContainsKey(columnName) ?
this.ColumnNameToIndexCollection[columnName] : -1;
public bool IsColumnNameUniqueKeyMember(string columnName) =>
this.UniqueKeyColumnList.Contains(columnName);
public long ToQuotedCSV() => this.ToCSV(true);
public long ToPlainCSV() => this.ToCSV(false);
private void AddColumns(string columns)
{
columns = columns.Trim();
if (string.IsNullOrEmpty(columns)) throw new Exception
($@"Missing header definition in/for file: '{this.FileName}'!");
columns = this.EraseLastSeparator(columns);
this.AddColumns(this.CsvValue.Split(columns));
}
private void AddColumns(string[] columns)
{
this.ColumnNames = new string[columns.Length];
for (int i = 0; i < columns.Length; i++)
{
this.ColumnNames[i] = columns[i];
this.ColumnNameToIndexCollection.Add(columns[i], i);
}
}
public List<uint> GetRowIndexList()
{
List<uint> list = new List<uint>();
foreach (var pair in this.RowIndexToValueCollection) list.Add(pair.Key);
list.Sort();
return list;
}
public List<string> GetUniqueKeyList()
{
List<string> list = new List<string>();
foreach (var pair in this.UniqueKeyToRowIndexCollection) list.Add(pair.Key);
return list;
}
private string GetUniqueKeyColumns()
{
string key = @"";
foreach (var column in this.UniqueKeyColumnList) key +=
column != this.UniqueKeyColumnList[this.UniqueKeyColumnList.Count - 1] ?
$@"{column}+" : column;
return key;
}
private void Validate(uint rowIndex, string columnName = null)
{
if (rowIndex == 0) throw new Exception
($@"Invalid row index: '{rowIndex}'. Must be greater than zero!");
if (columnName != null && this.GetColumnIndex(columnName) == -1)
throw new Exception($@"'{columnName}' is not available
as a column in/for file: '{this.FileName}'!");
if (!this.RowIndexToValueCollection.ContainsKey(rowIndex))
throw new Exception($@"No content for row index: '{rowIndex}'
in/for file: '{this.FileName}' available!");
}
private void UpdateUniqueKey(uint rowIndex, string columnName,
string newValue, string oldValue)
{
string newKey = @"";
foreach (var keyColumn in this.UniqueKeyColumnList) newKey +=
this.RowIndexToValueCollection[rowIndex][this.GetColumnIndex(keyColumn)];
if (!this.UniqueKeyToRowIndexCollection.ContainsKey(newKey))
{
string oldKey = this.RowIndexToUniqueKeyCollection[rowIndex];
this.UniqueKeyToRowIndexCollection.Remove(oldKey);
this.UniqueKeyToRowIndexCollection.Add(newKey, rowIndex);
this.RowIndexToUniqueKeyCollection[rowIndex] = newKey;
}
else
{
this.RowIndexToValueCollection[rowIndex]
[this.GetColumnIndex(columnName)] = oldValue;
throw new Exception($@"Duplicate unique key:
'{this.GetUniqueKeyColumns()}' with new value: '{newValue}'
for column: '{columnName}' and row: '{rowIndex}'
in/for file: '{this.FileName}'!. New value denied.");
}
}
private uint Insert(string values)
{
uint rowIndex;
string key = @"";
lock (this.Locker)
{
values = values.Trim();
values = this.EraseLastSeparator(values);
string[] valueArray = this.CsvValue.Split(values);
if (this.GetColumnCount() == 0) throw new Exception
($@"Missing header definition in/for file: '{this.FileName}'!");
if (valueArray.Length != this.ColumnNames.Length)
throw new Exception($@"Column count mismatch.
Row '{values}' has '{valueArray.Length}' columns.
The header has: '{this.ColumnNames.Length}'
columns in/for file: '{this.FileName}'!");
foreach (var keyColumn in this.UniqueKeyColumnList)
key += valueArray[this.GetColumnIndex(keyColumn)];
if (this.UniqueKeyColumnList.Count > 0)
{
if (this.UniqueKeyToRowIndexCollection.ContainsKey(key))
throw new Exception($@"Duplicate unique key:
'{this.GetUniqueKeyColumns()}'
with row: '{values}' in/for file: '{this.FileName}'!");
this.UniqueKeyToRowIndexCollection.Add(key, this.RowCounter + 1);
this.RowIndexToUniqueKeyCollection.Add(this.RowCounter + 1, key);
}
this.RowIndexToValueCollection.Add(this.RowCounter + 1, valueArray);
this.RowCounter++;
rowIndex = this.RowCounter;
key = this.GetUniqueKeyOfRowIndex(rowIndex);
}
this.InsertEvent?.Invoke
(this, new CsvFileProcessorEventArgs(rowIndex, key));
return rowIndex;
}
public uint Insert(string[] values) => this.Insert(this.CsvValue.ToCSV(values));
public uint Insert()
{
uint rowIndex;
lock (this.Locker)
{
string[] values = new string[this.GetColumnCount()];
for (int i = 0; i < this.GetColumnCount(); i++)
values[i] = string.Empty;
rowIndex = this.Insert(this.CsvValue.ToCSV(values));
}
return rowIndex;
}
public bool Delete(uint rowIndex)
{
string key;
lock (this.Locker)
{
if (rowIndex == 0) throw new Exception
($@"Invalid row index: '{rowIndex}'. Must be greater than zero!");
if (!this.RowIndexToValueCollection.ContainsKey(rowIndex))
throw new Exception($@"No content for row index:
'{rowIndex}' in/for file: '{this.FileName}' available!");
key = this.GetUniqueKeyOfRowIndex(rowIndex);
this.UniqueKeyToRowIndexCollection.Remove
(this.RowIndexToUniqueKeyCollection[rowIndex]);
this.RowIndexToUniqueKeyCollection.Remove(rowIndex);
this.RowIndexToValueCollection.Remove(rowIndex);
}
this.DeleteEvent?.Invoke
(this, new CsvFileProcessorEventArgs(rowIndex, key));
return true;
}
public bool Delete(string key)
{
bool success;
lock (this.Locker)
{
uint rowIndex = 0;
success = this.UniqueKeyToRowIndexCollection.ContainsKey(key);
if (success) rowIndex = this.UniqueKeyToRowIndexCollection[key];
if (rowIndex > 0) success = this.Delete(rowIndex);
}
return success;
}
public uint Delete<T>(string columnName, T value)
{
uint counter = 0;
lock (this.Locker)
{
int columnIndex = this.GetColumnIndex(columnName);
if (columnIndex == -1) throw new Exception($@"'{columnName}'
is not available as a column in/for file: '{this.FileName}'!");
string strValue = TypeConverter.ConvertTo<string>(value);
List<uint> list = new List<uint>();
foreach (var pair in this.RowIndexToValueCollection.Where
(pair => pair.Value[columnIndex] == strValue)) list.Add(pair.Key);
foreach (uint rowIndex in list)
{
if (this.Delete(rowIndex)) counter++;
}
}
return counter;
}
public bool Select<T>(uint rowIndex, string columnName, out T value)
{
this.Validate(rowIndex, columnName);
value = TypeConverter.ConvertTo<T>
(this.RowIndexToValueCollection[rowIndex][this.GetColumnIndex(columnName)]);
return true;
}
public bool Select<T>(string key, string columnName, out T value)
{
if (!this.UniqueKeyToRowIndexCollection.ContainsKey(key))
throw new Exception($@"Unknown key: '{key}'
in/for file: '{this.FileName}'!");
return this.Select(this.UniqueKeyToRowIndexCollection[key],
columnName, out value);
}
public string[] Select(uint rowIndex)
{
this.Validate(rowIndex);
return (string[])this.RowIndexToValueCollection[rowIndex].Clone();
}
public string[] Select(string key)
{
if (!this.UniqueKeyToRowIndexCollection.ContainsKey(key))
throw new Exception($@"Unknown key:
'{key}' in/for file: '{this.FileName}'!");
return this.Select(this.UniqueKeyToRowIndexCollection[key]);
}
public bool Update<T>(uint rowIndex, string columnName, T value)
{
string key, newValue, oldValue;
lock (this.Locker)
{
this.Validate(rowIndex, columnName);
newValue = TypeConverter.ConvertTo<string>(value);
oldValue = this.RowIndexToValueCollection[rowIndex]
[this.GetColumnIndex(columnName)];
key = this.GetUniqueKeyOfRowIndex(rowIndex);
this.RowIndexToValueCollection[rowIndex]
[this.GetColumnIndex(columnName)] = newValue;
if (this.IsColumnNameUniqueKeyMember(columnName) &&
newValue != oldValue) this.UpdateUniqueKey
(rowIndex, columnName, newValue, oldValue);
}
if (newValue != oldValue) this.UpdateEvent?.Invoke
(this, new CsvFileProcessorEventArgs
(rowIndex, key, columnName, newValue, oldValue));
return true;
}
public bool Update<T>(string key, string columnName, T value)
{
if (!this.UniqueKeyToRowIndexCollection.ContainsKey(key))
throw new Exception
($@"Unknown key: '{key}' in/for file: '{this.FileName}'!");
return this.Update
(this.UniqueKeyToRowIndexCollection[key], columnName, value);
}
public uint Update<T>(string columnName, T oldValue, T newValue)
{
uint counter = 0;
lock (this.Locker)
{
int columnIndex = this.GetColumnIndex(columnName);
if (columnIndex == -1) throw new Exception($@"'{columnName}'
is not available as a column in/for file: '{this.FileName}'!");
string strOldValue = TypeConverter.ConvertTo<string>(oldValue);
foreach (var pair in this.RowIndexToValueCollection.Where
(pair => pair.Value[columnIndex] == strOldValue))
{
if (this.Update(pair.Key, columnName, newValue)) counter++;
}
}
return counter;
}
private long ToCSV(bool isQuoted)
{
long length;
lock (this.Locker)
{
File.Delete(this.FileName);
CsvValue csv = new CsvValue(this.Separator, isQuoted,
this.CsvValue.QuoteCharacter);
using (StreamWriter streamWriter = File.AppendText(this.FileName))
{
streamWriter.WriteLine(csv.ToCSV(this.ColumnNames));
List<uint> list = this.GetRowIndexList();
foreach (uint rowIndex in list)
streamWriter.WriteLine(csv.ToCSV
(this.RowIndexToValueCollection[rowIndex]));
}
length = new FileInfo(this.FileName).Length;
}
return length;
}
}
}
Using the CsvFileProcessor Class
Please find below the implementation of how to read existing data from a file and an implementation of how to create a structured table in the memory, to do some manipulation stuff and to save the content of the instance into a file.
===================================================================================
Example of an existing file:
===================================================================================
Row 0 = "Column000","Column001","Column002","Column003","Val","Column005","Column006"
Row 1 = "R1C0Value","R1C1Value","R1C2Value","R1C3Value","123","R1C5Value","R1C6Value"
Row 2 = "R2C0Value","R2C1Value","R2C2Value","R2C3Value","555","R2C5Value","R2C6Value"
Row 3 = "R3C0Value","R3C1Value","R3C2Value","R3C3Value","999","R3C5Value","R3C6Value"
===================================================================================
Usage for an existing file:
===================================================================================
private void DoSomethingWithExistingFile()
{
try
{
CsvFileProcessor csv = new CsvFileProcessor("YourFile.csv");
csv.AddColumnNameToUniqueKey("Column000");
csv.AddColumnNameToUniqueKey("Column001");
csv.ReadFile();
string key = "R2C0Value" + "R2C1Value";
bool success = csv.Select(2, "Val", out int value);
success = csv.Select(key, "Val", out value);
success = csv.Update(key, "Val", value + 222);
long length = csv.ToPlainCSV();
csv.Reset(null, csv.Separator, false);
csv.ReadFile();
long length = csv.ToQuotedCSV();
}
catch (Exception e) { Console.WriteLine($@"ERROR: {e.Message}"); }
}
===================================================================================
Usage for non existing files:
===================================================================================
private void DoSomethingWithNonExistingFile()
{
try
{
double pi = 3.1415;
string[] columns = new string[] { "Column000", "Column001",
"Column002", "Column003", "Val", "Column005", "Column006, the last one" };
CsvFileProcessor csv =
new CsvFileProcessor("YourFile.csv", columns, '|');
csv.AddColumnNameToUniqueKey("Column000");
csv.InsertEvent += OnInsert;
csv.UpdateEvent += OnUpdate;
csv.DeleteEvent += OnDelete;
csv.ClearEvent += OnClear;
for (int i = 1; i <= 9; i++) csv.Insert(new string[]
{ $"R{i}C0Value", $"R{i}C1Value",
$"R{i}C2Value", $"R{i}C3Value", $"{i*100}", $"R{i}C5Value", $"R{i}C6Value" });
string[] valueArray = csv.Select("R1C0Value");
bool success = csv.Select(1, "Val", out int value);
success = csv.Select("R1C0Value", "Val", out value);
success = csv.Update("R2C0Value", "Val", value + 222);
success = csv.Update("R3C0Value", "Val", value + 333);
success = csv.Update("R4C0Value", "Val", pi);
success = csv.Update("R5C0Value", "Val", pi * 2);
success = csv.Update("R6C0Value", "Val", true);
success = csv.Update("R7C0Value", "Val", true);
success = csv.Select("R7C0Value", "Val", out bool b);
uint rows = csv.Update("Val", b, !b);
rows = csv.Delete("Val", false);
uint rowIndex = csv.Insert();
success = csv.Update(rowIndex, "Column000", "\"AnyValue\"");
rowIndex = csv.Insert(new string[] { "Dog", "Cat", "Mouse", "Worm", "500", "Fly",
"I saw the movie, but didn't understand!" });
long length = csv.ToQuotedCSV();
}
catch (Exception e) { Console.WriteLine($@"ERROR: {e.Message}"); }
}
private void OnInsert(object sender, CsvFileProcessorEventArgs e) =>
Console.WriteLine($@"Insert row: <{e.RowIndex}>,
key: <{(e.Key == null ? @"null" : e.Key)}>");
private void OnUpdate(object sender, CsvFileProcessorEventArgs e) =>
Console.WriteLine($@"Update row: <{e.RowIndex}>,
key: <{(e.Key == null ? @"null" : e.Key)}>,
column: <{e.ColumnName}>, NewValue: <{e.NewValue}>,
OldValue: <{e.OldValue}>");
private void OnDelete(object sender, CsvFileProcessorEventArgs e) =>
Console.WriteLine($@"Delete row: <{e.RowIndex}>,
key: <{(e.Key == null ? @"null" : e.Key)}>");
private void OnClear(object sender, CsvFileProcessorEventArgs e) =>
Console.WriteLine($@"Clear action");
Using CsvFileProcessor
might be helpful not only if you want to deal with data in Comma Separated Value Files, it might also be helpful if you deal with some central data which can be manipulated by different threads and the threads have to react according the data content.
Conclusion
I hope this small class and the example is kind of useful for somebody or the implementation might help anybody who is searching for a solution in a completely different area.
History
- 31st May, 2022: Initial post
- 10th June, 2022: Article updated