Introduction
In this short article, we'll see how to convert a common CSV file into its XML representation, using Visual Basic .NET and the powerful functionalities of DataTable objects. When working in the field of data management, data manipulation is an everyday task, and it can be useful to quickly convert a given set of data in other forms.
Sample Data
Speaking about a CSV file, lets suppose we have a file named "example.txt", into the C:\Temp folder. It has a content like this:
1;John Doe;40
2;Mark Nosurname;35
3;Jane Doe;32
4;Without name;60
CsvToXML Function
Lets see the converting function, plus some afterwords
Private Sub CsvToXml(_inputFile As String, _dataName As String, _separator As Char, _outputFile As String, Optional _fieldnames() As String = Nothing)
Dim dt As New DataTable(_dataName)
Dim firstRow As Boolean = True
Using sr As New StreamReader(_inputFile)
While Not (sr.EndOfStream)
Dim fields() As String = sr.ReadLine.Split(_separator)
If firstRow Then
For ii As Integer = 0 To fields.Count - 1
Dim _fName As String = ""
If IsNothing(_fieldnames) Then
_fName = "Field" & ii.ToString("000")
Else
_fName = _fieldnames(ii)
End If
dt.Columns.Add(_fName)
Next
firstRow = False
End If
dt.Rows.Add(fields)
End While
dt.WriteXml(_outputFile)
dt.Dispose()
End Using
End Sub
Our function requires a path for our CSV file (_inputFile), a name to be assigned to our main XML entity (_dataName), the character used as field separator (_separator), a path for the output XML file (_outputFile), and an optional array of string, representing specific field names to be assigned (_fieldnames).
We start our routine creating a new DataTable, and opening our CSV file. Reading the first line, the columns of the DataTable are created: if we have passed the argument _fieldnames, the function uses our passed string, otherwise it proceed in creating a column with a more generic name, such as "FieldXYZ", where "XYZ" is replaced by the field's index.
In a while loop, we process every line of the CSV file, splitting each line using _separator as the split character, and assigning each splitted parameter to a local array of strings, which will be used as argument for the standard method Rows.Add of the DataTable object. At the end of the loop, we'll have a full-fledged DataTable, with its columns and rows compiled with CSV splitted data.
Invoking the WriteXML method will produce an XML representation of DataTable data.
Final results
Assuming the above CSV data, we could use our function as follows:
Dim fieldNames() As String = {"Id", "Name", "Age"}
CsvToXml("c:\temp\example.txt", "TempTable", ";", "c:\temp\example.xml", fieldNames)
And the final results will be:
="1.0"="yes"
<DocumentElement>
<TempTable>
<Id>1</Id>
<Name>John Doe</Name>
<Age>40</Age>
</TempTable>
<TempTable>
<Id>2</Id>
<Name>Mark Nosurname</Name>
<Age>35</Age>
</TempTable>
<TempTable>
<Id>3</Id>
<Name>Jane Doe</Name>
<Age>32</Age>
</TempTable>
<TempTable>
<Id>4</Id>
<Name>Without name</Name>
<Age>60</Age>
</TempTable>
</DocumentElement>
Source code
Bibliography
History
- 2015-01-22: First release for CodeProject