Introduction
This article gives you an introduction to using DataSets and how to use them with XML files. Working with
them really makes your life easier when you want to share data from a data source and you are thinking of XML.
System Requirements
To compile the solution you need to have Microsoft Visual Studio .NET installed. To run any of the client
executable you need to have the .NET framework installed.
The sample provided is a simple application written in C#. It displays a form with a DataGrid
. By default,
the application connects to a SQL Server, binds the Northwind database and fetches some parent records from table
Customers and some child records from table Orders. By default, the application assumes that you have
an instance of SQL Server installed on your machine. If this is not the case, then you must manually modify the
connection string and then rebuild the sample.
Then, you can save the dataset to XML file. Schema information is also saved.
What is a DataSet?
A DataSet
object is a very generic object that can store cached data from a database in a very efficient
way. It is member of the System::Data
namespace.
One obvious question is: When to use a dataset? Well, the answer is: it depends. You should consider that a
dataset is a collection of in-memory cached data. So it's good to use datasets when:
- You are working with multiple separated tables or tables from different
data sources.
- You are exchanging data with another application such as a Web Service.
- You perform extensive processing with the records in the database. If you use a SQL query
every time you
need to change something, processing each record may result in connection being held open which may affect
performance.
- You want to perform XML/XSLT operations on the data.
You should not use a dataset if:
- You are using Web Forms in your application because Web Forms and their controls are recreated each time
a page is requested by the client. Thus, creating, filling and destroying a dataset each time will be inefficient
unless you plan to cache it between roundtrips.
A DataSet
has a DataTableCollection
object as a member that is nothing but a collection
of DataTable
objects. You declare a DataSet
object and add tables to it like this
(in Managed C++):
DataSet* MyDataSet = new DataSet ("MyDataSet");
DataTable* Table1 = MyDataSet->Tables->Add ("Table1");
DataTable* Table2 = MyDataSet->Tables->Add ("Table2");
You can refer the tables later either using the pointers returned by the Add
method or like this:
DataTable* table = MyDataSet->Tables->Item[0];
DataTable* table = MyDataSet->Tables->Item["Table1"];
A
DataTable
object has two important members:
Rows
and
Columns
.
Rows
is a
DataRowCollection
object and
Columns
is a
DataColumnCollection
.
DataRowCollection
is a collection of
DataRow
objects
and
DataColumnCollection
is a collection of
DataColumn
objects. I am sure you can easily
figure out what these objects represent. :)
Adding data to a data set is straight-forward:
DataTable* Table1 = MyDataSet->Tables->Item[0];
Table1->Columns->Add ("Column1");
Table2->Columns->Add ("Column2");
DataRowCollection* drc = Table1->Rows;
Object* obj[] = new Object* [2];
obj[0] = new String ("Item 1");
obj[1] = new String ("Item 2");
drc->Add (obj);
If you want to specify the data type of a particular column you should use the DataColumn::DataType
property. You can set any of the following data types: Boolean
, Byte
, Char
,
DateTime
, Decimal
, Double
, Int16
, Int32
,
Int64
, SByte
, Single
, String
, TimeSpan
,
UInt16
, UInt32
, UInt64
.
That's it! Well, that's all you have to do if you want to build up your data set manually. This is not how it is
done if you want to connect to a real data source.
Binding a Database
To connect to a database server (such as SQL Server) and fill a dataset from there you need three additonal
objects: a Connection
, a DataCommand
and a DataAdapter
object.
The Connection
object is used to connect to the database object. You must provide a connection string to
it. Also, if your application is using transactions, you must attach a transaction object to the connection.
The DataCommand
object is used to sending commands to the database server. In includes four
System::String
objects named SelectCommand
, InsertCommand
,
UpdateCommand
and DeleteCommand
. I believe it is obvious what these string objects
represent, nothing else but the four basic SQL operations.
The DataAdapter
object is the object that does all the magic work. It fills the DataSet
with data
from the database and updates data from the DataSet
to the database.
You may now wonder what are the classes that correspond to the objects described above. Well, Microsoft have
prepared two sets of classes - you can't say life is getting much easier in .NET can you? :). The first set is
based on OLE DB and is part of the System::Data::OleDb
namespace. It contains the following
classes: OleDbConnection
, OleDbCommand
and OleDbDataAdapter
. The other set
of classes is optimized for working with Microsoft SQL Server. It is part of the System::Data::SqlClient
namespace and its classes include: SqlConnection
, SqlCommand
and
SqlDataAdapter
.
Here is a complete example of how to connect to the database and fill the dataset. I have used the classes
optimized for SQL Server. If you need to use OLE DB you only have to replace "Sql" with
"OleDb". We try to fetch two tables Table1 and Table2 and set a parent-child
relationship between them.
String* str = new String ("user id=sa;password=;initial catalog=MyDB;"
"data source=(local)");
SqlConnection* sqlcon = new SqlConnection (str);
sqlcon->Open ();
String* strTable1 = String::Format ("SELECT * FROM Table1 "
"WHERE Field1 = {0}", FieldID.ToString ());
String* strTable2 = String::Format ("SELECT T2.* FROM Table2 T2 "
"INNER JOIN Table1 T1 ON T2.ParendID = T1.ID "
"WHERE T1.Field1 = {0}", Field1.ToString ());
SqlCommand* sqlTable1 = new SqlCommand (strTable1, sqlcon);
SqlCommand* sqlTable2 = new SqlCommand (strTable2, sqlcon);
SqlDataAdapter* Table1Adapter = new SqlDataAdapter (sqlTable1);
SqlDataAdapter* Table2Adapter = new SqlDataAdapter (sqlTable2);
DataSet* MyDataSet = new DataSet ("MyDataSet");
DataTable* Table1 = BackupDataSet->Tables->Add ("Table1");
DataTable* Table2 = BackupDataSet->Tables->Add ("Table2");
Table1Adapter->Fill (Table1);
Table2Adapter->Fill (Table2);
MyDataSet->Relations->Add (Table1->Columns->Item[0],
Table2->Columns->Item[1]);
For details about relations and constraints you should read about the Constraint
class and the
two classes derived from it, ForeignKeyConstraint
and UniqueConstraint
.
Working with XML files
DataSet
s can work with XML files very easily. There are two methods to serialize a DataSet
object. These are DataSet::WriteXml
and DataSet::WriteXmlSchema
.
The first one writes data to the XML file and may include also schema information. It is useful when you want to
write an XML file that has schema information embedded. However, if you prefer schema information in a separate
(.xsd) file you should use the DataSet::WriteXmlSchema
method.
There are also plenty of classes in the System::Xml
namespace: XmlWriter
,
XmlReader
, XmlTextWriter
, XmlDataDocument
to name a few. You can use those
with a dataset to perform some advanced XML operations. For instance, if you want to write a dataset to an XML
file you can either use
XmlDataDocument* xmlDoc = new XmlDataDocument(MyDataSet);
XmlTextWriter* Output = new XmlTextWriter ("C:\\Myfile.xml", NULL);
Output->Formatting = Formatting::Indented;
Output->Indentation = 2;
xmlDoc->WriteTo (Output);
Output->Close ();
or use the DataSet::WriteXml
method directly:
MyDataSet->WriteXml ("C:\\MyFile.xml", XmlWriteMode::WriteSchema);
In the latter situation I chose to embed schema information in the file by using one member of the
XmlWriteMode
enumeration. Other fields of the enumeration are XmlWriteMode::IgnoreSchema
if you do not want to include schema information, XmlWriteMode::DiffGram
if you want to include both
original values of the dataset and any changes. For reading an XML file we use another enumeration:
XmlReadMode
.
A DataRelation
has a property named Nested
. When this property is set to
true
then every time the DataSet
will write a record of a parent table, it will also nest all
corresponding records in all child tables.
Formatting data in an XML file is very flexible. By default, a new element is created for every column in every
table. Assuming you have a table named Table1 with two columns named ID and Name the default output of
an XML file will be:
<MyDataSet>
<Table1>
<ID>7</ID>
<Name>name 1</Name>
</Table1>
<Table1>
<ID>8</ID>
<Name>name 2</Name>
</Table1>
</MyDataSet>
If one column is to become an attribute of the node then you will set ColumnMapping
property
of DataColumn
class. For that you must have a look at the MappingType
enumeration. Its
fields are: Attribute
, Element
, Hidden
and SimpleContent
.
Choosing Attribute
will write the corresponding column as an attribute of the parent node. The output
will be like this:
<MyDataSet>
<Table1 ID="7">
<Name>name 1</Name>
</Table1>
<Table1 ID="8">
<Name>name 2</Name>
</Table1>
</MyDataSet>
SimpleContent
means that tags for one column will not be written. If you chose Hidden
then that column will not be written at all.
Of course you can combine them very easily. Doing this:
Table1->Columns->Item[0]->ColumnMapping = MappingType::Attribute;
Table2->Columns->Item[1]->ColumnMapping = MappingType::SimpleContent;
will give you the following results:
<MyDataSet>
<Table1 ID="7">name1</Table1>
<Table1 ID="8">name2</Table1>
</MyDataSet>
Reading from an xml file is just as easy as writing. The simplest options is to use the ReadXml
method like this:
MyDataSet->ReadXml ("C:\\MyFile.xml", XmlReadMode::ReadSchema);
I have read also schema information from the file. This means that the DataSet
will automatically detect
data type for every column in all tables in the dataset and also any constraints or relations between tables. This
is really cool if you want to update a dataset. You change a value in the parent table and all child tables will
have the parent value updated. Also, trying to change a value in a child table that does not have a parent value
will throw an exception.
Updating the Database
Reading data into a dataset and then updating a database is just as easy as reading from a data source and
filling a dataset. Assuming you are reading from an XML file and you are updating a SQL server, you must do the
following:
- Create the
DataSet
object and read the XML file. If your XML file contains any schema
information then the DataSet will detect and create the corresponding tables and enable any constraints
automatically.
- Create a
DataAdapter
object for every table you want to update.
- Call the
Update
method for every DataAdapter
object.
Remember that I mentioned four String
members of the SqlDataAdapter
class? These
are: SelectCommand
, InsertCommand
, UpdateCommand
and
DeleteCommand
. SelectCommand
is the query used to fetch data from the database. You can
define the other three objects if you want to perform custom update/insert/delete operations. If you do not want to
do that you can use either SqlCommandBuilder
or OleDbCommandBuilder
class. This class will
build those strings automatically.
Whenever you are writing data to a database is good practice to use a transaction to prevent concurrent writes to
the database. For this. the .NET framework provides two classes: OleDbTransaction
and
SqlTransaction
respectively.
Here is a sample of reading data from an XML file and the writing to a SQL Server.
SqlTransaction* SqlTrans;
try
{
String* str = new String ("user id=sa;password=;initial catalog=MyDB;"
"data source=(local)");
SqlConnection* sqlcon = new SqlConnection (str);
sqlcon->Open ();
DataSet* MyDataSet = new DataSet ("MyDataSet");
MyDataSet->ReadXml ("C:\\MyXmlFile.xml", XmlReadMode::ReadSchema);
SqlTransaction = sqlcon->BeginTransaction ();
SqlDataAdapter* Table1Adapter = new SqlDataAdapter("SELECT * FROM Table1", sqlcon);
SqlDataAdapter* Table2Adapter = new SqlDataAdapter("SELECT * FROM Table2", sqlcon);
SqlCommandBuilder* Table1Command = new SqlCommandBuilder (Table1Adapter);
SqlCommandBuilder* Table2Command = new SqlCommandBuilder (Table2Adapter);
Table1Adapter->SelectCommand->Transaction = SqlTrans;
Table2Adapter->SelectCommand->Transaction = SqlTrans;
Table1Adapter->Update (MyDataSet, "Table1");
Table2Adapter->Update (MyDataSet, "Table2");
SqlTrans->Commit ();
}
catch (Exception* e)
{
if (SqlTrans != NULL)
SqlTrans->Rollback();
}
Conclusion
DataSets
provide a very easy to use and powerful way to handle large amounts of data coming from different
tables or even data sources. A dataset will cache all data making it useful when you need to read data, perform
intensive operations with it and then update it. It also provides full XML support making your life easier to share
data across applications such as Web Services.