Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / C#

Using DataSets to work with (relational) in-memory data

4.44/5 (6 votes)
16 Mar 2015CPOL8 min read 39.6K  
Need some flexible in-memory data cache? DataSets might be what you're looking for...

After a previous blog, Using C# to connect to and query from a SQL database, I got a request to write a bit more about the SqlDataAdapter and DataSets. So here it goes.

Looking at the DataSet

So what exactly is a DataSet? Besides it obviously being a class in the .NET Framework the DataSet represents an in-memory cache of data. This data can be stored in multiple DataTables. DataTables contains DataRows, which are made up of DataColumns. DataTables in a single DataSet can be related to each other by what can best be described as foreign key relations, the DataRelation is used to establish such a relation. Additionally DataTables can contain contraints such as UniqueContraints and ForeignKeyContraints. In case you hadn’t noticed yet, this is the relational model we know from SQL databases!
So yes, you’d think that working with DataSets and SQL kind of go hand-in-hand and it is in fact quite easy to fill a DataSet using the SqlDataAdapter. We’ve seen this in the previous blog post, but let’s quickly take a look at it again.

C#
<code>DataTable table = new DataTable();
using (SqlConnection connection = new SqlConnection("Data Source=(local);Initial Catalog=AdventureWorks2014;Integrated Security=SSPI"))
using (SqlCommand cmd = new SqlCommand("SELECT BusinessEntityID AS ID, FirstName, MiddleName, LastName FROM Person.Person", connection))
using (SqlDataAdapter adapter = new SqlDataAdapter(cmd))
{
    adapter.Fill(table);
}</code>

Notice that I’ve used a DataTable instead of a DataSet. In this case all I’m getting from the database is a single resultset. If I would’ve used a DataSet the DataAdapter would’ve filled it with one DataTable, the one we have now. So in the next example I’m going to use a DataSet and do something with the data we get.

C#
<code>DataSet set = new DataSet();
using (SqlConnection connection = new SqlConnection("Data Source=(local);Initial Catalog=AdventureWorks2014;Integrated Security=SSPI"))
using (SqlCommand cmd = new SqlCommand("SELECT BusinessEntityID AS ID, FirstName, MiddleName, LastName FROM Person.Person", connection))
using (SqlDataAdapter adapter = new SqlDataAdapter(cmd))
{
    adapter.Fill(set);
}

foreach (DataTable table in set.Tables)
{
    foreach (DataRow row in table.Rows)
    {
        Console.WriteLine("ID: {0}, Name: {1} {2}.", row["ID"], row["FirstName"], row["LastName"]);
    }
}

Console.ReadKey();</code>

Notice that I loop through the tables of the DataSet and through the Rows of the DataTable. the row["ColumnName"] accesses the value (boxed in an object) of the column in that row. Obviously if I had two tables, the other not being a Person, this code would break.

Accessing multiple tables and schema’s

So in the next example we’re going to load multiple tables into our DataSet. By default, the DataSet creates tables with the name “Table”, “Table1″, “Table2″, etc. If we want to access our tables by name rather than index, like we do with our columns, we’ll have to name them ourselves. You’ll see this in the next example.

C#
<code>DataSet set = new DataSet();
using (SqlConnection connection = new SqlConnection("Data Source=(local);Initial Catalog=AdventureWorks2014;Integrated Security=SSPI"))
using (SqlCommand cmd = new SqlCommand("SELECT BusinessEntityID AS ID FROM Person.BusinessEntity\r\n" +
    "SELECT BusinessEntityID AS ID, FirstName, MiddleName, LastName FROM Person.Person", connection))
using (SqlDataAdapter adapter = new SqlDataAdapter(cmd))
{
    adapter.Fill(set);
    set.Tables[0].TableName = "BusinessEntity";
    set.Tables[1].TableName = "Person";
}

foreach (DataTable table in set.Tables)
{
    if (table.TableName == "Person")
    {
        foreach (DataRow row in table.Rows)
        {
            Console.WriteLine(String.Format("ID: {0}, Name: {1} {2}.", row["ID"], row["FirstName"], row["LastName"]));
        }
    }
    else if (table.TableName == "BusinessEntity")
    { //... 
    }
}

Console.ReadKey();</code>

And even better would be to not loop through tables at all, but simply access the table you want by name.

C#
<code>DataTable personTable = set.Tables["Person"];
foreach (DataRow row in personTable.Rows)
{ //...
}</code>

As you can see we’re querying for two tables (the first only having the ID column) and the second being our Person. It’s also possible to create your DataSet without having to query for data. This can be done by using the FillSchema method of the DataAdapter. The following will result in the same tables we had in the previous example, but without data.

C#
<code>DataSet set = new DataSet();
using (SqlConnection connection = new SqlConnection("Data Source=(local);Initial Catalog=AdventureWorks2014;Integrated Security=SSPI"))
using (SqlCommand cmd = new SqlCommand("SELECT BusinessEntityID AS ID FROM Person.BusinessEntity\r\n" +
    "SELECT BusinessEntityID AS ID, FirstName, MiddleName, LastName FROM Person.Person", connection))
using (SqlDataAdapter adapter = new SqlDataAdapter(cmd))
{
    adapter.FillSchema(set, SchemaType.Source);
    set.Tables[0].TableName = "BusinessEntity";
    set.Tables[1].TableName = "Person";
}</code>

Of course the SqlTableAdapter still has to do a roundtrip to the database to get the schema information. It will send the following query to get the schema (I’m not sure if it’s the same for all versions of .NET and/or SQL Server).

SQL
<code>SET FMTONLY OFF; SET NO_BROWSETABLE ON; SET FMTONLY ON;SELECT BusinessEntityID AS ID FROM Person.BusinessEntity
SELECT BusinessEntityID AS ID, FirstName, MiddleName, LastName FROM Person.Person
SELECT BusinessEntityID AS ID, AddressID, AddressTypeID FROM Person.BusinessEntityAddress
SELECT AddressID, AddressLine1, AddressLine2, City, PostalCode FROM Person.Address SET FMTONLY OFF; SET NO_BROWSETABLE OFF;
</code>

Getting relational

I mentioned that DataSets can also have relations between tables. Unfortunately you’ll have to add those yourselves. Why would you want to do this? Because you can now navigate from a row in one table to child rows in other tables. BusinessEntity and Person are related, one BusinessEntity represents one Person. However, a BusinessEntity can have one or more addresses through the BusinessEntityAddress table which has a relation to Address. Let’s load all those tables into our DataSet, create the appropriate relations and display the names and addresses of our business entities (note that I’m only printing the ID’s and names of people who have at least one address). Remember that in order to create a relation all rows from a child table need a row in their master table (just like in the database). This is especially tricky when you start using filters. For example getting ONLY Persons with FirstName like ‘A%’, but getting ALL addresses will result in addresses without a person (or a child without a master), which prevents you from creating a relation.

C#
<code>DataSet set = new DataSet();
using (SqlConnection connection = new SqlConnection("Data Source=(local);Initial Catalog=AdventureWorks2014;Integrated Security=SSPI"))
using (SqlCommand cmd = new SqlCommand("SELECT BusinessEntityID AS ID FROM Person.BusinessEntity\r\n" +
    "SELECT BusinessEntityID AS ID, FirstName, MiddleName, LastName FROM Person.Person\r\n" +
    "SELECT BusinessEntityID AS ID, AddressID, AddressTypeID FROM Person.BusinessEntityAddress\r\n" +
    "SELECT AddressID, AddressLine1, AddressLine2, City, PostalCode FROM Person.Address", connection))
using (SqlDataAdapter adapter = new SqlDataAdapter(cmd))
{
    adapter.Fill(set);
    set.Tables[0].TableName = "BusinessEntity";
    set.Tables[1].TableName = "Person";
    set.Tables[2].TableName = "BusinessEntityAddress";
    set.Tables[3].TableName = "Address";

    set.Relations.Add("BusinessEntity_Person", set.Tables["BusinessEntity"].Columns["ID"], set.Tables["Person"].Columns["ID"]);
    set.Relations.Add("BusinessEntity_BusinessEntityAddress", set.Tables["BusinessEntity"].Columns["ID"], set.Tables["BusinessEntityAddress"].Columns["ID"]);
    set.Relations.Add("BusinessEntityAddress_Address", set.Tables["BusinessEntityAddress"].Columns["AddressID"], set.Tables["Address"].Columns["AddressID"]);
}

foreach (DataRow entityRow in set.Tables["Businessentity"].Rows)
{
    foreach (DataRow personRow in entityRow.GetChildRows("BusinessEntity_Person"))
    {
        foreach (DataRow entityAddressRow in entityRow.GetChildRows("BusinessEntity_BusinessEntityAddress"))
        {
            foreach (DataRow addressRow in entityAddressRow.GetChildRows("BusinessEntityAddress_Address"))
            {
                Console.WriteLine("The entity with ID {0} has name {1} {2} and address {3} {4} {5}.",
                    entityRow["ID"], personRow["FirstName"], personRow["LastName"],
                    addressRow["AddressLine1"], addressRow["City"], addressRow["PostalCode"]);
            }
        }
    }
}

Console.ReadKey();</code>

Now that’s quite a bit of code… I’m using a lot of (magical) strings to illustrate how it works. I suggest you create some constants or something. Don’t duplicate strings like that.
So take a good look at that example and let it sink in. You’ve just loaded your database, including relations, in code! That’s pretty neat! But it gets even better!

CRUD operations using DataSets

So we’ve looked at reading data, but what about creating, updating and deleting data? To do this you need to set the Insert, Update and DeleteCommands of your DataAdapter and call the Update method. You can do this manually, simply create three SqlCommands, one with an INSERT statement, one with an UPDATE statement and one with a DELETE statement. You can do with less if you know that certain actions are impossible, for example records from a certain table could never be deleted.
You can also use the SqlCommandBuilder, like I will do in the next example. The SqlCommandBuilder can create the insert, update and delete commands for you based on your (single table!) select query. So make sure you use the same select query in both your actual select and your update routine! The SqlCommandBuilder needs to get the schema of your table and does this using your select query. That also means it needs to make a roundtrip to the database (the same we saw with FillSchema). So when you really need that performance, or if you want to do things ‘right’, set your commands manually.
Another caveat when updating records is that you can only update one table at a time and it will do so row by row. If you update a DataSet and don’t specify a table the Update method will use the first table in the DataSet.
Before updating you can call HasChanges (to prevent going to the database if you have no changes) and GetChanges, which returns a new DataSet containing only changed DataTables and DataRows. When you do you’ll need to explicitly call AcceptChanges on your original DataSet to indicate the changes were successfully saved to the data source. Alternatively you can call RejectChanges to restore the DataSet to its original state (since the last Accept- or RejectChanges was called or since it was created). You can also accept or reject changes on DataTable and even DataRow level.
That’s A LOT of information, let’s look at some code!

C#
<code>DataSet set = new DataSet();
using (SqlConnection connection = new SqlConnection("Data Source=(local);Initial Catalog=AdventureWorks2014;Integrated Security=SSPI"))
using (SqlCommand cmd = new SqlCommand("SELECT BusinessEntityID AS ID, FirstName, MiddleName, LastName FROM Person.Person ORDER BY ID", connection))
using (SqlDataAdapter adapter = new SqlDataAdapter(cmd))
{
    adapter.Fill(set);
}

set.Tables[0].Rows[0]["FirstName"] = "Sander";

if (set.HasChanges())
{
    DataSet changes = set.GetChanges();
    using (SqlConnection connection = new SqlConnection("Data Source=(local);Initial Catalog=AdventureWorks2014;Integrated Security=SSPI"))
    using (SqlCommand cmd = new SqlCommand("SELECT BusinessEntityID AS ID, FirstName, MiddleName, LastName FROM Person.Person ORDER BY ID", connection))
    using (SqlDataAdapter adapter = new SqlDataAdapter(cmd))
    using (SqlCommandBuilder builder = new SqlCommandBuilder(adapter))
    {
        adapter.Update(changes);
        set.AcceptChanges();
    }
}</code>

You can insert and delete rows in the same manner. Let’s look at inserting a record. In this case we always need to insert a BusinessEntity or we’ll create a Person without a BusinessEntity. This is actually more complicated than it sounds. First, we’ll need all the columns that do not allow NULL values and set values. Second we need to let our DataSet know that there’s a relation between BusinessEntity and Person. We’ll also need to let the DataSet know that the BusinessEntityID is an ID field with an Identity Specification (which means we don’t have to set it ourselves), but our Person needs that same ID (we can use FillSchema for this!). Last we need to issue an update for each table seperately, which means creating seperate SqlCommands, SqlDataAdapters and SqlCommandbuilders…

C#
<code>DataSet set = new DataSet();
using (SqlConnection connection = new SqlConnection("Data Source=(local);Initial Catalog=AdventureWorks2014;Integrated Security=SSPI"))
using (SqlCommand cmd = new SqlCommand("SELECT BusinessEntityID AS ID FROM Person.BusinessEntity\r\n" +
    "SELECT BusinessEntityID AS ID, PersonType, NameStyle, FirstName, MiddleName, LastName, EmailPromotion FROM Person.Person", connection))
using (SqlDataAdapter adapter = new SqlDataAdapter(cmd))
{
    adapter.TableMappings.Add("Person.BusinessEntity", "BusinessEntity");
    adapter.TableMappings.Add("Person.Person", "Person");
    adapter.FillSchema(set, SchemaType.Source);
    adapter.Fill(set);
    set.Tables[0].TableName = "BusinessEntity";
    set.Tables[1].TableName = "Person";
    set.Relations.Add("Relation", set.Tables["Businessentity"].Columns["ID"], set.Tables["Person"].Columns["ID"]);
}

DataTable businessEntityTable = set.Tables["BusinessEntity"];
DataRow newBusinessEntityRow = businessEntityTable.NewRow();
businessEntityTable.Rows.Add(newBusinessEntityRow);

DataTable personTable = set.Tables["Person"];
DataRow newPersonRow = personTable.NewRow();
newPersonRow["PersonType"] = "EM";
newPersonRow["NameStyle"] = 0;
newPersonRow["FirstName"] = "Sander";
newPersonRow["LastName"] = "Rossel";
newPersonRow["EmailPromotion"] = 0;
newPersonRow.SetParentRow(newBusinessEntityRow);
personTable.Rows.Add(newPersonRow);

if (set.HasChanges())
{
    DataSet changes = set.GetChanges();
    using (SqlConnection connection = new SqlConnection("Data Source=(local);Initial Catalog=AdventureWorks2014;Integrated Security=SSPI"))
    using (SqlCommand businessEntityCmd = new SqlCommand("SELECT BusinessEntityID AS ID FROM Person.BusinessEntity", connection))
    using (SqlDataAdapter businessEntityAdapter = new SqlDataAdapter(businessEntityCmd))
    using (SqlCommandBuilder businessEntityBuilder = new SqlCommandBuilder(businessEntityAdapter))
    using (SqlCommand personCmd = new SqlCommand(
        "SELECT BusinessEntityID AS ID, PersonType, NameStyle, FirstName, MiddleName, LastName, EmailPromotion FROM Person.Person", connection))
    using (SqlDataAdapter personAdapter = new SqlDataAdapter(personCmd))
    using (SqlCommandBuilder personBuilder = new SqlCommandBuilder(personAdapter))
    {
        businessEntityAdapter.Update(changes, "BusinessEntity");
        personAdapter.Update(changes, "Person");
        set.AcceptChanges();
    }
}</code>

We can also delete rows. In this database it’s a bit difficult because everything is related and some entities simply cannot be deleted (only made inactive). But we can, of course, delete our just inserted row. Most of the code above remains the same, except the part where we created the DataRows. I’ll use some LINQ to retrieve the last inserted BusinessEntity (assuming it’s the row we just inserted). Once I got the row I’ll first delete the child rows (Persons) and then I’ll delete the BusinessEntity.
Now here comes the tricky part, in the example above we first needed to insert the BusinessEntity to generate an ID so we could insert the Person. For deletion it’s the other way around! That means that if you would do updates, inserts and deletes all in one you’d get an exception. Either your inserts fail because you insert child rows before their parent rows are inserted or you delete parent rows before their child rows are deleted.
You can use an overload on GetChanges that accepts a DataRowState to create two DataSets, one with updates and inserts and one with deletes and execute them in the correct order.
In the next example I’ll use the DataRowState, but I’m only deleting rows.

C#
<code>DataSet set = new DataSet();
using (SqlConnection connection = new SqlConnection("Data Source=(local);Initial Catalog=AdventureWorks2014;Integrated Security=SSPI"))
using (SqlCommand cmd = new SqlCommand("SELECT BusinessEntityID AS ID FROM Person.BusinessEntity\r\n" +
    "SELECT BusinessEntityID AS ID, PersonType, NameStyle, FirstName, MiddleName, LastName, EmailPromotion FROM Person.Person", connection))
using (SqlDataAdapter adapter = new SqlDataAdapter(cmd))
{
    adapter.TableMappings.Add("Person.BusinessEntity", "BusinessEntity");
    adapter.TableMappings.Add("Person.Person", "Person");
    adapter.FillSchema(set, SchemaType.Source);
    adapter.Fill(set);
    set.Tables[0].TableName = "BusinessEntity";
    set.Tables[1].TableName = "Person";
    set.Relations.Add("Relation", set.Tables["Businessentity"].Columns["ID"], set.Tables["Person"].Columns["ID"]);
}

DataTable table = set.Tables["BusinessEntity"];
List rows = table.Rows.Cast().ToList();
int maxId = table.Rows.Cast().Max(b => b.Field("ID"));
DataRow row = rows.Single(be => be.Field("ID") == maxId);
foreach (DataRow personRow in row.GetChildRows("Relation"))
{
    row.Delete();
}
row.Delete();

if (set.HasChanges())
{
    DataSet changes = set.GetChanges(DataRowState.Deleted);
    using (SqlConnection connection = new SqlConnection("Data Source=(local);Initial Catalog=AdventureWorks2014;Integrated Security=SSPI"))
    using (SqlCommand businessEntityCmd = new SqlCommand("SELECT BusinessEntityID AS ID FROM Person.BusinessEntity", connection))
    using (SqlDataAdapter businessEntityAdapter = new SqlDataAdapter(businessEntityCmd))
    using (SqlCommandBuilder businessEntityBuilder = new SqlCommandBuilder(businessEntityAdapter))
    using (SqlCommand personCmd = new SqlCommand(
        "SELECT BusinessEntityID AS ID, PersonType, NameStyle, FirstName, MiddleName, LastName, EmailPromotion FROM Person.Person", connection))
    using (SqlDataAdapter personAdapter = new SqlDataAdapter(personCmd))
    using (SqlCommandBuilder personBuilder = new SqlCommandBuilder(personAdapter))
    {
        personAdapter.Update(changes, "Person");
        businessEntityAdapter.Update(changes, "BusinessEntity");
        set.AcceptChanges();
    }
}</code>

And there you have it! No one said working with DataSets was easy… In this post we looked at DataSets and SqlDataAdapters, it is possible to use DataSets with other data sources such as XML though! In fact, DataSets and DataTables are unaware of their data source, you can fill them however you like. Their structure also makes them perfect for binding to DataGridViews in WinForms (and probably WPF too, but I wouldn’t really know).
Their weakness is that they are not strong-typed. We are forced to work with strings to refer to table and column names, and all values contained in them are objects (which means they could be anything).
Microsoft saw this and created so-called Typed DataSets. I recommend you skip Typed DataSets and move straight to LINQ-To-SQL or the Entity Framework. That’s a whole different topic though.

Comments are welcome. Happy coding!

The post Using DataSets to work with (relational) in-memory data appeared first on Sander's bits.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)