During the early phase of your career, when you just started learning about .NET technology, you might have come across the sentence "ADO.NET is a disconnected database architecture". Well, it's true, but how? How can ADO be transformed to such an architecture which no other database architecture supports. Every other database programming supports only RecordSet
which you need to use to get data from the database. Have you ever thought of it ? If you didn't, it is time to rethink it now.
In this post, I will uncover a few internals of ADO.NET architecture, and also refer to my favorite tool Reflector to see the code that is written inside the .NET classes. My focus in the post is only on the internals of the architecture, so if you are new to ADO.NET, it would not be a good idea to read it over and confuse yourself more, rather read thorough Overview of ADO.NET and come back later.
Introduction
Well, In fact, I didn't ever think of writing such a post. Strangely, while talking with Devs, I found out people commenting that ADO.NET as a new way of accessing data missing out the point where ADO.NET differs from ADO. It is true, ADO.NET is disconnected. There are large performance gains of an application which release the Connection after fetching data from the database. ADO.NET has inherent capability to release the database connection, after the so called DataSet
/ DataTable
is filled with data which you might be looking for. There are few Adapters provided with .NET library which lets you Fill these DataSet
or DataTable
when you want and later you can use these cached data to your application. Yes, every line here is a fact. But internally there is still a Reader associated with every call. Let me detail the fact a little.
Details of Loading Data using Select Query
Say you invoke a Selection operation on the database using Select Query. Let's jot down the steps what happens internally:
- We create an instance of
DataAdapter
and call Fill
method to fill in the data. Fill
method is defined when DbDataAdapter
is called. - It sends the SQL script to the database process and database loads the data to memory.
- ADO.NET gets the pointer to the first row of data being loaded from the SQL server process.
DbDataAdapter.Fill
internally creates a DataReader
to read the data from the memory location. - The
DataReader
reads data sequentially and creates DataRow
for each Record
. DataRow
are enumerated into a DataTable
and the final Result is returned back to you. - After it returns the
DataTable
(a cached version of RecordSet
), you can eventually release the connection and can work with the data.
Thus from the above steps, you must be clear about the fact that there is nothing magical happening inside the architecture of ADO.NET, rather it uses the same way of loading data into .NET objects which eventually lets you manipulate the data without letting the connection remain active for long. So as you now know about the steps, let us take you deep into Reflector to see how the code looks like:
The code looks simple enough. As we pass the select
Query as SelectCommand
in DbDataAdapter
, it in turn creates a Native Scope to run your code and places the same into the DataTable
. The code Bid.ScopeEnter
actually creates a scope for the Command
API. The FillCommandBehavior
is an internal enumeration which identifies how the data is to be fetched. Now let me move ahead to this.Fill
.
Hmm, the Fill
seems to be a bit fishy. It does nothing, but checks the parameter it receives, does some manual checks and finally calls FillInternal
to get the result. So basically, there is nothing to look at this method, and let me quickly jump to FillInternal
now.
There are quite a number of things that are going on in FillInternal
. First of all, it is an internal method and is not exposed to the outer world, and it actually creates the data in a DataTable
and returns the result. In between, it opens the connection to database and finally closes it.
If you see the code above, it first gets the connection from command
object and tries to open it. QuietOpen
is a private static
method which just set the ConnectionState
to Open. The QuiteClose
on the other hand closes the connection (as marked in blue).
The most interesting part of the code is the DataReader
. Basically, after the connection is opened, ADO.NET actively creates a Reader
object and executes the Reader
to create the DataTable
. The command ExecuteReader
actually invokes the command we pass to the database, creates a DataStream
which points to the base location of the loaded memory (in which the resultant data is stored), creates an object to wrap each and every information and returns the Reader
object. So now after this line, the reader
will load the DataTable
using this.Fill
method. If you look into this Fill
overload, you will find it actually calls FillFromReader
which again calls FillLoadDataChunk
and eventually we come to FillLoadDataRow
. This method is one which we find interesting. Let's look at the code first:
So basically ExecuteReader
loads each DataRow
from the DataStream
for each Record
, and adds the same to the contextual DataTable
.
Hence, coming to the conclusion, ADO.NET is actually a wrapper to existing technology available to allow the developer to get Cached Managed Object(DataTable
) after loading the data loaded from SQL server leaving the Connection closed after DataTable
is totally loaded.
Choice between DataReader and DataTable
There is always a dilemma between the choice of DataTable
and DataReader
amongst developers. Some think DataReader
is faster as you are getting the data immediately, while some argue for DataTable
as being a total managed data object. Well, I must say each of them have its own merits and demerits.
- You should choose DataReader over DataTable when your requirement to a data is somewhat limited to 1 source. Say for instance, while generating an ASP.NET Grid. Thus once the
Reader
reads the whole data, the HTML is produced immediately, hence you eventually removed the loops and checks that are imposed (as shown above) during the creation of DataTable
. - You should choose DataTable over DataReader when you have to do random analysis on the data retrieved from the database and the data representation is not straight forward. Even if you want to feed the same data into multiple objects or want to show aggregation of data, it would be a good idea to use
DataTable
(and we can neglect initial data load).
Conclusion
Even though there are a lot of things left out from the post, the main idea is to let you know how ADO.NET releases connection and the difference between DataReader
and DataTable
, which I think is clear enough. I would also like to notify that internally everything runs the same way as we think of and there is nothing magical happening in between. Even though it is a very old topic, I hope most of you like it.
Thank you for reading.