Introduction
LINQ is a great thing. But when I have all my datatables together and I filled them with data, I had to think, to search on the net, how could I query the desired data from the datatables with a LINQ query. But I have the appropriate SQL query immediately in my head. It would be great - I thought - if I could query the data from the datatables. And when I searched the net, I saw, that I'm with my wish not alone.
Therefore I decided to code a class library, by which I can run SQL query against a dataset, which contains one ore more datatables. Ironically, I learned during the coding a lot about LINQ.
The library contains four classes:
- SqlToDatasetBase: it contains the basic data structures to deal with data in the dataset
- SqlSyntaxChecker: it checks the syntax of the query (more details see below)
- SqlTranslator: this class translates the query and generates a
System.Data.DataTable
, which contains the selected datas - SqlQuery: this class runs the query and returns the above data table
The following SQL functions, operators, etc. are supported:
- SELECT Statement
- multiple JOINS (only INNER JOIN, LEFT JOIN and RIGHT JOIN), based only one condition, f.e. t1.f1 ON t2.f1 AND t1.f3 ON t2.f3 is at this time not supported
- multiple WHERE clauses (AND, OR, NOT)
- GROUP BY with multiple fields
- ORDER BY with multiple fields
- MIN, MAX, SUM, AVG, COUNT, FIRST, LAST functions
- >=, <=, =, <>, >, <, NOT IN, IN, NOT LIKE, LIKE operators
- Datatypes: System.Char, System.String, System.DateTime, System.Boolean, System.Byte, System.Decimal, System.Double, System.Int16, System.Int32, System.Int64, System.SByte, System.Single, System.UInt16, System.UInt32, System.UInt64
- special chars in the field names (such fields need to stand in [ ])
- special chars are: space, ',", !, %, \, =, ?, ;, :, #, &, @, {, } (of course, the list can be extended)
The following are not supported at this time:
- alias names are recognized during the syntax check, but they can't be referred in the other parts of the SQL query
- mathematic and other expressions are recognized during the syntax check, but not evaulated in the SqlTranslator class (f. e. SELECT 2*f1 FROM t1 returns only f1 and not 2*f1)
- "grouping" of WHERE conditions is at this time not supported (f. e. (a=1 AND b=2) OR (c=3 AND d=4))
- subqueries
- star in the SELECT statement (the field names to be selected must be listed explicitly)
But I'm intending to code this too.
Speed:
The speed of a query is composed of three parts (in millisecs):
- SyntaxCheckTime: elapsed time for the syntax checking
- TranslateTime: elapsed time for the translating of the query
- OutputTime: elapsed time for the output of the selected data in a datatable
Checking the syntax:
- general syntax checking of the SQL language
- universal checking of the fields in all the clauses (f. e. whether one field in GROUP BY is present in the selected datatables, etc.)
- field type checking (f . e. if f1 is a string field and it is used like this: WHERE f1>10)
- ambiguous field checking
- and a lot more
Background
How it works? I load in the SqlToDatasetBase class the table structure of all the tables in a dataset. Then I load all the rows in a List(Of DataRow)
. Now I'm able to query this collection with a LINQ query. I parse the SQL query, store everything, what I need and then run a LINQ query against the collection. If there are WHERE, GROUP BY or ORDER BY clauses, I generate the appropriate LINQ query with expression trees (by the way, this was the hardest part).
After all of that I have a IEnumerable(Of Object())
, which I can iterate to add the rows to the output datatable.
Using the code
I added to this article a sample project to show, how to use the class library.
Basically, there are three steps:
- once we read the data in our datatables, an instance of the SqlToDatasetBase class must be declared:
Dim test_sqltodataset = New SqlToDataset.SqlToDatasetBase
- after that we need to add the datatables to this instance:
test_sqltodataset.Tables.Add(conn.Dataset.Tables(0))
test_sqltodataset.Tables.Add(conn.Dataset.Tables(1))
test_sqltodataset.Tables.Add(conn.Dataset.Tables(2))
test_sqltodataset.LoadData()
- then we pass the SQL query in text form to the SqlQuery class; if there are errors, they can be catched:
Dim sqlquery As New SqlToDataset.SqlQuery(querytext)
dgvOutput.DataSource = Nothing
Try
dgvOutput.DataSource = sqlquery.RunQuery(test_sqltodataset)
Catch ex As SqlToDataset.SqlSyntaxChecker.SqlSyntaxException
MessageBox.Show(ex.Message, "SqlToDataset", MessageBoxButtons.OK, MessageBoxIcon.Error)
Exit Sub
End Try
If needed, the speed times can be analysed.
The sample application can be used the following:
- you load the sample data from the database in the tree datatables (Load database)
- you create an instance of the
SqlToDataset
(Create SqlToDataset) - you choose a sample query (they are only for demonstrating purposes of the capabilities of the library) or you enter one
- then you start the query (Run SQL query)
- the result will be displayed in a datagridview.
Points of Interest
The parsing of the query wasn't difficult, but the building of the expression trees have caused some headache to me.
Licensing
The class library uses the CPOL licensing system, with some restrictions:
- the library is closed source
- it is forbidden to extract the source code (refactoring)
- it is forbidden to sell the class library in your name or use it on its own to earning purposes in any way