Background
I’ve been looking into the new Microsoft Synchronization Framework, specifically, providers for databases.
Visual Studio 2008 has a new project item type, “Local Database Cache”, not to be confused with “Local Database” which is for the SQL Server Compact edition database synchronization. The compact database doesn’t allow foreign key relationships; therefore, it doesn’t suit my needs. While Microsoft is working on a designer for an Express edition, I plug on.
The data that is retrieved from the master table needs to be retrieved in an order that doesn’t violate foreign key constraints. The understanding is that we are working down the happy path of software development. The Synchronization Framework will handle the cases where there are errors due to other issues that don’t involve the order of the data retrieved.
The purpose of this article is to provide a way of sorting database tables based on the foreign key relationships that exist within the database.
Sorting Algorithm
The following is part of the AdventureWorks sample database from Microsoft. The diagram shows the basic fields and the foreign key relationships between some of the tables.
After four attempts at trying to sort tables, I came up with an algorithm that I believe works well. It is based on the principle that each table falls into one of the following categories:
- None – The table is not reference by another table and it does not reference another table.
- Referenced By – The table is referenced by another table.
- Both – The table is referenced by tables and references other tables.
- References – The table only references other tables.
The following diagram shows a picture of the database tables above, using a class notation of the tables organized into the four classifications:
Within each category, the tables are ordered by name as they don’t have a relationship with any other table within the category; the exception being category 3 (Both). Category 3 needs to be further sorted by which table references other tables.
The following fictitious database picture represents the final sorted order (D, B, E, C) of the tables within category 3. If you look at the arrows of the class, you will notice that all the arrows are flowing from right to left. This is the flow that we are looking for when performing the sort.
Given the AdventureWorks database table and the class diagrams shown above, the tables should be sorted into the following order:
- dbo.ErrorLog
- Sales.SalesTerritory
- HumanResources.Employee
- Sales.SalesPerson
- Sales.SalesOrderHeader
- Sales.Store
- Sales.SalesPersonQuotaHistory
- Sales.SalesTerritoryHistory
Using the Code
The project is broken into the following files:
- Program.cs – Main driver and prints the results of the sort.
- TableInfo.cs – Retrieves database information, creates the table relationships, and performs the sort.
- TableInfoComparer.cs – Comparison algorithm.
The table information is retrieved from SQL Server using the following two SQL scripts:
Retrieve the fully qualified table names in the current database:
SELECT TABLE_SCHEMA + '.' + TABLE_NAME AS TableName
FROM INFORMATION_SCHEMA.Tables
WHERE TABLE_TYPE='BASE TABLE'
ORDER BY TableName
Retrieve the relationship between tables:
SELECT
OBJECT_SCHEMA_NAME (fkey.referenced_object_id) + '.' +
OBJECT_NAME (fkey.referenced_object_id) AS ReferenceTableName
,COL_NAME(fcol.referenced_object_id,
fcol.referenced_column_id) AS ReferenceColumnName
,OBJECT_SCHEMA_NAME (fkey.parent_object_id) + '.' +
OBJECT_NAME(fkey.parent_object_id) AS TableName
,COL_NAME(fcol.parent_object_id, fcol.parent_column_id) AS ColumnName
FROM sys.foreign_keys AS fkey
INNER JOIN sys.foreign_key_columns AS fcol ON
fkey.OBJECT_ID = fcol.constraint_object_id
ORDER BY ReferenceTableName, ReferenceColumnName, TableName, ColumnName