Database Project Deployment -- An introduction to the DependencyList

jimbobmcgee

2.36/5 (6 votes)

29 Sep 2008CPOL6 min read

167

A means to deploy Visual Studio database projects, ordering SQL files using a List that yields in the order of dependency.

Introduction

Recently, I have been working with database projects within Visual Studio 2005 (yes, I'm set in my ways -- I haven't got around to VS2008 yet!!). I like the ability to source control my creation scripts and Stored Procedures, and I'm happy to use the VS IDE to write my SQL code and running individual scripts against multiple development database references is a breeze.

However, deploying the finished scripts to a production database is a bit of a nightmare. Essentially, if each script file represents a single database object, you have to ensure that you run them all in the correct order -- especially if you have foreign key constraints (you do have foreign key constraints, don't you?!). Ordering all these scripts manually is a royal pain.

After the second or third time of being stuck in this situation, I decided enough is enough, and set about writing a small app to do the job for me. The results of this formed the mighty DependencyList.

The DependencyList

The DependencyList is a class I have written that can be accessed and modified like a regular List<T> but also allows you to determine that certain items are dependent on each other. During the enumeration process, using GetEnumerator or foreach, the dependencies of such items are always yielded first.

To assist with this (and promote possible extension to additional collections), I started out with a basic interface:

public interface IDependencyCollection<T> : ICollection<T>
{
    void AddDependencies(T item, params T[] dependsOnItems);
    void AddDependencies(int index, params int[] dependsOnIndices);
    void RemoveDependencies(T item, params T[] noLongerDependsOnItems);
    void RemoveDependencies(int index, params int[] noLongerDependsOnIndices);
    void ClearDependencies();
}

This determines that I will use the basic ICollection principals so the class is familiar to implementors, but I have added the means to add and remove dependencies, based on value or index.

My basic means to determine dependencies is to declare a child class that provides a Yield method and contains a YieldingDelegate, which is a delegate with the same method signature as the Yield method. I've chosen to make this a private child, with internal members, because it is not necessary for implementors to see/use this child class outside of the DependencyList context. Laziness is responsible for my not wrapping my fields in properties and other such 'good practice' coding, but I figure that, since it is a private class, why add the overhead of a property get/set method call?

It is also important that an item is not yielded more than once per enumeration, so I have also added a private enum which defines the three states of an item: not yielded, yielding, and yielded. The yield process is only really concerned with 'not yielded' and 'yielded'; the 'yielding' state is primarily used in debugging to detect circular dependencies:

public class DependencyList<T> : IDependencyCollection<T>, IList<T>
{
    private delegate IEnumerable<T> YielderDelegate();
    private enum YieldState { None = 0, Yielding, Yielded }

    private class DependencyListNode
    {
        internal readonly T value;
        internal DependencyListNode(T value) { this.value = value; }
        internal YielderDelegate Yielding;
        internal YieldState state = YieldState.None;
        internal void ClearYielding() { Yielding = null; }

        internal IEnumerable<T> Yield() { /*...*/ }
    }

    //...
}

The Yield method is responsible for checking the yield state and, if None, checking the YielderDelegate. If the YielderDelegate is not null, it invokes the YielderDelegate. After the YielderDelegate is finished, the Yield method yields its own value:

public class DependencyList<T> : IDependencyCollection<T>, IList<T>
{
    //...

    private class DependencyListNode
    {
        //..

        internal IEnumerable<T> Yield()
        {
            if (state == YieldState.None)
            {
                state = YieldState.Yielding;

                if (Yielding != null)
                {
                    IEnumerable<T> e = Yielding();
                    foreach (T t in e) yield return t;
                }

                yield return value;

                state = YieldState.Yielded;
            }
        }

        //...


    }

    //...
}

Implementing the interfaces

My DependencyList class can now be fleshed out with the IList and IDependencyCollection interface implementations. As you will see, I've not tried to reinvent the wheel as far as the basic List functionality goes; instead I add a List of DependencyListNodes to the class and expose it using the IList interface:

public class DependencyList<T> : IDependencyCollection<T>, IList<T>
{
    //...

    private readonly List<DependencyListNode> nodes = 
            new List<DependencyListNode>();

    #region IList<T> members
    public void Insert(int index, T item)
    {
        nodes.Insert(index, new DependencyListNode(item));
    }
    public int IndexOf(T item)
    {
        for (int i = 0; i < nodes.Count; i++)
            if (Equals(nodes[i].value, item)) return i;

        return -1;
    }
    public void RemoveAt(int index) { nodes.RemoveAt(index); }
    //...
    #endregion

    #region ICollection<T> members
    public void Add(T item) { nodes.Add(new DependencyListNode(item)); }
    public void Clear() { nodes.Clear(); }
    public int Count { get { return nodes.Count; } }
    //...
    #endregion

    //...
}

(Note that I've not included the entire interface implementation here, but you should be able to get the idea.)

Next up is to add the IDependencyCollection implementation. This is the key method behind how this class works. Adding a dependency to a dependent item is a case of adding the Yield method of the item that must be yielded first, to the YielderDelegate of the dependent item. Since the YielderDelegate is always invoked before the node's value is yielded, the net effect is a recursion of Yield methods, in the order of dependency:

public class DependencyList<T> : IDependencyCollection<T>, IList<T>
{
    //...

    public void AddDependencies(int index, params int[] dependsOnIndexes)
    {
        if (dependsOnIndexes == null)
            throw new ArgumentNullException("dependsOnIndexes");
        if (index < 0 || index >= nodes.Count)
            throw new ArgumentOutOfRangeException("index",
                      index, "Index out of range"
        );
        foreach (int i in dependsOnIndexes)
            if (i < 0 || i >= nodes.Count)
                throw new ArgumentOutOfRangeException("dependsOnIndexes",
                          i, "Index out of range"
        );

        foreach (int i in dependsOnIndexes) nodes[index].Yielding += nodes[i].Yield;
    }

    //...
}

(Again, I've not fleshed out all the methods here -- RemoveDependencies(int, params int[]) is the same; except for the delegate assignment operator += becomes -=, and the ClearDependencies method invokes DependencyListNode.ClearYielding for all nodes, which sets the YielderDelegates back to null. The value-based Add/Remove methods simply use IndexOf to find the first index of a value, and then supplies the said index to the Add/Remove methods shown).

Deploying Database Projects

Now, my DependencyList class is complete, I can move on to the Database Project Deployer.

This consists of a basic form, with input folder and output file dialog boxes. For added functionality, I've also included a multi-select list box which is populated with all the SQL files found in the input folder. The user can choose to include/exclude certain files using this list box:

When the user clicks the OK button, each selected file is added to a new class called the ScriptGenerator. This class is responsible for parsing each SQL file, adding it to a DependencyList, and determining if it has any dependencies on any other SQL file. This class works on the single assumption that each SQL file corresponds to a given database object, and that the name of the file is the name of the database object it creates, e.g., 'table1.sql'. It is not case sensitive.

Generating the Script

The ScriptGenerator consists of a DependencyList<string>, which stores the contents of the SQL files, and a Dictionary<string, int> which maps the object name (taken from the filename) against its indexed position in the DependencyList. It exposes two public methods, AddSqlFile(string) and Generate(string), back to the form.

public class ScriptGenerator
{
    private readonly DependencyList<string> sqlList = 
                     new DependencyList<string>();
    private readonly Dictionary<string, int> nameIndexes = 
                     new Dictionary<string, int>();

    public void AddSqlFile(string sqlFilePath)
    {
        string fileName = 
          Path.GetFileNameWithoutExtension(sqlFilePath).ToLowerInvariant();
        using (StreamReader reader = new StreamReader(sqlFilePath))
        {
            sqlList.Add(reader.ReadToEnd());
            nameIndexes.Add(fileName, sqlList.Count - 1);
        }
    }

    public void Generate(string outputFilePath)
    {
        ResolveDependencies();
        using (StreamWriter writer = new StreamWriter(outputFilePath))
        {
            foreach (string sql in sqlList)
            {
                writer.WriteLine(sql);
                writer.WriteLine();
                writer.Flush();
            }
        }
    }

    //...
}

The observant amongst you might have noticed the ResolveDependencies() method tucked into Generate. This is the key part of the ordering system, as it is responsible for adding the dependencies within the DependencyList. It does this simply by index-looping through DependencyList, using for, and looping through the stored keys, trying to find that key within the SQL at that indexed position in the DependencyList. There is an additional caveat to this process, namely that there are a number of different ways to write SQL files -- the object names may be repeated, and qualified by spaces, square brackets, quote marks, periods, etc. To mitigate this, I have added a basic private struct to store the various qualifiers I have identified (there may be more), and looped through an array of them during each key check. The private Qualifier struct looks much like this:

public class ScriptGenerator
{
    //...

    private struct Qualifier
    {
        internal static readonly Qualifier[] QUALIFIERS = new Qualifier[] {
            new Qualifier('"', '"'),
            new Qualifier('[', ']'),
            new Qualifier(' ', ' '),
            new Qualifier(' ', '('),
            new Qualifier(')', ' '),
            new Qualifier(')', '('),
            new Qualifier('.', '.'),
            new Qualifier('.', '('),
            new Qualifier('.', ' '),
            new Qualifier('(', ')')
        };

        private readonly char start;
        private readonly char end;

        internal Qualifier(char start, char end)
        {
            this.start = start;
            this.end = end;
        }

        internal string Qualify(string str)
        {
            return string.Concat(start, str, end);
        }
    }

    //...
}

Resolving Dependencies

The ResolveDependencies method looks like this:

public class ScriptGenerator
{
    //...
    private void ResolveDependencies()
    {
        for (int s = 0; s < sqlList.Count; s++)
        {
            string sql = sqlList[s]
                .ToLowerInvariant()
                .Replace(Environment.NewLine, "\n")
                .Replace('\t', ' ');

            foreach (string key in nameIndexes.Keys)
                foreach (Qualifier q in Qualifier.QUALIFIERS)
                    if (s != nameIndexes[key] && sql.Contains(q.Qualify(key)))
                    {
                        sqlList.AddDependencies(s, nameIndexes[key]);
                        break;
                    }
        }
    }
    //...
}

The end result is that all the SQL files are ordered appropriately, and spooled to a single output script that can be executed on the target database with far less effort than all those separate scripts. Obviously, your mileage may vary, depending on how you've written your SQL code, although most ordering errors can be resolved by adding suitable qualifiers and remembering that it is just a dumb string.Contains(string) call that determines dependencies. I'm sure there are better ways to parse object names from the SQL code, but this suits my needs in 99% of cases.

Hope this helps with future database projects...

History

v1.0

Initial version.

Originally posted here.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)