(untagged)

Continuous Integration for Database Development

Red Gate Software

14 Mar 2011

Continuous integration makes development more efficient – allowing bugs to be caught earlier, and providing rapid validation. But for too long, it hasn’t been standard practice for databases. This is an overview of implementing CI so you can bring database code in line with application development.

This article is in the Product Showcase section for our sponsors at CodeProject. These articles are intended to provide you with information on products and services that we consider useful and of value to developers.

Continuous integration is the business of making sure that everything in your software project builds and gets tested automatically and frequently. So in most situations, each code commit kicks off a build, and the build has a suite of unit and integration tests run against it. This means you get quick validation of your changes, and notification of problems.

Martin Fowler wrote the definitive article on continuous integration back in 2000, so it’s not a new idea. In fact, it’s standard practice for a lot of development teams working on application code.

But for the database layer, it’s a little harder.

Why is continuous integration a problem for databases?

Database code just isn’t like application code. It doesn’t live neatly as a set of files you can copy around, and you don’t compile it. This means the first problem is that there’s nothing to put in source control. Quite apart from the benefits of change tracking and change sharing, source control offers you a single location for the “one true version” of your code. It’s the location you deploy from in continuous integration.

Then there’s the fact that SQL is declarative, and DDL statements modify the current state of a database. So every time you make a change, you must account for referential integrity, and ensure data is preserved.

In practice, this means migration scripts – lots of them. You need to determine the state of the source database, the differences with the target, and write a script to correctly migrate between the two. This is time consuming and can result in a lot of errors.

Solving the Problem: Source Control and Deployment

An ideal solution would allow you to get your database into source control, and automate creation of change scripts. It would include both the database schema and any static data required by the application. The database would go into source control alongside the application code, and it would be deployed from there by the build system.

The rest of this article outlines how you can actually implement these processes using the SQL Developer Bundle, particularly SQL Source Control.

The first thing to do is to get the database into source control. SQL Source Control isn’t a source control system itself. It’s an add-in for SQL Server Management Studio that connects it with your existing source control system.

The current version supports Subversion and Team Foundation Server. The forthcoming SQL Source Control 2.1 will include support for any source control system with a command line interface, initially shipping with SourceGear Vault and Mercurial support built in.

So to get set up you link the database to your source control system:

Continuous-Integration/image001.jpg

You enter details of your source control repository, link the database and then commit the objects:

Continuous-Integration/image002.png

With version 2 and onwards, you can also choose to source control your static data. To do this, right-click the database in the Object Explorer, and click Link/Unlink Static Data… A dialog box is displayed letting you choose the tables whose data you want to source control.

When the time comes to deploy, you need to get the database out of source control again. To support this, SQL Compare and SQL Data Compare have command line interfaces you can use on a build server, for example with MS Build, NAnt, or TeamCity.

Here’s an example of the command line script to deploy the AdventureWorks database from Team Foundation server.

cd "C:\Program Files (x86)\Microsoft Visual Studio 10.0\Common7\IDE" 
tf get "C:\Scripts\AdventureWorks" /version:T 
cd "C:\Program Files\Red Gate\SQL Compare 8" 
sqlcompare /scr1:"C:\Scripts\AdventureWorks" 
/s2:TestingServer\SQL2008 
/UserName2:TestUser 
/db2:AdvWrksTst 
/Report:"C:\SchemaDiffReport.html" 
/ReportType:Interactive 
/ScriptFile:"C:\SchemaSyncScript.sql" 
/sync 
cd "C:\Program Files\Red Gate\SQL Data Compare 8" 
sqldatacompare /scr1:"C:\Scripts\AdventureWorks" 
/s2:TestingServer\SQL2008 
/UserName2:TestUser 
/db2:AdvWrksTst 
/o:Default 
/ScriptFile:"C:\DataSyncScript.sql" 
/sync

The command lines are documented fully online – here for SQL Compare, and here for SQL Data Compare. But briefly, here’s what’s going on:

tf get "C:\Scripts\AdventureWorks" /version:T is the TFS command that updates the local copy with the latest source control version of the database.
C:\Scripts\AdventureWorks is the file path for your local folder
sqlcompare /scr1:"C:\Scripts\AdventureWorks" specifies the local folder as the source for the schema comparison
/s2:TestingServer\SQL2008 specifies the target server for the schema synchronization (deployment)
/UserName2:TestUser is the user name for the target server
/db2:AdvWrksTst specifies the target database on TestingServer\SQL2008
/Report:"C:\SchemaDiffReport.html" generates a report of the schema differences and writes it to the specified file
/ReportType:Interactive specifies the format of the report, in this case a detailed interactive HTML format
/ScriptFile saves a copy of the SQL script used to migrate the changes
/sync synchronizes the data sources, making AdvWrksTst the same as AdventureWorks
sqldatacompare /scr1:"C:\Scripts\AdventureWorks" specifies the local folder as the source for the data comparison
/ScriptFile:"C:\SchemaSyncScript.sql" saves a copy of the SQL script used to migrate the schema changes
/ScriptFile:"C:\DataSyncScript.sql" saves a copy of the SQL script used to migrate the data changes

Essentially, the script gets a copy of the latest database version from source control and deploys it to a testing server, creating detailed reports of the migration.

Conclusion

Continuous integration makes development projects more efficient – allowing bugs to be caught earlier, and providing rapid validation. For too long, databases have been not only excluded from continuous integration, but have had no source control at all.

The tools described here provide the source control and deployment automation that let you bring database code in line with the application development process.

This has been a quick overview of working databases into the continuous integration process. It’s been written about in much more detail in the Red Gate white paper on continuous integration, and by the blogger Troy Hunt, in his post on automated database releases.

If you want to try this out for yourself, you can download a free trial of the SQL Developer Bundle.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here