Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / database / SQL-Server / SQL-Server-2008

Change Data Capture (SQL Server 2008)

4.00/5 (7 votes)
11 Aug 2008CPOL5 min read 1  
Change Data Capture in SQL Server 2008 (shujaatsiddiqi.blogspot.com)

Introduction

Historically, change management has been difficult in the life of a database administrator. There have been many solutions adopted ranging from having fields like CreateData and ModifiedDate in every table to DML triggers for DML operations. DDL triggers have been the only choice available in SQL Server 2005 for managing DDL operations. With many new features introduced in SQL Server 2008, CDC (Change Data Capture) is one of the features which are worth being discussed.

Configuration

To understand the configuration of CDC in SQL Server 2008, we create a new database called CDC_Practice.

Image 1

When ‘New Database’ is selected from the context menu, the following form appears:

Image 2

We are creating our database with the default options (jut specify database name as CDC_Practice) and press OK button.

Now we should enable CDC for our database. SQL Server 2008 is equipped with a new stored procedure for that, which is named as sys.sp_cdc_enable_db. We just have to execute this stored procedure.

SQL
USE CDC_Practice
Exec sys.sp_cdc_enable_db

We can verify if it is enabled on our database or not by just checking the is_cdc_enable field added in sys.databases table.

Image 3

You also cannot avoid looking at some new system tables created when you have executed this command. There is also a user cdc created along with a schema named cdc.

Image 4

Generally we have a number of tables on which such kind of auditing is required. It is good that we can specify the specific tables for which we want to turn CDC on. We create a table EX_Table.

SQL
Use CDC_Practice
Go
Create table EX_Table
(
ID Int Identity(1,1) Primary Key NOT NULL,
Name varchar(10)
)

As I mentioned earlier, we can enable CDC on specific tables. So, we should try enabling this for our EX_table table.

SQL
Use CDC_Practice
Go
EXEC sp_cdc_enable_table 'dbo', 'Ex_table', @role_name = NULL, @supports_net_changes =1
Go

We can verify if CDC has been enabled with the help of a new attribute added in sys.tables, which is is_tracked_by_cdc. If the result is 1, then it is enabled on the specified table. We check it on our table EX_Table.

SQL
SELECT is_tracked_by_cdc FROM sys.tables
WHERE name = 'EX_Table'

We need to make sure that SQL Server Agent is running before enabling CDC on a table. This is because it also generates two jobs for the same purpose. The two jobs created are as follows:

  1. cdc.CDC_Practice_capture
  2. cdc.CDC_Practice_cleanup

DDL Capture

So far, we have only discussed about capturing data manipulation stuff. What about capturing DDL related tasks? In SQL Server 2005, DDL triggers were introduced. Till now, it was the only option to log DDL statements being executed. In SQL Server 2008, these DDL statements are available in a table, which is ddl_history. This table may also be used to create a database which is exactly the replica of the first database.

This is still more beneficial than DDL triggers because DDL trigger caused a single point of failure in a database application. If there is any issue with this, then there are no changes allowed in the database schema. With CDC, this problem is also reduced.

Stored Procedures Introduced

To support CDC, some stored procedures have also been introduced. They are as follows:

  1. sp_cdc_add_job: To create a capture or clean-up job.
  2. sp_cdc_change_job: To modify capture or cleanup job configuration.
  3. sp_cdc_cleanup_change_table: deletes all records in the change table.
  4. sp_cdc_disable_db: To disable CDC for current database.
  5. sp_cdc_disable_table: Disables CDC for the specified table
  6. sp_cdc_drop_job: To drop a change or capture job.
  7. sp_cdc_enable_db: Enables CDC for the specified database.
  8. sp_cdc_enable_table: enables change tracking of the specified table. It also creates some utility objects like tables etc.
  9. sp_cdc_get_ddl_history: To give DDL history for the specified CDC instance.
  10. sp_cdc_help_change_data_capture: To provide CDC information for each table in current database on which CDC is enabled.
  11. sp_cdc_help_jobs: To give information about change and capture jobs.
  12. sp_cdc_scan: Internally called by sp_cdc_enable_table_change_data_capture. It executes the CDC log scan operation.
  13. sp_cdc_startjob: To start a CDC capture or cleanup job for the current database.
  14. sp_cdc_stopjob: To stop a CDC capture or cleanup job for the current database.

It must be noted that all of the above mentioned stored procedures are defined in sys schema.

Functions Introduced

There are two table valued functions added for each table being tracked by CDC. They are as follows:

  1. fn_cdc_get_all_changes_dbo_TableName: Created for each new table for which CDC is enabled.
  2. fn_cdc_get_net_changes_dbo_TableName

There are three required parameters for both of the above functions. They include the start LSN, End LSN and row filter option. Row filter option has possible values ‘all’, ‘all with mask’ or ‘all with merge’.

Metadata Functions

  1. CHANGE_TRACKING_CURRENT_VERSION
  2. Change_tracking_current_version()
  3. Change_Tracking_Is_Column_In_Mask()

System Tables Introduced

If CDC is not enabled on the database, then we will enable Change Data Capture on the database. When we enable the CDC on the Database the schema named 'cdc' is created and the following six system tables are also automatically created in the schema. It must be remembered that querying the system tables directly is not recommended. Instead the new stored procedures which are introduced in 2008, should be used.

  1. cdc.captured_columns: This holds information about all the captured columns of all the change tables.
  2. cdc.change_tables: This contains all the change tables in the database.
  3. cdc.ddl_history: The history of DDL statements after enabling CDC.
  4. cdc.index_columns: This holds information about each index column associated with change tables.
  5. cdc.lsn_time_mapping: Holds information about each transaction in change tables.
  6. cdc.*_CT: This table is created for every table which is tracked by CDC. The holds all the DML operations with data performed after enabling CDC. There are no more DML triggers required just for auditing changes in data in a table. CT is the abbreviation for Change Table.

Changes to sys.Databases Table

In SQL Server 2008, a new field has been added to sys.databases table. This field is is_cdc_enabled. This field tells whether CDC is enabled for a particular database.

Limitations

  1. Publisher / Subscriber model is not supported. So any operation cannot be triggered based on some changes. We still have to resort to our old approach of triggers for such requirements.
  2. It is an asynchronous operation. Though asynchronous operation minimizes the performance impact, there are some requirements in which only synchronous operation may be decided.
  3. CDC cannot be enabled on heaps, i.e. it cannot be enabled on those tables which do not have any primary key.

Note

There is one more feature released with SQL Server 2008 - Change Tracking. It must be remembered that it is different from CD:

  • It only gives you net changes in a database table.
  • It is synchronous.

History

  • 11th August, 2008: Initial post

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)