Change tracking for SQL Server is a flexible and easy-to-use technology for monitoring tables for inserts, updates, and deletes. In this post, I’ll discuss getting started with change tracking in SQL Server, and will show an example of how to get started with it.
Change Tracking Overview
Change tracking is a lightweight mechanism for tracking which rows have been inserted, updated, and deleted in tables monitored by change tracking. Change tracking first appeared in SQL Server 2008 and has been in every version since. Even better, change tracking is available in every edition of SQL Server, even the free express edition.
In a nutshell, here’s how it works: In tables that have change tracking enabled, the net change to each row is tracked internally and is accessible through change tracking functions. Each change has a version ID attached to it, and there will be a new version ID for every row that is inserted, updated, or deleted. The change tracking version is an 8-byte integer (BIGINT) that reflects the most recent change ID in that database. It is important to note that the change tracking version is not table specific – a DML operation in any tracked table in each database will increment the version number. The version number will always be sequential, but will not necessarily be contiguous within a single table if there is more than one table enabled for change tracking.
Change tracking is enabled at the database level. After that, each table that will be monitored must be individually enlisted in change tracking. Each table to be monitored by change tracking must have a primary key, as this is the row-level identifier used to report on DML operations within change tracking. When enabling change tracking at the table level, you can opt to track the column(s) changed in the most recent update, which will give you greater visibility into what was changed.
Once set up, using change tracking is a relatively simple process. There are a few functions – most notably, CHANGE_TRACKING_CURRENT_VERSION() and CHANGETABLE(), that can be used to check the current version stamp in change tracking and retrieve the list of recent changes. I’ll demonstrate both of these functions shortly.
Change Tracking is Not Audit Logging
I’m going to be careful not to use the words audit or logging to describe change tracking. Let me be clear: this is not a full logging mechanism. The change history is not tracked at all – change tracking only reports the fact that a change occurred, but does not retain the version history. Consider the case of a row of data with an ID of 1234. That row is inserted, then updated 5 times, and then deleted. Change tracking would not show the history of insert, update, and delete; rather, it would report only the net change, that row ID 1234 was deleted. If your load process requires detailed logging history for each change (rather than just the delta of all changes), you’d have to use something like Change Data Capture.
Setting Up Change Tracking
Enabling table-level change tracking is a two step process. First, it must be enabled on the database. This can be done through the UI in the Database Properties, on the Change Tracking tab.
As shown, there’s not much to configure when enabling change tracking on a database. Simply set the Change Tracking value to True to set up change tracking for that database. Optionally, the Retention Period value can be tweaked as well. The default value is 2 days, which I have overridden in this example to use 14 days instead. As with most UI operations, there is a T-SQL command to do the same thing. The command to set up change tracking on this database is below.
ALTER DATABASE [ChangeTracking] SET change_tracking = ON (change_retention = 14 days)
After this step, change tracking is enabled, but it’s not yet tracking anything. It still has to be enabled for each table to be tracked. The Table Properties UI makes this very easy.
As shown, simply changing the Change Tracking value to True enables change tracking for this table. In this example I also opted to track the columns changed during updates (more on this in a bit). The T-SQL equivalent for this operation is:
ALTER TABLE [dbo].[Divisions] ENABLE change_tracking WITH (track_columns_updated = ON)
The last step above would be repeated for each table to be tracked in change tracking. Once change tracking is enabled, any changes (inserts, updates, or deletes) to that table will be stored in the change tracking cache.
Using Change Tracking
For the above example, I’m going to insert, update, and delete some data to demonstrate how to access the change tracking data generated for those DML operations. For reference, here is the table structure.
CREATE TABLE Emp
ID INT PRIMARY KEY IDENTITY(1, 1),
Recall that I mentioned earlier that change tracking uses a version ID to track the current version of the tracked tables. That version ID is our timeline marker for detecting changes. To retrieve that value, there is a very simple function: CHANGE_TRACKING_CURRENT_VERSION(). It is used as shown below.
On my test system, this value is 470 (since I’ve run several tests prior to this writing). This is the starting point, and any changes made from this point forward would trigger a new version number. I’ll make a note of that value, and will now make some changes to the [Emp] table described above. I’ll insert a handful of rows to show how change tracking displays inserts.
INSERT dbo.Emp VALUES (‘Joey’ ,NULL ,‘Tribbiani’)
INSERT dbo.Emp VALUES (‘Chandler’ ,NULL, ‘Bing’)
INSERT dbo.Emp VALUES (‘Ross’,NULL,‘Geller’)
INSERT dbo.Emp VALUES (‘Monica’,NULL,‘Geller’)
INSERT dbo.Emp VALUES (‘Rachel’,NULL,‘Green’)
INSERT dbo.Emp VALUES (‘Phoebe’,NULL,‘Buffay’)
After inserting these six rows, I check the CHANGE_TRACKING_CURRENT_VERSION() value again and find that the value is now 476. It has been increased by 6 – one per row inserted, which is what I’d expect. Now I’ll put that version number to use, querying change tracking for the metadata for the rows inserted. To do that, I’ll call the CHANGETABLE() function.
FROM CHANGETABLE(CHANGES dbo.Emp, @ver)
To break this down:
- CHANGETABLE is the table-valued system function that will return the list of changes stored in change tracking
- CHANGES indicates that I’m looking for the changes since the specified version
- @ver is the variable I set up to store the version number. CHANGETABLE will return all of the results reflecting changes since this version. Note that you can use a variable as I did, or just pass in a scalar number (using the literal 470 here would have accomplished the same thing)
When I run the code above, I receive the following result set.
This tells me the version of the insert and/or update, the operation (I, U, or D for insert, update, or delete, respectively), the column mask for update operations (more on this momentarily), and the primary key of the row impacted by this change. Because CHANGETABLE() returns a table, I could easily join this result set back to the original table to see the change operation along with the current data in that table.
This will look a little different for an update operation. Next I’ll execute an update statement, but first, I’m going to note the current version of change tracking (which is still 476).
SET @ver = CHANGE_TRACKING_CURRENT_VERSION()
Now the update statement, which will update two rows in the [Emp] table:
SET MiddleName = ‘Muriel’
WHERE FirstName = ‘Chandler’ AND LastName = ‘Bing’
SET LastName = ‘Bing’
WHERE FirstName = ‘Monica’ AND LastName = ‘Geller’
Now when I run the CHANGETABLE() code from above, using the more recent change tracking version (476) as a starting point, I get a different result set:
This is the metadata for all of the changes since version 476, which only includes the two rows updated from the UPDATE statement above. Notice that the creation version is null, because this was an insert and not an update. Also, the SYS_CHANGE_COLUMNS value is now populated, though the value doesn’t really show us what has changed (yet). This is a good time to talk about the change tracking function CHANGE_TRACKING_IS_COLUMN_IN_MASK(). This function will check to see if the specified column has been updated since the most recent version. Its syntax is a bit quirky, but to check whether the MiddleName was updated, the query would look like this:
FROM CHANGETABLE(CHANGES dbo.Emp, @ver) AS ct
Honestly, I don’t know that I’ve ever used the CHANGE_TRACKING_IS_COLUMN_IN_MASK function. It’s a bit of a pain because the syntax is difficult, and you have to run this for each column you want to check. Most of my work is in data warehousing, and I’ve run into few cases where I need to know exactly which columns were updated – I just want to know if the row has been updated. However, for other scenarios (especially in OLTP), I can see the need for this.
I’ve demonstrated inserts and updates. Let’s look at what a delete would look like. Again, I’ll make a note of the current version number – 478 – for the next operation. I’ll now delete one row of data:
WHERE FirstName = ‘Joey’
AND LastName = ‘Tribbiani’
Having deleted one row, I’ll run CHANGETABLE() again to see what change tracking reports for this operation.
FROM CHANGETABLE(CHANGES dbo.Emp, 478)
I find the one row I deleted in the last operation, with the SYS_CHANGE_OPERATION set to D (delete):
Now, remember that the version number makes a difference here! The version number passed into CHANGETABLE() is the starting point for any changes returned by that function. Through this exercise I have been checking the change tracking results after each DML operation; however, I can set the starting version number to any valid version number, or simply use NULL to get all available change tracking results for that table. To demonstrate, I’ll set the value back to version 470 – the starting point before any updates – to show what the full history would look like. When I rerun CHANGETABLE() using our original change tracking version, I get the following:
There are a couple of predictable nuances here. First of all, the row showing record ID of 1 (which was the Phoebe Buffay record I deleted) shows up simply as a delete operation, even though this row was inserted and subsequently deleted since the starting version number. Remember, it’s the delta that will be shown – each operation against that row is not retained in change tracking. For IDs numbered 2 and 4 – which were the two rows that I inserted and subsequently updated – the SYS_CHANGE_OPERATION shows an insert, even though we updated both records after insert. The tell is that the SYS_CHANGE_VERSION and SYS_CHANGE_CREATION_VERSION on these rows do not match, indicating that the most recent change was not the insert.
Change tracking is a simple and lightweight means of change detection in SQL Server. As shown, using change tracking allows relatively easy access to new, changed, and deleted data, eliminating the need for brute-force comparisons or other costly means of change detection. In my next post, I’ll look at this from an ETL perspective, integrating change tracking into an end-to-end load process.