Auditing Data Changes In Microsoft SQL Server

Introduction

Tracking changes in data over time is a common problem, and deciding on your approach relies on answering the questions, such as “Do I want to track every field or just some fields?”, “Does it need to be ‘live’ or is it okay to detect changes within a period of time?”, and “What audit fields are available to me and what degree of tracking is needed (e.g. deletions vs. just updates)?”

In this article, I’ll examine four different approaches, diving into some implementation details with an emphasis on contrasting the differences - including performance benchmarking. I’ve made my test harness available on GitHub . Executing this T-SQL script will not only create all necessary objects to demonstrate all four solutions but output the performance numbers I will quote later, so you can check my work!

Motivation and Goals

The type of tracking I’m going to discuss is a general framework that’s largely transparent to applications in some cases supported by the SQL engine itself. For example, you might have a requirement of: “I’m interested in knowing when any user changes some important data, including who did it, when, and what was the exact change.” The challenge is coming up with a way to apply this to one or more tables, without having your application know or care about the implementation of your auditing.

This is achievable, but some basic requirements apply to all solutions:

You’ll need to track who last changed records. You’ll need to track when records were changed.

This is useful information even if you never keep a history of changes, and it’s common to see the “who” handled through a text field (e.g. LastUpdatedBy), and “when” through a DateTime (or datetime2) field (e.g. LastUpdatedDate). The names of the fields are less important than their function. LastUpdatedBy might be sourced from SUSER_SNAME() but if you’re using forms authentication, you might prefer to use the application-maintained user ID. LastUpdatedDate might be sourced from GETDATE() or GETUTCDATE(), for example.

Several solutions expose history tables that sit behind an application’s audited tables (which I’ll refer to as base tables ). The structure of the history tables might be similar to the base tables - with perhaps a few extra attributes to support auditing. Or we could construct a history tracking system that captures changes in a single log table where we record the table name, field name, old value, new value, etc. The single table approach is something I’ve generally steered away from for a few reasons:

A “mimic” of the base table means when the business decides they want to track a field that was previously untracked, you may already have it. If you’re only writing out records at a field level, you have no way to “go back in time” to determine what the values were prior to the request to add the field. This may not be a big deal, but it’s a consideration. A “mimic” of the base table supports easy point-in-time queries . In this case, you could use such a point-in-time query to restore individual records, if you need to. Constructing a point-in-time picture of the base record using a single change table isn’t impossible (if you have all the necessary data) but this can be difficult. The act of pivoting data, in this case, would make the penalty for logging potentially significantly higher if, for example, we had 10 of 20 fields we wanted to audit on a single table. We might presumably do this in a trigger which needs to perform 10 possible INSERT’s, instead of one that matches the shape of the row. In general, I favor solutions that minimize write penalty to base tables since the reading of history tends to be a rarer need. We can also optimize our history tables by only recording a subset of columns, where that makes sense.

If we accept that use of history tables with one row per version of all base records is a goal, then all four approaches I’ll look at either do that or a close variation.

It’s worth noting that I’ve also created a “pivoting” job that does turn one row per version into one row per field change of interest, based on a configuration table, making some application screens faster where the format matched exactly what users wanted to see. This job didn’t have to maintain real-time changes and didn’t suffer from problem #1 listed above since all fields were available in history tables - it was effectively populating a materialized view .

Another goal here is to educate through a common example that runs through the various implementation options. The sample base table that I’ll be using has the following attributes,

[PersonID][int]IDENTITY(1,1)NOTNULLPRIMARYKEY, [FullName][nvarchar](100)NOTNULL, [Username][varchar](100)NOTNULL, [IsActive][bit]NOTNULL, [Birthday][date]NULL, [Age]AS(DATEDIFF(year,[Birthday],GETDATE())), [LastUpdatedDate][datetime2](7)NOTNULL, [LastUpdatedBy][varchar](50)NOTNULL Alternative #1 Roll-your-own Snapshots

You might be interested in change tracking, but what if it’s tracking tables in a third-party system? You may not have the freedom to add triggers or change the schema, so are you stuck? No! One option if you’re willing to accept tracking over an interval is to use a set of T-SQL statements that can most easily be packaged in a stored procedure (per table). Such a procedure can be scheduled to run every few minutes (or hours), depending on your requirements. You’ll only pick up the last change in that interval, determined by comparing the current state in your base table versus the most recent state in your history table, based on a chosen natural key . (In our example, PersonID is our natural key.)

The history table we’ll use looks like the base table but with two additional fields,

[RowExpiryDate][datetime2](7)NOTNULL, [IsDeleted][bit]NOTNULL In the T-SQL script that I offer here, the stored procedure [History].[up_Track_Proc_Load] is what populates the history table, [History].[Track_Proc]. There’re three basic steps: Expire old records (as would happen on updates). Insert new / changed records (supporting inserts and updates). Flag deleted records.

Starting from an empty base and history table, if we were to run this script,

INSERTdbo.Track_Proc(FullName,Username,IsActive,Birthday,LastUpdatedBy,LastUpdatedDate) VALUES('BobbyTables','bob',1,'1/1/2000','inserter_guy',GETDATE()); EXECHistory.up_Track_Proc_Load; WAITFORDELAY'00:00:02'; UPDATEdbo.Track_Proc SETFullName='RobertTables',UserName='rob',LastUpdatedBy='updater_guy',LastUpdatedDate=GETDATE() WHEREPersonID=1; EXECHistory.up_Track_Proc_Load; WAITFORDELAY'00:00:02'; UPDATEdbo.Track_Proc SETUserName='robby',LastUpdatedBy='updater_guy',LastUpdatedDate=GETDATE() WHEREPersonID=1; EXECHistory.up_Track_Proc_Load; WAITFORDELAY'00:00:02'; DELETEdbo.Track_Proc WHEREPersonID=1; EXECHistory.up_Track_Proc_Load; SELECT*FROMHistory.Track_Proc ORDERBYLastUpdatedDateASC;

We’d see the following contents in the History.Track_Proc history table,

Perso

Auditing Data Changes In Microsoft SQL Server

Trending Articles

《沈冰自述——我和周永康的故事》全本

Moog - Subsequent 25

出售: 林憶蓮•回來愛的身邊 (東芝1A1頭版)

筆記 - 使用 PowerShell 清除停用 AD 帳號與 OU

df-dferh-01 中国区 Android 安装 Google Play Store 后报错的解决办法

「一棒接一棒、棒棒強棒」108學年度家長會長交接典禮

吸烟与MBTI类型判断捷径 (豆瓣 INFJ的奇幻之旅小组)

acermark龍璿國際展出多款包裝設備

枋寮北勢寮隆山宮睽違12年再辦迎王祭典

日本女优有村千佳COS集锦：狂三&黑白岩&亚丝娜&绫波丽

有遇到过这个问题么。/jsb-videoplayer.js not found, possible missing file.

MAS v2.8 magicgenius 汉化版 - 11.11更新

出售: Monster Cable Interlink Reference 2

福建佛教人士望云和尚(林斌)的九仙禅寺被强行收走，望云妈妈被赶出寺庙

R 语言中的OpenBLAS*和英特尔® 数学核心函数库的性能比较

[转载]煞貢、直星、人專吉日\金神七煞歌

HAKERS哈克士戶外 12月8~14日廠拍

OBS Studio 23.2.1 免安裝中文版 - 免費網路實況廣播軟體實況主必備軟體取代Fraps

<請教>行駛中安卓機會重新開機

Udp2raw-tunnel 及其一键安装脚本