Managing Temporal Table History in SQL Server 2016

By:Aaron Bertrand || Related Tips:More >SQL Server 2016

Problem

SQL Server 2016 introduced a new feature, Temporal Tables, which allow you to keep a historical record of all of the versions of each row in a table. As rows get introduced, changed, and deleted over time, you can always see what the table looked like during a certain time period or at a specific point in time.

You may have heard Temporal Tables referred to as system-versioned tables. What happens is that the historical versions of rows, according to the system time of their last modification, are stored in a separate table, while the current version resides in the original table.

For tables that don't change very often, this works fantastic - queries against the base table know, based on the filter criteria, whether to get the data from the base table or the history table. For tables with a high volume of insert/delete activity, however, or with rows that get updated frequently, the history table can quickly grow out of control. Imagine a table with 100 rows, and you update those 100 rows 100 times each - you now have 100 rows in the base table, and 9,900 rows in the history table (99 versions of each row).

While there are definitely going to be regulatory/auditing exceptions, in many cases, you won't want or need to keep every single version of every single row for all of time.

Solution

The MSDN article, Manage Retention of Historical Data in System-Versioned Temporal Tables , provides a few options:

Stretch Database Table Partitioning (sliding window) Custom cleanup script

The way these solutions are explained, though, lead you to make a blanket choice about retaining historical data only based on a specific point in time (say, archive everything from before January 1, 2017) or fixed windows (once a month, switch out the oldest month). This may be perfectly adequate for your requirements, and that's okay.

When I considered these solutions, I immediately envisioned a scenario that they wouldn't cover: what if I want to keep only the last three versions of a row, regardless of when those modifications took place? Or all previous versions from the past two weeks or the current calendar year, plus one additional version before that? If I archive or delete based only on a point in time, then I might keep too many versions of some rows, and no historical versions of other rows. If I want to keep the three previous versions, I can't possibly enforce that based on a point in time.

Any criteria can be accomplished, of course, if we put a little more thought into the "custom cleanup script" solution. The procedure demonstrated in the documentation accepts a specific datetime value, and deletes all historical data before that point. I'd like to demonstrate how to accomplish a selective delete (or archiving into yet another historical location) using a different set of criteria.

First, we need a base table, and a few rows:

CREATE TABLE dbo.Employees
(
EmployeeID int PRIMARY KEY,
FirstName nvarchar(64),
LastName nvarchar(64),
Salary int,
ValidFrom datetime2(7) GENERATED ALWAYS AS ROW START NOT NULL,
ValidTo datetime2(7) GENERATED ALWAYS AS ROW END NOT NULL,
PERIOD FOR SYSTEM_TIME (ValidFrom, ValidTo)
) WITH (SYESTEM_VERSIONING = OFF);
INSERT dbo.Employees(EmployeeID, FirstName, LastName, Salary)
VALUES (1, N'Bobby', N'Orr', 25000),
(2, N'Milt', N'Schmidt', 25000),
(3, N'Eddie', N'Shore', 25000);

Now, even though SYSTEM_VERSIONING is OFF, the ValidFrom and ValidTo values are populated with the time of the insert and the end of the day on 9999-12-31, respectively. If you update the data in this table right now, the ValidFrom value will update the current time, but no historical version of the row will be stored anywhere.

We can then create a history table. The columns and data types must match, but the history table can't have constraints. So we're going to create a clustered index on EmployeeID, ValidFrom, ValidTo:

CREATE TABLE dbo.Employees_History
(
EmployeeID int NOT NULL,
FirstName nvarchar(64),
LastName nvarchar(64),
Salary int,
ValidFrom datetime2(7) NOT NULL,
ValidTo datetime2(7) NOT NULL
);
CREATE CLUSTERED INDEX EmployeeID_From_To
ON dbo.Employees_History(EmployeeID, ValidFrom, ValidTo);

Next, we'll fictitiously populate it with some historical versions of these rows, just as if I had set this up a couple of years ago (this is *absolutely not* a demonstration of how Temporal Tables should work, nor a recommendation to ever do it this way; we're just trying to set up some dummy data):

-- a historical version representing when we updated salary:
INSERT dbo.Employees_History
(
EmployeeID, FirstName, LastName, Salary, ValidFrom, ValidTo
)
SELECT EmployeeID, FirstName, LastName, 20000, DATEADD(YEAR, -1, ValidFrom), ValidFrom
FROM dbo.Employees;
INSERT dbo.Employees_History
(
EmployeeID, FirstName, LastName, Salary, ValidFrom, ValidTo
)
-- then another salary update from a year before:
SELECT EmployeeID, FirstName, LastName, 15000, DATEADD(YEAR, -1, ValidFrom), ValidFrom
FROM dbo.Employees_History
-- and a row that has been "deleted" from the primary table
UNION ALL
SELECT 4, N'Phil', N'Esposito', 24500, '20150101', '20161231';
-- then, finally, let's add a new row that doesn't exist in history:
INSERT dbo.Employees(EmployeeID, FirstName, LastName, Salary)
VALUES(5, N'Brad', N'Marchand', 22750);

If we take a quick look, we have 4 rows in the base table, and 7 rows in history:

SELECT * FROM dbo.Employees;
SELECT * FROM dbo.Employees_History;

Base table and history table rows (click to enlarge)

Image may be NSFW.
Clik here to view. Managing Temporal Table History in SQL Server 2016

Now, we can turn system versioning for the table ON:

ALTER TABLE dbo.Employees SET (SYSTEM_VERSIONING = ON
(
HISTORY_TABLE = dbo.Employees_History,
DATA_CONSISTENCY_CHECK = ON
));

And just for kicks, let's update one row:

UPDATE dbo.Employees SET FirstName = N'Milton'
WHERE EmployeeID = 2;

What this does, effectively, is moves the existing row from the base table to the history table, updates the ValidTo value to the current time, then creates a new row in the base table with the updated column and a new ValidFrom value. (That is what happens logically , but not necessarily what happens physically .) Now the two tables look like this - up top, I've highlighted the changed value in the base table, and below, the row that now appears in the history table (click to enlarge):

Image may be NSFW.
Clik here to view. Managing Temporal Table History in SQL Server 2016

Highlighting changes after an update (click to enlarge)

This should demonstrate the purpose of this article: As you update more rows in the base table, the history table can very quickly ramp up and take over your disk, especially if the rows are a lot wider than this simple example.

Identifying Rows to Clean up

Depending on the rules you want to use to determine which history rows to keep or throw away, it should be easy in this case to visually identify those rows, and then build the proper query.

First, let's look at the total set of rows we have in our base table and the history table together:

SELECT *, src = 'Base table' F

Managing Temporal Table History in SQL Server 2016

Trending Articles

《沈冰自述——我和周永康的故事》全本

Moog - Subsequent 25

出售: 林憶蓮•回來愛的身邊 (東芝1A1頭版)

筆記 - 使用 PowerShell 清除停用 AD 帳號與 OU

df-dferh-01 中国区 Android 安装 Google Play Store 后报错的解决办法

「一棒接一棒、棒棒強棒」108學年度家長會長交接典禮

吸烟与MBTI类型判断捷径 (豆瓣 INFJ的奇幻之旅小组)

acermark龍璿國際展出多款包裝設備

枋寮北勢寮隆山宮睽違12年再辦迎王祭典

日本女优有村千佳COS集锦：狂三&黑白岩&亚丝娜&绫波丽

有遇到过这个问题么。/jsb-videoplayer.js not found, possible missing file.

MAS v2.8 magicgenius 汉化版 - 11.11更新

出售: Monster Cable Interlink Reference 2

福建佛教人士望云和尚(林斌)的九仙禅寺被强行收走，望云妈妈被赶出寺庙

R 语言中的OpenBLAS*和英特尔® 数学核心函数库的性能比较

[转载]煞貢、直星、人專吉日\金神七煞歌

HAKERS哈克士戶外 12月8~14日廠拍

OBS Studio 23.2.1 免安裝中文版 - 免費網路實況廣播軟體實況主必備軟體取代Fraps

<請教>行駛中安卓機會重新開機

Udp2raw-tunnel 及其一键安装脚本