Be Careful with Key Order in SQL Server Missing Index Recommendations

By: Aaron Bertrand || Related Tips:More >Indexing

Attend this free live mssqlTips webcast

Leveraging Storage Spaces Direct for SQL Server High Availability Thursday, July 19, 2018 - click here to learn more

Problem

I have been working with SQL Server for a long time, but I’m still constantly learning new things. Generally, its missing index recommendations can be useful at a high level, and some systems rely on them heavily. I haven’t spent a ton of time analyzing them because of the limitations in what they offer, such as often suggesting redundant indexes, never offering filtered indexes, and recommending indexes for queries that have run exactly once. And sometimes, these indexes are not created correctly, specifically in terms of the order of key columns. Not too long ago, I learned why.

Solution

It turns out that SQL Server just orders key columns based on their ordinal position in sys.columns. Let’s look at a simple example; we’ll create a table (with, on my system, about 85,000 rows):

CREATE TABLE dbo.DateFirst
(
id int IDENTITY(1,1) NOT NULL,
dt datetime NOT NULL,
ln nvarchar(200) NOT NULL,
CONSTRAINT pk_DateFirst PRIMARY KEY (id)
);
;WITH x AS
(
SELECT
dt = CONVERT(date,DATEADD(DAY,1-(o.[object_id]/10.0 * column_id) % 1000, modify_date)),
ln = LEFT(o.name + c.name, 200)
FROM sys.all_objects AS o
INNER JOIN sys.all_columns AS c
ON o.[object_id] BETWEEN c.[column_id] - 20 AND c.[column_id] + 20
)
INSERT dbo.DateFirst(dt,ln)
SELECT dt = MAX(dt), ln FROM x GROUP BY ln;

If we run a simple query against this table, where we use equality predicates on both the datetime and string columns, we get the following index recommendation:

SELECT id FROM dbo.DateFirst
WHERE dt = '2018-05-11'
AND ln = N'sysallocunitsabort_state'
AND 1 > (SELECT 0); -- to make optimization non-trivial
/*
CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>]
ON [dbo].[DateFirst] ([dt],[ln])-- order of table
*/

You might conclude that it just listed them that way because of the order of the predicates. But if you reverse the order, putting the ln check before dt , the exact same index is suggested:

SELECT id FROM dbo.DateFirst
WHERE ln = N'sysallocunitsabort_state'
AND dt = '2018-05-11'
AND 1 > (SELECT 0); -- to make optimization non-trivial/*
CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>]
ON [dbo].[DateFirst] ([dt],[ln])-- order of table
*/

It can actually get a little more complex than that, because the key columns will be divided into two sets: those that satisfy equality predicates, and those that satisfy inequality predicates. If we change one of the predicates to be inequality, we now get the key columns listed in the opposite order (but really, they are two groups of columns listed in order of equality and then inequality):

SELECT id FROM dbo.DateFirst
WHERE dt <> '2018-05-11'
AND ln = N'sysallocunitsabort_state'
AND 1 > (SELECT 0);
/*
CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>]
ON [dbo].[DateFirst] ([ln],[dt])-- NOT order of table
*/

For fabricated data like this, it may not really matter. The point is that a recommended index in a real world scenario might be much better suited to having a specific leading key column, and it may not reflect the order of the columns in the table.

Let’s create a new version of the table with those columns flipped, and insert the same data:

CREATE TABLE dbo.DateLast
(
id int IDENTITY(1,1) NOT NULL,
ln nvarchar(200) NOT NULL,
dt datetime NOT NULL,
CONSTRAINT pk_DateLast PRIMARY KEY(id)
);
INSERT dbo.DateLast(ln,dt) SELECT ln,dt FROM dbo.DateFirst;

Now, we’ll run similar queries, and see the recommendations we get:

SELECT id FROM dbo.DateLast
WHERE dt = '2018-05-11'
AND ln = N'sysallocunitsabort_state'
AND 1 > (SELECT 0);
/*
CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>]
ON [dbo].[DateLast] ([ln],[dt])-- order of table
*/
SELECT id FROM dbo.DateLast
WHERE ln = N'sysallocunitsabort_state'
AND dt = '2018-05-11'
AND 1 > (SELECT 0);
/*
CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>]
ON [dbo].[DateLast] ([ln],[dt])-- order of table
*/
SELECT id FROM dbo.DateLast
WHERE ln <> N'sysallocunitsabort_state'
AND dt = '2018-05-11'
AND 1 > (SELECT 0);
/*
CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>]
ON [dbo].[DateFirst] ([dt],[ln])-- NOT order of table
*/

Again, when all predicates use equality against a column, they’re all treated equally, and so SQL Server falls back to sorting by ordinal position in the table itself. Only when we introduce inequality against a column does it treat that column differently. And it is easy to prove that if all columns use inequality, the order would again fall back to ordinal position within the table.

You would think that important factors like density and selectivity would come into play, but they don’t. If you were creating these indexes yourself, and you knew your data, you would look at the predicates and deduce that the leading key column should usually be the one that is most selective, since it eliminates the largest portion of the B-tree the quickest.

Credit goes to Brent Ozar and Bryan Rebok, who recently solved this mystery in this Stack Exchange question a question on Database Administrators Stack Exchange .

Summary

So, what is the fix? I don’t know. Heuristics are tough, and sometimes the correct index structure depends on future information that SQL Server doesn’t even know yet. All I can suggest is that you pay particular attention to the key column order when you are creating indexes that come from individual index recommendations and, more importantly, the Database Engine Tuning Advisor (DTA).

Next Steps

Read on for related tips and other resources:

Be Careful with Key Order in SQL Server Missing Index Recommendations

Trending Articles

《沈冰自述——我和周永康的故事》全本

Moog - Subsequent 25

出售: 林憶蓮•回來愛的身邊 (東芝1A1頭版)

筆記 - 使用 PowerShell 清除停用 AD 帳號與 OU

df-dferh-01 中国区 Android 安装 Google Play Store 后报错的解决办法

「一棒接一棒、棒棒強棒」108學年度家長會長交接典禮

吸烟与MBTI类型判断捷径 (豆瓣 INFJ的奇幻之旅小组)

acermark龍璿國際展出多款包裝設備

枋寮北勢寮隆山宮睽違12年再辦迎王祭典

日本女优有村千佳COS集锦：狂三&黑白岩&亚丝娜&绫波丽

有遇到过这个问题么。/jsb-videoplayer.js not found, possible missing file.

MAS v2.8 magicgenius 汉化版 - 11.11更新

出售: Monster Cable Interlink Reference 2

福建佛教人士望云和尚(林斌)的九仙禅寺被强行收走，望云妈妈被赶出寺庙

R 语言中的OpenBLAS*和英特尔® 数学核心函数库的性能比较

[转载]煞貢、直星、人專吉日\金神七煞歌

HAKERS哈克士戶外 12月8~14日廠拍

OBS Studio 23.2.1 免安裝中文版 - 免費網路實況廣播軟體實況主必備軟體取代Fraps

<請教>行駛中安卓機會重新開機

Udp2raw-tunnel 及其一键安装脚本