Right now, Erik and I are presenting at the 24 Hours of PASS. We’re talking about Last Season’s Performance Tuning Techniques.
You can register and join us now .
Wanna play along with us as we show how your performance skills might be a little out of date? Here’s the demos and the slides:
Last Season's Performance Tuning Techniques from Brent Ozar
Fill Factor: Doing the Page SplitsYou can do this one in any database, but you’ll want to do it on a server with very low load. If anybody else is doing any deletes/updates/inserts at all, it’s going to skew your numbers.
CREATE TABLE TheHeart (ID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED, Groove VARCHAR(100)); GO /* How many page splits has our server had? */ SELECT * FROM sys.dm_os_performance_counters WHERE counter_name LIKE '%Splits%'; GO /* Add just one row */ INSERT INTO TheHeart (Groove) VALUES ('The chills that you spill up my back keep me filled'); GO /* See any page splits? */ SELECT * FROM sys.dm_os_performance_counters WHERE counter_name LIKE '%Splits%'; GO DROP TABLE TheHeart; GOQuestions to think about:
What does a page split really mean? Is there such a thing as a good or a bad page split? How do you know which ones you’re having? Would setting fill factor have prevented that page split? Missing IndexesThis one requires the Stack Overflow demo database . If you don’t already have a copy of that, don’t try to download it live during the session it’s too big (~15GB torrent, then expands to a ~100GB SQL Server database.)
Get the estimated execution plan for this:
SELECT c.CreationDate, c.Score, c.Text, p.Title, p.PostTypeId FROM dbo.Users me INNER JOIN dbo.Comments c ON me.Id = c.UserId INNER JOIN dbo.Posts p ON c.PostId = p.ParentId WHERE me.DisplayName = 'Brent Ozar'; GOAnd then ask yourself:
What index am I told to create? Does that index make sense? Is there anything SQL Server isn’t telling me?Now try the estimated plan for this:
SELECT Id FROM dbo.Users WHERE DisplayName = 'Brent Ozar' ORDER BY Age; GOAnd ask yourself those same questions.
BEGIN TRAN ERIK CTEsSometimes a CTE won’t change anything at all. This is the case with simple predicates.
/*When they don't matter*/ WITH freedom AS ( SELECT p.Id FROM dbo.Posts AS p WHERE p.Id = 4 ) -- Simple predicate inside SELECT COUNT(*) AS RECORDS FROM freedom AS f; WITH freedom AS ( SELECT p.Id FROM dbo.Posts AS p ) SELECT COUNT(*) AS RECORDS FROM freedom AS f WHERE f.Id = 4; --Simple predicate outsideCTEs don’t materialize results. What do you think this is, Oracle?
If you join a CTE to itself, you’ll run the CTE query again.
/*One joins*/ WITH freedom AS ( SELECT p.Id FROM dbo.Posts AS p WHERE p.ParentId = 0 AND p.CreationDate >= '20160306' AND p.Score >= 2 ) SELECT COUNT(*) AS RECORDS FROM freedom AS f JOIN freedom AS f2 ON f2.Id = f.Id; /*Two joins*/ WITH freedom AS ( SELECT p.Id FROM dbo.Posts AS p WHERE p.ParentId = 0 AND p.CreationDate >= '20160306' AND p.Score >= 2 ) SELECT COUNT(*) AS RECORDS FROM freedom AS f JOIN freedom AS f2 ON f2.Id = f.Id JOIN freedom AS f3 ON f3.Id = f.Id; /*Ah hell, ten joins!*/ WITH freedom AS ( SELECT p.Id FROM dbo.Posts AS p WHERE p.ParentId = 0 AND p.CreationDate >= '20160306' AND p.Score >= 2 ) SELECT COUNT(*) AS RECORDS FROM freedom AS f JOIN freedom AS f2 ON f2.Id = f.Id JOIN freedom AS f3 ON f3.Id = f.Id JOIN freedom AS f4 ON f4.Id = f.Id JOIN freedom AS f5 ON f5.Id = f.Id JOIN freedom AS f6 ON f6.Id = f.Id JOIN freedom AS f7 ON f7.Id = f.Id JOIN freedom AS f8 ON f8.Id = f.Id JOIN freedom AS f9 ON f9.Id = f.Id JOIN freedom AS f10 ON f10.Id = f.Id;Thankfully, nested CTEs don’t exhibit the same problem.
/*What about nested CTEs?*/ WITH freedom AS ( SELECT p.Id FROM dbo.Posts AS p WHERE p.ParentId = 0 AND p.CreationDate >= '20160306' AND p.Score >= 2 ), more_freedom AS ( SELECT * FROM freedom ) SELECT COUNT(*) AS RECORDS FROM more_freedom AS f;CTEs and derived tables will behave similarly as far as performance and query plans go.
One difference is that you can’t reference a derived table more than once, where you can do that with CTEs.
/*CTEs vs Derived Tables?*/ WITH freedom AS ( SELECT p.Id FROM dbo.Posts AS p WHERE p.Id = 4 ) SELECT COUNT(*) AS RECORDS FROM freedom AS f; SELECT COUNT(*) AS Records FROM (SELECT p.Id FROM dbo.Posts AS p WHERE p.Id = 4) AS xCTEs are cool though. You can filter on things on the outside that you can’t filter on the inside.
/*This will failbot*/ SELECT p.OwnerUserId, p.Score, p.Title, DENSE_RANK() OVER ( PARTITION BY p.OwnerUserId ORDER BY p.Score DESC ) AS ScoreRank FROM dbo.Posts AS p WHERE p.PostTypeId = 1 AND p.OwnerUserId > 0 AND ScoreRank <= 3 /*This will work excellently*/ WITH t1 AS ( SELECT p.OwnerUserId, p.Score, p.Title, DENSE_RANK() OVER ( PARTITION BY p.OwnerUserId ORDER BY p.Score DESC ) AS ScoreRank FROM dbo.Posts AS p WHERE p.PostTypeId = 1 AND p.OwnerUserId > 0 ) SELECT TOP 100 u.DisplayName, t1.Title, t1.Score FROM t1 JOIN dbo.Users AS u ON u.Id = t1.OwnerUserId WHERE t1.ScoreRank <= 3 ORDER BY t1.OwnerUserId; /*Be careful where you put that TOP*/ WITH t1 AS ( SELECT TOP 100 p.OwnerUserId, p.Score, p.Title, DENSE_RANK() OVER ( PARTITION BY p.OwnerUserId ORDER BY p.Score DESC ) AS ScoreRank FROM dbo.Posts AS p WHERE p.PostTypeId = 1 AND p.OwnerUserId > 0 ORDER BY p.OwnerUserId ) SELECT u.DisplayName, t1.Title, t1.Score FROM t1 JOIN dbo.Users AS u ON u.Id = t1.OwnerUserId WHERE t1.ScoreRank <= 3; FunctionsThis query runs without a function and finishes pretty quickly.
/*Normal string aggregation*/ /*This all ends in 2017 with STRING_AGG*/ SELECT b.UserId, STUFF((SELECT N', ' + b2.Name FROM dbo.Badges AS b2 WHERE b2.UserId = b.UserId GROUP BY b2.Name FOR XML PATH(N''), TYPE ).value(N'.[1]', N'NVARCHAR(4000)'), 1, 2, N'') AS Badges FROM dbo.Badges AS b WHERE b.Date >= '20160301';If we turn that string aggregation expression into a scalar valued function…
/*Let's not repeat ourselves*/ /*Lets MAKE A FUNCTION*/ CREATE OR ALTER FUNCTION dbo.Fake_String_Agg (@UserId INT) RETURNS NVARCHAR(4000) WITH RETURNS NULL ON NULL INPUT, SCHEMABINDING AS BEGIN DECLARE @WickedBadIdeaDude NVARCHAR(4000) SELECT @WickedBadIdeaDude = STUFF((SELECT N', ' + b2.Name FROM dbo.Badges AS b2 WHERE b2.UserId = @UserId GROUP BY b2.Name FOR XML PATH(N''), TYPE ).value(N'.[1]', N'NVARCHAR(4000)'), 1, 2, N'') RETURN @WickedBadIdeaDude END GONow we can crap up all our queries effortlessly.
/*Let's see how things go with our new function*/ SELECT b.UserId, dbo.Fake_String_Agg(b.UserId) AS Badges FROM dbo.Badges AS b WHERE b.Date >= '20160301'; /*THIS IS SO COOL WE CAN USE IT WITH OTHER TABLES*/ SELECT u.DisplayName, dbo.Fake_String_Agg(u.Id) AS Badges FROM dbo.Users AS u WHERE u.LastAccessDate >= '20160306';Checking on query performance with sp_BlitzQueryStore…
DECLARE @ThisIsTheModernWorld DATETIME = GETDATE() EXEC master.dbo.sp_BlitzQueryStore @DatabaseName = 'StackOverflow', @StartDate = @ThisIsTheModernWorld GO EXEC sp_BlitzQueryStore @DatabaseName = 'StackOverflow', @PlanIdFilter = 3527 GO DECLARE @ThisIsTheModernWorld DATETIME = GETDATE() EXEC master.dbo.sp_BlitzQueryStore @DatabaseName = 'StackOverflow', @StoredProcName = 'Fake_String_Agg' , @StartDate = @ThisIsTheModernWorld GOComputed columns with Scalar Valued Functions in them will be similarly crappy.
/*Let's use a computed column instead*/ CREATE TABLE dbo.LittleBadges ( Id INT NOT NULL PRIMARY KEY CLUSTERED, UserId INT NOT NULL, Badge_Agg AS dbo.Fake_String_Agg(UserId) ); INSERT dbo.LittleBadges WITH (TABLOCK) (Id, UserId ) SELECT b.Id, b.UserId FROM dbo.Badges AS b WHERE b.Date >= '20160301';Crappiness doesn’t depend on whether or not we select the computed column. It’s there no matter what.
SELECT TOP 10 * FROM dbo.LittleBadges AS lb SELECT TOP 10 lb.Id, lb.UserId --NOT SELECTING THE COMPUTED COLUMN FROM dbo.LittleBadges AS lbLet’s see what happens when we add in a check constraint based on a UDF.
What to look for in XE: executions of function after inserting rows. Executions after selecting data.
/*Can we make this worse?*/ /*You betcha!*/ TRUNCATE TABLE dbo.LittleBadges ALTER TABLE dbo.LittleBadges ADD CONSTRAINT [HAHAHAHAHAHAHAHA] CHECK (dbo.Fake_String_Agg(UserId) <> '') /*Let's Extend Ourselves*/ DROP EVENT SESSION [Crud] ON SERVER EXEC xp_cmdshell 'DEL /F /Q C:\Temp\crud*.xel' DECLARE @SPID VARCHAR(3) = @@SPID DECLARE @event_sql NVARCHAR(MAX) = N' CREATE EVENT SESSION [Crud] ON SERVER ADD EVENT sqlserver.rpc_completed(SET collect_data_stream=(1) ACTION(sqlserver.database_id,sqlserver.database_name,sqlserver.server_principal_name,sqlserver.session_id,sqlserver.sql_text) WHERE ([package0].[greater_than_uint64]([sqlserver].[database_id],(4)) AND [package0].[equal_boolean]([sqlserver].[is_system],(0)) AND [sqlserver].[session_id]=('+@SPID+'))), ADD EVENT sqlserver.sp_statement_completed(SET collect_object_name=(1) ACTION(sqlserver.database_id,sqlserver.database_name,sqlserver.server_principal_name,sqlserver.session_id,sqlserver.sql_text) WHERE ([package0].[greater_than_uint64]([sqlserver].[database_id],(4)) AND [package0].[equal_boolean]([sqlserver].[is_system],(0)) AND [sqlserver].[session_id]=('+@SPID+'))), ADD EVENT sqlserver.sql_batch_completed( ACTION(sqlserver.database_id,sqlserver.database_name,sqlserver.server_principal_name,sqlserver.session_id,sqlserver.sql_text) WHERE ([package0].[greater_than_uint64]([sqlserver].[database_id],(4)) AND [package0].[equal_boolean]([sqlserver].[is_system],(0)) AND [sqlserver].[session_id]=('+@SPID+'))) ADD TARGET package0.event_file(SET filename=N''c:\temp\crud'') WITH (MAX_MEMORY=4096 KB,EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,MAX_DISPATCH_LATENCY=2 SECONDS,MAX_EVENT_SIZE=0 KB,MEMORY_PARTITION_MODE=NONE,TRACK_CAUSALITY=ON,STARTUP_STATE=OFF) ALTER EVENT SESSION [Crud] ON SERVER STATE = START ' PRINT @event_sql EXEC sys.sp_executesql @event_sql /*Insert 100 rows*/ INSERT dbo.LittleBadges WITH (TABLOCK) (Id, UserId ) SELECT TOP (100) b.Id, b.UserId FROM dbo.Badges AS b WHERE b.Date >= '20160301'; /*TO THE SHREDDER*/ DROP TABLE dbo.LittleBadgesUsing an inline TVF makes things faster for the query, but we can’t use it in a computed column. Other downsides: inline TVFs aren’t tracked in DMVs (2016 has a function_stats DMV that doesn’t catch them).
This is the same in Query Store.
/*Let's make that inline, y'all*/ CREATE OR ALTER FUNCTION dbo.Fake_String_Agg_Inline (@UserId INT) RETURNS TABLE WITH SCHEMABINDING AS RETURN SELECT STUFF((SELECT N', ' + b2.Name FROM dbo.Badges AS b2 WHERE b2.UserId = @UserId GROUP BY b2.Name FOR XML PATH(N''), TYPE ).value(N'.[1]', N'NVARCHAR(4000)'), 1, 2, N'') AS Badges GO /*Testing out this inline voodoo*/ SELECT b.UserId, f.Badges FROM dbo.Badges AS b CROSS APPLY dbo.Fake_String_Agg_Inline(b.UserId) AS f WHERE b.Date >= '20160301'; SELECT u.DisplayName, f.Badges FROM dbo.Users AS u CROSS APPLY dbo.Fake_String_Agg_Inline(u.Id) AS f WHERE u.LastAccessDate >= '20160306'; /*TURN OFF QUERY PLANS OR ALL WILL BURN*/ DECLARE @ThisIsTheModernWorld DATETIME = GETDATE() EXEC master.dbo.sp_BlitzQueryStore @DatabaseName = 'StackOverflow', @StartDate = @ThisIsTheModernWorld GO EXEC sp_BlitzQueryStore @DatabaseName = 'StackOverflow', @PlanIdFilter = 3526 GO DECLARE @ThisIsTheModernWorld DATETIME = GETDATE() EXEC master.dbo.sp_BlitzQueryStore @DatabaseName = 'StackOverflow', @StoredProcName = 'Fake_String_Agg_Inline' , @StartDate = @ThisIsTheModernWorld GO SELECT OBJECT_NAME(qsq.object_id), * FROM sys.query_store_query AS qsq WHERE OBJECT_NAME(qsq.object_id) IS NOT NULL Temp Tables or Table VariablesTable variable modifications are forced to run serially.
/*No parallel inserts*/ DECLARE @Votes TABLE (Id INT NOT NULL, VoteTypeId INT NOT NULL) INSERT @Votes ( Id, VoteTypeId ) SELECT v.PostId, v.VoteTypeId FROM dbo.Votes AS v WHERE v.VoteTypeId = 16 GO DECLARE @Votes TABLE (Id INT NOT NULL, VoteTypeId INT NOT NULL) INSERT @Votes ( Id, VoteTypeId ) SELECT v.PostId, v.VoteTypeId FROM dbo.Votes AS v WHERE v.VoteTypeId = 16 OPTION(USE HINT('ENABLE_PARALLEL_PLAN_PREFERENCE')); GOThey’re also not guaranteed to be in memory. Backed by temp objects which may spill to disk.
SET STATISTICS TIME, IO ON DECLARE @Votes TABLE (Id INT NOT NULL, VoteTypeId INT NOT NULL) INSERT @Votes ( Id, VoteTypeId ) SELECT v.PostId, v.VoteTypeId FROM dbo.Votes AS v WHERE v.VoteTypeId = 16 SELECT COUNT(*) AS [Vote Type] FROM @Votes AS v GOBad estimates may prevent parallel plans from happening when they should have.
/*Crappy estimates*/ DECLARE @Votes TABLE (Id INT NOT NULL, VoteTypeId INT NOT NULL) INSERT @Votes ( Id, VoteTypeId ) SELECT v.PostId, v.VoteTypeId FROM dbo.Votes AS v WHERE v.VoteTypeId = 16 SELECT COUNT(*) AS [Vote Type] FROM @Votes AS v GO DECLARE @Votes TABLE (Id INT NOT NULL, VoteTypeId INT NOT NULL) INSERT @Votes ( Id, VoteTypeId ) SELECT v.PostId, v.VoteTypeId FROM dbo.Votes AS v WHERE v.VoteTypeId = 16 SELECT COUNT(*) AS [Vote Type] FROM @Votes AS v JOIN dbo.Posts AS p ON p.Id = v.Id GODoes recompiling always make things better?
/*Does recompile really even help?*/ DECLARE @Votes TABLE (Id INT NOT NULL, VoteTypeId INT NOT NULL) INSERT @Votes ( Id, VoteTypeId ) SELECT v.PostId, v.VoteTypeId FROM dbo.Votes AS v WHERE v.VoteTypeId = 16 SELECT COUNT(*) AS [Vote Type] FROM @Votes AS v JOIN dbo.Posts AS p ON p.Id = v.Id OPTION(RECOMPILE) GOTemp tables generally work better!
/*Temp table superiority*/ DROP TABLE IF EXISTS #Votes CREATE TABLE #Votes (Id INT NOT NULL, VoteTypeId INT NOT NULL) INSERT #Votes ( Id, VoteTypeId ) SELECT v.PostId, v.VoteTypeId FROM dbo.Votes AS v WHERE v.VoteTypeId = 16 SELECT COUNT(*) AS [Vote Type] FROM #Votes AS v JOIN dbo.Posts AS p ON p.Id = v.Id GODo indexes change anything?
/*BUT INDEXES*/ DECLARE @Votes TABLE (Id INT NOT NULL, VoteTypeId INT NOT NULL, INDEX ix_Id CLUSTERED (Id)) INSERT @Votes ( Id, VoteTypeId ) SELECT v.PostId, v.VoteTypeId FROM dbo.Votes AS v WHERE v.VoteTypeId = 16 SELECT COUNT(*) AS [Vote Type] FROM @Votes AS v JOIN dbo.Posts AS p ON p.Id = v.Id GO DROP TABLE IF EXISTS #Votes CREATE TABLE #Votes (Id INT NOT NULL, VoteTypeId INT NOT NULL, INDEX ix_Id CLUSTERED (Id)) INSERT #Votes WITH (TABLOCK) ( Id, VoteTypeId ) SELECT v.PostId, v.VoteTypeId FROM dbo.Votes AS v WHERE v.VoteTypeId = 16 SELECT COUNT(*) AS [Vote Type] FROM #Votes AS v JOIN dbo.Posts AS p ON p.Id = v.Id GO ROLLBACK ERIK Are your performance skills out of fashion?If you learned things during the webcast, and you’re starting to question your taste, have no fear: we’re here to help. We’re doing an all-day pre-con class before the PASS Summit called Expert Performance Tuning for SQL Server 2016 & 2017 . We specifically designed it to update your performance skills for today and a lot of the techniques are even useful on currently patched versions of 2012 & 2014, too. Learn more and register for the pre-con .