Quantcast
Channel: CodeSection,代码区,SQL Server(mssql)数据库 技术分享 - CodeSec
Viewing all articles
Browse latest Browse all 3160

Last Season’s Performance Tuning Techniques: Slides & Demos

$
0
0

Right now, Erik and I are presenting at the 24 Hours of PASS. We’re talking about Last Season’s Performance Tuning Techniques.

You can register and join us now .

Wanna play along with us as we show how your performance skills might be a little out of date? Here’s the demos and the slides:

Last Season's Performance Tuning Techniques from Brent Ozar

Fill Factor: Doing the Page Splits

You can do this one in any database, but you’ll want to do it on a server with very low load. If anybody else is doing any deletes/updates/inserts at all, it’s going to skew your numbers.

CREATE TABLE TheHeart (ID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED, Groove VARCHAR(100)); GO /* How many page splits has our server had? */ SELECT * FROM sys.dm_os_performance_counters WHERE counter_name LIKE '%Splits%'; GO /* Add just one row */ INSERT INTO TheHeart (Groove) VALUES ('The chills that you spill up my back keep me filled'); GO /* See any page splits? */ SELECT * FROM sys.dm_os_performance_counters WHERE counter_name LIKE '%Splits%'; GO DROP TABLE TheHeart; GO

Questions to think about:

What does a page split really mean? Is there such a thing as a good or a bad page split? How do you know which ones you’re having? Would setting fill factor have prevented that page split? Missing Indexes

This one requires the Stack Overflow demo database . If you don’t already have a copy of that, don’t try to download it live during the session it’s too big (~15GB torrent, then expands to a ~100GB SQL Server database.)

Get the estimated execution plan for this:

SELECT c.CreationDate, c.Score, c.Text, p.Title, p.PostTypeId FROM dbo.Users me INNER JOIN dbo.Comments c ON me.Id = c.UserId INNER JOIN dbo.Posts p ON c.PostId = p.ParentId WHERE me.DisplayName = 'Brent Ozar'; GO

And then ask yourself:

What index am I told to create? Does that index make sense? Is there anything SQL Server isn’t telling me?

Now try the estimated plan for this:

SELECT Id FROM dbo.Users WHERE DisplayName = 'Brent Ozar' ORDER BY Age; GO

And ask yourself those same questions.

BEGIN TRAN ERIK CTEs

Sometimes a CTE won’t change anything at all. This is the case with simple predicates.

/*When they don't matter*/ WITH freedom AS ( SELECT p.Id FROM dbo.Posts AS p WHERE p.Id = 4 ) -- Simple predicate inside SELECT COUNT(*) AS RECORDS FROM freedom AS f; WITH freedom AS ( SELECT p.Id FROM dbo.Posts AS p ) SELECT COUNT(*) AS RECORDS FROM freedom AS f WHERE f.Id = 4; --Simple predicate outside

CTEs don’t materialize results. What do you think this is, Oracle?

If you join a CTE to itself, you’ll run the CTE query again.

/*One joins*/ WITH freedom AS ( SELECT p.Id FROM dbo.Posts AS p WHERE p.ParentId = 0 AND p.CreationDate >= '20160306' AND p.Score >= 2 ) SELECT COUNT(*) AS RECORDS FROM freedom AS f JOIN freedom AS f2 ON f2.Id = f.Id; /*Two joins*/ WITH freedom AS ( SELECT p.Id FROM dbo.Posts AS p WHERE p.ParentId = 0 AND p.CreationDate >= '20160306' AND p.Score >= 2 ) SELECT COUNT(*) AS RECORDS FROM freedom AS f JOIN freedom AS f2 ON f2.Id = f.Id JOIN freedom AS f3 ON f3.Id = f.Id; /*Ah hell, ten joins!*/ WITH freedom AS ( SELECT p.Id FROM dbo.Posts AS p WHERE p.ParentId = 0 AND p.CreationDate >= '20160306' AND p.Score >= 2 ) SELECT COUNT(*) AS RECORDS FROM freedom AS f JOIN freedom AS f2 ON f2.Id = f.Id JOIN freedom AS f3 ON f3.Id = f.Id JOIN freedom AS f4 ON f4.Id = f.Id JOIN freedom AS f5 ON f5.Id = f.Id JOIN freedom AS f6 ON f6.Id = f.Id JOIN freedom AS f7 ON f7.Id = f.Id JOIN freedom AS f8 ON f8.Id = f.Id JOIN freedom AS f9 ON f9.Id = f.Id JOIN freedom AS f10 ON f10.Id = f.Id;

Thankfully, nested CTEs don’t exhibit the same problem.

/*What about nested CTEs?*/ WITH freedom AS ( SELECT p.Id FROM dbo.Posts AS p WHERE p.ParentId = 0 AND p.CreationDate >= '20160306' AND p.Score >= 2 ), more_freedom AS ( SELECT * FROM freedom ) SELECT COUNT(*) AS RECORDS FROM more_freedom AS f;

CTEs and derived tables will behave similarly as far as performance and query plans go.

One difference is that you can’t reference a derived table more than once, where you can do that with CTEs.

/*CTEs vs Derived Tables?*/ WITH freedom AS ( SELECT p.Id FROM dbo.Posts AS p WHERE p.Id = 4 ) SELECT COUNT(*) AS RECORDS FROM freedom AS f; SELECT COUNT(*) AS Records FROM (SELECT p.Id FROM dbo.Posts AS p WHERE p.Id = 4) AS x

CTEs are cool though. You can filter on things on the outside that you can’t filter on the inside.

/*This will failbot*/ SELECT p.OwnerUserId, p.Score, p.Title, DENSE_RANK() OVER ( PARTITION BY p.OwnerUserId ORDER BY p.Score DESC ) AS ScoreRank FROM dbo.Posts AS p WHERE p.PostTypeId = 1 AND p.OwnerUserId > 0 AND ScoreRank <= 3 /*This will work excellently*/ WITH t1 AS ( SELECT p.OwnerUserId, p.Score, p.Title, DENSE_RANK() OVER ( PARTITION BY p.OwnerUserId ORDER BY p.Score DESC ) AS ScoreRank FROM dbo.Posts AS p WHERE p.PostTypeId = 1 AND p.OwnerUserId > 0 ) SELECT TOP 100 u.DisplayName, t1.Title, t1.Score FROM t1 JOIN dbo.Users AS u ON u.Id = t1.OwnerUserId WHERE t1.ScoreRank <= 3 ORDER BY t1.OwnerUserId; /*Be careful where you put that TOP*/ WITH t1 AS ( SELECT TOP 100 p.OwnerUserId, p.Score, p.Title, DENSE_RANK() OVER ( PARTITION BY p.OwnerUserId ORDER BY p.Score DESC ) AS ScoreRank FROM dbo.Posts AS p WHERE p.PostTypeId = 1 AND p.OwnerUserId > 0 ORDER BY p.OwnerUserId ) SELECT u.DisplayName, t1.Title, t1.Score FROM t1 JOIN dbo.Users AS u ON u.Id = t1.OwnerUserId WHERE t1.ScoreRank <= 3; Functions

This query runs without a function and finishes pretty quickly.

/*Normal string aggregation*/ /*This all ends in 2017 with STRING_AGG*/ SELECT b.UserId, STUFF((SELECT N', ' + b2.Name FROM dbo.Badges AS b2 WHERE b2.UserId = b.UserId GROUP BY b2.Name FOR XML PATH(N''), TYPE ).value(N'.[1]', N'NVARCHAR(4000)'), 1, 2, N'') AS Badges FROM dbo.Badges AS b WHERE b.Date >= '20160301';

If we turn that string aggregation expression into a scalar valued function…

/*Let's not repeat ourselves*/ /*Lets MAKE A FUNCTION*/ CREATE OR ALTER FUNCTION dbo.Fake_String_Agg (@UserId INT) RETURNS NVARCHAR(4000) WITH RETURNS NULL ON NULL INPUT, SCHEMABINDING AS BEGIN DECLARE @WickedBadIdeaDude NVARCHAR(4000) SELECT @WickedBadIdeaDude = STUFF((SELECT N', ' + b2.Name FROM dbo.Badges AS b2 WHERE b2.UserId = @UserId GROUP BY b2.Name FOR XML PATH(N''), TYPE ).value(N'.[1]', N'NVARCHAR(4000)'), 1, 2, N'') RETURN @WickedBadIdeaDude END GO

Now we can crap up all our queries effortlessly.

/*Let's see how things go with our new function*/ SELECT b.UserId, dbo.Fake_String_Agg(b.UserId) AS Badges FROM dbo.Badges AS b WHERE b.Date >= '20160301'; /*THIS IS SO COOL WE CAN USE IT WITH OTHER TABLES*/ SELECT u.DisplayName, dbo.Fake_String_Agg(u.Id) AS Badges FROM dbo.Users AS u WHERE u.LastAccessDate >= '20160306';

Checking on query performance with sp_BlitzQueryStore…

DECLARE @ThisIsTheModernWorld DATETIME = GETDATE() EXEC master.dbo.sp_BlitzQueryStore @DatabaseName = 'StackOverflow', @StartDate = @ThisIsTheModernWorld GO EXEC sp_BlitzQueryStore @DatabaseName = 'StackOverflow', @PlanIdFilter = 3527 GO DECLARE @ThisIsTheModernWorld DATETIME = GETDATE() EXEC master.dbo.sp_BlitzQueryStore @DatabaseName = 'StackOverflow', @StoredProcName = 'Fake_String_Agg' , @StartDate = @ThisIsTheModernWorld GO

Computed columns with Scalar Valued Functions in them will be similarly crappy.

/*Let's use a computed column instead*/ CREATE TABLE dbo.LittleBadges ( Id INT NOT NULL PRIMARY KEY CLUSTERED, UserId INT NOT NULL, Badge_Agg AS dbo.Fake_String_Agg(UserId) ); INSERT dbo.LittleBadges WITH (TABLOCK) (Id, UserId ) SELECT b.Id, b.UserId FROM dbo.Badges AS b WHERE b.Date >= '20160301';

Crappiness doesn’t depend on whether or not we select the computed column. It’s there no matter what.

SELECT TOP 10 * FROM dbo.LittleBadges AS lb SELECT TOP 10 lb.Id, lb.UserId --NOT SELECTING THE COMPUTED COLUMN FROM dbo.LittleBadges AS lb

Let’s see what happens when we add in a check constraint based on a UDF.

What to look for in XE: executions of function after inserting rows. Executions after selecting data.

/*Can we make this worse?*/ /*You betcha!*/ TRUNCATE TABLE dbo.LittleBadges ALTER TABLE dbo.LittleBadges ADD CONSTRAINT [HAHAHAHAHAHAHAHA] CHECK (dbo.Fake_String_Agg(UserId) <> '') /*Let's Extend Ourselves*/ DROP EVENT SESSION [Crud] ON SERVER EXEC xp_cmdshell 'DEL /F /Q C:\Temp\crud*.xel' DECLARE @SPID VARCHAR(3) = @@SPID DECLARE @event_sql NVARCHAR(MAX) = N' CREATE EVENT SESSION [Crud] ON SERVER ADD EVENT sqlserver.rpc_completed(SET collect_data_stream=(1) ACTION(sqlserver.database_id,sqlserver.database_name,sqlserver.server_principal_name,sqlserver.session_id,sqlserver.sql_text) WHERE ([package0].[greater_than_uint64]([sqlserver].[database_id],(4)) AND [package0].[equal_boolean]([sqlserver].[is_system],(0)) AND [sqlserver].[session_id]=('+@SPID+'))), ADD EVENT sqlserver.sp_statement_completed(SET collect_object_name=(1) ACTION(sqlserver.database_id,sqlserver.database_name,sqlserver.server_principal_name,sqlserver.session_id,sqlserver.sql_text) WHERE ([package0].[greater_than_uint64]([sqlserver].[database_id],(4)) AND [package0].[equal_boolean]([sqlserver].[is_system],(0)) AND [sqlserver].[session_id]=('+@SPID+'))), ADD EVENT sqlserver.sql_batch_completed( ACTION(sqlserver.database_id,sqlserver.database_name,sqlserver.server_principal_name,sqlserver.session_id,sqlserver.sql_text) WHERE ([package0].[greater_than_uint64]([sqlserver].[database_id],(4)) AND [package0].[equal_boolean]([sqlserver].[is_system],(0)) AND [sqlserver].[session_id]=('+@SPID+'))) ADD TARGET package0.event_file(SET filename=N''c:\temp\crud'') WITH (MAX_MEMORY=4096 KB,EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,MAX_DISPATCH_LATENCY=2 SECONDS,MAX_EVENT_SIZE=0 KB,MEMORY_PARTITION_MODE=NONE,TRACK_CAUSALITY=ON,STARTUP_STATE=OFF) ALTER EVENT SESSION [Crud] ON SERVER STATE = START ' PRINT @event_sql EXEC sys.sp_executesql @event_sql /*Insert 100 rows*/ INSERT dbo.LittleBadges WITH (TABLOCK) (Id, UserId ) SELECT TOP (100) b.Id, b.UserId FROM dbo.Badges AS b WHERE b.Date >= '20160301'; /*TO THE SHREDDER*/ DROP TABLE dbo.LittleBadges

Using an inline TVF makes things faster for the query, but we can’t use it in a computed column. Other downsides: inline TVFs aren’t tracked in DMVs (2016 has a function_stats DMV that doesn’t catch them).

This is the same in Query Store.

/*Let's make that inline, y'all*/ CREATE OR ALTER FUNCTION dbo.Fake_String_Agg_Inline (@UserId INT) RETURNS TABLE WITH SCHEMABINDING AS RETURN SELECT STUFF((SELECT N', ' + b2.Name FROM dbo.Badges AS b2 WHERE b2.UserId = @UserId GROUP BY b2.Name FOR XML PATH(N''), TYPE ).value(N'.[1]', N'NVARCHAR(4000)'), 1, 2, N'') AS Badges GO /*Testing out this inline voodoo*/ SELECT b.UserId, f.Badges FROM dbo.Badges AS b CROSS APPLY dbo.Fake_String_Agg_Inline(b.UserId) AS f WHERE b.Date >= '20160301'; SELECT u.DisplayName, f.Badges FROM dbo.Users AS u CROSS APPLY dbo.Fake_String_Agg_Inline(u.Id) AS f WHERE u.LastAccessDate >= '20160306'; /*TURN OFF QUERY PLANS OR ALL WILL BURN*/ DECLARE @ThisIsTheModernWorld DATETIME = GETDATE() EXEC master.dbo.sp_BlitzQueryStore @DatabaseName = 'StackOverflow', @StartDate = @ThisIsTheModernWorld GO EXEC sp_BlitzQueryStore @DatabaseName = 'StackOverflow', @PlanIdFilter = 3526 GO DECLARE @ThisIsTheModernWorld DATETIME = GETDATE() EXEC master.dbo.sp_BlitzQueryStore @DatabaseName = 'StackOverflow', @StoredProcName = 'Fake_String_Agg_Inline' , @StartDate = @ThisIsTheModernWorld GO SELECT OBJECT_NAME(qsq.object_id), * FROM sys.query_store_query AS qsq WHERE OBJECT_NAME(qsq.object_id) IS NOT NULL Temp Tables or Table Variables

Table variable modifications are forced to run serially.

/*No parallel inserts*/ DECLARE @Votes TABLE (Id INT NOT NULL, VoteTypeId INT NOT NULL) INSERT @Votes ( Id, VoteTypeId ) SELECT v.PostId, v.VoteTypeId FROM dbo.Votes AS v WHERE v.VoteTypeId = 16 GO DECLARE @Votes TABLE (Id INT NOT NULL, VoteTypeId INT NOT NULL) INSERT @Votes ( Id, VoteTypeId ) SELECT v.PostId, v.VoteTypeId FROM dbo.Votes AS v WHERE v.VoteTypeId = 16 OPTION(USE HINT('ENABLE_PARALLEL_PLAN_PREFERENCE')); GO

They’re also not guaranteed to be in memory. Backed by temp objects which may spill to disk.

SET STATISTICS TIME, IO ON DECLARE @Votes TABLE (Id INT NOT NULL, VoteTypeId INT NOT NULL) INSERT @Votes ( Id, VoteTypeId ) SELECT v.PostId, v.VoteTypeId FROM dbo.Votes AS v WHERE v.VoteTypeId = 16 SELECT COUNT(*) AS [Vote Type] FROM @Votes AS v GO

Bad estimates may prevent parallel plans from happening when they should have.

/*Crappy estimates*/ DECLARE @Votes TABLE (Id INT NOT NULL, VoteTypeId INT NOT NULL) INSERT @Votes ( Id, VoteTypeId ) SELECT v.PostId, v.VoteTypeId FROM dbo.Votes AS v WHERE v.VoteTypeId = 16 SELECT COUNT(*) AS [Vote Type] FROM @Votes AS v GO DECLARE @Votes TABLE (Id INT NOT NULL, VoteTypeId INT NOT NULL) INSERT @Votes ( Id, VoteTypeId ) SELECT v.PostId, v.VoteTypeId FROM dbo.Votes AS v WHERE v.VoteTypeId = 16 SELECT COUNT(*) AS [Vote Type] FROM @Votes AS v JOIN dbo.Posts AS p ON p.Id = v.Id GO

Does recompiling always make things better?

/*Does recompile really even help?*/ DECLARE @Votes TABLE (Id INT NOT NULL, VoteTypeId INT NOT NULL) INSERT @Votes ( Id, VoteTypeId ) SELECT v.PostId, v.VoteTypeId FROM dbo.Votes AS v WHERE v.VoteTypeId = 16 SELECT COUNT(*) AS [Vote Type] FROM @Votes AS v JOIN dbo.Posts AS p ON p.Id = v.Id OPTION(RECOMPILE) GO

Temp tables generally work better!

/*Temp table superiority*/ DROP TABLE IF EXISTS #Votes CREATE TABLE #Votes (Id INT NOT NULL, VoteTypeId INT NOT NULL) INSERT #Votes ( Id, VoteTypeId ) SELECT v.PostId, v.VoteTypeId FROM dbo.Votes AS v WHERE v.VoteTypeId = 16 SELECT COUNT(*) AS [Vote Type] FROM #Votes AS v JOIN dbo.Posts AS p ON p.Id = v.Id GO

Do indexes change anything?

/*BUT INDEXES*/ DECLARE @Votes TABLE (Id INT NOT NULL, VoteTypeId INT NOT NULL, INDEX ix_Id CLUSTERED (Id)) INSERT @Votes ( Id, VoteTypeId ) SELECT v.PostId, v.VoteTypeId FROM dbo.Votes AS v WHERE v.VoteTypeId = 16 SELECT COUNT(*) AS [Vote Type] FROM @Votes AS v JOIN dbo.Posts AS p ON p.Id = v.Id GO DROP TABLE IF EXISTS #Votes CREATE TABLE #Votes (Id INT NOT NULL, VoteTypeId INT NOT NULL, INDEX ix_Id CLUSTERED (Id)) INSERT #Votes WITH (TABLOCK) ( Id, VoteTypeId ) SELECT v.PostId, v.VoteTypeId FROM dbo.Votes AS v WHERE v.VoteTypeId = 16 SELECT COUNT(*) AS [Vote Type] FROM #Votes AS v JOIN dbo.Posts AS p ON p.Id = v.Id GO ROLLBACK ERIK Are your performance skills out of fashion?

If you learned things during the webcast, and you’re starting to question your taste, have no fear: we’re here to help. We’re doing an all-day pre-con class before the PASS Summit called Expert Performance Tuning for SQL Server 2016 & 2017 . We specifically designed it to update your performance skills for today and a lot of the techniques are even useful on currently patched versions of 2012 & 2014, too. Learn more and register for the pre-con .


Viewing all articles
Browse latest Browse all 3160

Trending Articles