Quantcast
Channel: CodeSection,代码区,SQL Server(mssql)数据库 技术分享 - CodeSec
Viewing all articles
Browse latest Browse all 3160

Columnstore Indexes part 88 (“Minimal Logging in SQL Server 2016”)

$
0
0

Continuation from the previous 87 parts, the whole series can be found at http://www.nikoport.com/columnstore/ .

Getting Data into the Database is one essential process, no matter if you are working with an OLTP application or if you are a professional Data Warehousing developer, or any other data jokey (or data wrangler).

Making this process function to the maximum velocity is essential, especially if you are working with large number of processes or large volumes of data. In the current age of the global processes and exploration of the data value this skill is essential for any data professional. The skill of knowing how to load can be there, but what about the database, what about SQL Server and the columnstore indexes ? In the blog post Columnstore Indexes part 43 (“Transaction Log Basics”) related to the SQL Server 2014, I have shown that Columnstore Indexes were more efficient in comparison to Rowstore ones, when loading data in small transactions, but once we tried Minimal Logging, the situation would change with the traditional Rowstore being quite more effective than the Columnstore Indexes.

In SQL Server 2016, Microsoft has implemented the Minimal Logging for Columnstore Indexes and I am excited to put it to the test and to see if this will be another barrier where Columnstore Indexes has successfully tackled the Rowstore indexes.

For the setup, I will use the following script, where the source table dbo.TestSource will get 1.500.000 rows, which will be used to seed the data into our test tables: dbo.BigDataTestHeap (Heap) & dbo.BigDataTest (Clustered Columnstore)

createtabledbo.TestSource( idint not null, namechar (200), lastnamechar(200), logDatedatetime ); -- Loading 1.500.000 Rows setnocounton declare @i as int; set @i = 1; begintran while @i <= 1500000 begin insertintodbo.TestSource values (@i, 'First Name', 'LastName', GetDate()); set @i = @i + 1; end; commit; -- Create a HEAP createtabledbo.BigDataTestHeap( idint not null, namechar (200), lastnamechar(200), logDatedatetime ); -- CreteourtestCCItable createtabledbo.BigDataTest( idint not null, namechar (200), lastnamechar(200), logDatedatetime ); -- CreateColumnstoreIndex createclusteredcolumnstoreindexPK_BigDataTest ondbo.BigDatatest;

Now let’s do the very same test as before loading 10 rows into an empty (truncated) heap table dbo.BigDataTestHeap, while not using Minimal Logging.

truncatetableBigDataTestHeap; checkpoint -- Load 10 rowsofdatafromtheSourcetable declare @i as int; set @i = 1; begintran while @i <= 10 begin insertintodbo.BigDataTestHeap --with (TABLOCK) selectid, Name, LastName, logDate fromdbo.TestSource whereid = @i; set @i = @i + 1; end; commit; -- youwillsee a correspondingnumber (10) of 512 byteslong recordsin T-log selecttop 1000 operation,context, [logrecordfixedlength], [logrecordlength], AllocUnitId, AllocUnitName fromfn_dblog(null, null) whereallocunitname='dbo.BigDataTestHeap' orderby [LogRecordLength] Desc; -- 5K+ oftransacitonloglength selectsum([logrecordlength]) as LogSize fromfn_dblog (NULL, NULL) whereallocunitname='dbo.BigDataTestHeap';

As you can see on the picture below, we have exactly 10 entries with the log record length of 512 bytes and the total length of the 5.576 bytes .


Columnstore Indexes   part 88 (“Minimal Logging in SQL Server 2016”)

The rowstore table functionality did not changed since SQL Server 2014, but let’s check the Columnstore tables. For that purpose, let’s load the same 10 records and see if something has changed since SQL Server 2014:

truncatetabledbo.BigDataTest; checkpoint declare @i as int; set @i = 1; begintran while @i <= 10 begin insertintodbo.BigDataTest selectid, Name, LastName, logDate fromdbo.TestSource whereid = @i; set @i = @i + 1; end; commit;

When starting to analyse the transaction log, you will immediately notice some very important changes that were implemented under the hood the reference “full name of the table + name of the Clustered Columnstore Index” does not work any more, Microsoft has changed some details: they have added indication of the destination (Delta Store) and so the allocation unit name returned by the fn_dblog function for our case is dbo.BigDataTest.PK_BigDataTest(Delta) and instead of the previous 140 bytes the entries we have 524 bytes used for each of the log records:

-- Checkonthelength selecttop 1000 operation,context, [logrecordfixedlength], [logrecordlength], AllocUnitId, AllocUnitName fromfn_dblog(null, null) whereallocunitnamelike 'dbo.BigDataTest.PK_BigDataTest(Delta)' orderby [LogRecordLength] Desc selectsum([logrecordlength]) as LogSize fromfn_dblog (NULL, NULL) whereallocunitname='dbo.BigDataTest.PK_BigDataTest(Delta)';
Columnstore Indexes   part 88 (“Minimal Logging in SQL Server 2016”)

This is a very serious change under the hood for those migrating from SQL Server 2014 to SQL Server 2016, which will give a serious impact if loading data in the old way and expecting that it will function exactly as before. Overall we are talking about 5.696 bytes allocated in the log for the Columnstore Indexes table, instead of the 1.756 bytes that the same information occupies in SQL Server 2014 .

I will make a serious assumption that this changes have a lot to do with the internal changes of SQL Server 2016, that allow to have multiple secondary nonclustered rowstore indexes on the tables with Clustered Columnstore Indexes.

Another important thing to hold in mind is that in SQL Server 2016, the Delta-Stores are no more page-compressed, meaning that the impact on the disk will be even bigger than before. To confirm the compression of our Delta-Store, let’s use the following query checking on the new DMV sys.internal_partitions:

selectobject_name(object_id), internal_object_type, internal_object_type_desc,data_compression, data_compression_desc,* fromsys.internal_partitions whereobject_name(object_id) = 'BigDataTest'
Columnstore Indexes   part 88 (“Minimal Logging in SQL Server 2016”)

From the image above you can see that the Delta-Store is not compressed anymore, meaning that it will definitely occupy more space on the disk, while using less CPU cycles.

Microsoft has optimised some of the operations for the data insertion (about some of them watch out for the upcoming blog post), and even the total length of the Clustered Columnstore Index record insertion is still smaller than the Rowstore counterpart, but the difference with the SQL Server 2014 can be a serious surprise for unprepared.

Minimal Logging in Columnstore Indexes 2016

Let’s move onto the Minimal logging, that was implemented in SQL Server, but before that let’s pick up the base line with Rowstore Indexes, while loading 10.000 rows into our HEAP test table:

truncatetableBigDataTestHeap; checkpoint; insertintodbo.BigDataTestHeapwith (TABLOCK) selecttop 10000 id, Name, lastname, logDate fromdbo.TestSource orderbyid; selecttop 1000 operation,context, [logrecordfixedlength], [logrecordlength], AllocUnitId, AllocUnitName fromfn_dblog(null, null) whereallocunitname='dbo.BigDataTestHeap' orderby [LogRecordLength] Desc; -- Measurethetransactionloglength selectsum([logrecordlength]) as LogSize fromfn_dblog (NULL, NULL) whereallocunitname='dbo.BigDataTestHeap'; Without any huge surprises, we still have 92 b

Viewing all articles
Browse latest Browse all 3160

Trending Articles