Quantcast
Channel: CodeSection,代码区,SQL Server(mssql)数据库 技术分享 - CodeSec
Viewing all articles
Browse latest Browse all 3160

How to Proactively Gather SQL Server Indexes Fragmentation Information

$
0
0

How to Proactively Gather SQL Server Indexes Fragmentation Information

Introduction to SQL Server Indexes

Microsoft SQL Server is considered as one of the relational database management systems ( RDBMS ), in which the data is logically organized into rows and columnsthat are stored in data containers called tables. Physically, the tables are stored as 8 KB pages that can be organized into Heap or B-Tree Clustered tables. In the Heap table, there is no sorting order that controls the order of the data inside the data pages and the sequence of pages within that table, as there is no Clustered index defined on that table to enforce the sorting mechanism. If a Clustered index is defined on one column of the group of table columns, the data will be sorted inside the data pages based on the values of the Clustered index key columns, and the pages will be linked together based on these index key values. This sorted table is called a Clustered table .

In SQL Server, the index is considered as an important and effective key in the performance tuning process. The purpose of creating an index is to speed up access to the base table and retrieve the requested data without having to scan all the table rows to return the requested data. You can think of the database index as a book index that helps you quickly find the words in the book, without having to read the entire book to find that word. For example, suppose you need to retrieve information about a specific customer using a customer ID. If there is no index defined for the Customer ID column in this table, the SQL Server Engine checks all the table rows, one by one, in order to retrieve the customer with the provided ID. If an index is defined for the Customer ID column in this table, the SQL Server Engine will look for the requested Customer ID values in the sorted index, rather than in the base table, to retrieve information about the customer, reducing the number of scanned rows to retrieve the data.

In SQL Server, the index is structured logically as 8K pages, or index nodes, in the form of a B-tree. The B-Tree structure contains three levels: a Root Level that includes one index page at the top of the B-tree, a Leaf Level that is located at the bottom of the B-tree and contains data pages, and an Intermediate Level that includes all the nodes located between the root and the leaf levels, with index key values and pointers to the following pages. This B-tree shape provides a quick way to navigate the data pages from left to right and from the top to the bottom, based on the index key.

In SQL Server, there are two main types of indexes, a Clustered index, in which the actual data is stored at the leaf level pages of the index, with the ability to create only one clustered index for each table, as the data inside the data pages and the order of the pages will be sorted based on the clustered index key. If you define a primary key constraint in your table, a clustered index will be created automatically if no clustered index was previously defined for that table. The second type of indexes is a Non-clustered index that includes a sorted copy of the index key columns and a pointer to the rest of the columns in the base table or the clustered index, with the ability to create up to 999 non-clustered indexes for each table.

SQL Server provides us with other special types of indexes, such as a Unique index that is created automatically when a unique constraint is defined to enforce the uniqueness of specific column values, a Composite index in which more than one key column will participate in the index key, a Covering index in which all columns requested by a specific query will participate in the index key, a Filtered index that is an optimized non-clustered index with a filter predicate for indexing only a small portion of the table rows, a Spatial index that is created on the columns that store spatial data, an XML index that is created on XML binary large objects (BLOBs) in XMLdata type columns, a Columnstore index in which data is organized in columnar data format, a Full-text index that is created by the SQL Server Full-Text Engine, and a Hash index that is used in Memory-Optimized tables.

As I used to call the SQL Server index, this is a double-edged sword , where the SQL Server Query Optimizer can benefit from the index designed well to improve the performance of your applications by speeding up the data retrieval process. In contrast, an index that is designed in a bad way will not be chosen by the SQL Server Query Optimizer and it will degrade the performance of your applications by slowing down the data modification operations and consume you storage without taking advantage of it in the data retrieval processes. Therefore, it is better to first follow the index creation best practices and guidelines, check the effect if creating an on the development environment, and find a compromise between the speed of data retrieval operations and the overhead of adding that index on the data modification operations and the space requirements of that index, before applying it to the production environment.

Before creating an index, you need to study the different aspects that affect the creation and use of the index. This includes the type of the database workload, Online Transaction Processing (OLTP) or Online Analytical Processing (OLAP), the size of the table , the characteristics of the table columns , the sorting order of the columns in the query, the type of the index that corresponds to the query and the storage properties such as the FILLFACTOR and PAD_INDEX options that control the percentage of space on each leaf-level and the intermediate level pages to be filled with data.

SQL Server Index Fragmentation

Your work as a DBA is not limited to creating the right index. Once the index is created, you should monitor the index usage and statistics, for example, you need to know if this index is used poorly or not used at all. Thus, you can provide the correct solution to maintain these indexes or replace them with more efficient ones. In this way, you will maintain the highest applicable performance for your system. You may ask yourself: Why does the SQL Server Query Optimizer no longer use my index, although it did that before?

The answer is mainly related to the continuous data and schema changes that are performed on the base table that should be reflected in the indexes. Over time, and with all these changes, index pages become unsorted, causing the index to become fragmented. Another reason for the fragmentation is an attempt to insert a new value or update the current value, and the new value does not fit in the currently available free space. In this case, the page will be split into two pages, where the new page will be created physically after the last page. And you can imagine reading from a fragmented index and the number of pages that should be scanned, and, of course, the number of I/O operations performed to retrieve several records due to the distance between these pages. And because of this extra cost of using this fragmented index, the SQL Server Query Optimizer will ignore this index.

Different Ways to Get Index Fragmentation SQL Server provides us with different ways to get the percentage of index fragmentation. The first way is to check the percentage of index fragmentation in the Index Properties win

Viewing all articles
Browse latest Browse all 3160

Trending Articles