Quantcast
Channel: CodeSection,代码区,SQL Server(mssql)数据库 技术分享 - CodeSec
Viewing all articles
Browse latest Browse all 3160

Cleaning up Old Code and Unused objects in SQL Server

$
0
0

In this article, we are going to talk about discovering and archiving SQL references to invalid procedures.

We have a significant amount of database code within our database procedures we no longer use, as we’ve changed objects referenced in procedures. As we’ve moved to source control for all of our databases, we’re seeing the costs of keeping old code in our environment that we no longer use. The trouble is that we don’t know how to identify what we use since we have many teams who develop objects in our databases and use them for the features they create. What are some techniques for cleaning old database code on the database level as well as preventing this problem in the future?

Overview

Outside of the storage costs and repository costs where we must sync with many objects, old database code in the form of stored procedures that we no longer use raises security risks, as procedures provide access to data through CRUD operations. If we don’t use database code, we should practice eliminating it immediately. It is a major security risk to have stored procedures that are no longer used. As for functions, this depends on the function in question, but I recommend the practice of only keeping what’s required. We can apply the same lessons to views, users, roles, etc as well the more database objects we use, the more we need to restrict or eliminate as we change our environment.

For the best practice to reduce effort, if we replace a database object with another object (such as replacing an old procedure with a new procedure), we should remove the old object immediately. We can archive the old object for SQL reference in an archive source control location, but we should never keep the object live. This best practice will prevent a database from possessing objects that are no longer used but could possibly reveal inappropriate information to a malicious actor.

Demonstration with A Stored Procedure

In the below code, we create a table that we’ll use for an experiment. Once we create the table, we create a procedure that inserts a record from a parameter the procedure accepts into our newly created table. We then test the procedure a few times, query the table, then drop the table. Now, when we execute the stored procedure, we get an error the procedure fails. As we see in the below code, we can remove a table that’s referenced in a procedure.

--- Window 1 CREATE TABLE DataEntry( TextId TINYINT IDENTITY(1,1), TextEntry VARCHAR(9) ) --- Window 2 CREATE PROCEDURE addDataEntry @string VARCHAR(9) AS BEGIN INSERT INTO Dataentry (TextEntry) VALUES (@string) END EXEC addDataEntry 'ref-1' EXEC addDataEntry 'ref-2' EXEC addDataEntry 'ref-3' EXEC addDataEntry 'ref-4' EXEC addDataEntry 'ref-5' SELECT * FROM DataEntry DROP TABLE DataEntry EXEC addDataEntry 'ref-6'
Cleaning up Old Code and Unused objects in SQL Server

Our query shows the results of the data we added.


Cleaning up Old Code and Unused objects in SQL Server

We can remove the table, but now when we call the procedure, the procedure fails since the table doesn’t exist.

This may be one of the complexities that we’re solving for: an invalid SQL reference where an object refers to another object that no longer exists. This scenario could also be a procedure calling another procedure with the second procedure not existing.

Identifying SQL References Before a Drop

One clear solution here is to identify all references prior to dropping the referenced object and the same applies if we rename the object. Using the above example of removing a table, we could look for all the references to the table we’re dropping in our procedures, views, etc. This code becomes invalid if they reference the removed object and removing them with it reduces possible errors. There may be cases where we want to update the reference if we’ve renamed the object or created a new object to replace the old one. The flow of removing an object should be to identify references first and include these references as objects to be removed if we need to remove an object. The below diagram shows this.


Cleaning up Old Code and Unused objects in SQL Server

Using the above image as an example, if we remove the referenced table, notice the impact it would cause on all the other references to the table they would be invalid.

The same logic applies to other changes from renaming objects to changing data types to changing primary keys (which should be extremely rare, but unfortunately happen). These changes can have significant effects on code that reference these. We must account for all dependent code which will be affected prior to making the change.

Assuming that we use source control for our database from database objects to application code we can iterate over our files using PowerShell and obtain SQL references. We’ll want to use searches that reflect how we name objects; notice the nuances in the below select statements for one table all of which return the same data set, but query the table differently:

SELECT * FROM etlMidLoader SELECT * FROM dbo.etlMidLoader SELECT * FROM [dbo].etlMidLoader SELECT * FROM [dbo].[etlMidLoader] SELECT etlMidLoader.* FROM [dbo].[etlMidLoader] SELECT [etlMidLoader].* FROM [dbo].[etlMidLoader]

Developers sometimes name tables the same name as column names, so searching for a table name may return results you don’t want to see, such as returning a column from a table you don’t want to remove or update. I tend to develop with the approach of restricting most object names (tables, views, procedures, etc) with a starting identifier either indicating the object type or purpose (tb or etl) or using the action it performs (helpful for views or procedures) and the reason is that searches through source control become fast as I never get a column and table clash. One of the best DBAs I worked with enforced this rule by removing any object that didn’t follow appropriate naming convention immediately without warning it made developers create objects with the right name from the beginning and saved time when tracking down dependencies (this also saved a lot of time for other tracking purposes). However, this assumption would be inappropriate in many development environments because many environments don’t have strict practices like this.

In the below code, we look for the object etload in our source control by filtering on SQL files exclusively (our assumption here being that we’re using SQL for references) and we return the file name plus the line of the file we found it along with the line of text. This script is not procedure-biased either unless we only point it at a directory that holds procedures (relative to design); it will search all SQL files for references. Even comments should be reviewed, as it’s a common practice by developers to use multi-line comments. Relative to how specific we want our filter, we can edit the if statement in the below code to be very specific with regular expressions.

Function Read-ForSpecificText { Param( [Parameter(Mandatory=$true)][string]$file , [Parameter(Mandatory=$true)][string]$objectpattern ) Process { $readfile = New-Object System.IO.StreamReader($file) $identifylineno = 0 while (($fileline = $readfile.ReadLine()) -ne $null) { if ($fileline -match “$objectpattern”) { Write-Output (‘”’ + $file + ‘” ‘ + $identifylineno.ToString() + “: “ + $fileline) } $identifylineno++ } $readfile.Close() $readfile.Dispose() } } $sourcecodesql = Get-ChildItem “C:\Src\Repo One\Reposins\” -Filter *.sql -Recurse foreach ($sqlfile in $sourcecodesql) { Read-ForSpecificTe

Viewing all articles
Browse latest Browse all 3160

Trending Articles