Quantcast
Channel: CodeSection,代码区,SQL Server(mssql)数据库 技术分享 - CodeSec
Viewing all articles
Browse latest Browse all 3160

Bulk upsert into SQL Server

$
0
0

I recently worked through a situation where a bulk upsert (update if a matching record already exists; otherwise insert a new record) to SQL Server implemented using Entity Framework 6 was producing the correct results, but running very slowly.

I was able to improve the performance of the bulk upsert by about 600x (!!!) by replacing the Entity Framework implementation with a an new approach based on a stored procedure taking a table-valued parameter.

Entity Framework implementation very slow

The original code, using Entity Framework 6 (altered to use a table / object type of “Employee” instead of the actual type I was working with):

DateTime now = DateTime.Now;
using (MyCustomContext context = new MyCustomContext())
{
foreach (Employee employee in employeeData)
{
// Get the matching Employee record from the database, if there is one.
Employee employeeRecord = context.Employees
.Where(e => e.EmployeeID == employee.EmployeeID)
.SingleOrDefault();
bool isNewRecord = false;
if (employeeRecord == null)
{
// We don't have a record in the database for this employee yet,
// so we'll add a new one.
isNewRecord = true;
employeeRecord = new Employee()
{
EmployeeID = employee.EmployeeID,
CreationDate = now
};
}
employeeRecord.ModificationDate = now;
employeeRecord.FirstName = employee.FirstName;
// (Set the remaining attributes...)
if (isNewRecord)
{
context.Employees.Add(employeeRecord);
}
context.SaveChanges();
}
}

That approach was taking around 300 seconds per 1000 records inserted. It was bogging down my application in situations where I needed to insert/update tens of thousands of records at once.

Stored Procedure + Table-Valued Parameter implementation fast!

In my testing, the revised approach below, replacing the Entity Framework implementation with a stored procedure taking a table-valued parameter, was able to upsert the same 1000 records in 0.5 second a huge improvement!

I needed to add a custom table type with columns matching the Employees table:

CREATE TYPE EmployeesTableType AS TABLE
(
[EmployeeID] [int] NOT NULL,
[CreationDate] [datetime] NOT NULL,
[FirstName] [nvarchar(100)] NOT NULL,
/* More fields here... */
);

Next, the stored procedure to perform the upsert, taking input of an instance of the custom EmployeesTableType, and using the SQL Server MERGE command:

CREATE PROCEDURE dbo.UpsertEmployees
@UpdateRecords dbo.EmployeesTableType READONLY
AS
BEGIN
MERGE INTO Employees AS Target
USING @UpdateRecords AS Source
ON Target.EmployeeID = Source.EmployeeID
WHEN MATCHED THEN
UPDATE SET Target.EmployeeID = Source.EmployeeID,
Target.FirstName = Source.FirstName,
/* More fields here... */
WHEN NOT MATCHED THEN INSERT (EmployeeID,
FirstName,
/* More field names here... */
)
VALUES (Source.EmployeeID,
Source.FirstName,
/* More field values here... */
);
END

Finally, here’s the C# code I used to execute the stored procedure:

public static void UpsertEmployeeData(List <Employee> employees)
{
string connectionString = MyCustomMethodToGetConnectionString();
using(SqlConnection connection = new SqlConnection(connectionString))
{
connection.Open();
using (SqlCommand command = connection.CreateCommand())
{
command.CommandText = "dbo.UpsertEmployees";
command.CommandType = CommandType.StoredProcedure;
SqlParameter parameter = command.Parameters.AddWithValue("@UpdateRecords",
CreateUpdateRecordsSqlParam(employees)); // See implementation below
parameter.SqlDbType = SqlDbType.Structured;
parameter.TypeName = "dbo.EmployeesTableType";
command.ExecuteNonQuery();
}
}
}
private static DataTable CreateUpdateRecordsSqlParam(IEnumerable employees)
{
DataTable table = new DataTable();
table.Columns.Add("EmployeeID", typeof(Int32));
table.Columns.Add("FirstName", typeof(string));
// More fields here...
foreach (Employee employee in employees)
{
table.Rows.Add(employee.EmployeeID,
employee.FirstName,
// More fields here...
}
return table;
}

Ideally, this post will serve as a useful reference the next time I, or you, need to code up a bulk upsert into SQL Server!

Credit to this StackOverflow answer by Ryan Prechel which was the primary inspiration for this approach.


Viewing all articles
Browse latest Browse all 3160

Trending Articles