In this article, we’re going to discuss the ROW_NUMBER SQL function. This is a continuation of the SQL essential series. In this guide, I’ll explain what a window function is all about, and you’ll see sample examples to understand the concepts the ROW_NUMBER SQL function.
IntroductionThe most commonly used function SQL Server is ROW_NUMBER. ROW_NUMBER adds a unique incrementing number to the window. The order in which the row numbers are applied is determined by the ORDER BY expression. Most of the time, one or more columns are specified in the ORDER BY expression, but it’s possible to use more complex expressions or even a sub-query. So, it creates an ever-increasing integral value and it always starts off at 1 and subsequent rows get the next higher value. You can also use it with a PARTITION BY clause. But when it crosses a partition limit or boundary, it resets the counter and starts from 1. So, the partition may have values 1, 2, 3, and so on and the second partitions again start the counter from 1, 2, 3… and so on, and so forth.
Guidelines: The ROW_NUMBER SQL function is a non-persistent generation of a sequence of temporary values and it is calculated dynamically when then the query is executed. There is no guarantee that the rows returned by a SQL query using the ROW_NUMBER SQL function will be ordered exactly the same with each execution. The ROW_NUMBER and RANK SQL functions are very similar. The output of the ROW_NUMBER SQL function is a sequence of values starts from 1 with an increment of 1 but whereas RANK function, the values are also incremented by 1 but the values will repeat for the ties. If you’ve had any experience with Oracle then ROWNUM is more familiar to you. It is a Pseudo-Column. It starts off with 1 and goes all the way down increasing by one, to the end of the table. The ROWNUM function is dynamic in nature and we are allowed to reset the values using the PARTITION BY clause The Order by clause of the query and the Order by clause of the Over clause have nothing to do with each other. Syntax ROW_NUMBER ( ) OVER ( [ PARTITION BY value_expression1 , ... [ n ] ] order_by_clause col1,col2..)The syntax is pretty simple. The ROW_NUMBER SQL function is available from SQL Server 2005 and later versions.
ROW_NUMBER SQL functionROW_NUMBER followed by OVER function and then in the parentheses use an Order by clause. It is required to use the Order by clause in order to impose sort of order for the result-set.
Over clauseThe Over clause defines the window or set of rows that the window function operates on, so it’s really important for you to understand. The possible components of the Over clause are ORDER BY and PARTITION BY. The ORDER BY expression of the Over clause is supported when the rows need to be lined up in a certain way for the function to work.
PARTITION BY value_expression1 Partition by clauseThe Partition by clause is optional. On specifying the value, it divides the result set produced by the FROM clause into partitions to which the ROW_NUMBER SQL function is applied. The values specified in the PARTITION clause define the boundaries of the resultset. If the PARTITION BY clause is not specified, then the Over clause operates on the all rows of the result set as a single data-set. This clause may consist of one or more columns, a more complex expression, or even a sub-query.
Order by clauseThe Order by clause is mandatory. It determines the sequence and association of the temporary value to the rows of a specified partition. The Order by clause is an expression of the Over clause and it determines how the rows need to be lined up in a certain way for the function.
DemoIn this section, we’ll take a look at the ROW_NUMBER SQL function. For the entire demo, I’ve used AdventureWorks2016 database.
How to use ROW_NUMBER in a SQL QueryThe following examples, we’ll see the use of Over clause. Let us get the list of all the customers by projecting the columns such as SalesOrderID, OrderDate, SalesOrderNumber, SubTotal, TotalDue and RowNum. The Row_NUMBER SQL function is applied with the order of the CustomerID column. The temporary value starts from 1 assigned based on the order of the CustomerID, and the values are continued till the last rows of the table. The order of CustomerID is not guaranteed because we don’t specify the Order by clause in the query.
USE AdventureWorks2016; GO SELECT ROW_NUMBER() OVER( ORDER BY CustomerID) AS RowNum, CustomerID, SalesOrderID, OrderDate, SalesOrderNumber, SubTotal, TotalDue FROM Sales.SalesOrderHeader;
Order by clause
The following example uses the Order by clause in the query. The Order by clause in the query applied on the SalesOrderID column. We can see that the rows in output are still ordered and returned. The Row_NUMBER SQL function is still applied to the CustomerID. The output indicates that the ORDER BY of the query and the ORDER BY of the Over clause are independent of the output.
USE AdventureWorks2016; GO SELECT ROW_NUMBER() OVER( ORDER BY CustomerID) AS RowNum, CustomerID, SalesOrderID, OrderDate, SalesOrderNumber, SubTotal, TotalDue FROM Sales.SalesOrderHeader ORDER BY SalesOrderID;
How to use multiple columns with the Over clause
The following example you can see that we have listed customerID and OrderDate in the Order by clause. This gives the customer details with the most recent order details along with the sequence of numbers assigned to the entire result-set.
USE AdventureWorks2016; GO SELECT ROW_NUMBER() OVER(ORDER BY CustomerID, OrderDate DESC) AS RowNum, CustomerID, SalesOrderID, OrderDate, SalesOrderNumber, SubTotal, TotalDue FROM Sales.SalesOrderHeader
How to use ROW_NUMBER with PARTITION
The following example uses PARTITION BY clause on CustomerID and OrderDate fields. In the output, you can see that the customer 11019 has three orders for the month 2014-Jun. In this case, the partition is done on more than one column. The partition is a combination of OrderDate and CustomerID. The Row_NUMBER will start over for each unique combination of OrderDate and CustomerID. In this way, it’s easy to find the customer who has placed more than one order on the same day.
USE AdventureWorks2016; GO SELECT ROW_NUMBER() OVER(PARTITION BY CustomerID, DATEADD(MONTH, DATEDIFF(MONTH, 0, OrderDate), 0) ORDER BY SubTotal DESC) AS MonthlyOrders, CustomerID, SalesOrderID, OrderDate, SalesOrderNumber, SubTotal, TotalDue FROM Sales.SalesOrderHeader;
How to return a subset of rows using CTE and ROW_NUMBER The following example we are going to analyze SalesOrderHeader to display the top five largest orders placed by each customer every month. Using the Month SQL function, the orderDate columns is manipulated to fetch the month part. In this way, the sales corresponding to specific month (OrderDate) along with customer (CustomerID) is partitioned. To lists the five la