Microsoft has decided in SQL Server 2012, that they will modify the ORDER BY clause and do what MySQL has been doing for a long time – providing simple functions for paging result sets.
This comes in the form of OFFSET and FETCH.
Now, I’m not saying that this was previously not possible in SQL Server. There are solutions to this problem in other versions of the product in the form of temp tables, ROW_NUMBER() and TOP but I prefer OFFSET and FETCH to the others – it’s just simple! 🙂
I am using SQL Server 2012 Express and the AdventureWorks 2012 database for this demonstration.
So lets look at some data and I have decided to query some of the fields from the TransactionHistory table under the Production schema.
[sourcecode language=’sql’]SELECT TOP 10
TransactionID
, ProductID
, TransactionDate
, Quantity
, ActualCost
FROM
Production.TransactionHistory;[/sourcecode]
TransactionID ProductID TransactionDate Quantity ActualCost ------------- ----------- ----------------------- ----------- --------------------- 100000 784 2007-09-01 00:00:00.000 2 0.00 100001 794 2007-09-01 00:00:00.000 1 0.00 100002 797 2007-09-01 00:00:00.000 1 0.00 100003 798 2007-09-01 00:00:00.000 1 0.00 100004 799 2007-09-01 00:00:00.000 1 0.00 100005 800 2007-09-01 00:00:00.000 1 0.00 100006 801 2007-09-01 00:00:00.000 1 0.00 100007 954 2007-09-01 00:00:00.000 1 0.00 100008 955 2007-09-01 00:00:00.000 1 0.00 100009 966 2007-09-01 00:00:00.000 1 0.00 (10 row(s) affected)
This table contains approximately 133,500 rows – not a massive amount in today’s world but for the purposes of this article, lets say I wanted to write queries to page through this data sorted by newest transactions and I wanted to display 20 rows per page.
Using OFFSET and FETCH in SQL Server 2012
So here is an example. Note that OFFSET and FETCH are added after the ORDER BY clause.
[sourcecode language=’sql’]SELECT
TransactionID
, ProductID
, TransactionDate
, Quantity
, ActualCost
FROM
Production.TransactionHistory
ORDER BY TransactionDate DESC
OFFSET 0 ROWS
FETCH NEXT 20 ROWS ONLY;[/sourcecode]
OFFSET provides a starting row from which to display the result set. FETCH instructs the query to display the number of rows you want in your result set from the OFFSET point.
Note that NEXT or FIRST can be supplied for FETCH and are synonyms for ANSI compatibility. You can also type ROW or ROWS, again they are synonyms for ANSI compatibility.
So this is nice and easy, for the next page of results, the OFFSET value would be changed to 20 and then 40 etc.
OFFSET and FETCH can accept variables so for example:
[sourcecode language=’sql’]DECLARE @OffsetRows tinyint = 0
, @FetchRows tinyint = 20;
SELECT
TransactionID
, ProductID
, TransactionDate
, Quantity
, ActualCost
FROM
Production.TransactionHistory
ORDER BY TransactionDate DESC
OFFSET @OffsetRows ROWS
FETCH NEXT @FetchRows ROWS ONLY;[/sourcecode]
You can use expressions in OFFSET and FETCH:
[sourcecode language=’sql’]ORDER BY TransactionDate DESC
OFFSET @OffsetRows – 0 ROWS
FETCH NEXT @FetchRows – @OffsetRows + 1 ROWS ONLY;[/sourcecode]
And I really like this – plugging in a scalar sub queries:
[sourcecode language=’sql’]SELECT
TransactionID
, ProductID
, TransactionDate
, Quantity
, ActualCost
FROM
Production.TransactionHistory
ORDER BY TransactionDate DESC
OFFSET @OffsetRows ROWS
FETCH NEXT (SELECT 20) ROWS ONLY;[/sourcecode]
So imagine that (SELECT 20) was in fact reading a table somewhere in your system (SELECT PageSize FROM PageSetting WHERE SiteID = 5) which controlled the number of rows to be displayed for each query.
OFFSET and FETCH versus ROW_NUMBER()
I’m not going to go into detail about all the methods of paging which have been employed in previous versions of SQL Server and start drawing comparisons and conclusions over performance (perhaps I will in a future post) but the one which immediately springs to mind as an alternative to OFFSET and FETCH is ROW_NUMBER()
So a quick comparison between the two methods shows the following:
Using OFFSET and FETCH
[sourcecode language=’sql’]SELECT
TransactionID
, ProductID
, TransactionDate
, Quantity
, ActualCost
FROM
Production.TransactionHistory
ORDER BY TransactionDate DESC
OFFSET 0 ROWS
FETCH NEXT 20 ROWS ONLY[/sourcecode]
Using ROW_NUMBER() with CTE
[sourcecode language=’sql’]WITH Paging_CTE AS
(
SELECT
TransactionID
, ProductID
, TransactionDate
, Quantity
, ActualCost
, ROW_NUMBER() OVER (ORDER BY TransactionDate DESC) AS RowNumber
FROM
Production.TransactionHistory
)
SELECT
TransactionID
, ProductID
, TransactionDate
, Quantity
, ActualCost
FROM
Paging_CTE
WHERE RowNumber > 0 AND RowNumber <= 20[/sourcecode]
So what do you think? It's certainly easier to write a query using OFFSET and FETCH as there is less involved. There is one less column too because "RowNumber" is not needed for the OFFSET and FETCH version.
If I were a betting man, I would say that the execution plans are different between the two queries. So lets take a look.
Using OFFSET and FETCH...
Using ROW_NUMBER()…
There is certainly more going on in the second one right? As I wrote above, I’m not intending to draw any conclusions on this because I am not doing any thorough testing here.
Finally, if you want to guarantee stable results in your OFFSET and FETCH paging solution there are two things that you must implement.
- You should ensure that the underlying data does not change and that involves running the queries inside a transaction using either snapshot or serializable transaction isolation.
- The ORDER BY clause needs to contain a column or combination of columns that are guaranteed to be unique.
For more information from Microsoft click here – SQL Server 2012 OFFSET and FETCH
Anye says
I actually asked some of the MS Sql folks at DevConnections end of March about performance of this feature vs. CTEs and they said it should perform exactly the same, that behind the scenes it is still creating a CTE. The purpose of the feature wasn’t to enhance performance (though they may try to do that in future) but to make maintenance of your SQL queries easier. Along with this, when we tested it in the class, we got the exact same plans and results as well. So I’m surprised you got two different plans here.
Andy Hayes says
Hi Anye
Yes, I have two different execution plans using the code examples above and the larger execution plan is created where I have used ROW_NUMBER().
Whilst this is not conclusive as I have not done any real performance analysis between the two queries, I might look at it in a future post to check for reasons and performance differences.
Thanks for your comment.
Andy
Raju Dutta says
Very nice article. Thanks
Andy Hayes says
Thanks Raju
Harsha says
Interesting article, thanks for sharing.
Pawan Kumar says
I was wondering when MS will bring something like this. This is very good !
Nice one…
Pawan
Andy Hayes says
Hi Pawan
Thank you for your comment. I’m glad you found this post useful 🙂
All the best.
Andy
Virendra Yaduvanshi says
Hi Andy,
Its really very nice demostration of SQL Server 2012 – OFFSET and FETCH and somehow differences between ROW_NUMBER() & OFFSET / FETCH .
Regards,
Virendra Yaduvanshi
Andy Hayes says
Hi Virendra
Thanks for your comment. I am glad you liked my post 🙂
All the best
Andy
Anil says
Hi Andy,
Very nice post and very useful. Can you share other enhancements in SQL Server 2012.
Thanks & Regards,
Anil
Andy Hayes says
Hi Anil
Thanks for your comment, I’m glad you found this topic useful.
I will try and provide you with more SQL Server 2012 articles soon! 🙂
All the best
Andy
Ramkoti says
What is difference between
OFFSET 0 ROWS (OR ) OFFSET 10000 ROWS ?
Can i use any number after OFFSET ?
Andy Hayes says
Hi Ramkoti
Thanks for your comment.
OFFSET is the starting point of your result set. If you supply 0, you show records from the beginning. As far as I know any number can be used providing that it is 0 and above.
All the best
Andy
Mike Scally says
Hi Ramkoti,
If you OFFSET by a number that is larger than your result set then you will get no results.
Apart from that I don’t think there would be any other issues.
Nice article Andy,
Short and succinct.
Cheers
Mike
Andy Hayes says
Thanks Mike 🙂
Steve S says
Actually, if you look at the detail of the execution plan, it seems that those extra steps don’t add anything to the performance….they are 0{3a76cbff1b5ce5ace3bd34bbdd68f71285d81fb9af60114c7a6db80d3a8688de}. Which would explain that they are equal.
Thank you for this post. Very helpful!
Andy Hayes says
Hi Steve
I’m glad you liked the post. Thanks for your comment.
All the best
Andy
Dave says
Although the plans are different, if you look closer, you will see that both perform a clustered index scan at 8{3a76cbff1b5ce5ace3bd34bbdd68f71285d81fb9af60114c7a6db80d3a8688de} and sort at 92{3a76cbff1b5ce5ace3bd34bbdd68f71285d81fb9af60114c7a6db80d3a8688de} of the overall query cost.
The actual usage of the Row_number() compared to the OFFSET is negligible at the end of the day.
Having said that though, the OFFSET will also be brilliant at fining the Medium value of a rowset.
Andre Guerreiro Neto says
Great article!
Microsoft is surely listening to their users and server-paging features are always welcome
in the web applications world.
But about the ROW_NUMBER() VS OFFSET/FETCH: I’m just guessing here but doesn’t ROW_NUMBER() need to retrieve all the lines from query, then number them in the desired order and finally getting rid of all the data except for the 20 rows we need to show?
Again, thanks for the article.
Best regards,
Andre Guerreiro Neto
Andy Hayes says
Hi Andre
Thanks for your comment. In the example queries inside the article, performance is equal – certainly in terms of IO. On my machine after running DBCC FREEPROCCACHE and DBCC DROPCLEANBUFFERS, with SET STATISTICS IO set to ON, the figures are the same. Each execution plan contains a clustered index scan which is reading through 113,000+ rows.
–OFFSET and FETCH
Table ‘TransactionHistory’. Scan count 1, logical reads 797, physical reads 3, read-ahead reads 793, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
–ROW_NUMBER()
Table ‘TransactionHistory’. Scan count 1, logical reads 797, physical reads 3, read-ahead reads 793, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
ROW_NUMBER() is calculating the id’s based on the order you specify using OVER (ORDER BY…) and the WHERE clause against the CTE is responsible for returning the first 20 rows. At the core of the CTE query, all rows from the Production.TransactionHistory table are returned.
There’s a very interesting article on MSSQLTips.com written by Ben Snaidero on the performance of the different paging solutions that have been available since SQL 2000. You can read it here
Bob Wolfson says
OFFSET and FETCH offer a nice syntax, but…
The main use of this feature is to facilitate “paging” though a large dataset, no? In most such cases, we’re talking about serving pages to a client, one at a time – i.e. “paging from the server” – as opposed to returning the entire dataset to the client and letting it do the paging locally.
In such cases, the client (or the user of the client) usually wants to know how MANY pages there are. And often the user has specified some sort of “filter” criteria over a primary dataset, so there’s no way to know the page count in his filtered dataset without building it.
So it seems to me (and I don’t profess to be an expert by any means) that the best way to deal with the situation is to filter the primary dataset into a secondary table and then page through the secondary table. The count(*) of the secondary table, divided by the page size, tells you how many pages there are, and if row numbers are inserted into it as part of the filtering it’s easy and efficient to pull any given page-worth of results using the simple syntax in the 2nd half of the cte query.
I’d appreciate any thoughts you might have.
RMcMullan says
That’s what I was thinking too. It sounds great for paging through the results of a SQL Query but when you realize that there’s no count and with the constraint that the data can’t change, this feature is not useful for this sort of data paging. So I don’t see what good it is?
Anonim says
Existe alguna forma de retornar con OFFSET y FETCH el número de registros existente en la consulta “sin haber realizado el pagineo”… Actualmente tengo que hacer la misma consulta 2 veces pero no se ve muy bien “a pesar que sigue siendo mas rapido que usando Row_Number con el CTE:
Declare @page Int = 3
Declare @pageSize Int = 25
Select UsuarioId, UserName, Email From Usuarios
Order By UsuarioId
Offset(@page – 1) * @pageSize Rows
Fetch Next @pageSize Rows Only;
Select Count(*) From Usuarios;
Simon says
I’m so glad they added that feature. I just had a flashback to how badly this was needed in SQL Server 2000 [shudders]….
Critian says
Nice post and very clear, so thanks!
Andy Hayes says
Thanks for your comment!
Sarthak says
Hi there!
I was trying to search for features added to MS SQL Server 2012 over MS SQl Server 2008R2 . Whilst crawling among several webpages and digging into each of the differences that MS brings to this 2012 edition i found this article helpful.
Andy, can i also find a consolidated list for all the newly features added? This article is really helpful.
Cheers
S
Yogesh Khaire says
Very nice and useful post.
Ali says
How do I find the total rows?
For example, I request page 7, and I get 20 results back. but I’m also gonna need to know that there were 230 total results
Miroslav says
Unfortunately with OFFSET FETCH you can’t do that.
Miroslav says
However using OFFSET FETCH is easier and straightforward I still prefer using ROW_NUMBER as it’s more generic. To OFFSET FETCH I can’t pass NULLs or negatives to cancel the pagination and get all the records, while with ROW_NUMBER I can, e.g.:
–// stored procedure’s header with parameters
@ID int,
@Name nvarchar(50),
–// these are the parameters responsible for pagination
@OrderByDirection nvarchar(4),
@OrderByColumn nvarchar(max),
@PageSize int, — if I pass NULL or zero or negative in this parameter the pagination is being ignorred and all records are returned
@PageNumber int
–// and the following is stored procedure’s body
SET @OrderByColumn = ISNULL(@OrderByColumn,’ID’)
SET @OrderByDirection = ISNULL(@OrderByDirection,’DESC’)
SET @PageNumber = ISNULL(@PageNumber,1)
SET @PageSize = ISNULL(@PageSize,-1)
INSERT INTO @tmpTable
SELECT ROW_NUMBER() OVER(ORDER BY
CASE @OrderByDirection WHEN ‘ASC’ THEN CASE @OrderByColumn WHEN ‘ID’ THEN t.ID END END,
CASE @OrderByDirection WHEN ‘DESC’ THEN CASE @OrderByColumn WHEN ‘ID’ THEN t.ID END END DESC,
CASE @OrderByDirection WHEN ‘ASC’ THEN CASE @OrderByColumn WHEN ‘Name’ THEN t.Name END END,
CASE @OrderByDirection WHEN ‘DESC’ THEN CASE @OrderByColumn WHEN ‘Name’ THEN t.Name END END DESC) as RowNum,
ID,
Name
FROM SomeTable AS t
WHERE (@ID IS NULL OR t.ID = @ID) AND (@Name IS NULL OR @Name = ” OR t.Name LIKE ‘%’ + @Name + ‘%’)
//
SELECT RowNum, ID, Name, TotalCount
FROM @tmpTable
cross apply(SELECT TotalCount = count(*) FROM @tmpTable) as TotalCount
WHERE (@PageSize <= 0 OR
RowNum BETWEEN
(CASE WHEN TotalCount < ((@PageNumber – 1) * @PageSize + 1) THEN 1 ELSE ((@PageNumber – 1) * @PageSize + 1) END)
AND
(CASE WHEN TotalCount < ((@PageNumber – 1) * @PageSize + 1) THEN @PageSize ELSE ISNULL(@PageNumber,1) * ISNULL(@PageSize,10) END))
As you can see this way I'm in almost full control of what is going on with the pagination.
And also later, in the UI, I use RowNum and TotalCount to design pagination control with labels like "showing records x to y of total z" etc.