New T-SQL features in SQL Server 2012 – OFFSET and FETCH

Microsoft has decided in SQL Server 2012, that they will modify the ORDER BY clause and do what MySQL has been doing for a long time – providing simple functions for paging result sets.

This comes in the form of OFFSET and FETCH.

Now, I’m not saying that this was previously not possible in SQL Server. There are solutions to this problem in other versions of the product in the form of temp tables, ROW_NUMBER() and TOP but I prefer OFFSET and FETCH to the others – it’s just simple! :)

I am using SQL Server 2012 Express and the AdventureWorks 2012 database for this demonstration.

So lets look at some data and I have decided to query some of the fields from the TransactionHistory table under the Production schema.

SELECT TOP 10
TransactionID
, ProductID
, TransactionDate
, Quantity
, ActualCost
FROM
Production.TransactionHistory;
TransactionID ProductID   TransactionDate         Quantity    ActualCost
------------- ----------- ----------------------- ----------- ---------------------
100000        784         2007-09-01 00:00:00.000 2           0.00
100001        794         2007-09-01 00:00:00.000 1           0.00
100002        797         2007-09-01 00:00:00.000 1           0.00
100003        798         2007-09-01 00:00:00.000 1           0.00
100004        799         2007-09-01 00:00:00.000 1           0.00
100005        800         2007-09-01 00:00:00.000 1           0.00
100006        801         2007-09-01 00:00:00.000 1           0.00
100007        954         2007-09-01 00:00:00.000 1           0.00
100008        955         2007-09-01 00:00:00.000 1           0.00
100009        966         2007-09-01 00:00:00.000 1           0.00

(10 row(s) affected)

This table contains approximately 133,500 rows – not a massive amount in today’s world but for the purposes of this article, lets say I wanted to write queries to page through this data sorted by newest transactions and I wanted to display 20 rows per page.

Using OFFSET and FETCH in SQL Server 2012

So here is an example. Note that OFFSET and FETCH are added after the ORDER BY clause.

SELECT
TransactionID
, ProductID
, TransactionDate
, Quantity
, ActualCost
FROM
Production.TransactionHistory
ORDER BY TransactionDate DESC
OFFSET 0 ROWS
FETCH NEXT 20 ROWS ONLY;

OFFSET provides a starting row from which to display the result set. FETCH instructs the query to display the number of rows you want in your result set from the OFFSET point.

Note that NEXT or FIRST can be supplied for FETCH and are synonyms for ANSI compatibility. You can also type ROW or ROWS, again they are synonyms for ANSI compatibility.

So this is nice and easy, for the next page of results, the OFFSET value would be changed to 20 and then 40 etc.

OFFSET and FETCH can accept variables so for example:

DECLARE @OffsetRows tinyint = 0
, @FetchRows tinyint = 20;
SELECT
TransactionID
, ProductID
, TransactionDate
, Quantity
, ActualCost
FROM
Production.TransactionHistory
ORDER BY TransactionDate DESC
OFFSET @OffsetRows ROWS
FETCH NEXT @FetchRows ROWS ONLY;

You can use expressions in OFFSET and FETCH:

ORDER BY TransactionDate DESC
OFFSET @OffsetRows - 0 ROWS
FETCH NEXT @FetchRows - @OffsetRows + 1 ROWS ONLY;

And I really like this – plugging in a scalar sub queries:

SELECT
TransactionID
, ProductID
, TransactionDate
, Quantity
, ActualCost
FROM
Production.TransactionHistory
ORDER BY TransactionDate DESC
OFFSET @OffsetRows ROWS
FETCH NEXT (SELECT 20) ROWS ONLY;

So imagine that (SELECT 20) was in fact reading a table somewhere in your system (SELECT PageSize FROM PageSetting WHERE SiteID = 5) which controlled the number of rows to be displayed for each query.

OFFSET and FETCH versus ROW_NUMBER()

I’m not going to go into detail about all the methods of paging which have been employed in previous versions of SQL Server and start drawing comparisons and conclusions over performance (perhaps I will in a future post) but the one which immediately springs to mind as an alternative to OFFSET and FETCH is ROW_NUMBER()

So a quick comparison between the two methods shows the following:

Using OFFSET and FETCH

SELECT
TransactionID
, ProductID
, TransactionDate
, Quantity
, ActualCost
FROM
Production.TransactionHistory
ORDER BY TransactionDate DESC
OFFSET 0 ROWS
FETCH NEXT 20 ROWS ONLY

Using ROW_NUMBER() with CTE

WITH Paging_CTE AS
(
SELECT
TransactionID
, ProductID
, TransactionDate
, Quantity
, ActualCost
, ROW_NUMBER() OVER (ORDER BY TransactionDate DESC) AS RowNumber
FROM
Production.TransactionHistory
)
SELECT
TransactionID
, ProductID
, TransactionDate
, Quantity
, ActualCost
FROM
Paging_CTE
WHERE RowNumber > 0 AND RowNumber <= 20

So what do you think? It’s certainly easier to write a query using OFFSET and FETCH as there is less involved. There is one less column too because “RowNumber” is not needed for the OFFSET and FETCH version.

If I were a betting man, I would say that the execution plans are different between the two queries. So lets take a look.

Using OFFSET and FETCH…

sql server offset and fetch versus row_number()

Using ROW_NUMBER()…

sql server offset and fetch versus row_number()
There is certainly more going on in the second one right? As I wrote above, I’m not intending to draw any conclusions on this because I am not doing any thorough testing here.

Finally, if you want to guarantee stable results in your OFFSET and FETCH paging solution there are two things that you must implement.

  1. You should ensure that the underlying data does not change and that involves running the queries inside a transaction using either snapshot or serializable transaction isolation.
  2. The ORDER BY clause needs to contain a column or combination of columns that are guaranteed to be unique.

For more information from Microsoft click here - SQL Server 2012 OFFSET and FETCH

Subscribe to our mailing list

* indicates required Email Address *
About Andy Hayes

Andy Hayes is a DBA working with SQL Server since version 7.0. He has a wonderful wife and two beautiful children. He loves database technology, playing cricket, and blogging. He is passionate about sharing his experiences as a DBA and learning more to further his understanding and knowledge. You can follow me on Twitter, check out my Facebook page or follow me on

Comments

  1. I actually asked some of the MS Sql folks at DevConnections end of March about performance of this feature vs. CTEs and they said it should perform exactly the same, that behind the scenes it is still creating a CTE. The purpose of the feature wasn’t to enhance performance (though they may try to do that in future) but to make maintenance of your SQL queries easier. Along with this, when we tested it in the class, we got the exact same plans and results as well. So I’m surprised you got two different plans here.

    • Andy Hayes says:

      Hi Anye

      Yes, I have two different execution plans using the code examples above and the larger execution plan is created where I have used ROW_NUMBER().

      Whilst this is not conclusive as I have not done any real performance analysis between the two queries, I might look at it in a future post to check for reasons and performance differences.

      Thanks for your comment.

      Andy

  2. Raju Dutta says:

    Very nice article. Thanks

  3. Interesting article, thanks for sharing.

  4. I was wondering when MS will bring something like this. This is very good !

    Nice one…

    Pawan

  5. Hi Andy,

    Its really very nice demostration of SQL Server 2012 – OFFSET and FETCH and somehow differences between ROW_NUMBER() & OFFSET / FETCH .

    Regards,
    Virendra Yaduvanshi

  6. Hi Andy,

    Very nice post and very useful. Can you share other enhancements in SQL Server 2012.

    Thanks & Regards,

    Anil

    • Andy Hayes says:

      Hi Anil

      Thanks for your comment, I’m glad you found this topic useful.

      I will try and provide you with more SQL Server 2012 articles soon! :)

      All the best

      Andy

  7. What is difference between
    OFFSET 0 ROWS (OR ) OFFSET 10000 ROWS ?

    Can i use any number after OFFSET ?

    • Andy Hayes says:

      Hi Ramkoti

      Thanks for your comment.

      OFFSET is the starting point of your result set. If you supply 0, you show records from the beginning. As far as I know any number can be used providing that it is 0 and above.

      All the best

      Andy

  8. Actually, if you look at the detail of the execution plan, it seems that those extra steps don’t add anything to the performance….they are 0%. Which would explain that they are equal.

    Thank you for this post. Very helpful!

  9. Although the plans are different, if you look closer, you will see that both perform a clustered index scan at 8% and sort at 92% of the overall query cost.

    The actual usage of the Row_number() compared to the OFFSET is negligible at the end of the day.

    Having said that though, the OFFSET will also be brilliant at fining the Medium value of a rowset.

  10. Andre Guerreiro Neto says:

    Great article!

    Microsoft is surely listening to their users and server-paging features are always welcome
    in the web applications world.

    But about the ROW_NUMBER() VS OFFSET/FETCH: I’m just guessing here but doesn’t ROW_NUMBER() need to retrieve all the lines from query, then number them in the desired order and finally getting rid of all the data except for the 20 rows we need to show?

    Again, thanks for the article.

    Best regards,

    Andre Guerreiro Neto

    • Andy Hayes says:

      Hi Andre

      Thanks for your comment. In the example queries inside the article, performance is equal – certainly in terms of IO. On my machine after running DBCC FREEPROCCACHE and DBCC DROPCLEANBUFFERS, with SET STATISTICS IO set to ON, the figures are the same. Each execution plan contains a clustered index scan which is reading through 113,000+ rows.

      –OFFSET and FETCH
      Table ‘TransactionHistory’. Scan count 1, logical reads 797, physical reads 3, read-ahead reads 793, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
      –ROW_NUMBER()
      Table ‘TransactionHistory’. Scan count 1, logical reads 797, physical reads 3, read-ahead reads 793, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

      ROW_NUMBER() is calculating the id’s based on the order you specify using OVER (ORDER BY…) and the WHERE clause against the CTE is responsible for returning the first 20 rows. At the core of the CTE query, all rows from the Production.TransactionHistory table are returned.

      There’s a very interesting article on MSSQLTips.com written by Ben Snaidero on the performance of the different paging solutions that have been available since SQL 2000. You can read it here

  11. Bob Wolfson says:

    OFFSET and FETCH offer a nice syntax, but…

    The main use of this feature is to facilitate “paging” though a large dataset, no? In most such cases, we’re talking about serving pages to a client, one at a time – i.e. “paging from the server” – as opposed to returning the entire dataset to the client and letting it do the paging locally.

    In such cases, the client (or the user of the client) usually wants to know how MANY pages there are. And often the user has specified some sort of “filter” criteria over a primary dataset, so there’s no way to know the page count in his filtered dataset without building it.

    So it seems to me (and I don’t profess to be an expert by any means) that the best way to deal with the situation is to filter the primary dataset into a secondary table and then page through the secondary table. The count(*) of the secondary table, divided by the page size, tells you how many pages there are, and if row numbers are inserted into it as part of the filtering it’s easy and efficient to pull any given page-worth of results using the simple syntax in the 2nd half of the cte query.

    I’d appreciate any thoughts you might have.

  12. I’m so glad they added that feature. I just had a flashback to how badly this was needed in SQL Server 2000 [shudders]….

  13. Critian says:

    Nice post and very clear, so thanks!

Trackbacks

Speak Your Mind

*


*