• Skip to main content
  • Skip to primary sidebar

DBA Diaries

Thoughts and experiences of a DBA working with SQL Server and MySQL

How to Delete Millions of Rows using T-SQL with Reduced Impact

October 9, 2012 by Andy Hayes 27 Comments

In this post, I talk about deleting millions of rows in SQL Server whilst keeping impact low.

Deleting millions of rows in one transaction can throttle a SQL Server

TRUNCATE TABLE – We will presume that in this example TRUNCATE TABLE is not available due to permissions, that foreign keys prevent this operation from being executed or that this operation is unsuitable for purpose because we don’t want to remove all rows.

When you run something like the following to remove all rows from your table in a single transaction,

[sourcecode language=’sql’]DELETE FROM ExampleTable[/sourcecode]

SQL Server sets about the process of writing to the transaction log all of the changes to be applied to the physical data. It will also decide on how it lock the data. It’s highly likely that the optimizer will decide that a complete table lock will be the most efficient way to handle the transaction.

There are potentially some big problems here,

  • Your transaction log may grow to accommodate the changes being written to it. If your table is huge, you run the risk of consuming all the space on your transaction log disk.
  • If your application(s) still requires access to the table and a table lock has been placed on it, your application has to wait until the table becomes available. This could be some time resulting in application time outs and frustrated users.
  • Your transaction log disk will be working hard during this period as your transaction log grows. This could be decreasing performance across all databases which might be sharing that disk for their transaction logs.
  • Depending on how much memory you have allocated to your SQL Server buffer pool, there could be significant drops in page life expectancy, reducing performance for other queries.
  • The realisation that a big performance issue is occurring lends temptation to kill the query. The trouble with that is it can delay things even more as the server has to rollback the transaction.ย Depending on how far along the operation is, this could add on even more time to what was originally going to be.

For example, if you kill the query and it is 90{3a76cbff1b5ce5ace3bd34bbdd68f71285d81fb9af60114c7a6db80d3a8688de} done then the server has to rollback a 90{3a76cbff1b5ce5ace3bd34bbdd68f71285d81fb9af60114c7a6db80d3a8688de} completed transaction. This will vary but the rollback can take as much time as the delete operation was in progress!ย (check using KILL n WITH STATUSONLY)

Some ways to delete millions of rows using T-SQL loops and TOP

Use a loop combined with TOP and delete rows in smaller transactions. Here are a couple of variations of the same thing. Note that I have arbitrarily chosen 1000 as a figure for demonstration purposes.

[sourcecode language=’sql’]
SELECT 1
WHILE @@ROWCOUNT > 0
BEGIN
DELETE TOP (1000)
FROM LargeTable
END
[/sourcecode]

And another way…
[sourcecode language=’sql’]
DoItAgain:
DELETE TOP (1000)
FROM ExampleTable

IF @@ROWCOUNT > 0
GOTO DoItAgain
[/sourcecode]

These are simple examples just to demonstrate. You can add WHERE clauses and JOINS to help with the filtering process to remove specifics. You would add error handling/transactions (COMMIT/ROLLBACK) ย also.

Summary

It’s a bad idea to delete millions of rows in one transaction ๐Ÿ™‚ and whilst this might sound like a no-brainer, people do try and do this and wonder why things start to go bad.

Breaking the delete operation down into smaller transactions is better all round. This will help reduce contention for your table, reduce probability of your transaction log becoming too large for its disk and reduce performance impact in general.

Your transaction log disk will still be working hard as your refined delete routine removes the rows from your table. Try and run this task during maintenance windows which are typically done inside off peak periods.

Filed Under: All Articles, SQL Server Administration, SQL Server Performance Tagged With: performance, sql server

About Andy Hayes

Andy Hayes is a DBA working with SQL Server since version 7.0. He has a wonderful wife and two beautiful children. He loves database technology, playing cricket, and blogging. He is passionate about sharing his experiences as a DBA and learning more to further his understanding and knowledge. You can follow me on Twitter, check out my Facebook page or follow me on Google+

Reader Interactions

Comments

  1. Virendra Yaduvanshi says

    November 1, 2012 at 4:00 am

    Hi Andy,
    Very Nice article… good work.
    regards,
    Virendra

    Reply
    • Andy Hayes says

      November 2, 2012 at 8:04 am

      Hi Virendra

      Thanks for your comment, I’m glad you liked the article ๐Ÿ™‚

      All the best

      Andy

      Reply
      • Justice says

        December 2, 2015 at 8:13 am

        Hi!,

        So what happens if I stop this query when it’s 80{3a76cbff1b5ce5ace3bd34bbdd68f71285d81fb9af60114c7a6db80d3a8688de} through the way?

        Reply
  2. TIRUMALESH says

    February 22, 2013 at 7:13 pm

    Very Good Site…Nice articles…Keep it up…

    Reply
    • admin says

      February 23, 2013 at 9:00 am

      Glad you like the site Tirumalesh and thank you for the comment.

      All the best

      Andy

      Reply
  3. Patrick says

    May 3, 2013 at 12:58 am

    Mate thanks a bunch, that is so much faster I’m very surprised! ๐Ÿ™‚

    Reply
    • Andy Hayes says

      May 3, 2013 at 5:55 pm

      Hi Patrick

      I’m glad this has helped.

      Thanks for your comment.

      All the best.

      Andy

      Reply
  4. Milind says

    June 20, 2013 at 12:31 am

    Exactly what I was looking for. Thanks Andy!

    Reply
    • Andy Hayes says

      June 20, 2013 at 4:13 pm

      I’m glad you found this useful Milind ๐Ÿ™‚

      Reply
  5. Adnan Khan says

    July 18, 2013 at 11:12 am

    Thanks a lot Andy..This is exactly what I was looking for.. You saved me…
    Once again thank you so much for sharing a nice article…

    Reply
    • Andy Hayes says

      July 26, 2013 at 7:18 pm

      My pleasure Adnan ๐Ÿ™‚

      Reply
  6. Ash Kumar says

    February 17, 2014 at 1:58 pm

    Similar to your top rows solution, the following works as well:

    set rowcount 50000
    do while (1=1)
    begin
    delete mytable where
    if @@rowcount=0
    break
    end
    set rowcount 0 — Reset the rowcount

    Reply
  7. Johnny says

    June 29, 2014 at 2:26 am

    Nice article. Assuming that disk space for the transaction log growth is not a concern, and assuming that other queries being blocked is not a concern, is a single large operation faster than multiple smaller “group type operations”? I’m not a DBA and I’m trying to get my head around the situation from a theoretical standpoint.

    Reply
    • Andy Hayes says

      June 29, 2014 at 8:02 am

      Thanks very much Johnny. I’m glad you liked the article. You have raised an interesting question. Typically SQL Server is far more efficient at dealing with sets of data so in the scenario you refer to where disk space and locking is not an issue, I would expect a single large transaction to be faster. It’s something worthy of a blog post to try and prove this. ๐Ÿ™‚

      Reply
  8. Dilon says

    July 21, 2016 at 7:47 am

    Hi Andy
    Great article well explained. But can you explain why is it necessary for the SELECT 1 at the beginning of the code.
    Thanks
    Dilon

    Reply
    • Andy Hayes says

      August 7, 2016 at 7:00 pm

      Hi Dilon

      SELECT 1 returns 1 row and it is to give the next line a starting position in the loop setting @@ROWCOUNT to be 1.

      Reply
  9. Adam Feather says

    August 12, 2016 at 10:43 am

    I’ve just forwarded this article around my dept. as I suspect we suffered at the hands of a large delete the other day. Hope you’re well!

    Reply
    • Andy Hayes says

      August 12, 2016 at 8:40 pm

      Hello sir! I’m very good thanks. Hope all is well with you. Thanks for sharing my post ๐Ÿ˜‰

      Reply
  10. Gavin Stevens says

    October 5, 2016 at 2:51 am

    I started this route. 94,000,000 rows in a logging table.. The data file is 95 GB in size.. I’m nearing capacity on a 128GB drive. Your method is better than a single delete command, but will still take forever to run. I found it more effective to simply drop and re-create the logging table and then shrink the database, filesize went from 95GB to 56 MB!

    Reply
  11. Duncan says

    April 11, 2017 at 10:36 pm

    Hi Andy,
    I learn a lot from this site, keep writing:-)

    Reply
  12. Imran Khan says

    November 2, 2017 at 12:44 pm

    Hi Andy,

    Thanks for your great article. Its help me to delete 30 lacs records in a one loop query.
    Well done and lots of tons thanks.

    Reply
  13. Michal says

    June 11, 2018 at 5:19 am

    Hi,

    In case its not splitted by begin tran and commit tran, its still one transaction (with more steps)
    in case there will be any “order by” clause, different with cluster index, can be a huge problem

    isnt beter to use truncate table instead ?

    Reply
    • Andy Hayes says

      June 22, 2018 at 3:54 pm

      Hi Michal

      Thanks for your question. TRUNCATE TABLE is great when you have permission to use it, if you want to clear the whole table (not a subset of rows) and when the table is not referenced by foreign key constraints.

      Reply
  14. zaib tabs says

    December 14, 2018 at 3:39 pm

    Hi,

    Its quite interest, but in my case i have to display records in report which are around 50,000. is there a way i can use a loop or for loop or cte to reduce the load on my cpu.

    currently my memory and cpu is maxing out, when user fetch the data

    select * from items.

    can i use

    DoItAgain:

    Select * from items TOP (1000)

    FROM ExampleTable

    IF @@ROWCOUNT > 1000

    GOTO DoItAgain

    Reply
  15. John says

    February 13, 2019 at 9:29 pm

    Thanks for the article.

    If you’ve got a script where you’re doing several of this, you can drop the begin/end for a single-statement WHILE:

    WHILE @@ROWCOUNT > 0
    BEGIN
    DELETE TOP (1000)
    FROM LargeTable
    END

    becomes

    WHILE @@ROWCOUNT > 0 DELETE TOP (1000) FROM LargeTable

    I hope my next database restore to development environment goes much faster when I’m purging huge tables.

    Reply
  16. Pavan says

    January 17, 2020 at 3:06 am

    Hi Andy,

    Just to know, can we by changing the recovery model to bulk logged and peform the delete operation in single transaction during a maintenance time.

    Reply

Trackbacks

  1. Deleting millions of rows | My Favorite SQL says:
    April 9, 2013 at 6:25 am

    […] More info […]

    Reply

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

CAPTCHA
Refresh

*

Primary Sidebar

Categories

  • All Articles (82)
  • Career Development (8)
  • MySQL Administration (18)
  • MySQL Performance (2)
  • SQL Server Administration (24)
  • SQL Server News (3)
  • SQL Server Performance (14)
  • SQL Server Security (3)
  • SQL Tips and Tricks (19)

Top 10 Popular Posts

  • Using sp_change_users_login to fix SQL Server orphaned users
  • MySQL SHOW USERS? – How to List All MySQL Users and Privileges
  • How to shrink tempdb
  • How to Transfer Logins to Another SQL Server or Instance
  • How to Delete Millions of Rows using T-SQL with Reduced Impact
  • T-SQL – How to Select Top N Rows for Each Group Using ROW_NUMBER()
  • New T-SQL features in SQL Server 2012 – OFFSET and FETCH
  • How to Kill All MySQL Processes For a Specific User
  • Using exec sp_who2 to help with SQL Server troubleshooting
  • How to move tempdb

Recent Posts

  • How to Setup MySQL Master Master Replication
  • How To Use SQL to Convert a STRING to an INT
  • How to set up MySQL Replication Tutorial
  • How to Use SQL CASE for Conditional Logic in Your SQL Queries
  • Using ISNULL in SQL Server to Replace NULL Values

Search

Connect

  • Twitter
  • Facebook
  • Google+
  • RSS

About

  • Cookie Policy
  • Disclaimer
  • About
Copyright ยฉ