SQL Server covering index and key lookup performance

covering index performance increaseIn this post, I wanted to write about the covering index and key lookup and how adding covering indexes can help increase performance by reducing lookups to the table data.

It’s helpful to know what each of these terms mean and then we will look at an example scenario further down the post.

What is a covering index?

A covering index is a non-clustered index which includes all columns referenced in the query and therefore, the optimizer does not have to perform an additional lookup to the table in order to retrieve the data requested. As the data requested is all indexed by the covering index, it is a faster operation.

To be considered as a covering index, all columns referenced by the query must be contained in the index. So this means all columns in the SELECT, JOIN, WHERE, GROUP BY, HAVING etc etc.

What is a key lookup?

A key lookup is an extra read which the query has to make in the absence of a suitable covering index.

In the query, the number of columns involved in the select statement may exceed the number of columns in the non-clustered index and when this happens, for each row in the result set, a key lookup is performed. So 500 rows in the result set equates to 500 extra key lookups.

Key lookup medicine – AKA the covering index!

Lets look at a simple query to help explain what I was talking about above. Imagine that dbo.MyTable consists of just two columns “Col1” and “Col2”.

SELECT Col1 FROM dbo.MyTable WHERE Col2 = 1;

For this query to be considered covered by an index, columns “Col1” and “Col2” need to be part of the same index and you could create different variants of indexes which would act as a covering index.

For example, you could do:

CREATE NONCLUSTERED INDEX IX_MyCoveringIndex
ON dbo.MyTable (Col1, Col2);

CREATE NONCLUSTERED INDEX IX_MyCoveringIndex
ON dbo.MyTable (Col2, Col1);

You could also use included columns:

CREATE NONCLUSTERED INDEX IX_MyCoveringIndex
ON dbo.MyTable (Col1) INCLUDE (Col2);

CREATE NONCLUSTERED INDEX IX_MyCoveringIndex
ON dbo.MyTable (Col2) INCLUDE (Col1);

So the columns need only be found in the index and the column order or whether they are an “included column” or not does not matter. It is important to remember that the execution plan and performance may vary considerably depending on which index is applied however, so choosing the correct covering index is important.

Lets look at how the covering index removes the need for a key lookup.

I am going to use this simple query against a copy of the AdventureWorks2012 database using SQL Server 2012 Express.

I’ll enable the execution plan output and enable STATISTICS IO output.

SET STATISTICS IO ON;

SELECT CustomerID, OrderDate
FROM [AdventureWorks2012].[Sales].[SalesOrderHeader]
WHERE CustomerID = 11000;

There is an non-clustered index on the CustomerID column. The clustered index on this table is on a column not referenced by the query called SalesOrderID.

The output looks like this:

STATISICS IO output….

Table 'SalesOrderHeader'. Scan count 1, logical reads 11, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Now lets look at the execution plan

covering index key lookup

You can see that 68% of the total cost of this query has been allocated to the key lookup against the clustered index.

Let’s see what happens to it after we modify the index to cover the columns in the query.

DROP INDEX Sales.SalesOrderHeader.IX_SalesOrderHeader_CustomerID;

CREATE NONCLUSTERED INDEX IX_SalesOrderHeader_CustomerID_OrderDate
ON Sales.SalesOrderHeader(CustomerID, OrderDate);

covering index key lookup

The key lookup is gone (well you were expecting that right 🙂 ) and now the query is retrieving the data entirely from the covering index.

Lets look at STATISTICS IO again

Table 'SalesOrderHeader'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

A big reduction can be seen in the number of logical reads compared to the first query. Logical reads are the pages read from the data cache regardless of whether the pages had to be read from disk or not. In both cases, they did not need to be read from disk (physical reads = 0), both queries read their pages direct from the data cache. As the second one read less pages from the data cache, that is still a good indicator that the covering index has improved performance.

If I ran DBCC DROPCLEANBUFFERS (do not do this on a production server as you could see massive performance reduction until the cache is re-populated) before each query and index change in order to force the optimizer to read from disk, then STATISTICS IO would look like:

--BEFORE covering index
Table 'SalesOrderHeader'. Scan count 1, logical reads 11, physical reads 7, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

--AFTER covering index
Table 'SalesOrderHeader'. Scan count 1, logical reads 2, physical reads 2, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

You can see the differences in the number of pages read from disk and this helps to show the benefit of having the covering index in place. Once the  data pages are read from disk, they enter the data cache. All queries get their results by reading the data cache.

The second query reads less data pages into the data cache and subsequently, there is more room in the data cache. More room means more pages can be read into cache and the frequency for disk reads becomes less with performance increasing.

Should you apply a covering index in all cases?

Indexing is a complex subject and there is no definitive “one size fits all” solution to indexing a database. You have to know what is possible and then assess your needs. You should always test your index changes properly before putting them into production.

An analysis of server workload will help you figure this out.  A few things to consider:

  1. Is existing query performance acceptable?
  2. How often is the query running?
  3. How selective is the query?
  4. Is the table you are adding the index to write intensive?

Remember, a query may perform well and return results in good time. What if it is running many hundreds or thousands of times a minute? Cumulatively without a covering index, it could be generating many more extra IO’s than necessary.

You need to assess whether the frequency and selectivity of the query warrants adding a covering index. Covering indexes have the greatest benefit to non-selective queries.

In addition, if your system is write intensive, adding an index could have the opposite affect of what you are trying to do by slowing down overall database performance. This is because for each update to the data, the indexes have to be updated as well.

For more information on SQL Server index creation, you can visit this link which talks about the covering index and included columns.

 

 

It's only fair to share...Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedInPin on PinterestBuffer this pageShare on RedditShare on StumbleUpon
About Andy Hayes

Andy Hayes is a DBA working with SQL Server since version 7.0. He has a wonderful wife and two beautiful children. He loves database technology, playing cricket, and blogging. He is passionate about sharing his experiences as a DBA and learning more to further his understanding and knowledge. You can follow me on Twitter, check out my Facebook page or follow me on

Comments

  1. Anandan Kanagarajan says:

    Excellent post and simple and clear definitions about Key Lookup & Covering Index. Thanks dude and please continue to blog…

  2. Anandan Kanagarajan says:

    Andy Hayes,
    It will be great if you continue this topic with Filtered Index (from 2008 on wards) and Column store index (from 2012 on wards)
    Also shed some light on RID Look up.

  3. Excellent Post. I’m Brasilian DBA and I didn’t find any good explanation about this, like i found here. Thank you for this blog. Please, continue the blog.

  4. sunil kumar anna says:

    Thanks Andy.. it’s is useful

Speak Your Mind

*


*