How to Produce CSV Format Using T-SQL

There may be some requirements where you need to produce a csv output from a table. There are some good, efficient ways to do this and there are also some less efficient ways. Let us now look at one optimized approach and a couple of lesser optimized approaches 😉

SQL performs best working with sets

This is true and should always be the first thought when designing a query.

So instead of querying the database row by row by firing multiple select statements, using loops or cursors, always try and think how can you achieve the operation in a set based way.

To help demonstrate this thought process, lets take this example table and its data

T-SQL:

SET NOCOUNT ON;
DECLARE @Guid UNIQUEIDENTIFIER = NEWID()

CREATE TABLE CSVOutputDemo
(
ID INT PRIMARY KEY IDENTITY(1,1),
ReferenceId UNIQUEIDENTIFIER NOT NULL,
ImageName VARCHAR(500)
)

--add some data

INSERT INTO CSVOutputDemo(ReferenceId,ImageName)
VALUES
(@Guid, 'image1.jpg'),
(@Guid, 'image2.jpg'),
(@Guid, 'image3.jpg')

SELECT * FROM CSVOutputDemo
ID          ReferenceId                          ImageName
----------- ------------------------------------ --------------------------------------------------
1           6083FFAF-8C29-489A-9486-07F2BABDC264 image1.jpg
2           6083FFAF-8C29-489A-9486-07F2BABDC264 image2.jpg
3           6083FFAF-8C29-489A-9486-07F2BABDC264 image3.jpg

The desired output is that we want all the values in the ImageURL field to be returned from the table as a comma separated value string, aka CSV output

image1,jpg,image2.jpg,image3.jpg

How to take three strings from three rows and turn them into one string on a single row?

You can take advantage of FOR XML which is used in conjunction with the SELECT statement. It’s really neat 🙂

By placing FOR XML PATH(”) at the end of the query, you can see that already, the rows returned have reduced from 3 to 1

SELECT ImageName FROM CSVOutputDemo FOR XML PATH('')

Returns…

XML_F52E2B61-18A1-11d1-B105-00805F49916B
<ImageName>image1.jpg</ImageName<ImageName>image2.jpg</ImageName><ImageName>image3.jpg</ImageName>

However, the output isn’t quite as we want it just yet so we have to make some adjustments to add the comma’s

SELECT ',' + ImageName FROM CSVOutputDemo FOR XML PATH('')

XML_F52E2B61-18A1-11d1-B105-00805F49916B
----------------------------------------
,image1.jpg,image2.jpg,image3.jpg

We now have the CSV but there is a little bit of tidying up to do on the result to remove the comma at the start of the string and for this we will use STUFF to add an empty string in place of the first instance of the comma.

SELECT STUFF ((SELECT ',' + ImageName FROM CSVOutputDemo FOR XML PATH('')), 1, 1, '') AS ImageCSV

ImageCSV
---------------------------------
image1.jpg,image2.jpg,image3.jpg

FOR XML PATH(”) is an elegant solution but what about the alternative row by row approach?

This example uses a WHILE loop

DECLARE @ID INT
DECLARE @ImageName VARCHAR(50)
DECLARE @ImageCSV VARCHAR(500) = ''

SET @ID = (SELECT MIN(ID) FROM CSVOutputDemo)
WHILE @ID IS NOT NULL
BEGIN
  SET @ImageName = (SELECT ImageName FROM CSVOutputDemo WHERE ID = @ID)
  SET @ImageCSV = @ImageCSV + ',' + @ImageName
  SET @ID = (SELECT MIN(ID) FROM CSVOutputDemo WHERE ID > @ID)
END

SELECT STUFF(@ImageCSV, 1, 1, '') AS ImageCSV

As you can see, this is a lot more code for the same output. There is an alternate row by row solution which uses a T-SQL cursor.

DECLARE @ID INT
DECLARE @ImageName VARCHAR(50)
DECLARE @ImageCSV VARCHAR(500) = ''

DECLARE CursorImage CURSOR FOR
SELECT ImageName FROM CSVOutputDemo
OPEN CursorImage
FETCH NEXT FROM CursorImage INTO @ImageName
WHILE @@FETCH_STATUS <> -1
BEGIN
	SET @ImageCSV = @ImageCSV + ',' + @ImageName
	FETCH NEXT FROM CursorImage INTO @ImageName
END
CLOSE CursorImage
DEALLOCATE CursorImage

SELECT STUFF(@ImageCSV, 1, 1, '') AS ImageCSV

The performance comparison is best seen when STATISTICS IO is enabled.

Here are the results:

FOR XML PATH(”)

Table 'CSVOutputDemo'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

T-SQL WHILE LOOP

Table 'CSVOutputDemo'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'CSVOutputDemo'. Scan count 0, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'CSVOutputDemo'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'CSVOutputDemo'. Scan count 0, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'CSVOutputDemo'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'CSVOutputDemo'. Scan count 0, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'CSVOutputDemo'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

T-SQL CURSOR

Table 'Worktable'. Scan count 0, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'CSVOutputDemo'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'CSVOutputDemo'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'CSVOutputDemo'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'CSVOutputDemo'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

The set based operation involving the FOR XML PATH(”) solution makes one trip to the database however the other two solutions involve many more requests which will have a negative effect on performance.

It's only fair to share...Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedInPin on PinterestBuffer this pageShare on RedditShare on StumbleUpon
About Andy Hayes

Andy Hayes is a DBA working with SQL Server since version 7.0. He has a wonderful wife and two beautiful children. He loves database technology, playing cricket, and blogging. He is passionate about sharing his experiences as a DBA and learning more to further his understanding and knowledge. You can follow me on Twitter, check out my Facebook page or follow me on

Speak Your Mind

*


*