February 25, 2009

XML Shredding

Hi
Today I've started working on a new project, which has a requirement to store a list of 300k values in a table (having each value stored in a different record).
Facing this requirement I had two possible solutions in mind, either send an XML value or a CSV string, and since I'm not that familiar with SQL Server's XML abilities I decided to conduct a benchmark.
In order to simulate an input xml of a 300k list, I have created a table with random numbers and used a query in order to generate the XML variable.

create table random_numbers (random_number int)
declare @a xml
select @a = (select random_number as a from random_numbers for xml auto)
It took SQL Server about 2 seconds to generate the xml variable.
The next step was to shred it back into a table.
My first attempt was
select t.a.value('(/random_numbers[1]/@a)','int') id
from @a.nodes('/random_numbers') t(a)
but when I ran this query, all the rows had the value from the first row in the xml
so i modified it to this query
select t.a.query('.').value('(/random_numbers/@a)[1]','int') id
from @a.nodes('/random_numbers') t(a)
this gave me the required results, but it took forever
1000 records - 2 seconds
2000 records - 7 seconds
3000 records - 18 seconds
4000 records - 34 seconds
5000 records - 47 seconds
6000 records - 67 seconds
7000 records - 97 seconds
8000 records - 123 seconds
9000 records - 187 seconds
finally I changed the query to
select t.a.value('@a','int') id
from @a.nodes('/random_numbers') t(a)
This query worked amazingly well. It took SQL Server only 5 seconds to shred the 300k rows xml.
Hope this post helped you.

1 comment:

  1. hi man,
    isn't xml shredding about storing XML (LOB ) Data in a relational DB(/table) , by shredding the xml to table columns.
    i don't get how you can achieve this using a select statment...?

    please xplain.
    yellowg.

    ReplyDelete