Tips for Optimizing XML in SQL Server

Donabel Santos (twitter (@sqlbelle) | blog) – March 29, 2011

I’ve worked on a project that used XML heavily inside SQL Server. We really utilized SQL Server’s XML support almost to the full extent, but with some repercussions. As we did our load testing, performance did degrade and we had to step back and adjust how we were using our XML data.

If you are using XML in your database, you may want to consider these few tips to optimize your overall queries and performance related to your XML data.

1. Promote frequently used elements and attributes into relational columns

If you find yourself always pulling out scalar values from your XML column in order to join to another table, you should consider “promoting” this value into a column.

Pro: This value can now be indexed, therefore can boost performance of your queries.

Con: It’s an additional management overhead. If you need to change the value in the XML document, you will also need to change the value in your relational column. You can consider doing this in the application level (ie change the value in both places at once), or creating a persisted calculated column that uses a UDF that extracts the scalar value for you, or even triggers (be careful! make sure you test before you implement in production).

2. Add a schema to your XML columns

XML is really just a text document. This poses an overhead to SQL Server because everytime you do operations, SQL Server will need to “guess” which data types may be appropriate for your operation before it does an implicit conversion. You can eliminate this step by creating typed XML columns, or XML that is bound to a schema (XSD).

Pro: Processing your XML will be faster than if you’re using untyped XML. This eliminates the data type guess work from SQL Server.

Con: This can create a management overhead. Your XML now suddenly becomes not-so-flexible anymore. Everytime you need to add a new element or attribute or new nested fragment, you will need to ALTER the schema first before you can make changes. This change may need to cascade down to all your stored procedures that use this column.

3. Use XML Indexes

You must first create a Primary XML index, then create secondary index(es).

-- sample primary XML index
CREATE PRIMARY XML INDEX invoiceidx
ON [Sales].[salesxml](xmlcontent)
GO

There are 3 main secondary XML indexes:

-- PATH Secondary XML Index
-- useful if you use the path, and if you do not have wildcards
SELECT
   xmlcontent.value('(sales/order[@ord_num="6871"])[1]', 'varchar(20)') ,
   othercols
FROM
   [Sales].[salesxml]
WHERE
   xmlcontent.exist('(sales/order[@ord_num="6871"])') =1

-- PROPERTY Secondary XML Index
-- useful if searching for multiple properties, but may not have the full path

-- sample
SELECT
   othercols
FROM
   [Sales].[salesxml]
WHERE
   xmlcontent.exist('(//title_id)') = 1

-- VALUE  Secondary XML Index
-- useful if you know exact value looking for, but may not have full path

SELECT
   othercols
FROM
   [Sales].[salesxml]
WHERE
   xmlcontent.exist('/sales/order[@ord_num="6871"]/text()[. = "Something"]') = 1

SELECT
   othercols
FROM
   [Sales].[salesxml]
WHERE
   xmlcontent.exist('(//title_id/@*[. = "special"])') = 1

Pro: As with regular indexes, faster searches.

Con: As with regular indexes, occupies more storage, and needs more resources needed to reprocess your indexes.

4. Recreating XML on Mass Updates

Ever wanted to mass update an element or attribute property in your XML? Tempted to use SQL Server’s support for XQuery? Unfortunately in my case, for some of the updates I needed to do, XQuery just did not cut it. The .modify() method of your XML column or variable is limited to updating just that instance of XML. If you want to use XQuery .modify() to update more than one item, you will have to loop through all XML instances and invoke .modify() for each one.

A possible alternative, depending on how big your XML data is, is recreating the XML. If the data you need for XML is already in other existing columns, you can just do a query with a FOR XML PATH and include the new values. Use with caution. It worked for my purposes, your mileage may vary.

Pro: It could be faster. YMMV.

Con: It requires a little bit more coding, and re-assembling your XML.

References and Additional Readings:
MSDN. Performance Optimizations for the XML Data Type in SQL Server 2005.

MSDN Blogs. Performance Tips of Using XML Data in SQL Server.

Comments

Pingback from Tips for Optimizing XML in SQL Server | SQL Server Performance Blog » Web Coding Unravelled
Time March 29, 2011 at 6:47 pm

[...] Tips f&#959r Optimizing XML &#1110n SQL Server | SQL Server Performance Blog Related Posts:Pro SQL Server 2008 XML Pro SQL Server 2008 XML is your key to unlocking the powerful [...]

Pingback from HTML all you need to know» Blog Archive » Tips for Optimizing XML in SQL Server | SQL Server Performance Blog
Time March 29, 2011 at 6:59 pm

[...] here: Tips f&#959r Optimizing XML &#1110n SQL Ma?tre d’h?tel | SQL Ma?tre d’h?tel Performance … Related Posts:Pro SQL Server 2008 XMLSelect Data From XML in MS SQL Server (T-SQL)All about the SQL [...]

Pingback from HTML Scripts Tips and Secrets » Blog Archive » Tips for Optimizing XML in SQL Server | SQL Server Performance Blog
Time March 29, 2011 at 7:11 pm

[...] See th&#1077 first post: Tips f&#959r Optimizing XML &#1110n SQL Ma?tre d’h?tel | SQL Ma?tre d’h?tel Performance … [...]

Write a comment

You need to login to post comments!