Saturday, June 11, 2011

TVPs Vs XML - An Analysis Through .NET

A common requirement of .NET applications that connect to databases like SQL Server is passing Lists and Arrays to the database server. Since most of these databases are relational, transferring these Lists of Objects isn't really straight-forward.

One way of passing these objects would be to send individual properties of these Objects to the database server, one object at a time. Obviously this is highly inefficient. Understanding this, various database management systems have provided various approaches to achieve this.

There are two ways of passing such content to SQL Server -

  1. Generating an XML from the List and passing it as a parameter
  2. Creating a Data Table from the List and passing it as a Table Valued Parameter

Passing Lists and Arrays as XML to SQL Server has been a common approach since SQL Server 2000. From SQL Server 2008, Microsoft introduced a new way of passing Lists and Arrays - Table Valued Parameters.

In this post, I will be evaluating the performance statistics of XML and TVP though a .NET application. I will be doing this through the RetrieveStatistics method of the SqlConnection class. This method retrieves the statistics of an operation through an IDictionary which contains various performance metrics.

This happens only when the StatisticsEnabled property is set to true before executing the query/stored procedure. These metrics can be reset using the ResetStatistics method of the SqlConnection class. Here's a typical usage of this method -

The dictionary contains several provider statistics. There are 18 values that can be obtained from the Microsoft SQL Server provider. A detailed analysis of the metrics that can be obtained from the dictionary is available here - http://msdn.microsoft.com/en-us/library/7h2ahss8(v=vs.80).aspx.

The .NET application that is being used in this post to compare the performance of TVP and XML captures these 18 metrics into an object. However the final analysis focuses on three of these properties - Execution Time, Bytes Sent and Bytes Received.

The application primarily inserts records into a table. The number of records and the value of the records are decided by the application. To understand the effect of load, the numbers of records are raised from 25 to 2500 in gradual steps of 25. For an accurate measurement each step is repeated for 25 times and an average of the metrics obtained is considered.

Here are the graphs obtained for Execution Time, Bytes Sent and Bytes Received -

Execution Time (Records on X-Axis and Time on Y-Axis)

Bytes Sent (Records on X-Axis and Bytes on Y-Axis)

Note - The Bytes Sent for OpenXML and Nodes are almost the same.

Sent Received (Records on X-Axis and Bytes on Y-Axis)

Note - The Bytes Received for TVP and Nodes are the same.

The observations that I could conclude are as follows -

  • The execution time for TVP is less than XML (Using OpenXML took more time than Nodes). The execution time of OpenXML was far higher than TVP
  • The bytes sent for XML was higher than for TVP. This is probably due to the XML tags that had to be added for the transfer
  • The bytes received in all the three cases were constant irrespective to the bytes sent. Though the number of bytes received for TVP and Nodes was the same, the number of bytes for OpenXML was slightly higher. This might be due to the procedure calls of sp_xml_preparedocument and sp_xml_removedocument

On a whole, TVPs look more promising than XML. However this was a very preliminary test and the actual results might vary in live environments.

The SQL scripts and the Visual Studio project used in this post can be found here and the raw numbers are here.