Processing XML Files With SQL Server Functions: Data Transformation Services OPENROWSET Function
Processing XML Files With SQL Server Functions: Data Transformation Services OPENROWSET Function
Processing XML Files With SQL Server Functions: Data Transformation Services OPENROWSET Function
Since the release of SQL Server 2000, processing XML data has been a topic of discussion among database
administrators. It is often used when a front-end application passes an XML document to a stored procedure within an
input parameter. Sometimes, however, you encounter a folder containing a set of XML files that needs to be loaded into
the database and then processed into SQL Server tables. This is more difficult, and documentation on how to do so is
sparse.
You can use several techniques, most of which are quite complex. I have used SQL Server Integration Services (SSIS),
Data Transformation Services (DTS) and a self-written Windows application to read and load files. My favorite
technique, however, is to use the OPENROWSET function. This function enables a good deal of flexibility, because you
can control the entire process from each end within your T-SQL stored procedure.
The first step of the process is to create a table with a single column that uses the XML data type. A temporary table also
works:
With the OPENROWSET function, you then load the data into a single row of the table:
You can now place the value from the Data column of the #WorkingTable table into a variable, call on the OPENXML
function to parse the XML document and use it as necessary. In this case, the XML document is very basic:
The OPENXML code necessary to read this data from the temporary table should look something like this:
The resulting output from this query will look something like this:
This is the basic syntax of all your OPENXML statements. From here, you can add greater complexity to /root/data,
@parm1 and the other XPath expressions. This allows you to process data from more complex XML documents. Here is
an example of a more complex XML document, with invoice data:
As you can see, we have a three-level document with needed data in the Customer ID, Order ID and Item ID nodes. In
this sample code, you can see that we start with the Item ID node and work our way back to the Customer ID and Order
ID values:
SELECT *
FROM OPENXML(@hDoc, '/root/Customer/Order/Item')
WITH (CustomerId INT '../../@ID',
OrderId INT '../@ID',
ItemId INT '@ID',
Qty INT '@Qty')
If we were to take this same XML document and add an order for a second customer, we would see an output like this:
As I said, we started at the Item node of the XML document. If we started at the Customer node of the document and
worked our way down the document, our output would not be correct. We would instead get only the first item for each
Customer ID in the document.
SELECT *
FROM OPENXML(@hDoc, '/root/Customer')
WITH (CustomerId INT '@ID',
OrderId INT 'Order/@ID',
ItemId INT 'Order/Item/@ID',
Qty INT 'Order/Item/@Qty')
In this article, we have discussed basic examples involving the use of XPath. For more comprehensive examples, you
can refer to Microsoft's XPath syntax options to leverage the power of the XML engine in T-SQL statements.