Introduction
A while ago, I was given a task to generate an XML file from database using stored procedure and was given a stored procedure as reference. The reference stored procedure fetched data into a cursor and then processed each record and created XML elements by appending string literals and then returned the generated string "XML". After looking at that stored procedure, I thought to myself there must be a better way to do this and there was. Using FOR XML
clause, one can return table records as XML data. After learning about it, I decided to write an article about it.
Article Series Roadmap
This is article 3 of 4 part series. Other articles are listed below:
Content
Background
Basic understanding of SQL Joins and SET operators is required to follow along.
Using the Code
The article below will use the following database. You can copy/paste the following code to follow along or create your own database and tweak the queries.
Create database as shown below:
CREATE DATABASE FOR_XML_TUTORIAL;
Execute the below statements to create CUSTOMER
and ORDER
tables and populate it with data.
USE [FOR_XML_TUTORIAL];
CREATE TABLE [CUSTOMER]
(
CUSTOMER_ID INT PRIMARY KEY NOT NULL,
FIRST_NAME VARCHAR(25) NOT NULL,
LAST_NAME VARCHAR(25) NOT NULL,
POSTAL_CODE VARCHAR(2) NOT NULL,
);
CREATE TABLE [ORDER]
(
ORDER_ID INT PRIMARY KEY NOT NULL,
CUSTOMER_ID INT NOT NULL REFERENCES CUSTOMER(CUSTOMER_ID),
TOTAL_ITEMS INT NOT NULL,
TOTAL_AMOUNT NUMERIC(18,2) NOT NULL
);
INSERT INTO CUSTOMER VALUES (1, 'John', 'Michaels', 'TX');
INSERT INTO CUSTOMER VALUES (2, 'Shawn', 'Cena', 'MA');
INSERT INTO CUSTOMER VALUES (3, 'Dwayne', 'Austin', 'TX');
INSERT INTO CUSTOMER VALUES (4, 'Steve', 'Johnson', 'FL');
INSERT INTO [ORDER] VALUES (1, 1, 5, 32.50);
INSERT INTO [ORDER] VALUES (2, 1, 2, 21.36);
INSERT INTO [ORDER] VALUES (3, 2, 7, 59.00);
INSERT INTO [ORDER] VALUES (4, 3, 2, 18.24);
INSERT INTO [ORDER] VALUES (5, 4, 3, 30.00);
INSERT INTO [ORDER] VALUES (6, 4, 6, 66.00);
FOR XML Modes
When using FOR XML
clause, a mode must be specified which returns XML accordingly. Following is a list of available modes:
EXPLICIT Mode
In RAW and AUTO mode we have little control on XML creation. SQL Server provides us with two other modes using which we can have more control in our XML creation, one of which, the EXPLICIT mode we are going to learn in this tutorial and the other one, the PATH mode in next tutorial.
Using EXPLICIT mode we can control XML creation such that we can define hierarchies and define whether the values be visible as attributes or seperate elements and provides us with many more options. But with more control comes more complexity. To use EXPLICIT mode we have to create SQL selects such that it represents our XML hierarchies. In addition to that we have to create a seperate select statement for each level in heirarchy and use UNION ALL to join record sets. The best way to understand EXPLICIT mode is by an example so lets see one.
We will be using above mentioned tables to output following XML.
<Customer CustomerID="1">
<FirstName>John</FirstName>
<LastName>Michaels</LastName>
<PostalCode>TX</PostalCode>
<Order OrderID="1">
<Items>5</Items>
<Amount>32.50</Amount>
</Order>
<Order OrderID="2">
<Items>2</Items>
<Amount>21.36</Amount>
</Order>
</Customer>
<Customer CustomerID="2">
<FirstName>Shawn</FirstName>
<LastName>Cena</LastName>
<PostalCode>MA</PostalCode>
<Order OrderID="3">
<Items>7</Items>
<Amount>59.00</Amount>
</Order>
</Customer>
<Customer CustomerID="3">
<FirstName>Dwayne</FirstName>
<LastName>Austin</LastName>
<PostalCode>TX</PostalCode>
<Order OrderID="4">
<Items>2</Items>
<Amount>18.24</Amount>
</Order>
</Customer>
<Customer CustomerID="4">
<FirstName>Steve</FirstName>
<LastName>Johnson</LastName>
<PostalCode>FL</PostalCode>
<Order OrderID="5">
<Items>3</Items>
<Amount>30.00</Amount>
</Order>
<Order OrderID="6">
<Items>6</Items>
<Amount>66.00</Amount>
</Order>
</Customer>
I find it easy to write select statements to return the record set that somewhat resembles final XML data and then when I am satisfied with the result set, I convert it to XML. We'll tackle this problem the same way.
In above XML, the first level in hierarchy contains customer information. As mentioned before each level in hierarchy will be mapped to a select statement so lets create one for customer information.
SELECT C.CUSTOMER_ID,
C.FIRST_NAME,
C.LAST_NAME,
C.POSTAL_CODE,
NULL AS ORDER_ID,
NULL AS TOTAL_ITEMS,
NULL AS TOTAL_AMOUNT
FROM [CUSTOMER] C
Above query is a simple select statement with nothing complex except 3 columns representing ORDER_ID, TOTAL_ITEMS and TOTAL_AMOUNT are selected as null. I will explain why these columns are included shortly.
The record set returned is
Next we will create select statement for next level in hierarchy which is customer order information. Below is the query.
SELECT C.CUSTOMER_ID,
C.FIRST_NAME,
C.LAST_NAME,
C.POSTAL_CODE,
O.ORDER_ID,
O.TOTAL_ITEMS,
O.TOTAL_AMOUNT
FROM [CUSTOMER] C
INNER JOIN [ORDER] O
ON C.CUSTOMER_ID = O.CUSTOMER_ID
Above query returns
Now we will join both record sets using UNION ALL clause. Now you will see the reason why included 3 null columns in first query because to perform set operations on record sets, the number of columns must be equal.
Following is the query.
SELECT C.CUSTOMER_ID,
C.FIRST_NAME,
C.LAST_NAME,
C.POSTAL_CODE,
NULL AS ORDER_ID,
NULL AS TOTAL_ITEMS,
NULL AS TOTAL_AMOUNT
FROM [CUSTOMER] C
UNION ALL
SELECT C.CUSTOMER_ID,
C.FIRST_NAME,
C.LAST_NAME,
C.POSTAL_CODE,
O.ORDER_ID,
O.TOTAL_ITEMS,
O.TOTAL_AMOUNT
FROM [CUSTOMER] C
INNER JOIN [ORDER] O
ON C.CUSTOMER_ID = O.CUSTOMER_ID
Above query returns
Now lets add an order by clause to ensure our record set represents the XML structure. Order the records by CUSTOMER_ID and then ORDER_ID so that the records with NULL ORDER_ID comes first.
SELECT C.CUSTOMER_ID,
C.FIRST_NAME,
C.LAST_NAME,
C.POSTAL_CODE,
NULL AS ORDER_ID,
NULL AS TOTAL_ITEMS,
NULL AS TOTAL_AMOUNT
FROM [CUSTOMER] C
UNION ALL
SELECT C.CUSTOMER_ID,
C.FIRST_NAME,
C.LAST_NAME,
C.POSTAL_CODE,
O.ORDER_ID,
O.TOTAL_ITEMS,
O.TOTAL_AMOUNT
FROM [CUSTOMER] C
INNER JOIN [ORDER] O
ON C.CUSTOMER_ID = O.CUSTOMER_ID
ORDER BY CUSTOMER_ID, ORDER_ID
Above query returns
Now this record set represents our final XML data. Lets convert this into XML using EXPLICIT mode.
To convert this record set into XML we have to include 2 new columns in the first two positions
TAG: The TAG number represents the level of hierarchy or depth.
PARENT: The TAG number of current level's parent in hierarchy.
SELECT 1 AS TAG,
NULL AS PARENT,
C.CUSTOMER_ID,
C.FIRST_NAME,
C.LAST_NAME,
C.POSTAL_CODE,
NULL AS ORDER_ID,
NULL AS TOTAL_ITEMS,
NULL AS TOTAL_AMOUNT
FROM [CUSTOMER] C
UNION ALL
SELECT 2 AS TAG,
1 AS PARENT,
C.CUSTOMER_ID,
C.FIRST_NAME,
C.LAST_NAME,
C.POSTAL_CODE,
O.ORDER_ID,
O.TOTAL_ITEMS,
O.TOTAL_AMOUNT
FROM [CUSTOMER] C
INNER JOIN [ORDER] O
ON C.CUSTOMER_ID = O.CUSTOMER_ID
ORDER BY CUSTOMER_ID, ORDER_ID
As you can see in first select statement we select TAG as 1 and PARENT as NULL since it is the first element in hierarchy and it has no parent. Similarly, in second select statement we select TAG as 2 and PARENT as 1 since it is at second level in hierarchy and its parent is the element with TAG equals to 1 in hierarcy.
Additionaly we have to provide a very complex structured alias to each column in first select statement as following.
<ELEMENT>!<TAG>!<ATTRIBUTE>[!<DIRECTIVE>]
where,
<ELEMENT>: The name of element to which values will be assigned.
<TAG>: The tag number representing the level in hierarchy or depth.
<ATTRIBUTE>: The name of attribute to which a particular columns's value will be assigned.
<DIRECTIVE>: This is optional and used to provide additional information for XML creation. We will look at one of its option "ELEMENT".
Coming to back to our example now our select query looks like
SELECT 1 AS TAG,
NULL AS PARENT,
C.CUSTOMER_ID AS [Customer!1!CustomerID],
C.FIRST_NAME AS [Customer!1!FirstName!ELEMENT],
C.LAST_NAME AS [Customer!1!LastName!ELEMENT],
C.POSTAL_CODE AS [Customer!1!PostalCode!ELEMENT],
NULL AS [Order!2!OrderID],
NULL AS [Order!2!Items!ELEMENT],
NULL AS [Order!2!Amount!ELEMENT]
FROM [CUSTOMER] C
UNION ALL
SELECT 2 AS TAG,
1 AS PARENT,
C.CUSTOMER_ID,
C.FIRST_NAME,
C.LAST_NAME,
C.POSTAL_CODE,
O.ORDER_ID,
O.TOTAL_ITEMS,
O.TOTAL_AMOUNT
FROM [CUSTOMER] C
INNER JOIN [ORDER] O
ON C.CUSTOMER_ID = O.CUSTOMER_ID
ORDER BY CUSTOMER_ID, ORDER_ID
Let's take a look at the first alias [Customer!1!CustomerID]
Here,
Customer denotes the element to which our value will be assigned.
1 denotes the hierarchy level or depth.
CustomerID denotes the attribute to which value will be assigned.
If optional directive ELEMENT is specified, a child element namely CustomerID will be created instead of attribute as in the case of C.FIRST_NAME.
Let's modify ORDER BY clause to include new aliases.
SELECT 1 AS TAG,
NULL AS PARENT,
C.CUSTOMER_ID AS [Customer!1!CustomerID],
C.FIRST_NAME AS [Customer!1!FirstName!ELEMENT],
C.LAST_NAME AS [Customer!1!LastName!ELEMENT],
C.POSTAL_CODE AS [Customer!1!PostalCode!ELEMENT],
NULL AS [Order!2!OrderID],
NULL AS [Order!2!Items!ELEMENT],
NULL AS [Order!2!Amount!ELEMENT]
FROM [CUSTOMER] C
UNION ALL
SELECT 2 AS TAG,
1 AS PARENT,
C.CUSTOMER_ID,
C.FIRST_NAME,
C.LAST_NAME,
C.POSTAL_CODE,
O.ORDER_ID,
O.TOTAL_ITEMS,
O.TOTAL_AMOUNT
FROM [CUSTOMER] C
INNER JOIN [ORDER] O
ON C.CUSTOMER_ID = O.CUSTOMER_ID
ORDER BY [Customer!1!CustomerID], [Order!2!OrderID]
Lastly, append FOR XML EXPLICIT to above query.
SELECT 1 AS TAG,
NULL AS PARENT,
C.CUSTOMER_ID AS [Customer!1!CustomerID],
C.FIRST_NAME AS [Customer!1!FirstName!ELEMENT],
C.LAST_NAME AS [Customer!1!LastName!ELEMENT],
C.POSTAL_CODE AS [Customer!1!PostalCode!ELEMENT],
NULL AS [Order!2!OrderID],
NULL AS [Order!2!Items!ELEMENT],
NULL AS [Order!2!Amount!ELEMENT]
FROM [CUSTOMER] C
UNION ALL
SELECT 2 AS TAG,
1 AS PARENT,
C.CUSTOMER_ID,
C.FIRST_NAME,
C.LAST_NAME,
C.POSTAL_CODE,
O.ORDER_ID,
O.TOTAL_ITEMS,
O.TOTAL_AMOUNT
FROM [CUSTOMER] C
INNER JOIN [ORDER] O
ON C.CUSTOMER_ID = O.CUSTOMER_ID
ORDER BY [Customer!1!CustomerID], [Order!2!OrderID]
FOR XML EXPLICIT
Simple as that. Following is the result of above query.
<Customer CustomerID="1">
<FirstName>John</FirstName>
<LastName>Michaels</LastName>
<PostalCode>TX</PostalCode>
<Order OrderID="1">
<Items>5</Items>
<Amount>32.50</Amount>
</Order>
<Order OrderID="2">
<Items>2</Items>
<Amount>21.36</Amount>
</Order>
</Customer>
<Customer CustomerID="2">
<FirstName>Shawn</FirstName>
<LastName>Cena</LastName>
<PostalCode>MA</PostalCode>
<Order OrderID="3">
<Items>7</Items>
<Amount>59.00</Amount>
</Order>
</Customer>
<Customer CustomerID="3">
<FirstName>Dwayne</FirstName>
<LastName>Austin</LastName>
<PostalCode>TX</PostalCode>
<Order OrderID="4">
<Items>2</Items>
<Amount>18.24</Amount>
</Order>
</Customer>
<Customer CustomerID="4">
<FirstName>Steve</FirstName>
<LastName>Johnson</LastName>
<PostalCode>FL</PostalCode>
<Order OrderID="5">
<Items>3</Items>
<Amount>30.00</Amount>
</Order>
<Order OrderID="6">
<Items>6</Items>
<Amount>66.00</Amount>
</Order>
</Customer>
Wrapping Up
That is all for EXPLICIT mode in FOR XML
clause. To learn more about FOR XML you can visit MSDN SQL Server books online. In the next article, we will learn about PATH mode.