Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / database / SQL-Server / SQL-Server-2008

How to export data from database tables to an XML file using SSIS

4.55/5 (8 votes)
18 Aug 2013CPOL6 min read 201.3K  
A simple way to create an XML file from SQL Server tables using SSIS

Introduction

I had a task to export data from tables in a database on SQL Server to an XML file. Here is a simple way to do it using SSIS. No Script components!

Background

No doubt, SSIS is the most natural tool to do such a task. One of the core purposes of SSIS is to transfer data from a database to a file. Usually in SSIS it's done by a "Data Flow Task" building block. This block consists of two sub-blocks: one for a database (source) and another one for a file (destination). The first sub-block is picked from the "Data Flow Sources" in the Toolbox. The second sub-block you might expect to find in the "Data Flow Destinations" section of the Toolbox. But... there is no "XML file" in this section! Thus unfortunately SSIS does not have a "buit-in" way to export data from database tables to a XML file. It still can be done using a "Flat File Destination" as a destination sub-block to create a regular text file. Then some additional manipulation is needed to format the text inside the file as XML. After all XML is nothing more than a formatted text. 

Most of the articles that I found on the internet suggest to use a "Scripting Task" block and to write code that make a use of .NET libraries to build XML output. It's a flexible and powerful way though in my opinion there is no need in such as there is no need to use a gun to kill a mosquito. You can perform such a simple task with you "built-in" bare hands. 

The SSIS presented here uses a standard "Data Flow Task" building block without any .NET code. It uses standard T-SQL (SQL for SQL Server). A nested SELECT command with FOR XML clause is used to represent the hierarchy of tables as XML. It is also used to add custom opening and closing XML tags.

Data Example

The source data is organized in a hierarchy of two database tables: Employees who serve Customers with relations "many-to-many" stored in the third table called EmployeesCustomers. This table describes which employee serves which customer. Usually one employee serves several customers. Some customers can be served by more than one Employee. SSIS exports this data to a single XML file which looks like this:

XML
<Company>
  <Employees>
    <Employee Id="..." Name="...">
      <Customers>
        <Customer Id="..." Name="...">
        ...
      </Customers>
    </Employee>
    ...
  </Employees>
</Company>

Note that customers who are served by more than one employee appear more than once in the XML file each time under a different employee. 

Now let's build the example data. Run these SQL commands in your SQL Server Management Studio to build the example tables. (I used the Bible characters as employees and contemporary celebrities as customers to make it amusing. This is just an example to illustrate how it works. Of course you can use your own data and your own hierarchy):

SQL
CREATE TABLE [dbo].[Employees](
        [EmployeeID] [bigint] NOT NULL,
        [FirstName] [varchar](20) NULL,
        [Title] [varchar](50) NULL,
        [BirthDate] [datetime] NULL,
        [Salary] [money] NULL,
CONSTRAINT [PK_Employees] PRIMARY KEY CLUSTERED
(
        [EmployeeID] ASC
))
 
CREATE TABLE [dbo].[Customers](
        [CustomerID] [int] NOT NULL,
        [CustomerName] [varchar](50) NULL,
CONSTRAINT [PK_Customers] PRIMARY KEY CLUSTERED
(
        [CustomerID] ASC
))
 
CREATE TABLE [dbo].[EmployeesCustomers](
        [EmployeeID] [int] NOT NULL,
        [CustomerID] [int] NOT NULL,
CONSTRAINT [PK_EmployeesCustomers] PRIMARY KEY CLUSTERED
(
        [EmployeeID] ASC,
        [CustomerID] ASC
))
 
INSERT INTO Employees VALUES (1,'Abraham','Team Leader'   ,'1947-11-25', 100.00)
INSERT INTO Employees VALUES (2,'Sarah'  ,'HR Manager'    ,'1967-04-20', 100.00)
INSERT INTO Employees VALUES (3,'Hagar'  ,'Office Cleaner','1978-08-03', 10.00)
INSERT INTO Employees VALUES (4,'Ishmael','Mail Deliverer','1989-12-01', 20.00)
INSERT INTO Employees VALUES (5,'Isaac'  ,'Developer'     ,'1995-01-01', 50.00)
INSERT INTO Employees VALUES (6,'Rebekah','QA Person'     ,'1995-05-13', 50.00)
 
INSERT INTO Customers VALUES (1,'Serena Williams')
INSERT INTO Customers VALUES (2,'Pat Robertson')
INSERT INTO Customers VALUES (3,'Paris Hilton')
INSERT INTO Customers VALUES (4,'Barak Obama')
INSERT INTO Customers VALUES (5,'Arnold Schwarzenegger')
INSERT INTO Customers VALUES (6,'Oprah Winfrey')
INSERT INTO Customers VALUES (7,'Prince El Saudi')
  
INSERT INTO EmployeesCustomers VALUES (1,5)
INSERT INTO EmployeesCustomers VALUES (1,6)
INSERT INTO EmployeesCustomers VALUES (1,7)
INSERT INTO EmployeesCustomers VALUES (2,1)
INSERT INTO EmployeesCustomers VALUES (2,2)
INSERT INTO EmployeesCustomers VALUES (3,3)
INSERT INTO EmployeesCustomers VALUES (3,4)

Using SSIS

Open a new SSIS project. Right click in the empty space in "Connection Managers" window which is usually at the bottom of the screen. We are going to add two connection managers: one for a resulting XML file and another for a SQL Server command as a data source:

Image 1

Select from the menu "New Flat File Connection…". In General section fill the fields "Connection manager name" as FileOut and "File name" with the name of your resulting XML file. Tick "Unicode" to avoid complications with code pages. The field "Locale" is selected automatically depending on your local language and your country. Mine is Hebrew (Israel). Leave the rest of the fields to their default values:

Image 2

Click on "Advanced" under "Description" on the left side. Click on the arrows down inside the "New" button and select "Add column" from the menu:

Image 3

A new column is now added. Fill the field "Name" as COL_XML and the field "DataType" as Unicode text stream [DT_NTEXT]. Click OK:

Image 4

Now to the data source. Right click on the "Connection Managers" and select from the menu "New OLE DB Connection…". Give it the name SQLIn. In the "Connection Manager" window fill the fields "Server name" and "Select or enter a database name". Click on "Test Connection" button to see that everything is fine with the database connection. Click OK.

Image 5

By now you must have two connection managers that look like this:

Image 6

Now we are going to add the one and only building block which is going to do the job. From the Toolbox select "Data Flow Task":

Image 7

Double click on it:

Image 8

From the Toolbox in the section "Data Flow Sources" select "OLE DB Source". Right click on it and select "Edit...". In the "OLE DB Source Editor" select SQLIn for "OLE DB connection manager" and select "SQL command" for "Data access mode". Now comes the most important part of the task: the nested SQL SELECT command with FOR XML clause: 

SQL
SELECT (
  SELECT TOP 1 NULL as N,
    (SELECT Employee.EmployeeId as Id, Employee.FirstName as Name,
      (SELECT TOP 1 NULL as N,
        (SELECT Customer.CustomerID as Id, CustomerName as Name
         FROM EmployeesCustomers INNER JOIN Customers Customer
         ON EmployeesCustomers.CustomerID = Customer.CustomerID
         WHERE EmployeesCustomers.EmployeeId = Employee.EmployeeId
         ORDER BY Customer.CustomerID
         FOR XML AUTO, TYPE)
       FROM Customers
       FOR XML AUTO, TYPE)
     FROM EmployeesCustomers INNER JOIN Employees Employee
     ON EmployeesCustomers.EmployeeID = Employee.EmployeeID
     GROUP BY Employee.EmployeeId, Employee.FirstName
     ORDER BY Employee.EmployeeId
     FOR XML AUTO, TYPE)
  FROM Employees
  FOR XML AUTO, ROOT('Company')
) AS COL_XML   

Basically this SQL statement is a variation of a standard join between the three tables: Customers and EmployeesCustomers and Employees  in order to get the Employee who takes care of certain customers as well as customers who are taken care by several employees in what is called "many-to-many" relationship. Note the two INNER JOIN clauses which connect between the three tables. If your task requires more simple relationship such as "one-to-many" then you will need just one INNER JOIN clause and the whole SQL will be much shorter.  After getting the relationships employees are grouped to get rid of multiple instances of the same employee. For this I used an ORDER BY clause. 

Copy and paste this text into the "SQL command text": 

Image 9

Click "Preview" button. You will see XML text inside a single column COL_XML: 

Image 10

Note that our SELECT statement is enclosed in SELECT (...) AS COL_XML to make the output to be text. Otherwise it will look like this:

Image 11

Click on Close and OK.

From the Toolbox in the section "Data Flow Destinations" select "Flat File Destination". Right click on it and select "Edit...". In the "Flat File Destination Editor" select FileOut for "Flat File connection manager":

Image 12

Click on "Columns" on the left side to see that it recognized the only column named COL_XML:

Image 13

Click OK. Connect the "OLE DB Source" and "Flat File Destination" blocks with a green arrow:

Image 14

If everything was fine the red error mark will disappear from the "Flat File Destination" block.

Now run your project. The end result will be the XML file as you specified earlier in the FileOut connection manager. Go find it and open it in your browser. It will look like this:

Image 15

We are done. It looks like a lot of steps but it actually takes a short time to do this.

Points of Interest

XML is formed by FOR XML clause. AUTO modifier uses table's and column's names as XML tags.

To enclose the whole XML document in the root tag ROOT('root_tag_name') modifier is used like this: 

SQL
FOR XML AUTO, ROOT('Company'

To enclose several child XML nodes in a parent node without XML attributes, for instance to enclose several <Employee> nodes in one <Employees> node I used this statement: 

SQL
SELECT TOP 1 NULL as N, (SELECT...) FROM TableName 

Of course it's not mandatory and could be omitted if your resulting XML does not need it.

To customize XML attributes' names I used an as clause, for example:

SQL
Employee.EmployeeId as Id, Employee.FirstName as Name

History

  • 15/08/2013 - First version submitted. 
  • 18/08/2013 - Revised to explain why this approach is helpful and to compare it to others. SELECT statement is explained in details. 

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)