Introduction
In this tip we will see how to import a text file with foreign key relation and validation by using SSIS. All the error records will stored in a separate table.
Background
Before starting this tip, I recommend you to read the previous post: SSIS - Import Text File
Using the Tools and Code
- Database : SQL Server 2012 R2
- BI : SQL Server Data Tools 2010
- Create sample file
- Create database tables
- Create SSIS package
- Sample file analysis & expected output
- Final output
Let's Get Started
Part 1: Create Sample File
1. In this example we are going to use a student, details text file.
Student Name, Register No and Department Name are the columns.
Part 2: Create Database Tables
1. Open SQL Server Management Studio
2. Create a database "TestDB"
3. Create a table "SSIS_Department" with the following columns
- DepartmentId
- DepartmentName
Set DepartmentId as a primary key and set Identity Specification is "yes"
CREATE TABLE [dbo].[SSIS_Department](
[DepartmentId] [bigint] IDENTITY(1,1) NOT NULL,
[DepartmentName] [varchar](10) NOT NULL,
CONSTRAINT [PK_SSIS_Department] PRIMARY KEY CLUSTERED
(
[DepartmentId] ASC
)
) ON [PRIMARY]
4. Create a table "SSIS_Student" with the following columns
- StudentId
- StudentName
- RegisterNo
- DepartmentId
Set StudentId as a primary key and set Identity Specification is "yes"
Set SSIS_Deprtment table DepartmentId as a foreign key of SSIS_Student table DepartmentId
CREATE TABLE [dbo].[SSIS_Student](
[StudentId] [bigint] IDENTITY(1,1) NOT NULL,
[StudentName] [varchar](50) NOT NULL,
[RegisterNo] [varchar](5) NOT NULL,
[DepartmentId] [bigint] NOT NULL,
CONSTRAINT [PK_SSIS_Student] PRIMARY KEY CLUSTERED
(
[StudentId] ASC
)
) ON [PRIMARY]
ALTER TABLE [dbo].[SSIS_Student] WITH CHECK ADD CONSTRAINT [FK_SSIS_Student_SSIS_Department] FOREIGN KEY([DepartmentId])
REFERENCES [dbo].[SSIS_Department] ([DepartmentId])
ALTER TABLE [dbo].[SSIS_Student] CHECK CONSTRAINT [FK_SSIS_Student_SSIS_Department]
Part 3: Create SSIS Package
1. Start the SQL Server Data Tools (Microsoft Visual Studio 2010 Shell)
2. Go to File -> New -> Project
3. Select Business Intelligence, from the Installed Templates option and project as Integration Services Project and give the Project Name.
4. Drag two data flow task from tool box to Control Flow Tab. For easy understanding change the first data flow task as "Department" and second data flow task as "Student".
5. Create a Flat File Connection to read text file and OLE DB Connection to insert ext file data to database. Right click on Connection Manager Pane and select New Flat File Connection.
- Flat File Connection Manager General settings: Specify a connection manager name, specify the sample file location earlier we created and leave the remaining settings.
- Flat File Connection Manager Columns settings: Columns setting will display all the available columns in the sample file.
- Flat File Connection Manager Advanced settings: In the advanced settings, we can add validation. Here we are going to add a validation to RegisterNo column. Select RegisterNo column and set the OutputColumnWidth to 5.
- Preview is used to view the sample file. Then click Ok to create the connection manager.
6. Now let's create the OLE DB Connection manager. Right click on Connection Manager Pane, create a new OLE DB Connection.
- In the connection manager window specify the Server name.
- Since I am using my local server I have used ".", otherwise need to specify the server name.
- Select "Use SQL Server Authentication" option and provide user name and password.
- In "Select or enter a database name" select the "TestDB" database. Earlier we have created the tables in TestDB database.
- Verify the connection established successfully or not by Test Connection button.
- Finally click Ok button to create OLE DB Connection manager.
7. In this example we are going to load student data from text file. But in our database design SSIS_Student table using DepartmentId. It means all the distinct DepatmentName available in the text file, insert into SSIS_Department table, then using a lookup function to find the DepartmnetId of the DepartmentName to insert the student details.
8. Earlier we have included a validation, RegisterNo restricted to five characters. If a record's RegisterNo with more than five characters, it will load into the error data table.
9. Ok, let's finish the Department part.
- Right click Department data flow task and select Edit. It will go to Data Flow tab.
- Now place a Flat File Source from the tool box. Right click on Flat File Source and edit. It will show the Flat File Source Editor.
- In the connection manager setting: select the Flat file connection manager name in the dropdown.
- In the Columns setting: It will show all the available External Columns. Now we are going to deal with department name only. So select DepatrmentName
10. Now place a Sort Transformation from the SSIS tool box. Connect flat file source output to Sort transformation input.
- Right click on Sort transformation and edit. It will show the Sort Transformation Editor. Specify any sort type and sort order and select "Remove rows with duplicate sort values" checkbox.
11. Now place an OLE DB Destination to insert the department names.
- Right click on OLE DB destination and select Edit. It will show the OLE DB destination editor.
- In the connection manager setting: Select "OLE DB connection manager" name from the dropdown list, Select "Data access mode" to "Table or view - fast load", Select "Name of the table or the view" to "SSIS_Department".
- In the mapping setting: map the respective input column to destination column.
12. Now we will start student details part.
- Go to Control Flow tab, right click Student data flow task and select Edit. It will go to Student Data Flow tab.
- Now place a Flat File Source from the tool box. Right click on Flat File Source and edit. It will show the Flat File Source Editor.
- In the connection manager setting: select the Flat file connection manager name in the dropdown.
- In the Columns setting: It will show all the available External Columns. Select all the available external columns.
- In the Error Output setting : Set all the columns Error and Truncation to Redirect row
13. This time Flat File Source has two outputs. The Blue color indicates the success records and the Red color indicates the error records.
- Now place an OLE DB Destination to insert the error records. Chnage as "Error Destination". Connect the Red color output of the flat file source to input as the Error Destination.
- Right click the Error Destination and select edit. It will show the OLE DB destination editor.
- In the connection manager setting: Select OLE DB connection manager name from the dropdown list, Select data access mode to "Table or view - fast load". In the "Name of the table or the view" dropdown we need to select the table to insert the error records, but still we didn't create any table to insert the error records.
- To insert the error records, click "New..." button next to the "Name of the table or the view" dropdown. It will show the table structure to insert the error records.
- Copy the code and use SQL management studio to create a table. Also include an primary key filed.
CREATE TABLE [SSIS_Error] (
[ErrorId] [bigint] IDENTITY(1,1) NOT NULL,
[Flat File Source Error Output Column] varchar(max),
[ErrorCode] int,
[ErrorColumn] int,
CONSTRAINT [PK_SSIS_Error] PRIMARY KEY CLUSTERED
(
[ErrorId] ASC
)
)
- After created "SSIS_Error" table, set the "Name of the table or the view" dropdown to "SSIS_Error".
- In the mapping setting: map the respective available input column to available destination column.
14. We have finished the error records capturing part. Now we are going to process the success records.
- To process the success records we need to do a lookup transformation to find the DepartmentId for the DepartmentName.
- Place a lookup transformation from the tool box and connect the blue color output of the flat file source to input as the lookup transformation.
- Right click on lookup transformation and select edit. It will show the lookup transformation editor.
- In general setting: select Cache mode as "Full cache" and Connection type as "OLE DB connection manager"
- In Connection settings: Select OLE DB connection manager name from the dropdwon and set the Use a table or view to SSIS_Department table.
- In the columns setting: Map the available input columns "DepartmentName" to available lookup columns "DepartmentName" and select available lookup columns "DepartmentId" checkbox.
15. Now place an OLE DB destination and change the name as "Student Destination". Connect the lookup match output to Student Destination input.
- In the connection manager setting: Select OLE DB connection manager name from the dropdown list, Select data access mode to "Table or view - fast load". In the "Name of the table or the view" dropdown select "SSIS_Student".
- In the mappings setting: map the respective input column to destination column.
Part 4: Sample File analysis & Expected Output
We have finished our SSIS package design. Before execute this package, take a look on our input sample file.
- In our package Department part will execute first.
- So CSE, ECE, EEE and IT data will insert into the SSIS_Department table. (Four records)
- Then Student part will execute.
- While processing student records, we have placed the RegisterNo length restriction to five characters. But in our sample file last record (Watson) RegisterNo exceeds the limt. So that record should not load into SSIS_Student table. It should be in SSIS_Error (One record) table.
- Remaining student records will insert into the SSIS_Student table with DepartmentId. (Four records)
Part 5: Final Output
Run the SSIS package. If we don't encounter any error in the project, the result will be green color. If we have any issue, the result will indicate in red color.
Error SSIS package execution.
By viewing the progress tab, we can findout the error.
Total SSIS package execution.
Department part execution output.
Output showing that, from Flat file source five departments returned, after apply sort and remove duplicate sort values, four records are passed to the destionation to insert.
Student part execution output.
From the flat file source, out error record details passed to "Error Destination", remaining four student details are passed to lookup transformation to find the DepartmentId and then passed to the destination to insert.
Lets chek in the database.
SSIS_Department records.
SSIS_Error records.
SSIS_Student records.
History
1st Version : 2015-09-06