This Learn MSBI article is now outdated we would request to see our latest article using SQL Server 2014 http://www.codeproject.com/Articles/1064477/Learn-to-create-MSBI-Microsoft-Business-Intelligen
Welcome to the day 3 of Learn Microsoft business intelligence Step by Step. In this article we will make a deep dive into the SSIS world. We will expose some great features and Tasks in the SSIS.
With MSBI step by step articles and videos do watch our Practical MSBI Interview Questions & Answers Tutorial.
- Day 1 - Understand Data warehouse, Business Intelligence, Basics of SSIS, SSAS and SSRS, Simple ETL using SSIS
- Day 2- Dimensions and Facts, Data warehouse design techniques, SSIS - Execute SQL Task, SSIS – Backup database Task, SSIS – Script Task and Variables.
- Day 3
Before we move on and start our Labs, first let’s have some talk on “Precedence constraints”.
- Precedence constraints let us decide order in which tasks should be executed. Example which task should be executed first, which will be next and so on.
- It also let us decide conditions on which task should be executed.
Example,
- Execute if previous task fails
- Execute if previous task succeeds.
- Execute on completion of previous task (either pass or fail)
- Execute when some condition matches.
Still confused!!!
No worries. In further coming labs we are going to use precedent constraints and after that all your confusions will fly away.
Control Flow
Till now in control flow, we had gone through “Data Flow Task”, “Execute Sql Task”, “Backup database task” and “script task”. Let’s explore some more.
“Web service task” let us execute the web servce method and then store the result into an variable or inside a file.
This variable or file may act as an input for some other task.
Note:
- Intent of this article is to explain SSIS not web services. We are assuming that you have a basic understanding of web services. If you are new to web services then in step 1 and step 2 we have shared 2 videos which will make you familiar with web services.
- In real life scenario as a BI developer, mostly you are not required to worry about creation of web services. You will get metadata of a service in the form of wsdl from external sources (may be from some other developers or organization). Your task will be adding “Web Service Task” and that’s what we will learn.
Step 1: Create a Demo WCF Service
Please go through the following video from www.Questpond.com and understand how to create a simple WCF Demo Service.
Step 2: Host the Demo Service in IIS or Console application
Please go through the following article from www.Questpond.com and understand how to host a WCF Demo Service.
Step 3: Create new Integration services project
Open Sql Server data tools (or business intelligence development studio). Click File>> New>> Project. Select integration services from the left section of create project dialog box, Enter name of the project as WebServiceDemo and click OK
Step 4: Create HttpConnection Manager
- Right click the connection manager in the bottom and select new connection.
- Select HTTP from the dialog box and click ok.
- Put the address of your WSDL file in the next dialog box.
Note: How to get the WSDL address is not your concern. The one who created the Service will give you that.
Note: Provide the Username and Password or certificate if service is accepting any. This information will also be available with you.
- Click OK.
Step 5: Add Web Service task and configure it
- Take WebService Task from the SSIS toolbox and add it to the SSIS designer tab.
- Double click the task. It will open up the “Web Service Task Editor Dialog box”.
- Set the HttpConnection property to one set in the above step.
- Set WSDL File property to some absolute file path.
Note: Don’t select the path, rather type it because WSDL file is not available yet.
Path should be “Some_Valid_Folder_Path\AnyFileName.wsdl”
For example set it to “G:\BI Step by Step\3\Source Code\WebServiceTask\Service.wsdl”
- In the bottom you will find a Download WSDL button. Click that. It will download the wsdl exposed by service into the path specified in above step.
| |
- Now change selection from General to input in the left section.
- Set service and method to one of the values in dropdown. Values had been added to both dropdowns when we clicked Download MetaData in the last step.
| |
- Navigate to output section and set Output type to Variable. Next Set variable to “New Variable”. Enter Variable name as “ServiceOutput” and set type to String. Click OK to close the “Add variable” window. Click ok to close the “Web service task editor”.
Step 6: Create script task
Add Script task from the SSIS toolbox and configure it to display the value of “ServiceOutput” variable in message box. Please refer day 2 to learn how to work with script task.
Step 7: Decide the execution flow.
Use the precedence constraint and make Webservice task executed first and then ScriptTask.
- Click the Web service task. You will see a green arrow coming out of it.
- Take that line and connect it to ScriptTask.
| |
Note: If we don’t use precedence constraints here both the tasks will execute in parallel. With Precedent constraints we added a constraint “One will execute only after other executes successfully”
Step 8: Execute and Test
- Make sure service is executing. (For demo you can use the WCF service attached in the article. From the WCF Service project and execute hosting project.)
- Press F5 and confirm that package is working
As you can see, we are getting XML out from Webservice.
In order to extract the value from it we will use “xml task”.
“Xml Task” let us work with xml data. Using this,
- We can merge multiple xml documents,
- We can apply XLST stylsheets to a xml document
- We can evaluate values using XPath expression
- We can Validate XML document
- We can compare Xml documents
Step 1: Prepare the project
Use the same project created in the last lab (web service lab) and remove “script task” from it.
Step 2: Add Xml Task and configure it
Take “Xml task” from the SSIS toolbox and add it to the control flow.
Step 3: Add variable for storing final result
Right click the designer section, select variables and add a variable called “XmlOutput”
Step 4: Configure the XML Task
- Double click the “xml task” it will open the “XML Task Editor window”.
- Set the properties as follows
- I.Source Type to variable
- II.Source to User::ServiceOuput (this variable will contain the webservice return value in xml format and it was created in last demo)
- III.OperationType to XPATH
- IV.XPathOpearation to “Values”
- V.SecondOperationType to DirectInput
- VI.SecondOperand to “/string”
Note: If you want, you can store XPath expression in the file or variable and use it directly. For that set SecondOperationType as “File Connection” or as “Variable” accordingly. - VII.SaveOperationResult to True
- VIII.DestinationType to variable
- IX.Destination to XmlOutput (It was created in last step)
- X.OverwriteDestination to True
- Click OK
Step 5: Add and configure script task
Add script task in the control flow and configure to display the value of XmlOutput variable in message box.
Step 6: Decide the execution flow
Use the precedence constraint and make “Web Service Task” executed first, then “Xml Task” and finally “Script Task”.
| |
|
Step 7: Execute and Test the application
Press F5 and check the output
|
|
Error handling had been an intriguing feature in every technology. How come SSIS stay behind? J
Let’s take the same sample created in above lab for this demo. But this time Service won’t be available hence web service task won’t execute properly and will throw error.
Step 1: Prepare the project
I.Take sample created in above Lab 6.
II.Stop the web service if its executing
Step 2: Create Event Handler
I.Click the event handler tab in the SSIS designer window.
II.You will find two dropdowns in the top named “Executables” and “Event handler”. Select “Web Service Task” as Executable and select “OnError” as Event handler.
III.In the working area of designer window you will find a link “Click here to create an “OnError” event handler for executable “Web Service Task”. Click that
IV.That’s it. It will create the event handler.
Step 3: Add Script task and configure it
- Now in the working area of event handler add a script task.
- Double click the task to open “Script Task Editor”.
- Make sure to select a system variable called “Error description” for ReadOnlyVariable property
Note:
- “Error Description” is a system variable which will contain the latest error description in the SSIS Package.
- Variable will be updated each time new error occurs.
- We can use this variable as input for some other tasks.
- Click the Edit Script and write a code to display the value of "Error Description” variable in message box.
MessageBox.Show(Dts.Variables["ErrorDescription"].Value.ToString());
Step 4: Execute and test the application
Press F5 and confirm that everything is proper.
[Work In progress – will be updated soon]
Containers in SSIS let us group multiple tasks into one. By adding multiple tasks into one group we get following benefits.
- Easy control over the sequence of execution.
Let’s have an example.
Let say we have task1 and we want that task 2 will execute after task1. The challenge is, once the task2 gets complete, task3, task4 and task5 should execute in parallel and finally once all of them completes task6 should start. Without containers it would have been nightmare because in real time scenario we may have huge number of tasks and controlling all of them only with precedence constraints will be difficult.
- Set values to common properties of multiple tasks into one go. Example disable multiple tasks in one click.
- Instead of adding event handler for every task individually we can add event handler for container. In the last demo we had added “OnError” event handler to one task. What if we have many tasks? Simply put all the tasks into one Container and add “OnError” event handler to Container.
- Till now, whatever variable we had created those were package level variables. Life time of those variable is equal to the lifetime of a package. With containers we can restrict the scope of variable to container.
- We can execute multiple tasks in the transactions. If one task fails, operations done by other tasks get roll backed.
- Sequence container – Just group the tasks into one
- For Loop Container – Group the tasks and let them execute “n” number of times
- For Each Loop Container – Group the tasks, let us loop though the collection of items (like file) and let us execute the “group of tasks” in that loop. (Every item in the loop may act as input to one or more tasks inside the container)
In this lab
- We will group multiple tasks into one.
- Then we will make them execute in transaction.
Step 1: Prepare Database
Create a database called ContainerDemo and a table called Customer as follows in the SQL Server
Step 2: Create Integration services project
Create new SSIS project using SQL Server Data tools.
Step 3: Add Connection Managers
Add Ado.net Connection manager and connect it to ContainerDemo database
Step 4: Add Container
Add sequence container in control flow.
Step 5: Add Execute SQL Task
- Create three “Execute SQL Task” inside sequence container.
- Set ConnectionType as ADO.NET for all three of them.
- Set connection to one created in step no 3 for all three of them.
- Set SQLSourceType to DirectInput for all three of them.
- Set SqlStatement to
- insert into Customer values('A',55) for first task
- insert into Customer values('B',60) for second one
- insert into Customer values('C',’CC’) for third one
Note:
- Third query will leads to error because Age is integer and we are passing string.
- Don’t connect three tasks inside sequence container using precedence constraints. Let them execute in parallel.
Step 6: Execute and Test
Press F5 and Test the application.
Step 7: Enable Transaction
- Right click Sequence container and select properties
- In the property window set value of TransactionOption property to “Required”
Step 8: Execute and Test
Press F5 and test the application
Note: Make sure to delete all the exiting records from the table before you execute so that you get a clear picture of record insertion without any confusion.
In this lab we will learn how to limit the scope of the variable to container and how to perform looping
Step 1: Prepare Database
We will use the same database created in above lab. Make sure to empty Customer Table so that there will not be any confusion.
Step 2: Crate new Integration Services Project
Create new SSIS project using SQL Server Data tools.
Step 3: Add Connection Managers
Add Ado.net Connection manager and connect it to ContainerDemo database
Step 4: Add Container
Add “For Loop container” in control flow.
Step 5: Create variable
In order to loop, first thing we need is variable. Create a variable at package level called Counter.
Step 6: Add sequence container
Add one more sequence container in the same control flow.
Step 7: Check current variable scope.
Make sure variable window is open.
- Click the sequence container. You will find Counter variable in the variable window.
- Click for loop container. Counter variable is still available
Step 8: Change variable scope
- Select the variable in variable window, click the Move Variable button.
- A new window called “Select new scope” will popup. Select the for loop container which was added in one of the previous step.
Step 9: Recheck variable scope.
Perform the step 7 again.
Step 9: Remove the sequence container
Select sequence container and press delete key. Purpose of that container in this demo was only testing
Step 10: Configure For Loop container.
- Double click the “For loop container”. It will open up the “For loop editor” window.
- Set the values to the properties as show in the figure.
Note: It’s self-explanatory now. For loop will execute five times
Step 11: Add Execute Sql Task and configure it.
- Add a new Execute Sql Task inside For loop container.
- Configure its properties as follows.
- ConnectionType to ADO.NET.
- Connection to one created in one of the above step.
- SQLSourceType to “DirectInput”
- Set SQLStatement to “insert into Customer values('A'+cast(@Index as varchar),@Index)”
- In the “Execute Sql Task editor” window go to “parameter mapping” section and map sql parameter @index to SSIS parameter @Counter
- Click Ok
Step 12: Execute and Test
Press F5 for executing package and test the output
In this lab we will iterate through CSV files located inside a directory and store all the data inside it into Sql Sever
Step 1: Prepare the database
Same ContainerDemo database and Customer table will be used. Make sure to truncate it.
Step 2: Create SSIS project
Create new SSIS project using SQL Server data tools
Step 3: Create data Folder and Data Files
Create 3 CSV Files with some data and put them in the some folder.
Step 4: Create variable
Create a new variable called “FilePath”
|
| |
Step 5: Add and configure “for each loop container”
- Add For each loop container to control flow
- Double click the container to open “Foreach loop editor”
- Change selection in left section from General to collection.
- Set Enumerator as “ForEach File Enumerator”
- Select Folder Path
- Put “*.csv” in the Files textbox
- Change selection from Collection to variable mapping.
- Set variable to one created in above step and Index to 0
| |
- Click Ok.
Step 6: Create dataflow task
When it comes to data transfer, data flow task is the only choice. Take “data flow task” from toolbox and add it inside the container.
Step 7: Add connection manager
- Add Ado.net connection Manager and configure it to point ContainerDemo database
- Add Flat File Connection Manager and explicitly point it to one of the files in the directory (data directory)
Step 7: Configure source and destination inside dataflow task.
- Double click the “data flow task”. It will take you to dataflow tab
- Add Flat file source and double click it to open Flat File source editor.
Set Connection manager to “flat file connection” created in last step.
- Add Ado.Net destination
- Connect flat file source to this destination using “dataflow path”
- Double click the Ado.Net destination top open “Ado.Net destination editor”
- Set Connection manager to “Ado.Net connection” created in last step.
- Select table as Customer.
- Navigate to mapping section and make sure mapping is proper.
Note: You might be thinking “We are going to perform looping over the files, then why we configured source to single file”. Answer is Relax!! J We have not done yet. We connected our source to one of the file because by doing so, we got the idea about source file structure and mapping was possible.
Step 7: Make connection dynamic
- Right click the Flat file connection and go to properties
- Find the expression property and click the 3 dot button. It will popup “property expression editor”
- Select property as ConnectionManager and click the 3 dot button to set expression
- “Expression builder” window will popup. From the “Variable and parameter” section, take FilePath variable created in one of the previous step and drag it to “Expression” section.
- Click Ok to close the “expression builder”. Click Ok again to close the expression property editor window.
Step 8: Execute and Test the application
Press F5 and check the output
Before going in to deployment let’s understand what elements get’s deployed in a SSIS project. If you look at your SSIS projectstructure it looks something as shown below.
At the top we have the solution and inside the solution we have project files. Further every project has package file with extension DTSX and configuration for those packages are stored in project.params.
To understand SSIS deployment we need to think in terms of “What” and ”Where”. So what are the various ways of deploying and where can we deploy them.
So let’s answer first What?. There are two ways of deploying SSIS projects,1st Project level deployment and2nd package level deployment.
When you do project level deployment you deploy all packages in one go. While in package level deployment, deployment happens at the DTSX file level. So you need to deploy individual DTSX files.
Project level deployment is the new way of doing deployment and was introduced in SQL 2012 while package level is the old wayi.e. till SQL 2008. In this lab we will focus more on project level deployment rather package level.
The second thing we need to understand is “Where can we deploy these packages ?”. In other words what are the different ways of hosting these packages. So there are 3 primary source or hosting in which you can deploy:-
- In SQL Server service itself i.e. in SSISDB database.
- In SSIS service :-
So in order to understand deployment step by step we will do this is 5 steps:-
- Create a simple file copy project which copies file from a source to a destination.
- Create setup of that project.
- Install the package setup.
- Configure the package before running.
- Finally running the package.
So let’s start the journey.
Step 1:- Create the file copy project
So let’s create two folders “Location1” and “Location2” as shown in the below figure and let’s create a simple text file “SimpleText” file in “Location1”.
Now we would like to copy this “SimpleText” file from “Location1” folder to “Location2” folder. So in order to achieve the same we will be using the “File system task” control. So create a simple SSIS project and add a package “FileCopy.dtsx” file. Your project structure should look something as shown in the below figure.
On this package drag and drop the file system task from the control flow tool box and put it on the control flow designer pane as shown in the below figure.
Also we need to provide source and destination files which needs to be copied. So right click on the “File system task” control and click edit and provide the “Destination” and “Source” connection. So in the source connection we will be pointing to location1 and in the destination we will be pointing to location2 folder.
Run this project and once test , if the fileis getting copied from “Location1” folder to “Location2” folder.
We have purposely kept this project simple so that we can concentrate on the deployment more rather than the SSIS project.
Step 2 :- Create a setup
As said in the previous section in SQL Server 2012 the deployment model is now project deployment. So do a full build by clicking on build à rebuild all. Now go to the project folder by right clicking on the solution and click on “open folders in windows explorer”.
Go in to bin/Development folder you will find a full SSIS setup created with a file extension “ISPAC”. Double click and run the same.
Step 3:- Install the setup
For now we will be deploying in SQL Server SSISDB. So connect to SQL Server instance and browse to the “Integration Services catalogs” folder and right click on “SSISDB” and click on create folder. So we will create a folder “MyPackages” and we will deploy the packages in the same.
Once you run the ISPAC file setup it will start a wizard. In the wizard there are two important things to be mentioned one is the source and the other is the destination. Source is from where the things will be loaded and i.e the ISPAC file and destination is where the deployment will happen.
We have already created a folder “Mypackages” so select the same and install in the same.
Step 4 :- Configure the package
Once the package is installed you should see your project inside “SSISDB\Mypackages” folder.
Expand the SSIS project folder and browse to the package and right click on the same. Now you can do three things first validate the package, configure and run it. So let’s first configure and then run.
So once you click on configure the following screen comes up with two tabs. The first tab is for parameters we will talk about that later on. For now the second tab is important. The second tab has connection managers , remember we had two file connection one for source and other for destination.
If you wish to configure the file paths you can click on the “…” and set a different value.
Step 5:- Run the package
Once we have finished the configuration we can run the package again by right clicking and execute. After execution you can also see a report of your success and failure.
Some points to remember before we conclude deployment:-
You can parameterize your package by creating parameter’s using the parameter tab and these parameters can be connected with your connection manager properties of variables. These values can be supplied when you execute the package.
If you want to share configuration data at the project level. Like some values which are common for all packages like server name or some common folder etc you can create environment variables. You can then attach these environmental variables with package configuration or variables.
In this lab we have focused mainly on package deployment but in case you want to use the old way of deployment i.e. package you can always convert it as shown in the below figure.
In case in project deployment you want to individually deploy packages you can do the same by right clicking on project folder and clicking import packages as shown below.
Conclusion
In this article we specially focused about Control Flow. In the next article we focus more on the data flow tasks and on general features like Debugging, Deployment etc.
Hope you enjoyed reading this. Your comments, votes and suggestions motivates us for writing more stuffs like this.
You can also refer the below 1 hour MSBI youtube video :- Learn MSBI in 4 days.
For technical trainings on various topics like WCF, MVC, Business Intelligence, Design Patterns, WPF and UML and many more feel free to visit www.sukesh-marla.com
For more stuff like this click here. Subscribe to article updates or follow at twitter @SukeshMarla
For further reading do watch the below interview preparation videos and step by step video series.