Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / database / SQL-Server

Term Lookup Transformation in SSIS

4.33/5 (2 votes)
8 Apr 2013CPOL3 min read 77.7K   2K  
Term Lookup is the one of the powerful Transformations in SQL Server Integration Service. This transformation basically helps to analyze the set of text records.

Introduction

Term Lookup is the one of the powerful Transformation in Sql Server Integration Service. This Transformation is basically helps to analyze the set of text records.

Term Lookup transformation will look for set of user predefined nouns or noun Phrases in transformation input text column and provide the count of occurrences as an output on row by row basis.

Example

A mobile customer support team has collected the product feedback from the end user for a newly released product. This Unstructured Data [Feedback] can be analyze for most frequently encountered problems and that can be resolved in the next release of the product.

How to use Term Lookup Transformation

Table structure - Here I have used three tables

  1. CustFeedback table [Given Input]
  2. Customer Feedback Table

  3. SearchKey Table [User predefined nouns or noun Phrases]
  4. SearchKey Table

  5. Data_Analyze Table [Excepted output]
  6. Data Analyze Table

Steps To Be Carried Out

Step 1: Create new project in BIDS with Integration Service Project Template.

New Project

Step 2: Drag the Dataflow task to Control flow.

Data Flow Task

Step 3: Create a New OLEDB Connection.

Right click on Connection Manager and select New OLEDB Connection and provide the necessary information to connect your SQL database.

New OLEDB Connection

Step 4: Add OLEDB source Task and configure the same.

Double click on Data Flow task and Drag the OLEDB source Task to Data flow.

New OLEDB Source

Double click on OLEDB source Task and select the connection which you have created in previous step and select the name of the Feedback Table.

OLEDB Source Config

Step 5: Now it’s a time to use Term lookup transformation to find out the terms in Feedback source column, but the Term Lookup transformation will except only Unicode String [DT_WSTR] or Unicode text stream [DT_NTXT] as its data type . To convert the data type of the column we use data Conversion transformation.

New Data Conversion

Convert Feedback column Data Type. Double click on Data Conversion transformation and select Feedback column in available Input Columns and select Unicode text stream [DT_NTXT] as Data Type as shown in the below screenshot.

Data Conversion Config

Step 6: Add Term lookup transformation and configure the same.

Drag Term Lookup Transformation to Data Flow.

New Term Lookup

Configure Term Lookup Transformation

When you double click on Term Lookup Transformation you will see three tabs in editor window.

  1. Reference Table: Select the Reference Table name from the dropdown
  2. Ref Table

  3. Term Lookup: Now you will see the Available Input Columns and Available Reference Columns.
  4. Available Input Columns

    Select the check box in Available Input Column list which you wanted to see as a output column.

    Output Columns

    The terms [or Records] which are available in the Keyword column of the Reference table is going to compare with Converted_Feedback column of the Input Column. Drag converted_Feedback column to Keyword column to relate both the column.

    Ref Columns

  5. Advanced: Select the checkbox “Use Case-sensitive Term lookup” for Case-sensitive term lookup.
  6. Advanced Tab

Step 7: Add OLEDB destination Task and configure the same [copy the result data into Data_Analyze ] Table.

New OLEDB Dest

Double click on OLEDB destination Task and provide the necessary information to connect Data_Analyze table.

OLEDB Dest Config

Provide the appropriate column name in Mapping tab

OLEDB Dest Config Map

Step 8: Run the application.

Run

After successful execution of this package we will get following output in Data_Analyze Table.

Data Analyze

  • ID – Auto generated number .
  • Terms – The Term or phrases which are found in the provided input column.
  • Frequency –Count of Term occurrences in output on row by row basis.
  • UserID – Used ID Information for the table CustFeedback.
  • Converted_Feedback- Feedback Information from the table CustFeedback.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)