Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Excel Source as Lookup Transformation Connection

I need to have Lookup Transformation Task, where Connection will be a Excel Source. But SSIS allow to set only OLEDB Connection as Connection.

Is there any possibility to use Excel file as Connection for Lookup Transformation?

like image 553
ceth Avatar asked Oct 31 '11 12:10

ceth


People also ask

How do I find SSIS on Excel?

Click on the mappings node in the cache transform and map the columns from excel to the cache connection. Notice the magnifying glass next to the ID, this indicates the index column. In the next data flow we will have and OLEDB source from the SQL table and a lookup transform using the cache connection.

Which type of join does the lookup transformation performs during lookups?

The Lookup transformation tries to perform an equi-join between values in the transformation input and values in the reference dataset. (An equi-join means that each row in the transformation input must match at least one row from the reference dataset.)

How is lookup different from lookup transformation?

The Term Lookup transformation adds two columns to the transformation output, named by default Term and Frequency. Term contains a term from the lookup table and Frequency contains the number of times the term in the reference table occurs in the input data set.


1 Answers

Merge sort is an option, but if you really want to use Excel as a source for the SSIS Lookup task, you can do it.

As you have correctly discerned, you cannot use the Excel Connection Manager in a Lookup task, it only accepts the OLE DB connection manager. The trick then, is to use an OLE DB connection manager with an Excel spreadsheet.

For this example, I have a spreadsheet with state codes and their full name and my source data only has the abbreviations flowing through. This walk through will wire up a lookup task against Excel to retrieve those values.

Keep these caveats in mind: Lookups are case sensitive, regardless of whether the source database (or file in this case) is case sensitive. Excel strings are always going to be interpreted as unicode/nvarchar.

Source Data

Given a simple file sitting at C:\tmp\LookupReference.xlsx that looks like the following Simple key value pairs in Excel

Connection Manager

We must first establish an OLE DB Connection Manager. Instead of the default "Native OLE DB\SQL Server Native Client 10.0" change that to "Native OLE DB\Microsoft Office 12.0 Access Database Engine OLE DB Provider". In the Server or file name, locate the source file. Clicking Test Connection at this point will result in an error.

OLE DB Connection Manager Connection Tab

Here comes the "magic." Click the All tab and scroll to the top. In Extended Properties, add the indicated string "Excel 12.0;HDR=YES; IMEX=1;" This tells the provider that we are going to use Excel 12.0 (.xlsx format) with a header row and the IMEX 1 tells the driver there will be intermixed data.

OLE DB Connection Manager All Tab

Your package sould now look something like this. A connection manager with extended properties set and assumes a preexisting data flow

Control Flow

Data Flow

To simplify matters, I have a script source that generates 3 rows of data with state codes MO, KS and NE and sends them down the pipeline. Your source will obviously be different but the concept will remain the same. Sample code provided in the annotation.

In your lookup transformation, you will need to write a query against the spreadsheet. It's similar to a normal database query except your table is going to be Sheet1$ unless you have a named range in which your table would be MyRange Note the $ is required when referencing a sheet. Using the sample spreadsheet above, my query would be

SELECT
    S.StateCode
,   S.StateName
FROM 
    `Sheet1$` S

I map my Script task column StateCode to the reference query's StateCode column and check the StateName field as I want to add that to my data flow.

Data Flow

I've put a Data viewer after the Lookup task to verify my lookups worked

Data Viewer

Everything works and we're all happy.

Old Excel

If you are using a .xls file, you need to make the following changes. - In your Connection Manager, instead of the Office 12 provider, select the "Native OLE DB\Microsoft Jet 4.0 OLE DB Provider" - The Extended Properties become "EXCEL 8.0;HDR=Yes; IMEX=1;"

Minor Edit

Blogged this at Using Excel in an SSIS lookup transformation. You can also use a Cached Connection Manager to use any source for lookups in SSIS 2008+

like image 160
billinkc Avatar answered Sep 28 '22 09:09

billinkc