Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Skipping rows when importing Excel into SQL using SSIS 2008

I need to import sheets which look like the following:

    March Orders   
   ***Empty Row    
    Week Order # Date Cust #
    3.1 271356 3/3/10 010572
    3.1 280353 3/5/10 022114
    3.1 290822 3/5/10 010275
    3.1 291436 3/2/10 010155
    3.1 291627 3/5/10 011840

The column headers are actually row 3. I can use an Excel Sourch to import them, but I don't know how to specify that the information starts at row 3.

I Googled the problem, but came up empty.

like image 731
DavidStein Avatar asked Mar 10 '10 14:03

DavidStein


People also ask

How you can skip initial 200 rows while uploading a flat file in SSIS?

For SSIS, in the Flat File Connection Manager, you have an option that mentions Header rows to skip. On that option, you need to set it to 2 and check the Column names in the first row checkbox. That should handle the file if the row delimiters are consistent.


1 Answers

have a look:

the links have more details, but I've included some text from the pages (just in case the links go dead)

http://social.msdn.microsoft.com/Forums/en-US/sqlintegrationservices/thread/97144bb2-9bb9-4cb8-b069-45c29690dfeb

Q:

While we are loading the text file to SQL Server via SSIS, we have the provision to skip any number of leading rows from the source and load the data to SQL server. Is there any provision to do the same for Excel file.

The source Excel file for me has some description in the leading 5 rows, I want to skip it and start the data load from the row 6. Please provide your thoughts on this.

A:

Easiest would be to give each row a number (a bit like an identity in SQL Server) and then use a conditional split to filter out everything where the number <=5

http://social.msdn.microsoft.com/Forums/en/sqlintegrationservices/thread/947fa27e-e31f-4108-a889-18acebce9217

Q:

  1. Is it possible during import data from Excel to DB table skip first 6 rows for example?

  2. Also Excel data divided by sections with headers. Is it possible for example to skip every 12th row?

A:

  1. YES YOU CAN. Actually, you can do this very easily if you know the number columns that will be imported from your Excel file. In your Data Flow task, you will need to set the "OpenRowset" Custom Property of your Excel Connection (right-click your Excel connection > Properties; in the Properties window, look for OpenRowset under Custom Properties). To ignore the first 5 rows in Sheet1, and import columns A-M, you would enter the following value for OpenRowset: Sheet1$A6:M (notice, I did not specify a row number for column M. You can enter a row number if you like, but in my case the number of rows can vary from one iteration to the next)

  2. AGAIN, YES YOU CAN. You can import the data using a conditional split. You'd configure the conditional split to look for something in each row that uniquely identifies it as a header row; skip the rows that match this 'header logic'. Another option would be to import all the rows and then remove the header rows using a SQL script in the database...like a cursor that deletes every 12th row. Or you could add an identity field with seed/increment of 1/1 and then delete all rows with row numbers that divide perfectly by 12. Something like that...

http://social.msdn.microsoft.com/Forums/en-US/sqlintegrationservices/thread/847c4b9e-b2d7-4cdf-a193-e4ce14986ee2

Q:

I have an SSIS package that imports from an Excel file with data beginning in the 7th row.

Unlike the same operation with a csv file ('Header Rows to Skip' in Connection Manager Editor), I can't seem to find a way to ignore the first 6 rows of an Excel file connection.

I'm guessing the answer might be in one of the Data Flow Transformation objects, but I'm not very familiar with them.

A:

Question Sign in to vote 1 Sign in to vote rbhro, actually there were 2 fields in the upper 5 rows that had some data that I think prevented the importer from ignoring those rows completely.

Anyway, I did find a solution to my problem.

In my Excel source object, I used 'SQL Command' as the 'Data Access Mode' (it's drop down when you double-click the Excel Source object). From there I was able to build a query ('Build Query' button) that only grabbed records I needed. Something like this: SELECT F4, F5, F6 FROM [Spreadsheet$] WHERE (F4 IS NOT NULL) AND (F4 <> 'TheHeaderFieldName')

Note: I initially tried an ISNUMERIC instead of 'IS NOT NULL', but that wasn't supported for some reason.

In my particular case, I was only interested in rows where F4 wasn't NULL (and fortunately F4 didn't containing any junk in the first 5 rows). I could skip the whole header row (row 6) with the 2nd WHERE clause.

So that cleaned up my data source perfectly. All I needed to do now was add a Data Conversion object in between the source and destination (everything needed to be converted from unicode in the spreadsheet), and it worked.

like image 51
KM. Avatar answered Nov 13 '22 15:11

KM.