Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Problem with using OleDbDataAdapter to fetch data from a Excel sheet

First, I want to say that I'm out on deep water here, since I'm just doing some changes to code that is written by someone else in the company, using OleDbDataAdapter to "talk" to Excel and I'm not familiar with that. There is one bug there I just can't follow.

I'm trying to use a OleDbDataAdapter to read in a excel file with around 450 lines.

In the code it's done like this:

connection = new OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;" + "Data Source='" + path + "';" + "Extended Properties=\"Excel 8.0;HDR=Yes;IMEX=1;\"");
connection.Open();
OleDbDataAdapter objAdapter = new OleDbDataAdapter(objCommand.CommandText, connection);
objAdapter.Fill(objDataSet, "Excel");

foreach (DataColumn dataColumn in objTable.Columns) {
  if (dataColumn.Ordinal > objDataSet.Tables[0].Columns.Count - 1) {
    objDataSet.Tables[0].Columns.Add();
  }
  objDataSet.Tables[0].Columns[dataColumn.Ordinal].ColumnName = dataColumn.ColumnName;
  objImport.Columns.Add(dataColumn.ColumnName);
}

foreach (DataRow dataRow in objDataSet.Tables[0].Rows) {
   ...
}

Everything seems to be working fine except for one thing. The second column is filled with mostly four digit numbers like 6739, 3920 and so one, but fice rows have alphanumeric values like 8201NO and 8205NO. Those five cells are reported as having blank contents instead of their alphanumeric content. I have checked in excel, and all the cells in this columns are marked as Text.

This is an xls file by the way, and not xlsx.

Do anyone have any clue as why these cells are shown as blank in the DataRow, but the numeric ones are shown fine? There are other columns with alphanumeric content that are shown just fine.

like image 471
Øyvind Bråthen Avatar asked Sep 28 '10 22:09

Øyvind Bråthen


2 Answers

What's happening is that excel is trying to assign a data type to the spreadsheet column based on the first several values in that column. I suspect that if you look at the properties in that column it will say it is a numerical column.

The problem comes when you start trying to query that spreadsheet using jet. When it thinks it's dealing with a numerical column and it finds a varchar value it quietly returns nothing. Not even a cryptic error message to go off of.

As a possible work around can you move one of the alpha numeric values to the first row of data and then try parsing. I suspect you will start getting values for the alpha numeric rows then...

Take a look at this article. It goes into more detail on this issue. it also talks about a possible work around which is:

However, as per JET documentation, we can override the registry setting thru the Connection String, if we set IMEX=1( as part of Extended Properties), the JET will set the all column type as UNICODE VARCHAR or ADVARWCHAR irrespective of ‘ImportMixedTypes’ key value.hey

like image 161
Abe Miessler Avatar answered Sep 19 '22 00:09

Abe Miessler


IMEX=1 means "Read mixed data as text."

There are some gotchas, however. Jet will only use several rows to determine whether the data is mixed, and if so happens these rows are all numeric, you'll get this behaviour.

See connectionstrings.com for details:

Check out the [HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel] located registry REG_DWORD "TypeGuessRows". That's the key to not letting Excel use only the first 8 rows to guess the columns data type. Set this value to 0 to scan all rows. This might hurt performance. Please also note that adding the IMEX=1 option might cause the IMEX feature to set in after just 8 rows. Use IMEX=0 instead to be sure to force the registry TypeGuessRows=0 (scan all rows) to work.

like image 33
GSerg Avatar answered Sep 20 '22 00:09

GSerg