I am writing a C# library to read in Excel files (both xls and xlsx) and I'm coming across an issue.
Exactly the same as what was expressed in this question, if my Excel file has a column that has string values, but has a numeric value in the first row, the OLEDB provider assumes that column to be numeric and returns NULL
for the values in that column that are not numeric.
I am aware that, as in the answer provided, I can make a change in the registry, but since this is a library I plan to use on many machines and don't want to change every user's registry values, I was wondering if there is a better solution.
Maybe a DB provider other than ACE.OLEDB (and it seems JET is no longer supported well enough to be considered)?
Also, since this needs to work on XLS / XLSX, options such as EPPlus / XML readers won't work for the xls version.
Select the field (the column) that you want to change. On the Fields tab, in the Properties group, click the arrow in the drop-down list next to Data Type, and then select a data type. Save your changes.
Try repairing Microsoft Office 365. That often does something to your Office setup which makes Stock and Geography data types reappear. In Windows go to Control Panel | Programs & Features | Office 365 | Change then choose Online Repair.
Whenever you want to get current data for your data types, right-click a cell with the linked data type and select Data Type > Refresh. That will refresh the cell you selected, plus any other cells that have that same data type.
Power Query reads the table schema from the data source and automatically displays the data by using the correct data type for each column. Unstructured sources Examples include Excel, CSV, and text files. Power Query automatically detects data types by inspecting the values in the table.
Your connection string should look like this
Provider=Microsoft.ACE.OLEDB.12.0;Data Source=c:\myFolder\myExcelfile.xlsx;Extended Properties="Excel 12.0 Xml;HDR=YES;IMEX=1";
IMEX=1 in the connection string is the part that you need to treat the column as mixed datatype. This should work fine without the need to edit the registry.
HDR=Yes is simply to mark the first row as column headers and is not needed in your particular problem, however I've included it anyways.
To always use IMEX=1 is a safer way to retrieve data for mixed data columns.
Source: https://www.connectionstrings.com/excel/
Edit:
Here is the data I'm using:
Here is the output:
This is the exact code I used:
string connString = @"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=C:\test.xlsx;Extended Properties=""Excel 12.0 Xml;HDR=YES;IMEX=1""";
using (DbClass db = new DbClass(connString))
{
var x = db.dataReader("SELECT * FROM [Sheet1$]");
while (x.Read())
{
for (int i = 0; i < x.FieldCount; i++)
Console.Write(x[i] + "\t");
Console.WriteLine("");
}
}
The DbClass is a simple wrapper I made in order to make life easier. It can be found here:
http://tech.reboot.pro/showthread.php?tid=4713
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With