Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When reading a CSV file using a DataReader and the OLEDB Jet data provider, how can I control column data types?

Tags:

c#

.net

csv

oledb

In my C# application I am using the Microsoft Jet OLEDB data provider to read a CSV file. The connection string looks like this:

Provider=Microsoft.Jet.OLEDB.4.0;Data Source=c:\Data;Extended Properties="text;HDR=Yes;FMT=Delimited

I open an ADO.NET OleDbConnection using that connection string and select all the rows from the CSV file with the command:

select * from Data.csv

When I open an OleDbDataReader and examine the data types of the columns it returns, I find that something in the stack has tried to guess at the data types based on the first row of data in the file. For example, suppose the CSV file contains:

House,Street,Town
123,Fake Street,Springfield
12a,Evergreen Terrace,Springfield

Calling the OleDbDataReader.GetDataTypeName method for the House column will reveal that the column has been given the data type "DBTYPE_I4", so all values read from it are interpreted as integers. My problem is that House should be a string - when I try to read the House value from the second row, the OleDbDataReader returns null.

How can I tell either the Jet database provider or the OleDbDataReader to interpret a column as strings instead of numbers?

like image 501
Rory MacLeod Avatar asked Sep 22 '08 15:09

Rory MacLeod


3 Answers

To expand on Marc's answer, I need to create a text file called Schema.ini and put it in the same directory as the CSV file. As well as column types, this file can specify the file format, date time format, regional settings, and the column names if they're not included in the file.

To make the example I gave in the question work, the Schema file should look like this:

[Data.csv]
ColNameHeader=True
Col1=House Text
Col2=Street Text
Col3=Town Text

I could also try this to make the data provider examine all the rows in the file before it tries to guess the data types:

[Data.csv]
ColNameHeader=true
MaxScanRows=0

In real life, my application imports data from files with dynamic names, so I have to create a Schema.ini file on the fly and write it to the same directory as the CSV file before I open my connection.

Further details can be found here - http://msdn.microsoft.com/en-us/library/ms709353(VS.85).aspx - or by searching the MSDN Library for "Schema.ini file".

like image 187
Rory MacLeod Avatar answered Oct 18 '22 06:10

Rory MacLeod


There's a schema file you can create that would tell ADO.NET how to interpret the CSV - in effect giving it a structure.

Try this: http://www.aspdotnetcodes.com/Importing_CSV_Database_Schema.ini.aspx

Or the most recent MS Documentation

like image 23
MarcE Avatar answered Oct 18 '22 06:10

MarcE


Please check

http://kbcsv.codeplex.com/

using (var reader = new CsvReader("data.csv"))
{
    reader.ReadHeaderRecord();
    foreach (var record in reader.DataRecords)
    {
        var name = record["Name"];
        var age = record["Age"];
    }
}
like image 6
Akhil Avatar answered Oct 18 '22 08:10

Akhil