Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parse CSV where headers contain spaces with CsvHelper

Tags:

c#

csv

csvhelper

I have a CSV file with field headers and some of them contain two or three words separated by spaces:

Screenshot of the first few rows of a spreadsheet in a spreadsheet application, including headers that contain spaces, as described above.

You can see in the above picture the field headers that contain spaces:
"Time of Day", "Process Name", and "Image Path".

When I tried to read the CSV by calling reader.GetRecords<DataRecord>(); (where DataRecord is a class I have defined), I get the error:

Fields 'TimeOfDay' do not exist in the CSV file."*

This is because my DataRecord class can't contain members with spaces.

How can I use CsvHelper to parse the CSV file?

like image 826
E235 Avatar asked Nov 08 '14 12:11

E235


3 Answers

Based on CsvHelper Documentation, there are several ways that we can achieve our desired results.

1. Ignore White Space from Headers (which I believe should solve your problem easily)

In CsvHelper 3 or later, use PrepareHeaderForMatch (documented at http://joshclose.github.io/CsvHelper/configuration#headers) to remove whitespace from headers:

csv.Configuration.PrepareHeaderForMatch =
    header => Regex.Replace(header, @"\s", string.Empty)

In CsvHelper 2, set the IgnoreHeaderWhiteSpace flag which tells the reader to ignore white space in the headers when matching the columns to the properties by name.

reader.Configuration.IgnoreHeaderWhiteSpace = true;

2. Read Manually

We can read each field manually like:

var reader = new CsvReader(sr);
do
{
    reader.Read();                   
    var record=new DataRecord();

    record.TimeOfDay=reader.GetField<string>("Time of Day");
    record.ProcessName=reader.GetField<string>("Process Name");
    record.PID=reader.GetField<string>("PID");
    record.Operation=reader.GetField<string>("Operation");
    record.Path=reader.GetField<string>("Path");
    record.Result=reader.GetField<string>("Result");
    record.Detail=reader.GetField<string>("Detail");
    record.ImagePath=reader.GetField<string>("Image Path");

} while (!reader.IsRecordEmpty());

3. Class Mapping:

We can manually map between our class's properties and the headings in the CSV file using name class mapping like this:

public sealed class DataRecordMap:CsvClassMap<DataRecord>
{
    public DataRecordMap()
    {
         Map( m => m.TimeOfDay).Name("Time Of Day");
         Map( m => m.ProcessName).Name("Process Name");
         Map( m => m.PID).Name("PID");   
         Map( m => m.Operation).Name("Operation");    
         Map( m => m.Path).Name("Path");
         Map( m => m.Result).Name("Result");
         Map( m => m.Detail).Name("Detail");
         Map( m => m.ImagePath).Name("Image Path");
     }
}

Then we should register it using:

reader.Configuration.RegisterClassMap<DataRecordMap>();
like image 95
user3473830 Avatar answered Nov 18 '22 12:11

user3473830


The library supports attributes now. You would want to use the Name attribute.

using CsvHelper.Configuration.Attributes;

public class DataRecord
{
    [Name("Time of Day")]
    public string TimeOfDay { get; set; }

    [Name("Process Name")]
    public string ProcessName { get; set; }

    public string PID { get; set; }
    public string Operation { get; set; }
    public string Path { get; set; }
    public string Result { get; set; }
    public string Detail { get; set; }

    [Name("Image Path")]
    public string ImagePath { get; set; }

    public static IEnumerable<DataRecord> ParseDataRecords(Stream file)
    {
        using (var sr = new StreamReader(file))
        using (var csv = new CsvReader(sr))
        {
            foreach (var record in csv.GetRecords<DataRecord>())
            {
                yield return record;
            }
        }
    }
}
like image 25
JPK Avatar answered Nov 18 '22 12:11

JPK


After lots of struggling with this, here's what worked for me. I"m using version 3.

        var config = new CsvConfiguration(CultureInfo.InvariantCulture)
        {
            PrepareHeaderForMatch = args => args.Header.Replace(" ","")
        };

        using (var reader = new StreamReader("my_csv_file.csv"))
        using (var csv = new CsvReader(reader, config))

I believe I'm creating a configuration (config) that appends or modifies InvariantCulture and uses the Replace function in C# to remove the spaces. So far it's working for me.

like image 2
StephenHenry Avatar answered Nov 18 '22 12:11

StephenHenry