Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reading CSV files using C#

Tags:

c#

csv

People also ask

What is csv file in C?

Writing to a CSV using C code A basic CSV file relies on a comma-separated data format, which makes it straightforward to write to this file type from a c program. First you will want to create a file pointer, which will be used to access and write to a file.


Don't reinvent the wheel. Take advantage of what's already in .NET BCL.

  • add a reference to the Microsoft.VisualBasic (yes, it says VisualBasic but it works in C# just as well - remember that at the end it is all just IL)
  • use the Microsoft.VisualBasic.FileIO.TextFieldParser class to parse CSV file

Here is the sample code:

using (TextFieldParser parser = new TextFieldParser(@"c:\temp\test.csv"))
{
    parser.TextFieldType = FieldType.Delimited;
    parser.SetDelimiters(",");
    while (!parser.EndOfData) 
    {
        //Processing row
        string[] fields = parser.ReadFields();
        foreach (string field in fields) 
        {
            //TODO: Process field
        }
    }
}

It works great for me in my C# projects.

Here are some more links/informations:

  • MSDN: Read From Comma-Delimited Text Files in Visual Basic
  • MSDN: TextFieldParser Class

I recommend CsvHelper from Nuget.

PS: Regarding other more upvoted answers, I'm sorry but adding a reference to Microsoft.VisualBasic is:

  • Ugly
  • Not cross-platform, because it's not available in .NETCore/.NET5 (and Mono never had very good support of Visual Basic, so it may be buggy).

My experience is that there are many different csv formats. Specially how they handle escaping of quotes and delimiters within a field.

These are the variants I have ran into:

  • quotes are quoted and doubled (excel) i.e. 15" -> field1,"15""",field3
  • quotes are not changed unless the field is quoted for some other reason. i.e. 15" -> field1,15",fields3
  • quotes are escaped with \. i.e. 15" -> field1,"15\"",field3
  • quotes are not changed at all (this is not always possible to parse correctly)
  • delimiter is quoted (excel). i.e. a,b -> field1,"a,b",field3
  • delimiter is escaped with \. i.e. a,b -> field1,a\,b,field3

I have tried many of the existing csv parsers but there is not a single one that can handle the variants I have ran into. It is also difficult to find out from the documentation which escaping variants the parsers support.

In my projects I now use either the VB TextFieldParser or a custom splitter.


Sometimes using libraries are cool when you do not want to reinvent the wheel, but in this case one can do the same job with fewer lines of code and easier to read compared to using libraries. Here is a different approach which I find very easy to use.

  1. In this example, I use StreamReader to read the file
  2. Regex to detect the delimiter from each line(s).
  3. An array to collect the columns from index 0 to n

using (StreamReader reader = new StreamReader(fileName))
    {
        string line; 

        while ((line = reader.ReadLine()) != null)
        {
            //Define pattern
            Regex CSVParser = new Regex(",(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))");

            //Separating columns to array
            string[] X = CSVParser.Split(line);

            /* Do something with X */
        }
    }

CSV can get complicated real fast.

Use something robust and well-tested:
FileHelpers: www.filehelpers.net

The FileHelpers are a free and easy to use .NET library to import/export data from fixed length or delimited records in files, strings or streams.

Another one to this list, Cinchoo ETL - an open source library to read and write CSV files

For a sample CSV file below

Id, Name
1, Tom
2, Mark

Quickly you can load them using library as below

using (var reader = new ChoCSVReader("test.csv").WithFirstLineHeader())
{
   foreach (dynamic item in reader)
   {
      Console.WriteLine(item.Id);
      Console.WriteLine(item.Name);
   }
}

If you have POCO class matching the CSV file

public class Employee
{
   public int Id { get; set; }
   public string Name { get; set; }
}

You can use it to load the CSV file as below

using (var reader = new ChoCSVReader<Employee>("test.csv").WithFirstLineHeader())
{
   foreach (var item in reader)
   {
      Console.WriteLine(item.Id);
      Console.WriteLine(item.Name);
   }
}

Please check out articles at CodeProject on how to use it.

Disclaimer: I'm the author of this library