I am working on a CSV parser using C# TextFieldParser class.
My CSV data is deliminated by ,
and the string is enclosed by a "
character.
However, sometimes the data row cell can also have a "
which appears to be making the parser throw an exception.
This is my C# code so far:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using Microsoft.VisualBasic.FileIO;
namespace CSV_Parser
{
class Program
{
static void Main(string[] args)
{
// Init
string CSV_File = "test.csv";
// Proceed If File Is Found
if (File.Exists(CSV_File))
{
// Test
Parse_CSV(CSV_File);
}
// Finished
Console.WriteLine("Press any to exit ...");
Console.ReadKey();
}
static void Parse_CSV(String Filename)
{
using (TextFieldParser parser = new TextFieldParser(Filename))
{
parser.TextFieldType = FieldType.Delimited;
parser.SetDelimiters(",");
parser.TrimWhiteSpace = true;
while (!parser.EndOfData)
{
string[] fieldRow = parser.ReadFields();
foreach (string fieldRowCell in fieldRow)
{
// todo
}
}
}
}
}
}
This is the content of my test.csv
file:
" dummy test"s data", b , c
d,e,f
gh,ij
What is the best way to deal with "
in my row cell data?
UPDATE
Based on Tim Schmelter's
answer, I have modified my code to the following:
static void Parse_CSV(String Filename)
{
using (TextFieldParser parser = new TextFieldParser(Filename))
{
parser.TextFieldType = FieldType.Delimited;
parser.SetDelimiters(",");
parser.HasFieldsEnclosedInQuotes = false;
parser.TrimWhiteSpace = true;
while (parser.PeekChars(1) != null)
{
var cleanFieldRowCells = parser.ReadFields().Select(
f => f.Trim(new[] { ' ', '"' }));
Console.WriteLine(String.Join(" | ", cleanFieldRowCells));
}
}
}
Which appears to produce the following (correctly):
Is this is the best way to deal with string enclosed by quotes, having quotes?
Could you omit the quoting-character by setting HasFieldsEnclosedInQuotes
to false
?
using (var parser = new TextFieldParser(@"Path"))
{
parser.HasFieldsEnclosedInQuotes = false;
parser.Delimiters = new[]{","};
while(parser.PeekChars(1) != null)
{
string[] fields = parser.ReadFields();
}
}
You can remove the quotes manually:
var cleanFields = fields.Select(f => f.Trim(new[]{ ' ', '"' }));
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With