I am using CsvHelper to read/writer the data into Csv file. Now I want to parse the delimiter of the csv file. How can I get this please?
My code:
var parser = new CsvParser(txtReader);
delimiter = parser.Configuration.Delimiter;
I always got delimiter is "," but actually in the csv file the delimiter is "\t".
Since I had to deal with the possibility that, depending on the localization settings of the user, the CSV file (Saved in MS Excel) could contain a different delimiter, I ended up with the following approach :
public static string DetectDelimiter(StreamReader reader)
{
// assume one of following delimiters
var possibleDelimiters = new List<string> {",",";","\t","|"};
var headerLine = reader.ReadLine();
// reset the reader to initial position for outside reuse
// Eg. Csv helper won't find header line, because it has been read in the Reader
reader.BaseStream.Position = 0;
reader.DiscardBufferedData();
foreach (var possibleDelimiter in possibleDelimiters)
{
if (headerLine.Contains(possibleDelimiter))
{
return possibleDelimiter;
}
}
return possibleDelimiters[0];
}
I also needed to reset the reader's read position, since it was the same instance I used In the CsvReader constructor.
The usage was then as follows:
using (var textReader = new StreamReader(memoryStream))
{
var delimiter = DetectDelimiter(textReader);
using (var csv = new CsvReader(textReader))
{
csv.Configuration.Delimiter = delimiter;
... rest of the csv reader process
}
}
I found this piece of code in this site
public static char Detect(TextReader reader, int rowCount, IList<char> separators)
{
IList<int> separatorsCount = new int[separators.Count];
int character;
int row = 0;
bool quoted = false;
bool firstChar = true;
while (row < rowCount)
{
character = reader.Read();
switch (character)
{
case '"':
if (quoted)
{
if (reader.Peek() != '"') // Value is quoted and
// current character is " and next character is not ".
quoted = false;
else
reader.Read(); // Value is quoted and current and
// next characters are "" - read (skip) peeked qoute.
}
else
{
if (firstChar) // Set value as quoted only if this quote is the
// first char in the value.
quoted = true;
}
break;
case '\n':
if (!quoted)
{
++row;
firstChar = true;
continue;
}
break;
case -1:
row = rowCount;
break;
default:
if (!quoted)
{
int index = separators.IndexOf((char)character);
if (index != -1)
{
++separatorsCount[index];
firstChar = true;
continue;
}
}
break;
}
if (firstChar)
firstChar = false;
}
int maxCount = separatorsCount.Max();
return maxCount == 0 ? '\0' : separators[separatorsCount.IndexOf(maxCount)];
}
With separators
is the possible separators that you can have.
Hope that help :)
CSV is Comma
Separated Values. I don't think you can reliably detect if there is a different character used a separator. If there is a header row, then you might be able to count on it.
You should know the separator that is used. You should be able to see it when opening the file. If the source of the files gives you a different separator each time and is not reliable, then I'm sorry. ;)
If you just want to parse using a different delimiter, then you can set csv.Configuration.Delimiter
. http://joshclose.github.io/CsvHelper/#configuration-delimiter
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With