How can I detect the CSV separator from a string in Javascript/NodeJS?
Which is the standard algorithm?
Note that the separator is not a comma always. The most common separators being ;
, ,
and \t
(tab).
A possible algorithm for getting the likely separator(s) is pretty simple, and assumes the data is well-formed:
length
.length
is not equal to the last line's length, this is not a valid delimiter.Proof of concept (doesn't handle quoted fields):
function guessDelimiters (text, possibleDelimiters) {
return possibleDelimiters.filter(weedOut);
function weedOut (delimiter) {
var cache = -1;
return text.split('\n').every(checkLength);
function checkLength (line) {
if (!line) {
return true;
}
var length = line.split(delimiter).length;
if (cache < 0) {
cache = length;
}
return cache === length && length > 1;
}
}
}
The length > 1
check is to make sure the split
didn't just return the whole line. Note that this returns an array of possible delimiters - if there's more than one item, you have an ambiguity problem.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With