Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to check if .csv-File has a comma or a semicolon as separator?

I have to read in a lot of CSV files automatically. Some have a comma as a delimiter, then I use the command read.csv().

Some have a semicolon as a delimiter, then I use read.csv2().

I want to write a piece of code that recognizes if the CSV file has a comma or a semicolon as a a delimiter (before I read it) so that I don´t have to change the code every time.

My approach would be something like this:

try to read.csv("xyz")
if error 
read.csv2("xyz")

Is something like that possible? Has somebody done this before? How can I check if there was an error without actually seeing it?

like image 213
ValentinDarting Avatar asked Dec 14 '22 10:12

ValentinDarting


1 Answers

Here are a few approaches assuming that the only difference among the format of the files is whether the separator is semicolon and the decimal is a comma or the separator is a comma and the decimal is a point.

1) fread As mentioned in the comments fread in data.table package will automatically detect the separator for common separators and then read the file in using the separator it detected. This can also handle certain other changes in format such as automatically detecting whether the file has a header.

2) grepl Look at the first line and see if it has a comma or semicolon and then re-read the file:

L <- readLines("myfile", n = 1)
if (grepl(";", L)) read.csv2("myfile") else read.csv("myfile")

3) count.fields We can assume semicolon and then count the fields in the first line. If there is one field then it is comma separated and if not then it is semicolon separated.

L <- readLines("myfile", n = 1)
numfields <- count.fields(textConnection(L), sep = ";")
if (numfields == 1) read.csv("myfile") else read.csv2("myfile")

Update Added (3) and made improvements to all three.

like image 185
G. Grothendieck Avatar answered Jan 31 '23 00:01

G. Grothendieck