Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to read a semi-colon separated file with double quotes in the columns?

Tags:

r

csv

I have a semi-colon separated file which I want to read. The data in the file is given below. In the 4th row , I want to be able to read only 4 columns.

But I'm failing to do that in R.

ID;Comment;Date;Amt
1;Hello;5-06-2003;85.13
2;World;5-06-2013;127.39
3;Airlines;5-06-1999;148.34
4;"Air"l;ine"s";5-09-2013;87.94

data<-read.table(fileName,header=T,sep = ";",quote="\"",na.strings = c("" , ".", "-", "NA" ));

The above code does not work. Can anyone help ?

like image 791
G_1991 Avatar asked Nov 21 '25 06:11

G_1991


2 Answers

fread from the data.table package, which can handle such "exceptions" quite nicely, would be one way to solve this.

data.table::fread("file.txt")
   ID     Comment      Date    Amt
1:  1       Hello 5-06-2003  85.13
2:  2       World 5-06-2013 127.39
3:  3    Airlines 5-06-1999 148.34
4:  4 Air"l;ine"s 5-09-2013  87.94
like image 66
plastikdusche Avatar answered Nov 23 '25 20:11

plastikdusche


Another way is to use some delicious regex

path <- tempfile()
writeLines('ID;Comment;Date;Amt
1;Hello;5-06-2003;85.13
2;World;5-06-2013;127.39
3;Airlines;5-06-1999;148.34
4;"Air"l;ine"s";5-09-2013;87.94', path)


(rl <- scan(path, what = ''))

read.table(text = gsub('^(\\w+);(.*?);(Date|[-0-9]+);(Amt|[0-9.]+)$',
                       '\\1 \\2 \\3 \\4', rl),
           quote = '', header = TRUE, stringsAsFactors = FALSE)

#   ID       Comment      Date    Amt
# 1  1         Hello 5-06-2003  85.13
# 2  2         World 5-06-2013 127.39
# 3  3      Airlines 5-06-1999 148.34
# 4  4 "Air"l;ine"s" 5-09-2013  87.94

And a simplified version gives the same thing

read.table(text = gsub('^(.*?);(.*);(.*?);(.*?)$',
                       '\\1 \\2 \\3 \\4', rl),
           quote = '', header = TRUE, stringsAsFactors = FALSE)
like image 24
rawr Avatar answered Nov 23 '25 21:11

rawr



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!