Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Read csv file in R with double quotes

Tags:

r

csv

Suppose I have a csv file looks like this:

Type,ID,NAME,CONTENT,RESPONSE,GRADE,SOURCE
A,3,"","I have comma, ha!",I have open double quotes",A,""

desired output should be:

df <- data.frame(Type='A',ID=3, NAME=NA, CONTENT='I have comma, ha!',
                 RESPONSE='I have open double quotes\"', GRADE=A, SOURCE=NA)
df
  Type ID NAME           CONTENT                   RESPONSE GRADE SOURCE
1    A  3   NA I have comma, ha! I have open double quotes"     A     NA

I tried to use read.csv, since the data provider uses quote to escape comma in the string, but they forgot to escape double quotes in string with no comma, so no matter whether I disable quote in read.csv I won't get desired output.

How can I do this in R? Other package solutions are also welcome.

like image 348
Bamqf Avatar asked Aug 19 '15 19:08

Bamqf


People also ask

How do you add double quotes in R?

With an escape character, however, adding a double quote inside your string is easy, you simply prepend the double quote with the backslash. The table below shows some of the other characters that can be "escaped" in this way.

Why does my CSV have double quotes?

CSV files use double-quote marks to delimit field values that have spaces, so a value like Santa Claus gets saved as “Santa Claus” to preserve the space. If a field value contains double-quotes, then the double-quotes get doubled-up so the parser can tell the difference.

How do you handle double quotes and commas in a CSV file?

Since CSV files use the comma character "," to separate columns, values that contain commas must be handled as a special case. These fields are wrapped within double quotation marks. The first double quote signifies the beginning of the column data, and the last double quote marks the end.


2 Answers

I'm not too sure about the structure of CSV files, but you said the author had escaped the comma in the text under content.

This works to read the text as is with the " at the end.

read.csv2("Test.csv", header = T,sep = ",", quote="")
like image 116
Buzz Lightyear Avatar answered Sep 22 '22 15:09

Buzz Lightyear


fread from data.table handles this just fine:

library(data.table)

fread('Type,ID,NAME,CONTENT,RESPONSE,GRADE,SOURCE
A,3,"","I have comma, ha!",I have open double quotes",A,""')
#   Type ID NAME           CONTENT                   RESPONSE GRADE SOURCE
#1:    A  3      I have comma, ha! I have open double quotes"     A       
like image 31
eddi Avatar answered Sep 20 '22 15:09

eddi