Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Quoting parameter in pandas read_csv()

Tags:

python

pandas

csv

I'm using pandas.read_csv() and I found quotechar and quoting parameters in it,

pandas.read_csv(filepath_or_buffer, sep=', ' , quotechar='"', quoting=0) 

what is the exact use of these parameters? I checked the documentation, but I can't comprehend it.

like image 664
gaurav1207 Avatar asked Apr 11 '17 11:04

gaurav1207


1 Answers

it's for handling multi string data:

In [39]:
data = {"strings": ["string", "string,string"],
        "int": np.arange(2),
        "float": np.random.randn(2)}
​
df = pd.DataFrame(data)
df

Out[39]:
      float  int        strings
0  0.116076    0         string
1 -0.316229    1  string,string

In [40]:    
df.to_csv(quotechar="'")

Out[40]:
",float,int,strings\n0,0.11607600924932446,0,string\n1,-0.31622948240636567,1,'string,string'\n"

You can see that the string,string gets quoted to:

'string,string'

when writing to a csv

Whilst the first single string is left alone.

The default quote char is double quotes:

In [41]:
df.to_csv()

Out[41]:
',float,int,strings\n0,0.11607600924932446,0,string\n1,-0.31622948240636567,1,"string,string"\n'

the multi-string entry is written out as:

"string,string"
like image 119
EdChum Avatar answered Oct 27 '22 12:10

EdChum