Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Saving and Loading of dataframe to csv results in Unnamed columns

Tags:

python

pandas

prob in the title. exaple:

x=[('a','a','c') for i in range(5)]
df = DataFrame(x,columns=['col1','col2','col3'])
df.to_csv('test.csv')
df1 = read_csv('test.csv')

   Unnamed: 0 col1 col2 col3
0           0    a    a    c
1           1    a    a    c
2           2    a    a    c
3           3    a    a    c
4           4    a    a    c

The reason seems to be that when saving a dataframe, the index column is written also, with no name in the header. then when you load the csv again, it is loaded with the index column as unnamed column. Is this a bug? How can I avoid writing a csv with the index, or dropping unnamed columns in reading?

like image 291
idoda Avatar asked Oct 17 '13 14:10

idoda


2 Answers

You can remove row labels via the index and index_label parameters of to_csv.

like image 106
Max Avatar answered Nov 08 '22 22:11

Max


These are not symmetric as there are ambiguities in the csv format because of the positioning. You need to specify an index_col on read-back

In [1]: x=[('a','a','c') for i in range(5)]

In [2]: df = DataFrame(x,columns=['col1','col2','col3'])

In [3]: df.to_csv('test.csv')

In [4]: !cat test.csv
,col1,col2,col3
0,a,a,c
1,a,a,c
2,a,a,c
3,a,a,c
4,a,a,c

In [5]: pd.read_csv('test.csv',index_col=0)
Out[5]: 
  col1 col2 col3
0    a    a    c
1    a    a    c
2    a    a    c
3    a    a    c
4    a    a    c

This looks very similar to the above, so is 'foo' a column or an index?

In [6]: df.index.name = 'foo'

In [7]: df.to_csv('test.csv')

In [8]: !cat test.csv
foo,col1,col2,col3
0,a,a,c
1,a,a,c
2,a,a,c
3,a,a,c
4,a,a,c
like image 33
Jeff Avatar answered Nov 08 '22 22:11

Jeff