prob in the title. exaple:
x=[('a','a','c') for i in range(5)]
df = DataFrame(x,columns=['col1','col2','col3'])
df.to_csv('test.csv')
df1 = read_csv('test.csv')
Unnamed: 0 col1 col2 col3
0 0 a a c
1 1 a a c
2 2 a a c
3 3 a a c
4 4 a a c
The reason seems to be that when saving a dataframe, the index column is written also, with no name in the header. then when you load the csv again, it is loaded with the index column as unnamed column. Is this a bug? How can I avoid writing a csv with the index, or dropping unnamed columns in reading?
You can remove row labels via the index
and index_label
parameters of to_csv.
These are not symmetric as there are ambiguities in the csv format because of the positioning. You need to specify an index_col
on read-back
In [1]: x=[('a','a','c') for i in range(5)]
In [2]: df = DataFrame(x,columns=['col1','col2','col3'])
In [3]: df.to_csv('test.csv')
In [4]: !cat test.csv
,col1,col2,col3
0,a,a,c
1,a,a,c
2,a,a,c
3,a,a,c
4,a,a,c
In [5]: pd.read_csv('test.csv',index_col=0)
Out[5]:
col1 col2 col3
0 a a c
1 a a c
2 a a c
3 a a c
4 a a c
This looks very similar to the above, so is 'foo' a column or an index?
In [6]: df.index.name = 'foo'
In [7]: df.to_csv('test.csv')
In [8]: !cat test.csv
foo,col1,col2,col3
0,a,a,c
1,a,a,c
2,a,a,c
3,a,a,c
4,a,a,c
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With