I have a DataFrame with named rows and columns indexes:
import numpy as np
import pandas as pd
I = pd.Index(["a", "b", "c", "d"], name="rows")
C = pd.Index(["col0", "col1", "col2"], name="cols")
df = pd.DataFrame(data=np.random.rand(4, 3),
                  index=I,
                  columns=C)
I have tried to store it in several formats (Excel, CSV) but when re-reading the file, the names are lost (maybe I have missed some options). Msgpack works but it is marked as experimental so I would prefer to avoid it for now. I would also prefer to avoid pickle. Is there any way (format and option) to store the name of the 2 indexes?
EDIT: I know how to write and read CSV with pandas. The problem is to save the name of the column index and of the row index.
You can use hdf.
import numpy as np
import pandas as pd
I = pd.Index(["a", "b", "c", "d"], name="rows")
C = pd.Index(["col0", "col1", "col2"], name="columns")
df = pd.DataFrame(data=np.random.rand(4,3), index=I, columns=C)
print(df)
columns      col0      col1      col2
rows                                 
a        0.098497  0.918954  0.642800
b        0.168266  0.678434  0.455059
c        0.434939  0.244027  0.599400
d        0.877356  0.053085  0.182661
df.to_hdf('test.hdf', 'test')
print(pd.read_hdf('test.hdf'))
columns      col0      col1      col2
rows                                 
a        0.098497  0.918954  0.642800
b        0.168266  0.678434  0.455059
c        0.434939  0.244027  0.599400
d        0.877356  0.053085  0.182661
                        You can export the DataFrame to a csv-file using .to_csv() and read it back in using .read_csv(). I extended the code you already had as follows:
#!/usr/bin/env python3
# coding: utf-8
import numpy as np
import pandas as pd
I = pd.Index(["a", "b", "c", "d"], "rows")
C = pd.Index(["col0", "col1", "col2"], "cols")
df = pd.DataFrame(data=np.random.rand(4,3), index=I, columns=C)
# export DataFrame to csv
df.to_csv('out.csv')
# set index_col in order to read first column as indices
df_in = pd.read_csv('out.csv', index_col=0)
So the DataFrame df looks like this:
       col0      col1      col2
a  0.590016  0.834033  0.535310
b  0.421589  0.897302  0.029500
c  0.373580  0.109005  0.239181
d  0.473872  0.075918  0.751628
The csv-file out.csv looks like this:
,col0,col1,col2
a,0.5900160748408918,0.8340332218911729,0.5353103406507513
b,0.42158899389955884,0.8973015040807538,0.029500416731096046
c,0.37357951184145965,0.10900495955642386,0.2391805787788026
d,0.47387186813644167,0.07591794371425187,0.7516279365972057
Reading the data back in leads to the DataFrame df_in as follows:
       col0      col1      col2
a  0.590016  0.834033  0.535310
b  0.421589  0.897302  0.029500
c  0.373580  0.109005  0.239181
d  0.473872  0.075918  0.751628
So df2 is exactly the same as df which shows that export and the desired import is working as expected.
EDIT to export column and index names:
df.to_csv('out.csv', index_label=[df.index.name, df.columns.name])
However, this makes re-importing a bit difficult since the columns name is added as a additional column. Normally, this is useful for multi-indexed data, but leads to an additional empty column here.
So I would suggest to export the index name, only:
# export DataFrame to csv
df.to_csv('out.csv', index_label=df.index.name)
# set index_col in order to read first column as indices
df_in = pd.read_csv('out.csv', index_col=0)
which leads to df_in as:
          col0      col1      col2
rows                              
a     0.442467  0.959260  0.626502
b     0.639044  0.989795  0.853002
c     0.576137  0.350260  0.532920
d     0.235698  0.095978  0.194151
I do not know why you need to export the names of both index and colums. If you simply want to access the row or column names you can get their label like this:
column_labels = df.columns.get_values()
>>> array(['col0', 'col1', 'col2'], dtype=object)
index_labels = df.index.get_values()
>>> array(['a', 'b', 'c', 'd'], dtype=object)
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With