My code is
import numpy as np
import pandas as pd
ser_1 = pd.Series(np.random.randn(6))
ser_2 = pd.Series(np.random.randn(6))
ser_3 = pd.Series(np.random.randn(6))
df = pd.DataFrame(data= {'Col1': ser_1, 'Col2': ser_2, 'Col3':ser_3 } , )
df
It gives me a table consists of generated rand #s:
Col1 Col2 Col3
0 -0.594436 -0.014419 0.512523
1 0.208414 0.804857 0.261830
2 1.714547 -0.765586 -0.153386
3 -0.834847 -0.683258 -1.341085
4 2.726621 0.379711 -0.276410
5 0.151987 0.622103 0.966635
However, I would like to have labels for the rows instead of 0, 1, ...5, I tried
df = pd.DataFrame(data= {'Col1': ser_1, 'Col2': ser_2, 'Col3':ser_3 } , index=['row0', 'row1', 'row2', 'row3', 'row4', 'row5', 'row6'] )
But as expected it gives me NaNs
Col1 Col2 Col3
row0 NaN NaN NaN
row1 NaN NaN NaN
row2 NaN NaN NaN
row3 NaN NaN NaN
row4 NaN NaN NaN
row5 NaN NaN NaN
row6 NaN NaN NaN
Question is what can be done so that it won't give NaNs and I can still label them?
You can set the index directly:
In [11]: df.index = ['row0', 'row1', 'row2', 'row3', 'row4', 'row5']
In [12]: df
Out[12]:
Col1 Col2 Col3
row0 -1.094278 -0.689078 -0.465548
row1 1.555546 -0.388261 1.211150
row2 -0.143557 1.769561 -0.679080
row3 -0.064910 1.959216 0.227133
row4 -0.383729 0.113739 -0.954082
row5 0.434357 -0.646387 0.883319
Note: you can also do this with map (which is a little cleaner):
df.index = df.index.map(lambda x: 'row%s' % x)
...though I should say that usually this isn't something you usually need to do, keeping integer index is A Good ThingTM.
A list comprehension would also work:
df.index = ['row{0}'.format(n) for n in range(df.index.shape[0])]
>>> df
Col1 Col2 Col3
row0 -1.213463 -1.331086 0.306792
row1 0.334060 -0.127397 -0.107466
row2 -0.893235 0.580098 -0.191778
row3 -0.663146 -1.269988 -1.303429
row4 0.418924 0.316321 -0.940015
row5 -0.082087 -1.893178 -1.809514
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With