Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas dataframe - how to assign index?

Tags:

python

pandas

My code is

import numpy as np
import pandas as pd
ser_1 = pd.Series(np.random.randn(6))
ser_2 = pd.Series(np.random.randn(6))
ser_3 = pd.Series(np.random.randn(6))
df = pd.DataFrame(data= {'Col1': ser_1, 'Col2': ser_2, 'Col3':ser_3 } ,  )
df

It gives me a table consists of generated rand #s:

    Col1    Col2    Col3
0   -0.594436   -0.014419   0.512523
1   0.208414    0.804857    0.261830
2   1.714547    -0.765586   -0.153386
3   -0.834847   -0.683258   -1.341085
4   2.726621    0.379711    -0.276410
5   0.151987    0.622103    0.966635

However, I would like to have labels for the rows instead of 0, 1, ...5, I tried

df = pd.DataFrame(data= {'Col1': ser_1, 'Col2': ser_2, 'Col3':ser_3 } , index=['row0', 'row1', 'row2', 'row3', 'row4', 'row5', 'row6'] )

But as expected it gives me NaNs

    Col1    Col2    Col3
row0    NaN     NaN     NaN
row1    NaN     NaN     NaN
row2    NaN     NaN     NaN
row3    NaN     NaN     NaN
row4    NaN     NaN     NaN
row5    NaN     NaN     NaN
row6    NaN     NaN     NaN

Question is what can be done so that it won't give NaNs and I can still label them?

like image 851
user5331677 Avatar asked Dec 05 '25 03:12

user5331677


2 Answers

You can set the index directly:

In [11]: df.index = ['row0', 'row1', 'row2', 'row3', 'row4', 'row5']

In [12]: df
Out[12]:
          Col1      Col2      Col3
row0 -1.094278 -0.689078 -0.465548
row1  1.555546 -0.388261  1.211150
row2 -0.143557  1.769561 -0.679080
row3 -0.064910  1.959216  0.227133
row4 -0.383729  0.113739 -0.954082
row5  0.434357 -0.646387  0.883319

Note: you can also do this with map (which is a little cleaner):

df.index = df.index.map(lambda x: 'row%s' % x)

...though I should say that usually this isn't something you usually need to do, keeping integer index is A Good ThingTM.

like image 174
Andy Hayden Avatar answered Dec 07 '25 19:12

Andy Hayden


A list comprehension would also work:

df.index = ['row{0}'.format(n) for n in range(df.index.shape[0])]

>>> df
          Col1      Col2      Col3
row0 -1.213463 -1.331086  0.306792
row1  0.334060 -0.127397 -0.107466
row2 -0.893235  0.580098 -0.191778
row3 -0.663146 -1.269988 -1.303429
row4  0.418924  0.316321 -0.940015
row5 -0.082087 -1.893178 -1.809514
like image 28
Alexander Avatar answered Dec 07 '25 20:12

Alexander



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!