Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert string to integer pandas dataframe index

I have a pandas dataframe with a multiindex. Unfortunately one of the indices gives years as a string

e.g. '2010', '2011'

how do I convert these to integers?

More concretely

MultiIndex(levels=[[u'2010', u'2011'], [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]],
       labels=[[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...], [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 
 10, 11, 12, , ...]], names=[u'Year', u'Month'])

.

df_cbs_prelim_total.index.set_levels(df_cbs_prelim_total.index.get_level_values(0).astype('int'))

seems to do it, but not inplace. Any proper way of changing them?

Cheers, Mike

like image 310
Mike Avatar asked Oct 19 '22 21:10

Mike


1 Answers

Will probably be cleaner to do this before you assign it as index (as @EdChum points out), but when you already have it as index, you can indeed use set_levels to alter one of the labels of a level of your multi-index. A bit cleaner as your code (you can use index.levels[..]):

In [165]: idx = pd.MultiIndex.from_product([[1,2,3], ['2011','2012','2013']])

In [166]: idx
Out[166]:
MultiIndex(levels=[[1, 2, 3], [u'2011', u'2012', u'2013']],
           labels=[[0, 0, 0, 1, 1, 1, 2, 2, 2], [0, 1, 2, 0, 1, 2, 0, 1, 2]])

In [167]: idx.levels[1]
Out[167]: Index([u'2011', u'2012', u'2013'], dtype='object')    

In [168]: idx = idx.set_levels(idx.levels[1].astype(int), level=1)

In [169]: idx
Out[169]:
MultiIndex(levels=[[1, 2, 3], [2011, 2012, 2013]],
           labels=[[0, 0, 0, 1, 1, 1, 2, 2, 2], [0, 1, 2, 0, 1, 2, 0, 1, 2]])

You have to reassign it to save the changes (as is done above, in your case this would be df_cbs_prelim_total.index = df_cbs_prelim_total.index.set_levels(...))

like image 114
joris Avatar answered Oct 22 '22 23:10

joris