Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Set value multiindex Pandas

I'm a newbie to both Python and Pandas.

I am trying to construct a dataframe, and then later populate it with values.

I have constructed my dataframe

from pandas import *

ageMin = 21
ageMax = 31
ageStep = 2

bins_sumins = [0, 10000, 20000]
bins_age = list(range(ageMin, ageMax, ageStep))
indeks_sex = ['M', 'F']
indeks_age  =  ['[{0}-{1})'.format(bins_age[i-1], bins_age[i]) for i in range(1, len(bins_age))]
indeks_sumins = ['[{0}-{1})'.format(bins_sumins[i-1], bins_sumins[i]) for i in range(1, len(bins_sumins))]
indeks = MultiIndex.from_product([indeks_age, indeks_sex, indeks_sumins], names=['Age', 'Sex', 'Sumins'])

cols = ['A', 'B', 'C', 'D']

df = DataFrame(data = 0, index = indeks, columns = cols)

So far all is well. I am able to assign value to a whole set of values

>>> df['A']['[21-23)']['M'] = 1
>>> df
                           A  B  C  D
Age     Sex Sumins                   
[21-23) M   [0-10000)      1  0  0  0
            [10000-20000)  1  0  0  0
        F   [0-10000)      0  0  0  0
            [10000-20000)  0  0  0  0
[23-25) M   [0-10000)      0  0  0  0
            [10000-20000)  0  0  0  0
        F   [0-10000)      0  0  0  0
            [10000-20000)  0  0  0  0
[25-27) M   [0-10000)      0  0  0  0
            [10000-20000)  0  0  0  0
        F   [0-10000)      0  0  0  0
            [10000-20000)  0  0  0  0
[27-29) M   [0-10000)      0  0  0  0
            [10000-20000)  0  0  0  0
        F   [0-10000)      0  0  0  0
            [10000-20000)  0  0  0  0

however, setting the value of one position only is a no go...

>>> df['B']['[21-23)']['M']['[10000-20000)'] = 2
>>> df
                           A  B  C  D
Age     Sex Sumins                   
[21-23) M   [0-10000)      1  0  0  0
            [10000-20000)  1  0  0  0
        F   [0-10000)      0  0  0  0
            [10000-20000)  0  0  0  0
[23-25) M   [0-10000)      0  0  0  0
            [10000-20000)  0  0  0  0
        F   [0-10000)      0  0  0  0
            [10000-20000)  0  0  0  0
[25-27) M   [0-10000)      0  0  0  0
            [10000-20000)  0  0  0  0
        F   [0-10000)      0  0  0  0
            [10000-20000)  0  0  0  0
[27-29) M   [0-10000)      0  0  0  0
            [10000-20000)  0  0  0  0
        F   [0-10000)      0  0  0  0
            [10000-20000)  0  0  0  0
[16 rows x 4 columns]

What is going on here? I am open to the idea that i have completely misunderstood how multiindexing works. Anyone?

like image 589
mortysporty Avatar asked Apr 16 '14 12:04

mortysporty


People also ask

How do I change my pandas multilevel index?

A multi-index dataframe has multi-level, or hierarchical indexing. We can easily convert the multi-level index into the column by the reset_index() method. DataFrame. reset_index() is used to reset the index to default and make the index a column of the dataframe.

How do I drop one level of MultiIndex pandas?

Drop Level Using MultiIndex.droplevel() to drop columns level. When you have Multi-level columns DataFrame. columns return MultiIndex object and use droplevel() on this object to drop level.

What is a MultiIndex in pandas?

The MultiIndex object is the hierarchical analogue of the standard Index object which typically stores the axis labels in pandas objects. You can think of MultiIndex as an array of tuples where each tuple is unique. A MultiIndex can be created from a list of arrays (using MultiIndex.


1 Answers

First off, have a look at the docs on chained indexing

Second, read this about needing to sort MultiIndices.

That will get you to this solution:

In [46]: df = df.sort_index()

In [47]: df.loc['[21-23)', 'M', '[10000-20000)'] = 2

In [48]: df
Out[48]: 
                           A  B  C  D
Age     Sex Sumins                   
[21-23) F   [0-10000)      0  0  0  0
            [10000-20000)  0  0  0  0
        M   [0-10000)      0  0  0  0
            [10000-20000)  2  2  2  2
[23-25) F   [0-10000)      0  0  0  0
            [10000-20000)  0  0  0  0
        M   [0-10000)      0  0  0  0
            [10000-20000)  0  0  0  0
[25-27) F   [0-10000)      0  0  0  0
            [10000-20000)  0  0  0  0
        M   [0-10000)      0  0  0  0
            [10000-20000)  0  0  0  0
[27-29) F   [0-10000)      0  0  0  0
            [10000-20000)  0  0  0  0
        M   [0-10000)      0  0  0  0
            [10000-20000)  0  0  0  0

[16 rows x 4 columns]

pandas .14 will have some additional ways for slicing a MultiIndex.

like image 68
TomAugspurger Avatar answered Sep 18 '22 15:09

TomAugspurger