Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get a random (bootstrap) sample from pandas multiindex

I'm trying to create a bootstrapped sample from a multiindex dataframe in Pandas. Below is some code to generate the kind of data I need.

from itertools import product
import pandas as pd
import numpy as np

df = pd.DataFrame({'group1': [1, 1, 1, 2, 2, 3],
                       'group2': [13, 18, 20, 77, 109, 123],
                       'value1': [1.1, 2, 3, 4, 5, 6],
                       'value2': [7.1, 8, 9, 10, 11, 12]
                       })
df = df.set_index(['group1', 'group2'])

print df

The df dataframe looks like:

                   value1  value2
group1 group2                
1      13         1.1     7.1
       18         2.0     8.0
       20         3.0     9.0
2      77         4.0    10.0
       109        5.0    11.0
3      123        6.0    12.0

I want to get a random sample from the first index. For example let's say the random values np.random.randint(3,size=3) produces [3,2,2]. I'd like the resultant dataframe to look like:

                   value1  value2
group1 group2                
3      123        6.0    12.0
2      77         4.0    10.0
       109        5.0    11.0
2      77         4.0    10.0
       109        5.0    11.0

I've spent a lot of time researching this and I've been unable to find a similar example where the multiindex values are integers, the secondary index is of variable length, and the primary index samples are repeating. This is how I think an appropriate implementation for bootstrapping would work.

like image 423
Chris Avatar asked Aug 02 '16 23:08

Chris


People also ask

What is Panda MultiIndex?

The MultiIndex object is the hierarchical analogue of the standard Index object which typically stores the axis labels in pandas objects. You can think of MultiIndex as an array of tuples where each tuple is unique. A MultiIndex can be created from a list of arrays (using MultiIndex.


1 Answers

Try:

df.unstack().sample(3, replace=True).stack()

enter image description here

like image 187
piRSquared Avatar answered Nov 15 '22 09:11

piRSquared