Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hiearchical indexing based on sub level

Tags:

pandas

I have a DataFrame as follows. How can I select rows whose second index is in ['two','three']?

index = MultiIndex(levels=[['foo', 'bar', 'baz', 'qux'],
                               ['one', 'two', 'three']],
                       labels=[[0, 0, 0, 1, 1, 2, 2, 3, 3, 3],
                               [0, 1, 2, 0, 1, 1, 2, 0, 1, 2]])
hdf = DataFrame(np.random.randn(10, 3), index=index,
            columns=['A', 'B', 'C'])

In [3]: hdf
Out[3]: 
                  A         B         C
foo one   -1.274689  0.946294 -0.149131
    two   -0.015483  1.630099  0.085461
    three  1.396752 -0.272583 -0.760000
bar one   -1.151217  1.269658  2.457231
    two   -1.657258 -1.271384 -2.429598
baz two    1.124609  0.138720 -1.994984
    three  0.124298 -0.127099 -0.409736
qux one    0.535038  1.139026  0.414842
    two    0.287724  0.461041 -0.268918
    three -0.259649  0.226574 -0.558334
like image 668
user1907561 Avatar asked Nov 03 '22 08:11

user1907561


1 Answers

One way it to use the DataFrame's select method:

In [4]: hdf.select(lambda x: x[1] in ['two', 'three'])
Out[4]: 
                  A         B         C
foo two   -0.015483  1.630099  0.085461
    three  1.396752 -0.272583 -0.760000
bar two   -1.657258 -1.271384 -2.429598
baz two    1.124609  0.138720 -1.994984
    three  0.124298 -0.127099 -0.409736
qux two    0.287724  0.461041 -0.268918
    three -0.259649  0.226574 -0.558334
like image 154
Andy Hayden Avatar answered Nov 08 '22 06:11

Andy Hayden