Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Select only one index of multiindex DataFrame

I am trying to create a new DataFrame using only one index from a multi-indexed DataFrame.

                   A         B         C first second                               bar   one     0.895717  0.410835 -1.413681       two     0.805244  0.813850  1.607920 baz   one    -1.206412  0.132003  1.024180       two     2.565646 -0.827317  0.569605 foo   one     1.431256 -0.076467  0.875906       two     1.340309 -1.187678 -2.211372 qux   one    -1.170299  1.130127  0.974466       two    -0.226169 -1.436737 -2.006747 

Ideally, I would like something like this:

In: df.ix[level="first"] 

and:

Out:                 A         B         C first                                bar        0.895717  0.410835 -1.413681            0.805244  0.813850  1.607920 baz       -1.206412  0.132003  1.024180            2.565646 -0.827317  0.569605 foo        1.431256 -0.076467  0.875906            1.340309 -1.187678 -2.211372 qux       -1.170299  1.130127  0.974466           -0.226169 -1.436737 -2.006747 ` 

Essentially I want to drop all the other indexes of the multi-index other than level first. Is there an easy way to do this?

like image 834
Skorpeo Avatar asked Jan 25 '15 19:01

Skorpeo


People also ask

How do you select a specific index in a DataFrame?

So, if you want to select the 5th row in a DataFrame, you would use df. iloc[[4]] since the first row is at index 0, the second row is at index 1, and so on. . loc selects rows based on a labeled index.

How do I convert MultiIndex to single index in Pandas?

To revert the index of the dataframe from multi-index to a single index using the Pandas inbuilt function reset_index(). Returns: (Data Frame or None) DataFrame with the new index or None if inplace=True.


2 Answers

One way could be to simply rebind df.index to the desired level of the MultiIndex. You can do this by specifying the label name you want to keep:

df.index = df.index.get_level_values('first') 

or use the level's integer value:

df.index = df.index.get_level_values(0) 

All other levels of the MultiIndex would disappear here.

like image 161
Alex Riley Avatar answered Oct 05 '22 05:10

Alex Riley


The solution is fairly new and uses the df.xs function as

In [88]: df.xs('bar', level='first') Out[88]: Second  Third one     A       -2.315312         B        0.497769         C        0.108523 two     A       -0.778303         B       -1.555389         C       -2.625022 dtype: float64 

Can also do with multiple indices as

In [89]: df.xs(('bar', 'A'), level=('First', 'Third')) Out[89]: Second one   -2.315312 two   -0.778303 dtype: float64 

The setup for the examples is below

import pandas as pd import numpy as np arrays = [     np.array(['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux']),     np.array(['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two']) ] index = pd.MultiIndex.from_tuples(list(zip(*arrays)), names=['first', 'second']) df = pd.DataFrame(np.random.randn(3, 8), index=['A', 'B', 'C'], columns=index) df.index.names = pd.core.indexes.frozen.FrozenList(['First', 'Second', 'Third']) df = df.unstack() 
like image 23
Alexander McFarlane Avatar answered Oct 05 '22 04:10

Alexander McFarlane