Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas multiindex - how to select second level when using columns?

I have a dataframe with this index:

index = pd.MultiIndex.from_product([['stock1','stock2'...],['price','volume'...]]) 

It's a useful structure for being able to do df['stock1'], but how do I select all the price data? I can't make any sense of the documentation.

I've tried the following with no luck: df[:,'price'] df[:]['price'] df.loc(axis=1)[:,'close'] df['price]

If this index style is generally agreed to be a bad idea for whatever reason, then what would be a better choice? Should I go for a multi-indexed index for the stocks as labels on the time series instead of at the column level?

Many thanks

EDIT - I am using the multiindex for the columns, not the index (the wording got the better of me). The examples in the documentation focus on multi-level indexes rather than column structures.

like image 249
AndyMoore Avatar asked Jul 16 '17 12:07

AndyMoore


People also ask

How do I drop one level of MultiIndex pandas?

Python – Drop multiple levels from a multi-level column index in Pandas dataframe. To drop multiple levels from a multi-level column index, use the columns. droplevel() repeatedly. We have used the Multiindex.

How does pandas handle multiple index columns?

pandas MultiIndex to ColumnsUse pandas DataFrame. reset_index() function to convert/transfer MultiIndex (multi-level index) indexes to columns. The default setting for the parameter is drop=False which will keep the index values as columns and set the new index to DataFrame starting from zero. Yields below output.


2 Answers

Also using John's data sample:

Using xs() is another way to slice a MultiIndex:

df                0 stock1 price   1        volume  2 stock2 price   3        volume  4 stock3 price   5        volume  6  df.xs('price', level=1, drop_level=False)               0 stock1 price  1 stock2 price  3 stock3 price  5 

Alternatively if you have a MultiIndex in place of columns:

df   stock1        stock2        stock3           price volume  price volume  price volume 0      1      2      3      4      5      6  df.xs('price', axis=1, level=1, drop_level=False)   stock1 stock2 stock3    price  price  price 0      1      3      5 
like image 198
Andrew L Avatar answered Sep 20 '22 17:09

Andrew L


Using @JohnZwinck's data sample:

In [132]: df Out[132]:                0 stock1 price   1        volume  2 stock2 price   3        volume  4 stock3 price   5        volume  6 

Option 1:

In [133]: df.loc[(slice(None), slice('price')), :] Out[133]:               0 stock1 price  1 stock2 price  3 stock3 price  5 

Option 2:

In [134]: df.loc[pd.IndexSlice[:, 'price'], :] Out[134]:               0 stock1 price  1 stock2 price  3 stock3 price  5 

UPDATE:

But what if for the 2nd Index, I want to select everything but price and there are multiple values so that enumeration is not an option. Is there something like slice(~'price')

first let's name the index levels:

df = df.rename_axis(["lvl0", "lvl1"]) 

now we can use the df.query() method:

In [18]: df.query("lvl1 != 'price'") Out[18]:                0 lvl0   lvl1 stock1 volume  2 stock2 volume  4 stock3 volume  6 
like image 24
MaxU - stop WAR against UA Avatar answered Sep 20 '22 17:09

MaxU - stop WAR against UA