Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

filter pandas dataframe on one level of a multi level index

Tags:

python

pandas

If I have a pandas dataframe with a multi level index, how can I filter by one of the levels of that index. For example:

df = pd.DataFrame({"id": [1,2,1,2], "time": [1, 1, 2, 2], "val": [1,2,3,4]})
df.set_index(keys=["id", "time"], inplace=True)

I would like to do something like:

df[df["time"] > 1]

but time is no longer a column. I could keep it as a column but I don't want to drag around copies of data.

like image 940
Alex Avatar asked May 23 '18 19:05

Alex


People also ask

How do you slice multiple index in pandas?

You can slice a MultiIndex by providing multiple indexers. You can provide any of the selectors as if you are indexing by label, see Selection by Label, including slices, lists of labels, labels, and boolean indexers. You can use slice(None) to select all the contents of that level.

How does pandas handle multiple index columns?

pandas MultiIndex to ColumnsUse pandas DataFrame. reset_index() function to convert/transfer MultiIndex (multi-level index) indexes to columns. The default setting for the parameter is drop=False which will keep the index values as columns and set the new index to DataFrame starting from zero. Yields below output.

How do I create a hierarchical index in pandas?

To make the column an index, we use the Set_index() function of pandas. If we want to make one column an index, we can simply pass the name of the column as a string in set_index(). If we want to do multi-indexing or Hierarchical Indexing, we pass the list of column names in the set_index().


Video Answer


2 Answers

In [17]: df[df.index.get_level_values('time') > 1]
Out[17]:
         val
id time
1  2       3
2  2       4

@piRSquared's solution is more idiomatic though...

like image 74
MaxU - stop WAR against UA Avatar answered Oct 22 '22 07:10

MaxU - stop WAR against UA


query

df.query('time > 1')

         val
id time     
1  2       3
2  2       4

IndexSlice

DataFrame index must be lexsorted

df.sort_index().loc[pd.IndexSlice[:, 2:], :]

         val
id time     
1  2       3
2  2       4
like image 28
piRSquared Avatar answered Oct 22 '22 05:10

piRSquared