Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

df.loc[] with multiple callables

I want to do a lookup in a DataFrame using two separate Callables (one provided by the user, one by param). Also acceptable: Index by one Callable and another filter using the explicit syntax.

Is this possible? I'm guessing it could be done with groupby, but that seems a bit cumbersome.

Minimal code sample:

import pandas as pd  # Version: 0.23.4, Python 2.7
df = pd.DataFrame({'C1': [1, 2,1], 'C2': [3, 4, 10]})


# This works
filter = lambda adf: adf['C1']==1
df.loc[filter]

# So does this
df.loc[df['C2']>5]

# Both of them together works
df.loc[(df['C2']>5) & (df['C1']==1)]

# So why don't any of these?
df.loc[(df['C2']>5) & filter] #TypeError: ...
df.loc[(df['C2']>5) & (filter)] # TypeError: ...
df.loc[df['C2']>5 & filter] # TypeError: ...

filter2 = lambda adf: adf['C2']>5
df.loc[(filter) & (filter2)] # TypeError: ...
df.loc[(filter) | (filter2)] # TypeError: ...

# Nesting works, but isn't pretty for multiple callables
df.loc[(df['C2']>5)].loc[filter]
like image 734
Maggie Avatar asked Jun 12 '26 12:06

Maggie


1 Answers

When you pass your lambda filter as a loc parameter, you are passing it like an object function, and not as a result of computation of that function.

For this reason you can't use any logical operator to combine multiple function, unlike what happens when you use multiple logical criteria.

In any case, if you want to use heterogeneous criteria (logical and functional) for filter you dataframe, you can use loc twice. as you yourself suggest.

# function
filter = lambda df: df['C1']==1
df.loc[(df['C2']>5)].loc[filter]
like image 189
Pierock Avatar answered Jun 15 '26 01:06

Pierock