I need to calculate some probabilities given certain conditions, so I'm using a function to get the rows that contain givens values, for example:
df:
col1 col2 col3
A B C
H B C
A B
A H C
This is the function
def existence(x):
return df[df.isin([x]).any(1)]
So if I do:
in:
existence('A')
out:
col1 col2 col3
A B C
A B
A H C
I need to generalize the function, so that I can give it more than one parameter and do the following:
existence(x, y):
return df[df.isin([x]).any(1) & df.isin([y]).any(1)]
or generalized
existence(x1, x2,..., xn):
return df[df.isin([x1]).any(1) & df.isin([x2]).any(1) & ... & df.isin([xn]).any(1)]
I think args can't help me since I can't merge operations with the operator &
thank you in advance estimates
Your function will already work with a variable number of arguments. Check out the description of pd.DataFrame.isin
:
DataFrame.isin(values)
Whether each element in the DataFrame is contained in values.
This works fine with more than one parameter, you just need to change how you pass parameters into your function. This is what *args
is for in Python. It allows you to pass in a variable number of arguments and gives you back a tuple
.
I also updated your function to take in the DataFrame to apply the mask on, because it isn't good pratice to rely on global variable names.
def existence(df, *args):
return df[df.isin(args).any(1)]
In [13]: existence(df, 'A')
Out[13]:
col1 col2 col3
0 A B C
2 A B None
3 A H C
In [14]: existence(df, 'C', 'H')
Out[14]:
col1 col2 col3
0 A B C
1 H B C
3 A H C
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With