Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Check if a value exists using multiple conditions within group in pandas

Following is what my dataframe looks like. Expected_Output is my desired/target column.

   Group  Value1  Value2  Expected_Output
0      1       3       9             True
1      1       7       6             True
2      1       9       7             True
3      2       3       8            False
4      2       8       5            False
5      2       7       6            False

If any Value1 == 7 AND if any Value2 == 9 within a given Group, then I want to return True.

I tried to no avail:

df['Expected_Output']= df.groupby('Group').Value1.isin(7) &  df.groupby('Group').Value2.isin(9)

N.B:- Either True/False or 1/0 can be output.

like image 877
gibbz00 Avatar asked Oct 06 '18 15:10

gibbz00


People also ask

How do you check if a value exists in a series pandas?

You can check if a column contains/exists a particular value (string/int), list of multiple values in pandas DataFrame by using pd. series() , in operator, pandas. series.

How do you use between conditions in pandas?

Pandas between() method is used on series to check which values lie between first and second argument. inclusive: A Boolean value which is True by default. If False, it excludes the two passed arguments while checking.

How to check if a column contains/exists a particular value in pandas?

You can check if a column contains/exists a particular value (string/int), list of multiple values in pandas DataFrame by using pd.series (), in operator, pandas.series.isin (), str.contains () methods and many more. In this article, I will explain how to check if a column contains a particular value with examples.

How to apply an IF condition in pandas Dataframe?

Applying an IF condition in Pandas DataFrame. Let’s now review the following 5 cases: (1) IF condition – Set of numbers. Suppose that you created a DataFrame in Python that has 10 numbers (from 1 to 10). You then want to apply the following IF conditions: If the number is equal or lower than 4, then assign the value of ‘True’

What is the difference between Isin and any in pandas?

The isin () method is used to check single or multiple elements exist in the dataframe. Return value: It returns a boolean dataframe in which each value is represented with true for match value False for unmatched. The any () method returns a pandas series that displays a column that contains True OR FALSE for given values.

How to check if multiple elements exist in a Dataframe in Python?

To check if multiple-element exist in the dataframe we are using the dictionary comprehension and its dictionary of key-value pairs. It returns TRUE if a value exists FALSE if does not exist. The isin () method is used to check single or multiple elements exist in the dataframe.


Video Answer


2 Answers

Use groupby on Group column and then use transform and lambda function as:

g = df.groupby('Group')
df['Expected'] = (g['Value1'].transform(lambda x: x.eq(7).any()))&(g['Value2'].transform(lambda x: x.eq(9).any()))

Or using groupby, apply and merge using parameter how='left' as:

df.merge(df.groupby('Group').apply(lambda x: x['Value1'].eq(7).any()&x['Value2'].eq(9).any()).reset_index(),how='left').rename(columns={0:'Expected_Output'})

Or using groupby, apply and map as:

df['Expected_Output'] = df['Group'].map(df.groupby('Group').apply(lambda x: x['Value1'].eq(7).any()&x['Value2'].eq(9).any()))

print(df)
   Group  Value1  Value2  Expected_Output
0      1       3       9             True
1      1       7       6             True
2      1       9       7             True
3      2       3       8            False
4      2       8       5            False
5      2       7       6            False
like image 195
Space Impact Avatar answered Nov 03 '22 06:11

Space Impact


You can create a dataframe of the expected result by group and then merge it back to the original dataframe.

expected = (
    df.groupby('Group')
    .apply(lambda x: (x['Value1'].eq(7).any() 
                      & x['Value2'].eq(9)).any())
    .to_frame('Expected_Output'))
>>> expected
       Expected_Output
Group                 
1                 True
2                False

>>> df.merge(expected, left_on='Group', right_index=True)
   Group  Value1  Value2  Expected_Output
0      1       3       9             True
1      1       7       6             True
2      1       9       7             True
3      2       3       8            False
4      2       8       5            False
5      2       7       6            False
like image 45
Alexander Avatar answered Nov 03 '22 07:11

Alexander