Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

get rows where n of m values are wrong answered

I have a dataframe like this:

right_answer   rater1   rater2   rater3   item
1              1        1        2        S01
1              1        2        2        S02
2              1        2        1        S03
2              2        1        2        S04

and I need to get those rows or values in 'items' where at least two out of the three raters gave the wrong answer. I could already check if all the raters agree with each other with this code:

df.where(df[['rater1', 'rater2', 'rater3']].eq(df.iloc[:, 0], axis=0).all(1) == True)

I don't want to calculate a column with a majority voting because maybe I need to adjust the number of raters that have to agree or disagree wih the right answer.

Thanks for help

like image 898
maybeyourneighour Avatar asked Jun 26 '20 07:06

maybeyourneighour


1 Answers

Use, DataFrame.filter to filter the dataframe containing columns like rater, then use DataFrame.ne along axis=0 to compare the columns containing rater with the column right_answer, then use DataFrame.sum along axis=1 to get number of raters who have given wrong answer, then use Series.ge to create a boolean mask, finally filter the dataframe rows using this mask:

mask = (
    df.filter(like='rater')
    .ne(df['right_answer'], axis=0).sum(axis=1).ge(2)
)

df = df[mask]

Result:

# print(df)

   right_answer  rater1  rater2  rater3 item
1             1       1       2       2  S02
2             2       1       2       1  S03
like image 63
Shubham Sharma Avatar answered Oct 12 '22 11:10

Shubham Sharma