Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

A quick way to write a decision into a column based on the corresponding rows using pandas?

Tags:

python

pandas

Suppose I have four columns A, B, C, D in a data frame df:

import pandas as pd

df = pd.read_csv('results.csv')
df 

A     B     C     D
good  good  good  good
good  bad   good  good
good  bad   bad   good
bad   good  good  good

I want to add an other column result. The variables in it should be based on the corresponding rows' variables. Here, in my case, if there are at least three goods in the corresponding row i.e. in the columns A, B, C, D then the variable in results should be valid otherwise notvalid.

Expected output:

A     B     C     D     results
good  good  good  good  valid
good  bad   good  good  valid
good  bad   bad   good  notvalid
bad   good  good  good  valid
like image 859
Gun Avatar asked May 23 '20 05:05

Gun


People also ask

How do I select specific rows and columns from a DataFrame?

Select Rows by Name in Pandas DataFrame using locThe . loc[] function selects the data by labels of rows or columns. It can select a subset of rows and columns.

How do I print a specific row and column in pandas?

You can use the df. loc[[2]] to print a specific row of a pandas dataframe.


Video Answer


1 Answers

You can use:

# columns of interest:
cols = ['A','B','C','D']

df['results'] = np.where(df[cols].eq('good').sum(1).ge(3), 
                         'valid', 'invalid')

Output:

      A     B     C     D  results
0  good  good  good  good    valid
1  good   bad  good  good    valid
2  good   bad   bad  good  invalid
3   bad  good  good  good    valid
like image 181
Quang Hoang Avatar answered Oct 22 '22 07:10

Quang Hoang