Pandas: Flatten column based on condition?

Question

I am trying to flatten rows and keep the info from the rows I want.

What I have:

id  var1  var2 var3
1      Y     N    Y
1      N          Y
2      Y          N
2      N     Y    N
2      Y     N    Y

What I would like:

id  var1  var2 var3
1      Y     N    Y
2      Y     Y    Y

Essentially, it would check if there is a Y/N and always give priority to a Y. Also there are more columns than var1, var2, var3; so I would like something more general so I could apply to other columns as well.

Scott Boston · Accepted Answer

Let's try, you can use groupby and sum to act like an OR, hence "giving Y priority":

df1 = df.replace({'Y':True,'N':False})

df_out = (df1.groupby('id').sum(skipna=False)
         .astype(bool)
         .replace({True:'Y',False:'N'})
         .reset_index())

print(df_out)

Output:

   id var1 var2 var3
0   1    Y    N    Y
1   2    Y    Y    Y

Pandas: Flatten column based on condition?

Tags:

python

pandas

spitfiredd

1 Answers

Scott Boston

Recent Activity

Donate For Us

Pandas: Flatten column based on condition?

Tags:

python

pandas

spitfiredd

1 Answers

Scott Boston

Related questions

Recent Activity

Donate For Us