Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python 3 get the column name depending of a condition [duplicate]

So i have a pandas df (python 3.6) like this

index   A   B   C  ... 
  A     1   5   0
  B     0   0   1 
  C     1   2   4
 ...

As you can see, the index values are the same as the columns names.

What i'm trying to do is to get a new column in the dataframe that has the name of the columns where the value is > than 0

index   A   B   C  ... NewColumn
  A     1   5   0       [A,B]
  B     0   0   1       [C]
  C     1   2   4       [A,B,C]
 ...

i've been trying with iterrows with no success

also i know i can melt and pivot but i think there should be a way with apply lamnda maybe?

Thanks in advance

like image 409
Gera Sanz Avatar asked Mar 05 '26 07:03

Gera Sanz


1 Answers

If new column should be string compare by DataFrame.gt with dot product with columns, last remove trailing separator:

df['NewColumn'] = df.gt(0).dot(df.columns + ', ').str.rstrip(', ')
print (df)
   A  B  C NewColumn
A  1  5  0      A, B
B  0  0  1         C
C  1  2  4   A, B, C

And for lists use apply with lambda function:

df['NewColumn'] = df.gt(0).apply(lambda x: x.index[x].tolist(), axis=1)
print (df)
   A  B  C  NewColumn
A  1  5  0     [A, B]
B  0  0  1        [C]
C  1  2  4  [A, B, C]
like image 93
jezrael Avatar answered Mar 07 '26 21:03

jezrael



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!