I have a dataframe like this <pre class="prettyprint"><code> A B C D E 0 0.0 1.0 0.0 0.0 1.0 1 0.0 0.0 1.0 0.0 0.0 2 0.0 1.0 1.0 1.0 0.0 3 1.0 0.0 0.0 0.0 1.0 4 0.0 0.0 0.0 1.0 0.0 </code></pre> The mission is to get a list like this <pre class="prettyprint"><code>0 B,E 1 C 2 B,C,D 3 A,E 4 D </code></pre> Any ideas, thanks in advance.

You can use <code>apply</code> with <code>axis=1</code> for processing by rows and then compare each row with <code>1</code> for index values (because <code>axis=1</code> each row is converted to Series with index from columns), which are joined by <code>,</code>: <pre class="prettyprint"><code>s1 = df.apply(lambda x: ','.join(x.index[x == 1]), axis=1) print (s1) 0 B,E 1 C 2 B,C,D 3 A,E 4 D dtype: object </code></pre> Another solution, faster if larger <code>DataFrame</code>. First change format of columns to list: <pre class="prettyprint"><code>print (['{}, '.format(x) for x in df.columns]) ['A, ', 'B, ', 'C, ', 'D, ', 'E, '] </code></pre> Same like: <pre class="prettyprint"><code>s = np.where(df == 1, ['{}, '.format(x) for x in df.columns], '') </code></pre> because <code>1</code> values are casted to <code>True</code>s. Compare values of <code>DataFrame</code> and for <code>True</code>s use custom format of columns names: <pre class="prettyprint"><code>s = np.where(df, ['{}, '.format(x) for x in df.columns], '') print (s) [['' 'B, ' '' '' 'E, '] ['' '' 'C, ' '' ''] ['' 'B, ' 'C, ' 'D, ' ''] ['A, ' '' '' '' 'E, '] ['' '' '' 'D, ' '']] </code></pre> Last join all rows with removing empty values: <pre class="prettyprint"><code>s1 = pd.Series([''.join(x).strip(', ') for x in s], index=df.index) print (s1) 0 B, E 1 C 2 B, C, D 3 A, E 4 D dtype: object </code></pre> EDIT: Old answer another better solution: <pre class="prettyprint"><code>s1 = df.eq(1).dot(df.columns + ',').str.rstrip(',') </code></pre>

Boolean values to column names in one list, dataframe pandas python

Tags:

python

list

pandas

dataframe

I have a dataframe like this

     A    B    C    D    E
  0  0.0  1.0  0.0  0.0  1.0
  1  0.0  0.0  1.0  0.0  0.0
  2  0.0  1.0  1.0  1.0  0.0
  3  1.0  0.0  0.0  0.0  1.0
  4  0.0  0.0  0.0  1.0  0.0

The mission is to get a list like this

0  B,E
1  C
2  B,C,D
3  A,E
4  D

Any ideas, thanks in advance.

858

asked Oct 15 '17 12:10

amn89

1 Answers

You can use apply with axis=1 for processing by rows and then compare each row with 1 for index values (because axis=1 each row is converted to Series with index from columns), which are joined by ,:

s1 = df.apply(lambda x: ','.join(x.index[x == 1]), axis=1)
print (s1)
0      B,E
1        C
2    B,C,D
3      A,E
4        D
dtype: object

Another solution, faster if larger DataFrame.

First change format of columns to list:

print (['{}, '.format(x) for x in df.columns])
['A, ', 'B, ', 'C, ', 'D, ', 'E, ']

Same like:

s = np.where(df == 1, ['{}, '.format(x) for x in df.columns], '')

because 1 values are casted to Trues. Compare values of DataFrame and for Trues use custom format of columns names:

s = np.where(df, ['{}, '.format(x) for x in df.columns], '')
print (s)
[['' 'B, ' '' '' 'E, ']
 ['' '' 'C, ' '' '']
 ['' 'B, ' 'C, ' 'D, ' '']
 ['A, ' '' '' '' 'E, ']
 ['' '' '' 'D, ' '']]

Last join all rows with removing empty values:

s1 = pd.Series([''.join(x).strip(', ') for x in s], index=df.index)
print (s1)
0       B, E
1          C
2    B, C, D
3       A, E
4          D
dtype: object

EDIT: Old answer another better solution:

s1 = df.eq(1).dot(df.columns + ',').str.rstrip(',')

134

answered Sep 29 '22 00:09

jezrael

Related questions
                            
                                Why view functions require a request parameter In Django?
                            
                                How to handle memory error while fitting GaussianMixture in sklearn python?
                            
                                Alternatives to `tell()` while iterating over lines of a file in Python3?
                            
                                Seaborn plot adds extra zeroes to x axis time-stamp labels
                            
                                Training in batches but testing individual data item in Tensorflow?
                            
                                Creating constant value in Keras
                            
                                How to capture python https traffic in fiddler?
                            
                                What is the difference between uploading a file to S3 using boto3.resource.put_object() and boto3.s3.transfer.upload_file()
                            
                                gdb python module read memory content
                            
                                Find longest unique substring in string python
                            
                                Python - how do i save a itertools.product loop and resume where it left off
                            
                                Python 3: Move email to trash by uid (imaplib)
                            
                                os.path.isdir() returns false on unaccessible, but existing directory
                            
                                Calculating XIRR in Python
                            
                                Python Selenium. How to use driver.set_page_load_timeout() properly?
                            
                                How i can easily extract data from historian with python?
                            
                                Subclassing Sequence with proper type hints in Python
                            
                                Django. Listing files from a static folder
                            
                                Class method takes 1 positional argument but 2 were given
                            
                                Move data from Postgres/MySQL to S3 using Airflow

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With