Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas - combine column values into a list in a new column

I have a Python Pandas dataframe df:

d=[['hello',1,'GOOD','long.kw'],    [1.2,'chipotle',np.nan,'bingo'],    ['various',np.nan,3000,123.456]]                                                     t=pd.DataFrame(data=d, columns=['A','B','C','D'])  

which looks like this:

print(t)          A         B     C        D 0    hello         1  GOOD  long.kw 1      1.2  chipotle   NaN    bingo 2  various       NaN  3000  123.456 

I am trying to create a new column which is a list of the values in A, B, C, and D. So it would look like this:

t['combined']                                               Out[125]:  0        [hello, 1, GOOD, long.kw] 1        [1.2, chipotle, nan, bingo] 2        [various, nan, 3000, 123.456] Name: combined, dtype: object 

I am trying this code:

t['combined'] = t.apply(lambda x: list([x['A'],                                         x['B'],                                         x['C'],                                         x['D']]),axis=1)     

Which returns this error:

ValueError: Wrong number of items passed 4, placement implies 1  

What is puzzling to me is if remove one of the columns that I want to put in the list (or add another column to the dataframe that I DON'T add to the list), my code works.

For instance, run this code:

t['combined'] = t.apply(lambda x: list([x['A'],                                         x['B'],                                         x['D']]),axis=1)       

Returns this which is perfect if I only wanted the 3 columns:

print(t)          A         B     C        D                 combined 0    hello         1  GOOD  long.kw      [hello, 1, long.kw] 1      1.2  chipotle   NaN    bingo   [1.2, chipotle, bingo] 2  various       NaN  3000  123.456  [various, nan, 123.456] 

I am at a complete loss as to why requesting the 'combined' list be made of all columns in the dataframe would create an error, but selecting all but 1 column to create the 'combined' list and the list is created as expected.

like image 682
clg4 Avatar asked May 10 '17 16:05

clg4


People also ask

How do I combine multiple columns into one list in Python?

You can use DataFrame. apply() for concatenate multiple column values into a single column, with slightly less typing and more scalable when you want to join multiple columns .

How do I combine column values in pandas?

To start, you may use this template to concatenate your column values (for strings only): df['New Column Name'] = df['1st Column Name'] + df['2nd Column Name'] + ... Notice that the plus symbol ('+') is used to perform the concatenation.


1 Answers

try this :

t['combined']= t.values.tolist()  t Out[50]:           A         B     C        D                       combined 0    hello         1  GOOD  long.kw      [hello, 1, GOOD, long.kw] 1     1.20  chipotle   NaN    bingo    [1.2, chipotle, nan, bingo] 2  various       NaN  3000   123.46  [various, nan, 3000, 123.456] 
like image 92
Steven G Avatar answered Sep 18 '22 23:09

Steven G