Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Converting a Panda DF List into a string

Tags:

I have a pandas data frame. One of the columns contains a list. I want that column to be a single string.

For example my list ['one','two','three'] should simply be 'one, two, three'

df['col'] = df['col'].astype(str).apply(lambda x: ', '.join(df['col'].astype(str))) 

gives me ['one, two, three],['four','five','six'] where the second list is from the next row. Needless to say with millions of rows this concatenation across rows is not only incorrect, it kills my memory.

like image 242
Rusty Coder Avatar asked May 20 '16 13:05

Rusty Coder


People also ask

How do you convert a list to a string in a DataFrame in Python?

To convert a list to a string, use Python List Comprehension and the join() function. The list comprehension will traverse the elements one by one, and the join() method will concatenate the list's elements into a new string and return it as output.

What is Tolist () in pandas?

Pandas series can be converted to a list using tolist() or type casting method. There can be situations when you want to perform operations on a list instead of a pandas object. In such cases, you can store the DataFrame columns in a list and perform the required operations.


2 Answers

You should certainly not convert to string before you transform the list. Try:

df['col'].apply(', '.join) 

Also note that apply applies the function to the elements of the series, so using df['col'] in the lambda function is probably not what you want.


Or, there is a native .str.join method, but it is (surprisingly) a bit slower than apply.

like image 89
IanS Avatar answered Nov 14 '22 13:11

IanS


When you cast col to str with astype, you get a string representation of a python list, brackets and all. You do not need to do that, just apply join directly:

import pandas as pd  df = pd.DataFrame({     'A': [['a', 'b', 'c'], ['A', 'B', 'C']]     })  # Out[8]:  #            A # 0  [a, b, c] # 1  [A, B, C]  df['Joined'] = df.A.apply(', '.join)  #            A   Joined # 0  [a, b, c]  a, b, c # 1  [A, B, C]  A, B, C 
like image 43
hilberts_drinking_problem Avatar answered Nov 14 '22 11:11

hilberts_drinking_problem