I have a pandas data frame. One of the columns contains a list. I want that column to be a single string.
For example my list ['one','two','three']
should simply be 'one, two, three'
df['col'] = df['col'].astype(str).apply(lambda x: ', '.join(df['col'].astype(str)))
gives me ['one, two, three],['four','five','six']
where the second list is from the next row. Needless to say with millions of rows this concatenation across rows is not only incorrect, it kills my memory.
To convert a list to a string, use Python List Comprehension and the join() function. The list comprehension will traverse the elements one by one, and the join() method will concatenate the list's elements into a new string and return it as output.
Pandas series can be converted to a list using tolist() or type casting method. There can be situations when you want to perform operations on a list instead of a pandas object. In such cases, you can store the DataFrame columns in a list and perform the required operations.
You should certainly not convert to string before you transform the list. Try:
df['col'].apply(', '.join)
Also note that apply
applies the function to the elements of the series, so using df['col']
in the lambda function is probably not what you want.
Or, there is a native .str.join
method, but it is (surprisingly) a bit slower than apply
.
When you cast col
to str
with astype
, you get a string representation of a python list, brackets and all. You do not need to do that, just apply
join
directly:
import pandas as pd df = pd.DataFrame({ 'A': [['a', 'b', 'c'], ['A', 'B', 'C']] }) # Out[8]: # A # 0 [a, b, c] # 1 [A, B, C] df['Joined'] = df.A.apply(', '.join) # A Joined # 0 [a, b, c] a, b, c # 1 [A, B, C] A, B, C
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With