I have a pandas data frame. One of the columns contains a list. I want that column to be a single string.
For example my list ['one','two','three'] should simply be 'one, two, three'
df['col'] = df['col'].astype(str).apply(lambda x: ', '.join(df['col'].astype(str)))  gives me ['one, two, three],['four','five','six']  where the second list is from the next row. Needless to say with millions of rows this concatenation across rows is not only incorrect, it kills my memory.
To convert a list to a string, use Python List Comprehension and the join() function. The list comprehension will traverse the elements one by one, and the join() method will concatenate the list's elements into a new string and return it as output.
Pandas series can be converted to a list using tolist() or type casting method. There can be situations when you want to perform operations on a list instead of a pandas object. In such cases, you can store the DataFrame columns in a list and perform the required operations.
You should certainly not convert to string before you transform the list. Try:
df['col'].apply(', '.join)  Also note that apply applies the function to the elements of the series, so using df['col'] in the lambda function is probably not what you want.
Or, there is a native .str.join method, but it is (surprisingly) a bit slower than apply.
When you cast col to str with astype, you get a string representation of a python list, brackets and all. You do not need to do that, just apply join directly:
import pandas as pd  df = pd.DataFrame({     'A': [['a', 'b', 'c'], ['A', 'B', 'C']]     })  # Out[8]:  #            A # 0  [a, b, c] # 1  [A, B, C]  df['Joined'] = df.A.apply(', '.join)  #            A   Joined # 0  [a, b, c]  a, b, c # 1  [A, B, C]  A, B, C 
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With