I have a dataframe with two columns containing lists. I want to combine these columns into a single column and merge the lists into a single list. Also this list should only contain unique values from the original lists.
I've tried merging them using df['E']=df[['B','C']].values.tolist()
.
However this creates a single column with values comprising two lists.
The dataframe looks something like this:
A B C D
a1 [b1,b2] [c1,b1] d1
a2 [b1,b1] [b3] d2
a3 [b2] [b2,b2] d3
The final dataframe should look like this:
A B C D E
a1 [b1,b2] [c1,b1] d1 [b1,b2,c1]
a2 [b1,b1] [b3] d2 [b1,b3]
a3 [b2] [b2,b2] d3 [b2]
Edit: The values within the lists of the dataframe are strings.
tolist() you can convert pandas DataFrame Column to List. df['Courses'] returns the DataFrame column as a Series and then use values. tolist() to convert the column values to list.
To join a list of DataFrames, say dfs , use the pandas. concat(dfs) function that merges an arbitrary number of DataFrames to a single one.
IIUC
df['E']=(df.B+df.C).map(set).map(list)
df
Out[81]:
A B C D E
0 a1 [b1, b2] [c1, b1] d1 [b2, b1, c1]
1 a2 [b1, b1] [b3] d2 [b3, b1]
2 a3 [b2] [b2, b2] d3 [b2]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With