Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to take set union of all the values in a column of pandas Dataframe?

Tags:

python

pandas

First two rows of a dataframe, df:

0|50331648|{1,2,3,4,5}|6  
1|50331649|{3,5,7,8}|2  

After performing the operation, I just need a set that contains {1,2,3,4,5,7,8}.

How to achieve it?

like image 309
Abhishek Niranjan Avatar asked Jan 31 '17 10:01

Abhishek Niranjan


People also ask

How do you Union a column in Pandas?

To merge two pandas DataFrames on multiple columns use pandas. merge() method. merge() is considered more versatile and flexible and we also have the same method in DataFrame.

How do you change all values in a column in Pandas?

You can replace all values or selected values in a column of pandas DataFrame based on condition by using DataFrame. loc[] , np. where() and DataFrame.


1 Answers

Assuming "B" to be the column name under consideration, you could use set.union on the obtained unpacked list:

set.union(*df['B'].tolist())
{1, 2, 3, 4, 5, 7, 8}

(Or)

Supply these as a callable function to reduce:

from functools import reduce      # If you're on Py3k
reduce(set.union, df['B'].tolist())
{1, 2, 3, 4, 5, 7, 8}

Data:

df = pd.DataFrame(dict(A=[50331648, 50331649],
                       B=[{1,2,3,4,5}, {3,5,7,8}],
                       C=[6,2])
                 )

enter image description here

like image 89
Nickil Maveli Avatar answered Oct 12 '22 11:10

Nickil Maveli