Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get unique values of multiple columns as a new dataframe in pandas

Having pandas data frame df with at least columns C1,C2,C3 how would you get all the unique C1,C2,C3 values as a new DataFrame?

in other words, similiar to :

SELECT C1,C2,C3
FROM T
GROUP BY C1,C2,C3

Tried that

print df.groupby(by=['C1','C2','C3'])

but im getting

<pandas.core.groupby.DataFrameGroupBy object at 0x000000000769A9E8>
like image 649
Ofek Ron Avatar asked Jan 06 '18 20:01

Ofek Ron


People also ask

How do I get unique values from multiple columns in Python?

To get the unique values in multiple columns of a dataframe, we can merge the contents of those columns to create a single series object and then can call unique() function on that series object i.e. It returns the count of unique elements in multiple columns.

How do I get unique values from two columns in a data frame?

You can get unique values in column (multiple columns) from pandas DataFrame using unique() or Series. unique() functions. unique() from Series is used to get unique values from a single column and the other one is used to get from multiple columns.


1 Answers

I believe you need drop_duplicates if want all unique triples:

df = df.drop_duplicates(subset=['C1','C2','C3'])

If want use groupby add first:

df = df.groupby(by=['C1','C2','C3'], as_index=False).first()
like image 140
jezrael Avatar answered Oct 18 '22 06:10

jezrael