Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Distinct combinations values in Pandas DataFrames

Tags:

python

pandas

Is there an easy way to pull out the distinct combinations of values in a dataframe? I've used pd.Series.unique() for single columns, but what about multiple columns?

Example data:

df = pd.DataFrame(data=[[1, 'a'], [2, 'a'], [3, 'b'], [3, 'b'], [1, 'b'], [1, 'b']], 
                  columns=['number', 'letter'])

Expected output:
(1, a)
(2, a)
(3, b)
(1, b)

Ideally, I'd like a separate Series object of tuples with the distinct values.

like image 785
AZhao Avatar asked Sep 25 '15 15:09

AZhao


1 Answers

You can zip the columns and create a set:

>>> set(zip(df.number, df.letter))
{(1, 'a'), (1, 'b'), (2, 'a'), (3, 'b')}
like image 166
Alexander Avatar answered Nov 14 '22 21:11

Alexander