I want to extract all unique combinations of values of columns Col1
, Col2
and Col3
. Let's say there is the following dataframe df
:
df =
Col1 Col2 Col3
12 AB 13
11 AB 13
12 AB 13
12 AC 14
The answer is:
unique =
Col1 Col2 Col3
12 AB 13
11 AB 13
12 AC 14
I know how to obtain unique values of a particular column, i.e. df.Col1.unique()
, however not sure about unique combinations.
You can get unique values in column (multiple columns) from pandas DataFrame using unique() or Series. unique() functions. unique() from Series is used to get unique values from a single column and the other one is used to get from multiple columns.
When more than one expression is provided in the DISTINCT clause, the query will retrieve unique combinations for the expressions listed. In SQL, the DISTINCT clause doesn't ignore NULL values. So when using the DISTINCT clause in your SQL statement, your result set will include NULL as a distinct value.
There is a method for this - pandas.DataFrame.drop_duplicates
:
>>> df.drop_duplicates()
Col1 Col2 Col3
0 12 AB 13
1 11 AB 13
3 12 AC 14
You can do it inplace
as well:
>>> df.drop_duplicates(inplace=True)
>>> df
Col1 Col2 Col3
0 12 AB 13
1 11 AB 13
3 12 AC 14
If you need to get unique values of certain columns:
>>> df[['Col2','Col3']].drop_duplicates()
Col2 Col3
0 AB 13
3 AC 14
as @jezrael suggests, you can also consider using subset
parameter of drop_duplicates()
:
>>> df.drop_duplicates(subset=['Col2','Col3'])
Col1 Col2 Col3
0 12 AB 13
3 12 AC 14
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With