Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find the unique values in a column and then sort them

People also ask

How do I find unique values in a DataFrame column?

You can get unique values in column (multiple columns) from pandas DataFrame using unique() or Series. unique() functions. unique() from Series is used to get unique values from a single column and the other one is used to get from multiple columns.


sorted(iterable): Return a new sorted list from the items in iterable.

CODE

import pandas as pd
df = pd.DataFrame({'A':[1,1,3,2,6,2,8]})
a = df['A'].unique()
print(sorted(a))

OUTPUT

[1, 2, 3, 6, 8]

sort sorts inplace so returns nothing:

In [54]:
df = pd.DataFrame({'A':[1,1,3,2,6,2,8]})
a = df['A'].unique()
a.sort()
a

Out[54]:
array([1, 2, 3, 6, 8], dtype=int64)

So you have to call print a again after the call to sort.

Eg.:

In [55]:
df = pd.DataFrame({'A':[1,1,3,2,6,2,8]})
a = df['A'].unique()
a.sort()
print(a)

[1 2 3 6 8]

You can also use the drop_duplicates() instead of unique()

df = pd.DataFrame({'A':[1,1,3,2,6,2,8]})
a = df['A'].drop_duplicates()
a.sort()
print a

I prefer the oneliner:

print(sorted(df['Column Name'].unique()))

Came across the question myself today. I think the reason that your code returns 'None' (exactly what I got by using the same method) is that

a.sort()

is calling the sort function to mutate the list a. In my understanding, this is a modification command. To see the result you have to use print(a).

My solution, as I tried to keep everything in pandas:

pd.Series(df['A'].unique()).sort_values()

I would suggest using numpy's sort, as it is anyway what pandas is doing in background:

import numpy as np
np.sort(df.A.unique())

But doing all in pandas is valid as well.


Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!