You can get unique values in column (multiple columns) from pandas DataFrame using unique() or Series. unique() functions. unique() from Series is used to get unique values from a single column and the other one is used to get from multiple columns.
sorted(iterable)
: Return a new sorted list from the items in iterable.
CODE
import pandas as pd
df = pd.DataFrame({'A':[1,1,3,2,6,2,8]})
a = df['A'].unique()
print(sorted(a))
OUTPUT
[1, 2, 3, 6, 8]
sort
sorts inplace so returns nothing:
In [54]:
df = pd.DataFrame({'A':[1,1,3,2,6,2,8]})
a = df['A'].unique()
a.sort()
a
Out[54]:
array([1, 2, 3, 6, 8], dtype=int64)
So you have to call print a
again after the call to sort
.
Eg.:
In [55]:
df = pd.DataFrame({'A':[1,1,3,2,6,2,8]})
a = df['A'].unique()
a.sort()
print(a)
[1 2 3 6 8]
You can also use the drop_duplicates() instead of unique()
df = pd.DataFrame({'A':[1,1,3,2,6,2,8]})
a = df['A'].drop_duplicates()
a.sort()
print a
I prefer the oneliner:
print(sorted(df['Column Name'].unique()))
Came across the question myself today. I think the reason that your code returns 'None' (exactly what I got by using the same method) is that
a.sort()
is calling the sort function to mutate the list a. In my understanding, this is a modification command. To see the result you have to use print(a).
My solution, as I tried to keep everything in pandas:
pd.Series(df['A'].unique()).sort_values()
I would suggest using numpy's sort, as it is anyway what pandas is doing in background:
import numpy as np
np.sort(df.A.unique())
But doing all in pandas is valid as well.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With