The numpy.unique
function allows to return the counts of unique elements if return_counts
is True
. Now the returned tuple consists of two arrays one containing the unique elements and the 2nd one containing a count array, both are sorted by the unique elements. Now is there a way to have both sorted according to the counts array instead of the unique elements? I mean I know how to do it the hard way but is there some concise one-liner or lambda functionality for such cases?
Current result:
my_chr_list = ["a","a","a", "b", "c", "b","d", "d"]
unique_els, counts = np.unique(my_chr_list, return_counts=True)
print(unique_els, counts)
Which returns something along the lines of this:
>>> (array(['a', 'b', 'c', 'd'],
dtype='<U1'), array([3, 2, 1, 2], dtype=int64))
However, what I would want to have:
>>> (array(['a', 'b', 'd', 'c'],
dtype='<U1'), array([3, 2, 2, 1], dtype=int64))
Using np. unique is on average faster than everything else, and it certainly gives the best performance on array data structures out of all the methods.
This function returns an array of unique elements in the input array. The function can be able to return a tuple of array of unique vales and an array of associated indices. Nature of the indices depend upon the type of return parameter in the function call.
You can't do this directly with unique
function. Instead as a Numpythonic approach, you can use return_index
keyword to get the indices of the unique items then use np.argsort
to get the indices of the sorted count
items and use the result to find the items based on their frequency.
In [33]: arr = np.array(my_chr_list)
In [34]: u, count = np.unique(my_chr_list, return_counts=True)
In [35]: count_sort_ind = np.argsort(-count)
In [36]: u[count_sort_ind]
Out[36]:
array(['a', 'b', 'd', 'c'],
dtype='<U1')
In [37]: count[count_sort_ind]
Out[37]: array([3, 2, 2, 1])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With