Starting from a simple array with duplicate values:
a = np.array([2,3,2,2,3,3,2,1])
I'm trying to select a maximum of 2 unique values from this. The resulting array would appear as:
b = np.array([2,3,2,3,1])
no matter the order of the items. So far I tried to find unique values with:
In [20]: c = np.unique(a,return_counts=True)
In [21]: c
Out[21]: (array([1, 2, 3]), array([1, 4, 3]))
which is useful because it returns the frequency of values as well, but I'm stucked in filtering by frequency.
You could use np.repeat to generate the desired array from the array of uniques and counts:
import numpy as np
a = np.array([2,3,2,2,3,3,2,1])
uniques, count = np.unique(a,return_counts=True)
np.repeat(uniques, np.clip(count, 0, 2))
yields
array([1, 2, 2, 3, 3])
np.clip is used to force all values in count to be between 0 and 2. Thus, you get at most two values for each unique value.
You can use a list comprehension within np.concatenate() and limit the number of items by slicing:
>>> np.concatenate([a[a==i][:2] for i in np.unique(a)])
array([1, 2, 2, 3, 3])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With