Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Numpy: Duplicate mask for an array (returning True if we've seen that value before, False otherwise)

Tags:

python

list

numpy

I'm looking for a vectorized function that returns a mask with values of True if the value in the array has been seen before and False otherwise.

I'm looking for the fastest solution possible as speed is very important.

For example this is what I would like to see:

array = [1, 2, 1, 2, 3]
mask = [False, False, True, True, False]

So is_duplicate = array[mask] should return [1, 2].

Is there a fast, vectorized way to do this? Thanks!

like image 969
narcissa Avatar asked Oct 24 '25 18:10

narcissa


2 Answers

Approach #1 : With sorting

def mask_firstocc(a):
    sidx = a.argsort(kind='stable')
    b = a[sidx]
    out = np.r_[False,b[:-1] == b[1:]][sidx.argsort()]
    return out

We can use array-assignment to boost perf. further -

def mask_firstocc_v2(a):
    sidx = a.argsort(kind='stable')
    b = a[sidx]
    mask = np.r_[False,b[:-1] == b[1:]]
    out = np.empty(len(a), dtype=bool)
    out[sidx] = mask
    return out

Sample run -

In [166]: a
Out[166]: array([2, 1, 1, 0, 0, 4, 0, 3])

In [167]: mask_firstocc(a)
Out[167]: array([False, False,  True, False,  True, False,  True, False])

Approach #2 : With np.unique(..., return_index)

We can leverage np.unique with its return_index which seems to return the first occurence of each unique elemnent, hence a simple array-assignment and then indexing works -

def mask_firstocc_with_unique(a):
    mask = np.ones(len(a), dtype=bool)
    mask[np.unique(a, return_index=True)[1]] = False
    return mask
like image 166
Divakar Avatar answered Oct 26 '25 06:10

Divakar


Use np.unique

a = np.array([1, 2, 1, 2, 3])
_, ix = np.unique(a, return_index=True)
b = np.full(a.shape, True)
b[ix] = False

In [45]: b
Out[45]: array([False, False,  True,  True, False])
like image 26
Andy L. Avatar answered Oct 26 '25 07:10

Andy L.



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!