Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

numpy.unique gives wrong output for list of sets

I have a list of sets given by,

sets1 = [{1},{2},{1}]

When I find the unique elements in this list using numpy's unique, I get

np.unique(sets1)
Out[18]: array([{1}, {2}, {1}], dtype=object)

As can be seen seen, the result is wrong as {1} is repeated in the output.

When I change the order in the input by making similar elements adjacent, this doesn't happen.

sets2 = [{1},{1},{2}]

np.unique(sets2)
Out[21]: array([{1}, {2}], dtype=object)

Why does this occur? Or is there something wrong in the way I have done?

like image 251
rashid Avatar asked Nov 21 '19 14:11

rashid


People also ask

What does numpy unique return?

Returns the sorted unique elements of an array.

How does numpy unique work?

This function returns an array of unique elements in the input array. The function can be able to return a tuple of array of unique vales and an array of associated indices. Nature of the indices depend upon the type of return parameter in the function call.

How do I get unique values from an NP array?

With the help of np. unique() method, we can get the unique values from an array given as parameter in np. unique() method. Return : Return the unique of an array.


1 Answers

What happens here is that the np.unique function is based on the np._unique1d function from NumPy (see the code here), which itself uses the .sort() method.

Now, sorting a list of sets that contain only one integer in each set will not result in a list with each set ordered by the value of the integer present in the set. So we will have (and that is not what we want):

sets = [{1},{2},{1}]
sets.sort()
print(sets)

# > [{1},{2},{1}]
# ie. the list has not been "sorted" like we want it to

Now, as you have pointed out, if the list of sets is already ordered in the way you want, np.unique will work (since you would have sorted the list beforehand).

One specific solution (though, please be aware that it will only work for a list of sets that each contain a single integer) would then be:

np.unique(sorted(sets, key=lambda x: next(iter(x))))
like image 144
bglbrt Avatar answered Nov 03 '22 00:11

bglbrt