I have a list of sets given by, <pre class="prettyprint"><code>sets1 = [{1},{2},{1}] </code></pre> When I find the unique elements in this list using numpy's <code>unique</code>, I get <pre class="prettyprint"><code>np.unique(sets1) Out[18]: array([{1}, {2}, {1}], dtype=object) </code></pre> As can be seen seen, the result is wrong as <code>{1}</code> is repeated in the output. When I change the order in the input by making similar elements adjacent, this doesn't happen. <pre class="prettyprint"><code>sets2 = [{1},{1},{2}] np.unique(sets2) Out[21]: array([{1}, {2}], dtype=object) </code></pre> Why does this occur? Or is there something wrong in the way I have done?

What happens here is that the <code>np.unique</code> function is based on the <code>np._unique1d</code> function from NumPy (see the code here), which itself uses the <code>.sort()</code> method. Now, sorting a list of sets that contain only one integer in each set will not result in a list with each set ordered by the value of the integer present in the set. So we will have (and that is not what we want): <pre class="prettyprint"><code>sets = [{1},{2},{1}] sets.sort() print(sets) # > [{1},{2},{1}] # ie. the list has not been "sorted" like we want it to </code></pre> Now, as you have pointed out, if the list of sets is already ordered in the way you want, <code>np.unique</code> will work (since you would have sorted the list beforehand). One specific solution (though, please be aware that it will only work for a list of sets that each contain a single integer) would then be: <pre class="prettyprint"><code>np.unique(sorted(sets, key=lambda x: next(iter(x)))) </code></pre>

numpy.unique gives wrong output for list of sets

Tags:

python

list

set

numpy

I have a list of sets given by,

sets1 = [{1},{2},{1}]

When I find the unique elements in this list using numpy's unique, I get

np.unique(sets1)
Out[18]: array([{1}, {2}, {1}], dtype=object)

As can be seen seen, the result is wrong as {1} is repeated in the output.

When I change the order in the input by making similar elements adjacent, this doesn't happen.

sets2 = [{1},{1},{2}]

np.unique(sets2)
Out[21]: array([{1}, {2}], dtype=object)

Why does this occur? Or is there something wrong in the way I have done?

251

asked Nov 21 '19 14:11

rashid

1 Answers

What happens here is that the np.unique function is based on the np._unique1d function from NumPy (see the code here), which itself uses the .sort() method.

Now, sorting a list of sets that contain only one integer in each set will not result in a list with each set ordered by the value of the integer present in the set. So we will have (and that is not what we want):

sets = [{1},{2},{1}]
sets.sort()
print(sets)

# > [{1},{2},{1}]
# ie. the list has not been "sorted" like we want it to

Now, as you have pointed out, if the list of sets is already ordered in the way you want, np.unique will work (since you would have sorted the list beforehand).

One specific solution (though, please be aware that it will only work for a list of sets that each contain a single integer) would then be:

np.unique(sorted(sets, key=lambda x: next(iter(x))))

144

answered Nov 03 '22 00:11

bglbrt

Related questions
                            
                                Parsing mbox files in Python
                            
                                python setup.py configuration to install files in custom directories
                            
                                pymongo connection pooling and client requests
                            
                                print a binary tree on its side
                            
                                Python: Ignore xmlns in elementtree.ElementTree
                            
                                Numpy: Difference between dot(a,b) and (a*b).sum()
                            
                                getting URLError: <urlopen error [Errno 111] Connection refused> in selenium webdriver using python in phantomjs
                            
                                python: merging dictionaries by identical value of key [duplicate]
                            
                                How do I package for distribution a python module that uses a shared library?
                            
                                A simple example of using cmake to build a Windows DLL
                            
                                Run Python script from AJAX or JQuery
                            
                                Auto-import doesn't follow PEP8
                            
                                High Kernel CPU when running multiple python programs
                            
                                Best practice when using folium on django
                            
                                Getting signals working on PulseAudio's DBus interface?
                            
                                How do I configure spacemacs for python 3?
                            
                                Highlight text in a PDF with Python [closed]
                            
                                Programmatically defining a class: type vs types.new_class
                            
                                Differences between generator comprehension expressions
                            
                                Adding packages to Python "embedded" installation for Windows

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With