Why "numpy.any" has no short-circuit mechanism?

Tags:

I don't understand why a so basic optimization has not yet be done:

In [1]: one_million_ones = np.ones(10**6)
In [2]: %timeit one_million_ones.any()
100 loops, best of 3: 693µs per loop

In [3]: ten_millions_ones = np.ones(10**7)
In [4]: %timeit ten_millions_ones.any()
10 loops, best of 3: 7.03 ms per loop

The whole array is scanned, even if the conclusion is an evidence at first item.

896

asked Aug 19 '17 12:08

B. M.

1 Answers

It's an unfixed performance regression. NumPy issue 3446. There actually is short-circuiting logic, but a change to the ufunc.reduce machinery introduced an unnecessary chunk-based outer loop around the short-circuiting logic, and that outer loop doesn't know how to short circuit. You can see some explanation of the chunking machinery here.

The short-circuiting effects wouldn't have showed up in your test even without the regression, though. First, you're timing the array creation, and second, I don't think they ever put in the short-circuit logic for any input dtype but boolean. From the discussion, it sounds like the details of the ufunc reduction machinery behind numpy.any would have made that difficult.

The discussion does bring up the surprising point that the argmin and argmax methods appear to short-circuit for boolean input. A quick test shows that as of NumPy 1.12 (not quite the most recent version, but the version currently on Ideone), x[x.argmax()] short-circuits, and it outcompetes x.any() and x.max() for 1-dimensional boolean input no matter whether the input is small or large and no matter whether the short-circuiting pays off. Weird!

132

answered Sep 17 '22 00:09

user2357112 supports Monica

Related questions
                            
                                Initialising an n-length tuple of lists
                            
                                Memory usage with concurrent.futures.ThreadPoolExecutor in Python3
                            
                                Selenium Python: How to wait for a page to load after a click?
                            
                                GSpread ImportError: No module named oauth2client.service_account
                            
                                Importing Python modules for Azure Function
                            
                                what's the usage of __traceback_hide__
                            
                                R's order equivalent in python
                            
                                F test with python, finding the critical value
                            
                                I cannot ignore pycache and db.sqlite on Django even though it refers them at .gitignore
                            
                                Swapping/Ordering multi-index columns in pandas
                            
                                python map() on zipped object
                            
                                What is the difference between var, cvar and ivar in python's sphinx?
                            
                                python fuzzywuzzy's process.extract(): how does it work?
                            
                                Repeating letters like excel columns?
                            
                                Resample Daily Data to Monthly with Pandas (date formatting)
                            
                                IB API Python sample not using Ibpy
                            
                                Combining cv2.imshow() with matplotlib plt.show() in real time
                            
                                Numpy diff inverted operation?
                            
                                How to make numpy array column sum up to 1
                            
                                why UniqueConstraint doesn't work in flask_sqlalchemy

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why "numpy.any" has no short-circuit mechanism?

Tags:

performance

python

numpy

B. M.

People also ask

1 Answers

user2357112 supports Monica

Recent Activity

Donate For Us