How to do a cumulative "all"

Tags:

Setup
Consider the numpy array a

>>> np.random.seed([3,1415])
>>> a = np.random.choice([True, False], (4, 8))

>>> a
array([[ True, False,  True, False,  True,  True, False,  True],
       [False, False, False, False,  True, False, False,  True],
       [False,  True,  True,  True,  True,  True,  True,  True],
       [ True,  True,  True, False,  True, False, False, False]], dtype=bool)

Question
For each column, I want to determine the cumulative equivalent for all.

The result should look like this:

array([[ True, False,  True, False,  True,  True, False,  True],
       [False, False, False, False,  True, False, False,  True],
       [False, False, False, False,  True, False, False,  True],
       [False, False, False, False,  True, False, False, False]], dtype=bool)

Take the first column

a[: 0]

# Original First Column
array([ True, False, False,  True], dtype=bool)
# So far so good
#        \     False from here on
#         |    /---------------\
array([ True, False, False, False], dtype=bool)
# Cumulative all

So basically, cumulative all is True as long as we have True and turns False from then on at the first False

What I have tried
I can get the result with

a.cumprod(0).astype(bool)

But, I can't help but wonder if its necessary to perform each and every multiplication when I know everything will be False from the first False I see.

Consider the larger 1-D array

b = np.array(list('111111111110010101010101010101010101010011001010101010101')).astype(int).astype(bool)

I contend that these two produce the same answer

bool(b.prod())

and

b.all()

But b.all() can short circuit while b.prod() does not. If I time them:

%timeit bool(b.prod())
%timeit b.all()

100000 loops, best of 3: 2.05 µs per loop
1000000 loops, best of 3: 1.45 µs per loop

b.all() is quicker. This implies that there must me a way to conduct a cumulative all that is quicker that my a.cumprod(0).astype(bool)

850

asked Jun 14 '17 21:06

piRSquared

1 Answers

All ufuncs have 5 methods: reduce, accumulate, reduceat, outer, and at. In this case, use accumulate since it returns the result of cumulative applications of the ufunc:

In [41]: np.logical_and.accumulate(a, axis=0)
Out[50]: 
array([[ True, False,  True, False,  True,  True, False,  True],
       [False, False, False, False,  True, False, False,  True],
       [False, False, False, False,  True, False, False,  True],
       [False, False, False, False,  True, False, False, False]], dtype=bool)

In [60]: np.random.seed([3,1415])

In [61]: a = np.random.choice([True, False], (400, 80))

In [57]: %timeit np.logical_and.accumulate(a, axis=0)
10000 loops, best of 3: 85.6 µs per loop

In [59]: %timeit a.cumprod(0).astype(bool)
10000 loops, best of 3: 138 µs per loop

179

answered Sep 28 '22 05:09

unutbu

Related questions
                            
                                How to use transactions with Django REST framework?
                            
                                Pandas: Get corresponding column value in row based on unique value
                            
                                AttributeError list object has no attribute add
                            
                                How to save the data from a scrapy crawler into a variable?
                            
                                Reading zipped JSON files
                            
                                Click and drag a rectangle with pygame
                            
                                Choose matplotlib xticks frequency
                            
                                Python Set Firefox Preferences for Selenium--Download Location
                            
                                Get basename of a Windows path in Linux
                            
                                How to respect PEP8 when accessing multiple nested dictionaries?
                            
                                How can I mock a module that is imported from a function and not present in sys.path? [duplicate]
                            
                                Type Conversion in python AttributeError: 'str' object has no attribute 'astype'
                            
                                Adding specific lines to a Plotly Scatter3d() plot
                            
                                Connection reset by Peer pymongo
                            
                                datasets.load_iris() in Python
                            
                                Join dataframes - one with multiindex columns and the other without
                            
                                Python script should end with new line or not ? Pylint contradicting itself?
                            
                                Python Pandas: TypeError: unsupported operand type(s) for +: 'datetime.time' and 'Timedelta'
                            
                                How can I do a Monte Carlo analysis on an equation?
                            
                                statespace.SARIMAX model: why the model use all the data to train mode, and predict the a range of train model

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to do a cumulative "all"

Tags:

python

pandas

numpy

piRSquared

People also ask

1 Answers

unutbu

Recent Activity

Donate For Us