How do I identify sequences of values in a boolean array?

Tags:

I have a long boolean array:

bool_array = [ True, True, True, True, True, False, False, False, False, False, True, True, True, False, False, True, True, True, True, False, False, False, False, False, False, False ]

I need to figure out where the values flips, i.e., the addresses where sequences of True and False begin. In this particular case, I would want to get

index = [0, 5, 10, 13, 15, 19, 26]

Is there an easy way to do without manually looping to check every ith element with the (i+1)th?

894

asked Apr 27 '16 15:04

saud

2 Answers

As a more efficient approach for large datasets, in python 3.X you can use accumulate and groupby function from itertools module.

>>> from itertools import accumulate, groupby
>>> [0] + list(accumulate(sum(1 for _ in g) for _,g in groupby(bool_array)))
[0, 5, 10, 13, 15, 19, 26]

The logic behind the code:

This code, categorizes the sequential duplicate items using groupby() function, then loops over the iterator returned by groupby() which is contains pairs of keys (that we escaped it using under line instead of a throw away variable) and these categorized iterators.

>>> [list(g) for _, g in groupby(bool_array)]
[[True, True, True, True, True], [False, False, False, False, False], [True, True, True], [False, False], [True, True, True, True], [False, False, False, False, False, False, False]]

So all we need is calculating the length of these iterators and sum each length with its previous length, in order to get the index of first item which is exactly where the item is changed, that is exactly what that accumulate() function is for.

In Numpy you can use the following approach:

In [19]: np.where(arr[1:] - arr[:-1])[0] + 1
Out[19]: array([ 5, 10, 13, 15, 19])
# With leading and trailing indices
In [22]: np.concatenate(([0], np.where(arr[1:] - arr[:-1])[0] + 1, [arr.size]))
Out[22]: array([ 0,  5, 10, 13, 15, 19, 26])

197

answered Sep 27 '22 23:09

Mazdak

This will tell you where:

>>> import numpy as np
>>> np.argwhere(np.diff(bool_array)).squeeze()
array([ 4,  9, 12, 14, 18])

np.diff calculates the difference between each element and the next. For booleans, it essentially interprets the values as integers (0: False, non-zero: True), so differences appear as +1 or -1 values, which then get mapped back to booleans (True when there is a change).

The np.argwhere function then tells you where the values are True --- which are now the changes.

answered Sep 28 '22 00:09

DilithiumMatrix

Related questions
                            
                                How can I correct the error ' AttributeError: 'dict_keys' object has no attribute 'remove' '?
                            
                                How to make worker threads quit after work is finished in a multithreaded producer-consumer pattern?
                            
                                Interweave two dataframes
                            
                                How does numpy's argpartition work on the documentation's example?
                            
                                Change pandas plotting backend to get interactive plots instead of matplotlib static plots
                            
                                ValueError: Solver lbfgs supports only 'l2' or 'none' penalties, got l1 penalty
                            
                                Django with PyPy
                            
                                PySerial: How to send Ctrl-C command on the serial line
                            
                                Range is too large Python
                            
                                How to compare the modified date of two files in python?
                            
                                Python 2.7: %d, %s, and float()
                            
                                Flatten a list of strings and lists of strings and lists in Python [duplicate]
                            
                                Deleting certain elements from numpy array using conditional checks
                            
                                Read file with timeout in Python
                            
                                numpy.polyfit doesn't handle NaN values
                            
                                "Expected type 'Union[str, bytearray]' got 'int' instead" warning in write method
                            
                                Python re.split() vs nltk word_tokenize and sent_tokenize
                            
                                Inverting a dictionary with list values
                            
                                How to add a custom function/method in sqlalchemy model to do CRUD operations?
                            
                                Quick way to access first element in Numpy array with arbitrary number of dimensions?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How do I identify sequences of values in a boolean array?

Tags:

python

list

python-3.x

boolean

saud

People also ask

2 Answers

Mazdak

DilithiumMatrix

Recent Activity

Donate For Us