Counting consecutive 1's in NumPy array

Tags:

[1, 1, 1, 0, 0, 0, 1, 1, 0, 0]

I have a NumPy array consisting of 0's and 1's like above. How can I add all consecutive 1's like below? Any time I encounter a 0, I reset.

[1, 2, 3, 0, 0, 0, 1, 2, 0, 0]

I can do this using a for loop, but is there a vectorized solution using NumPy?

430

asked Feb 09 '17 05:02

user308827

2 Answers

Here's a vectorized approach -

def island_cumsum_vectorized(a):
    a_ext = np.concatenate(( [0], a, [0] ))
    idx = np.flatnonzero(a_ext[1:] != a_ext[:-1])
    a_ext[1:][idx[1::2]] = idx[::2] - idx[1::2]
    return a_ext.cumsum()[1:-1]

Sample run -

In [91]: a = np.array([1, 1, 1, 0, 0, 0, 1, 1, 0, 0])

In [92]: island_cumsum_vectorized(a)
Out[92]: array([1, 2, 3, 0, 0, 0, 1, 2, 0, 0])

In [93]: a = np.array([0, 1, 1, 1, 1, 0, 0, 0, 1, 1, 0, 0, 1])

In [94]: island_cumsum_vectorized(a)
Out[94]: array([0, 1, 2, 3, 4, 0, 0, 0, 1, 2, 0, 0, 1])

Runtime test

For the timings , I would use OP's sample input array and repeat/tile it and hopefully this should be a less opportunistic benchmark -

Small case :

In [16]: a = np.array([1, 1, 1, 0, 0, 0, 1, 1, 0, 0])

In [17]: a = np.tile(a,10)  # Repeat OP's data 10 times

# @Paul Panzer's solution
In [18]: %timeit np.concatenate([np.cumsum(c) if c[0] == 1 else c for c in np.split(a, 1 + np.where(np.diff(a))[0])])
10000 loops, best of 3: 73.4 µs per loop

In [19]: %timeit island_cumsum_vectorized(a)
100000 loops, best of 3: 8.65 µs per loop

Bigger case :

In [20]: a = np.array([1, 1, 1, 0, 0, 0, 1, 1, 0, 0])

In [21]: a = np.tile(a,1000)  # Repeat OP's data 1000 times

# @Paul Panzer's solution
In [22]: %timeit np.concatenate([np.cumsum(c) if c[0] == 1 else c for c in np.split(a, 1 + np.where(np.diff(a))[0])])
100 loops, best of 3: 6.52 ms per loop

In [23]: %timeit island_cumsum_vectorized(a)
10000 loops, best of 3: 49.7 µs per loop

Nah, I want really huge case :

In [24]: a = np.array([1, 1, 1, 0, 0, 0, 1, 1, 0, 0])

In [25]: a = np.tile(a,100000)  # Repeat OP's data 100000 times

# @Paul Panzer's solution
In [26]: %timeit np.concatenate([np.cumsum(c) if c[0] == 1 else c for c in np.split(a, 1 + np.where(np.diff(a))[0])])
1 loops, best of 3: 725 ms per loop

In [27]: %timeit island_cumsum_vectorized(a)
100 loops, best of 3: 7.28 ms per loop

answered Sep 25 '22 00:09

Divakar

If a list comprehension is acceptable

np.concatenate([np.cumsum(c) if c[0] == 1 else c for c in np.split(a, 1 + np.where(np.diff(a))[0])])

answered Sep 24 '22 00:09

Paul Panzer

Related questions
                            
                                Psycopg2: connection not committing when exiting 'with' statement containing a try/except
                            
                                Querying Flask-SQLAlchemy through two table joins
                            
                                How to insert dictionaries as last rows in Pandas DataFrame
                            
                                How do I except specific kinds of pymysql.err.InternalErrors, not just all of them?
                            
                                QImage skews some images but not others
                            
                                Why does this code only work when the input is all capital letters?
                            
                                Better alternative to nested for loops through arrays in numpy?
                            
                                Better way to write a polling function in python
                            
                                Fast (vectorized) way to find points in one DF belonging to equally sized rectangles (given by two points) from the second DF
                            
                                NumPy - How to bitwise and over each element in matrix rows
                            
                                Graph-tool: subgraphs as new Graph objects
                            
                                How can I convert nested dictionary keys to strings?
                            
                                Fastest way to fill numpy array with distances from a point
                            
                                Beam/Dataflow Python: AttributeError: '_UnwindowedValues' object has no attribute 'sort'
                            
                                Python unable to compare bound method to itself
                            
                                mocking global variables on python doesn't work
                            
                                Rotate screen in mac os with terminal
                            
                                Use boto3 to download from public bucket
                            
                                Python regex search: repeated digit n times
                            
                                How to switch two elements in string using Python RegEx?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Counting consecutive 1's in NumPy array

Tags:

python

numpy

cumsum

user308827

People also ask

2 Answers

Divakar

Paul Panzer

Recent Activity

Donate For Us