Count number of clusters of non-zero values in Python?

Tags:

My data looks something like this:

a=[0,0,0,0,0,0,10,15,16,12,11,9,10,0,0,0,0,0,6,9,3,7,5,4,0,0,0,0,0,0,4,3,9,7,1]

Essentially, there's a bunch of zeroes before non-zero numbers and I am looking to count the number of groups of non-zero numbers separated by zeros. In the example data above, there are 3 groups of non-zero data so the code should return 3.

Number of zeros between groups of non-zeros is variable

Any good ways to do this in python? (Also using Pandas and Numpy to help parse the data)

459

asked Dec 31 '16 21:12

Timbo Slice

3 Answers

With a as the input array, we could have a vectorized solution -

m = a!=0
out = (m[1:] > m[:-1]).sum() + m[0]

Alternatively for performance, we might use np.count_nonzero which is very efficient to count bools as is the case here, like so -

out = np.count_nonzero(m[1:] > m[:-1]) + m[0]

Basically, we get a mask of non-zeros and count rising edges. To account for the first element that could be non-zero too and would not have any rising edge, we need to check it and add to the total sum.

Also, please note that if input a is a list, we need to use m = np.asarray(a)!=0 instead.

Sample runs for three cases -

In [92]: a  # Case1 :Given sample
Out[92]: 
array([ 0,  0,  0,  0,  0,  0, 10, 15, 16, 12, 11,  9, 10,  0,  0,  0,  0,
        0,  6,  9,  3,  7,  5,  4,  0,  0,  0,  0,  0,  0,  4,  3,  9,  7,
        1])

In [93]: m = a!=0

In [94]: (m[1:] > m[:-1]).sum() + m[0]
Out[94]: 3

In [95]: a[0] = 7  # Case2 :Add a non-zero elem/group at the start

In [96]: m = a!=0

In [97]: (m[1:] > m[:-1]).sum() + m[0]
Out[97]: 4

In [99]: a[-2:] = [0,4] # Case3 :Add a non-zero group at the end

In [100]: m = a!=0

In [101]: (m[1:] > m[:-1]).sum() + m[0]
Out[101]: 5

187

answered Oct 18 '22 08:10

Divakar

You may achieve it via using itertools.groupby() with list comprehension expression as:

>>> from itertools import groupby

>>> len([is_true for is_true, _ in groupby(a, lambda x: x!=0) if is_true])
3

answered Oct 18 '22 08:10

Moinuddin Quadri

simple python solution, just count changes from 0 to non-zero, by keeping track of the previous value (rising edge detection):

a=[0,0,0,0,0,0,10,15,16,12,11,9,10,0,0,0,0,0,6,9,3,7,5,4,0,0,0,0,0,0,4,3,9,7,1]

previous = 0
count = 0
for c in a:
    if previous==0 and c!=0:
        count+=1
    previous = c

print(count)  # 3

answered Oct 18 '22 10:10

Jean-François Fabre

Related questions
                            
                                Using datetime timedelta with a series in a pandas DF
                            
                                Bulk update in Pymongo using multiple ObjectId
                            
                                start node app from python script
                            
                                Apply multiple functions with map
                            
                                double curly brace {{
                            
                                Extract the text from `p` within `div` with BeautifulSoup
                            
                                Django - The current URL, , didn't match any of these
                            
                                Convert a column in pandas dataframe from String to Float
                            
                                faster geometric average on ASCII
                            
                                toctree nested drop down
                            
                                Python regex for finding all words in a string [duplicate]
                            
                                How to check empty gzip file in Python
                            
                                How to stream in and manipulate a large data file in python
                            
                                write dataframe to excel file at given path
                            
                                Python: How to group a list of objects by their characteristics or attributes? [duplicate]
                            
                                convert AST node to python code
                            
                                Python : printing in multiple threads
                            
                                Fast subtraction of two dataframes ignoring indices (Python)
                            
                                Produce a string from a tuple
                            
                                IPython 5, key for executing block of code instead of inserting new line

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Count number of clusters of non-zero values in Python?

Tags:

python

pandas

numpy