Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Count consecutive occurences of values varying in length in a numpy array

Say I have a bunch of numbers in a numpy array and I test them based on a condition returning a boolean array:

np.random.seed(3456)
a = np.random.rand(8)
condition = a>0.5

And with this boolean array I want to count all of the lengths of consecutive occurences of True. For example if I had [True,True,True,False,False,True,True,False,True] I would want to get back [3,2,1].

I can do that using this code:

length,count = [],0
for i in range(len(condition)):

    if condition[i]==True:
        count += 1
    elif condition[i]==False and count>0:
        length.append(count)
        count = 0

    if i==len(condition)-1 and count>0:
        length.append(count)

    print length

But is there anything already implemented for this or a python,numpy,scipy, etc. function that counts the length of consecutive occurences in a list or array for a given input?

like image 722
pbreach Avatar asked Jun 21 '14 13:06

pbreach


People also ask

How do I find the most frequent element in an array in NumPy?

Steps to find the most frequency value in a NumPy array: Create a NumPy array. Apply bincount() method of NumPy to get the count of occurrences of each element in the array. The n, apply argmax() method to get the value having a maximum number of occurrences(frequency).

Which function is used to counts the number of elements in NumPy array?

count() is a numpy library function that counts the total number of occurrences of a character in a string or the array.


2 Answers

If you already have a numpy array, this is probably going to be faster:

>>> condition = np.array([True,True,True,False,False,True,True,False,True])
>>> np.diff(np.where(np.concatenate(([condition[0]],
                                     condition[:-1] != condition[1:],
                                     [True])))[0])[::2]
array([3, 2, 1])

It detects where chunks begin, has some logic for the first and last chunk, and simply computes differences between chunk starts and discards lengths corresponding to False chunks.

like image 144
Jaime Avatar answered Sep 24 '22 08:09

Jaime


Here's a solution using itertools (it's probably not the fastest solution):

import itertools
condition = [True,True,True,False,False,True,True,False,True]
[ sum( 1 for _ in group ) for key, group in itertools.groupby( condition ) if key ]

Out:
[3, 2, 1]
like image 31
usual me Avatar answered Sep 23 '22 08:09

usual me