I have an array containing chunks of negative and chunks of positive elements. A much simplified example of it would be an array a looking like: array([-3, -2, -1,  1,  2,  3,  4,  5,  6, -5, -4])
(a<0).sum() and (a>0).sum() give me the total number of negative and positive elements but how do I count these in order? By this I mean I want to know that my array contains first 3 negative elements, 6 positive and 2 negative.
This sounds like a topic that have been addressed somewhere, and there may be a duplicate out there, but I can't find one.
A method is to use numpy.roll(a,1) in a loop over the whole array and count the number of elements of a given sign appearing in e.g. the first element of the array as it rolls, but it doesn't look much numpyic (or pythonic) nor very efficient to me.
Here's one vectorized approach -
def pos_neg_counts(a):
    mask = a>0
    idx = np.flatnonzero(mask[1:] != mask[:-1])
    count = np.concatenate(( [idx[0]+1], idx[1:] - idx[:-1], [a.size-1-idx[-1]] ))
    if a[0]<0:
        return count[1::2], count[::2] # pos, neg counts
    else:
        return count[::2], count[1::2] # pos, neg counts
Sample runs -
In [155]: a
Out[155]: array([-3, -2, -1,  1,  2,  3,  4,  5,  6, -5, -4])
In [156]: pos_neg_counts(a)
Out[156]: (array([6]), array([3, 2]))
In [157]: a[0] = 3
In [158]: a
Out[158]: array([ 3, -2, -1,  1,  2,  3,  4,  5,  6, -5, -4])
In [159]: pos_neg_counts(a)
Out[159]: (array([1, 6]), array([2, 2]))
In [160]: a[-1] = 7
In [161]: a
Out[161]: array([ 3, -2, -1,  1,  2,  3,  4,  5,  6, -5,  7])
In [162]: pos_neg_counts(a)
Out[162]: (array([1, 6, 1]), array([2, 1]))
Runtime test
Other approach(es) -
# @Franz's soln        
def split_app(my_array):
    negative_index = my_array<0
    splits = np.split(negative_index, np.where(np.diff(negative_index))[0]+1)
    len_list = [len(i) for i in splits]
    return len_list
Timings on bigger dataset -
In [20]: # Setup input array
    ...: reps = np.random.randint(3,10,(100000))
    ...: signs = np.ones(len(reps),dtype=int)
    ...: signs[::2] = -1
    ...: a = np.repeat(signs, reps)*np.random.randint(1,9,reps.sum())
    ...: 
In [21]: %timeit split_app(a)
10 loops, best of 3: 90.4 ms per loop
In [22]: %timeit pos_neg_counts(a)
100 loops, best of 3: 2.21 ms per loop
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With