I want to detect consecutive spans of 1's in a numpy array. Indeed, I want to first identify whether the element in an array is in a span of a least three 1's. For example, we have the following array a:
import numpy as np
a = np.array([1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0])
Then the following 1's in bold are the elements satisfy the requirement.
[1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0]
Next, if two spans of 1's are separated by at most two 0's, then the two spans make up a longer span. So the above array is charaterized as
[1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0]
In other words, for the original array as input, I want the output as follows:
[True, True, True, True, True, True, True, False, False, False, False, False, True, True, True, True, True, True, True, True, True, True, False]
I have been thinking of an algorithm to implement this function, but all the one I come up with seems to complicated. So I would love to know better ways to implement this -- it would be greatly appreciated if someone can help me out.
Update:
I apologize that I did not make my question clear. I want to identify 3 or more consecutive 1's in the array as a span of 1's, and any two spans of 1's with only one or two 0's in between are identified, along with the separating 0's, as a single long span. My goal can be understood in the following way: if there are only one or two 0's between spans of 1's, I consider those 0's as errors and are supposed to be corrected as 1's.
@ritesht93 provided an answer that almost gives what I want. However, the current answer does not identify the case when there are three spans of 1's that are separated by 0's, which should be identified as one single span. For example, for the array
a2 = np.array([0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1, 0])
we should receive the output
[False, True, True, True, True, True, True, True, True,
True, True, True, True, True, False, False, False, False,
False, True, True, True, True, True, False]
Update 2:
I was greatly inspired by and found the algorithm based on regular expression is easiest to implement and to understand -- though I am not sure about the efficient compared to other methods. Eventually I used the following method.
lst = np.array([0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1, 0])
lst1 = re.sub(r'1{3,}', lambda x:'c'*len(x.group()), ''.join(map(str, lst)))
print lst1
which identified spans of 1's
0ccc0ccc00cccc00100ccccc0
and then connect spans of 1's
lst2 = re.sub(r'c{1}0{1,2}c{1}', lambda x:'c'*len(x.group()), ''.join(map(str, lst1)))
print lst2
which gives
0ccccccccccccc00100ccccc0
The final result is given by
np.array(list(lst2)) == 'c'
array([False, True, True, True, True, True, True, True, True,
True, True, True, True, True, False, False, False, False,
False, True, True, True, True, True, False])
Steps to find the most frequency value in a NumPy array: Create a NumPy array. Apply bincount() method of NumPy to get the count of occurrences of each element in the array. The n, apply argmax() method to get the value having a maximum number of occurrences(frequency).
In a multi-dimensional NumPy array, axis 1 is the second axis. When we're talking about 2-d and multi-dimensional arrays, axis 1 is the axis that runs horizontally across the columns.
The [:, :] stands for everything from the beginning to the end just like for lists. The difference is that the first : stands for first and the second : for the second dimension. a = numpy. zeros((3, 3)) In [132]: a Out[132]: array([[ 0., 0., 0.], [ 0., 0., 0.], [ 0., 0., 0.]])
We could solve it with a combination of binary dilation
and erosion
to get past the first stage and then binary closing
to get the final output, like so -
from scipy.ndimage.morphology import binary_erosion,binary_dilation,binary_closing
K = np.ones(3,dtype=int) # Kernel
b = binary_dilation(binary_erosion(a,K),K)
out = binary_closing(b,K) | b
Sample runs
Case #1 :
In [454]: a
Out[454]: array([1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0])
In [456]: out
Out[456]:
array([ True, True, True, True, True, True, True, False, False,
False, False, False, True, True, True, True, True, True,
True, True, True, True, False], dtype=bool)
Case #2 :
In [460]: a
Out[460]:
array([0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1, 0])
In [461]: out
Out[461]:
array([False, True, True, True, True, True, True, True, True,
True, True, True, True, True, False, False, False, False,
False, True, True, True, True, True, False], dtype=bool)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With