I have an array A:
import numpy as np
A = np.array( [0, 0, 1, 1, 1, 0, 1, 1, 0 ,0, 1, 0] )
The length of consecutive '1s' would be:
output: [3, 2, 1]
with the corresponding starting indices:
idx = [2, 6, 10]
The original arrays are huge and I prefer a solution with less for-loop.
Edit (Run time):
import numpy as np
import time
A = np.array( [0, 0, 1, 1, 1, 0, 1, 1, 0 ,0, 1, 0] )
def LoopVersion(A):
l_A = len(A)
size = []
idx = []
temp_idx = []
temp_size = []
for i in range(l_A):
if A[i] == 1:
temp_size.append(1)
if not temp_idx:
temp_idx = i
idx.append(temp_idx)
else:
size.append( len(temp_size) )
size = [i for i in size if i != 0]
temp_size = []
temp_idx = []
return size, idx
Quang's solution:
def UniqueVersion(A):
_, idx, counts = np.unique(np.cumsum(1-A)*A, return_index=True, return_counts=True)
return idx, counts
Jacco's solution:
def ConcatVersion(A):
A = np.concatenate(([0], A, [0])) # get rid of some edge cases
starts = np.argwhere((A[:-1] + A[1:]) == 1).ravel()[::2]
ends = np.argwhere((A[:-1] + A[1:]) == 1).ravel()[1::2]
len_of_repeats = ends - starts
return starts, len_of_repeats
Dan's solution (works with special cases as well):
def structure(A):
ZA = np.concatenate(([0], A, [0]))
indices = np.flatnonzero( ZA[1:] != ZA[:-1] )
counts = indices[1:] - indices[:-1]
return indices[::2], counts[::2]
Run time analysis with 10000 elements:
np.random.seed(1234)
B = np.random.randint(2, size=10000)
start = time.time()
size, idx = LoopVersion(B)
end = time.time()
print ( (end - start) )
# 0.32489800453186035 seconds
start = time.time()
idx, counts = UniqueVersion(B)
end = time.time()
print ( (end - start) )
# 0.008305072784423828 seconds
start = time.time()
idx, counts = ConcatVersion(B)
end = time.time()
print ( (end - start) )
# 0.0009801387786865234 seconds
start = time.time()
idx, counts = structure(B)
end = time.time()
print ( (end - start) )
# 0.000347137451171875 seconds
In the outer loop, pick elements one by one and count the number of occurrences of the picked element in the inner loop. This method doesn’t use the other useful data provided in questions like range of numbers is between 1 to n and there are only two repeating elements. Traverse the array once.
Then to get the last index (if that is what you want) use LastOrDefault (), as apose to ToArrray. Show activity on this post. You can have a class that implements IEnumerable and returns the indices you want:
We traverse array from beginning to find first occurrence. If element is present, then we traverse from end also to find last occurrence. return "Only one key is present "."
Group the array elements by occurrence sequence Find longest subarray with 0 sum Longest consecutive subsequence Count distinct elements in every window of given size Check if two arrays are permutations of each other Find all distinct symmetric pairs in an array efficiently Check pair with given product in an array
Let's try unique
:
_, idx, counts = np.unique(np.cumsum(1-A)*A, return_index=True, return_counts=True)
# your expected output:
idx, counts
Output:
(array([ 2, 6, 10]), array([3, 2, 1]))
Here is a pedestrian try, solving the problem by programming the problem.
We prepend and also append a zero to A
, getting a vector ZA
, then detect the 1
islands, and the 0
islands coming in alternating manner in the ZA
by comparing the shifted versions ZA[1:]
and ZA[-1]
. (In the constructed arrays we take the even places, corresponding to the ones in A
.)
import numpy as np
def structure(A):
ZA = np.concatenate(([0], A, [0]))
indices = np.flatnonzero( ZA[1:] != ZA[:-1] )
counts = indices[1:] - indices[:-1]
return indices[::2], counts[::2]
Some sample runs:
In [71]: structure(np.array( [0, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 0] ))
Out[71]: (array([ 2, 6, 10]), array([3, 2, 1]))
In [72]: structure(np.array( [1, 1, 1, 0, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 0, 1] ))
Out[72]: (array([ 0, 5, 9, 13, 15]), array([3, 3, 2, 1, 1]))
In [73]: structure(np.array( [1, 1, 1, 0, 0, 1, 1, 1, 0, 1, 1, 0] ))
Out[73]: (array([0, 5, 9]), array([3, 3, 2]))
In [74]: structure(np.array( [1, 0, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1] ))
Out[74]: (array([ 0, 2, 5, 7, 11, 14]), array([1, 2, 1, 3, 2, 3]))
You can use the fact that the indexes of '1s' provide all information you need. It's enough to find starts and ends of series of '1s'.
A = np.concatenate(([0], A, [0])) # get rid of some edge cases
diff = np.argwhere((A[:-1] + A[1:]) == 1).ravel()
starts = diff[::2]
ends = diff[1::2]
print(starts, ends - starts)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With