Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python, Numpy - Trying split an array according to a condition

I am trying to find clusters (i.e. groups within an array where the difference between [n+1] and [n] is less than a certain value) inside an array. I have a numpy array that is a sequence of time stamps. I can find the difference between time stamps using numpy.diff(), but I have a hard time trying to determine clusters without looping through the array. To exemplify this:

t = t = np.array([ 147, 5729, 5794, 5806, 6798, 8756, 8772, 8776, 9976])
dt  = np.diff(t)
dt = array([5582,   65,   12,  992, 1958,   16,    4, 1200])

If my cluster condition is dt < 100 t[1], t[2], and t[3] would be one cluster and t[5], t[6], and t[7] would be another. I have tried playing around with numpy.where(), but I am having no success with getting the conditions tuned right to separate out the clusters, i.e.

cluster1 = np.array([5729, 5794, 5806])
cluster2 = np.array([8756, 8772, 8776])

or something along the lines.

Any help is appreciated.

like image 987
madtowneast Avatar asked May 14 '26 05:05

madtowneast


1 Answers

import numpy as np

t = np.array([ 147, 5729, 5794, 5806, 6798, 8756, 8772, 8776, 9976])
dt  = np.diff(t)
pos = np.where(dt > 100)[0] + 1
print np.split(t, pos)

the output is:

[array([147]), 
array([5729, 5794, 5806]), 
array([6798]), 
array([8756, 8772, 8776]), 
array([9976])]
like image 191
HYRY Avatar answered May 16 '26 19:05

HYRY



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!