Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Choose m evenly spaced elements from a sequence of length n

I have a vector/array of n elements. I want to choose m elements.

The choices must be fair / deterministic -- equally many from each subsection.

With m=10, n=20 it is easy: just take every second element. But how to do it in the general case? Do I have to calculate the LCD?

like image 755
j13r Avatar asked Mar 26 '12 14:03

j13r


4 Answers

You probably need Bresenham's line algorithm. Choosing m elements uniformly from n is equivalent to drawing a line in mxn discrete pixel grid. Assume x coordinate in 0..n-1 and y coordinate 0..m-1, and proceed like if you were drawing a line between (0,0) and (n-1,m-1). Whenever y coordinate changes, pick an element from index x.

UPD: But it seems that this simple function will suffice you:

>>> f = lambda m, n: [i*n//m + n//(2*m) for i in range(m)]
>>> f(1,20)
[10]
>>> f(2,20)
[5, 15]
>>> f(3,20)
[3, 9, 16]
>>> f(5,20)
[2, 6, 10, 14, 18]
like image 175
hamstergene Avatar answered Oct 10 '22 11:10

hamstergene


Here is a quick example:

from math import ceil

def takespread(sequence, num):
    length = float(len(sequence))
    for i in range(num):
        yield sequence[int(ceil(i * length / num))]

math.ceil is used because without it, the chosen indexes will be weighted too much toward the beginning of each implicit subsection, and as a result the list as a whole.

like image 39
agf Avatar answered Oct 10 '22 10:10

agf


Use a loop (int i=0; i < m; i++)

Then to get the indexes you want, Ceil(i*m/n).

like image 37
Francis P Avatar answered Oct 10 '22 11:10

Francis P


This will always select the first and last elements:

which_idxs = lambda m, n: np.rint( np.linspace( 1, n, min(m,n) ) - 1 ).astype(int)

evenly_spaced = np.array( your_list )[which_idxs(m,n)]

This will only select a maximum of n elements, in case m is higher than n.

If you truly want it equally spread throughout the array, even at the ends, then it would be this instead:

which_idxs = lambda m, n: [idx for idx in np.rint( np.linspace( 1-n/(2*min(m,n)), n+n/(2*min(m,n)), min(m,n)+2 ) - 1 ).astype(int) if idx in range(n)]

evenly_spaced = np.array( your_list )[which_idxs(m,n)]

Which gives you something like this:

>>> np.array( [1, 2, 3, 'a', 'b', 'c'] )[which_idxs(m,n)]
Out: array(['2', 'b'])
like image 26
Default picture Avatar answered Oct 10 '22 10:10

Default picture