Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to populate the spacings between elements of an array with constant step, and combine two such arrays with primary and secondary priorities?

Tags:

python

numpy

For a given random array

a = np.random.rand(3)
>>> a
array([0.51, 0.19, 0.72])

I would like to populate regions between the elements by a constant step = 0.1, such that I have the resulting array

>>> pop_func(a)
array([0.51, 0.41, 0.31, 0.21, 0.19, 0.29, 0.39, 0.49, 0.59, 0.69, 0.72])

Now I have two such arrays (they are the primary and secondary components of a 2dim array); pri_ara and sec_ara, ara = np.array([pri_ara, sec_ara]).T.

I would like the perform the same pop_func on each of the axis but with a twist. ara would be populated such that down the index of ara, pri_ara increments to the next element first while sec_ara component remains constant, followed by the sec_ara increment while pri_ara component remains constant, . This is hard to put in words, but as an explicit example, with step = 0.1:

pri_ara = array([0.51, 0.19, 0.32])
sec_ara = array([0.14, 0.44, 0.48])
ara = np.array([pri_ara, sec_ara]).T

>>> twistpop_func(ara)
np.array([[0.51, 0.14], 
          [0.41, 0.14], 
          [0.31, 0.14], 
          [0.21, 0.14], 
          [0.19, 0.14], 
          [0.19, 0.24], 
          [0.19, 0.34], 
          [0.19, 0.44], 
          [0.29, 0.44], 
          [0.32, 0.44], 
          [0.32, 0.48]])

What I have tried doing was to create a np.arange in each element of each component array, i.e.

pri_ara = pri_ara[..., None]
a, b = pri_ara[:-1], pri_ara[1:]
absign = np.nan_to_num((a - b)/np.abs(a - b), nan=1) # nan_to_num necessary to remove nan entries where element of a and b are equal
                                                     # set nan -> 1, so arange will not create any elements inbetween
pri_ara = np.concatenate(
    (a, b, absign * step * np.ones_like(a)), 
    axis = -1
)
pri_ara = np.apply_along_axis(lambda x: np.arange(*x), axis=-1, arr=pri_ara)

The last line does not work because the length of the np.arange differs for each x in the array, and numpy requires axis to be of the same shape.

One solution would be to pad every line to be of the same length, but that complicates things because when I combine the pri_ara and sec_ara together I would have to remove the padding.

Would really love it if there was a more straight forward method!

like image 256
Tian Avatar asked Dec 14 '25 16:12

Tian


2 Answers

TL;DR at the end

I would start by making an output buffer of the right size using np.repeat, then filling in the ascending/descending portions with a loop.

Let's look at the size of the runs you have and work out the repeat strategy to get them filled in. Given the dataset ara

0.51 0.14
0.19 0.44
0.32 0.48

you want to get

0.51 0.14
0.41 0.14  4 = abs(0.19 - 0.51) // step + 1
0.31 0.14
0.21 0.14
---- ----
0.19 0.14
0.19 0.24  3 = abs(0.44 - 0.14) // step + 1
0.19 0.34
---- ----
0.19 0.44  2 = abs(0.32 - 0.19) // step + 1
0.29 0.44
---- ----
0.32 0.44  1 = abs(0.48 - 0.44) // step + 1
---- ----
0.32 0.48  last section is always size 1

Using the size information shown above, which is clearly based on np.diff(ara, axis=0), we can first construct an array that looks like this:

0.51 0.14
0.51 0.14
0.51 0.14
0.51 0.14
0.19 0.14
0.19 0.14
0.19 0.14
0.19 0.44
0.19 0.44
0.32 0.44
0.32 0.48

The trick is to repeat all the elements the required number of times:

signs = np.diff(ara, axis=0, append=ara[-1, None]).ravel()[:-1]
d = (np.abs(signs) // step).astype(int) + 1
repeats = np.tile(d, 2)
values = np.repeat(ara.ravel(order='F'), 2)[1:-1]

buffer = np.repeat(values, repeats).reshape(-1, 2, order='F')

The remaining portion is to fill in the ranges of ascending/descending numbers. This can easily be done with a for loop:

ends = np.cumsum(d)
starts = np.zeros_like(end)
starts[1:] = ends[:-1]
for col, start, end in zip(itertools.cycle((0, 1)), starts, ends):
    s = buffer[start, col]
    e = buffer[end, col]
    buffer[start:end, col] = np.arange(s, e, np.copysign(step, e - s))

But this is "no fun" because it uses a for loop. So let's make a truly vectorized solution. First we need an array of cumulative sums that we can use to add to each ascending/descending section. If we just do np.arange(buffer.shape[0]) * step, reset at every section boundary, and get the sign right, we can simply add that to the buffer to get the output. So imagine the following operations:

( 0 -  0) * step * sign(0.19 - 0.51)
( 1 -  0) * step * sign(0.19 - 0.51)
( 2 -  0) * step * sign(0.19 - 0.51)
( 3 -  0) * step * sign(0.19 - 0.51)
 --   --
( 4 -  4) * step * sign(0.44 - 0.14)
( 5 -  4) * step * sign(0.44 - 0.14)
( 6 -  4) * step * sign(0.44 - 0.14)
 --   --
( 7 -  7) * step * sign(0.32 - 0.19)
( 8 -  7) * step * sign(0.32 - 0.19)
 --   --
( 9 -  9) * step * sign(0.48 - 0.44)
 --   --
(10 - 10) * step * "Doesn't matter"

The first column is an increasing range. The second column is the offset for each section, which looks like the cumulative sum of the section lengths. The signs are already something we've computed.

The whole operation looks like this:

numbers = np.arange(buffer.shape[0])
offsets = np.zeros(d.size)
offsets[1:] = np.cumsum(d[:-1])
offsets = np.repeat(offsets, d)
signs = np.repeat(signs, d)

ramps = (numbers - offsets) * np.copysign(step, signs)

Before adding this to the output buffer, we have to split this array into two columns, alternating by section. You can do that by duplicating ramps to two columns, and setting the unwanted elements to zero:

ramps = np.stack((ramps, ramps), axis=1)
mask = np.zeros((d.size, 2))
mask[::2, 0] = mask[1::2, 1] = 1
mask = np.repeat(mask, d, axis=0)

buffer += ramps * mask

TL;DR

Here is a fully vectorized solution:

def twistpop_func(ara):
    signs = np.diff(ara, axis=0, append=ara[-1, None]).ravel()[:-1]

    d = (np.abs(signs) // step).astype(int) + 1

    repeats = np.tile(d, 2)

    values = np.repeat(ara.ravel(order='F'), 2)[1:-1]

    buffer = np.repeat(values, repeats).reshape(-1, 2, order='F')

    numbers = np.arange(buffer.shape[0])

    offsets = np.zeros(d.size)
    offsets[1:] = np.cumsum(d[:-1])
    offsets = np.repeat(offsets, d)

    signs = np.repeat(signs, d)

    ramps = (numbers - offsets) * np.copysign(step, signs)
    ramps = np.stack((ramps, ramps), axis=1)

    mask = np.zeros((d.size, 2))
    mask[::2, 0] = mask[1::2, 1] = 1
    mask = np.repeat(mask, d, axis=0)

    buffer += ramps * mask
    return buffer
like image 191
Mad Physicist Avatar answered Dec 17 '25 08:12

Mad Physicist


It is not nice but it works:

import numpy as np
pri_ara = np.array([0.51, 0.19, 0.72, 0.21])
sec_ara = np.array([0.14, 0.44, 0.48, 0.81])

def pop_func(arr, step):
    diff = np.diff(arr)
    diff_steps = (diff / step).astype(int)
    diff_abs = np.abs(diff_steps) + 1
    diff_sign = np.sign(diff_steps)
    res = np.hstack([arr[i] + step*diff_sign[i]*np.arange(diff_abs[i])
                     for i in range(len(arr) - 1)])
    res = np.hstack([res, arr[-1:]])
    return res, diff_abs

def twistpop_func(arr1, arr2, step):
    n = len(arr1)
    arr1_pop, d1 = pop_func(arr=arr1, step=step)
    arr2_pop, d2 = pop_func(arr=arr2, step=step)

    org_idx1 = np.zeros(n, dtype=int)
    org_idx1[1:] = np.cumsum(d1)
    org_idx1[2:] += np.cumsum(d2[1:])

    org_idx2 = np.zeros(n, dtype=int)
    org_idx2[1:] = np.cumsum(d2)
    org_idx2[1:] += np.cumsum(d1)

    for i in range(n-1):
        arr1_pop = np.insert(arr1_pop, np.full(d2[i], org_idx1[i+1]), arr1[i+1])
        arr2_pop = np.insert(arr2_pop, np.full(d1[i], org_idx2[i]), arr2[i])

    return np.stack((arr1_pop, arr2_pop), axis=1)

res = twistpop_func(arr1=pri_ara, arr2=sec_ara, step=0.1)
like image 44
scleronomic Avatar answered Dec 17 '25 08:12

scleronomic



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!