Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Upsampling: insert extra values between each consecutive elements of a vector

Suppose we a have a vector V consisting of 20 floating point numbers. Is it possible to insert values between each pair of these floating points such that vector V becomes a vector of exactly 50 numbers.

The inserted value should be a random number between upper and lower value I've decided to insert a midpoint of two values between the two.

I have try the following:

 vector<double> upsample(vector<double>& in)
 {

    vector<double> temp;
    for (int i = 1; i <= in.size() - 1 ; i++)
    {
        double sample = (in[i] + in[i - 1]) / 2;
        temp.push_back(in[i - 1]);
        temp.push_back(sample);
    }
    temp.push_back(in.back());
    return temp;
}

with this function the input vector elements increases by 2(n) - 1 (20 elements becomes 39). It can be possible that the input vector has different sizes less than 50.

I think it can be done by inserting more than one value between two elements randomly to have a vector of size 50 (e.g. between V[0] and V[1] insert 3 values, between V[3] and V[4] insert 1 value, etc.). Is this possible?

Could you please guide me how to perform this? Thank you.

like image 590
student_11 Avatar asked Oct 16 '22 12:10

student_11


1 Answers

So I did some math myself, because I was curious how to get the weight ratios (as if upsampling linearly up to common multiple and then extracting only target values from large array - but without creating the large array, just using weights to know how much left+right elements contribute to particular value).

The sample code does always create new value by simple weighted average (i.e. 40% of 123.4 with 60% 567.8 will give "upscaled" value 390.04), no random used to pepper the upscaled values (leaving that part to OP).

The ratios goes like this:

if vector of size M is being upscaled to size N (M <= N) (the "upscale" will always preserve first and last element of input vector, those are "fixed" in this proposal)

then every upscaled element can be viewed as being somewhere between some original elements [i, i+1].

If we would declare the "distance" between the source elements [i, i+1] equal to d = N-1, then the upscaled element distance between can be always expressed as some j/d where j:[0,d] (when j is actual d, it's precisely at "i+1" element and can be considered the same case as j=0 but with [i+1,i+2] source elements)

And the distance between two upscaled elements is then M-1.

So when source vector is size 4, and upscaled vector size should be 5, the ratios for upscaled elements are [ [4/4,0/4], [1/4,3/4], [2/4,2/4], [3/4,1/4], [0/4,4/4] ] of elements (indices into vector) [ [0,1], [0,1], [1,2], [2, 3], [2, 3] ]. ("distance" between source elements is 5-1=4, that's then the "/4" to normalize weights, "distance" between upscaled elements is 4-1=3, that's why the ratios are shifting by [-3,+3] in every step).

I'm afraid that my description is far from "obvious" (how it feels in my head after figuring it out), but if you will put some of that into spreadsheet and toy around with it, hopefully it will make some sense. Or maybe you can debug the code to get better feel how that mumbling above transforms into real code.

Code example 1, this one will "copy" source element only if the weight is precisely fully onto it (i.e. in the example data only first and last element are "copied", rest of the upscaled elements are weighted averages of original values).

#include <iostream>
#include <vector>
#include <cassert>

static double get_upscale_value(const size_t total_weight, const size_t right_weight, const double left, const double right) {
    // do the simple weighted average for demonstration purposes
    const size_t left_weight = total_weight - right_weight;
    return (left * left_weight + right * right_weight) / total_weight;
}

std::vector<double> upsample_weighted(std::vector<double>& in, size_t n)
{
    assert( 2 <= in.size() && in.size() <= n ); // this is really only upscaling (can't downscale)

    // resulting vector variable
    std::vector<double> upscaled;
    upscaled.reserve(n);

    // upscaling factors variables and constants
    size_t index_left = 0;                      // first "left" item is the in[0] element
    size_t weight_right = 0;                    // and "right" has zero weight (i.e. in[0] is copied)
    const size_t in_weight = n - 1;             // total weight of single "in" element
    const size_t weight_add = in.size() - 1;    // shift of weight between "upscaled" elements

    while (upscaled.size() < n) {               // add N upscaled items
        if (0 == weight_right) {
            // full weight of left -> just copy it (never tainted by "upscaling")
            upscaled.push_back(in[index_left]);
        } else {
            // the weight is somewhere between "left" and "right" items of "in" vector
            // i.e. weight = 1..(in_weight-1) ("in_weight" is full "right" value, never happens)
            double upscaled_val = get_upscale_value(in_weight, weight_right, in[index_left], in[index_left+1]);
            upscaled.push_back(upscaled_val);
        }
        weight_right += weight_add;
        if (in_weight <= weight_right) {
            // the weight shifted so much that "right" is new "left"
            ++index_left;
            weight_right -= in_weight;
        }
    }

    return upscaled;
}

int main(int argc, const char *argv[])
{
    std::vector<double> in { 10, 20, 30 };
//    std::vector<double> in { 20, 10, 40 };

    std::vector<double> upscaled = upsample_weighted(in, 14);
    std::cout << "upsample_weighted from " << in.size() << " to " << upscaled.size() << ": ";
    for (const auto i : upscaled) {
        std::cout << i << " ";
    }
    std::cout << std::endl;
    return 0;
}

output:

upsample_weighted from 3 to 14: 10 11.5385 13.0769 14.6154 16.1538 17.6923 19.2308 20.7692 22.3077 23.8462 25.3846 26.9231 28.4615 30

Code example 2, this one will "copy" every source elements and use weighted average only to fill up gaps between, so as much of original data as possible is preserved (for the price of the result being not linearly-upscaled of original data set, but "aliased" to "grid" defined by target size):

(the code is pretty much identical to first one, except one if line in the upscaler)

#include <iostream>
#include <vector>
#include <cassert>

static double get_upscale_value(const size_t total_weight, const size_t right_weight, const double left, const double right) {
    // do the simple weighted average for demonstration purposes
    const size_t left_weight = total_weight - right_weight;
    return (left * left_weight + right * right_weight) / total_weight;
}

// identical to "upsample_weighted", except all source values from "in" are copied into result
// and only extra added values (to make the target size) are generated by "get_upscale_value"
std::vector<double> upsample_copy_preferred(std::vector<double>& in, size_t n)
{
    assert( 2 <= in.size() && in.size() <= n ); // this is really only upscaling (can't downscale)

    // resulting vector variable
    std::vector<double> upscaled;
    upscaled.reserve(n);

    // upscaling factors variables and constants
    size_t index_left = 0;                      // first "left" item is the in[0] element
    size_t weight_right = 0;                    // and "right" has zero weight (i.e. in[0] is copied)
    const size_t in_weight = n - 1;             // total weight of single "in" element
    const size_t weight_add = in.size() - 1;    // shift of weight between "upscaled" elements

    while (upscaled.size() < n) {               // add N upscaled items
/* ! */ if (weight_right < weight_add) { /* ! this line is modified */
            // most of the weight on left -> copy it (don't taint it by upscaling)
            upscaled.push_back(in[index_left]);
        } else {
            // the weight is somewhere between "left" and "right" items of "in" vector
            // i.e. weight = 1..(in_weight-1) ("in_weight" is full "right" value, never happens)
            double upscaled_val = get_upscale_value(in_weight, weight_right, in[index_left], in[index_left+1]);
            upscaled.push_back(upscaled_val);
        }
        weight_right += weight_add;
        if (in_weight <= weight_right) {
            // the weight shifted so much that "right" is new "left"
            ++index_left;
            weight_right -= in_weight;
        }
    }

    return upscaled;
}

int main(int argc, const char *argv[])
{
    std::vector<double> in { 10, 20, 30 };
//    std::vector<double> in { 20, 10, 40 };

    std::vector<double> upscaled = upsample_copy_preferred(in, 14);
    std::cout << "upsample_copy_preferred from " << in.size() << " to " << upscaled.size() << ": ";
    for (const auto i : upscaled) {
        std::cout << i << " ";
    }
    std::cout << std::endl;
    return 0;
}

output:

upsample_copy_preferred from 3 to 14: 10 11.5385 13.0769 14.6154 16.1538 17.6923 19.2308 20 22.3077 23.8462 25.3846 26.9231 28.4615 30

(notice how "20.7692" from example 1 is here just "20" - copy of original sample, even if at that point the "30" has some small weight if linear interpolation is considered)

like image 189
Ped7g Avatar answered Oct 19 '22 02:10

Ped7g