Say I have <code>some_data</code> of shape <code>(1, n)</code>. I have new <code>incoming_data</code> of shape <code>(1, n±x)</code>, where x is some positive integer much smaller than <code>n</code>. I would like to squeeze or stretch <code>incoming_data</code> such that it is of the same length as <code>n</code>. How might this be done, using the SciPy stack? Here's an example of what I'm trying to accomplish. <pre class="prettyprint"><code># Stretch arr2 to arr1's shape while "filling in" interpolated value arr1 = np.array([1, 5, 2, 3, 7, 2, 1]) arr2 = np.array([1, 5, 2, 3, 7, 1]) result > np.array([1, 5, 2, 3, 6.x, 2.x 1]) # of shape (arr1.shape) </code></pre> As another example: <pre class="prettyprint"><code># Squeeze arr2 to arr1's shape while placing interpolated value. arr1 = np.array([1, 5, 2, 3, 7, 2, 1]) arr2 = np.array([1, 5, 2, 3, 4, 7, 2, 1]) result > np.array([1, 5, 2, 3.x, 7.x, 2.x, 1]) # of shape (arr1.shape) </code></pre>

There's another package that works very well for upsampling and downsampling: <code>resampy</code>. It has a simpler command than <code>scipy.interpolate.interp1d</code> but only uses a single interpolation function. As @Andras Deak said, you have to be careful in choosing interpolation functions. <h3>MWE:</h3> <pre class="prettyprint"><code>import numpy as np import resampy from matplotlib import pyplot as plt x_mesh = np.linspace(0,1,10) short_arr = np.sin(x_mesh*2*np.pi) plt.plot(short_arr) </code></pre> <img src="https://i.stack.imgur.com/j28gA.png" alt="coarse_plot"> <pre class="prettyprint"><code>interp_arr = resampy.resample(short_arr, 20, 100) plt.plot(interp_arr) </code></pre> <img src="https://i.stack.imgur.com/S9BUk.png" alt="fine_plot"> Two words of caution: <ol> <li><code>resampy</code> uses a "band-limited sinc interpolation". Check the documentation for more info. It works best if your array originally came from data with local frequency components, e.g. sound, images, and other time-series data. It's used in some of the tensorflow examples on audio, which is what I use. I'm not sure whether your example array was small for demonstration purposes, but if that truly is the size of your array, interpolating may be bad whatever method you use, linear, spline, or otherwise.</li> <li>Your examples demonstrated more than interpolation. It seems you found a portion of the arrays that matched (e.g. <code>[1,5,2,3]</code>) then interpolated the rest. Depending on whether you want to match the beginning of the array or an arbitrary number of patches, you may be asking for a two methods: one to identify the correct portions of an array to interpolate, and one to interpolate those portions. If that's the case, look at <code>numpy.isin</code> for a basic method or levenshtein distance for more generally matching a set of substrings.</li> </ol>

Interpolating a numpy array to fit another array

Tags:

python

arrays

numpy

interpolation

Say I have some_data of shape (1, n). I have new incoming_data of shape (1, n±x), where x is some positive integer much smaller than n. I would like to squeeze or stretch incoming_data such that it is of the same length as n. How might this be done, using the SciPy stack?

Here's an example of what I'm trying to accomplish.

# Stretch arr2 to arr1's shape while "filling in" interpolated value
arr1 = np.array([1, 5, 2, 3, 7, 2, 1])
arr2 = np.array([1, 5, 2, 3, 7, 1])
result
> np.array([1, 5, 2, 3, 6.x, 2.x 1])  # of shape (arr1.shape)

As another example:

# Squeeze arr2 to arr1's shape while placing interpolated value.
arr1 = np.array([1, 5, 2, 3, 7, 2, 1])
arr2 = np.array([1, 5, 2, 3, 4, 7, 2, 1])
result
> np.array([1, 5, 2, 3.x, 7.x, 2.x, 1])  # of shape (arr1.shape)

469

asked Jun 27 '16 23:06

ericmjl

2 Answers

You can implement this simple compression or stretching of your data using scipy.interpolate.interp1d. I'm not saying it necessarily makes sense (it makes a huge difference what kind of interpolation you're using, and you'll generally only get a reasonable result if you can correctly guess the behaviour of the underlying function), but you can do it.

The idea is to interpolate your original array over its indices as x values, then perform interpolation with a sparser x mesh, while keeping its end points the same. So essentially you have to do a continuum approximation to your discrete data, and resample that at the necessary points:

import numpy as np
import scipy.interpolate as interp
import matplotlib.pyplot as plt

arr_ref = np.array([1, 5, 2, 3, 7, 1])  # shape (6,), reference
arr1 = np.array([1, 5, 2, 3, 7, 2, 1])  # shape (7,), to "compress"
arr2 = np.array([1, 5, 2, 7, 1])        # shape (5,), to "stretch"
arr1_interp = interp.interp1d(np.arange(arr1.size),arr1)
arr1_compress = arr1_interp(np.linspace(0,arr1.size-1,arr_ref.size))
arr2_interp = interp.interp1d(np.arange(arr2.size),arr2)
arr2_stretch = arr2_interp(np.linspace(0,arr2.size-1,arr_ref.size))

# plot the examples, assuming same x_min, x_max for all data
xmin,xmax = 0,1
fig,(ax1,ax2) = plt.subplots(ncols=2)
ax1.plot(np.linspace(xmin,xmax,arr1.size),arr1,'bo-',
         np.linspace(xmin,xmax,arr1_compress.size),arr1_compress,'rs')
ax2.plot(np.linspace(xmin,xmax,arr2.size),arr2,'bo-',
         np.linspace(xmin,xmax,arr2_stretch.size),arr2_stretch,'rs') 
ax1.set_title('"compress"')
ax2.set_title('"stretch"')

The resulting plot:

result

In the plots, blue circles are the original data points, and red squares are the interpolated ones (these overlap at the boundaries). As you can see, what I called compressing and stretching is actually upsampling and downsampling of an underlying (linear, by default) function. This is why I said you must be very careful with interpolation: you can get very wrong results if your expectations don't match your data.

173

answered Oct 03 '22 23:10

Andras Deak -- Слава Україні

There's another package that works very well for upsampling and downsampling: resampy. It has a simpler command than scipy.interpolate.interp1d but only uses a single interpolation function. As @Andras Deak said, you have to be careful in choosing interpolation functions.

MWE:

import numpy as np
import resampy
from matplotlib import pyplot as plt

x_mesh = np.linspace(0,1,10)
short_arr = np.sin(x_mesh*2*np.pi)
plt.plot(short_arr)

coarse_plot

interp_arr = resampy.resample(short_arr, 20, 100)
plt.plot(interp_arr)

fine_plot
Two words of caution:

resampy uses a "band-limited sinc interpolation". Check the documentation for more info. It works best if your array originally came from data with local frequency components, e.g. sound, images, and other time-series data. It's used in some of the tensorflow examples on audio, which is what I use. I'm not sure whether your example array was small for demonstration purposes, but if that truly is the size of your array, interpolating may be bad whatever method you use, linear, spline, or otherwise.
Your examples demonstrated more than interpolation. It seems you found a portion of the arrays that matched (e.g. [1,5,2,3]) then interpolated the rest. Depending on whether you want to match the beginning of the array or an arbitrary number of patches, you may be asking for a two methods: one to identify the correct portions of an array to interpolate, and one to interpolate those portions. If that's the case, look at numpy.isin for a basic method or levenshtein distance for more generally matching a set of substrings.

answered Oct 04 '22 00:10

Jake Stevens-Haas

Related questions
                            
                                How to convert matrix to pandas data frame
                            
                                Making the labels of the scatterplot vertical and horizontal in Pandas
                            
                                Python memory mapping
                            
                                uwsgi does not reload after changing django settings
                            
                                Best way to Query Microsoft AD with Python 3
                            
                                Why is numpy/pandas parsing of a csv file with long lines so slow?
                            
                                Ubuntu 14.04 - Python 3.4 - pyenv: command Not Found
                            
                                Backport Python 3.4's regular expression "fullmatch()" to Python 2
                            
                                Scrapy delay request
                            
                                Activating pyvenv from gitbash for windows
                            
                                new thread blocks main thread
                            
                                Does the number of imported modules in python effect memory and performance?
                            
                                List All Wireless Networks Python for PC
                            
                                How to write Python Array into Excel Spread sheet
                            
                                How to get rows from DF that contain value None in pyspark (spark)
                            
                                Converting numpy arrays of arrays into one whole numpy array
                            
                                Registering route on blueprint raises AttributeError: 'function' object has no attribute 'route'
                            
                                Using groupby group names in function
                            
                                Getting File Metadata from Google API V3 in Python
                            
                                returncode of Popen object is None after the process is terminated

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With