Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python numpy - is there a faster way to convolve?

I have a numpy array that is very large (1 million integers). I'm using np.convolve in order to find the "densest" area of that array. By "desnsest" area I mean the window of a fixed length that has the the highest numbers when the window is summed. Let me show you in code:

import numpy as np

example = np.array([0,0,0,1,1,1,1,1,1,1,0,1,1,1,1,0,0,0,1,0,0,1,1,0,1,0,0,0,0,0,1,0])
window_size = 10
density = np.convolve(example, np.ones([window_size]), mode='valid')
print(density) 
# [7.0, 7.0, 8.0, 9.0, 9.0, 9.0, 8.0, 7.0, 6.0, 6.0, 5.0, 5.0, 5.0, 5.0, 4.0, 4.0, 4.0, 4.0, 4.0, 3.0, 3.0, 4.0, 3.0]

I can then use np.argmax(density) to get the starting index of the highest density area 3.

Anyway, with this example it runs fast. but when convolving over million element array and with a window size of 10,000 it takes 2 seconds to complete. if I choose a windows_size of 500,000 it takes 3 minutes to complete.

Is there a better way to sum over the array with a certain window size to speed this up? If I converted this into a pandas series instead could I perhaps use something there?

Thanks for your help!

like image 536
Legit Stack Avatar asked Nov 30 '18 03:11

Legit Stack


People also ask

How do you convolve two arrays?

The convolution of 2 arrays is defined as C[i + j] = ∑(a[i] * b[j]) for every i and j.

What is numpy convolve?

The np. convolve() is a built-in numpy library method used to return discrete, linear convolution of two one-dimensional vectors. The numpy convolve() method accepts three arguments which are v1, v2, and mode, and returns discrete the linear convolution of v1 and v2 one-dimensional vectors.

How do you convolve two signals in Python?

An array in numpy is a signal. The convolution of two signals is defined as the integral of the first signal, reversed, sweeping over ("convolved onto") the second signal and multiplied (with the scalar product) at each position of overlapping vectors.

How do you use convolve in Python?

Convolution is an operation that is performed on an image to extract features from it applying a smaller tensor called a kernel like a sliding window over the image. Depending on the values in the convolutional kernel, we can pick up specific patterns from the image.


3 Answers

Try using scipy.signal.convolve. It has the option to compute the convolution using the fast Fourier transform (FFT), which should be much faster for the array sizes that you mentioned.

Using an array example with length 1000000 and convolving it with an array of length 10000, np.convolve took about 1.45 seconds on my computer, and scipy.signal.convolve took 22.7 milliseconds.

like image 57
Warren Weckesser Avatar answered Nov 02 '22 00:11

Warren Weckesser


This is how you can use the built-in NumPy real FFT functions to convolve in 1 dimension:

import numpy, numpy.fft.fftpack_lite

def fftpack_lite_rfftb(buf, s):
    n = len(buf)
    m = (n - 1) * 2
    temp = numpy.empty(m, buf.dtype)
    numpy.divide(buf, m, temp[:n])
    temp[n:m] = 0
    return numpy.fft.fftpack_lite.rfftb(temp[:m], s)

def fftconvolve(x, y):
    xn = x.shape[-1]
    yn = y.shape[-1]
    cn = xn + yn - (xn + yn > 0)
    m = 1 << cn.bit_length()
    s = numpy.fft.fftpack_lite.rffti(m)  # Initialization; can be factored out for performance
    xpad = numpy.pad(x, [(0, 0)] * (len(x.shape) - 1) + [(0, m - xn)], 'constant')
    a = numpy.fft.fftpack_lite.rfftf(xpad, s)  # Forward transform
    ypad = numpy.pad(y, [(0, 0)] * (len(y.shape) - 1) + [(0, m - yn)], 'constant')
    b = numpy.fft.fftpack_lite.rfftf(ypad, s)  # Forward transform
    numpy.multiply(a, b, b)  # Spectral multiplication
    c = fftpack_lite_rfftb(b, s)  # Backward transform
    return c[:cn]

# Verify convolution is correct
assert (lambda a, b: numpy.allclose(fftconvolve(a, b), numpy.convolve(a, b)))(numpy.random.randn(numpy.random.randint(1, 32)), numpy.random.randn(numpy.random.randint(1, 32)))

Bear in mind that this padding is inefficient for convolution of vectors with significantly different sizes (> 100%); you'll want to use a linear combination technique like overlap-add to do smaller convolution.

like image 34
user541686 Avatar answered Nov 01 '22 23:11

user541686


cumsum = np.cumsum(np.insert(example, 0, 0))

density2 = cumsum[window_size:]-cumsum[:-window_size]

np.all(density2 == density)

True

(remove insertion if you can live without the first value...)

like image 35
Mathieu Villion Avatar answered Nov 02 '22 01:11

Mathieu Villion