Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Scipy's correlate function is slow

I have compared the different methods for convolving/correlating two signals using numpy/scipy. It turns out that there are huge differences in speed. I compared the follwing methods:

  • correlate from the numpy package (np.correlate in plot)
  • correlate from the scipy.signal package (sps.correlate in plot)
  • fftconvolve from scipy.signal (sps.fftconvolve in plot)

Now I of course understand that there is a considerable difference between fftconvolve and the other two functions. What I do not understand is why the sps.correlate is so much slower than np.correlate. Does anybody know why scipy uses an implementation that is so much slower?

Speed comparison

For completeness, here is the code that produces the plot:

import time

import numpy as np
import scipy.signal as sps

from matplotlib import pyplot as plt


if __name__ == '__main__':

    a = 10**(np.arange(10)/2)
    print(a)

    results = {}
    results['np.correlate'] = np.zeros(len(a))
    results['sps.correlate'] = np.zeros(len(a))
    results['sps.fftconvolve'] = np.zeros(len(a))

    ii = 0
    for length in a:

        sig = np.random.rand(length)

        t0 = time.clock()
        for jj in range(3):
            np.correlate(sig, sig, 'full')
        t1 = time.clock()
        elapsed = (t1-t0)/3

        results['np.correlate'][ii] = elapsed

        t0 = time.clock()
        for jj in range(3):
            sps.correlate(sig, sig, 'full')
        t1 = time.clock()
        elapsed = (t1-t0)/3

        results['sps.correlate'][ii] = elapsed

        t0 = time.clock()
        for jj in range(3):
            sps.fftconvolve(sig, sig, 'full')
        t1 = time.clock()
        elapsed = (t1-t0)/3

        results['sps.fftconvolve'][ii] = elapsed

        ii += 1

    ax = plt.figure()
    plt.loglog(a, results['np.correlate'], label='np.correlate')
    plt.loglog(a, results['sps.correlate'], label='sps.correlate')
    plt.loglog(a, results['sps.fftconvolve'], label='sps.fftconvolve')
    plt.xlabel('Signal length')
    plt.ylabel('Elapsed time in seconds')

    plt.legend()
    plt.grid()

    plt.show()
like image 640
Chris Avatar asked Jun 03 '15 12:06

Chris


Video Answer


1 Answers

According to the documentation, numpy.correlate was designed for 1D arrays, while scipy.correlate can accept ND-arrays.

The scipy implementation being more general and therefore complex, seem indeed to incur an additional computational overhead. You can compare the C code between numpy and scipy implementations.

Another difference, could be for instance, that numpy implementation gets better vectorized by the compiler on modern processors, etc.

like image 76
rth Avatar answered Oct 10 '22 10:10

rth