Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Improve performance of operation on numpy trigonometric functions

I have a rather large code which I need to optimize. After some analysis using time.time(), I've found that the line that takes up the most processing time (it is executed thousands of times) is this one:

A = np.cos(a) * np.cos(b) - np.sin(a) * np.sin(b) * np.sin(c - d)

where all the variables can be randomly defined with:

N = 5000
a = np.random.uniform(0., 10., N)
b = np.random.uniform(0., 50., N)
c = np.random.uniform(0., 30., N)
d = np.random.uniform(0., 25., N)

Is there a way to improve the performance of the calculation of A? As I'm already using numpy, I'm pretty much out of ideas.

like image 712
Gabriel Avatar asked Dec 25 '22 11:12

Gabriel


2 Answers

By using the product-to-sum trig. identities, you can reduce the number of trig. function calls. In the following, func1 and func2 compute the same value, but func2 makes fewer calls to trig. functions.

import numpy as np

def func1(a, b, c, d):
    A = np.cos(a) * np.cos(b) - np.sin(a) * np.sin(b) * np.sin(c - d)
    return A

def func2(a, b, c, d):
    s = np.sin(c - d)
    A = 0.5*((1 - s)*np.cos(a - b) + (1 + s)*np.cos(a + b))
    return A

Here's a timing comparison with N = 5000:

In [48]: %timeit func1(a, b, c, d)
1000 loops, best of 3: 374 µs per loop

In [49]: %timeit func2(a, b, c, d)
1000 loops, best of 3: 241 µs per loop
like image 131
Warren Weckesser Avatar answered Feb 16 '23 01:02

Warren Weckesser


Did you tried to use so Python accelerator like Numba, Cython, Pythran or anything else?

I did some test with Pythran. Here is the result:

Original code :

  • Python + numpy : 1000 loops, best of 3: 1.43 msec per loop
  • Pythran : 1000 loops, best of 3:777usec per loop
  • Pythran + SIMD : 1000 loops, best of 3:488 usec per loop

Code provided by Warren:

  • Python + numpy : 1000 loops, best of 3: 1.05 msec per loop
  • Pythran : 1000 loops, best of 3: 646 usec per loop
  • Pythran + SIMD : 1000 loops, best of 3: 425 usec per loop

This is done with N = 5000

  • Update * :

Here is the code :

# pythran export func1(float[], float[], float[], float[])
# pythran export func2(float[], float[], float[], float[])
import numpy as np

def func1(a, b, c, d):
    A = np.cos(a) * np.cos(b) - np.sin(a) * np.sin(b) * np.sin(c - d)
    return A

def func2(a, b, c, d):
    s = np.sin(c - d)
    A = 0.5*((1 - s)*np.cos(a - b) + (1 + s)*np.cos(a + b))
    return A

And command line:

$ pythran test.py  # Default compilation
$ pythran test.py -march=native -DUSE_BOOST_SIMD  # Pythran with code vectorization
like image 30
P. Brunet Avatar answered Feb 16 '23 00:02

P. Brunet