Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

sine calculation orders of magnitude slower than cosine

tl;dr

Of the same numpy array, calculating np.cos takes 3.2 seconds, wheras np.sin runs 548 seconds (nine minutes) on Linux Mint.

See this repo for full code.


I've got a pulse signal (see image below) which I need to modulate onto a HF-carrier, simulating a Laser Doppler Vibrometer. Therefore signal and its time basis need to be resampled to match the carrier's higher sampling rate.

pulse signal to be modulated onto HF-carrier

In the following demodulation process both the in-phase carrier cos(omega * t) and the phase-shifted carrier sin(omega * t) are needed. Oddly, the time to evaluate these functions depends highly on the way the time vector has been calculated.

The time vector t1 is being calculated using np.linspace directly, t2 uses the method implemented in scipy.signal.resample.

pulse = np.load('data/pulse.npy')  # 768 samples

pulse_samples = len(pulse)
pulse_samplerate = 960  # 960 Hz
pulse_duration = pulse_samples / pulse_samplerate  # here: 0.8 s
pulse_time = np.linspace(0, pulse_duration, pulse_samples,
                         endpoint=False)

carrier_freq = 40e6  # 40 MHz
carrier_samplerate = 100e6  # 100 MHz
carrier_samples = pulse_duration * carrier_samplerate  # 80 million

t1 = np.linspace(0, pulse_duration, carrier_samples)

# method used in scipy.signal.resample
# https://github.com/scipy/scipy/blob/v0.17.0/scipy/signal/signaltools.py#L1754
t2 = np.arange(0, carrier_samples) * (pulse_time[1] - pulse_time[0]) \
        * pulse_samples / float(carrier_samples) + pulse_time[0]

As can be seen in the picture below, the time vectors are not identical. At 80 million samples the difference t1 - t2 reaches 1e-8.

difference between time vectors <code>t1</code> and <code>t2</code>

Calculating the in-phase and shifted carrier of t1 takes 3.2 seconds each on my machine.
With t2, however, calculating the shifted carrier takes 540 seconds. Nine minutes. For nearly the same 80 million values.

omega_t1 = 2 * np.pi * carrier_frequency * t1
np.cos(omega_t1)  # 3.2 seconds
np.sin(omega_t1)  # 3.3 seconds

omega_t2 = 2 * np.pi * carrier_frequency * t2
np.cos(omega_t2)  # 3.2 seconds
np.sin(omega_t2)  # 9 minutes

I can reproduce this bug on both my 32-bit laptop and my 64-bit tower, both running Linux Mint 17. On my flat mate's MacBook, however, the "slow sine" takes as little time as the other three calculations.


I run a Linux Mint 17.03 on a 64-bit AMD processor and Linux Mint 17.2 on 32-bit Intel processor.

like image 965
Finwood Avatar asked Sep 09 '25 20:09

Finwood


1 Answers

I don't think numpy has anything to do with this: I think you're tripping across a performance bug in the C math library on your system, one which affects sin near large multiples of pi. (I'm using "bug" in a pretty broad sense here -- for all I know, since the sine of large floats is poorly defined, the "bug" is actually the library behaving correctly to handle corner cases!)

On linux, I get:

>>> %timeit -n 10000 math.sin(6e7*math.pi)
10000 loops, best of 3: 191 µs per loop
>>> %timeit -n 10000 math.sin(6e7*math.pi+0.12)
10000 loops, best of 3: 428 ns per loop

and other Linux-using types from the Python chatroom report

10000 loops, best of 3: 49.4 µs per loop 
10000 loops, best of 3: 206 ns per loop

and

In [3]: %timeit -n 10000 math.sin(6e7*math.pi)
10000 loops, best of 3: 116 µs per loop

In [4]: %timeit -n 10000 math.sin(6e7*math.pi+0.12)
10000 loops, best of 3: 428 ns per loop

but a Mac user reported

In [3]: timeit -n 10000 math.sin(6e7*math.pi)
10000 loops, best of 3: 300 ns per loop

In [4]: %timeit -n 10000 math.sin(6e7*math.pi+0.12)
10000 loops, best of 3: 361 ns per loop

for no order-of-magnitude difference. As a workaround, you might try taking things mod 2 pi first:

>>> new = np.sin(omega_t2[-1000:] % (2*np.pi))
>>> old = np.sin(omega_t2[-1000:])
>>> abs(new - old).max()
7.83773902468434e-09

which has better performance:

>>> %timeit -n 1000 new = np.sin(omega_t2[-1000:] % (2*np.pi))
1000 loops, best of 3: 63.8 µs per loop
>>> %timeit -n 1000 old = np.sin(omega_t2[-1000:])
1000 loops, best of 3: 6.82 ms per loop

Note that as expected, a similar effect happens for cos, just shifted:

>>> %timeit -n 1000 np.cos(6e7*np.pi + np.pi/2)
1000 loops, best of 3: 37.6 µs per loop
>>> %timeit -n 1000 np.cos(6e7*np.pi + np.pi/2 + 0.12)
1000 loops, best of 3: 2.46 µs per loop
like image 63
DSM Avatar answered Sep 12 '25 11:09

DSM