Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Most efficient way to create an array of cos and sin in Numpy

I need to store an array of size n with values of cos(x) and sin(x), lets say

array[[cos(0.9), sin(0.9)],
      [cos(0.35),sin(0.35)],
      ...]

The arguments of each pair of cos and sin is given by random choice. My code as far as I have been improving it is like this:

def randvector():
""" Generates random direction for n junctions in the unitary circle """
    x = np.empty([n,2])
    theta = 2 * np.pi * np.random.random_sample((n))
    x[:,0] = np.cos(theta)
    x[:,1] = np.sin(theta)
    return x

Is there a shorter way or more effective way to achieve this?

like image 834
Alejandro Sazo Avatar asked Apr 09 '14 02:04

Alejandro Sazo


2 Answers

Your code is effective enough. And justhalf's answer is not bad I think.

For effective and short, How about this code?

def randvector(n):
    theta = 2 * np.pi * np.random.random_sample((n))
    return np.vstack((np.cos(theta), np.sin(theta))).T

UPDATE

Append cProfile result.

justhalf's

      5 function calls in 4.707 seconds

Ordered by: standard name

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     1    0.001    0.001    4.707    4.707 <string>:1(<module>)
     1    2.452    2.452    4.706    4.706 test.py:6(randvector1)
     1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
     1    0.010    0.010    0.010    0.010 {method 'random_sample' of 'mtrand.RandomState' objects}
     1    2.244    2.244    2.244    2.244 {numpy.core.multiarray.array}

OP's

      5 function calls in 0.088 seconds

Ordered by: standard name

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     1    0.000    0.000    0.088    0.088 <string>:1(<module>)
     1    0.079    0.079    0.088    0.088 test.py:9(randvector2)
     1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
     1    0.009    0.009    0.009    0.009 {method 'random_sample' of 'mtrand.RandomState' objects}
     1    0.000    0.000    0.000    0.000 {numpy.core.multiarray.empty}

mine

      21 function calls in 0.087 seconds

Ordered by: standard name

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     1    0.000    0.000    0.087    0.087 <string>:1(<module>)
     2    0.000    0.000    0.000    0.000 numeric.py:322(asanyarray)
     1    0.000    0.000    0.002    0.002 shape_base.py:177(vstack)
     2    0.000    0.000    0.000    0.000 shape_base.py:58(atleast_2d)
     1    0.076    0.076    0.087    0.087 test.py:17(randvector3)
     6    0.000    0.000    0.000    0.000 {len}
     1    0.000    0.000    0.000    0.000 {map}
     2    0.000    0.000    0.000    0.000 {method 'append' of 'list' objects}
     1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
     1    0.009    0.009    0.009    0.009 {method 'random_sample' of 'mtrand.RandomState' objects}
     2    0.000    0.000    0.000    0.000 {numpy.core.multiarray.array}
     1    0.002    0.002    0.002    0.002 {numpy.core.multiarray.concatenate}
like image 99
emeth Avatar answered Oct 19 '22 21:10

emeth


Your code already looks fine to me, but here are a few more thoughts.

Here's a one-liner. It is marginally slower than your version.

def randvector2(n):
    return np.exp((2.0j * np.pi) * np.random.rand(n, 1)).view(dtype=np.float64)

I get these timings for n=10000

Yours:

1000 loops, best of 3: 716 µs per loop

my shortened version:

1000 loops, best of 3: 834 µs per loop

Now if speed is a concern, your approach is really very good. Another answer shows how to use hstack. That works well. Here is another version that is just a little different from yours and is marginally faster.

def randvector3(n):
    x = np.empty([n,2])
    theta = (2 * np.pi) * np.random.rand(n)
    np.cos(theta, out=x[:,0])
    np.sin(theta, out=x[:,1])
    return x

This gives me the timing:

1000 loops, best of 3: 698 µs per loop

If you have access to numexpr, the following is faster (at least on my machine).

import numexpr as ne
def randvector3(n):
    sample = np.random.rand(n, 1)
    c = 2.0j * np.pi
    return ne.evaluate('exp(c * sample)').view(dtype=np.float64)

This gives me the timing:

1000 loops, best of 3: 366 µs per loop

Honestly though, if I were writing this for anything that wasn't extremely performance intensive, I'd do pretty much the same thing you did. It makes your intent pretty clear to the reader. The version with hstack works well too.

Another quick note: When I run timings for n=10, my one-line version is fastest. When I do n=10000000, the fast pure-numpy version is fastest.

like image 25
IanH Avatar answered Oct 19 '22 21:10

IanH