I want to generate a vector using Numpy that is k-sparse, i.e. it has n entries of which k are nonzero. The positions of the nonzero entries are chosen randomly, and the entries themselves are chosen from a Gaussian distribution with zero mean and unit variance. The test vector is small (256 entries) so I don't think Scipy's sparse matrix interface is necessary here.
My current approach is to generate a random list of k integers between 0 and 256, initialize a vector full of zeros, and then use a for loop to choose a random value and replace the entries of the vector with those values, like so:
# Construct data vector x
# Entries of x are ~N(0, 1) and are placed in the positions specified by the
# 'nonzeros' vector
x = np.zeros((256, 1))
# Get a random value ~N(0, 1) and place it in x at the position specified by
# 'nonzeros' vector
for i in range(k):
x[nonzeros[i]] = np.random.normal(mu, sigma)
Performance isn't an issue here (it's research-related) so this works, but it feels unpythonic, and I suspect there is a more elegant solution.
How about this:
In [41]: import numpy as np
In [42]: x = np.zeros(10)
In [43]: positions = np.random.choice(np.arange(10), 3, replace=False)
In [44]: x[positions] = np.random.normal(0,1,3)
In [45]: x
Out[45]:
array([ 0. , 0.11197222, 0. , 0.09540939, -0.04488175,
0. , 0. , 0. , 0. , 0. ])
Here's a recipe: first fill in the first k
elements of a zero vector with Gaussian noise, then shuffle to get those k
at random positions.
>>> n = 10
>>> k = 3
>>> a = np.zeros(n)
>>> a[:k] = np.random.randn(k)
>>> np.random.shuffle(a)
>>> a
array([ 1.26611853, 0. , 0. , 0. , -0.84272405,
0. , 0. , 1.96992445, 0. , 0. ])
This solution can be faster than the accepted one:
>>> def rand_nz_1(n, k):
... a = np.zeros(n)
... pos = np.random.choice(np.arange(n), k, replace=False)
... a[pos] = np.random.randn(k)
... return a
...
>>> def rand_nz_2(n, k):
... a = np.zeros(n)
... a[:k] = np.random.randn(k)
... np.random.shuffle(a)
... return a
...
>>> %timeit rand_nz_1(256, 15)
10000 loops, best of 3: 63.8 µs per loop
>>> %timeit rand_nz_2(256, 15)
10000 loops, best of 3: 38.3 µs per loop
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With