Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Create sample numpy array with randomly placed NaNs

For testing purposes, I'd like to create a M by N numpy array with c randomly placed NaNs

import numpy as np

M = 10;
N = 5;
c = 15;
A = np.random.randn(M,N)

A[mask] = np.nan

I am having problems in creating a mask with c true elements, or maybe this can be done with indices directly?

like image 385
Oleg Avatar asked Aug 24 '15 12:08

Oleg


2 Answers

You can use np.random.choice with the optional replace=False for random selection without replacement and use those on a flattened version of A (done with .ravel()), like so -

A.ravel()[np.random.choice(A.size, c, replace=False)] = np.nan

Sample run -

In [100]: A
Out[100]: 
array([[-0.35365726,  0.26754527, -0.44985524, -1.29520237,  2.01505444],
       [ 0.01319146,  0.65150356, -2.32054478,  0.40924753,  0.24761671],
       [ 0.3014714 , -0.80688589, -2.61431163,  0.07787956,  1.23381951],
       [-1.70725777,  0.07856845, -1.04354202, -0.68904925,  1.07161002],
       [-1.08061614,  1.17728247, -1.5913516 , -1.87601976,  1.14655867],
       [ 1.12542853, -0.26290025, -1.0371326 ,  0.53019033, -1.20766258],
       [ 1.00692277,  0.171661  , -0.89646634,  1.87619114, -1.04900026],
       [ 0.22238353, -0.6523747 , -0.38951426,  0.78449948, -1.14698869],
       [ 0.58023183,  1.99987331, -0.85938155,  1.4211672 , -0.43369898],
       [-2.15682219, -0.6872121 , -1.28073816, -0.97523148, -2.27967001]])

In [101]: A.ravel()[np.random.choice(A.size, c, replace=False)] = np.nan

In [102]: A
Out[102]: 
array([[        nan,  0.26754527, -0.44985524,         nan,  2.01505444],
       [ 0.01319146,  0.65150356, -2.32054478,         nan,  0.24761671],
       [        nan, -0.80688589,         nan,         nan,  1.23381951],
       [        nan,         nan, -1.04354202, -0.68904925,  1.07161002],
       [-1.08061614,  1.17728247, -1.5913516 ,         nan,  1.14655867],
       [ 1.12542853,         nan, -1.0371326 ,  0.53019033, -1.20766258],
       [        nan,  0.171661  , -0.89646634,         nan,         nan],
       [ 0.22238353, -0.6523747 , -0.38951426,  0.78449948, -1.14698869],
       [ 0.58023183,  1.99987331, -0.85938155,         nan, -0.43369898],
       [-2.15682219, -0.6872121 , -1.28073816, -0.97523148,         nan]])
like image 51
Divakar Avatar answered Oct 08 '22 02:10

Divakar


You could use np.random.shuffle on a new array to create your mask:

import numpy as np

M = 10;
N = 5;
c = 15;
A = np.random.randn(M,N)

mask=np.zeros(M*N,dtype=bool)
mask[:c] = True
np.random.shuffle(mask)
mask=mask.reshape(M,N)

A[mask] = np.nan

Which gives:

[[ 0.98244168  0.72121195  0.99291217  0.17035834  0.46987918]
 [ 0.76919975  0.53102064         nan  0.78776918         nan]
 [ 0.50931304  0.91826809  0.52717345         nan         nan]
 [ 0.35445471  0.28048106  0.91922292  0.76091783  0.43256409]
 [ 0.69981284  0.0620876   0.92502572         nan         nan]
 [        nan         nan         nan  0.24466688  0.70259211]
 [ 0.4916004          nan         nan  0.94945378  0.73983538]
 [ 0.89057404  0.4542628          nan  0.95547377         nan]
 [ 0.4071912   0.36066797  0.73169132  0.48217226  0.62607888]
 [ 0.30341337         nan  0.75608859  0.31497997         nan]]
like image 9
tmdavison Avatar answered Oct 08 '22 01:10

tmdavison