Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

what is the fastest way to initialise a scipy.sparse matrix with numpy.NaN?

Tags:

python

scipy

I want to initial a sparse matrix with numpy array. The numpy array contains NaN as zero for my program, the code to initial a sparse matrix as following:

a= np.array([[np.NaN,np.NaN,10]])
zero_a= np.array([[0,0,10]])
spr_a = lil_matrix(a)
zero_spr_a = lil_matrix(zero_a)
print repr(spr_a)
print repr(zero_spr_a)

the output is

1x3 sparse matrix of type 'type 'numpy.float64''
    with 3 stored elements in LInked List format
1x3 sparse matrix of type 'type 'numpy.int64''
    with 1 stored elements in LInked List format

for array with 0, there's only 1 element stored in sparse matrix. but there's 3 elements stored in NaN array, how to treat NaN as zero for scipy matrix?

like image 279
user1687717 Avatar asked Jan 17 '13 08:01

user1687717


1 Answers

If all you want to do is create a sparse matrix from your data, treating the NaNs as if they were zeros, you could do the following. First, lets create a random array with several np.nans in it:

>>> nans = np.random.randint(0, 2, size=(5,5))
>>> a = np.ones((5,5))
>>> a = np.where(nans, np.nan, a)
>>> a
array([[  1.,   1.,   1.,   1.,  nan],
       [ nan,  nan,  nan,   1.,   1.],
       [ nan,  nan,   1.,   1.,  nan],
       [  1.,   1.,   1.,   1.,  nan],
       [  1.,  nan,   1.,  nan,  nan]])

To make this sparse in COO format, it is as easy as:

>>> indices = np.nonzero(~np.isnan(a))
>>> sps = scipy.sparse.coo_matrix((a[indices], indices), shape=a.shape)
>>> sps
<5x5 sparse matrix of type '<type 'numpy.float64'>'
    with 14 stored elements in COOrdinate format>

And to check they are the same:

>>> sps.toarray()
array([[ 1.,  1.,  1.,  1.,  0.],
       [ 0.,  0.,  0.,  1.,  1.],
       [ 0.,  0.,  1.,  1.,  0.],
       [ 1.,  1.,  1.,  1.,  0.],
       [ 1.,  0.,  1.,  0.,  0.]])

Although your NaNs are now gone...

like image 97
Jaime Avatar answered Sep 28 '22 05:09

Jaime