Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Make a numpy upper triangular matrix padded with Nan instead of zero

I generate a matplotlib 3d surface plot. I only need to see the upper-triangular half of the matrix on the plot, as the other half is redundant.

np.triu() makes the redundant half of the matrix zeros, but I'd prefer if I can make them Nans, then those cells don't show up at all on the surface plot.

What would be a pythonic way to fill with NaN instead of zeros? I cannot do a search-and-replace 0 with NaN, as zeros will appear in the legitimate data I want to display.

like image 921
user3556757 Avatar asked Oct 09 '15 02:10

user3556757


People also ask

How do you get the upper triangular matrix in NumPy?

NumPy: triu() function Upper triangle of an array. The triu() function is used to get a copy of a matrix with the elements below the k-th diagonal zeroed. Number of rows in the array. Diagonal above which to zero elements.

Can a zero matrix be upper triangular?

a zero square matrix is upper and lower triangular as well as diagonal matrix.


2 Answers

You can use numpy.tril_indices() to assign the NaN value to lower triangle, e.g.:

>>> import numpy as np
>>> m = np.triu(np.arange(0, 12, dtype=np.float).reshape(4,3))
>>> m
array([[ 0.,  1.,  2.],
       [ 0.,  4.,  5.],
       [ 0.,  0.,  8.],
       [ 0.,  0.,  0.]])
>>> m[np.tril_indices(m.shape[0], -1)] = np.nan
>>> m
array([[  0.,   1.,   2.],
       [ nan,   4.,   5.],
       [ nan,  nan,   8.],
       [ nan,  nan,  nan]])
like image 149
AChampion Avatar answered Sep 20 '22 09:09

AChampion


tril_indices() might be the obvious approach here that generates the lower triangular indices and then you can use those to set those in input array to NaNs.

Now, if you care about performance, you can use boolean indexing after creating a mask of such lower triangular shape and then set those to NaNs. The implementation would look like this -

m[np.arange(m.shape[0])[:,None] > np.arange(m.shape[1])] = np.nan

So, np.arange(m.shape[0])[:,None] > np.arange(m.shape[1]) is the mask here that was created using broadcasting.

Sample run -

In [51]: m
Out[51]: 
array([[ 11.,  49.,  23.,  30.],
       [ 40.,  41.,  19.,  26.],
       [ 32.,  36.,  30.,  25.],
       [ 15.,  27.,  25.,  40.],
       [ 33.,  18.,  45.,  43.]])

In [52]: np.arange(m.shape[0])[:,None] > np.arange(m.shape[1]) # mask
Out[52]: 
array([[False, False, False, False],
       [ True, False, False, False],
       [ True,  True, False, False],
       [ True,  True,  True, False],
       [ True,  True,  True,  True]], dtype=bool)

In [53]: m[np.arange(m.shape[0])[:,None] > np.arange(m.shape[1])] = np.nan

In [54]: m
Out[54]: 
array([[ 11.,  49.,  23.,  30.],
       [ nan,  41.,  19.,  26.],
       [ nan,  nan,  30.,  25.],
       [ nan,  nan,  nan,  40.],
       [ nan,  nan,  nan,  nan]])

Runtime tests -

This section compares the boolean indexing based approach listed in this solution to np.tril_indices based one in the other solution for performance.

In [38]: m = np.random.randint(10,50,(1000,1100)).astype(float)

In [39]: %timeit m[np.tril_indices(m.shape[0], -1)] = np.nan
10 loops, best of 3: 62.8 ms per loop

In [40]: m = np.random.randint(10,50,(1000,1100)).astype(float)

In [41]: %timeit m[np.arange(m.shape[0])[:,None] > np.arange(m.shape[1])] = np.nan
100 loops, best of 3: 8.03 ms per loop
like image 22
Divakar Avatar answered Sep 21 '22 09:09

Divakar