Create numpy matrix filled with NaNs

People also ask

How do I create a Numpy array filled with NaN?

To create an array with nan values we have to use the numpy. empty() and fill() function. It returns an array with the same shape and type as a given array.

Is NaN for Numpy array?

Test element-wise for Not a Number (NaN), return result as a bool array. Input array. For scalar input, the result is a new boolean with value True if the input is NaN; otherwise the value is False.

How do I create a matrix in Numpy?

Code: import numpy as np #creating matrix from array like object B = np. matrix([[1, 2, 3], [4, 5, 6]]) print("Array created using array like object is :\n", B)

You rarely need loops for vector operations in numpy. You can create an uninitialized array and assign to all entries at once:

>>> a = numpy.empty((3,3,))
>>> a[:] = numpy.nan
>>> a
array([[ NaN,  NaN,  NaN],
       [ NaN,  NaN,  NaN],
       [ NaN,  NaN,  NaN]])

I have timed the alternatives a[:] = numpy.nan here and a.fill(numpy.nan) as posted by Blaenk:

$ python -mtimeit "import numpy as np; a = np.empty((100,100));" "a.fill(np.nan)"
10000 loops, best of 3: 54.3 usec per loop
$ python -mtimeit "import numpy as np; a = np.empty((100,100));" "a[:] = np.nan" 
10000 loops, best of 3: 88.8 usec per loop

The timings show a preference for ndarray.fill(..) as the faster alternative. OTOH, I like numpy's convenience implementation where you can assign values to whole slices at the time, the code's intention is very clear.

Note that ndarray.fill performs its operation in-place, so numpy.empty((3,3,)).fill(numpy.nan) will instead return None.

Another option is to use numpy.full, an option available in NumPy 1.8+

a = np.full([height, width, 9], np.nan)

This is pretty flexible and you can fill it with any other number that you want.

I compared the suggested alternatives for speed and found that, for large enough vectors/matrices to fill, all alternatives except val * ones and array(n * [val]) are equally fast.

enter image description here

Code to reproduce the plot:

import numpy
import perfplot

val = 42.0


def fill(n):
    a = numpy.empty(n)
    a.fill(val)
    return a


def colon(n):
    a = numpy.empty(n)
    a[:] = val
    return a


def full(n):
    return numpy.full(n, val)


def ones_times(n):
    return val * numpy.ones(n)


def list(n):
    return numpy.array(n * [val])


b = perfplot.bench(
    setup=lambda n: n,
    kernels=[fill, colon, full, ones_times, list],
    n_range=[2 ** k for k in range(20)],
    xlabel="len(a)",
)
b.save("out.png")

Are you familiar with numpy.nan?

You can create your own method such as:

def nans(shape, dtype=float):
    a = numpy.empty(shape, dtype)
    a.fill(numpy.nan)
    return a

Then

nans([3,4])

would output

array([[ NaN,  NaN,  NaN,  NaN],
       [ NaN,  NaN,  NaN,  NaN],
       [ NaN,  NaN,  NaN,  NaN]])

I found this code in a mailing list thread.

You can always use multiplication if you don't immediately recall the .empty or .full methods:

>>> np.nan * np.ones(shape=(3,2))
array([[ nan,  nan],
       [ nan,  nan],
       [ nan,  nan]])

Of course it works with any other numerical value as well:

>>> 42 * np.ones(shape=(3,2))
array([[ 42,  42],
       [ 42,  42],
       [ 42, 42]])

But the @u0b34a0f6ae's accepted answer is 3x faster (CPU cycles, not brain cycles to remember numpy syntax ;):

$ python -mtimeit "import numpy as np; X = np.empty((100,100));" "X[:] = np.nan;"
100000 loops, best of 3: 8.9 usec per loop
(predict)laneh@predict:~/src/predict/predict/webapp$ master
$ python -mtimeit "import numpy as np; X = np.ones((100,100));" "X *= np.nan;"
10000 loops, best of 3: 24.9 usec per loop

As said, numpy.empty() is the way to go. However, for objects, fill() might not do exactly what you think it does:

In[36]: a = numpy.empty(5,dtype=object)
In[37]: a.fill([])
In[38]: a
Out[38]: array([[], [], [], [], []], dtype=object)
In[39]: a[0].append(4)
In[40]: a
Out[40]: array([[4], [4], [4], [4], [4]], dtype=object)

One way around can be e.g.:

In[41]: a = numpy.empty(5,dtype=object)
In[42]: a[:]= [ [] for x in range(5)]
In[43]: a[0].append(4)
In[44]: a
Out[44]: array([[4], [], [], [], []], dtype=object)

Another alternative is numpy.broadcast_to(val,n) which returns in constant time regardless of the size and is also the most memory efficient (it returns a view of the repeated element). The caveat is that the returned value is read-only.

Below is a comparison of the performances of all the other methods that have been proposed using the same benchmark as in Nico Schlömer's answer.

enter image description here

Related questions
                            
                                How to update an existing Conda environment with a .yml file
                            
                                How do I upgrade to Python 3.6 with conda?
                            
                                Concatenate a NumPy array to another NumPy array
                            
                                How to prevent Google Colab from disconnecting?
                            
                                How to add header row to a pandas DataFrame
                            
                                Good Python modules for fuzzy string comparison? [closed]
                            
                                How to filter rows containing a string pattern from a Pandas dataframe [duplicate]
                            
                                How can I check the extension of a file?
                            
                                Flatten nested dictionaries, compressing keys
                            
                                python max function using 'key' and lambda expression
                            
                                Require either of two arguments using argparse
                            
                                List of lists into numpy array
                            
                                How do I check if a string is valid JSON in Python?
                            
                                Executing multi-line statements in the one-line command-line?
                            
                                How does asyncio actually work?
                            
                                How do I print the key-value pairs of a dictionary in python
                            
                                Finding median of list in Python
                            
                                How to get JSON from webpage into Python script
                            
                                Save classifier to disk in scikit-learn
                            
                                Python3 project remove __pycache__ folders and .pyc files

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Create numpy matrix filled with NaNs

Tags:

python

numpy

People also ask

Recent Activity

Donate For Us