Draw / Create Scatterplots of datasets with NaN

Tags:

matplotlib

I want to draw a scatter plot using pylab, however, some of my data are NaN, like this:

a = [1, 2, 3]
b = [1, 2, None]

pylab.scatter(a,b) doesn't work.

Is there some way that I could draw the points of real value while not displaying these NaN value?

330

asked Apr 02 '13 00:04

2 Answers

Things will work perfectly if you use NaNs. None is not the same thing. A NaN is a float.

As an example:

import numpy as np
import matplotlib.pyplot as plt

plt.scatter([1, 2, 3], [1, 2, np.nan])
plt.show()

enter image description here

Have a look at pandas or numpy masked arrays (and numpy.genfromtxt to load your data) if you want to handle missing data. Masked arrays are built into numpy, but pandas is an extremely useful library, and has very nice missing value functionality.

As an example:

import matplotlib.pyplot as plt
import pandas

x = pandas.Series([1, 2, 3])
y = pandas.Series([1, 2, None])
plt.scatter(x, y)
plt.show()

pandas uses NaNs to represent masked data, while masked arrays use a separate mask array. This means that masked arrays can potentially preserve the original data, while temporarily flagging it as "missing" or "bad". However, they use more memory, and have a hidden gotchas that can be avoided by using NaNs to represent missing data.

As another example, using both masked arrays and NaNs, this time with a line plot:

import numpy as np
import matplotlib.pyplot as plt

x = np.linspace(0, 6 * np.pi, 300)
y = np.cos(x)

y1 = np.ma.masked_where(y > 0.7, y)

y2 = y.copy()
y2[y > 0.7] = np.nan

fig, axes = plt.subplots(nrows=3, sharex=True, sharey=True)
for ax, ydata in zip(axes, [y, y1, y2]):
    ax.plot(x, ydata)
    ax.axhline(0.7, color='red')

axes[0].set_title('Original')
axes[1].set_title('Masked Arrays')
axes[2].set_title("Using NaN's")

fig.tight_layout()

plt.show()

enter image description here

184

answered Oct 01 '22 20:10

Joe Kington

Because you are drawing in 2D space, your points need to be defined by both an X and an Y value. If one of the values is None, that point cannot exist in 2D space so it cannot be plotted, hence you should remove both the None and it's corresponding value from the other list.

There are many ways to accomplish this. Here is one:

a = [1, 2, 3]
b = [1, None, 2]

i = 0
while i < len(a):
    if a[i] == None or b[i] == None:
        a = a[:i] + a[i+1:]
        b = b[:i] + b[i+1:]
    else:
        i += 1

"""Now a = [1, 3] and b = [1, 2]"""

pylab.scatter(a,b)

answered Oct 01 '22 21:10

Ionut Hulub

Related questions
                            
                                How to get Address from Latitude & Longitude in Django GeoIP?
                            
                                Cell assignment of a 2-dimensional Matrix in Python, without numpy
                            
                                Trouble querying ListField with mongoengine
                            
                                operator python parameter
                            
                                Is there a middle ground between `zip` and `zip_longest`
                            
                                How would you install a python module with chef?
                            
                                Using Pygame with PyPy
                            
                                Splitting a list of sequences into two lists efficiently [duplicate]
                            
                                Having line color vary with data index for line graph in matplotlib?
                            
                                how to use socket fetch webpage use python
                            
                                Why is collections.Counter much slower than ''.count?
                            
                                gaussian fit with scipy.optimize.curve_fit in python with wrong results
                            
                                Plugin architecture - Plugin Manager vs inspecting from plugins import *
                            
                                How to convert some character into five digit unicode one in Python 3.3?
                            
                                Unpack list and cast at the same time
                            
                                Passing arguments into os.system
                            
                                what is the diff between save_model and save_formset in django admin
                            
                                EntityFramework for Python [closed]
                            
                                Please code review my sample Python program [closed]
                            
                                Passing a parameter to the decorator in python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Draw / Create Scatterplots of datasets with NaN

Tags:

python

matplotlib

yangsuli

People also ask

2 Answers

Joe Kington

Ionut Hulub

Recent Activity

Donate For Us