Matplotlib remove interpolation for missing data

Question

I am plotting timeseries data using Matplotlib and some of the data is missing in the sequence. Matplotlib implicitly joins the last contiguous data point to the next one. But in case data is missing, the plot looks ugly. The following is the plot obtained. enter image description here

It can be seen that near the April 30th marker, data is missing and Matplotlib joins the points. Also the following image is the scatter plot of the data. Scatter plot covers up this fault, but then contiguous data points won't be joint in this case. Moreover, scatter plot is very slow given the huge number of data points involved. enter image description here

What is the recommended solution for such problems.

tacaswell · Accepted Answer

If you can identify where the break points should be you can either:

break the data and plot each 'section' by hand
insert np.nan in the data in the gaps

See for example Plot periodic trajectories.

You can get the same effect of scatter (if you don't want to scale the size or color of each point independently) with

ax.plot(x, y, linestyle='none', marker='o')

Luciano · Answer

As the previous answer says, you should insert NaNs where there is no data. This answer is specific to Pandas, and explains how this can be achieved easily. Either :

Series.resample() or
Series.reindex()

The simplest method to use is resample(). This is the most concise way for regularly spaced data. So in your example above, if you have e.g. 5 minute data, just do data.resample("5 min"). This will return your data set with 'NaT' (time equivalent of NaN) in the missing values.

The only case where this doesn't work too well is when your samples are not regularly-spaced.

The alternative is reindex(), which also works for ordered (but non-time-series) data. So for example, if you had a data set indexed with integers from 0 .. 100, but with a few missing samples, you could do data.reindex([0:100]). You can also replicate the behaviour of resample with reindex, by passing in a pandas.date_range() function as an argument.

Matplotlib remove interpolation for missing data

Tags:

python

matplotlib

scatter-plot

time-series

Nipun Batra

2 Answers

tacaswell

Luciano

Recent Activity

Donate For Us

Matplotlib remove interpolation for missing data

Tags:

python

matplotlib

scatter-plot

time-series

Nipun Batra

2 Answers

tacaswell

Luciano

Related questions

Recent Activity

Donate For Us