Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How not to plot missing periods

I'm trying to plot a time series data, where for certain periods there is no data. Data is loaded into dataframe and I'm plotting it using df.plot(). The problem is that the missing periods get connected while plotting, giving an impression that value exists in that period, while it doesn't.

Here's an example of the problem

problem

There is no data between Sep 01 and Sep 08 as well as between Sep 09 and Sep 25, but the data is plotted in a way that it seems that there are values in that period.

I would like to have zero values visualized in that period, or no values at all. How to do that?

Just to be clear, I don't have NaN values for periods [Sep 01, Sep 08], [Sep 09, Sep 29], but no data at all (not even in the time index).

like image 273
Kobe-Wan Kenobi Avatar asked Mar 11 '23 01:03

Kobe-Wan Kenobi


2 Answers

Consider the pd.Series s

s = pd.Series(
    np.arange(10), pd.date_range('2016-03-31', periods=10)
).replace({3: np.nan, 6: np.nan})

s.plot()

enter image description here

You can see the np.nan were skipped.
However:

s.fillna(0).plot()

enter image description here

0s are not skipped.

I suggest s.replace(0, np.nan).plot()

like image 52
piRSquared Avatar answered Mar 21 '23 06:03

piRSquared


You should add the missing dates to your dataframe, with NaN values. Then, when plotted, those NaNs break the line -- you will get several line segments, with empty periods between them.

This answer explains best how to add the missing dates to your dataframe. To summarize it, this should do the trick:

df = df.reindex(pd.DatetimeIndex(df.index), fill_value=NaN)
like image 43
shx2 Avatar answered Mar 21 '23 06:03

shx2