I'm trying to plot a time series data, where for certain periods there is no data. Data is loaded into dataframe and I'm plotting it using df.plot()
. The problem is that the missing periods get connected while plotting, giving an impression that value exists in that period, while it doesn't.
Here's an example of the problem
There is no data between Sep 01 and Sep 08 as well as between Sep 09 and Sep 25, but the data is plotted in a way that it seems that there are values in that period.
I would like to have zero values visualized in that period, or no values at all. How to do that?
Just to be clear, I don't have NaN values for periods [Sep 01, Sep 08], [Sep 09, Sep 29], but no data at all (not even in the time index).
Consider the pd.Series
s
s = pd.Series(
np.arange(10), pd.date_range('2016-03-31', periods=10)
).replace({3: np.nan, 6: np.nan})
s.plot()
You can see the np.nan
were skipped.
However:
s.fillna(0).plot()
0
s are not skipped.
I suggest s.replace(0, np.nan).plot()
You should add the missing dates to your dataframe, with NaN values. Then, when plotted, those NaNs break the line -- you will get several line segments, with empty periods between them.
This answer explains best how to add the missing dates to your dataframe. To summarize it, this should do the trick:
df = df.reindex(pd.DatetimeIndex(df.index), fill_value=NaN)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With