I am trying to plot a pandas DataFrame with TimeStamp indizes that has a time gap in its indizes. Using pandas.plot() results in linear interpolation between the last TimeStamp of the former segment and the first TimeStamp of the next. I do not want linear interpolation, nor do I want empty space between the two date segments. Is there a way to do that?
Suppose we have a DataFrame with TimeStamp indizes:
>>> import numpy as np
>>> import pandas as pd
>>> import matplotlib.pyplot as plt
>>> df = pd.DataFrame(np.random.randn(1000), index=pd.date_range('1/1/2000', periods=1000))
>>> df = df.cumsum()
Now lets take two time chunks of it and plot it:
>>> df = pd.concat([df['Jan 2000':'Aug 2000'], df['Jan 2001':'Aug 2001']])
>>> df.plot()
>>> plt.show()
The resulting plot has an interpolation line connecting the TimeStamps enclosing the gap. I cannot figure out how to upload pictures on this machine, but these pictures from Google Groups show my problem (interpolated.jpg, no-interpolation.jpg and no gaps.jpg). I can recreate the first as shown above. The second is achievable by replacing all gap values with NaN (see also this question). How can I achieve the third version, where the time gap is omitted?
Try:
df.plot(x=df.index.astype(str))
You may want to customize ticks and tick labels.
EDIT
That works for me using pandas 0.17.1 and numpy 1.10.4.
All you really need is a way to convert the DatetimeIndex
to another type which is not datetime-like. In order to get meaningful labels I chose str
. If x=df.index.astype(str)
does not work with your combination of pandas/numpy/whatever you can try other options:
df.index.to_series().dt.strftime('%Y-%m-%d')
df.index.to_series().apply(lambda x: x.strftime('%Y-%m-%d'))
...
I realized that resetting the index is not necessary so I removed that part.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With