Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Plotting for a large number of time series data points using matplotlib

I've collected a sensor data every 5 minutes for a month (30 days). That means, I have a timeseries data with 288*30 data points in total.

I'd like to scatterplot the data (x-axis: time, y-axis: sensor value). the following code is for test.

import pandas as pd
from matplotlib import pyplot as plt
import numpy as np

# generate time series randomly (length: 1 month)
rng=pd.date_range("2015-11-11",periods=288*30,freq="5min")
ts=pd.Series(np.random.randn(len(rng)),rng)

nr=3
nc=1

fig=plt.figure(1)
fig.subplots_adjust(left=0.04,top=1,bottom=0.02,right=0.98,wspace=0.1,hspace=0.1)

for i in range(3):
    ctr=i+1
    ax=fig.add_subplot(nr,nc,ctr)

    ax.scatter(ts.index,ts.values)
    ax.set_xlim(ts.index.min(),ts.index.max())

plt.show()

I've generated random time series data having 288*30 observations and tried to draw it in scatter plot. However, as you can see, it is impossible to analyze the figure.

enter image description here

I want to redraw it satisfying the following conditions:

  1. I want a zoomed-in version of the figure. In other words, a part of data points of some time range (e.g., 2~3 hours) is shown at once. Then, there should be enough space between adjacent points.

  2. I want save the figure as png or pdf file. Then, if I open the file, the image (or pdf) viewer has a horizontal scroll bar which enables me to explore the whole figure.

Is there anyone who can solve it?

I do not think it will be not hard for a matplotlib expert, but quite hard for me, a beginner.

like image 752
Minsoo Choy Avatar asked Oct 18 '22 21:10

Minsoo Choy


1 Answers

note to readers: answer changed significantly from v1 due to clarification of the question

  1. I want a zoomed-in version of the figure. In other words, a part of data points of some time range (e.g., 2~3 hours) is shown at once. Then, there should be enough space between adjacent points.

Zooming in matplotlib is implemented with the x and y limits of the axis. So you can simply change the arguments to your call to ax.set_xlim such that the corresponding times differ by 2-3 hours or however long you want. Knowing that you have a sample every 5 minutes, since 2 hours/(5 min/sample) = 24, you could use

ax.set_xlim(ts.index.min(),ts.index.min() + 24)

to get a 2-hour range.

  1. I want save the figure as png or pdf file. Then, if I open the file, the image (or pdf) viewer has a horizontal scroll bar which enables me to explore the whole figure.

Use savefig to save the figure to a file. Note that if you have set the axis limits using set_xlim or xlim or equivalent, this will save only the portion of the figure that is visible within the given limits. So to save the entire figure (with all data points visible), you will need to set the axis limits to the minimum and maximum values, respectively.

When you open the image/PDF file in a viewer, whether it displays a scroll bar (and how much of the figure is shown) is entirely up to the viewer. You cannot control this in Python. But you can give it some chance of showing up with a horizontal scroll bar by making the figure very large in the horizontal direction. To do so, you can pass the figsize=(width, height) keyword argument when creating the figure, or use the set_size_inches(width, height) method on an existing Figure object. The measurements are in inches in both cases. Pass a value for width that is much larger than that for height and you will get a very wide figure; for example, 40 for width and 4 for height. You'll have to experiment with these values to find which ones give your figure the proportions you want.

like image 101
David Z Avatar answered Oct 23 '22 11:10

David Z