Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to plot a time range as a value from Pandas or MatPlotLib

I have several DataFrames containing time series data and would like to create a simple visualisation of the span of the time ranges for each of those DataFrames. Since I was unable to generate this with code I have included a sketch to illustrate my goal.

Time Range Illustration

Here is some code to create three DataFrames that are essentially simplified, smaller versions of the DataFrames I am working with.

from pandas import DataFrame
from numpy import datetime64, random

# example data recorded by two different sensors
example_data = random.rand(5,2)
example_data2 = random.rand(9,2)
example_data3 = random.rand(9,2)

# timestamps from sensor1
times = ['2000-01-01 09:00:00',
        '2000-01-01 09:15:00',
        '2000-01-01 09:30:00',
        '2000-01-01 09:45:00',
        '2000-01-01 10:00:00']

# timestamps from sensor2
times2 = ['2000-01-01 08:45:00',
        '2000-01-01 09:00:00',
        '2000-01-01 09:15:00',
        '2000-01-01 09:30:00',
        '2000-01-01 09:45:00',
        '2000-01-01 10:00:00',
        '2000-01-01 10:15:00',
        '2000-01-01 10:30:00',
        '2000-01-01 10:45:00']

# timestamps from sensor2
times3 = ['2000-01-01 09:20:00',
        '2000-01-01 09:40:00',
        '2000-01-01 10:00:00',
        '2000-01-01 10:20:00',
        '2000-01-01 10:40:00',
        '2000-01-01 11:00:00',
        '2000-01-01 11:20:00',
        '2000-01-01 11:40:00',
        '2000-01-01 12:00:00']


# create the DataFrame object for sensor1 with the times and data above
sensor1 = DataFrame({'Time': times,
                    'measure1': example_data[:,0],
                    'measure2': example_data[:,1]})

# create the DataFrame object for sensor2 with the times and data above
sensor2 = DataFrame({'Time': times2,
                    'measure1': example_data2[:,0],
                    'measure2': example_data2[:,1]})

# create the DataFrame object for sensor2 with the times and data above
sensor3 = DataFrame({'Time': times3,
                    'measure1': example_data3[:,0],
                    'measure2': example_data3[:,1]})

# coerce the 'Time' column from string to a numpy datetime64 value
sensor1['Time'] = sensor1['Time'].astype(datetime64)
sensor2['Time'] = sensor2['Time'].astype(datetime64)
sensor3['Time'] = sensor3['Time'].astype(datetime64)

I have tried taking the min and max datetime value from each of the DataFrames and putting them into a new DataFrame but when I try and plot them I get an error that there are no values to plot.

I have also tried to taking just the 'Time' column, and assigning an Integer to a 'value' column (i.e. sensor 1 get's the Int 1 broadcast to the 'value' column, sensor2 get's the Int 2 broadcast and so on), then merging these DataFrames.

But this results in lots of duplicate values in the 'Time' column and Nan values in the 'value' column.

I have run out of ideas of how to get this to work.

EDIT: Corrected a sneaky '2001' timestamp in the code block ;-)

like image 907
Philip Lawrence Avatar asked Feb 20 '15 11:02

Philip Lawrence


People also ask

How do I plot time series data in Matplotlib?

In X-axis we should have a variable of DateTime. In Y-axis we can have the variable which we want to analyze with respect to time. plt. plot() method is used to plot the graph in matplotlib.

Can you plot time in Python?

Plot Time Series data in Python using Matplotlib pyplot. plot_date(). We will use Pandas Dataframe to extract the time series data from a CSV file using pandas. read_csv().


1 Answers

import numpy
import pandas

# create an index containing all time stamps
idx1 = pandas.Index(sensor1.Time)
idx2 = pandas.Index(sensor2.Time)
idx3 = pandas.Index(sensor3.Time)
df = pandas.DataFrame(index=idx1.union(idx2).union(idx3))

# create a (constant) Series for each sensor
df['Sensor1'] = df.index.to_series().apply(lambda x: 3 if x >= sensor1.Time.min() and x <= sensor1.Time.max() else numpy.NaN)
df['Sensor2'] = df.index.to_series().apply(lambda x: 2 if x >= sensor2.Time.min() and x <= sensor2.Time.max() else numpy.NaN)
df['Sensor3'] = df.index.to_series().apply(lambda x: 1 if x >= sensor3.Time.min() and x <= sensor3.Time.max() else numpy.NaN)

# plot
p = df.plot(ylim=[0, 4], legend=False)
p.set_yticks([1., 2., 3.])
p.set_yticklabels(['Sensor3', 'Sensor2', 'Sensor1'])

By the way, are you sure you have year 2001 in your timestamps? This will make your Sensor1 plot be invisibly small.

like image 157
AceRymond Avatar answered Oct 05 '22 06:10

AceRymond