Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hours and minutes as labels in Altair plot spanning more than one day

I'm trying to create in Altair a Vega-Lite specification of a plot of a time series whose time range spans a few days. Since in my case, it will be clear which day is which, I want to reduce noise in my axis labels by letting labels be of the form '%H:%M', even if this causes labels to be non-distinct.

Here's some example data; my actual data has a five minute resolution, but I imagine that won't matter too much here:

import altair as alt
import numpy as np
import pandas as pd

# Create data spanning 30 hours, or just over one full day
df = pd.DataFrame({'time': pd.date_range('2018-01-01', periods=30, freq='H'),
                   'data': np.arange(30)**.5})

By using the otherwise trivial yearmonthdatehoursminutes transform, I get the following:

alt.Chart(df).mark_line().encode(x='yearmonthdatehoursminutes(time):T', 
y='data:Q')

enter image description here

Now, my goal is to get rid of the dates in the labels on the horizontal axis, so they become something like ['00:00', '03:00', ..., '21:00', '00:00', '03:00'], or whatever spacing works best.

The naive approach of just using hoursminutes as a transform won't work, as that bins the actual data:

alt.Chart(df).mark_line().encode(x='hoursminutes(time):T', y='data:Q')

enter image description here

So, is there a declarative way of doing this? Ultimately, the visualization will be making use of selections to define the horizontal axis limits, so specifying the labels explicitly using Axis does not seem appealing.

like image 404
fuglede Avatar asked Feb 22 '20 10:02

fuglede


2 Answers

To expand on @fuglede's answer, there are two distinct concepts at play with dates and times in Altair.

Time formats let you specify how times are displayed on an axis; they look like this:

chart.encode(
    x=alt.X('time:T', axis=alt.Axis(format='%H:%M'))
)

Altair uses format codes from d3-time-format.

Time units let you specify how data will be grouped, and they also adjust the default time format to match. They look something like this:

chart.encode(
    x=alt.X('time:T', timeUnit='hoursminutes')
)

or via the shorthand:

chart.encode(
    x='hoursminutes(time):T'
)

Available time units are listed here.

If you want to adjust axis formats only, use time formats. If you want to group based on timespans (i.e. group data by year, by month, by hour, etc.) then use a time unit. Examples of this appear in the Altair documentation, e.g. the Seattle Weather Heatmap in Altair's example gallery.

like image 181
jakevdp Avatar answered Sep 23 '22 03:09

jakevdp


This can actually easily be achieved by specifying format in Axis:

alt.Chart(df).mark_line().encode(x=alt.X('time:T', axis=alt.Axis(format='%H:%M')), y='data:Q')

enter image description here

like image 28
fuglede Avatar answered Sep 23 '22 03:09

fuglede