Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I exclude certain dates (e.g., weekends) from time series plots?

Tags:

python

altair

In the following example, I'd like to exclude weekends and plot Y as a straight line, and specify some custom frequency for major tick labels since they would be a "broken" time series (e.g., every Monday, a la matplotlib's set_major_locator).

How would I do that in Altair?

import altair as alt
import pandas as pd

index = pd.date_range('2018-01-01', '2018-01-31', freq='B')
df = pd.DataFrame(pd.np.arange(len(index)), index=index, columns=['Y'])

alt.Chart(df.reset_index()).mark_line().encode(
    x='index',
    y='Y'
)

enter image description here

like image 689
capitalistcuttle Avatar asked Dec 16 '18 17:12

capitalistcuttle


1 Answers

A quick way to do that is to specify the axis as an ordinal field. This would produce a very ugly axis, with the hours specified for every tick. To change that, I add a column to the dataframe with a given label. I also added the grid, as by default it is removed for an ordinal encoding, and set the labelAngle to 0.

df2 = df.assign(label=index.strftime('%b %d %y'))

alt.Chart(df2).mark_line().encode(
    x=alt.X('label:O', axis=alt.Axis(grid=True, labelAngle=0)),
    y='Y:Q'
)

altair-chart-ordinal-axis

Beware that it would remove any missing point. So, maybe you want to add a tooltip. This is discussed in the documentation here. You can also play with labelOverlap in the axis setting depending of hat you want.


To customize the axis, we can build one up using mark_text and bring back the grid with mark_rule and a custom dataframe. It does not necessarily scale up well, but it can give you some ideas.

df3 = df2.loc[df2.index.dayofweek == 0, :].copy()
df3["Y"] = 0

text_chart = alt.Chart(df3).mark_text(dy = 15).encode(
    x=alt.X('label:O', axis = None),
    y=alt.Y('Y:Q'),
    text=alt.Text('label:O')
)

tick_chart = alt.Chart(df3).mark_rule(color='grey').encode(
    x=alt.X('label:O', axis=None),
)

line_chart = alt.Chart(df2).mark_line().encode(
    x=alt.X('label:O', axis=None, scale=alt.Scale(rangeStep=15)),
    y='Y:Q'
)
text_chart + tick_chart + line_chart 

enter image description here

like image 132
FlorianGD Avatar answered Nov 11 '22 13:11

FlorianGD