Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

strange year values on X axis

If I use the vega dataset "disasters" and make a straightforward chart, I get some weird values for year.

In Altair the code is:

import altair as alt
from vega_datasets import data

dis=data.disasters()

alt.Chart(dis).mark_bar().encode(
    x=alt.X('Year:T'),
    y=alt.Y('Deaths'),
    color='Entity'
)

enter image description here

(vega editor link)

like image 370
campo Avatar asked Jan 01 '23 04:01

campo


2 Answers

Adding to @kanitw's answer: when you convert an integer to a datetime, the integer is treated as nanoseconds since the zero date. You can see this in pandas by executing the following:

>>> pd.to_datetime(dis.Year)
0   1970-01-01 00:00:00.000001900
1   1970-01-01 00:00:00.000001901
2   1970-01-01 00:00:00.000001902
3   1970-01-01 00:00:00.000001903
4   1970-01-01 00:00:00.000001905
Name: Year, dtype: datetime64[ns]

Altair/Vega-Lite uses a similar convention.

If you would like to parse the year as a date when loading the data, and then plot the year with Altair, you can do the following:

import altair as alt
from vega_datasets import data

dis=data.disasters(parse_dates=['Year'])

alt.Chart(dis).mark_bar().encode(
    x=alt.X('year(Year):T'),
    y=alt.Y('Deaths'),
    color='Entity'
)

example chart

First we parse the year column as a date by passing the appropriate pandas.read_csv argument to the loading function, and then use the year timeUnit to extract just the year from the full datetime.

If you are plotting data from a CSV URL rather than a pandas dataframe, Vega-Lite is smart enough to parse the CSV file based on the encoding you specify in the Chart, which means the following will give the same result:

dis=data.disasters.url

alt.Chart(dis).mark_bar().encode(
    x=alt.X('year(Year):T'),
    y=alt.Y('Deaths:Q'),
    color='Entity:N'
)

example chart

like image 88
jakevdp Avatar answered Jan 12 '23 03:01

jakevdp


Year integer is not a standard time value.

In Vega-Lite you can add "format": {"parse": {"Year": "date: '%Y'"}} to the data block to specify custom date parsing format for the field "year".

See a working spec

In Altair, you can similarly specify format property of a *Data class (e.g., NamedData).

like image 42
kanitw Avatar answered Jan 12 '23 03:01

kanitw