If I use the vega dataset "disasters" and make a straightforward chart, I get some weird values for year.
import altair as alt
from vega_datasets import data
dis=data.disasters()
alt.Chart(dis).mark_bar().encode(
x=alt.X('Year:T'),
y=alt.Y('Deaths'),
color='Entity'
)
(vega editor link)
Adding to @kanitw's answer: when you convert an integer to a datetime, the integer is treated as nanoseconds since the zero date. You can see this in pandas by executing the following:
>>> pd.to_datetime(dis.Year)
0 1970-01-01 00:00:00.000001900
1 1970-01-01 00:00:00.000001901
2 1970-01-01 00:00:00.000001902
3 1970-01-01 00:00:00.000001903
4 1970-01-01 00:00:00.000001905
Name: Year, dtype: datetime64[ns]
Altair/Vega-Lite uses a similar convention.
If you would like to parse the year as a date when loading the data, and then plot the year with Altair, you can do the following:
import altair as alt
from vega_datasets import data
dis=data.disasters(parse_dates=['Year'])
alt.Chart(dis).mark_bar().encode(
x=alt.X('year(Year):T'),
y=alt.Y('Deaths'),
color='Entity'
)
First we parse the year column as a date by passing the appropriate pandas.read_csv
argument to the loading function, and then use the year
timeUnit to extract just the year from the full datetime.
If you are plotting data from a CSV URL rather than a pandas dataframe, Vega-Lite is smart enough to parse the CSV file based on the encoding you specify in the Chart, which means the following will give the same result:
dis=data.disasters.url
alt.Chart(dis).mark_bar().encode(
x=alt.X('year(Year):T'),
y=alt.Y('Deaths:Q'),
color='Entity:N'
)
Year integer is not a standard time value.
In Vega-Lite you can add "format": {"parse": {"Year": "date: '%Y'"}}
to the data block to specify custom date parsing format for the field "year"
.
See a working spec
In Altair, you can similarly specify format
property of a *Data class (e.g., NamedData
).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With