Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Overlaying vertical lines onto a plot in Altair

I have a data frame df which looks like this:

+---+------------+-----------+--------+
|   |    date    | violation | pounds |
+---+------------+-----------+--------+
| 0 | 2010-05-13 | N         | NaN    |
| 1 | 2015-04-22 | Y         | NaN    |
| 2 | 2009-08-12 | Y         | NaN    |
| 3 | 2006-06-01 | NaN       | 3732.0 |
| 4 | 2006-08-01 | NaN       | 1340.0 |
| 5 | 2006-10-01 | NaN       | 1310.0 |
+---+------------+-----------+--------+

I want to plot the pounds variables on the vertical axis with the horizontal coordinates given by the time series date, and overlay vertical lines onto the plot wherever violation is not Nan. Basically, I want the following chart, except with vertical bars at non-NaN values of df.violation:

example chart

I tried overlaying two Chart() objects on top of each other following this notebook but it didn't seem to work. I'm looking to be able to do something sort of like this:

points = Chart(df).mark_point().encode(y='pounds', x='date')
rules = Chart(df[df['violation']=='Y']).mark_rule().encode(x='date')
points + rules

I've checked that the separate Charts points and rules both look fine. Yet the points + rules command results in the following error:

ValueError                                Traceback (most recent call last)
~/anaconda3/lib/python3.5/site-packages/IPython/core/formatters.py in __call__(self, obj)
    907             method = _safe_get_formatter_method(obj, self.print_method)
    908             if method is not None:
--> 909                 method()
    910                 return True
    911 

~/anaconda3/lib/python3.5/site-packages/altair/api.py in _ipython_display_(self)
    186         from IPython.display import display
    187         from vega import VegaLite
--> 188         display(VegaLite(self.to_dict()))
    189 
    190     def display(self):

~/anaconda3/lib/python3.5/site-packages/vega/base.py in __init__(self, spec, data)
     21         """Initialize the visualization object."""
     22         spec = utils.nested_update(copy.deepcopy(self.DEFAULTS), spec)
---> 23         self.spec = self._prepare_spec(spec, data)
     24 
     25     def _prepare_spec(self, spec, data):

~/anaconda3/lib/python3.5/site-packages/vega/vegalite.py in _prepare_spec(self, spec, data)
     22 
     23     def _prepare_spec(self, spec, data):
---> 24         return prepare_spec(spec, data)
     25 
     26 

~/anaconda3/lib/python3.5/site-packages/vega/utils.py in prepare_spec(spec, data)
     91         # Data is either passed in spec or error
     92         if 'data' not in spec:
---> 93             raise ValueError('No data provided')
     94     else:
     95         # As a last resort try to pass the data to a DataFrame and use it

ValueError: No data provided

I know Altair is still in its infancy and is thus lacking in documentation, but does anyone know how to do this easily? This is one of those tasks which is trivial in ggplot2.

like image 626
gogurt Avatar asked Nov 02 '16 13:11

gogurt


1 Answers

Try

points = Chart(df).mark_point().encode(y='pounds', x='date')
rules = Chart(df).mark_rule().encode(x='date').transform_filter(datum.violation == 'Y')
points + rules

Take a look at this link which explains it further

like image 146
Yoni Fihrer Avatar answered Sep 30 '22 14:09

Yoni Fihrer