Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is a fast and proper way to refresh/update plots in Bokeh (0.11) server app?

Tags:

python

bokeh

I have a bokeh (v0.11) serve app that produces a scatter plot using (x,y) coordinates from a data frame. I want to add interactions such that when a user either selects points on the plot or enters the name of comma-separated points in the text box (ie. "p55, p1234"), then those points will turn red on the scatter plot.

I have found one way to accomplish this (Strategy #3, below) but it is terribly slow for large dataframes. I would think there is a better method. Can anyone help me out? Am I missing some obvious function call?

  • Strategy 1 (<1ms for 100 points) drills into the ColumnDataSource data for the exist plot and attempts to change the selected points.
  • Strategy 2 (~70ms per 100 points) overwrites the plot's existing ColumnDataSource with a newly created ColumnDataSource.
  • Strategy 3 (~400ms per 100 points) is Strategy 2 and then it re-creates the figure.

Code is deposited on pastebin: http://pastebin.com/JvQ1UpzY Most relevant portion is copied below.

def refresh_graph(self, selected_points=None, old_idxs=None, new_idxs=None):
    # Strategy 1: Cherry pick current plot's source.
    # Compute time for 100 points: < 1ms.
    if self.strategy == 1:
        t1 = datetime.now()
        for idx in old_idxs:
            self.graph_plot.data_source.data['color'][idx] = 'steelblue'
        for idx in new_idxs:
            self.graph_plot.data_source.data['color'][idx] = 'red'
        print('Strategy #1 completed in {}'.format(datetime.now() - t1))
    else:
        t3 = datetime.now()
        self.coords['color'] = 'steelblue'
        self.coords.loc[selected_points, 'color'] = 'red'
        new_source = bkmodels.ColumnDataSource(self.coords)
        self.graph_plot = self.graph_fig.scatter('x', 'y', source=new_source, color='color', alpha=0.6)
        print('Strategy #3 completed in {}'.format(datetime.now() - t3))
    return

Ideally, I would like to be able to use Strategy #1, but it does not seem to allow the points to refresh within the client browser.

Thanks for any help!

FYI: I am using RHEL 6.X

like image 382
user2700854 Avatar asked Jan 24 '16 00:01

user2700854


People also ask

How does Bokeh server work?

The Bokeh server executes the application code with every new connection and creates a new Bokeh document, syncing it to the browser. The application code also sets up the callbacks that should run whenever properties, such as widget values, change. You can provide the application code in several ways.

What is ColumnDataSource Bokeh?

The ColumnDataSource (CDS) is the core of most Bokeh plots. It provides the data to the glyphs of your plot. When you pass sequences like Python lists or NumPy arrays to a Bokeh renderer, Bokeh automatically creates a ColumnDataSource with this data for you.

Does Bokeh work with pandas?

Pandas-Bokeh provides a Bokeh plotting backend for Pandas, GeoPandas and Pyspark DataFrames, similar to the already existing Visualization feature of Pandas. Importing the library adds a complementary plotting method plot_bokeh() on DataFrames and Series.


1 Answers

If you are streaming data, then there is a related answer here: Timeseries streaming in bokeh

If you need update everything at once, then you can do that, and my suggestion is your Strategy 1, which is demonstrated, e.g. here:

https://github.com/bokeh/bokeh/blob/master/examples/app/sliders.py

The particular thing to note is that you really have to update all of source.data in one go. One of the assumptions is that all the columns of a column data source always have the same length. Updating individual columns runs the risk of breaking this assumption, which can cause problems. So you want to update all at once, with something like:

# Generate the new curve
x = np.linspace(0, 4*np.pi, N)
y = a*np.sin(k*x + w) + b

source.data = dict(x=x, y=y)
like image 100
bigreddot Avatar answered Oct 12 '22 07:10

bigreddot