I have a bokeh (v0.11) serve app that produces a scatter plot using (x,y) coordinates from a data frame. I want to add interactions such that when a user either selects points on the plot or enters the name of comma-separated points in the text box (ie. "p55, p1234"), then those points will turn red on the scatter plot.
I have found one way to accomplish this (Strategy #3, below) but it is terribly slow for large dataframes. I would think there is a better method. Can anyone help me out? Am I missing some obvious function call?
Code is deposited on pastebin: http://pastebin.com/JvQ1UpzY Most relevant portion is copied below.
def refresh_graph(self, selected_points=None, old_idxs=None, new_idxs=None):
# Strategy 1: Cherry pick current plot's source.
# Compute time for 100 points: < 1ms.
if self.strategy == 1:
t1 = datetime.now()
for idx in old_idxs:
self.graph_plot.data_source.data['color'][idx] = 'steelblue'
for idx in new_idxs:
self.graph_plot.data_source.data['color'][idx] = 'red'
print('Strategy #1 completed in {}'.format(datetime.now() - t1))
else:
t3 = datetime.now()
self.coords['color'] = 'steelblue'
self.coords.loc[selected_points, 'color'] = 'red'
new_source = bkmodels.ColumnDataSource(self.coords)
self.graph_plot = self.graph_fig.scatter('x', 'y', source=new_source, color='color', alpha=0.6)
print('Strategy #3 completed in {}'.format(datetime.now() - t3))
return
Ideally, I would like to be able to use Strategy #1, but it does not seem to allow the points to refresh within the client browser.
Thanks for any help!
FYI: I am using RHEL 6.X
The Bokeh server executes the application code with every new connection and creates a new Bokeh document, syncing it to the browser. The application code also sets up the callbacks that should run whenever properties, such as widget values, change. You can provide the application code in several ways.
The ColumnDataSource (CDS) is the core of most Bokeh plots. It provides the data to the glyphs of your plot. When you pass sequences like Python lists or NumPy arrays to a Bokeh renderer, Bokeh automatically creates a ColumnDataSource with this data for you.
Pandas-Bokeh provides a Bokeh plotting backend for Pandas, GeoPandas and Pyspark DataFrames, similar to the already existing Visualization feature of Pandas. Importing the library adds a complementary plotting method plot_bokeh() on DataFrames and Series.
If you are streaming data, then there is a related answer here: Timeseries streaming in bokeh
If you need update everything at once, then you can do that, and my suggestion is your Strategy 1, which is demonstrated, e.g. here:
https://github.com/bokeh/bokeh/blob/master/examples/app/sliders.py
The particular thing to note is that you really have to update all of source.data
in one go. One of the assumptions is that all the columns of a column data source always have the same length. Updating individual columns runs the risk of breaking this assumption, which can cause problems. So you want to update all at once, with something like:
# Generate the new curve
x = np.linspace(0, 4*np.pi, N)
y = a*np.sin(k*x + w) + b
source.data = dict(x=x, y=y)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With