Speeding up matplotlib scatter plots

Tags:

I'm trying to make an interactive program which primarily uses matplotlib to make scatter plots of rather a lot of points (10k-100k or so). Right now it works, but changes take too long to render. Small numbers of points are ok, but once the number rises things get frustrating in a hurry. So, I'm working on ways to speed up scatter, but I'm not having much luck

There's the obvious way to do thing (the way it's implemented now) (I realize the plot redraws without updating. I didn't want to alter the fps result with large calls to random).

import matplotlib.pyplot as plt
import numpy as np
import matplotlib as mpl
import time


X = np.random.randn(10000)  #x pos
Y = np.random.randn(10000)  #y pos
C = np.random.random(10000) #will be color
S = (1+np.random.randn(10000)**2)*3 #size

#build the colors from a color map
colors = mpl.cm.jet(C)
#there are easier ways to do static alpha, but this allows 
#per point alpha later on.
colors[:,3] = 0.1

fig, ax = plt.subplots()

fig.show()
background = fig.canvas.copy_from_bbox(ax.bbox)

#this makes the base collection
coll = ax.scatter(X,Y,facecolor=colors, s=S, edgecolor='None',marker='D')

fig.canvas.draw()

sTime = time.time()
for i in range(10):
    print i
    #don't change anything, but redraw the plot
    ax.cla()
    coll = ax.scatter(X,Y,facecolor=colors, s=S, edgecolor='None',marker='D')
    fig.canvas.draw()
print '%2.1f FPS'%( (time.time()-sTime)/10 )

Which gives a speedy 0.7 fps

Alternatively, I can edit the collection returned by scatter. For that, I can change color and position, but don't know how to change the size of each point. That would I think look something like this

import matplotlib.pyplot as plt
import numpy as np
import matplotlib as mpl
import time


X = np.random.randn(10000)  #x pos
Y = np.random.randn(10000)  #y pos
C = np.random.random(10000) #will be color
S = (1+np.random.randn(10000)**2)*3 #size

#build the colors from a color map
colors = mpl.cm.jet(C)
#there are easier ways to do static alpha, but this allows 
#per point alpha later on.
colors[:,3] = 0.1

fig, ax = plt.subplots()

fig.show()
background = fig.canvas.copy_from_bbox(ax.bbox)

#this makes the base collection
coll = ax.scatter(X,Y,facecolor=colors, s=S, edgecolor='None', marker='D')

fig.canvas.draw()

sTime = time.time()
for i in range(10):
    print i
    #don't change anything, but redraw the plot
    coll.set_facecolors(colors)
    coll.set_offsets( np.array([X,Y]).T )
    #for starters lets not change anything!
    fig.canvas.restore_region(background)
    ax.draw_artist(coll)
    fig.canvas.blit(ax.bbox)
print '%2.1f FPS'%( (time.time()-sTime)/10 )

This results in a slower 0.7 fps. I wanted to try using CircleCollection or RegularPolygonCollection, as this would allow me to change the sizes easily, and I don't care about changing the marker. But, I can't get either to draw so I have no idea if they'd be faster. So, at this point I'm looking for ideas.

278

asked Aug 12 '13 05:08

george

1 Answers

We are actively working on performance for large matplotlib scatter plots. I'd encourage you to get involved in the conversation (http://matplotlib.1069221.n5.nabble.com/mpl-1-2-1-Speedup-code-by-removing-startswith-calls-and-some-for-loops-td41767.html) and, even better, test out the pull request that has been submitted to make life much better for a similar case (https://github.com/matplotlib/matplotlib/pull/2156).

HTH

answered Oct 05 '22 23:10

pelson

Related questions
                            
                                Flask Database Issue
                            
                                Django query single underscore behaving like double underscore?
                            
                                How to remove escape sequence like '\xe2' or '\x0c' in python
                            
                                askopenfilename handling cancel on dialogue
                            
                                does calling a shell command from within a scripting language slow down performance?
                            
                                django serializers to json - custom json output format
                            
                                How do I compare 2D lists for equality in Python?
                            
                                How to show a window that was hidden using "withdraw" method?
                            
                                Using pandas to read text file with leading whitespace gives a NaN column
                            
                                Why is creating a range from 0 to log(len(list), 2) so slow?
                            
                                Why Cant I Click an Element in Selenium?
                            
                                Dealing with trying to read a file that might not exist
                            
                                Parsing data from text file
                            
                                python 3.3: struct.pack won't accept strings
                            
                                Exclude one or more items from pandas Series
                            
                                how do I track how many users visit my website
                            
                                python unittest assertRaises
                            
                                Concatenate Columns as Index in Pandas
                            
                                Set and Get @property method in Python by string variable
                            
                                Iterate over all pairwise combinations of numpy array columns

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Speeding up matplotlib scatter plots

Tags:

python

matplotlib

scatter

george

People also ask

1 Answers

pelson

Recent Activity

Donate For Us