Speed up plot() function for large dataset

Question

I am using plot() for over 1 mln data points and it turns out to be very slow.

Is there any way to improve the speed including programming and hardware solutions (more RAM, graphic card...)?

Where are data for plot stored?

Ben Bolker · Accepted Answer

(This question is closely related to Scatterplot with too many points, although that question focuses on the difficulty of seeing anything in the big scatterplot rather than on performance issues ...)

A hexbin plot actually shows you something (unlike the scatterplot @Roland proposes in the comments, which is likely to just be a giant, slow, blob) and takes about 3.5 seconds on my machine for your example:

set.seed(101)
a<-rnorm(1E7,1,1)
b<-rnorm(1E7,1,1)
library(hexbin)
system.time(plot(hexbin(a,b)))  ## 0.5 seconds, modern laptop

enter image description here

Another, slightly slower alternative is the base-R smoothScatter function: it plots a smooth density plus as many extreme points as requested (1000 in this case).

system.time(smoothScatter(a,b,cex=4,nr=1000))  ## 3.3 seconds

enter image description here

TPArrow · Answer

an easy and fast way is to set pch='.' . The performance is shown below

x=rnorm(10^6)
> system.time(plot(x))
  user  system elapsed 
  2.87   15.32   18.74 
> system.time(plot(x,pch=20))
  user  system elapsed 
  3.59   22.20   26.16 
> system.time(plot(x,pch='.'))
  user  system elapsed 
  1.78    2.26    4.06

Ajay Ohri · Answer

have you looked at the tabplot package. it is designed specifically for large data http://cran.r-project.org/web/packages/tabplot/ I use that its faster than using hexbin (or even the default sunflower plots for overplotting)

also i think Hadley wrote something on DS 's blog modifying ggplot for big data at http://blog.revolutionanalytics.com/2011/10/ggplot2-for-big-data.html

"""I'm currently with working another student, Yue Hu, to turn our research into a robust R package.""" October 21, 2011

Maybe we can ask Hadley if the updated ggplot3 is ready

Speed up plot() function for large dataset

Tags:

plot

r

SilverSpoon

3 Answers

Ben Bolker

TPArrow

Ajay Ohri

Recent Activity

Donate For Us

Speed up plot() function for large dataset

Tags:

plot

r

SilverSpoon

3 Answers

Ben Bolker

TPArrow

Ajay Ohri

Related questions

Recent Activity

Donate For Us