Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is GGally::ggpairs significantly slower in RStudio vs. base R?

Per the title, does anyone know why rendering a ggpairs plot from the GGally package takes significantly longer in RStudio vs. base R (or terminal)?

Example:

start.time <- Sys.time()
ggpairs(mtcars)
end.time <- Sys.time()
time.taken <- end.time - start.time
time.taken

Running this in RStudio on my machine takes on the order of 5 times longer than base R. I have experienced the same slow down regardless of OS (Windows vs. Mac).

Are there any workarounds?

Other packages?

Specifically, how to render something like: GGally::ggpairs(iris, color = "Species") quickly without leaving RStudio?

like image 762
JasonAizkalns Avatar asked Apr 29 '15 14:04

JasonAizkalns


People also ask

How to use ggpairs in R plot?

The basic application of ggpairs is similar to the pairs function of base R. You simply have to write the following R code: Figure 5: ggpairs R Plot via ggplot2 & GGally packages.

What is the ggplot2 equivalent of the pairs function in R?

The ggpairs function The GGally provides a function named ggpairs which is the ggplot2 equivalent of the pairs function of base R. You can pass a data frame containing both continuous and categorical variables.

What is the GGally R package?

The GGally R package is an extension of the ggplot2 package and adds several additional functions for the plotting of data in R. Here you can find the CRAN page of the GGally package. You can find tutorials and examples for the GGally package below. In the following, you can find a list of other useful R packages.

Is R a slow language?

Anyone who works in the data science space is familiar with R. You’ve surely come across someone making the argument that R is a slow language and can’t handle larger data. That simply isn’t always the case. A lot of R code I’ve seen in the wild shows that there is a lack of fundamental understanding of how the language works.


1 Answers

I had similar issues, and spent some time trying to figure out why. I found four significant issues (not an exhaustive list). If your situation is like mine, then 1 and 2 are your main concerns.

  1. The IDE. RStudio performs ggpairs slower than R.
  2. Your computing environment. I don't have the resources to test this extensively, but we're most likely talking about GPU, since this is a graphics processing issue.
  3. Number of variables. More variables = exponential growth in time.
  4. Sequential operations. If you are working on a low-power machine, you could slow execution with too many requests.

You can read more on my github here: https://github.com/zstachniak/Elapsed-Time-Pairwise-Functions/blob/master/ggpairs.md

like image 135
zstachniak Avatar answered Oct 10 '22 07:10

zstachniak