Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Implementing in-place modification algorithms in R

Tags:

algorithm

r

Just starting to learn R and immediately I am confused:

Given how everyone here (on SO) keeps saying that pass-by-value is one of the main R paradigms, is it possible to effectively implement algorithms that imply "modify in place" (aka quicksort and the likes)? The way I see it - if I do this using R I will have to return intermediate results effectively copying where in another language I will just modify an array passed by pointer/reference. Am I missing something?

I understand it may be the wrong language for that but is it really so?

like image 795
Zeks Avatar asked May 27 '14 16:05

Zeks


1 Answers

There are two main approaches. If you have control over the calling convention, you can wrap your objects in environments.

pointer <- new.env()
pointer$data <- iris
fn1 <- function(env) {
  numcols <- sapply(env$data, is.numeric)
  env$data[, numcols] <- env$data[, numcols] + 1
}
fn1(pointer) # pointer$data will now contain iris with all the numeric columns
             # incremented by 1. The full data set was never passed.

If you don't have control, you can try something sneakier with non-standard evaluation, but beware.

fn2 <- function(data) {
  numcols <- sapply(data, is.numeric)
  eval.parent(substitute(data[, numcols] <- data[, numcols] + 1))
}
fn2(iris)  # iris will now contain iris with all the numeric columns
           # incremented by 1. The full data set was also never passed.

In version 3.1 of R, copy on write will include the ability to handle nested structures, so the above two would be equivalent to simply

fn3 <- function(data) {
  numcols <- sapply(data, is.numeric)
  data[, numcols] <- data[, numcols] + 1
  data
}
iris <- fn3(iris)

If you have R 3.1 installed, you can verify the performance claims yourself by using microbenchmark on these three functions.

like image 60
Robert Krzyzanowski Avatar answered Nov 17 '22 23:11

Robert Krzyzanowski