Just starting to learn R and immediately I am confused:
Given how everyone here (on SO) keeps saying that pass-by-value is one of the main R paradigms, is it possible to effectively implement algorithms that imply "modify in place" (aka quicksort and the likes)? The way I see it - if I do this using R I will have to return intermediate results effectively copying where in another language I will just modify an array passed by pointer/reference. Am I missing something?
I understand it may be the wrong language for that but is it really so?
There are two main approaches. If you have control over the calling convention, you can wrap your objects in environments.
pointer <- new.env()
pointer$data <- iris
fn1 <- function(env) {
numcols <- sapply(env$data, is.numeric)
env$data[, numcols] <- env$data[, numcols] + 1
}
fn1(pointer) # pointer$data will now contain iris with all the numeric columns
# incremented by 1. The full data set was never passed.
If you don't have control, you can try something sneakier with non-standard evaluation, but beware.
fn2 <- function(data) {
numcols <- sapply(data, is.numeric)
eval.parent(substitute(data[, numcols] <- data[, numcols] + 1))
}
fn2(iris) # iris will now contain iris with all the numeric columns
# incremented by 1. The full data set was also never passed.
In version 3.1 of R, copy on write will include the ability to handle nested structures, so the above two would be equivalent to simply
fn3 <- function(data) {
numcols <- sapply(data, is.numeric)
data[, numcols] <- data[, numcols] + 1
data
}
iris <- fn3(iris)
If you have R 3.1 installed, you can verify the performance claims yourself by using microbenchmark
on these three functions.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With