Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Preallocation in r

Tags:

performance

r

In Matlab it is drilled into you - preallocate, preallocate, preallocate. If you fail to do this, gremlins will eat cpu cycles and you will be a bad person. Is it as important to preallocate in r as it is in Matlab?

like image 783
Carbon Avatar asked May 27 '14 06:05

Carbon


2 Answers

Since in R we tend to avoid explicit loops, it is not as important. Many functions do it under the hood for us. Of course, if you insist on using for loops, you should pre-allocate to avoid growing an object in a loop (which is one of the slowest operations you can do). Relevant reading material: The R Inferno.

like image 193
Roland Avatar answered Sep 20 '22 19:09

Roland


some examples

test1=function(){
  l=list()
  for(i in 1:10000){
    l=append(l,"abc")
  }
  return(l)
}
system.time(test1()) # 2.367 sec

test2=function(){
  l=vector("list", 10000)
  for(i in 1:10000){
    l[i]="abc"
  }
  return(l)
}
system.time(test2()) # 0.015 sec

test3=function(){
  l=list()
  for(i in 1:10000){
    l[i]="abc"
  }
  return(l)
}
system.time(test3()) # 0.309 sec

test4=function(){
  return(lapply(1:10000, function(x) "abc"))   
}
system.time(test4()) # 0.003

R for loops suck indeed :)

which is problematic, cause its not always readable to change it into a lappy

like image 25
phonixor Avatar answered Sep 22 '22 19:09

phonixor