Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to assign fixed memory size to a variable in R

I have a variable which grows in size from few MB to 3 GB in a loop. and I get out of memory error. I've tried some of solutions like increasing the amount of memory R can use,also using rm() and gc(). I have thought if it maybe solved if I assign 3GB to this variable at first. Now:

  1. Is it possible in R?

  2. If so, will it improve speed?

  3. Is it probable that solve out of memory error

I have a 64-bit Windows 7 OS. my code is more than thousand lines. but key lines are

1.getting data from an access file by odbcConnectAccess2007 and sqlFetch functions and putting the table in a temp variable

2.merging the Master Variable with temp variable

like image 760
John s Avatar asked May 30 '15 07:05

John s


People also ask

Does R store data in RAM?

The implementation is such that at no point does R hold the data in RAM. The memory mapped file will be there after the session is over. It can thus be called by other R sessions using attach.

Do variables use RAM?

The relevant part is, yes, they are. For the purposes of a programmer, user, and everything else except the machine itself, all variables and code of your program are stored in RAM.

Why is my R session using so much memory?

R uses more memory probably because of some copying of objects. Although these temporary copies get deleted, R still occupies the space. To give this memory back to the OS you can call the gc function. However, when the memory is needed, gc is called automatically.

How do I reduce memory in R?

Garbage collector: gc() R uses it to release memory it isn't using, but will usually run it automatically. So you shouldn't have to call it explicitly. However, if you want to see when this is happening, use gcinfo(TRUE) — you probably won't want to leave this on all the time, it will get annoying.


1 Answers

Without seeing specific code, it's hard to know what would help. But if you're calling rbind/cbind/merge within a for loop, that is extremely inefficient. What you can do is throw everything into a list and then use do.call at the end. Compare:

data_list <- list(); 
length(data_list) <- 2000

for(i in 1:2000) {
    data_list[[i]] <- data.frame(matrix(runif(11*10), ncol=11, nrow=10))
}


sequentialRbind<-function() {
    res <- data_list[[1]]
    for(i in 2:length(data_list)) {
        res <- rbind(res, data_list[[i]])
    }
    return(res)
}

> system.time(res1 <- do.call(rbind,data_list))
   user  system elapsed 
   0.78    0.00    0.78 
> 
> system.time(res2 <- Reduce(rbind,data_list))
   user  system elapsed 
   8.24    0.00    8.27 
> 
> system.time(res3 <- sequentialRbind())
   user  system elapsed 
   8.25    0.00    8.27 
like image 156
thc Avatar answered Sep 20 '22 13:09

thc