I have a variable which grows in size from few MB to 3 GB in a loop. and I get out of memory error. I've tried some of solutions like increasing the amount of memory R
can use,also using rm() and gc(). I have thought if it maybe solved if I assign 3GB to this variable at first. Now:
Is it possible in R
?
If so, will it improve speed?
Is it probable that solve out of memory error
I have a 64-bit Windows 7 OS. my code is more than thousand lines. but key lines are
1.getting data from an access file by odbcConnectAccess2007 and sqlFetch functions and putting the table in a temp variable
2.merging the Master Variable with temp variable
The implementation is such that at no point does R hold the data in RAM. The memory mapped file will be there after the session is over. It can thus be called by other R sessions using attach.
The relevant part is, yes, they are. For the purposes of a programmer, user, and everything else except the machine itself, all variables and code of your program are stored in RAM.
R uses more memory probably because of some copying of objects. Although these temporary copies get deleted, R still occupies the space. To give this memory back to the OS you can call the gc function. However, when the memory is needed, gc is called automatically.
Garbage collector: gc() R uses it to release memory it isn't using, but will usually run it automatically. So you shouldn't have to call it explicitly. However, if you want to see when this is happening, use gcinfo(TRUE) — you probably won't want to leave this on all the time, it will get annoying.
Without seeing specific code, it's hard to know what would help. But if you're calling rbind/cbind/merge within a for loop, that is extremely inefficient. What you can do is throw everything into a list and then use do.call at the end. Compare:
data_list <- list();
length(data_list) <- 2000
for(i in 1:2000) {
data_list[[i]] <- data.frame(matrix(runif(11*10), ncol=11, nrow=10))
}
sequentialRbind<-function() {
res <- data_list[[1]]
for(i in 2:length(data_list)) {
res <- rbind(res, data_list[[i]])
}
return(res)
}
> system.time(res1 <- do.call(rbind,data_list))
user system elapsed
0.78 0.00 0.78
>
> system.time(res2 <- Reduce(rbind,data_list))
user system elapsed
8.24 0.00 8.27
>
> system.time(res3 <- sequentialRbind())
user system elapsed
8.25 0.00 8.27
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With