Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tricks to manage the available memory in an R session

What tricks do people use to manage the available memory of an interactive R session? I use the functions below [based on postings by Petr Pikal and David Hinds to the r-help list in 2004] to list (and/or sort) the largest objects and to occassionally rm() some of them. But by far the most effective solution was ... to run under 64-bit Linux with ample memory.

Any other nice tricks folks want to share? One per post, please.

# improved list of objects .ls.objects <- function (pos = 1, pattern, order.by,                         decreasing=FALSE, head=FALSE, n=5) {     napply <- function(names, fn) sapply(names, function(x)                                          fn(get(x, pos = pos)))     names <- ls(pos = pos, pattern = pattern)     obj.class <- napply(names, function(x) as.character(class(x))[1])     obj.mode <- napply(names, mode)     obj.type <- ifelse(is.na(obj.class), obj.mode, obj.class)     obj.size <- napply(names, object.size)     obj.dim <- t(napply(names, function(x)                         as.numeric(dim(x))[1:2]))     vec <- is.na(obj.dim)[, 1] & (obj.type != "function")     obj.dim[vec, 1] <- napply(names, length)[vec]     out <- data.frame(obj.type, obj.size, obj.dim)     names(out) <- c("Type", "Size", "Rows", "Columns")     if (!missing(order.by))         out <- out[order(out[[order.by]], decreasing=decreasing), ]     if (head)         out <- head(out, n)     out } # shorthand lsos <- function(..., n=10) {     .ls.objects(..., order.by="Size", decreasing=TRUE, head=TRUE, n=n) } 
like image 770
Dirk Eddelbuettel Avatar asked Aug 31 '09 15:08

Dirk Eddelbuettel


People also ask

How do I deal with memory problems in R?

Short of reworking R to be more memory efficient, you can buy more RAM, use a package designed to store objects on hard drives rather than RAM (ff, filehash, R. huge, or bigmemory), or use a library designed to perform linear regression by using sparse matrices such as t(X)*X rather than X (big.

Why is my R session using so much memory?

R uses more memory probably because of some copying of objects. Although these temporary copies get deleted, R still occupies the space. To give this memory back to the OS you can call the gc function. However, when the memory is needed, gc is called automatically.

How do I limit memory usage in R?

Use memory. limit() . You can increase the default using this command, memory. limit(size=2500) , where the size is in MB.

How do I allocate more memory to RStudio?

Use memory. limit(). You can increase the default using this command, memory. limit(size=2500), where the size is in MB.


2 Answers

I use the data.table package. With its := operator you can :

  • Add columns by reference
  • Modify subsets of existing columns by reference, and by group by reference
  • Delete columns by reference

None of these operations copy the (potentially large) data.table at all, not even once.

  • Aggregation is also particularly fast because data.table uses much less working memory.

Related links :

  • News from data.table, London R presentation, 2012
  • When should I use the := operator in data.table?
like image 20
Matt Dowle Avatar answered Oct 16 '22 00:10

Matt Dowle


Ensure you record your work in a reproducible script. From time-to-time, reopen R, then source() your script. You'll clean out anything you're no longer using, and as an added benefit will have tested your code.

like image 52
hadley Avatar answered Oct 16 '22 00:10

hadley