Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Empty R environment becomes large file when saved

Tags:

r

I'm getting behaviour I don't understand when saving environments. The code below demonstrates the problem. I would have expected the two files (far-too-big.RData, and right-size.RData) to be the same size, and also very small because the environments they contain are empty.

In fact, far-too-big.RData ends up the same size as bigfile.RData.

I get the same results using 2.14.1 and 2.15.2, both on WinXP 5.1 SP3. Can anyone explain why this is happening?

Both far-too-big.RData and right-size.RData, when loaded into a new R session, appear to contain nothing. ie they return character(0) in response to ls(). However, if I switch the saves to include ascii=TRUE, and open the result in a text editor, I can see that far-too-big.RData contains the data in bigfile.RData.

a <- matrix(runif(1000000, 0, 1), ncol=1000)
save(a, file="bigfile.RData")
fn <- function() {
    load("bigfile.RData")
    test <- new.env()
    save(test, file="far-too-big.RData")
    test1 <- new.env(parent=globalenv())
    save(test1, file="right-size.RData")
}
fn()
like image 999
SJC Avatar asked Dec 17 '12 11:12

SJC


People also ask

How do you clear an entire environment in R?

The console can be cleared using the shortcut key “ctrl + L“.

How do I remove all objects in R?

Remove Objects from Memory in R Programming – rm() Function rm() function in R Language is used to delete objects from the memory. It can be used with ls() function to delete all objects.

How do you find the parent environment in R?

You can list the bindings in the environment's frame with ls() and see its parent with parent. env() . Another useful way to view an environment is ls. str() .

What is global environment R?

When a user starts a new session in R, the R system creates a new environment for objects created during that session. This environment is called the global environment. The global environment is not actually the root of the tree of environments.


1 Answers

This is not my area of expertise but I belive environments work like this.

  • Any environment inherits everything in its parent environment.
  • All function calls create their own environment.

The result of the above in your case is:

  1. When you run fn() it creates its own local environment (green), whose parent by default is globalenv() (grey).
  2. When you create the environment test (red) inside fn() its parent defaults to fn()'s environment (green). test will therefore include the object a.
  3. When you create the environment test1 (blue) and explicitly states that its parent is globalenv() it is separated from fn()'s environment and does not inherit the object a.

So when saving test you also save a (somewhat hidden) copy of the object a. This does not happen when you save test1 as it does not include the object a.

enter image description here

Update

Apparently this is a more complicated topic than I used to believe. Although I might just be quoting @joris-mays answer now I'd like to take a final go at it.

To me the most intuitive visualization of environments would be a tree structure, see below, where each node is an environment and the arrows point to its respective enclosing environment (which I would like to believe is the same as its parent, but that has to do with frames and is beyond my corner of the world). A given environment encloses all objects you can reach by moving down the tree and it can access all objects you can reach by moving up the tree. When you save an environment it appears you save all objects and environments that are both enclosed by it and accessible from it (with the exception of globalenv()).

However, the take home message is as Joris already stated: save your objects as lists and you don't need to worry.

enter image description here

If you want to know more I can recommend Norman Matloff's excellent book the art of R programming. It is aimed at software development in R rather than primary data analysis and assumes you have a fair bit of programming experience. I must admit I haven't fully digested the environment part yet, but as the rest of the book is very well written and pedagogical I assume this one is too.

like image 77
Backlin Avatar answered Oct 03 '22 18:10

Backlin