Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Here we go again: append an element to a list in R

I am not happy with the accepted answer to Append an object to a list in R in amortized constant time?

> list1 <- list("foo", pi) > bar <- list("A", "B") 

How can I append new element bar to list1? Clearly, c() does not work, it flattens bar:

> c(list1, bar) [[1]] [1] "foo"  [[2]] [1] 3.141593  [[3]] [1] "A"  [[4]] [1] "B" 

Assignment to index works:

> list1[[length(list1)+1]] <- bar > list1 [[1]] [1] "foo"  [[2]] [1] 3.141593  [[3]] [[3]][[1]] [1] "A"  [[3]][[2]] [1] "B" 

What is the efficiency of this method? Is there a more elegant way?

like image 271
user443854 Avatar asked Jun 11 '13 14:06

user443854


People also ask

How do I append to a list in R?

To append an element in the R List, use the append() function. You can use the concatenate approach to add components to a list. While concatenate does a great job of adding elements to the R list, the append() function operates faster.

What does append () do in R?

append() function in R is used for merging vectors or adding more elements to a vector.

Can you append in R?

append() method in R programming is used to append the different types of integer values into a vector in the last. Return: Returns the new vector after appending given value.


1 Answers

Adding elements to a list is very slow when doing it one element at a time. See these two examples:

I'm keeping the Result variable in the global environment to avoid copies to evaluation frames and telling R where to look for it with .GlobalEnv$, to avoid a blind search with <<-:

Result <- list()  AddItemNaive <- function(item) {     .GlobalEnv$Result[[length(.GlobalEnv$Result)+1]] <- item }  system.time(for(i in seq_len(2e4)) AddItemNaive(i)) #   user  system elapsed  #  15.60    0.00   15.61  

Slow. Now let's try the second approach:

Result <- list()  AddItemNaive2 <- function(item) {     .GlobalEnv$Result <- c(.GlobalEnv$Result, item) }  system.time(for(i in seq_len(2e4)) AddItemNaive2(i)) #   user  system elapsed  #  13.85    0.00   13.89 

Still slow.

Now let's try using an environment, and creating new variables within this environment instead of adding elements to a list. The issue here is that variables must be named, so I'll use the counter as a string to name each item "slot":

Counter <- 0 Result <- new.env()  AddItemEnvir <- function(item) {     .GlobalEnv$Counter <- .GlobalEnv$Counter + 1      .GlobalEnv$Result[[as.character(.GlobalEnv$Counter)]] <- item }  system.time(for(i in seq_len(2e4)) AddItemEnvir(i)) #   user  system elapsed  #   0.36    0.00    0.38  

Whoa much faster. :-) It may be a little awkward to work with, but it works.

A final approach uses a list, but instead of augmenting its size one element at a time, it doubles the size each time the list is full. The list size is also kept in a dedicated variable, to avoid any slowdown using length:

Counter <- 0 Result <- list(NULL) Size <- 1  AddItemDoubling <- function(item) {     if( .GlobalEnv$Counter == .GlobalEnv$Size )     {         length(.GlobalEnv$Result) <- .GlobalEnv$Size <- .GlobalEnv$Size * 2     }      .GlobalEnv$Counter <- .GlobalEnv$Counter + 1      .GlobalEnv$Result[[.GlobalEnv$Counter]] <- item }  system.time(for(i in seq_len(2e4)) AddItemDoubling(i)) #   user  system elapsed  #   0.22    0.00    0.22 

It's even faster. And as easy to a work as any list.

Let's try these last two solutions with more iterations:

Counter <- 0 Result <- new.env()  system.time(for(i in seq_len(1e5)) AddItemEnvir(i)) #   user  system elapsed  #  27.72    0.06   27.83    Counter <- 0 Result <- list(NULL) Size <- 1  system.time(for(i in seq_len(1e5)) AddItemDoubling(i)) #   user  system elapsed  #   9.26    0.00    9.32 

Well, the last one is definetely the way to go.

like image 154
Ferdinand.kraft Avatar answered Sep 22 '22 03:09

Ferdinand.kraft