Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the internal implementation of lists?

Tags:

list

r

I am curious how an object of type list is implemented. Is it

  1. a dynamic vector that will automatically increase its size when it is full.
  2. a linked list where appending an item is O(1), but accessing an item is O(n).
  3. a tree structure with O(log(n)) item access.
  4. a hashtable with O(1) item access.

I am curious because lists can have key-value pairs that make them look like hash tables, but the elements are in order, which looks like a vector.

Edit: because length(list(runif(1e4))) is 1, so when append element to a list, it looks like that it copy the whole list every time, that makes it very slow:

But the access speed is much slower than a vector:

z1 <- runif(1e4)
system.time({
  for(i in 1:10000) z1[[1 + i]] <- 1
})

outputs:

user  system elapsed 
0.060   0.000   0.062 

but:

z1 <- list(runif(1e4))
system.time({
  for(i in 1:10000) z1[[1 + i]] <- 1
})

outputs:

user  system elapsed 
1.31    0.00    1.31 

init a list with 10000 elements:

z1 <- as.list(runif(1e4))
system.time({
  for(i in 1:10000) z1[[1 + i]] <- 1
})

outputs:

user  system elapsed 
0.060   0.000   0.065 

For the key & value access:

z1 <- list()
for(i in 1:10000){key <- as.character(i); z1[[key]] <- i} 
system.time({
  for(i in 1:10000) x <- z1[["1"]]
})
system.time({
  for(i in 1:10000) x <- z1[["10000"]]
})

The output is:

user  system elapsed 
0.01    0.00    0.01 
user  system elapsed 
1.78    0.00    1.78 

It's not an O(1) access, so it's not a hash table. My conclusion is that it's not a dynamic array since appending items will cause memory accesses every time; it's not a hashtable since access by key is not O(1).

like image 406
HYRY Avatar asked Apr 21 '13 07:04

HYRY


People also ask

How list is implemented in internally?

The STL list requires traversal in both the directions by means of forward and reverse iterator. Hence the list should be implemented as a doubly linked list.

How list is implemented internally in Python?

Python lists are internally represented as arrays. The idea used is similar to implementation of vectors in C++ or ArrayList in Java. The costly operations are inserting and deleting items near the beginning (as everything has to be moved). Insert at the end also becomes costly if preallocated space becomes full.

How are lists implemented?

There are two general-purpose List implementations — ArrayList and LinkedList . Most of the time, you'll probably use ArrayList , which offers constant-time positional access and is just plain fast. It does not have to allocate a node object for each element in the List , and it can take advantage of System.

What do you mean by internal list?

Internal list means an eligibility list of internal candidates seeking promotional positions or reassignments.


1 Answers

Lists are essentially just arrays of R objects (SEXP). Resizing causes copies of the whole data and name lookup is linear.

Alternatively, you can use environments, which use hash tables internally.

like image 102
Romain Francois Avatar answered Oct 03 '22 11:10

Romain Francois