Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to understand list(list(object)) in r?

Tags:

r

a list

list("haha")
[[1]]
[1] "haha"

a list of list

list(list("haha"))
[[1]]
[[1]][[1]]
[1] "haha"

I can not understand the output of list(list("haha")) ,in my opnion the output should be:

list(list("haha"))
[[1]]
[[1]]
[1] "haha"

what is the magic here?

like image 494
showkey Avatar asked Oct 03 '22 07:10

showkey


2 Answers

I agree the format looks odd but it makes more sense once you start putting multiple objects in your lists then it isn't so redundant:

> L<-list(list("haha", "hoho"), list("heehee", "guffaw"))
> L
    [[1]]
    [[1]][[1]]
    [1] "haha"

    [[1]][[2]]
    [1] "hoho"


    [[2]]
    [[2]][[1]]
    [1] "heehee"

    [[2]][[2]]
    [1] "guffaw"

EDIT added for @gung: Im not sure this addresses your point? but what you see is what you get. like so..

> L[[1]][[2]]
[1] "hoho"
L[[2]][[1]]
[1] "heehee"

though I guess it could be less verbose???

EDIT added for @Henrik: or obviously these:

> L<-list(first=list(x="haha", y="hoho"), second=list(a="heehee", b="guffaw"))
> L
$first
$first$x
[1] "haha"

$first$y
[1] "hoho"


$second
$second$a
[1] "heehee"

$second$b
[1] "guffaw"
like image 146
Stephen Henderson Avatar answered Oct 05 '22 16:10

Stephen Henderson


In R, lists are ordered sets of components (see here). They are indexed differently from other data structures like vectors. Consider how your first example would look if it were a vector:

> c("haha")
[1] "haha"

The [1] is simply enumerating your results for easier reference. In this case, the [1] doesn't help much because it's obvious that you have only one element in your results and that "haha" is the first element, but if your vector were long enough, those notations would be helpful in determining the position of latter elements.

Since you have a list, the output is indexed (and looks) differently:

> list("haha")
[[1]]
[1] "haha"

Now you have two 1's indexing your results. This looks a little ambiguous, but as @Henrik and @Stephen Henderson point out, this is in part because there is only one component in your list. At any rate, the [[1]] is telling you that what follows is the first component, and the [1] is again simply enumerating the results within that component. Since there is only one component, which in turn contains only one element, this is strictly speaking unnecessary, but becomes more helpful when your list has more stuff.

Note that the components, and contained elements, of lists are always numbered, and are accessible via these numbers. But they can also be named, in which case the output looks a little different:

> list(chuckle="haha")
$chuckle
[1] "haha"

Now the [[1]] is not displayed with the results, but it is still there and you can still use it to access a component of the list:

> l <- list(chuckle="haha")
> l$chuckle
[1] "haha"
> l[[1]]
[1] "haha"

You now have two options available; the [[1]] is still there, should you prefer to use it, although it is not what R prints out by default. Notice further that [1] is still being printed with your results to enumerate them.

It may be easier to see some of these features if we had a list whose component contained multiple elements:

> list(letters[1:20])
[[1]]
 [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m"
[14] "n" "o" "p" "q" "r" "s" "t"

With only one (unnamed) component, we get the standard [[1]]. The [1] enumerates the first element of the results, and since all the results don't fit on one line, the [14] helpfully tells us that "n" is the fourteenth element. Counting over to "s" as the nineteenth element is now easier than if we had to start from "a".

What if we had a list with more than one component? The easiest case would have vectors as components:

> list(c("haha", "hehe"), c("hoho", "hehheh"))
[[1]]
[1] "haha" "hehe"

[[2]]
[1] "hoho"   "hehheh"

The results accurately show that there are two components. The first component is denoted [[1]], and the elements of that component are enumerated starting with [1]. That there are two components, and which elements are contained within each are clearly displayed. You can use these to access the element you want:

> ll <- list(c("haha", "hehe"), c("hoho", "hehheh"))
> ll[[2]][2]
[1] "hehheh"

The [[2]] indexes the second component (i.e., the vector c("hoho", "hehheh")) and the [2] retrieves the second element of that vector. Notice the [1] is still printed with the results, even though there can only be one element in this particular situation.

For a more complicated case, lets look at a list of lists, instead of a list of vectors:

> list(list("haha", "hehe"), list("hoho", "hehheh"))
[[1]]
[[1]][[1]]
[1] "haha"

[[1]][[2]]
[1] "hehe"


[[2]]
[[2]][[1]]
[1] "hoho"

[[2]][[2]]
[1] "hehheh"

Now this is a mess. What is confusing about this is that the output denotes three components (or at least, there are three [[1]]s at the top), yet it is not clear what the three components could be. There are two components of the outer list and two components of each inner list. When the components were vectors, the results did not look like this. The answer is that you are getting the outer list denoted twice. The [[1]] on the first row of the results is telling you that what follows comes from the first component. On the next row of the results, the [[1]][[1]] is telling you that what follows is the first component of the first component (even though it just told you these results pertain to the first component). Here's what it looks like if the components are named:

> list(chuckles=list("haha", "hehe"), guffaws=list("hoho", "hehheh"))
$chuckles
$chuckles[[1]]
[1] "haha"

$chuckles[[2]]
[1] "hehe"


$guffaws
$guffaws[[1]]
[1] "hoho"

$guffaws[[2]]
[1] "hehheh"

Now it is easier to see that it is giving you certain information twice in a row. Notice what happens if you ask for only one component:

> lll <- list(list("haha", "hehe"), list("hoho", "hehheh"))
> lll[[1]]
[[1]]
[1] "haha"

[[2]]
[1] "hehe"

Having specified we are only interested in the first (outer) component, it does not tell us this is the first component twice in a row. It does indicate the inner components and enumerate the elements contained, though.

It is very hard to see all of this in your example because you have a nested list of only one component containing only one component containing only one element, and none of them are named. Nonetheless, the results denote the first component, and then within the first component (again) the first (contained) component, and then the result(s) are returned with the first (actually only) element enumerated:

> list(list("haha"))
[[1]]
[[1]][[1]]
[1] "haha"
like image 29
gung - Reinstate Monica Avatar answered Oct 05 '22 15:10

gung - Reinstate Monica