Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Splitting and manipulating nested lists

Tags:

split

r

I'm trying to split a nested list by a group variable. Please consider the following structure:

> str(L1)
List of 2
 $ names:List of 2
  ..$ first: chr [1:5] "john" "lisa" "anna" "mike" ...
  ..$ last : chr [1:5] "johnsson" "larsson" "johnsson" "catell" ...
 $ stats:List of 2
  ..$ physical:List of 2
  .. ..$ age   : num [1:5] 14 22 53 23 31
  .. ..$ height: num [1:5] 165 176 179 182 191
  ..$ mental  :List of 1
  .. ..$ iq: num [1:5] 102 104 99 87 121

Now I need to produce two lists, which use both L1$names$last to splice, resulting in L2 and L3, seen below:

L2: Result grouped by L1$names$last

> str(L2) 
List of 3
 $ johnsson:List of 2
  ..$ names:List of 1
  .. ..$ first: chr [1:2] "john" "anna"
  ..$ stats:List of 2
  .. ..$ physical:List of 2
  .. .. ..$ age   : num [1:2] 14 53
  .. .. ..$ height: num [1:2] 165 179
  .. ..$ mental  :List of 1
  .. .. ..$ iq: num [1:2] 102 99
 $ larsson :List of 2
  ..$ names:List of 1
  .. ..$ first: chr [1:2] "lisa" "steven"
  ..$ stats:List of 2
  .. ..$ physical:List of 2
  .. .. ..$ age   : num [1:2] 22 31
  .. .. ..$ height: num [1:2] 176 191
  .. ..$ mental  :List of 1
  .. .. ..$ iq: num [1:2] 104 121
 $ catell  :List of 2
  ..$ names:List of 1
  .. ..$ first: chr "mike"
  ..$ stats:List of 2
  .. ..$ physical:List of 2
  .. .. ..$ age   : num 23
  .. .. ..$ height: num 182
  .. ..$ mental  :List of 1
  .. .. ..$ iq: num 87

L3: Each group only permits one occurrence of L1$names$last

List of 2
 $ 1:List of 2
  ..$ names:List of 2
  .. ..$ first: chr [1:3] "john" "lisa" "mike"
  .. ..$ last : chr [1:3] "johnsson" "larsson" "catell"
  ..$ stats:List of 2
  .. ..$ physical:List of 2
  .. .. ..$ age   : num [1:3] 14 22 23
  .. .. ..$ height: num [1:3] 165 176 182
  .. ..$ mental  :List of 1
  .. .. ..$ iq: num [1:3] 102 104 87
 $ 2:List of 2
  ..$ names:List of 2
  .. ..$ first: chr [1:2] "anna" "steven"
  .. ..$ last : chr [1:2] "johnsson" "larsson"
  ..$ stats:List of 2
  .. ..$ physical:List of 2
  .. .. ..$ age   : num [1:2] 53 31
  .. .. ..$ height: num [1:2] 179 191
  .. ..$ mental  :List of 1
  .. .. ..$ iq: num [1:2] 99 121

I`ve tried to apply this solution, but it appears that this won't work for nested lists.

Reproducible code:

L1 <- list("names" = list("first" = c("john","lisa","anna","mike","steven"),"last" = c("johnsson","larsson","johnsson","catell","larsson")),"stats" = list("physical" = list("age" = c(14,22,53,23,31), "height" = c(165,176,179,182,191)), "mental" = list("iq" = c(102,104,99,87,121))))

L2 <- list("johnsson" = list("names" = list("first" = c("john","anna")),"stats" = list("physical" = list("age" = c(14,53), "height" = c(165,179)), "mental" = list("iq" = c(102,99)))), "larsson" = list("names" = list("first" = c("lisa","steven")),"stats" = list("physical" = list("age" = c(22,31), "height" = c(176,191)), "mental" = list("iq" = c(104,121)))), "catell" = list("names" = list("first" = "mike"),"stats" = list("physical" = list("age" = 23, "height" = 182), "mental" = list("iq" = 87))))

L3 <- list("1" = list("names" = list("first" = c("john","lisa","mike"),"last" = c("johnsson","larsson","catell")),"stats" = list("physical" = list("age" = c(14,22,23), "height" = c(165,176,182)), "mental" = list("iq" = c(102,104,87)))), "2" = list("names" = list("first" = c("anna","steven"),"last" = c("johnsson","larsson")),"stats" = list("physical" = list("age" = c(53,31), "height" = c(179,191)), "mental" = list("iq" = c(99,121)))))

EDIT: Please note that the actual dataset is quite large and more deeply nested than the provided example.

like image 952
Comfort Eagle Avatar asked Jul 17 '17 22:07

Comfort Eagle


People also ask

How do I separate lists within a list?

Usually, we use a comma to separate three items or more in a list. However, if one or more of these items contain commas, then you should use a semicolon, instead of a comma, to separate the items and avoid potential confusion.

How do I manage nested lists in python?

Add items to a Nested list. To add new values to the end of the nested list, use append() method. When you want to insert an item at a specific position in a nested list, use insert() method. You can merge one list into another by using extend() method.

What do you mean by nesting lists explain with example?

A list that occurs as an element of another list (which may ofcourse itself be an element of another list etc) is known as nested list.


2 Answers

Usually for modifying lists you will want to use recursion. For example, consider this function:

foo <- function(x, idx) {

    if (is.list(x)) {
        return(lapply(x, foo, idx = idx))
    }
    return(x[idx])
}

it takes some list as x and a number of indices idx. It will check if x is a list, and if that is the case it will lapply itself to all subelements of the list. Once x no longer is a list, we take the elements given by idx. During the whole process, the structure of the original list will remain intact.

Here a full example. Note that this code assumes all vectors in the list have 5 elements.

L1 <- list("names" = list("first" = c("john","lisa","anna","mike","steven"),"last" = c("johnsson","larsson","johnsson","catell","larsson")),"stats" = list("physical" = list("age" = c(14,22,53,23,31), "height" = c(165,176,179,182,191)), "mental" = list("iq" = c(102,104,99,87,121))))

L2 <- list("johnsson" = list("names" = list("first" = c("john","anna")),"stats" = list("physical" = list("age" = c(14,53), "height" = c(165,179)), "mental" = list("iq" = c(102,99)))), "larsson" = list("names" = list("first" = c("lisa","steven")),"stats" = list("physical" = list("age" = c(22,31), "height" = c(176,191)), "mental" = list("iq" = c(104,121)))), "catell" = list("names" = list("first" = "mike"),"stats" = list("physical" = list("age" = 23, "height" = 182), "mental" = list("iq" = 87))))

L3 <- list("1" = list("names" = list("first" = c("john","lisa","mike"),"last" = c("johnsson","larsson","catell")),"stats" = list("physical" = list("age" = c(14,22,23), "height" = c(165,176,182)), "mental" = list("iq" = c(102,104,87)))), "2" = list("names" = list("first" = c("anna","steven"),"last" = c("johnsson","larsson")),"stats" = list("physical" = list("age" = c(53,31), "height" = c(179,191)), "mental" = list("iq" = c(99,121)))))

# make L2
foo <- function(x, idx) {

    if (is.list(x)) {
        return(lapply(x, foo, idx = idx))
    }
    return(x[idx])
}

levels <- unique(L1$names$last)
L2_2 <- vector("list", length(levels))
names(L2_2) <- levels
for (i in seq_along(L2_2)) {

    idx <- L1$names$last == names(L2_2[i])
    L2_2[[i]] <- list(names = foo(L1$names[-2], idx),
                      stats = foo(L1$stats, idx))

}
identical(L2, L2_2)

str(L2)
str(L2_2)

# make L3

dups <- duplicated(L1$names$last)
L3_2 <- vector("list", 2)
names(L3_2) <- 1:2
for (i in 1:2) {

    if (i == 1)
        idx <- !dups
    else
        idx <- dups

    L3_2[[i]] <- foo(L1, idx)

}
identical(L3, L3_2)
str(L3)
str(L3_2)
like image 177
Vandenman Avatar answered Oct 06 '22 18:10

Vandenman


This isn't a complete answer but I hope it helps.

See if this works for L3:

x = data.frame(L1, stringsAsFactors = F)
y = x[order(x$names.last),]
y$seq = 1
y$seq = ifelse(y$names.last == shift(y$names.last),shift(y$seq)+1,1)
y$seq[1] = 1

z = list(list(names=list(first=z[[1]]$names.first, last=z[[1]]$names.last), stats=list(physical = list(age =z[[1]]$stats.physical.age, height= z[[1]]$stats.physical.height), mental=list(iq= z[[1]]$stats.iq))), list(names=list(first=z[[2]]$names.first, last=z[[2]]$names.last), stats=list(physical = list(age =z[[2]]$stats.physical.age, height= z[[2]]$stats.physical.height), mental=list(iq= z[[2]]$stats.iq))))

The last part (z) where that transforms back into a list can be done with a loop. Assuming the same name doesn't come up too much the loop wouldn't be too slow.

You say it is more nested, in which case you will need to add is.null and or tryCatch functions to deal with errors.

like image 1
Olivia Avatar answered Oct 06 '22 18:10

Olivia