I'm trying to create a data structure that the whisker
package expects, and I can't seem to figure out how
create that structure from my data frame. Let's say I have the following data frame:
library(dplyr)
existing_format <-
mtcars %>%
select(carb, gear, cyl) %>%
arrange(carb, gear, cyl) %>%
distinct()
...I would like to go from existing_format
to the following desired format (only first two elements of desired_format
list are shown):
desired_format <- list(
list(
carb = "1",
gear = list(
list(gear = "3", cyl = list(list(cyl = "4"), list(cyl = "6"))),
list(gear = "4", cyl = list(list(cyl = "4")))
)
),
list(
carb = "2",
gear = list(
list(gear = "3", cyl = list(list(cyl = "8"))),
list(gear = "4", cyl = list(list(cyl = "4"))),
list(gear = "5", cyl = list(list(cyl = "4")))
)
)
)
I've tried things like grouping by carb
and gear
, then using tidyr::nest()
to create a nested df, but nothing is doing. Something tells me that whisker::iteratelist()
or whisker::rowSplit()
is the way forward, but I can't figure it out.
Thanks, Chris
A nested list is a list that appears as an element in another list. In this list, the element with index 3 is a nested list. If we print( nested[3] ), we get [10, 20] .
A list that occurs as an element of another list (which may ofcourse itself be an element of another list etc) is known as nested list.
Data frame columns can contain lists You can also create a data frame having a list as a column using the data. frame function, but with a little tweak. The list column has to be wrapped inside the function I.
Perhaps more flexible than it needs to be in this case, but you can do a recursive split
rsplit<-function(dd) {
col <- names(dd)[1]
dat <- dd[[1]]
xx <- lapply(unique(dat), function(x) {
z <- setNames(list(x), col)
if(ncol(dd)>1) {
z[[names(dd)[2]]] <- rsplit(dd[dat==x,-1, drop=FALSE])
}
z
})
xx
}
rsplit(existing_format)
This will split on all the columns and use the names from the column headers.
Here's a way, not general for n columns, but it works for 3.
library(purrr)
library(magrittr)
library(dplyr)
output <- existing_format %>%
map_df(as.character) %>%
group_by(carb,gear) %>%
summarize_at("cyl",~lst(map(.,~lst(cyl = .x)))) %>%
mutate(gear = map2(.x = gear,.y = cyl,~lst(gear = .x,cyl = .y))) %>%
group_by(carb) %>%
summarize_at("gear",~lst(gear=.)) %$%
map2(.x = carb,.y = gear,~lst(carb = .x,gear = .y))
identical(output[1:2],desired_format) #TRUE
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With