I have created a double nested structure for some data. How can I Access the data on the 2nd Level ( or for that matter the nth Level?)
library(gapminder)
library(purrr)
library(tidyr)
gapminder
nest_data <- gapminder %>% group_by(continent) %>% nest(.key = by_continent) 
nest_2<-nest_data %>% mutate(by_continent = map(by_continent, ~.x %>% group_by(country) %>% nest(.key = by_country)))
How can I now get the data for China into a dataframe or tibble from nest_2?
I can get the data for all of Asia, but I'm unable to isolate China.
a<-nest_2[nest_2$continent=="Asia",]$by_continent  ##Any better way of isolating Asia from nest_2?
I thought I could then do
b<-a[a$country=="China",]$by_country 
But I get the following error
Error in a[a$country == "China", ] : incorrect number of dimensions 
> glimpse(a)
List of 1
 $ :Classes ‘tbl_df’, ‘tbl’ and 'data.frame':   33 obs. of  2 variables:
  ..$ country   : Factor w/ 142 levels "Afghanistan",..: 1 8 9 19 25 56 59 60 61 62 ...
  ..$ by_country:List of 33
So my big error was not recognizing that the product was a list, which could be remedied by adding [[1]] in the end. However, I very much liked the solution by @Floo0. I took the liberty of providing a function taking the names of the variables in case the sequence of columns are different from the one provided.
select_unnest <- function(df, listcol, var, var_val){  ###listcol, var and var_val must enclosed by ""
  df[[listcol]][df[[var]]==var_val][[1]]
}
nest_2 %>% select_unnest(listcol = "by_continent", var = "continent", var_val = "Asia") %>% 
  select_unnest(listcol = "by_country", var = "country", var_val = "China")
                You can access a nested list by negative indexing as well. Negative indexes count backward from the end of the list. So, L[-1] refers to the last item, L[-2] is the second-last, and so on.
The items of a list can be accessed using their index numbers. In R, the first character of a string, list, or vector has its index value or position as 1. For example, the first character “H” of the string “Hello” has the index value 1, the second character “e” has index value 2, and so on.
When you want to insert an item at a specific position in a nested list, use insert() method. You can merge one list into another by using extend() method. If you know the index of the item you want, you can use pop() method. It modifies the list and returns the removed item.
To extract only first element from a list, we can use sapply function and access the first element with double square brackets. For example, if we have a list called LIST that contains 5 elements each containing 20 elements then the first sub-element can be extracted by using the command sapply(LIST,"[[",1).
This is a pipe-able (%>%) base R approach
select_unnest <- function(x, select_val){
  x[[2]][x[[1]]==select_val][[1]]
}
nest_2 %>% select_unnest("Asia") %>% select_unnest("China")
Comparing the timings:
Unit: microseconds
                min        lq      mean   median        uq       max neval
aosmith1   3202.105 3354.0055 4045.9602 3612.126 4179.9610 17119.495   100
aosmith2   5797.744 6191.9380 7327.6619 6716.445 7662.6415 24245.779   100
Floo0       227.169  303.3280  414.3779  346.135  400.6735  4804.500   100
Ben Bolker  622.267  720.6015  852.9727  775.172  875.5985  1942.495   100
Code:
microbenchmark::microbenchmark(
  {a<-nest_2[nest_2$continent=="Asia",]$by_continent
  flatten_df(a) %>%
    filter(country == "China") %>%
    unnest},
  {nest_2 %>%
      filter(continent == "Asia") %>%
      select(by_continent) %>%
      unnest%>%
      filter(country == "China") %>%
      unnest},
  {nest_2 %>% select_unnest("Asia") %>% select_unnest("China")},
  {n1 <- nest_2$by_continent[nest_2$continent=="Asia"][[1]]
  n2 <- n1 %>% filter(country=="China")
  n2$by_country[[1]]}
)
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With