accessing nested lists in R

Tags:

I have created a double nested structure for some data. How can I Access the data on the 2nd Level ( or for that matter the nth Level?)

Click to copy

library(gapminder)
library(purrr)
library(tidyr)
gapminder
nest_data <- gapminder %>% group_by(continent) %>% nest(.key = by_continent) 

nest_2<-nest_data %>% mutate(by_continent = map(by_continent, ~.x %>% group_by(country) %>% nest(.key = by_country)))

How can I now get the data for China into a dataframe or tibble from nest_2?

I can get the data for all of Asia, but I'm unable to isolate China.

Click to copy

a<-nest_2[nest_2$continent=="Asia",]$by_continent  ##Any better way of isolating Asia from nest_2?

I thought I could then do

Click to copy

b<-a[a$country=="China",]$by_country

But I get the following error

Click to copy

Error in a[a$country == "China", ] : incorrect number of dimensions 



> glimpse(a)
List of 1
 $ :Classes ‘tbl_df’, ‘tbl’ and 'data.frame':   33 obs. of  2 variables:
  ..$ country   : Factor w/ 142 levels "Afghanistan",..: 1 8 9 19 25 56 59 60 61 62 ...
  ..$ by_country:List of 33

So my big error was not recognizing that the product was a list, which could be remedied by adding [[1]] in the end. However, I very much liked the solution by @Floo0. I took the liberty of providing a function taking the names of the variables in case the sequence of columns are different from the one provided.

Click to copy

select_unnest <- function(df, listcol, var, var_val){  ###listcol, var and var_val must enclosed by ""
  df[[listcol]][df[[var]]==var_val][[1]]
}

nest_2 %>% select_unnest(listcol = "by_continent", var = "continent", var_val = "Asia") %>% 
  select_unnest(listcol = "by_country", var = "country", var_val = "China")

431

asked Sep 02 '16 11:09

Misha

1 Answers

This is a pipe-able (%>%) base R approach

Click to copy

select_unnest <- function(x, select_val){
  x[[2]][x[[1]]==select_val][[1]]
}

nest_2 %>% select_unnest("Asia") %>% select_unnest("China")

Comparing the timings:

Click to copy

Unit: microseconds

                min        lq      mean   median        uq       max neval
aosmith1   3202.105 3354.0055 4045.9602 3612.126 4179.9610 17119.495   100
aosmith2   5797.744 6191.9380 7327.6619 6716.445 7662.6415 24245.779   100
Floo0       227.169  303.3280  414.3779  346.135  400.6735  4804.500   100
Ben Bolker  622.267  720.6015  852.9727  775.172  875.5985  1942.495   100

Code:

Click to copy

microbenchmark::microbenchmark(
  {a<-nest_2[nest_2$continent=="Asia",]$by_continent
  flatten_df(a) %>%
    filter(country == "China") %>%
    unnest},
  {nest_2 %>%
      filter(continent == "Asia") %>%
      select(by_continent) %>%
      unnest%>%
      filter(country == "China") %>%
      unnest},
  {nest_2 %>% select_unnest("Asia") %>% select_unnest("China")},
  {n1 <- nest_2$by_continent[nest_2$continent=="Asia"][[1]]
  n2 <- n1 %>% filter(country=="China")
  n2$by_country[[1]]}
)

163

answered Sep 26 '22 00:09

Rentrop

Related questions
                            
                                How to specify different random effects in nlme vs. lme4?
                            
                                R Syntax Highlighting for Confluence
                            
                                ggplot2: boxplot with colors and text labels mapped to combination of two categorical variables
                            
                                igraph does not apply edge.width for negative correlation coefficients
                            
                                Reproduce a 'The Economist' chart with dual axis
                            
                                Multiply previous row value by constant R
                            
                                Date roll-up in R
                            
                                R ggplot geom_jitter duplicates outlier
                            
                                Time series plot gets offset by 2 hours if scale_x_datetime is used
                            
                                Referencing a range of columns in dplyr
                            
                                doParallel (package) foreach does not work for big iterations in R
                            
                                How to make the size of points on a plot proportional to p-value?
                            
                                The equivalent of 'this' or 'self' in R
                            
                                How to decrease padding between lines and points in R "both" type plots
                            
                                Store multiple objects in sysdata.rda: R-package development
                            
                                Drawing nested venn diagrams
                            
                                Detect a list of words in a string variable and extract matched words to a new variable in data frame
                            
                                renderImage() and .svg in shiny app
                            
                                Merging data by 2 variables in R
                            
                                R's t-distribution says "full precision may not have been achieved"

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

accessing nested lists in R

Tags:

r

purrr

tidyr

Misha

People also ask

1 Answers

Rentrop

Recent Activity

Donate For Us