Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unnest a list column directly into several columns

Tags:

r

tidyr

tibble

Can I unnest a list column directly into n columns?

The list can be assumed to regular, with all elements being of equal length.

If instead of a list column I would have a character vector, I could tidyr::separate. I can tidyr::unnest, but we need another helper variable to be able to tidyr::spread. Am I missing an obvious method?

Example data:

library(tibble)  df1 <- data_frame(   gr = c('a', 'b', 'c'),   values = list(1:2, 3:4, 5:6) ) 
# A tibble: 3 x 2   gr    values      <chr> <list>    1 a     <int [2]> 2 b     <int [2]> 3 c     <int [2]> 

Goal:

df2 <- data_frame(   gr = c('a', 'b', 'c'),   V1 = c(1, 3, 5),   V2 = c(2, 4, 6) ) 
# A tibble: 3 x 3   gr       V1    V2   <chr> <dbl> <dbl> 1 a        1.    2. 2 b        3.    4. 3 c        5.    6. 

Current method:

unnest(df1) %>%    group_by(gr) %>%    mutate(r = paste0('V', row_number())) %>%    spread(r, values) 
like image 452
Axeman Avatar asked Apr 06 '18 09:04

Axeman


People also ask

How do I Unnest columns in R?

The tidyr package in R is used to “tidy” up the data. The unnest() method in the package can be used to convert the data frame into an unnested object by specifying the input data and its corresponding columns to use in unnesting. The output is produced in the form of a tibble in R.

What does nest() do in r?

Nesting creates a list-column of data frames; unnesting flattens it back out into regular columns. Nesting is a implicitly summarising operation: you get one row for each group defined by the non-nested columns. This is useful in conjunction with other summaries that work with whole datasets, most notably models.

What is unnest function?

The UNNEST function returns a result table that includes a row for each element of the specified array. If there are multiple ordinary array arguments specified, the number of rows will match the array with the largest cardinality.


1 Answers

with tidyr 1.0.0 you can do :

library(tidyr) df1 <- tibble(   gr = c('a', 'b', 'c'),   values = list(1:2, 3:4, 5:6) )  unnest_wider(df1, values) #> New names: #> * `` -> ...1 #> * `` -> ...2 #> New names: #> * `` -> ...1 #> * `` -> ...2 #> New names: #> * `` -> ...1 #> * `` -> ...2 #> # A tibble: 3 x 3 #>   gr     ...1  ...2 #>   <chr> <int> <int> #> 1 a         1     2 #> 2 b         3     4 #> 3 c         5     6 

Created on 2019-09-14 by the reprex package (v0.3.0)

The output is verbose here because the elements that were unnested horizontally (the vector elements) were not named, and unnest_wider doesn't want to guess silently.

We can name them beforehand to avoid it :

df1 %>%   dplyr::mutate(values = purrr::map(values, setNames, c("V1","V2"))) %>%   unnest_wider(values) #> # A tibble: 3 x 3 #>   gr       V1    V2 #>   <chr> <int> <int> #> 1 a         1     2 #> 2 b         3     4 #> 3 c         5     6 

Or just use suppressMessages() or purrr::quietly()

like image 165
Moody_Mudskipper Avatar answered Sep 28 '22 20:09

Moody_Mudskipper