I converted a JSON file into a data.frame with a a nested list structure, which I would like to unnest and flatten. Some of the values in the list are NULL, which unnest does not accept. If I replace the NULL values with a data.frame structure that has only NA values, I get the desired result.
Below is a simplified example of my problem. I have tried to replace the NULL values with the NA data.frame but did not manage because of the the nested structure. How can I achieve the desired result?
Example
library(tidyr)
input1 <- data.frame(id = c("c", "d", "e"), value = c(7, 8, 9))
input2 <- NULL
input3 <- data.frame(id = c(NA), value = c(NA))
df <- dplyr::tibble(
a = c(1, 2),
b = list(a = input1, c = input2))
unnest(df)
gives the error "Error: Each column must either be a list of vectors or a list of data frames [b]"
df2 <- dplyr::tibble(
a = c(1, 2),
b = list(a = input1, c = input3))
unnest(df2)
gives the desired output.
As noted by @Frank above, as of tidyr
1.0.0 this can be achieved using the keep_empty
argument
library(tidyr)
input1 <- data.frame(id = c("c", "d", "e"), value = c(7, 8, 9))
input2 <- NULL
df <- tibble::tibble(
a = c(1, 2),
b = list(a = input1, c = input2)
)
unnest(df, cols = b, keep_empty = TRUE)
#> # A tibble: 4 x 3
#> a id value
#> <dbl> <chr> <dbl>
#> 1 1 c 7
#> 2 1 d 8
#> 3 1 e 9
#> 4 2 NA NA
Created on 2021-09-24 by the reprex package (v2.0.1)
We can use map_lgl
from purrr
here. If you don't care about those rows with a NULL
, you could simply remove them with filter
and unnest
:
library(tidyverse)
df %>%
filter(!map_lgl(b, is.null)) %>%
unnest()
#> # A tibble: 3 x 3
#> a id value
#> <dbl> <fctr> <dbl>
#> 1 1 c 7
#> 2 1 d 8
#> 3 1 e 9
In case you want to keep those rows, you could bring them back with right_join
after unnesting:
df %>%
filter(!map_lgl(b, is.null)) %>%
unnest() %>%
right_join(select(df, a))
#> Joining, by = "a"
#> # A tibble: 4 x 3
#> a id value
#> <dbl> <fctr> <dbl>
#> 1 1 c 7
#> 2 1 d 8
#> 3 1 e 9
#> 4 2 <NA> NA
input1 <- data.frame(id = c("c", "d", "e"), value = c(7, 8, 9))
input2 <- NULL
input3 <- data.frame(id = c(NA), value = c(NA))
df <- dplyr::tibble(
a = c(1, 2),
b = list(a = input1, c = input2)
)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With