I have the following data frame with list column:
a <- data.frame(col1=c("a","b","c"))
a$col2 <- list(list(),list(name="Michal", age=28), list(name="Johnny", age=31))
I'd like to merge these columns together as a data frame so the desired output would look like below data frame:
col1 name age
1 a NA NA
2 b Michal 28
3 c Johny 31
For transforming list column into data frame I'm using
plyr::ldply(a$col2, data.frame)
or
lapply(a$col2, data.frame, stringsAsFactors = FALSE)
but unfortunately it'll skip empty list in first position:
name age
1 Michal 28
2 Johny 31
Is there any trick, how to keep this empty list for further cbind().
Here is one option with data.table
library(data.table)
setDT(a)[, unlist(col2, recursive = FALSE), col1][a[, "col1", with = FALSE], on = .(col1)]
# col1 name age
#1: a NA NA
#2: b Michal 28
#3: c Johnny 31
If we need a tidyverse option
library(tidyverse)
a$col2 %>%
set_names(a$col1) %>%
Filter(length, .) %>%
bind_rows(., .id = "col1") %>%
left_join(a[1], .)
# col1 name age
#1 a <NA> NA
#2 b Michal 28
#3 c Johnny 31
Here's a solution using unnest, it supposes that col1 is a unique index (for the left_join) and that your lists are either NA or contain only name and age in the same order :
library(dplyr)
library(tidyR)
a %>% mutate(col2 = lapply(col2,unlist)) %>%
unnest %>%
cbind(key = c("name","age")) %>%
spread(key,col2) %>%
left_join(a,.) %>%
select("col1","name","age")
# col1 name age
# 1 a <NA> <NA>
# 2 b Michal 28
# 3 c Johnny 31
It'd be more general and elegant to change NULL lists to list(NA,NA) as a first step (then ugly left_join could be avoided), but I couldn't manage to do it.
EDIT:
Found a way to do it, though I'm sure first line could be improved :
library(magrittr)
a %>% mutate(col2 = inset(col2,lengths(col2) == 0,list(list(NA,NA)))) %>%
mutate(col2 = lapply(col2,unlist)) %>%
unnest %>%
cbind(key = c("name","age")) %>%
spread(key,col2)
EDIT2 :
Another one much more straightforward (skip first line if you're fine with NULL instead of NA):
a %>% mutate(col2 = inset(col2,lengths(col2) == 0,list(list(name=NA,age=NA)))) %>%
mutate(name = sapply(col2, "[[", "name"),
age = sapply(col2, "[[", "age")) %>%
select(-col2)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With