Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Stacking list column in data frame

Tags:

r

I have the following data frame with list column:

a <- data.frame(col1=c("a","b","c"))
a$col2 <- list(list(),list(name="Michal", age=28), list(name="Johnny", age=31))

I'd like to merge these columns together as a data frame so the desired output would look like below data frame:

  col1   name     age
1  a     NA       NA
2  b     Michal   28
3  c     Johny    31

For transforming list column into data frame I'm using

plyr::ldply(a$col2, data.frame)
or
lapply(a$col2, data.frame, stringsAsFactors = FALSE)

but unfortunately it'll skip empty list in first position:

   name   age
1 Michal  28
2  Johny  31

Is there any trick, how to keep this empty list for further cbind().

like image 559
martinkabe Avatar asked Apr 18 '26 07:04

martinkabe


2 Answers

Here is one option with data.table

library(data.table)
setDT(a)[, unlist(col2, recursive = FALSE), col1][a[, "col1", with = FALSE], on = .(col1)]
#   col1   name age
#1:    a     NA  NA
#2:    b Michal  28
#3:    c Johnny  31

If we need a tidyverse option

library(tidyverse)
a$col2 %>% 
    set_names(a$col1) %>% 
    Filter(length, .) %>% 
    bind_rows(., .id = "col1") %>% 
    left_join(a[1], .)
#   col1   name age
#1    a   <NA>  NA
#2    b Michal  28
#3    c Johnny  31
like image 95
akrun Avatar answered Apr 21 '26 11:04

akrun


Here's a solution using unnest, it supposes that col1 is a unique index (for the left_join) and that your lists are either NA or contain only name and age in the same order :

library(dplyr)
library(tidyR)
a  %>% mutate(col2 = lapply(col2,unlist)) %>%
  unnest %>%
  cbind(key = c("name","age")) %>%
  spread(key,col2) %>%
  left_join(a,.) %>%
  select("col1","name","age")

#   col1   name  age
# 1    a   <NA> <NA>
# 2    b Michal   28
# 3    c Johnny   31

It'd be more general and elegant to change NULL lists to list(NA,NA) as a first step (then ugly left_join could be avoided), but I couldn't manage to do it.

EDIT:

Found a way to do it, though I'm sure first line could be improved :

library(magrittr)
a  %>% mutate(col2 = inset(col2,lengths(col2) == 0,list(list(NA,NA)))) %>%
  mutate(col2 = lapply(col2,unlist)) %>%
  unnest %>%
  cbind(key = c("name","age")) %>%
  spread(key,col2)

EDIT2 :

Another one much more straightforward (skip first line if you're fine with NULL instead of NA):

a %>% mutate(col2 = inset(col2,lengths(col2) == 0,list(list(name=NA,age=NA)))) %>%
  mutate(name = sapply(col2, "[[", "name"),
         age  = sapply(col2, "[[", "age")) %>%
  select(-col2)
like image 44
Moody_Mudskipper Avatar answered Apr 21 '26 10:04

Moody_Mudskipper



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!