Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R: Extract columns from list of data.frames in a tibble

Tags:

r

dplyr

purrr

I am wondering how to manipulate a list containing data.frames stored in a tibble.

Specifically, I would like to extract two columns from a data.frame that are stored in a tibble list column.

I would like to go from this tibble c

random_data<-list(a=letters[1:10],b=LETTERS[1:10])
x<-as.data.frame(random_data, stringsAsFactors=FALSE)
y<-list()
y[[1]]<-x[1,,drop=FALSE]
y[[3]]<-x[2,,drop=FALSE]
c<-tibble(z=c(1,2,3),my_data=y)

to this tibble d

d<-tibble(z=c(1,2,3),a=c('a',NA,'b'),b=c('A',NA,'B'))

thanks

Iain

like image 600
Iain Avatar asked Jul 07 '17 19:07

Iain


People also ask

How do I get certain columns from a Dataframe in R?

To select a column in R you can use brackets e.g., YourDataFrame['Column'] will take the column named “Column”. Furthermore, we can also use dplyr and the select() function to get columns by name or index. For instance, select(YourDataFrame, c('A', 'B') will take the columns named “A” and “B” from the dataframe.


3 Answers

You could create a function f to change out the NULL values, then apply it to the my_data column and finish with unnest.

library(dplyr); library(tidyr)

unnest(mutate(c, my_data = lapply(my_data, f)))
# # A tibble: 3 x 3
#       z     a     b
#   <dbl> <chr> <chr>
# 1     1     a     A
# 2     2  <NA>  <NA>
# 3     3     b     B

Where f is a helper function to change out the NULL values, and is defined as

f <- function(x) {
    if(is.null(x)) data.frame(a = NA, b = NA) else x
}
like image 108
Rich Scriven Avatar answered Oct 18 '22 19:10

Rich Scriven


c2 is the final output.

library(tidyverse)

c2 <- c %>%
  filter(!map_lgl(my_data, is.null)) %>%
  unnest() %>%
  right_join(c, by = "z") %>%
  select(-my_data)
like image 22
www Avatar answered Oct 18 '22 18:10

www


I think this does the trick with d the requested tibble:

library(dplyr)

new.y <- lapply(y, function(x) if(is.null(x)) data.frame(a = NA, b = NA) else x)
d <- cbind(z = c(1, 2, 3), bind_rows(new.y)) %>% tbl_df()


# # A tibble: 3 x 3
#     z      a      b
#  <dbl> <fctr> <fctr>
# 1   1      a      A
# 2   2     NA     NA
# 3   3      b      B
like image 39
Constantinos Avatar answered Oct 18 '22 20:10

Constantinos