Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

combine rows in data frame containing NA to make complete row

Tags:

I know this is a duplicate Q but I can't seem to find the post again

Using the following data

df <- data.frame(A=c(1,1,2,2),B=c(NA,2,NA,4),C=c(3,NA,NA,5),D=c(NA,2,3,NA),E=c(5,NA,NA,4))

  A  B  C  D  E
  1 NA  3 NA  5
  1  2 NA  2 NA
  2 NA NA  3 NA
  2  4  5 NA  4

Grouping by A, I'd like the following output using a tidyverse solution

  A  B  C  D  E
  1  2  3  2  5
  2  4  5  3  4

I have many groups in A. I think I saw an answer using coalesce but am unsure how to get it work. I'd like a solution that works with characters as well. Thanks!

like image 576
CPak Avatar asked Aug 04 '17 20:08

CPak


People also ask

How do I combine rows in a Dataframe in R?

To merge two data frames (datasets) horizontally, use the merge() function in the R language. To bind or combine rows in R, use the rbind() function. The rbind() stands for row binding.


3 Answers

I haven't figured out how to put the coalesce_by_column function inside the dplyr pipeline, but this works:

coalesce_by_column <- function(df) {   return(coalesce(df[1], df[2])) }  df %>%   group_by(A) %>%   summarise_all(coalesce_by_column)  ##       A     B     C     D     E ##   <dbl> <dbl> <dbl> <dbl> <dbl> ## 1     1     2     3     2     5 ## 2     2     4     5     3     4 

Edit: include @Jon Harmon's solution for more than 2 members of a group

# Supply lists by splicing them into dots: coalesce_by_column <- function(df) {   return(dplyr::coalesce(!!! as.list(df))) }  df %>%   group_by(A) %>%   summarise_all(coalesce_by_column)  #> # A tibble: 2 x 5 #>       A     B     C     D     E #>   <dbl> <dbl> <dbl> <dbl> <dbl> #> 1     1     2     3     2     5 #> 2     2     4     5     3     4 
like image 95
Oriol Mirosa Avatar answered Sep 22 '22 03:09

Oriol Mirosa


We can use fill to fill all the missing values. And then filter just one row for each group.

library(dplyr) library(tidyr)  df2 <- df %>%   group_by(A) %>%   fill(everything(), .direction = "down") %>%   fill(everything(), .direction = "up") %>%   slice(1) 

And thanks to @Roger-123, the above code can be further simplified as follows.

df2 <- df %>%   group_by(A) %>%   fill(everything(), .direction = "downup") %>%   slice(1) 
like image 40
www Avatar answered Sep 24 '22 03:09

www


Not tidyverse but here's one base R solution

df <- data.frame(A=c(1,1),B=c(NA,2),C=c(3,NA),D=c(NA,2),E=c(5,NA))
sapply(df, function(x) x[!is.na(x)][1])
#A B C D E 
#1 2 3 2 5 

With updated data

do.call(rbind, lapply(split(df, df$A), function(a) sapply(a, function(x) x[!is.na(x)][1])))
#  A B C D E
#1 1 2 3 2 5
#2 2 4 5 3 4
like image 41
d.b Avatar answered Sep 24 '22 03:09

d.b