Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replacement of plyr::cbind.fill in dplyr?

Tags:

r

dplyr

plyr

cbind

I apologize if this question is elementary, but I've been scouring the internet and I can't seem to find a simple solution.

I currently have a list of R objects (named vectors or dataframes of 1 variable, I can work with either), and I want to join them into 1 large dataframe with 1 row for each unique name/rowname, and 1 column for each element in the original list.

My starting list looks something like:

l1 <- list(df1 = data.frame(c(1,2,3), row.names = c("A", "B", "C")), 
       df2 = data.frame(c(2,6), row.names = c("B", "D")),
       df3 = data.frame(c(3,6,9), row.names = c("C", "D", "A")),
       df4 = data.frame(c(4,12), row.names = c("A", "E")))

And I want the output to look like:

data.frame("df1" = c(1,2,3,NA,NA),
+            "df2" = c(NA,2,NA,6,NA),
+            "df3" = c(9,NA,3,6,NA),
+            "df4" = c(4,NA,NA,NA,12), row.names = c("A", "B", "C", "D", "E"))
  df1 df2 df3 df4
A   1  NA   9   4
B   2   2  NA  NA
C   3  NA   3  NA
D  NA   6   6  NA
E  NA  NA  NA  12

I don't mind if the fill values are NA or 0 (ultimately I want 0 but that's an easy fix).

I'm almost positive that plyr::cbind.fill does exactly this, but I have been using dplyr in the rest of my script and I don't think using both is a good idea. dplyr::bind_cols does not seem to work with vectors of different lengths. I'm aware a very similar question has been asked here: R: Is there a good replacement for plyr::rbind.fill in dplyr? but as I mentioned, this solution doesn't actually seem to work. Neither does dplyr::full_join, even wrapped in a do.call. Is there a straightforward solution to this, or is the only solution to write a custom function?

like image 522
Tom Avatar asked Feb 05 '20 19:02

Tom


People also ask

How to use bind_rows and bind_cols in dplyr?

How to Use bind_rows and bind_cols in dplyr (With Examples) You can use the bind_rows () function from the dplyr package in R to bind together two data frames by their rows: bind_rows (df1, df2, df3,...) Similarly, you can use the bind_cols () function from dplyr to bind together two data frames by their columns:

Why do you use plyr instead of dplyr?

Both involve much more typing and are more difficult to read the code in my opinion. The other aspect of the plyr (and dplyr) suite of functions that keeps me coming back is their simple, intuitive syntax. For example, if I am teaching a student how to aggregate or sort, plyr is my go to package.

What is cbind fill in R?

cbind.fill: Combine arbitrary data types, filling in missing rows. 1 Description. Robust alternative to cbind that fills missing values and works on arbitrary data types. ... 2 Arguments. R object to fill empty rows in columns below the max size. If unspecified, repeats input rows in the same way as cbind. 3 Examples

What's new in R for dplyr?

Bonus use for dplyr The new ability to use the chain function or alternatively the %.% operator is a great addition to R. One of the difficulties with code readability in R is the whenever functions are nested together. By default R interprets from inside to out, not how most of us read written words let alone code.


1 Answers

We can convert the rownames to a column with rownames_to_column, then rename the second column, bind the list elements with bind_rows, and reshape to 'wide' with pivot_wider

library(dplyr)
library(tidyr)
library(purrr)
library(tibble)
map_dfr(l1, ~ rownames_to_column(.x, 'rn') %>% 
              rename_at(2, ~'v1'), .id = 'grp') %>%        
   pivot_wider(names_from = grp, values_from = v1) %>% 
   column_to_rownames('rn')
like image 126
akrun Avatar answered Sep 21 '22 12:09

akrun