Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

r - 'rbind' dataframes with different prefix in column names

I have two dataframes like the following:

df1 <- data.frame(ID = c(1:4),
       Year = 2001,
       a_Var1 = c("A","B","C","D"),
       a_Var2 = c("T","F","F","T"))

df2 <- data.frame(ID = c(1:4),
       Year = 2002,
       b_Var1 = c("E","F","G","H"))

The desired end product is

df_combined <- data.frame(ID = c(1,1,2,2,3,3,4,4),
                      Year = c(2001,2002,2001,2002,2001,2002,2001,2002),
                      Var1 = c("A","E","B","F","C","G","D","H"),
                      Var2 = c("T",NA,"F",NA,"F",NA,"T",NA))

Question is how to 'rbind' in such a way that the prefix a_ or b_ is removed and Var1, Var2, etc become the new columns.

Tried plyr's rbind.fill but that doesn't solve the problem.

like image 937
Junran Cao Avatar asked May 08 '26 13:05

Junran Cao


1 Answers

Here is one option. Place the datasets in a list, rename by removing the prefix part including the _ and arrange by 'ID'

library(tidyverse)
map_df(list(df1, df2), ~ .x %>% 
             rename_all(~ str_remove(.x, "^[^_]+_"))) %>%
   arrange(ID)
#  ID Year Var1 Var2
#1  1 2001    A    T
#2  1 2002    E <NA>
#3  2 2001    B    F
#4  2 2002    F <NA>
#5  3 2001    C    F
#6  3 2002    G <NA>
#7  4 2001    D    T
#8  4 2002    H <NA>
like image 140
akrun Avatar answered May 10 '26 03:05

akrun