Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

dplyr::bind_cols (remove the first column while combining several data.frames)

Tags:

r

dplyr

I have around 50 data.frames. They are the results of different simulations

Examples of the data.frames are below

SiteID <- c("Site1", "Site2", "Site3", "Site4", "Site5")
measured_s1 <- c(21:25)
simulated_s1 <- c(22:26)
df <- data.frame(SiteID, measured_s1, simulated_s1)

SiteID <- c("Site1", "Site2", "Site3", "Site4", "Site5")
measured_s2 <- c(21:25)
simulated_s2 <- c(21.5:25.5)
df_s2 <- data.frame(SiteID, measured_s2, simulated_s2)

SiteID <- c("Site1", "Site2", "Site3", "Site4", "Site5")
measured_s3 <- c(21:25)
simulated_s3 <- c(21.2:25.2)
df_s3 <- data.frame(SiteID, measured_s3, simulated_s3)

I want to combine all of them. I did it using bind_cols

dplyr::bind_cols(df, df_s2, df_s3)
      SiteID measured_s1 simulated_s1 SiteID measured_s2 simulated_s2 SiteID measured_s3 simulated_s3
  #1  Site1          21           22  Site1          21         21.5  Site1          21         21.2
  #2  Site2          22           23  Site2          22         22.5  Site2          22         22.2
  #3  Site3          23           24  Site3          23         23.5  Site3          23         23.2
  #4  Site4          24           25  Site4          24         24.5  Site4          24         24.2
  #5  Site5          25           26  Site5          25         25.5  Site5          25         25.2

But it resulted in the SiteID column being repeated more than one time in the final data.frame resulting from bind_cols

Now, this can be fixed by removing the repeated SiteID manually or converting the df, df_s2and df_s3to long data.frame then using full_join by SiteID.

Is there any better way to drop SiteID column while combining the data.frames?

like image 444
shiny Avatar asked Apr 09 '26 16:04

shiny


1 Answers

You can put your data frames in a list, and then use Reduce function to join them one by one on the SiteID column:

Reduce(dplyr::full_join, list(df, df_s2, df_s3))

#  SiteID measured_s1 simulated_s1 measured_s2 simulated_s2 measured_s3 simulated_s3
#1  Site1          21           22          21         21.5          21         21.2
#2  Site2          22           23          22         22.5          22         22.2
#3  Site3          23           24          23         23.5          23         23.2
#4  Site4          24           25          24         24.5          24         24.2
#5  Site5          25           26          25         25.5          25         25.2

Or to avoid the join process, and you are aware that all the data frames are aligned well, you can remove the SiteID column with lapply and then use do.call(bind_cols, ...):

bind_cols(df, do.call(bind_cols, lapply(list(df_s2, df_s3), `[`, -1)))
like image 135
Psidom Avatar answered Apr 12 '26 08:04

Psidom



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!