Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

If a df does not have all the cols in another, add them and set all to 0

Tags:

dataframe

r

dplyr

I have 2 data frames e.g.

# empty df
df1 <- data_frame(
  x = 1,
  y = 1,
  z = 1
) |> 
  filter(x < 1)

df2 <- data.frame(
  x = 1:10,
  a = 1:10,
  c = letters[1:10]
)

df2 does not contain columns y and z yet they exist in df1. I would like to mutate the columns that are non existent in df2 from df1 onto df2 so that I have a df that looks like this:

df_full
    x  a c y z
1   1  1 a 0 0
2   2  2 b 0 0
3   3  3 c 0 0
4   4  4 d 0 0
5   5  5 e 0 0
6   6  6 f 0 0
7   7  7 g 0 0
8   8  8 h 0 0
9   9  9 i 0 0
10 10 10 j 0 0

How can I use r/dplyr to add any columns from df1 that don't exist in df2 and just set them all to 0?

like image 293
Doug Fir Avatar asked Sep 18 '25 20:09

Doug Fir


2 Answers

We can use power_left_join from the powerjoin library, with fill = 0.

library(powerjoin)

power_left_join(df2, df1, fill = 0)

    x  a c y z
1   1  1 a 0 0
2   2  2 b 0 0
3   3  3 c 0 0
4   4  4 d 0 0
5   5  5 e 0 0
6   6  6 f 0 0
7   7  7 g 0 0
8   8  8 h 0 0
9   9  9 i 0 0
10 10 10 j 0 0
like image 101
benson23 Avatar answered Sep 21 '25 11:09

benson23


You could left_join your dataframes and replace_na to 0 across the columnnames of df1 with !! you need like this:

library(dplyr)
library(tidyr)
df2 |>
  left_join(df1) |>
  mutate(across(!!colnames(df1), ~ replace_na(.x, 0)))
#> Joining with `by = join_by(x)`
#>     x  a c y z
#> 1   1  1 a 0 0
#> 2   2  2 b 0 0
#> 3   3  3 c 0 0
#> 4   4  4 d 0 0
#> 5   5  5 e 0 0
#> 6   6  6 f 0 0
#> 7   7  7 g 0 0
#> 8   8  8 h 0 0
#> 9   9  9 i 0 0
#> 10 10 10 j 0 0

Created on 2023-04-26 with reprex v2.0.2

like image 32
Quinten Avatar answered Sep 21 '25 09:09

Quinten