I have a data frame where every single SerialNo has a missing value row and some may have filled data. I need to expand the NA rows to two new rows consisting of the unique values of PartsUsed.
I believe that I'm on the right track by using dplyr complete or expand but I can't seem to get the arguments correct.
Below are example datasets (one that I currently have and one that is the target df)
begin_ex_df <- tibble(
SerialNo = c(1234, 5678, 5678, 5678, 9012, 1357, 1357, 1357, 2468, 8080),
PartsUsed = c(NA, "A", "B", NA, NA, "A", "B", NA, NA, NA),
Values = c(NA, 10, 15, NA, NA, 11, 14, NA, NA, NA)
)
ending_ex_df <- tibble(
SerialNo = c(1234, 1234, 5678, 5678, 5678, 5678, 9012, 9012, 1357, 1357, 1357, 1357, 2468, 2468, 8080, 8080),
PartsUsed = c("A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B", "A", "B"),
Values = c(0, 0, 10, 15, 0, 0, 0, 0, 11, 14, 0, 0, 0, 0, 0, 0)
)
I expect every NA row to be expanded to have an "A" and "B" in the PartsUsed and it's Values filled to 0 (Note that filling the value is not necessary - it can remain NA.) The rows that already have an "A" or "B" should be left alone.
Here is an alternative, which will keep duplicated SerialNo's
x <- subset(begin_ex_df, is.na(begin_ex_df$PartsUsed))
x <- expand.grid(SerialNo = x$SerialNo, PartsUsed = c("A", "B"))
x$Values <- 0
y <- subset(begin_ex_df, !is.na(begin_ex_df$PartsUsed))
new_df <- rbind(x, y)
new_df[order(new_df$SerialNo),]
# SerialNo PartsUsed Values
#1 1234 A 0
#2 1234 B 0
#3 1357 A 0
#4 1357 B 0
#5 1357 A 11
#6 1357 B 14
#7 2468 A 0
#8 2468 B 0
#9 5678 A 0
#10 5678 B 0
#11 5678 A 10
#12 5678 B 15
#13 8080 A 0
#14 8080 B 0
#15 9012 A 0
#16 9012 B 0
If you want to use tidyr complete then you can group_by creating an index for the groups with NA and without, and then drop NA at the end (you can also remove the Ind column).
library(dplyr)
library(tidyr)
begin_ex_df %>%
group_by(SerialNo, Ind = if_else(is.na(PartsUsed), "no", "yes")) %>%
complete(., SerialNo, PartsUsed = c("A", "B"), fill = list(Values = 0)) %>%
na.omit()
# Ind SerialNo PartsUsed Values
# <chr> <dbl> <chr> <dbl>
# 1 no 1234 A 0
# 2 no 1234 B 0
# 3 no 1357 A 0
# 4 no 1357 B 0
# 5 yes 1357 A 11
# 6 yes 1357 B 14
# 7 no 2468 A 0
# 8 no 2468 B 0
# 9 no 5678 A 0
#10 no 5678 B 0
#11 yes 5678 A 10
#12 yes 5678 B 15
#13 no 8080 A 0
#14 no 8080 B 0
#15 no 9012 A 0
#16 no 9012 B 0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With