Say I have a data frame, d1, that looks like this:
site code trait
1 1 A 1.0
2 2 B 1.3
3 3 A NA
4 4 B 2.9
5 5 A NA
Here is the dput to generate d1:
structure(list(site = 1:5, code = structure(c(1L, 2L, 1L, 2L,
1L), .Label = c("A", "B"), class = "factor"), trait = c(1, 1.3,
NA, 2.9, NA)), .Names = c("site", "code", "trait"), row.names = c(NA,
-5L), class = "data.frame")
I have a second data frame, d2, that looks like this:
code trait
1 A 1.5
2 B 2.5
Here is the dput to generate d2:
structure(list(code = structure(1:2, .Label = c("A", "B"), class = "factor"),
trait = c(1.5, 2.5)), .Names = c("code", "trait"), row.names = c(NA,
-2L), class = "data.frame")
I would like a piece of code that replaces the NA values of trait with the trait value from d2 that matches the code character for a particular row in d1. The final output of d1 would look like this:
site code trait
1 1 A 1.0
2 2 B 1.3
3 3 A 1.5
4 4 B 2.9
5 5 A 1.5
Things I've tried:
d1$trait<- ifelse(is.na(d1$trait),d2$trait[d2$code == d1$code],d1$trait)
When using this code I'm getting a warning:
Warning messages: 1: In is.na(e1) | is.na(e2) : longer object length is not a multiple of shorter object length 2: In ==.default(d2$code, d1$code) : longer object length is not a multiple of shorter object length
Your ifelse syntax is close, but the problematic bit is:
d2$trait[d2$code == d1$code]
Here, you are trying to look up the d2$trait value corresponding to the correct code value from d1, but you are actually just comparing the corresponding elements of d2$code to d1$code. The operation can instead be accomplished with match:
d1$trait<- ifelse(is.na(d1$trait),d2$trait[match(d1$code, d2$code)], d1$trait)
d1
# site code trait
# 1 1 A 1.0
# 2 2 B 1.3
# 3 3 A 1.5
# 4 4 B 2.9
# 5 5 A 1.5
An alternative would be to just replace the missing values, again using match to grab the relevant elements from d2$trait:
d1$trait[is.na(d1$trait)] <- d2$trait[match(d1$code[is.na(d1$trait)], d2$code)]
d1
# site code trait
# 1 1 A 1.0
# 2 2 B 1.3
# 3 3 A 1.5
# 4 4 B 2.9
# 5 5 A 1.5
While match and merge are internally doing very similar things, I find the match syntax to be a bit easier to use because you don't need to create an intermediate object via merge and then grab the relevant information from that intermediate object.
It is a simple task for merge:
df12 <- merge(df1, df2, by="code", all.x=TRUE)
df12$trait <- ifelse(is.na(df12$trait.x), df12$trait.y, df12$trait.x)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With