I have a data frame which looks something like:
dataDemo <- data.frame(POS = 1:4 , REF = c("A" , "T" , "G" , "C") ,
ind1 = c("A" , "." , "G" , "C") , ind2 = c("A" , "C" , "C" , "."),
stringsAsFactors=FALSE)
dataDemo
POS REF ind1 ind2
1 1 A A A
2 2 T . C
3 3 G G C
4 4 C C .
and I'd like to replace all the "."s with the REF
value for that row. Here is how I did it:
for(i in seq_along(dataDemo$REF)){
dataDemo[i , ][dataDemo[i , ] == '.'] <- dataDemo$REF[i]
}
I'd like to know if there's a more 'proper' or idiomatic way of doing this in R. I generally try to use *apply whenever possible and this seems like something that could easily be adapted to that approach and made more readable (and run faster), but despite throwing a good bit of time at it I haven't made much progress.
In dplyr
,
library(dplyr)
dataDemo %>% mutate_each(funs(ifelse(. == '.', REF, as.character(.))), -POS)
# POS REF ind1 ind2
# 1 1 A A A
# 2 2 T T C
# 3 3 G G C
# 4 4 C C C
Here's another base
R alternative, where we use the row numbers of the "."
occurrences to replace them by the appropriate REF
values.
# Get row numbers
rownrs <- which(dataDemo==".", arr.ind = TRUE)[,1]
# Replace values
dataDemo[dataDemo=="."] <- dataDemo$REF[rownrs]
# Result
dataDemo
# POS REF ind1 ind2
#1 1 A A A
#2 2 T T C
#3 3 G G C
#4 4 C C C
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With