Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replace values in a dataframe based on another factor which contains NA's in R

Tags:

dataframe

r

I have a dataframe which contains (among other things) a numeric column with a concentration, and a factor column with a status flag. This status flag contains NA's.

Here's an example

df<-structure(list(conc = c(101.769, 1.734, 62.944, 92.697, 25.091, 27.377, 24.343, 55.084, 0.335, 23.280), status = structure(c(NA, NA, NA, NA, NA, NA, 2L, NA, 1L, NA), .Label = c("<LLOQ", "NR"), class = "factor")), .Names = c("conc", "status"), row.names = c(NA, -10L), class = "data.frame")

I want to replace the concentration column with a string for some values of the flag column, or with the concentration value formatted to a certain number of significant digits.

When I try this

ifelse(df$status=="NR","NR",df$conc)

The NA's in the status flag don't trigger either the true or false condition (and return NA) - as the documentation suggests it will. I could loop over the rows and use IF then else on each one but this seems inefficient.

Am I missing something ? I've tried as.character(df$status) as well which doesn't work. My mojo must be getting low....

like image 386
PaulHurleyuk Avatar asked Dec 28 '22 21:12

PaulHurleyuk


2 Answers

Use %in% instead of == :

ifelse(df$status %in% "NR","NR", df$conc)

Side-by-side comparison of the two methods:

data.frame(df, ph = ifelse(df$status=="NR","NR",df$conc), mp = ifelse(df$status %in% "NR","NR",df$conc))

Check out ?match for more information - I'm not sure I could explain it well.

like image 180
Matt Parker Avatar answered Jan 31 '23 09:01

Matt Parker


You must explicit test for NA so you can use:

ifelse(df$status=="NR" | is.na(df$status),"NR",df$conc) # gives you NR for NA

or

ifelse(df$status=="NR" & !is.na(df$status),"NR",df$conc) # gives you df$conc for NA
like image 28
Marek Avatar answered Jan 31 '23 09:01

Marek