Please consider the following:
I recently 'discovered' the awesome plyr
and dplyr
packages and use those for analysing patient data that is available to me in a data frame. Such a data frame could look like this:
df <- data.frame(id = c(1, 1, 1, 2, 2), # patient ID
diag = c(rep("dia1", 3), rep("dia2", 2)), # diagnosis
age = c(7.8, NA, 7.9, NA, NA)) # patient age
I would like to summarise the minimum patient age of all patients with a median and mean. I did the following:
min.age <- df %>%
group_by(id) %>%
summarise(min.age = min(age, na.rm = T))
Since there are NAs
in the data frame I receive the warning:
`Warning message: In min(age, na.rm = T) :
no non-missing arguments to min; returning Inf`
With Inf
I cannot call summary(df$min.age)
in a meaningful way.
Using pmin()
instead of min
returned the error message:
Error in summarise_impl(.data, dots) :
Column 'in.age' must be length 1 (a summary value), not 3
What can I do to avoid any Inf
and instead get NA
so that I can further proceed with:
summary(df$min.age)
?
Thanks a lot!
You could use is.infinite()
to detect the infinities and ifelse
to conditionally set them to NA
.
#using your df and the dplyr package
min.age <-
df %>%
group_by(id) %>%
summarise(min.age = min(age, na.rm = T)) %>%
mutate(min.age = ifelse(is.infinite(min.age), NA, min.age))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With