Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

With min() in R return NA instead of Inf

Tags:

r

dplyr

min

plyr

Please consider the following:

I recently 'discovered' the awesome plyr and dplyr packages and use those for analysing patient data that is available to me in a data frame. Such a data frame could look like this:

df <- data.frame(id = c(1, 1, 1, 2, 2), # patient ID
                 diag = c(rep("dia1", 3), rep("dia2", 2)), # diagnosis
                 age = c(7.8, NA, 7.9, NA, NA)) # patient age

I would like to summarise the minimum patient age of all patients with a median and mean. I did the following:

min.age <- df %>% 
  group_by(id) %>% 
  summarise(min.age = min(age, na.rm = T))

Since there are NAs in the data frame I receive the warning:

`Warning message: In min(age, na.rm = T) :
no non-missing arguments to min; returning Inf`

With Inf I cannot call summary(df$min.age) in a meaningful way.

Using pmin() instead of min returned the error message:

Error in summarise_impl(.data, dots) :
 Column 'in.age' must be length 1 (a summary value), not 3

What can I do to avoid any Inf and instead get NA so that I can further proceed with: summary(df$min.age)?

Thanks a lot!

like image 791
Frederick Avatar asked Jan 19 '18 14:01

Frederick


1 Answers

You could use is.infinite() to detect the infinities and ifelse to conditionally set them to NA.

#using your df and the dplyr package
min.age <- 
  df %>% 
  group_by(id) %>% 
  summarise(min.age = min(age, na.rm = T)) %>%
  mutate(min.age = ifelse(is.infinite(min.age), NA, min.age))
like image 121
Andrew Chisholm Avatar answered Sep 23 '22 15:09

Andrew Chisholm