Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R max function ignore NA

Tags:

r

max

I have below working code. When i replicate same things on a different data set i get errors :(

#max by values
df <- data.frame(age=c(5,NA,9), marks=c(1,2,7), story=c(2,9,NA))
df

df$colMax <- apply(df[,1:3], 1, function(x) max(x[x != 9],na.rm=TRUE))
df

I tried to do the same on a bigger data and I am getting warnings, why?

maindata$max_pc_age <- apply(maindata[,c(paste("Q2",1:18,sep="_"))], 1, function(x) max(x[x != 9],na.rm=TRUE))


50: In max(x[x != 9], na.rm = TRUE) :
  no non-missing arguments to max; returning -Inf

in order to understand the problem better I made changes as below, but still getting warnings

maindata$max_pc_age <- apply(maindata[,c(paste("Q2",1:18,sep="_"))], 1, function(x) max(x,na.rm=TRUE))
1: In max(x, na.rm = TRUE) : no non-missing arguments to max; returning -Inf
like image 282
user2543622 Avatar asked Jul 01 '14 21:07

user2543622


People also ask

How do I ignore Na in Max R?

Note that, by default, the max() function doesn't calculate the maximum value of a column with missing values. So, to ignore NA's while calculating the maximum, you need to add the na. rm = TRUE option.

How do you find the max of a function in R?

max() in R The max() is a built-in R function that finds the maximum value of the vector or data frame. It takes the R object as an input and returns the maximum value out of it. To find the maximum value of vector elements, data frame, and columns, use the max() function.

How do I remove Na from data in R?

The na. omit() function returns a list without any rows that contain na values. It will drop rows with na value / nan values. This is the fastest way to remove na rows in the R programming language.

How do you change NA to all in R?

In this tutorial, we will learn how to replace all NA values in a data frame with zero number in R programming. To replace NA with 0 in an R data frame, use is.na() function and then select all those values with NA and assign them to 0.


3 Answers

You can use hablar::max_ which returns NA if all values are NA

apply(df, 1, function(x) hablar::max_(x[x!=9]))
#[1]  5 NA  7

data

df <- structure(list(age = c(5, NA, 9), marks = c(-5, NA, 7), story = c(2, 
9, NA)), row.names = c(NA, -3L), class = "data.frame")

df
#  age marks story
#1   5    -5     2
#2  NA    NA     9
#3   9     7    NA
like image 87
Ronak Shah Avatar answered Sep 20 '22 01:09

Ronak Shah


It seems that the problem has been pointed out in the comments already. Since some vectors contain only NAs, -Inf is reported, which I take from the comments you don't like. In this answer I would like to point out one possible way to tackle the issue, namely to built in a control statement (instead of overwritting -Inf after the fact, which is equally valid). For instance,

 my.max <- function(x) ifelse( !all(is.na(x)), max(x, na.rm=T), NA)

does this trick. If every (all) element in x is NA, then NA is returned, and the max otherwise. If you want any other value returned, just exchange NA for that value. You can also built this easily into your apply-function. E.g.

 maindata$max_pc_age <- apply(maindata[,c(paste("Q2",1:18,sep="_"))], 1, my.max)

I am still sometimes confused by R's NA and empty set treatment. Statements like test <- NA; test==NA will give NA as a result (instead of TRUE, as returned by is.na(test)), which is sometimes rationalized by saying that since the value is missing, how could you know that these two missing values are identical? In this case, however, max returns -Inf since it is given an empty set, which I think is not at all obvious. My experience is though that if strange and unexpected results pop up, NAs or empty sets are often involved.

like image 16
coffeinjunky Avatar answered Oct 21 '22 23:10

coffeinjunky


In cases like below:

df[2,2] <- NA
df[1,2] <- -5

apply(df, 1, function(x) max(x[x != 9],na.rm=TRUE))
#[1]    5 -Inf    7
#Warning message:
#In max(x[x != 9], na.rm = TRUE) :
#  no non-missing arguments to max; returning -Inf

You could do:

df1 <- df  
minVal <- min(df1[!is.na(df1)])-1

df1[is.na(df1)|df1==9] <- minVal
val <- do.call(`pmax`, df1)
val[val==minVal] <- NA
val
#[1]  5 NA  7
like image 1
akrun Avatar answered Oct 22 '22 00:10

akrun