Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R: Why does mean(NA, na.rm = TRUE) return NaN

Tags:

r

nan

na

mean

When estimating the mean with a vector of all NA's we get an NaN if na.rm = TRUE. Why is this, is this flawed logic or is there something I'm missing? Surely it would make more sense to use NA than NaN?

Quick example below

mean(NA, na.rm = TRUE)
#[1] NaN

mean(rep(NA, 10), na.rm = TRUE)
#[1] NaN
like image 668
Glen Moutrie Avatar asked Jul 24 '18 16:07

Glen Moutrie


People also ask

Why is R returning a NaN?

In R, NaN stands for Not a Number. Typically NaN values occur when you attempt to perform some calculation that results in an invalid result.

How do I get rid of NaN in R?

The NaN values are referred to as the Not A Number in R. It is also called undefined or unrepresentable but it belongs to numeric data type for the values that are not numeric, especially in case of floating-point arithmetic. To remove rows from data frame in R that contains NaN, we can use the function na. omit.

What does na RM true do in Max () and mean () functions?

na. rm: a logical value indicating whether NA values should be stripped before the computation proceeds. By feeding this argument a logical value ( TRUE or FALSE ) you are choosing whether to strip the NAs or not while running the function. The default (also given by the mean() documentation) is FALSE .

What does na RM true do in mean and sum?

You can use the argument na. rm = TRUE to exclude missing values when calculating descriptive statistics in R. #calculate mean and exclude missing values mean(x, na. rm = TRUE) #calculate sum and exclude missing values sum(x, na.


1 Answers

It is a bit pity that ?mean does not say anything about this. My comment only told you that applying mean on an empty "numeric" results in NaN without more reasoning. Rui Barradas's comment tried to reason this but was not accurate, as division by 0 is not always NaN, it can be Inf or -Inf. I once discussed about this in R: element-wise matrix division. However, we are getting close. Although mean(x) is not coded by sum(x) / length(x), this mathematical fact really explains this NaN.

From ?sum:

 *NB:* the sum of an empty set is zero, by definition.

So sum(numeric(0)) is 0. As length(numeric(0)) is 0, mean(numeric(0)) is 0 / 0 which is NaN.

like image 184
Zheyuan Li Avatar answered Nov 15 '22 08:11

Zheyuan Li