Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

NaN is removed when using na.rm=TRUE

Tags:

r

nan

na

na.rm

This reproducible example is a very simplified version of my code:

x <- c(NaN, 2, 3)

#This is fine, as expected
max(x)
> NaN

#Why does na.rm remove NaN?
max(x, na.rm=TRUE) 
> 3

To me, NA (missing value) and NaN (not a number) are two completely different entities, why does na.rm remove NaN? How can I ignore NA and not NaN?

ps:I am using 64-bit R version 3.0.0 on Windows7.

Edit: Upon some more study I found that is.na returns true for NaN too! This is the cause of confusion for me.

is.na(NaN)
> TRUE
like image 660
Nishanth Avatar asked Apr 16 '13 03:04

Nishanth


2 Answers

It's a language decision:

> is.na(NaN)
[1] TRUE

is.nan differentiates:

> is.nan(NaN)
[1] TRUE
> is.nan(NA)
[1] FALSE

So you may need to call both.

like image 123
Matthew Lundberg Avatar answered Sep 20 '22 17:09

Matthew Lundberg


na.rm arguments in functions generally use is.na() or an analogous function.
And since is.na(NaN) == TRUE, you then get the behavior you're observing.

Now should NaN be treated as also NA? That is a different question ;)


The best way around this is to explicitly tell R how to handle NaN One example:

ifelse(any(is.nan(x)), NaN, min(x, na.rm=TRUE))
like image 44
Ricardo Saporta Avatar answered Sep 23 '22 17:09

Ricardo Saporta