Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

mean returns NaN besides na.rm= TRUE

Tags:

r

dplyr

Sample data

date        coins   
2013-10-01  NA      
2013-10-01  NA      
2013-10-01  NA      
2013-11-01  10      
2013-11-01  NA      
2013-11-01  20      
2013-11-01  30      
2013-11-01  40      
2013-12-30  NA      
2013-12-30  22      
2013-12-30  24
2013-12-30  25

What I want to do?

I want to calculate mean and median of the coins column, ignoring missing values.

What i have done so far?

  1. Grouped the data on date variable by_date <- group_by(df, date)
  2. Summarised data using:by_date %>% summarise_each_(funs(mean(., na.rm = TRUE), median(., na.rm=TRUE)), names(by_date)[2])

Question The results returned by summarise_each_ show NaN for date 2013-10-01. Does that mean the function is not ignoring missing values?

like image 996
Imran Ali Avatar asked Feb 15 '16 15:02

Imran Ali


People also ask

Why does mean () return NA in R?

For example, the mean command will, by default, return NA if there are any NAs in the passed object. If you wish to calculate the mean of the non-missing values in the passed object, you can indicate this in the na. rm argument (which is, by default, set to FALSE).

What does Na Rm mean in R?

When using a dataframe function na. rm in r refers to the logical parameter that tells the function whether or not to remove NA values from the calculation. It literally means NA remove. It is neither a function nor an operation. It is simply a parameter used by several dataframe functions.


1 Answers

The problem here is that all the values for 2013-10-01 are NA, so there can't be a mean. The NaN is R trying to tell you this.

If you'd rather just not have 2013-10-01 show up in the summary, one option is to get rid of NA values upfront like this:

by_date<-group_by(df[!is.na(df$coins),],date)
like image 108
mrip Avatar answered Oct 02 '22 14:10

mrip