Sample data
date coins
2013-10-01 NA
2013-10-01 NA
2013-10-01 NA
2013-11-01 10
2013-11-01 NA
2013-11-01 20
2013-11-01 30
2013-11-01 40
2013-12-30 NA
2013-12-30 22
2013-12-30 24
2013-12-30 25
What I want to do?
I want to calculate mean and median of the coins column, ignoring missing values.
What i have done so far?
by_date <- group_by(df, date)
by_date %>% summarise_each_(funs(mean(., na.rm = TRUE), median(., na.rm=TRUE)), names(by_date)[2])
Question The results returned by summarise_each_ show NaN for date 2013-10-01. Does that mean the function is not ignoring missing values?
For example, the mean command will, by default, return NA if there are any NAs in the passed object. If you wish to calculate the mean of the non-missing values in the passed object, you can indicate this in the na. rm argument (which is, by default, set to FALSE).
When using a dataframe function na. rm in r refers to the logical parameter that tells the function whether or not to remove NA values from the calculation. It literally means NA remove. It is neither a function nor an operation. It is simply a parameter used by several dataframe functions.
The problem here is that all the values for 2013-10-01 are NA
, so there can't be a mean. The NaN
is R trying to tell you this.
If you'd rather just not have 2013-10-01 show up in the summary, one option is to get rid of NA
values upfront like this:
by_date<-group_by(df[!is.na(df$coins),],date)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With