For commands like max
the option na.rm
is set by default to FALSE
. I understand why this is a good idea in general, but I'd like to turn it off reversibly for a while -- i.e. during a session.
How can I require R to set na.rm = TRUE
whenever it is an option? I found
options(na.action = na.omit)
but this doesn't work. I know that I can set a na.rm=TRUE
option for each and every function I write.
my.max <- function(x) {max(x, na.rm=TRUE)}
But that's not what I am looking for. I'm wondering if there's something I could do more globally/universally instead of doing it for each function.
You can use the argument na. rm = TRUE to exclude missing values when calculating descriptive statistics in R.
Argument na. rm gives a simple way of removing missing values from data if they are coded as NA . In base R its standard default value is FALSE , meaning, NA 's are not removed.
One workaround (dangerous), is to do the following :
na.rm
as argument. Here I limited my search to the base package.na.rm = TRUE
So first I store in a list (ll) all functions having na.rm
as argument:
uses_arg <- function(x,arg) is.function(fx <- get(x)) && arg %in% names(formals(fx)) basevals <- ls(pos="package:base") na.rm.f <- basevals[sapply(basevals,uses_arg,'na.rm')]
EDIT better method to get all na.rm's argument functions (thanks to mnel comment)
Funs <- Filter(is.function,sapply(ls(baseenv()),get,baseenv())) na.rm.f <- names(Filter(function(x) any(names(formals(args(x)))%in% 'na.rm'),Funs))
So na.rm.f
list looks like:
[1] "all" "any" "colMeans" "colSums" [5] "is.unsorted" "max" "mean.default" "min" [9] "pmax" "pmax.int" "pmin" "pmin.int" [13] "prod" "range" "range.default" "rowMeans" [17] "rowsum.data.frame" "rowsum.default" "rowSums" "sum" [21] "Summary.data.frame" "Summary.Date" "Summary.difftime" "Summary.factor" [25] "Summary.numeric_version" "Summary.ordered" "Summary.POSIXct" "Summary.POSIXlt"
Then for each function I change the body, the code is inspired from data.table
package (FAQ 2.23) that add one line to the start of rbind.data.frame
and cbind.data.frame
.
ll <- lapply(na.rm.f,function(x) { tt <- get(x) ss = body(tt) if (class(ss)!="{") ss = as.call(c(as.name("{"), ss)) if(length(ss) < 2) print(x) else{ if (!length(grep("na.rm = TRUE",ss[[2]],fixed=TRUE))) { ss = ss[c(1,NA,2:length(ss))] ss[[2]] = parse(text="na.rm = TRUE")[[1]] body(tt)=ss (unlockBinding)(x,baseenv()) assign(x,tt,envir=asNamespace("base"),inherits=FALSE) lockBinding(x,baseenv()) } } })
No if you check , the first line of each function of our list :
unique(lapply(na.rm.f,function(x) body(get(x))[[2]])) [[1]] na.rm = TRUE
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With