Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it possible to set na.rm to TRUE globally?

Tags:

For commands like max the option na.rm is set by default to FALSE. I understand why this is a good idea in general, but I'd like to turn it off reversibly for a while -- i.e. during a session.

How can I require R to set na.rm = TRUE whenever it is an option? I found

options(na.action = na.omit) 

but this doesn't work. I know that I can set a na.rm=TRUE option for each and every function I write.

my.max <- function(x) {max(x, na.rm=TRUE)} 

But that's not what I am looking for. I'm wondering if there's something I could do more globally/universally instead of doing it for each function.

like image 739
Hugh Avatar asked Jul 02 '13 06:07

Hugh


People also ask

What does the NA RM true do?

You can use the argument na. rm = TRUE to exclude missing values when calculating descriptive statistics in R.

What is the purpose of the NA RM argument?

Argument na. rm gives a simple way of removing missing values from data if they are coded as NA . In base R its standard default value is FALSE , meaning, NA 's are not removed.


1 Answers

One workaround (dangerous), is to do the following :

  1. List all functions that have na.rm as argument. Here I limited my search to the base package.
  2. Fetch each function and add this line at the beginning of its body: na.rm = TRUE
  3. Assign the function back to the base package.

So first I store in a list (ll) all functions having na.rm as argument:

uses_arg <- function(x,arg)    is.function(fx <- get(x)) &&    arg %in% names(formals(fx)) basevals <- ls(pos="package:base")       na.rm.f <- basevals[sapply(basevals,uses_arg,'na.rm')] 

EDIT better method to get all na.rm's argument functions (thanks to mnel comment)

Funs <- Filter(is.function,sapply(ls(baseenv()),get,baseenv())) na.rm.f <- names(Filter(function(x) any(names(formals(args(x)))%in% 'na.rm'),Funs)) 

So na.rm.f list looks like:

 [1] "all"                     "any"                     "colMeans"                "colSums"                  [5] "is.unsorted"             "max"                     "mean.default"            "min"                      [9] "pmax"                    "pmax.int"                "pmin"                    "pmin.int"                [13] "prod"                    "range"                   "range.default"           "rowMeans"                [17] "rowsum.data.frame"       "rowsum.default"          "rowSums"                 "sum"                     [21] "Summary.data.frame"      "Summary.Date"            "Summary.difftime"        "Summary.factor"          [25] "Summary.numeric_version" "Summary.ordered"         "Summary.POSIXct"         "Summary.POSIXlt"  

Then for each function I change the body, the code is inspired from data.table package (FAQ 2.23) that add one line to the start of rbind.data.frame and cbind.data.frame.

ll <- lapply(na.rm.f,function(x)   {   tt <- get(x)   ss = body(tt)   if (class(ss)!="{") ss = as.call(c(as.name("{"), ss))   if(length(ss) < 2) print(x)   else{     if (!length(grep("na.rm = TRUE",ss[[2]],fixed=TRUE))) {       ss = ss[c(1,NA,2:length(ss))]       ss[[2]] = parse(text="na.rm = TRUE")[[1]]       body(tt)=ss       (unlockBinding)(x,baseenv())       assign(x,tt,envir=asNamespace("base"),inherits=FALSE)       lockBinding(x,baseenv())       }     }   }) 

No if you check , the first line of each function of our list :

unique(lapply(na.rm.f,function(x) body(get(x))[[2]])) [[1]] na.rm = TRUE 
like image 129
agstudy Avatar answered Oct 07 '22 00:10

agstudy