To date when writing R functions I've passed undefined arguments as NULL values and then tested whether they are NULL i.e.
f1 <- function (x = NULL) {
if(is.null(x))
...
}
However I recently discovered the possibility of passing undefined arguments as missing i.e.
f2 <- function (x) {
if(missing(x))
...
}
The R documentation states that
Currently missing can only be used in the immediate body of the function that defines the argument, not in the body of a nested function or a local call. This may change in the future.
Clearly this is one disadvantage of using missing to determine undefined values are there any others people or aware of? Or to phrase the question in a more useful form "When do you use missing versus NULL values for passing undefined function arguments in R and why?"
In simple words, NULL represents the null or an empty object in R. NA represents a missing value in R. NA can be updated in R by vectors, list and other R objects whereas NULL cannot be coerced. NA elements can be accessed and be managed using various na functions such as is.na(), na.
NULL represents the null object in R. NULL is used mainly to represent the lists with zero length, and is often returned by expressions and functions whose value is undefined. as. null ignores its argument and returns the value NULL .
Null Values DefinitionThe word 'Null' is a representation of "missing information”. It is commonly denoted as “NaN” or “NULL” or “N/A”. Note that, the null value should not be confused with the value “0”, rather it represents a lack of value.
You can pass NULL as a function parameter only if the specific parameter is a pointer. The only practical way is with a pointer for a parameter. However, you can also use a void type for parameters, and then check for null, if not check and cast into ordinary or required type.
NULL
is just another value you can assign to a variable. It's no different than any other default value you'd assign in your function's declaration.
missing
on the other hand checks if the user supplied that argument, which you can do before the default assignment - which thanks to R's lazy evaluation only happens when that variable is used.
A couple of examples of what you can achieve with this are: arguments with no default value that you can still omit - e.g. file
and text
in read.table
, or arguments with default values where you can only specify one - e.g. n
and nmax
in scan
.
You'll find many other use cases by browsing through R code.
missing(x)
seems to be a bit faster than using default arg to x
equal to NULL
.
> require('microbenchmark')
> f1 <- function(x=NULL) is.null(x)
> f2 <- function(x) missing(x)
> microbenchmark(f1(1), f2(1))
Unit: nanoseconds
expr min lq median uq max neval
f1(1) 615 631 647.5 800.5 3024 100
f2(1) 497 511 567.0 755.5 7916 100
> microbenchmark(f1(), f2())
Unit: nanoseconds
expr min lq median uq max neval
f1() 589 619 627 745.5 3561 100
f2() 437 448 463 479.0 2869 100
Note that in the f1
case x
is still reported as missing if you make a call f1()
, but it has a value that may be read within f1
.
The second case is more general than the first one. missing()
just means that the user did not pass any value. is.null()
(with NULL
default arg) states that the user either did not pass anything or he/she passed NULL
.
By the way, plot.default()
and chisq.test()
use NULL
for their second arguments. On the other hand, getS3method('t.test', 'default')
uses NULL
for y
argument and missing()
for mu
(in order to be prepared for many usage scenarios).
I think that some R users will prefer f1
-type functions, especially when working with the *apply
family:
sapply(list(1, NULL, 2, NULL), f1)
Achieving that in the f2
case is not so straightforward.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With