Below are my code and result. The first attempt raises an error, and the second attempt returns the result without raising error. I'm curious about this process.
tmp_func <- function(x = 1)
{
x <- x+1
if(!is.null(y=NULL))
{
}
return(x)
}
c <- tmp_func()
# Error in is.null(y = NULL) :
# supplied argument name 'y' doe not match 'x'
c <- tmp_func()
c
#[1] 2
The second time you run this function it is compiled into byte code by R's JIT compiler. The effect you see is a quirk of how the compiler handles primitive functions such as is.null()
with constant arguments that are special values, such as NULL
, TRUE
or FALSE
.
is.null()
does not take a y
argumentThe first thing to understand is that the x
being referred to in your error message is not the x
in your function but the x
that is.null()
expects as a parameter. The function signature is:
is.null
# function (x) .Primitive("is.null")
We can make this particular error appear whenever we provide a named argument to a primitive function which it does not expect:
is.null(y = NULL)
# Error in is.null(y = NULL) :
# supplied argument name 'y' does not match 'x'
abs(y = 1)
# Error in abs(y = 1) : supplied argument name 'y' does not match 'x'
So is.null(y=NULL)
should raise an error. To assign NULL
to y
and at the same time evaluate whether y
is NULL
, you need is.null(y <- NULL)
.
Let's rename your function's parameter to a
to reduce any ambiguity around x
, and also remove the unnecessary addition operation:
tmp_func <- function(a = 1) {
if (!is.null(y = NULL)) {}
return(a)
}
tmp_func()
# Error in is.null(y = NULL) :
# supplied argument name 'y' does not match 'x
tmp_func()
# [1] 1
We can see it is still complaining about x
, rather than a
.
The question becomes why we do not see this error the second time. The reason for this is because R has a just-in-time (JIT) compiler which compiles frequently used functions to byte code. You can check your JIT level by setting it to a negative value:
compiler::enableJIT(-1)
# [1] 3
The default is 3
. Yours must be at least 2
, meaning small functions are compiled before their second use.
The second time you run your function, it is compiled. We can see this if we print the function source after the first and second call:
tmp_func <- function(a = 1) {
if (!is.null(y = NULL)) {}
return(a)
}
tmp_func()
# Error in is.null(y = NULL) :
# supplied argument name 'y' does not match 'x
tmp_func # print source
# function(a = 1) {
# if (!is.null(y = NULL)) {}
# return(a)
# }
tmp_func()
# [1] 1
tmp_func # print source again
# function(a = 1) {
# if (!is.null(y = NULL)) {}
# return(a)
# }
# <bytecode: 0x3f42e28>
Note the final line shows that this function is now compiled into byte code. The compiler seems not to care about the argument name because it skips the R function, and the way it interfaces with the .Primitive("is.null")
in the C code does not care. We can see in the docs that constant NULL
arguments are a special case (p6):
Certain constant values, such as
TRUE
,FALSE
, andNULL
appear very often in code. It may be useful to provide and use special instructions for loading these.
You can test this for yourself. If you change it to is.null(y = "a")
, for example, the compiled code acts the same way as the interpreted code. Similarly, if you supply an unused argument to a non-primitive function the compiler raises the same error as the interpreter.
However, in this special case of a constant NULL
argument and a primitive function, the compiler ignores the argument name. We can see that in the instructions generated (note the LDNULL.OP
to load the
NULL
constant):
compiler::disassemble(tmp_func)
list(12L, BASEGUARD.OP, 1L, 6L, LDNULL.OP, ISNULL.OP,
NOT.OP, 4L, BRIFNOT.OP, 5L, 14L, LDNULL.OP, GOTO.OP, 15L,
LDNULL.OP, POP.OP, GETVAR.OP, 7L, RETURN.OP)
This is odd behaviour and I suppose technically it is a compiler bug. We can see that with this example:
f <- \(x) is.null(x = NULL) # valid
g <- \(x) is.null(y = NULL) # invalid
compiler::disassemble(f)
# list(12L, BASEGUARD.OP, 0L, 6L, LDNULL.OP, ISNULL.OP, RETURN.OP)
compiler::disassemble(g)
# list(12L, BASEGUARD.OP, 0L, 6L, LDNULL.OP, ISNULL.OP, RETURN.OP)
The second byte code should not be the same as the first. But it is because the name of the argument is ignored.
However, is.null(y = NULL)
is not really an expression that I would worry about. This issue appears to only affect primitive functions supplied with an incorrectly named constant argument which is TRUE
, FALSE
or NULL
. So as far as compiler bugs go, this doesn't seem to me like a very important one. In any case, if you want to ensure that you code is never JIT compiled, or establish if something is caused by JIT compilation, you can disable JIT compilation:
compiler::enableJIT(0)
JIT is disabled if the argument is 0. If level is 1 then larger closures are compiled before their first use. If level is 2, then some small closures are also compiled before their second use. If level is 3 then in addition all top level loops are compiled before they are executed.
You can read more about JIT compilation in the docs.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With