Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When is R's assign() function appropriate?

Tags:

r

I often see questions from novice R programmers where they've used assign to create multiple objects, and then run into trouble trying to manipulate those objects for a subsequent task (a recent example).

assign appeals to novice users because it has dynamic properties (programmatically creating variable names, in addition to the variable's values), and seems to mimic some properties of global assignment. Its straightforward name also makes it likely to show up in searches for a variety of problem types.

Of course, more experienced R programmers come to realize that assign creates code that is hard to read, fragile to maintain, and acts via the type of side effects that are otherwise staunchly avoided in the highly functional R language.

Every question I've seen on SO where the OP initially used assign ultimately has a better alternative in the correct use of named vectors, lists, or data frames. The resulting code is easier to follow, more robust to change, and often more performant.

All this is to say, it's easy to find examples of why assign is bad. My question is: in what situations would the use of assign be the appropriate, preferred, or only solution?

like image 413
jdobres Avatar asked Jan 06 '19 18:01

jdobres


1 Answers

If you were constructing a program that mediated a dialogue with a user wherein the user was asked to input an arbitrary object name (in the specific R sense of an unquoted string that that is listed in a particular namespace), you might consider using assign.

The option to assign to a particular environment may also have value. Notice how it is used in the ecdf function:

ecdf
#----screen output----
function (x) 
{
    x <- sort(x)
    n <- length(x)
    if (n < 1) 
        stop("'x' must have 1 or more non-missing values")
    vals <- unique(x)
    rval <- approxfun(vals, cumsum(tabulate(match(x, vals)))/n, 
        method = "constant", yleft = 0, yright = 1, f = 0, ties = "ordered")
    class(rval) <- c("ecdf", "stepfun", class(rval))
    assign("nobs", n, envir = environment(rval))
    attr(rval, "call") <- sys.call()
    rval
}
<bytecode: 0x7c77cc0>
<environment: namespace:stats>

The ecdf function takes data and returns another function. Most of that function is built with a C call by approxfun, but as a last feature, the ecdf function adds an element to the environment of the returned value (which is yet another function.)

I'm sure you could find other instances where assign is used in the R code of the base and stats packages. Those are arguably "R Core Certified^({TM)}" examples of "proper" uses.

When I followed my own advice I got this from a bash operation:

$ cd '/home/david/Downloads/R-3.5.2/src/library/base/R/' 
$ grep -R "assign" 
# --- results with a recent download of the R sources -----
userhooks.R:        assign(hookName, new, envir = .userHooksEnv, inherits = FALSE)
datetime.R:    cacheIt <- function(tz) assign(".sys.timezone", tz, baseenv())
autoload.R: assign(".Autoloaded", c(package, .Autoloaded), envir =.AutoloadEnv)
lazyload.R:    ## set <- function (x,  value,  env) .Internal(assign(x,  value,  env,  FALSE))
delay.R:    function(x, value, eval.env=parent.frame(1), assign.env=parent.frame(1))
delay.R:    .Internal(delayedAssign(x, substitute(value), eval.env, assign.env))
assign.R:#  File src/library/base/R/assign.R
assign.R:assign <-
assign.R:    .Internal(assign(x, value, envir, inherits))
# stripped out some occurences of "assighnment" 
# stripped out the occurrences of "assign" in the namespace functions
zzz.R:assign("%*%", function(x, y) NULL, envir = .ArgsEnv)
zzz.R:assign("...length", function() NULL, envir = .ArgsEnv)
zzz.R:assign("...elt", function(n) NULL, envir = .ArgsEnv)
zzz.R:assign(".C", function(.NAME, ..., NAOK = FALSE, DUP = TRUE, PACKAGE,
zzz.R:assign(".Fortran",
zzz.R:assign(".Call", function(.NAME, ..., PACKAGE) NULL, envir = .ArgsEnv)
zzz.R:assign(".Call.graphics", function(.NAME, ..., PACKAGE) NULL, envir = .ArgsEnv)
zzz.R:assign(".External", function(.NAME, ..., PACKAGE) NULL, envir = .ArgsEnv)
zzz.R:assign(".External2", function(.NAME, ..., PACKAGE) NULL, envir = .ArgsEnv)
zzz.R:assign(".External.graphics", function(.NAME, ..., PACKAGE) NULL,
zzz.R:assign(".Internal", function(call) NULL, envir = .ArgsEnv)
zzz.R:assign(".Primitive", function(name) NULL, envir = .ArgsEnv)
zzz.R:assign(".isMethodsDispatchOn", function(onOff = NULL) NULL, envir = .ArgsEnv)
zzz.R:assign(".primTrace", function(obj) NULL, envir = .ArgsEnv)
zzz.R:assign(".primUntrace", function(obj) NULL, envir = .ArgsEnv)
zzz.R:assign(".subset", function(x, ...) NULL, envir = .ArgsEnv)
zzz.R:assign(".subset2", function(x, ...) NULL, envir = .ArgsEnv)
zzz.R:assign("UseMethod", function(generic, object) NULL, envir = .ArgsEnv)
zzz.R:assign("as.call", function(x) NULL, envir = .ArgsEnv)
zzz.R:assign("attr", function(x, which, exact = FALSE) NULL, envir = .ArgsEnv)
zzz.R:assign("attr<-", function(x, which, value) NULL, envir = .ArgsEnv)
zzz.R:assign("attributes", function(obj) NULL, envir = .ArgsEnv)
zzz.R:assign("attributes<-", function(obj, value) NULL, envir = .ArgsEnv)
zzz.R:assign("baseenv", function() NULL, envir = .ArgsEnv)
zzz.R:assign("browser",
zzz.R:assign("call", function(name, ...) NULL, envir = .ArgsEnv)
zzz.R:assign("class", function(x) NULL, envir = .ArgsEnv)
zzz.R:assign("class<-", function(x, value) NULL, envir = .ArgsEnv)
zzz.R:assign(".cache_class", function(class, extends) NULL, envir = .ArgsEnv)
zzz.R:assign("emptyenv", function() NULL, envir = .ArgsEnv)
zzz.R:assign("enc2native", function(x) NULL, envir = .ArgsEnv)
zzz.R:assign("enc2utf8", function(x) NULL, envir = .ArgsEnv)
zzz.R:assign("environment<-", function(fun, value) NULL, envir = .ArgsEnv)
zzz.R:assign("expression", function(...) NULL, envir = .ArgsEnv)
zzz.R:assign("forceAndCall", function(n, FUN, ...) NULL, envir = .ArgsEnv)
zzz.R:assign("gc.time", function(on = TRUE) NULL, envir = .ArgsEnv)
zzz.R:assign("globalenv", function() NULL, envir = .ArgsEnv)
zzz.R:assign("interactive", function() NULL, envir = .ArgsEnv)
zzz.R:assign("invisible", function(x) NULL, envir = .ArgsEnv)
zzz.R:assign("is.atomic", function(x) NULL, envir = .ArgsEnv)
zzz.R:assign("is.call", function(x) NULL, envir = .ArgsEnv)
zzz.R:assign("is.character", function(x) NULL, envir = .ArgsEnv)
zzz.R:assign("is.complex", function(x) NULL, envir = .ArgsEnv)
zzz.R:assign("is.double", function(x) NULL, envir = .ArgsEnv)
zzz.R:assign("is.environment", function(x) NULL, envir = .ArgsEnv)
zzz.R:assign("is.expression", function(x) NULL, envir = .ArgsEnv)
zzz.R:assign("is.function", function(x) NULL, envir = .ArgsEnv)
zzz.R:assign("is.integer", function(x) NULL, envir = .ArgsEnv)
zzz.R:assign("is.language", function(x) NULL, envir = .ArgsEnv)
zzz.R:assign("is.list", function(x) NULL, envir = .ArgsEnv)
zzz.R:assign("is.logical", function(x) NULL, envir = .ArgsEnv)
zzz.R:assign("is.name", function(x) NULL, envir = .ArgsEnv)
zzz.R:assign("is.null", function(x) NULL, envir = .ArgsEnv)
zzz.R:assign("is.object", function(x) NULL, envir = .ArgsEnv)
zzz.R:assign("is.pairlist", function(x) NULL, envir = .ArgsEnv)
zzz.R:assign("is.raw", function(x) NULL, envir = .ArgsEnv)
zzz.R:assign("is.recursive", function(x) NULL, envir = .ArgsEnv)
zzz.R:assign("is.single", function(x) NULL, envir = .ArgsEnv)
zzz.R:assign("is.symbol", function(x) NULL, envir = .ArgsEnv)
zzz.R:assign("isS4", function(object) NULL, envir = .ArgsEnv)
zzz.R:assign("list", function(...) NULL, envir = .ArgsEnv)
zzz.R:assign("lazyLoadDBfetch", function(key, file, compressed, hook) NULL,
zzz.R:assign("missing", function(x) NULL, envir = .ArgsEnv)
zzz.R:assign("nargs", function() NULL, envir = .ArgsEnv)
zzz.R:assign("nzchar", function(x, keepNA=FALSE) NULL, envir = .ArgsEnv)
zzz.R:assign("oldClass", function(x) NULL, envir = .ArgsEnv)
zzz.R:assign("oldClass<-", function(x, value) NULL, envir = .ArgsEnv)
zzz.R:assign("on.exit", function(expr = NULL, add = FALSE, after = TRUE) NULL, envir = .ArgsEnv)
zzz.R:assign("pos.to.env", function(x) NULL, envir = .ArgsEnv)
zzz.R:assign("proc.time", function() NULL, envir = .ArgsEnv)
zzz.R:assign("quote", function(expr) NULL, envir = .ArgsEnv)
zzz.R:assign("retracemem", function(x, previous = NULL) NULL, envir = .ArgsEnv)
zzz.R:assign("seq_along", function(along.with) NULL, envir = .ArgsEnv)
zzz.R:assign("seq_len", function(length.out) NULL, envir = .ArgsEnv)
zzz.R:assign("standardGeneric", function(f, fdef) NULL, envir = .ArgsEnv)
zzz.R:assign("storage.mode<-", function(x, value) NULL, envir = .ArgsEnv)
zzz.R:assign("substitute", function(expr, env) NULL, envir = .ArgsEnv)
zzz.R:assign("switch", function(EXPR, ...) NULL, envir = .ArgsEnv)
zzz.R:assign("tracemem", function(x) NULL, envir = .ArgsEnv)
zzz.R:assign("unclass", function(x) NULL, envir = .ArgsEnv)
zzz.R:assign("untracemem", function(x) NULL, envir = .ArgsEnv)
zzz.R:     assign(f, fx, envir = env)  # grep fails to include the names of these
zzz.R:        assign(f, fx, envir = env)
zzz.R:        assign(f, fx, envir = env)
zzz.R:        assign(f, fx, envir = env)
zzz.R:        assign(f, fx, envir = env)
zzz.R:    assign("anyNA", fx, envir = env)
zzz.R:assign("!", function(x) UseMethod("!"), envir = .GenericArgsEnv)
zzz.R:assign("as.character", function(x, ...) UseMethod("as.character"),
zzz.R:assign("as.complex", function(x, ...) UseMethod("as.complex"),
zzz.R:assign("as.double", function(x, ...) UseMethod("as.double"),
zzz.R:assign("as.integer", function(x, ...) UseMethod("as.integer"),
zzz.R:assign("as.logical", function(x, ...) UseMethod("as.logical"),
zzz.R:#assign("as.raw", function(x) UseMethod("as.raw"), envir = .GenericArgsEnv)
zzz.R:## assign("c", function(..., recursive = FALSE, use.names = TRUE) UseMethod("c"),
zzz.R:assign("c", function(...) UseMethod("c"),
zzz.R:#assign("dimnames", function(x) UseMethod("dimnames"), envir = .GenericArgsEnv)
zzz.R:assign("dim<-", function(x, value) UseMethod("dim<-"), envir = .GenericArgsEnv)
zzz.R:assign("dimnames<-", function(x, value) UseMethod("dimnames<-"),
zzz.R:assign("length<-", function(x, value) UseMethod("length<-"),
zzz.R:assign("levels<-", function(x, value) UseMethod("levels<-"),
zzz.R:assign("log", function(x, base=exp(1)) UseMethod("log"),
zzz.R:assign("names<-", function(x, value) UseMethod("names<-"),
zzz.R:assign("rep", function(x, ...) UseMethod("rep"), envir = .GenericArgsEnv)
zzz.R:assign("round", function(x, digits=0) UseMethod("round"),
zzz.R:assign("seq.int", function(from, to, by, length.out, along.with, ...)
zzz.R:assign("signif", function(x, digits=6) UseMethod("signif"),
zzz.R:assign("trunc", function(x, ...) UseMethod("trunc"), envir = .GenericArgsEnv)
zzz.R:#assign("xtfrm", function(x) UseMethod("xtfrm"), envir = .GenericArgsEnv)
zzz.R:assign("as.numeric", get("as.double", envir = .GenericArgsEnv),
like image 164
IRTFM Avatar answered Oct 20 '22 00:10

IRTFM