Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R scoping: disallow global variables in function

Tags:

scope

r

global

Is there any way to throw a warning (and fail..) if a global variable is used within a R function? I think that is much saver and prevents unintended behaviours...e.g.

sUm <- 10
sum <- function(x,y){
sum = x+y
return(sUm)
}

due to the "typo" in return the function will always return 10. Instead of returning the value of sUm it should fail.

like image 401
Jonas Avatar asked Mar 10 '15 18:03

Jonas


People also ask

Are global variables always in scope?

A global variable is a variable that is defined outside all functions and available to all functions. These variables are unaffected by scopes and are always available, which means that a global variable exists until the program ends.

Can functions access global variables in R?

Global variables can be used by everyone, both inside of functions and outside.

Can global variables be altered inside of functions?

Such variables are said to have local 'scope'. Functions can access global variables and modify them. Modifying global variables in a function is considered poor programming practice. It is better to send a variable in as a parameter (or have it be returned in the 'return' statement).

Why should you avoid using global variables in programs that define functions?

Using global variables causes namespace pollution. This may lead to unnecessarily reassigning a global value. Testing in programs using global variables can be a huge pain as it is difficult to decouple them when testing.


2 Answers

My other answer is more about what approach you can take inside your function. Now I'll provide some insight on what to do once your function is defined.

To ensure that your function is not using global variables when it shouldn't be, use the codetools package.

library(codetools)

sUm <- 10
f <- function(x, y) {
    sum = x + y
    return(sUm)
}

checkUsage(f)

This will print the message:

<anonymous> local variable ‘sum’ assigned but may not be used (:1)

To see if any global variables were used in your function, you can compare the output of the findGlobals() function with the variables in the global environment.

> findGlobals(f)
[1] "{"  "+"  "="  "return"  "sUm"

> intersect(findGlobals(f), ls(envir=.GlobalEnv))
[1] "sUm"

That tells you that the global variable sUm was used inside f() when it probably shouldn't have been.

like image 173
Alex A. Avatar answered Oct 15 '22 10:10

Alex A.


There is no way to permanently change how variables are resolved because that would break a lot of functions. The behavior you don't like is actually very useful in many cases.

If a variable is not found in a function, R will check the environment where the function was defined for such a variable. You can change this environment with the environment() function. For example

environment(sum) <- baseenv()
sum(4,5)
# Error in sum(4, 5) : object 'sUm' not found

This works because baseenv() points to the "base" environment which is empty. However, note that you don't have access to other functions with this method

myfun<-function(x,y) {x+y}
sum <- function(x,y){sum = myfun(x+y); return(sUm)}

environment(sum)<-baseenv()
sum(4,5)
# Error in sum(4, 5) : could not find function "myfun"

because in a functional language such as R, functions are just regular variables that are also scoped in the environment in which they are defined and would not be available in the base environment.

You would manually have to change the environment for each function you write. Again, there is no way to change this default behavior because many of the base R functions and functions defined in packages rely on this behavior.

like image 33
MrFlick Avatar answered Oct 15 '22 10:10

MrFlick