Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Safely evaluating arithmetic expressions in R?

Edit

Ok, since there seems to be a lot of confusion, I'm going to simplify the question a little. You can try to answer the original question below, or you can tackle this version instead and ignore everything below the line.

My goal is to take an arbitrary expression and evaluate it in an extremely restricted environment. This environment will contain only variables with the following types of values:

  • Numeric vectors
  • Pure functions that take one or more numeric vectors and return numeric vectors (i.e. arithmetic operators)

In addition, the expression would necessarily be able to use any literals, such as numeric and string constants (but not numeric or string vectors, since those would require c). I would like to evaluate the expression in this environment and ensure that there is no way for the expression to access anything outside the environment, so that I can be sure that evaluating the expression would not be a security risk. So, in the below code, can you fill in the blank with a string that will do something naughty when evaluated? "Something naughty" is defined as printing something to the screen, accessing the value of the variable secret, executing any shell command (preferably one that produces output), or anything else that seems naughty to you (justify your choice).

a <- 1
b <- 2
x <- 5
y <- 1:10
z <- -1

## Give secret a random value so that you can't just compute it from
## the above variables
secret <- rnorm(5)

allowed.variables <- c(
    ## Numeric variables
    "a", "b", "x", "y", "z",
    ## Arithmetic operators
    "(", "+", "-", "/", "*", "^", "sqrt", "log", "log10", "log2", "exp", "log1p")

restricted.environment <- Map(get, allowed.variables)

## Example naughty expressions that my method successfully guards
## against
expr1 <- "secret"
expr2 <- "cat('Printing something with cat\n')"
expr3 <- "system('echo Printing something via shell command')"

arbitrary.expression <- "?????????" # Your naughty string constant here

eval(parse(text=arbitrary.expression), envir=restricted.environment, enclos=emptyenv())

Original question

I am writing some code to take an arithmetic expression as user input and evaluate it. I have a specified set of variables that can be used, and a whitelist of arithmetic functions (+, -, *, /, ^, etc.). Is there any way that I can evaluate an expression so that only these variables and operators are in scope, in order to avoid any possibility of arbitrary code injection? I have something that I think works, but I don't want to actually use it unless I have some certainty that it is really bulletproof:

## Shortcut for parse-then-eval pattern
evalparse <- function(expr, ...) eval(parse(text=expr), ...)

# I control these
arithmetic.operators <- Map(get, c("(", "+", "-", "/", "*", "^", "sqrt", "log", "log10", "log2", "exp", "log1p"))
vars <- list(a=1, b=2)
safe.envir <- c(vars, arithmetic.operators)

# Assume that these expressions are user input, e.g. from a web form.
nice.expr <- "a + b"
naughty.expr <- paste("cat('ARBITRARY R CODE INJECTION\n'); system('echo ARBITRARY SHELL COMMAND INJECTION');", nice.expr)

## NOT SAFE! Lookups outside env still possible.
evalparse(nice.expr, envir=safe.envir)
evalparse(naughty.expr, envir=safe.envir)

## Is this safe?
evalparse(nice.expr, envir=safe.envir, enclos=emptyenv())
evalparse(naughty.expr, envir=safe.envir, enclos=emptyenv())

If you run the above code in R, you'll see that the first time we eval naughty.expr, it successfully executes its payload. However, the second time, with enclose=emptyenv(), the evaluation only has access to the variables a, b, and the specified arithmetic operators, so the payload fails to execute.

So, is this method (i.e. eval(..., envir=safeenv, enclos=emptyenv()) ) actually OK to use in production accepting actual user input, or am I missing some sneaky way to still execute arbitrary code in the resticted environment?

like image 823
Ryan C. Thompson Avatar asked Aug 22 '13 00:08

Ryan C. Thompson


People also ask

How do you evaluate an arithmetic expression?

Parentheses may be used in expressions to specify the order of evaluation. Expressions within parentheses are evaluated first. When parentheses are nested, the innermost set of parentheses is evaluated first, and then successively more inclusive parentheses are evaluated.

What is expression evaluation?

To evaluate an algebraic expression means to find the value of the expression when the variable is replaced by a given number. To evaluate an expression, we substitute the given number for the variable in the expression and then simplify the expression using the order of operations.


1 Answers

I'd take a slightly different approach to defining the safe functions and the environment in which you evaluate arbitrary code, but it's really just some style changes. This technique is provably safe, provided all of the functions in safe_f are safe, i.e. they don't allow you to perform arbitrary code execution. I'd be pretty confident the functions in list are safe, but you'd need to inspect the individual source code to be sure.

safe_f <- c(
  getGroupMembers("Math"),
  getGroupMembers("Arith"),
  getGroupMembers("Compare"),
  "<-", "{", "("
)

safe_env <- new.env(parent = emptyenv())

for (f in safe_f) {
  safe_env[[f]] <- get(f, "package:base")
}

safe_eval <- function(x) {
  eval(substitute(x), env = safe_env)
}

# Can't access variables outside of that environment
a <- 1
safe_eval(a)    

# But you can create in that environment
safe_eval(a <- 2)
# And retrieve later
safe_eval(a)
# a in the global environment is not affected
a

# You can't access dangerous functions
safe_eval(cat("Hi!"))

# And because function isn't included in the safe list
# you can't even create functions
safe_eval({
  log <- function() {
    stop("Danger!")
  }
  log()
})

This is a much simpler problem than the rapporter sandbox because you're not trying to create an useful R environment, just a useful calculator environment, and the set of functions to check is much much smaller.

like image 163
hadley Avatar answered Oct 14 '22 01:10

hadley