Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do you code an R function so that it 'knows' to look in 'data' for the variables in other arguments?

Tags:

If you run:

mod <- lm(mpg ~ factor(cyl), data=mtcars)

It runs, because lm knows to look in mtcars to find both mpg and cyl.

Yet mean(mpg) fails as it can't find mpg, so you do mean(mtcars$mpg).

How do you code a function so that it knows to look in 'data' for the variables?

myfun <- function (a,b,data){
    return(a+b)
}

This will work with:

myfun(mtcars$mpg, mtcars$hp)

but will fail with:

myfun(mpg,hp, data=mtcars )

Cheers

like image 379
nzcoops Avatar asked Dec 13 '11 05:12

nzcoops


People also ask

How do you find an argument in a function in R?

Get the List of Arguments of a Function in R Programming – args() Function. args() function in R Language is used to get the required arguments by a function. It takes function name as arguments and returns the arguments that are required by that function.

How do you pass a function as an argument in R?

We can pass an argument to a function while calling the function by simply giving the value as an argument inside the parenthesis. Below is an implementation of a function with a single argument.

When using a function the functions arguments can be specified by?

Because all function arguments have names, they can be specified using their name. Specifying an argument by its name is sometimes useful if a function has many arguments and it may not always be clear which argument is being specified. Here, our function only has one argument so there's no confusion.

Which function is used to find out the type of a certain data in R?

There are several ways to check data type in R. We can make use of the “typeof()” function, “class()” function and even the “str()” function to check the data type of an entire dataframe.


2 Answers

Here's how I would code myfun():

myfun <- function(a, b, data) {
    eval(substitute(a + b), envir=data, enclos=parent.frame())
}

myfun(mpg, hp, mtcars)
#  [1] 131.0 131.0 115.8 131.4 193.7 123.1 259.3  86.4 117.8 142.2 140.8 196.4
# [13] 197.3 195.2 215.4 225.4 244.7  98.4  82.4  98.9 118.5 165.5 165.2 258.3
# [25] 194.2  93.3 117.0 143.4 279.8 194.7 350.0 130.4

If you're familiar with with(), it's interesting to see that it works in almost exactly the same way:

> with.default
# function (data, expr, ...) 
# eval(substitute(expr), data, enclos = parent.frame())
# <bytecode: 0x016c3914>
# <environment: namespace:base>

In both cases, the key idea is to first create an expression from the symbols passed in as arguments and then evaluate that expression using data as the 'environment' of the evaluation.

The first part (e.g. turning a + b into the expression mpg + hp) is possible thanks to substitute(). The second part is possible because eval() was beautifully designed, such that it can take a data.frame as its evaluation environment.

like image 150
Josh O'Brien Avatar answered Oct 07 '22 07:10

Josh O'Brien


lm "knows" to look in its data argument because it actually constructs a call to model.frame using its own call as the base. If you look at the code for lm, you'll see the necessary machinery in the first dozen lines or so.

You could replicate this for your own ends, but if your needs are simpler, you don't have to go to the same extent. For example:

myfun <- function(..., data)
eval(match.call(expand.dots=FALSE)$...[[1]], data)

Or, just look at evalq.

like image 22
Hong Ooi Avatar answered Oct 07 '22 07:10

Hong Ooi