I'm trying to find the names of all the functions used in an arbitrary legal R expression, but I can't find a function that will flag the below example as a function instead of a name.
test <- expression(
this_is_a_function <- function(var1, var2){
this_is_a_function(var1-1, var2)
})
all.vars(test, functions = FALSE)
[1] "this_is_a_function" "var1" "var2"
all.vars(expr, functions = FALSE) seems to return functions declarations (f <- function(){}) in the expression, while filtering out function calls ('+'(1,2), ...).
Is there any function - in the core libraries or elsewhere - that will flag 'this_is_a_function' as a function, not a name? It needs to work on arbitrary expressions, that are syntactically legal but might not evaluate correctly (e.g '+'(1, 'duck'))
I've found similar questions, but they don't seem to contain the solution.
If clarification is needed, leave a comment below. I'm using the parser package to parse the expressions.
I have expressions with contain entire scripts, which usually consist of a main function containing nested function definitions, with a call to the main function at the end of the script.
Functions are all defined inside the expressions, and I don't mind if I have to include '<-' and '{', since I can easy filter them out myself.
The motivation is to take all my R scripts and gather basic statistics about how my use of functions has changed over time.
A Regex-based approach grabs the function definitions, combined with the method in James' comment to grab function calls. Usually works, since I never use right-hand assignment.
function_usage <- function(code_string){
# takes a script, extracts function definitions
require(stringr)
code_string <- str_replace(code_string, 'expression\\(', '')
equal_assign <- '.+[ \n]+<-[ \n]+function'
arrow_assign <- '.+[ \n]+=[ \n]+function'
function_names <- sapply(
strsplit(
str_match(code_string, equal_assign), split = '[ \n]+<-'),
function(x) x[1])
function_names <- c(function_names, sapply(
strsplit(
str_match(code_string, arrow_assign), split = '[ \n]+='),
function(x) x[1]))
return(table(function_names))
}
names() function in R Language is used to get or set the name of an Object. This function takes object i.e. vector, matrix or data frame as argument along with the value that is to be assigned as name to the object.
ls() function in R Language is used to list the names of all the objects that are present in the working directory.
There are mainly three types of function in R programming: Primitive Functions. Infix Functions. Replacement Functions.
Short answer: is.function
checks whether a variable actually holds a function. This does not work on (unevaluated) calls because they are calls. You also need to take care of masking:
mean <- mean (x)
Longer answer:
IMHO there is a big difference between the two occurences of this_is_a_function
.
In the first case you'll assign a function to the variable with name this_is_a_function
once you evaluate the expression. The difference is the same difference as between 2+2
and 4
.
However, just finding <- function ()
does not guarantee that the result is a function:
f <- function (x) {x + 1} (2)
The second occurrence is syntactically a function call. You can determine from the expression that a variable called this_is_a_function
which holds a function needs to exist in order for the call to evaluate properly. BUT: you don't know whether it exists from that statement alone. however, you can check whether such a variable exists, and whether it is a function.
The fact that functions are stored in variables like other types of data, too, means that in the first case you can know that the result of function ()
will be function and from that conclude that immediately after this expression is evaluated, the variable with name this_is_a_function
will hold a function.
However, R is full of names and functions: "->" is the name of the assignment function (a variable holding the assignment function) ...
After evaluating the expression, you can verify this by is.function (this_is_a_function)
.
However, this is by no means the only expression that returns a function: Think of
f <- function () {g <- function (){}}
> body (f)[[2]][[3]]
function() {
}
> class (body (f)[[2]][[3]])
[1] "call"
> class (eval (body (f)[[2]][[3]]))
[1] "function"
all.vars(expr, functions = FALSE) seems to return functions declarations (f <- function(){}) in the expression, while filtering out function calls ('+'(1,2), ...).
I'd say it is the other way round: in that expression f
is the variable (name) which will be asssigned the function (once the call is evaluated). +
(1, 2) evaluates to a numeric. Unless you keep it from doing so.
e <- expression (1 + 2)
> e <- expression (1 + 2)
> e [[1]]
1 + 2
> e [[1]][[1]]
`+`
> class (e [[1]][[1]])
[1] "name"
> eval (e [[1]][[1]])
function (e1, e2) .Primitive("+")
> class (eval (e [[1]][[1]]))
[1] "function"
Instead of looking for function definitions, which is going to be effectively impossible to do correctly without actually evaluating the functions, it will be easier to look for function calls.
The following function recursively spiders the expression/call tree returning the names of all objects that are called like a function:
find_calls <- function(x) {
# Base case
if (!is.recursive(x)) return()
recurse <- function(x) {
sort(unique(as.character(unlist(lapply(x, find_calls)))))
}
if (is.call(x)) {
f_name <- as.character(x[[1]])
c(f_name, recurse(x[-1]))
} else {
recurse(x)
}
}
It works as expected for a simple test case:
x <- expression({
f(3, g())
h <- function(x, y) {
i()
j()
k(l())
}
})
find_calls(x)
# [1] "{" "<-" "f" "function" "g" "i" "j"
# [8] "k" "l"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With