Finding the names of all functions in an R expression

Q: What does the function names () do in R?

names() function in R Language is used to get or set the name of an Object. This function takes object i.e. vector, matrix or data frame as argument along with the value that is to be assigned as name to the object.

Q: How do I list all objects in R?

ls() function in R Language is used to list the names of all the objects that are present in the working directory.

Q: How many types of functions are in R?

There are mainly three types of function in R programming: Primitive Functions. Infix Functions. Replacement Functions.

Tags:

function

r

expression

metaprogramming

I'm trying to find the names of all the functions used in an arbitrary legal R expression, but I can't find a function that will flag the below example as a function instead of a name.

test <- expression(
    this_is_a_function <- function(var1, var2){

    this_is_a_function(var1-1, var2)
})

all.vars(test, functions = FALSE)

[1] "this_is_a_function" "var1"              "var2"

all.vars(expr, functions = FALSE) seems to return functions declarations (f <- function(){}) in the expression, while filtering out function calls ('+'(1,2), ...).

Is there any function - in the core libraries or elsewhere - that will flag 'this_is_a_function' as a function, not a name? It needs to work on arbitrary expressions, that are syntactically legal but might not evaluate correctly (e.g '+'(1, 'duck'))

I've found similar questions, but they don't seem to contain the solution.

If clarification is needed, leave a comment below. I'm using the parser package to parse the expressions.

Edit: @Hadley

I have expressions with contain entire scripts, which usually consist of a main function containing nested function definitions, with a call to the main function at the end of the script.

Functions are all defined inside the expressions, and I don't mind if I have to include '<-' and '{', since I can easy filter them out myself.

The motivation is to take all my R scripts and gather basic statistics about how my use of functions has changed over time.

Edit: Current Solution

A Regex-based approach grabs the function definitions, combined with the method in James' comment to grab function calls. Usually works, since I never use right-hand assignment.

function_usage <- function(code_string){
    # takes a script, extracts function definitions

    require(stringr)

    code_string <- str_replace(code_string, 'expression\\(', '')

    equal_assign <- '.+[ \n]+<-[ \n]+function'
    arrow_assign <- '.+[ \n]+=[ \n]+function'

    function_names <- sapply(
        strsplit(
            str_match(code_string, equal_assign), split = '[ \n]+<-'),    
        function(x) x[1])

    function_names <- c(function_names, sapply(
        strsplit(
            str_match(code_string, arrow_assign), split = '[ \n]+='),    
            function(x) x[1]))

        return(table(function_names))    
    }

464

asked Jan 11 '13 10:01

Róisín Grannell

2 Answers

Short answer: is.function checks whether a variable actually holds a function. This does not work on (unevaluated) calls because they are calls. You also need to take care of masking:

mean <- mean (x)

Longer answer:

IMHO there is a big difference between the two occurences of this_is_a_function.

In the first case you'll assign a function to the variable with name this_is_a_function once you evaluate the expression. The difference is the same difference as between 2+2 and 4.
However, just finding <- function () does not guarantee that the result is a function:

f <- function (x) {x + 1} (2)

The second occurrence is syntactically a function call. You can determine from the expression that a variable called this_is_a_function which holds a function needs to exist in order for the call to evaluate properly. BUT: you don't know whether it exists from that statement alone. however, you can check whether such a variable exists, and whether it is a function.

The fact that functions are stored in variables like other types of data, too, means that in the first case you can know that the result of function () will be function and from that conclude that immediately after this expression is evaluated, the variable with name this_is_a_function will hold a function.

However, R is full of names and functions: "->" is the name of the assignment function (a variable holding the assignment function) ...

After evaluating the expression, you can verify this by is.function (this_is_a_function). However, this is by no means the only expression that returns a function: Think of

f <- function () {g <- function (){}}
> body (f)[[2]][[3]]
function() {
}
> class (body (f)[[2]][[3]])
[1] "call"
> class (eval (body (f)[[2]][[3]]))
[1] "function"

all.vars(expr, functions = FALSE) seems to return functions declarations (f <- function(){}) in the expression, while filtering out function calls ('+'(1,2), ...).

I'd say it is the other way round: in that expression f is the variable (name) which will be asssigned the function (once the call is evaluated). + (1, 2) evaluates to a numeric. Unless you keep it from doing so.

e <- expression (1 + 2)
> e <- expression (1 + 2)
> e [[1]]
1 + 2
> e [[1]][[1]]
`+`
> class (e [[1]][[1]])
[1] "name"
> eval (e [[1]][[1]])
function (e1, e2)  .Primitive("+")
> class (eval (e [[1]][[1]]))
[1] "function"

111

answered Sep 21 '22 05:09

cbeleites unhappy with SX

Instead of looking for function definitions, which is going to be effectively impossible to do correctly without actually evaluating the functions, it will be easier to look for function calls.

The following function recursively spiders the expression/call tree returning the names of all objects that are called like a function:

find_calls <- function(x) {
  # Base case
  if (!is.recursive(x)) return()

  recurse <- function(x) {
    sort(unique(as.character(unlist(lapply(x, find_calls)))))
  }

  if (is.call(x)) {
    f_name <- as.character(x[[1]])
    c(f_name, recurse(x[-1]))
  } else {
    recurse(x)
  }
}

It works as expected for a simple test case:

x <- expression({
  f(3, g())
  h <- function(x, y) {
    i()
    j()
    k(l())
  }
})
find_calls(x)
# [1] "{"        "<-"       "f"        "function" "g"        "i"        "j"  
# [8] "k"        "l"

answered Sep 18 '22 05:09

hadley

Related questions
                            
                                Parallel for-loop in Windows
                            
                                Efficiently removing missing values from the start and end of multiple time series in 1 data frame
                            
                                Using substitute() to get argument names, multiple levels up
                            
                                cut off density plot in ggplot2
                            
                                Where is this whitespace hiding?
                            
                                Could we do backward elimination with mixed model using lmer
                            
                                geom_smooth in ggplot causes part of plot background to change colour
                            
                                How do I export a sorted factor loading table?
                            
                                Select rows without missing values in R
                            
                                merging data and receiving a big loss of data
                            
                                Create a column which increments based on another column in Python
                            
                                How can I color nodes and edges from an adjacency matrix in r?
                            
                                as.character usage on functions
                            
                                Adding a plane to a scatterplot3d
                            
                                How can I send selected text (or a line) in TextMate2 to R running on Terminal
                            
                                How to remove "Standard Error" column from xtable() output of an lm on R/RSweave/LaTeX
                            
                                Work with durations over 24 hours in R
                            
                                Looping through variable names in R
                            
                                What type of HTML table is this and what type of webscraping techniques can you use? [closed]
                            
                                Testthat fails when setting up rms by calling datadist() + options()

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With