Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do you calculate cyclomatic complexity for R functions?

Cyclomatic complexity measures how many possible branches can be taken through a function. Is there an existing function/tool to calculate it for R functions? If not, suggestions are appreciated for the best way to write one.

A cheap start towards this would be to count up all the occurences of if, ifelse or switch within your function. To get a real answer though, you need to understand when branches start and end, which is much harder. Maybe some R parsing tools would get us started?

like image 911
Richie Cotton Avatar asked Aug 12 '11 13:08

Richie Cotton


People also ask

How many ways can Cyclomatic complexity be calculated?

Cyclomatic complexity can be used in two ways, to: Limit code complexity. Determine the number of test cases required.

What is Cyclomatic complexity example?

The Definition Cyclomatic complexity is a metric that indicates the possible number of paths inside a code artifact, e.g., a function, class, or whole program. Thomas J. McCabe Sr. developed this metric, first describing it in a 1976 paper.

How is Cyclomatic complexity calculated in white box testing?

Cyclomatic Complexity: It is a measure of the logical complexity of the software and is used to define the number of independent paths. For a graph G, V(G) is its cyclomatic complexity. Calculating V(G): V(G) = P + 1, where P is the number of predicate nodes in the flow graph.


2 Answers

You can use codetools::walkCode to walk the code tree. Unfortunately codetools' documentation is pretty sparse. Here's an explanation and sample to get you started.

walkCode takes an expression and a code walker. A code walker is a list that you create, that must contain three callback functions: handler, call, and leaf. (You can use the helper function makeCodeWalker to provide sensible default implementations of each.) walkCode walks over the code tree and makes calls into the code walker as it goes.

call(e, w) is called when a compound expression is encountered. e is the expression and w is the code walker itself. The default implementation simply recurses into the expression's child nodes (for (ee in as.list(e)) if (!missing(ee)) walkCode(ee, w)).

leaf(e, w) is called when a leaf node in the tree is encountered. Again, e is the leaf node expression and w is the code walker. The default implementation is simply print(e).

handler(v, w) is called for each compound expression and can be used to easily provide an alternative behavior to call for certain types of expressions. v is the character string representation of the parent of the compound expression (a little hard to explain--but basically <- if it's an assignment expression, { if it's the start of a block, if if it's an if-statement, etc.). If the handler returns NULL then call is invoked as usual; if you return a function instead, that's what's called instead of the function.

Here's an extremely simplistic example that counts occurrences of if and ifelse of a function. Hopefully this can at least get you started!

library(codetools)

countBranches <- function(func) {
  count <- 0
  walkCode(body(func), 
           makeCodeWalker(
             handler=function(v, w) {
               if (v == 'if' || v == 'ifelse')
                 count <<- count + 1
               NULL  # allow normal recursion
             },
             leaf=function(e, w) NULL))
  count
}
like image 59
Joe Cheng Avatar answered Sep 21 '22 01:09

Joe Cheng


Also, I just found a new package called cyclocomp (released 2016). Check it out!

like image 37
lrthistlethwaite Avatar answered Sep 20 '22 01:09

lrthistlethwaite