Cyclomatic complexity measures how many possible branches can be taken through a function. Is there an existing function/tool to calculate it for R functions? If not, suggestions are appreciated for the best way to write one.
A cheap start towards this would be to count up all the occurences of if
, ifelse
or switch
within your function. To get a real answer though, you need to understand when branches start and end, which is much harder. Maybe some R parsing tools would get us started?
Cyclomatic complexity can be used in two ways, to: Limit code complexity. Determine the number of test cases required.
The Definition Cyclomatic complexity is a metric that indicates the possible number of paths inside a code artifact, e.g., a function, class, or whole program. Thomas J. McCabe Sr. developed this metric, first describing it in a 1976 paper.
Cyclomatic Complexity: It is a measure of the logical complexity of the software and is used to define the number of independent paths. For a graph G, V(G) is its cyclomatic complexity. Calculating V(G): V(G) = P + 1, where P is the number of predicate nodes in the flow graph.
You can use codetools::walkCode
to walk the code tree. Unfortunately codetools' documentation is pretty sparse. Here's an explanation and sample to get you started.
walkCode
takes an expression and a code walker. A code walker is a list that you create, that must contain three callback functions: handler
, call
, and leaf
. (You can use the helper function makeCodeWalker
to provide sensible default implementations of each.) walkCode
walks over the code tree and makes calls into the code walker as it goes.
call(e, w)
is called when a compound expression is encountered. e
is the expression and w
is the code walker itself. The default implementation simply recurses into the expression's child nodes (for (ee in as.list(e)) if (!missing(ee)) walkCode(ee, w)
).
leaf(e, w)
is called when a leaf node in the tree is encountered. Again, e
is the leaf node expression and w
is the code walker. The default implementation is simply print(e)
.
handler(v, w)
is called for each compound expression and can be used to easily provide an alternative behavior to call
for certain types of expressions. v
is the character string representation of the parent of the compound expression (a little hard to explain--but basically <-
if it's an assignment expression, {
if it's the start of a block, if
if it's an if-statement, etc.). If the handler returns NULL
then call
is invoked as usual; if you return a function instead, that's what's called instead of the function.
Here's an extremely simplistic example that counts occurrences of if
and ifelse
of a function. Hopefully this can at least get you started!
library(codetools)
countBranches <- function(func) {
count <- 0
walkCode(body(func),
makeCodeWalker(
handler=function(v, w) {
if (v == 'if' || v == 'ifelse')
count <<- count + 1
NULL # allow normal recursion
},
leaf=function(e, w) NULL))
count
}
Also, I just found a new package called cyclocomp (released 2016). Check it out!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With