Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Explicitly calling return in a function or not

Tags:

r

A while back I got rebuked by Simon Urbanek from the R core team (I believe) for recommending a user to explicitly calling return at the end of a function (his comment was deleted though):

foo = function() {   return(value) } 

instead he recommended:

foo = function() {   value } 

Probably in a situation like this it is required:

foo = function() {  if(a) {    return(a)  } else {    return(b)  } } 

His comment shed some light on why not calling return unless strictly needed is a good thing, but this was deleted.

My question is: Why is not calling return faster or better, and thus preferable?

like image 577
Paul Hiemstra Avatar asked Jul 31 '12 11:07

Paul Hiemstra


1 Answers

Question was: Why is not (explicitly) calling return faster or better, and thus preferable?

There is no statement in R documentation making such an assumption.
The main page ?'function' says:

function( arglist ) expr return(value) 

Is it faster without calling return?

Both function() and return() are primitive functions and the function() itself returns last evaluated value even without including return() function.

Calling return() as .Primitive('return') with that last value as an argument will do the same job but needs one call more. So that this (often) unnecessary .Primitive('return') call can draw additional resources. Simple measurement however shows that the resulting difference is very small and thus can not be the reason for not using explicit return. The following plot is created from data selected this way:

bench_nor2 <- function(x,repeats) { system.time(rep( # without explicit return (function(x) vector(length=x,mode="numeric"))(x) ,repeats)) }  bench_ret2 <- function(x,repeats) { system.time(rep( # with explicit return (function(x) return(vector(length=x,mode="numeric")))(x) ,repeats)) }  maxlen <- 1000 reps <- 10000 along <- seq(from=1,to=maxlen,by=5) ret <- sapply(along,FUN=bench_ret2,repeats=reps) nor <- sapply(along,FUN=bench_nor2,repeats=reps) res <- data.frame(N=along,ELAPSED_RET=ret["elapsed",],ELAPSED_NOR=nor["elapsed",])  # res object is then visualized # R version 2.15 

Function elapsed time comparison

The picture above may slightly difffer on your platform. Based on measured data, the size of returned object is not causing any difference, the number of repeats (even if scaled up) makes just a very small difference, which in real word with real data and real algorithm could not be counted or make your script run faster.

Is it better without calling return?

Return is good tool for clearly designing "leaves" of code where the routine should end, jump out of the function and return value.

# here without calling .Primitive('return') > (function() {10;20;30;40})() [1] 40 # here with .Primitive('return') > (function() {10;20;30;40;return(40)})() [1] 40 # here return terminates flow > (function() {10;20;return();30;40})() NULL > (function() {10;20;return(25);30;40})() [1] 25 >  

It depends on strategy and programming style of the programmer what style he use, he can use no return() as it is not required.

R core programmers uses both approaches ie. with and without explicit return() as it is possible to find in sources of 'base' functions.

Many times only return() is used (no argument) returning NULL in cases to conditially stop the function.

It is not clear if it is better or not as standard user or analyst using R can not see the real difference.

My opinion is that the question should be: Is there any danger in using explicit return coming from R implementation?

Or, maybe better, user writing function code should always ask: What is the effect in not using explicit return (or placing object to be returned as last leaf of code branch) in the function code?

like image 128
Petr Matousu Avatar answered Sep 28 '22 21:09

Petr Matousu