OK, we're all familiar with double colon operator in R. Whenever I'm about to write some function, I use <code>require(<pkgname>)</code>, but I was always thinking about using <code>::</code> instead. Using <code>require</code> in custom functions is better practice than <code>library</code>, since <code>require</code> returns warning and <code>FALSE</code>, unlike <code>library</code>, which returns error if you provide a name of non-existent package. On the other hand, <code>::</code> operator gets the variable from the package, while <code>require</code> loads whole package (at least I hope so), so speed differences came first to my mind. <code>::</code> must be faster than <code>require</code>. And I did some analysis in order to check that - I've written two simple functions that load <code>read.systat</code> function from <code>foreign</code> package, with <code>require</code> and <code>::</code> respectively, hence import <code>Iris.syd</code> dataset that ships with <code>foreign</code> package, replicated functions 1000 times each (which was shamelessly arbitrary), and... crunched some numbers. Strangely (or not) I found significant differences in terms of user CPU and elapsed time, while there were no significant differences in terms of system CPU. And yet more strange conclusion: <code>::</code> is actually slower! Documentation for <code>::</code> is very blunt, and just by looking at sources it's obvious that <code>::</code> should perform better! require <pre class="prettyprint"><code>#!/usr/local/bin/r ## with require fn1 <- function() { require(foreign) read.systat("Iris.syd", to.data.frame=TRUE) } ## times n <- 1e3 sink("require.txt") print(t(replicate(n, system.time(fn1())))) sink() </code></pre> double colon <pre class="prettyprint"><code>#!/usr/local/bin/r ## with :: fn2 <- function() { foreign::read.systat("Iris.syd", to.data.frame=TRUE) } ## times n <- 1e3 sink("double_colon.txt") print(t(replicate(n, system.time(fn2())))) sink() </code></pre> Grab CSV data here. Some stats: <pre class="prettyprint"><code>user CPU: W = 475366 p-value = 0.04738 MRr = 975.866 MRc = 1025.134 system CPU: W = 503312.5 p-value = 0.7305 MRr = 1003.8125 MRc = 997.1875 elapsed time: W = 403299.5 p-value < 2.2e-16 MRr = 903.7995 MRc = 1097.2005 </code></pre> MRr is mean rank for <code>require</code>, MRc ibid for <code>::</code>. I must have done something wrong here. It just doesn't make any sense... Execution time for <code>::</code> seems way faster!!! I may have screwed something up, you shouldn't discard that option... OK... I've wasted my time in order to see that there is some difference, and I carried out completely useless analysis, so, back to the question: "Why should one prefer <code>require</code> over <code>::</code> when writing a function?" =)

Since the time to load a package is almost always small compared to the time you spend trying to figure out what the code you wrote six months ago was about, in this case coding for clarity is the most important thing. For scripts, having a call to <code>require</code> or <code>library</code> at the start lets you know which packages you need straight away. Similarly, calling <code>require</code> (or a wrapper like <code>requirePackage</code> in <code>Hmisc</code> or <code>try_require</code> in <code>ggplot2</code>) at the start of a function is the most unambiguous way of showing that you need to use that package. <code>::</code> should be reserved for cases when you have naming conflicts between packages – compare, e.g., <pre class="prettyprint"><code>Hmisc::is.discrete </code></pre> and <pre class="prettyprint"><code>plyr::is.discrete </code></pre>

R writing style - require vs. ::

Tags:

OK, we're all familiar with double colon operator in R. Whenever I'm about to write some function, I use require(<pkgname>), but I was always thinking about using :: instead. Using require in custom functions is better practice than library, since require returns warning and FALSE, unlike library, which returns error if you provide a name of non-existent package.

On the other hand, :: operator gets the variable from the package, while require loads whole package (at least I hope so), so speed differences came first to my mind. :: must be faster than require.

And I did some analysis in order to check that - I've written two simple functions that load read.systat function from foreign package, with require and :: respectively, hence import Iris.syd dataset that ships with foreign package, replicated functions 1000 times each (which was shamelessly arbitrary), and... crunched some numbers.

Strangely (or not) I found significant differences in terms of user CPU and elapsed time, while there were no significant differences in terms of system CPU. And yet more strange conclusion: :: is actually slower! Documentation for :: is very blunt, and just by looking at sources it's obvious that :: should perform better!

require

#!/usr/local/bin/r

## with require
fn1 <- function() {
  require(foreign)
  read.systat("Iris.syd", to.data.frame=TRUE)
}

## times
n <- 1e3

sink("require.txt")
print(t(replicate(n, system.time(fn1()))))
sink()

double colon

#!/usr/local/bin/r

## with ::
fn2 <- function() {
  foreign::read.systat("Iris.syd", to.data.frame=TRUE)
}

## times
n <- 1e3


sink("double_colon.txt")
print(t(replicate(n, system.time(fn2()))))
sink()

Grab CSV data here. Some stats:

user CPU:     W = 475366    p-value = 0.04738  MRr =  975.866    MRc = 1025.134
system CPU:   W = 503312.5  p-value = 0.7305   MRr = 1003.8125   MRc =  997.1875
elapsed time: W = 403299.5  p-value < 2.2e-16  MRr =  903.7995   MRc = 1097.2005

MRr is mean rank for require, MRc ibid for ::. I must have done something wrong here. It just doesn't make any sense... Execution time for :: seems way faster!!! I may have screwed something up, you shouldn't discard that option...

OK... I've wasted my time in order to see that there is some difference, and I carried out completely useless analysis, so, back to the question:

"Why should one prefer require over :: when writing a function?"

935

asked Dec 06 '10 23:12

aL3xa

2 Answers

"Why should one prefer require over :: when writing a function?"

I usually prefer require due to the nice TRUE/FALSE return value that lets me deal with the possibility of the package not being available up front before getting into the code. Crash as early as possible instead of halfway through your analysis.

I only use :: when I need to make sure I am using the correct version of a function, not a version from some other package that is masking the name.

On the other hand, :: operator gets the variable from the package, while require loads whole package (at least I hope so), so speed differences came first to my mind. :: must be faster than require.

I think you may be ignoring the effects of lazy loading which is used by the foreign package according to the first page of its manual. Essentially, packages that use lazy loading defer the loading of objects, such as functions, until the objects are called upon for the first time. So your argument that ":: must be faster than require" is not necessarily true as foreign is not loading all of its contents into memory when you attach it with require. For full details on lazy loading, see Prof. Ripley's article in RNews, Volume 4, Issue 2.

answered Nov 03 '22 07:11

Sharpie

Since the time to load a package is almost always small compared to the time you spend trying to figure out what the code you wrote six months ago was about, in this case coding for clarity is the most important thing.

For scripts, having a call to require or library at the start lets you know which packages you need straight away.

Similarly, calling require (or a wrapper like requirePackage in Hmisc or try_require in ggplot2) at the start of a function is the most unambiguous way of showing that you need to use that package.

:: should be reserved for cases when you have naming conflicts between packages – compare, e.g.,

Hmisc::is.discrete

and

plyr::is.discrete

answered Nov 03 '22 06:11

Richie Cotton

Related questions
                            
                                Is there a C compiler that targets the 8086? [closed]
                            
                                Where to find free open sourced android custom controls?
                            
                                Adding index on large table takes forever
                            
                                The limit of SQL CE 4.0
                            
                                C++ : Implementing copy constructor and copy assignment operator
                            
                                Is it safe to bind a reference to a not yet constructed object in C++?
                            
                                Rails, Heroku and Subdomains. Is my special case scenario feasible?
                            
                                window.opener alternatives
                            
                                Avoiding duplicated messages on JMS/ActiveMQ
                            
                                RtAudio or PortAudio, which one to use?
                            
                                Caching attribute for method?
                            
                                GCC switch on enum, retain missing warning but use default

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With