If we look at any function written in C in R base source code we can see that the code is quite simple, e.g.
#include "nmath.h"
#include "dpq.h"
double dexp(double x, double scale, int give_log)
{
#ifdef IEEE_754
/* NaNs propagated correctly */
if (ISNAN(x) || ISNAN(scale)) return x + scale;
#endif
if (scale <= 0.0) ML_ERR_return_NAN;
if (x < 0.)
return R_D__0;
return (give_log ?
(-x / scale) - log(scale) :
exp(-x / scale) / scale);
}
While the function is vectorized if called from R:
dexp(rep(1, 5), 1:2)
## [1] 0.3678794 0.2706706 0.3678794 0.2706706 0.3678794
How is it so? What makes it properly vectorized?
Actually, you're not quoting the code that defines the R function. In R we see
> dexp
function (x, rate = 1, log = FALSE)
.Call(C_dexp, x, 1/rate, log)
<bytecode: 0x3582cf8>
<environment: namespace:stats>
which tells us that the dexp() function is defined in the stats package. With a little digging we see in the stats NAMESPACE
useDynLib(stats, .registration = TRUE, .fixes = "C_")
which tells us that the prefix C_ is added to the function name when exposed to R. So we're looking for a C function called dexp in the stats package. There are two relevant entries
init.c:147: CALLDEF_MATH2_1(dexp),
distn.c:144:DEFMATH2_1(dexp)
the first is a macro making the C function available to R, the second is a macro defining the function. For the later, the macro is defined as
#define DEFMATH2_1(name) \
SEXP do_##name(SEXP sa, SEXP sb, SEXP sI) { \
return math2_1(sa, sb, sI, name); \
}
which tells us to look for the math2_1 function, a little further up the file
static SEXP math2_1(SEXP sa, SEXP sb, SEXP sI, double (*f)(double, double, int))
{
SEXP sy;
R_xlen_t i, ia, ib, n, na, nb;
double ai, bi, *a, *b, *y;
int m_opt;
int naflag;
if (!isNumeric(sa) || !isNumeric(sb))
error(R_MSG_NONNUM_MATH);
SETUP_Math2;
m_opt = asInteger(sI);
mod_iterate(na, nb, ia, ib) {
// if ((i+1) % NINTERRUPT) R_CheckUserInterrupt();
ai = a[ia];
bi = b[ib];
if_NA_Math2_set(y[i], ai, bi)
else {
y[i] = f(ai, bi, m_opt);
if (ISNAN(y[i])) naflag = 1;
}
}
FINISH_Math2;
return sy;
} /* math2_1() */
That mod_iterate() call is actually another macro
#define mod_iterate(n1,n2,i1,i2) for (i=i1=i2=0; i<n; \
i1 = (++i1 == n1) ? 0 : i1,\
i2 = (++i2 == n2) ? 0 : i2,\
++i)
and you can see, at least with the eye of faith, that the C code is implementing a loop, so the magic implied by the original question (that compact C code results in vectorized calculations) is not there -- it is an iteration at the C level.
math2_1 takes as it's final argument a function, which in this case is the dexp quoted in the original question. This function is applied to each element in the iteration.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With