Faster modulo or equality checking in R (or good ways to vectorize)

Question

I've been running through Project Euler trying to write programs that are computationally efficient. Consider problem 1: http://projecteuler.net/problem=1. I've upped the range from 1000 to 10,000,000 to highlight inefficiencies.

This is my solution:

system.time({
    x <- 1:1E7
    a <- sum(as.numeric(x[x%%3 ==0 | x%%5==0]))
})
 user  system elapsed 
0.980   0.041   1.011

Here is some C++ code written by a friend to do the same thing.

#include <iostream>
using namespace std;

int main(int argc, char** argv)
{
 long x = 0;
 for (int i = 1; i < 10000000; i++)
 {
   if (i % 3 == 0)
     x += i;
   else if (i % 5 == 0)
     x += i;
 }
 cout << x;
 return 0;
}
cbaden$ time ./a.out
23333331666668
real    0m0.044s
user    0m0.042s
sys     0m0.001s

I know C++ should be faster than R, but this much faster? Rprof indicate that I'm spending almost 60% of my time with the modulo operator and 13% of the time with the "==" operation. Are there any vectorized ways of doing this faster?

A secondary concern would be that I'm going to run out of memory--this approach is not very scalable as the range gets larger. Is there a good way to do this that preserves the vectorizability, yet doesn't try to keep the subset in memory?

flodel · Accepted Answer

Modulo is faster when it operates on integers and not numerics:

f1 <- function() {
   x <- 1:1E7
   a <- sum(as.numeric(x[x%%3 ==0 | x%%5==0]))
}

f2 <- function() {
   x <- 1:1E7
   a <- sum(as.numeric(x[x %% 3L == 0L | x %% 5L == 0L]))
}

library(rbenchmark)
benchmark(f1(), f2(), replications = 5)
#   test replications elapsed relative user.self sys.self user.child sys.child
# 1 f1()            5   14.78 4.976431     13.95     0.67         NA        NA
# 2 f2()            5    2.97 1.000000      2.37     0.50         NA        NA

That's still far from C++ performance, but it's a step in the right direction.

shhhhimhuntingrabbits · Answer

A faster solution

x <-1E7
a<-x%/%3
b<-x%/%5
c<-x%/%15
ans<-3*a*(a+1)/2+5*b*(b+1)/2-15*c*(c+1)/2

doesnt really help with regards to the modulo

Faster modulo or equality checking in R (or good ways to vectorize)

Tags:

r

Christopher Aden

2 Answers

flodel

shhhhimhuntingrabbits

Recent Activity

Donate For Us

Faster modulo or equality checking in R (or good ways to vectorize)

Tags:

r

Christopher Aden

2 Answers

flodel

shhhhimhuntingrabbits

Related questions

Recent Activity

Donate For Us