Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How come this algorithm in Ruby runs faster than in Parallel'd C#?

The following ruby code runs in ~15s. It barely uses any CPU/Memory (about 25% of one CPU):

def collatz(num)
   num.even? ? num/2 : 3*num + 1
end

start_time = Time.now
max_chain_count = 0
max_starter_num = 0
(1..1000000).each do |i|
    count = 0
    current = i
    current = collatz(current) and count += 1 until (current == 1)
    max_chain_count = count and max_starter_num = i if (count > max_chain_count)
end

puts "Max starter num: #{max_starter_num} -> chain of #{max_chain_count} elements. Found in: #{Time.now - start_time}s"

And the following TPL C# puts all my 4 cores to 100% usage and is orders of magnitude slower than the ruby version:

static void Euler14Test()
{
    Stopwatch sw = new Stopwatch();
    sw.Start();
    int max_chain_count = 0;
    int max_starter_num = 0;
    object locker = new object();
    Parallel.For(1, 1000000, i =>
    {
        int count = 0;
        int current = i;
        while (current != 1)
        {
            current = collatz(current);
            count++;
        }
        if (count > max_chain_count)
        {
            lock (locker)
            {
                max_chain_count = count;
                max_starter_num = i;
            }
        }
        if (i % 1000 == 0)
            Console.WriteLine(i);
    });
    sw.Stop();
    Console.WriteLine("Max starter i: {0} -> chain of {1} elements. Found in: {2}s", max_starter_num, max_chain_count, sw.Elapsed.ToString());
}

static int collatz(int num)
{
    return num % 2 == 0 ? num / 2 : 3 * num + 1;
}

How come ruby runs faster than C#? I've been told that Ruby is slow. Is that not true when it comes to algorithms?


Perf AFTER correction:

  • Ruby (Non parallel): 14.62s
  • C# (Non parallel): 2.22s
  • C# (With TPL): 0.64s
like image 495
Louis Kottmann Avatar asked Dec 06 '12 14:12

Louis Kottmann


2 Answers

Actually, the bug is quite subtle, and has nothing to do with threading. The reason that your C# version takes so long is that the intermediate values computed by the collatz method eventually start to overflow the int type, resulting in negative numbers which may then take ages to converge.

This first happens when i is 134,379, for which the 129th term (assuming one-based counting) is 2,482,111,348. This exceeds the maximum value of 2,147,483,647 and therefore gets stored as -1,812,855,948.

To get good performance (and correct results) on the C# version, just change:

int current = i;

…to:

long current = i;

…and:

static int collatz(int num)

…to:

static long collatz(long num)

That will bring down your performance to a respectable 1.5 seconds.

Edit: CodesInChaos raises a very valid point about enabling overflow checking when debugging math-oriented applications. Doing so would have allowed the bug to be immediately identified, since the runtime would throw an OverflowException.

Overflow checking

OverflowException

like image 172
Douglas Avatar answered Oct 18 '22 16:10

Douglas


Should be:

Parallel.For(1L, 1000000L, i =>
    {

Otherwise, you have integer overfill and start checking negative values. The same collatz method should operate with long values.

like image 36
Alexander Bortnik Avatar answered Oct 18 '22 16:10

Alexander Bortnik