Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Built-in mod ('%') vs custom mod function: improve the performance of modulus operation

Recently I came to know that the mod('%') operator is very slow. So I made a function which will work just like a%b. But is it faster than the mod operator?

Here's my function

int mod(int a, int b)
{
    int tmp = a/b;
    return a - (b*tmp);
}
like image 435
madMDT Avatar asked Oct 25 '15 18:10

madMDT


People also ask

What is the function of modulus MOD operator in Visual Basic?

The modulus, or remainder, operator divides number1 by number2 (rounding floating-point numbers to integers) and returns only the remainder as result.

How does MOD () work?

The modulo operation (abbreviated “mod”, or “%” in many programming languages) is the remainder when dividing. For example, “5 mod 3 = 2” which means 2 is the remainder when you divide 5 by 3.

What is MOD function in SAS?

The MOD function returns the remainder from the division of dividend-expression by divisor-expression. When the result is nonzero, the result has the same sign as the first argument. The sign of the second argument is ignored.

Why is MOD important in programming?

Since any even number divided by 2 has a remainder of 0, we can use modulo to determine the even-ess of a number. This can be used to make every other row in a table a certain color, for example.


2 Answers

According to Chandler Carruth's benchmarks at CppCon 2015, the fastest modulo operator (on x86, when compiled with Clang) is:

int fast_mod(const int input, const int ceil) {
    // apply the modulo operator only when needed
    // (i.e. when the input is greater than the ceiling)
    return input >= ceil ? input % ceil : input;
    // NB: the assumption here is that the numbers are positive
}

I suggest that you watch the whole talk, he goes into more details on why this method is faster than just using % unconditionally.

like image 100
maddouri Avatar answered Sep 30 '22 01:09

maddouri


This will likely be compiler and platform dependent.

But I was interested and on my system you appear to be correct in my benchmarks. However the method from @865719's answer is fastest:

#include <chrono>
#include <iostream>

class Timer
{
    using clk = std::chrono::steady_clock;
    using microseconds = std::chrono::microseconds;

    clk::time_point tsb;
    clk::time_point tse;

public:

    void clear() { tsb = tse = clk::now(); }
    void start() { tsb = clk::now(); }
    void stop() { tse = clk::now(); }

    friend std::ostream& operator<<(std::ostream& o, const Timer& timer)
    {
        return o << timer.secs();
    }

    // return time difference in seconds
    double secs() const
    {
        if(tse <= tsb)
            return 0.0;
        auto d = std::chrono::duration_cast<microseconds>(tse - tsb);
        return d.count() / 1000000.0;
    }
};

int mod(int a, int b)
{
    int tmp=a/b;
    return a-(b*tmp);
}

int fast_mod(const int input, const int ceil) {
    // apply the modulo operator only when needed
    // (i.e. when the input is greater than the ceiling)
    return input < ceil ? input : input % ceil;
    // NB: the assumption here is that the numbers are positive
}

int main()
{
    auto N = 1000000000U;
    unsigned sum = 0;

    Timer timer;

    for(auto times = 0U; times < 3; ++times)
    {
        std::cout << "     run: " << (times + 1) << '\n';

        sum = 0;
        timer.start();
        for(decltype(N) n = 0; n < N; ++n)
            sum += n % (N - n);
        timer.stop();

        std::cout << "       %: " << sum << " " << timer << "s" << '\n';

        sum = 0;
        timer.start();
        for(decltype(N) n = 0; n < N; ++n)
            sum += mod(n, N - n);
        timer.stop();

        std::cout << "     mod: " << sum << " " << timer << "s" << '\n';

        sum = 0;
        timer.start();
        for(decltype(N) n = 0; n < N; ++n)
            sum += fast_mod(n, N - n);
        timer.stop();

        std::cout << "fast_mod: " << sum << " " << timer << "s" << '\n';
    }
}

Build: GCC 5.1.1 (x86_64)

g++ -std=c++14 -march=native -O3 -g0 ...

Output:

     run: 1
       %: 3081207628 5.49396s
     mod: 3081207628 4.30814s
fast_mod: 3081207628 2.51296s
     run: 2
       %: 3081207628 5.5522s
     mod: 3081207628 4.25427s
fast_mod: 3081207628 2.52364s
     run: 3
       %: 3081207628 5.4947s
     mod: 3081207628 4.29646s
fast_mod: 3081207628 2.56916s
like image 45
Galik Avatar answered Sep 30 '22 01:09

Galik