Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can counting byte matches between two strings be optimized using SIMD?

Profiling suggests that this function here is a real bottle neck for my application:

static inline int countEqualChars(const char* string1, const char* string2, int size) {
    int r = 0;
    for (int j = 0; j < size; ++j) {
        if (string1[j] == string2[j]) {
            ++r;
        }
    }

    return r;
}

Even with -O3 and -march=native, G++ 4.7.2 does not vectorize this function (I checked the assembler output). Now, I'm not an expert with SSE and friends, but I think that comparing more than one character at once should be faster. Any ideas on how to speed things up? Target architecture is x86-64.

like image 837
Milan Avatar asked Mar 24 '13 13:03

Milan


1 Answers

Of course it can.

pcmpeqb compares two vectors of 16 bytes and produces a vector with zeros where they differed, and -1 where they match. Use this to compare 16 bytes at a time, adding the result to an accumulator vector (make sure to accumulate the results of at most 255 vector compares to avoid overflow). When you're done, there are 16 results in the accumulator. Sum them and negate to get the number of equal elements.

If the lengths are very short, it will be hard to get a significant speedup from this approach. If the lengths are long, then it will be worth pursuing.

like image 127
Stephen Canon Avatar answered Oct 17 '22 06:10

Stephen Canon