Fast inner product of ternary vectors

Question

Consider two vectors, A and B, of size n, 7 <= n <= 23. Both A and B consists of -1s, 0s and 1s only.

I need a fast algorithm which computes the inner product of A and B.

So far I've thought of storing the signs and values in separate uint32_ts using the following encoding:

sign 0, value 0 → 0
sign 0, value 1 → 1
sign 1, value 1 → -1.

The C++ implementation I've thought of looks like the following:

struct ternary_vector {
    uint32_t sign, value;
};

int inner_product(const ternary_vector & a, const ternary_vector & b) {
    uint32_t psign = a.sign ^ b.sign;
    uint32_t pvalue = a.value & b.value;
    psign &= pvalue;
    pvalue ^= psign;
    return __builtin_popcount(pvalue) - __builtin_popcount(psign);
}

This works reasonably well, but I'm not sure whether it is possible to do it better. Any comment on the matter is highly appreciated.

rabensky · Accepted Answer

I like having the 2 uint32_t, but I think your actual calculation is a bit wasteful

Just a few minor points:

I'm not sure about the reference (getting a and b by const &) - this adds a level of indirection compared to putting them on the stack. When the code is this small (a couple of clocks maybe) this is significant. Try passing by value and see what you get
__builtin_popcount can be, unfortunately, very inefficient. I've used it myself, but found that even a very basic implementation I wrote was far faster than this. However - this is dependent on the platform.

Basically, if the platform has a hardware popcount implementation, __builtin_popcount uses it. If not - it uses a very inefficient replacement.

Fast inner product of ternary vectors

Tags:

c++

c

vector

product

user92382

1 Answers

rabensky

Recent Activity

Donate For Us

Fast inner product of ternary vectors

Tags:

c++

c

vector

product

user92382

1 Answers

rabensky

Related questions

Recent Activity

Donate For Us