SSE multiplication of 2 64-bit integers

Question

How to multiply two 64-bit integers by another 2 64-bit integers? I didn't find any instruction which can do it.

EasyasPi · Accepted Answer

Late answer, but this is a better version of what Barabas posted.

If you have ever used GCC or Clang's vector extensions, this is the routine they use.

This uses the same method that long multiplication and grid multiplication use.

    65
  * 73
  ----
    15 //   (5 * 3)
   180 //   (6 * 3) * 10
   350 //   (5 * 7) * 10
+ 4200 // + (6 * 7) * 100
------
  4745

However, instead of doing each unit of 10, it uses each unit of 32 bits, and it leaves out the last multiply because it will always be shifted past the 64th bit, just like how you wouldn't multiply 6*7 if you were truncating values greater than 99.

#include <emmintrin.h>

/*
 * Grid/long multiply two 64-bit SSE lanes.
 * Works for both signed and unsigned.
 *   ----------------.--------------.----------------.
 *  |                |   b >> 32    | a & 0xFFFFFFFF |
 *  |----------------|--------------|----------------|  
 *  | d >> 32        |   b*d << 64  |    a*d << 32   |
 *  |----------------|--------------|----------------|
 *  | c & 0xFFFFFFFF |   b*c << 32  |       a*c      |
 *  '----------------'--------------'----------------'
 *  Add all of them together to get the product.
 *
 *  Because we truncate the value to 64 bits, b*d << 64 will be zero,
 *  so we can leave it out.
 *
 *  We also can add a*d and b*c first and then shift because of the
 *  distributive property: (a << 32) + (b << 32) == (a + b) << 32.
 */

__m128i Multiply64Bit(__m128i ab, __m128i cd)
{
    /* ac = (ab & 0xFFFFFFFF) * (cd & 0xFFFFFFFF); */
    __m128i ac = _mm_mul_epu32(ab, cd);

    /* b = ab >> 32; */
    __m128i b = _mm_srli_epi64(ab, 32);

    /* bc = b * (cd & 0xFFFFFFFF); */
    __m128i bc = _mm_mul_epu32(b, cd);

    /* d = cd >> 32; */
    __m128i d = _mm_srli_epi64(cd, 32);

    /* ad = (ab & 0xFFFFFFFF) * d; */
    __m128i ad = _mm_mul_epu32(ab, d);

    /* high = bc + ad; */
    __m128i high = _mm_add_epi64(bc, ad);

    /* high <<= 32; */
    high = _mm_slli_epi64(high, 32);

    /* return ac + high; */
    return _mm_add_epi64(high, ac);
}

Compiler Explorer Note: The GCC vector extension version was also included below for comparison.

SSE multiplication of 2 64-bit integers

Tags:

x86

multiplication

simd

sse

sse2

Ines Karmani

1 Answers

EasyasPi

Recent Activity

Donate For Us

SSE multiplication of 2 64-bit integers

Tags:

x86

multiplication

simd

sse

sse2

Ines Karmani

1 Answers

EasyasPi

Related questions

Recent Activity

Donate For Us