Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

OpenCL: 32-bit and 64-bit popcnt instruction on GPU?

Tags:

gpgpu

opencl

I want to write a program for GPU (preferrably OpenCL) and a large part of the computation consists of counting the number of 1's in a bit array (packed as long or int).

So, on modern CPUs I would obviously just use the native __popcnt instruction. I read on several places on the internet that modern GPUs, this instruction is also present in the hardware, which would be a huge speedup for me. (at least for 32-bit, not sure about 64)

However, I find nowhere how to us this instruction. So:

1) how should I find out which GPUs have this instruction? (I still need to buy my GPU, so it will be a modern high-end one... probably Radeon HD7000 series or nVidia Kepler)

2) how to call this instruction from OpenCL (or a similar GPU language)?

like image 959
user1111929 Avatar asked Feb 04 '12 12:02

user1111929


1 Answers

This is available as an extension cl_amd_popcnt. I have a Radeon 6870 card and opteron 6128 cpu, both support the extension.

Even better news for you is that as of OpenCL 1.2, it is no longer an extension. See the instruction popcount on the reference card and in the spec. The AMD 7xxx series hardware is OCL 1.2 compatible, and I imagine the new Nvidia stuff is too.

"T is type char, charn, uchar, ucharn, short, shortn, ushort, ushortn, int, intn, uint, uintn, long, longn, ulong, or ulongn, where n is 2, 3, 4, 8, or 16"

T popcount(T x) returns the number of populated (non-zero) bits in x.

http://www.khronos.org/registry/cl/sdk/1.2/docs/OpenCL-1.2-refcard.pdf

http://www.khronos.org/registry/cl/specs/opencl-1.2.pdf

like image 177
mfa Avatar answered Dec 03 '22 20:12

mfa