Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculating mean value in the metal kernel

Anyone knows a proper way to calculate mean value of the buffer with random float numbers in the metal kernel?

Dispatching work on the compute command encoder:

threadsPerGroup = MTLSizeMake(1, 1, inputTexture.arrayLength);
numThreadGroups = MTLSizeMake(1, 1, inputTexture.arrayLength / threadsPerGroup.depth);

[commandEncoder dispatchThreadgroups:numThreadGroups
               threadsPerThreadgroup:threadsPerGroup];

Kernel code:

kernel void mean(texture2d_array<float, access::read> inTex [[ texture(0) ]],
             device float *means                            [[ buffer(1) ]],
             uint3 id                                       [[ thread_position_in_grid ]]) {

    if (id.x == 0 && id.y == 0) {
        float mean = 0.0;
        for (uint i = 0; i < inTex.get_width(); ++i) {
            for (uint j = 0; j < inTex.get_height(); ++j) {
                    mean += inTex.read(uint2(i, j), id.z)[0];
            }
        }

        float textureArea = inTex.get_width() * inTex.get_height();
        mean /= textureArea;
        out[id.z] = mean;
    }
}

The buffer is represented in the texture of texture2d_array type with R32Float pixel format.

like image 474
Vahan Babayan Avatar asked Oct 31 '16 17:10

Vahan Babayan


1 Answers

If you can use an array of uint (instead of float) as your data source, I would suggest using an "Atomic Fetch and Modify functions" (as described in the metal shading language spec) to write atomically to a buffer.

Here's an example of a kernel function which takes an input buffer (data: an array of Float) and writes the sum of the buffer into an atomic buffer (sum, a pointer to a uint):

kernel void sum(device uint *data [[ buffer(0) ]],
                volatile device atomic_uint *sum [[ buffer(1) ]],
                uint gid [[ thread_position_in_grid ]])
{
    atomic_fetch_add_explicit(sum, data[gid], memory_order_relaxed);
}

In your swift file, you would set the buffers:

...
let data: [UInt] = [1, 2, 3, 4]
let dataBuffer = device.makeBuffer(bytes: &data, length: (data.count * MemoryLayout<UInt>.size), options: [])
commandEncoder.setBuffer(dataBuffer, offset: 0, at: 0)

var sum:UInt = 0
let sumBuffer = device!.makeBuffer(bytes: &sum, length: MemoryLayout<UInt>.size, options: [])
commandEncoder.setBuffer(sumBuffer, offset: 0, at: 1)
commandEncoder.endEncoding()

Commit, wait and then fetch the data from the GPU:

commandBuffer.commit()
commandBuffer.waitUntilCompleted()

let nsData = NSData(bytesNoCopy: sumBuffer.contents(),
                        length: sumBuffer.length,
                        freeWhenDone: false)
nsData.getBytes(&sum, length:sumBuffer.length)

let mean = Float(sum/data.count)
print(mean)

Alternatively, if your initial data source has to be an array of float, you could use the vDSP_meanv method of the Accelerate framework which is very fast for such computation.

I Hope that helped, cheers!

like image 89
Steve Begin Avatar answered Oct 14 '22 08:10

Steve Begin