Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Allocating memory for data used by MTLBuffer in iOS Metal

As a follow-up question to this answer. I am trying to replace a for-loop running on CPU with a kernel function in Metal to parallelize computation and speed up performance.

My function is basically a convolution. Since I repeatedly receive new data for my input array values (the data stems from a AVCaptureSession) it seems that using newBufferWithBytesNoCopy:length:options:deallocator: is the sensible option for creating the MTLBuffer objects. Here is the relevant code:

id <MTLBuffer> dataBuffer = [device newBufferWithBytesNoCopy:dataVector length:sizeof(dataVector) options:MTLResourceStorageModeShared deallocator:nil];
id <MTLBuffer> filterBuffer = [device newBufferWithBytesNoCopy:filterVector length:sizeof(filterVector) options:MTLResourceStorageModeShared deallocator:nil];
id <MTLBuffer> outBuffer = [device newBufferWithBytesNoCopy:outVector length:sizeof(outVector) options:MTLResourceStorageModeShared deallocator:nil];

When running this I get the following error:

failed assertion `newBufferWithBytesNoCopy:pointer 0x16fd0bd48 is not 4096 byte aligned.'

Right now, I am not allocating any memory, but (for testing purposes) just creating an empty array of floats of a fixed size and filling it up with random numbers. So my main question is:

How do I allocate these arrays of floats the correct way so that the following requirement is met

This value must result in a page-aligned region of memory.

Also, some additional questions:

  • Does it even make sense to create the MTLBuffer with the newBufferWithBytesNoCopy method, or is copying the data not really an issue in terms of performance? (My actual data will consist of approximately 43'000 float values per video frame.)
  • Is MTLResourceStorageModeShared the correct choice for MTLResourceOptions
  • The API reference says

    The storage allocation of the returned new MTLBuffer object is the same as the pointer input value. The existing memory allocation must be covered by a single VM region, typically allocated with vm_allocate or mmap. Memory allocated by malloc is specifically disallowed.

    Does this apply only to the output buffer, or should the storage allocation for all objects used with MTLBuffer not be done with malloc?

like image 498
Maxi Mus Avatar asked Sep 29 '16 12:09

Maxi Mus


1 Answers

The easiest way to allocate page-aligned memory is with posix_memalign. Here's a complete example of creating a buffer with page-aligned memory:

void *data = NULL;
NSUInteger pageSize = getpagesize();
NSUInteger allocationSize = /* required byte count, rounded up to next multiple of page size */ pageSize * 10;
int result = posix_memalign(&data, pageSize, allocationSize);

if (result == noErr && data) {
    id<MTLBuffer> buffer = [device newBufferWithBytesNoCopy:data
                                                     length:allocationSize
                                                    options:MTLResourceStorageModeShared
                                                deallocator:^(void *pointer, NSUInteger length)
                                                            {
                                                                free(pointer);
                                                            }];
    NSLog(@"Created buffer of length %d", (int)buffer.length);
}

Since you can't ensure that your data will arrive in a page-aligned pointer, you'll probably be better off just allocating a MTLBuffer of whatever size can accommodate your data, without using the no-copy variant. If you need to do real-time processing of the data, you should create a pool of buffers and cycle among them instead of waiting for each command buffer to complete. The Shared storage mode is correct for these use cases. The caveat related to malloc only applies to the no-copy case, since in every other case, Metal allocates the memory for you.

like image 188
warrenm Avatar answered Nov 06 '22 23:11

warrenm