Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does complex Matlab gpuArray take twice as much memory than it should?

Tags:

matlab

gpu

I noticed that a large complex array takes twice as much memory on GPU than on CPU.

Here is a minimal example:

%-- First Try: Complex Single
gpu = gpuDevice(1);
m1 = gpu.FreeMemory;
test = complex(single(zeros(600000/8,1000)));  % 600 MByte complex single
whos('test')
test = gpuArray(test);
fprintf(' Used memory on GPU: %e\n', m1-gpu.FreeMemory);

Now I do the same with a twice as big array which is not complex:

%-- Second Try:, Single
gpu = gpuDevice(1);
m1 = gpu.FreeMemory;
test = single(zeros(600000/4,1000));  % 600MB MByte real single
whos('test')
test = gpuArray(test);
fprintf(' Used memory on GPU: %e\n', m1-gpu.FreeMemory);

The output is:

 Name          Size                  Bytes  Class     Attributes    
 test      75000x1000            600000000  single    complex   
 Used memory on GPU: 1.200095e+09

 Name           Size                  Bytes  Class     Attributes   
 test      150000x1000            600000000  single                  
 Used memory on GPU: 6.000476e+08

On the CPU both arrays are 600MB - on the GPU the complex array uses 1.2 GByte. I tested this on two graphics cards: GeForce GTX 680 and Tesla K20 using Matlab 2013a.

How can I avoid this? Is this a bug in Matlab?

like image 761
Stiefel Avatar asked Apr 03 '14 16:04

Stiefel


1 Answers

This was answered on MATLAB central. To summarize MathWorks developer Edric Ellis's answer:

  • gpu.FreeMemory may not be an accurate measure of the available GPU memory because MATLAB does not immediately free up memory when it's done using it. gpu.AvailableMemory is a more accurate measure of available memory.

  • Transferring complex data to/from the GPU still requires 2x the memory because there is a format conversion that is done on the GPU. Specifically, complex arrays in CPU host memory are stored with the real/imaginary parts split into 2 separate vectors, whereas complex arrays on the GPU device are stored in interleaved format.

Testing on R2017a, I confirmed that:

  • Switching from gpu.FreeMemory to gpu.AvailableMemory indeed addresses this discrepancy in reported memory usage that prompted the original question.

  • With 8 GB of GPU memory, copying...

    • 6 GB real array: success
    • 6 GB complex array: "Out of memory on device" error
    • 6 separate 1 GB complex arrays: success
like image 52
KQS Avatar answered Nov 15 '22 04:11

KQS