I'm wondering if there is a way to see if I'm doing things right, when trying to use MATLAB's copy-on-write (lazy copy) mechanism to link the same large matrix from multiple cells in a cell array.
For example:
img = randn(500);
[dx,dy] = gradient(img);
S = cell(2,2);
S{1,1} = dx.^2;
S{2,2} = dy.^2;
S{1,2} = dx.*dy;
S{2,1} = S{1,2}; % should be a reference, as long as not modified
But looking at the output of whos
:
>> whos
Name Size Bytes Class Attributes
S 2x2 8000448 cell
dx 500x500 2000000 double
dy 500x500 2000000 double
img 500x500 2000000 double
I would have liked to see S
occupy 6 MB, rather than 8 MB.
Is there a way to verify that there's no mistakes in the program and those two cells still reference the same array at the end?
I know of the function memory
, but it sadly only works on Windows platforms (I'm on MacOS).
One possible solution to verify that two particular arrays actually share data is using the following MEX-file modified from Yair's Undocumented MATLAB Blog:
#include "mex.h"
#include <cstdint>
void mexFunction( int /*nlhs*/, mxArray* plhs[], int nrhs, mxArray const* prhs[]) {
if (nrhs < 1) mexErrMsgTxt("One input required.");
plhs[0] = mxCreateNumericMatrix(1, 1, mxUINT64_CLASS, mxREAL);
std::uint64_t* out = static_cast<std::uint64_t*>(mxGetData(plhs[0]));
out[0] = reinterpret_cast<std::uint64_t>(mxGetData(prhs[0]));
}
Saving that as getaddr.cpp
and compiling with
mex getaddr.cpp
allows the following test:
img = randn(500);
[dx,dy] = gradient(img);
S = cell(2,2);
S{1,1} = dx.^2;
S{2,2} = dy.^2;
S{1,2} = dx.*dy;
S{2,1} = S{1,2}; % should be a reference, as long as not modified
assert(getaddr(S{1,2}) == getaddr(S{2,1}))
This is not the same as getting a summary of the memory actually used by the struct S
(which I still think would be useful), but it does allow to verify that memory is shared.
EDIT:
Before editing the answer I used an undocumented function that has an unexpected behavior and its signature isn't stable between different versions of MATLAB so Here I provided an extended version of @CrisLuengo's answer.
We can use a hash map to store unique address of data elements and their associated mxArray
s in the recursive function check_shared
and to get size of data. Note that here we can check for sharing status of in a cell and we can not check for elements that are outside of the cell and have identical address to the cell elements.*
#include "mex.h"
#include <unordered_map>
typedef std::unordered_map<void *,const mxArray *> TableType;
TableType check_shared(const mxArray* arr, TableType table = TableType())
{
switch (mxGetClassID(arr)) {
case mxCELL_CLASS:
for(int i = 0; i < mxGetNumberOfElements (arr); i++) {
table = check_shared(mxGetCell (arr,i), std::move(table));
}
break;
case mxSTRUCT_CLASS:
for (int i = 0; i < mxGetNumberOfFields (arr); i++) {
for (int j = 0; j < mxGetNumberOfElements (arr); j++) {
table = check_shared(mxGetFieldByNumber (arr, j, i), std::move(table));
}
}
break;
case mxVOID_CLASS:
case mxFUNCTION_CLASS:
case mxUNKNOWN_CLASS:
return table;
}
if (!mxIsEmpty (arr)) {
void* data = mxGetData(arr);
table[data] = arr;
}
return table;
}
uint64_t actual_size(const TableType& table)
{
uint64_t sz = 0;
for (const auto& entry : table) {
const mxArray * arr = entry.second;
sz += mxGetElementSize (arr) * mxGetNumberOfElements (arr);
}
return sz;
}
void mexFunction(int nlhs, mxArray *plhs[],
int nrhs, const mxArray *prhs[])
{
TableType table = check_shared(prhs[0]);
plhs[0] = mxCreateNumericMatrix(1,1, mxUINT64_CLASS, mxREAL );
uint64_t* result = static_cast<uint64_t*>(mxGetData (plhs[0]));
result[0] = actual_size(table);
}
(*) Basic data types such as cell
, struct
and numeric arrays are supported. For unknown data structures and classdef objects the function returns zero.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With