Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it possible return cell array that contains one instance in several cells?

I write some mex function and have to return huge array of strings.

I do this as following:

  mxArray * array = mxCreateCellMatrix(ARRAY_LEN, 1);
  for (size_t k = 0; k < ARRAY_LEN; ++ k) {
      mxArray *str = mxCreateString("Hello");
      mxSetCell(array, k, str);
  }
  prhs[0] = array;

However, since the string has always same value, I would like to create only one instance of it. like

  mxArray * array = mxCreateCellMatrix(ARRAY_LEN, 1);
  mxArray *str = mxCreateString("Hello");

  for (size_t k = 0; k < ARRAY_LEN; ++ k) {
      mxSetCell(array, k, str);
  }
  prhs[0] = array;

Does it possible? How the garbage collector knows to release it? Thank you.

like image 594
user1626803 Avatar asked Sep 17 '13 10:09

user1626803


People also ask

What is a single cell array?

A cell array is a data type with indexed data containers called cells, where each cell can contain any type of data. Cell arrays commonly contain either lists of text, combinations of text and numbers, or numeric arrays of different sizes.

How do you split a cell array in MATLAB?

Use the 'strsplit' function to split a string by specifying the '|' character as a delimiter. We can also use the 'cellfun' function to repeat the 'strsplit' function on each cell of a cell array.

How do you access the elements of a cell array in MATLAB?

There are two ways to refer to the elements of a cell array. Enclose indices in smooth parentheses, () , to refer to sets of cells--for example, to define a subset of the array. Enclose indices in curly braces, {} , to refer to the text, numbers, or other data within individual cells.


2 Answers

The second code you suggested is not safe and should not be used, as it could crash MATLAB. Instead you should write:

mxArray *arr = mxCreateCellMatrix(len, 1);
mxArray *str = mxCreateString("Hello");
for(mwIndex i=0; i<len; i++) {
    mxSetCell(arr, i, mxDuplicateArray(str));
}
mxDestroyArray(str);
plhs[0] = arr;

This is unfortunately not the most efficient use of memory storage. Imagine that instead of using a tiny string, we were storing a very large matrix (duplicated along the cells).


Now it is possible to do what you initially wanted, but you'll have to be resort to undocumented hacks (like creating shared data copies or manually increment the reference count in the mxArray_tag structure).

In fact this is what usually happens behind the scenes in MATLAB. Take this for example:

>> c = cell(100,100);
>> c(:) = {rand(5000)};

As you know a cell array in MATLAB is basically an mxArray whose data-pointer points to an array of other mxArray variables.

In the case above, MATLAB first creates an mxArray corresponding to the 5000x5000 matrix. This will be stored in the first cell c{1}.

For the rest of the cells, MATLAB creates "lightweight" mxArrays, that basically share its data with the first cell element, i.e its data pointer points to the same block of memory holding the huge matrix.

So there is only one copy of the matrix at all times, unless of course you modify one of them (c{2,2}(1)=99), at which point MATLAB has to "unlink" the array and make a separate copy for this cell element.

You see internally each mxArray structure has a reference counter and a cross-link pointer to make this data sharing possible.

Hint: You can study this data sharing behavior with format debug option turned on, and comparing the pr pointer address of the various cells.

The same concept holds true for structure fields, so when we write:

x = rand(5000);
s = struct('a',x, 'b',x, 'c',x);

all the fields would point to the same copy of data in x..


EDIT:

I forgot to show the undocumented solution I mentioned :)

mex_test.cpp

#include "mex.h"

extern "C" mxArray* mxCreateReference(mxArray*);

void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
    mwSize len = 10;
    mxArray *arr = mxCreateCellMatrix(len, 1);
    mxArray *str = mxCreateString("Hello");
    for(mwIndex i=0; i<len; i++) {
        // I simply replaced the call to mxDuplicateArray here
        mxSetCell(arr, i, mxCreateReference(str));
    }
    mxDestroyArray(str);
    plhs[0] = arr;
}

MATLAB

>> %c = repmat({'Hello'}, 10, 1);
>> c = mex_test()
>> c{1} = 'bye'
>> clear c

The mxCreateReference function will increment the internal reference counter of the str array each time it is called, thus letting MATLAB know that there are other copies of it.

So when you clear the resulting cell arrays, it will in turn decrement this counter one for each cell, until the counter reaches 0 at which point it is safe to destroy the array in question.

Using the array directly (mxSetCell(arr, i, str)) is problematic because the ref-counter immediately reaches zero after destroying the first cell. Thus for subsequent cells, MATLAB will attempt to free arrays that have already been freed, resulting in memory corruption.

like image 139
Amro Avatar answered Sep 18 '22 04:09

Amro


Bad news ... as of R2014a (possibly R2013b but I can't check) mxCreateReference is no longer available in the library (either missing or not exported), so the link will fail. Here is a replacement function you can use that hacks into the mxArray and bumps up the reference count manually:

struct mxArray_Tag_Partial {
    void *name_or_CrossLinkReverse;
    mxClassID ClassID;
    int VariableType;
    mxArray *CrossLink;
    size_t ndim;
    unsigned int RefCount; /* Number of sub-elements identical to this one */
};

mxArray *mxCreateReference(const mxArray *mx)
{
    struct mxArray_Tag_Partial *my = (struct mxArray_Tag_Partial *) mx;
    ++my->RefCount;
    return (mxArray *) mx;
}
like image 38
James Tursa Avatar answered Sep 18 '22 04:09

James Tursa