Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to identify places in MATLAB where data is stored outside the bounds of an array?

I am trying to use MATLAB Coder to convert code from Matlab to a MEX file. If I have a code snippet of the following form:

x = zeros(a,1)
x(a+1) = 1

then in Matlab this will resize the array to accommodate the new element, while in the MEX file this will give an "Index exceeds matrix dimensions" error. I expect there are lots of places in the code where this is happening.

What I want to do is run the MATLAB version of the code (without using the coder) but have MATLAB generate an error or warning whenever it resizes an array because I assign to something outside the bounds. (I could just use the MEX file and see where the errors pop up, but this requires rebuilding the whole MEX file using MATLAB Coder every time I change the code at all, which takes a while.)

Is there a way to do this? Is there any kind of setting in MATLAB that will turn off the "automatically resize if you assign to an out-of-bounds index", or give a warning if this happens?

like image 806
Alex319 Avatar asked Mar 19 '15 17:03

Alex319


1 Answers

EDIT: As of Matlab 2015b, Coder now has runtime error checking as an option (from the Matlab release notes):

In R2015b, generated standalone libraries and executables can detect and report run-time errors such as out-of-bounds array indexing. In previous releases, only generated MEX detected and reported run-time errors.

By default, run-time error detection is enabled for MEX. By default, run-time error detection is disabled for standalone libraries and executables.

To enable run-time error detection for standalone libraries and executables:

At the command line, use the code configuration property RuntimeChecks.

cfg = coder.config('lib'); % or 'dll' or 'exe'

cfg.RuntimeChecks = true;

codegen -config cfg myfunction

Using the MATLAB Coder app, in the project build settings, on the Debugging tab, select the Generate run-time error checks check box.

The generated libraries and executables use fprintf to write error messages to stderr and abort to terminate the application. If fprintf and abort are not available, you must provide them. Error messages are in English.

See Run-Time Error Detection and Reporting in Standalone C/C++ Code and Generate Standalone Code That Detects and Reports Run-Time Errors.

Original answer: The answer in the comments regarding declaring a class subclassed from double wherein the subsref method is overloaded to disallow growing would be one good way to do it.

Another easy way to do it is to sprinkle assert commands throughout the code (in each loop iteration or at the bottom of a function) to assert that the size has not increased over the allocated size.

For instance, if you had the code in a format of:

x = zeros(a,1)
x(a+1) = 1
... lots of other operations

if coder.target('MATLAB')
    assert(isequal(size(x), [a,1]), 'x has been indexed out of bounds')
end

this would let you have the assertion fail if any value were assigned that extended the array.

To make this a bit tidier, you might even make a function that bounds checked all the variables you care about, again wrapping the coder.target if statement around it. Then you could sprinkle this throughout your code.

It's not as elegant as overloading the double class, but on the other hand it doesn't add any overhead to the compiled code at all. It also won't give you errors exactly when the overrun happens, but it will give you confidence that the code is working well in a variety of situations.

Another way that you can be more confident of assignments is to do your own bounds checking on the assignment in situations where this may be appropriate. A common problem I've seen in assignment goes something like this. We have an allocated array and are copying in data from another array with a vector assignment. For instance, consider the following situation:

t = char(zeros(5,7));              % Allocate a 5 by 7 char array
tempstring = 'hello to anyone';    % Here's a string we want to put into it.
t(1, 1:numel(tempstring)) = tempstring;  % A valid assignment in MATLAB

>> size(t)
ans =
5    15   

Uh oh, exactly what you're concerned about in the question has happened: the t array has been automatically resized during the assignment, which works in MATLAB, but in Coder created code will cause a segfault or MEX error. An alternative is to use the power of the end function to keep the assignment tidy (but truncated.) If we change the assignment to:

t(1,1:min(end,numel(tempstring))) = tempstring(1:size(t, 2));

The size of t will remain unchanged, but the assignment will be truncated. The use of end allows bounds checking during the assignment. For some situations, this can be a nice way of handling the issue and will give you confidence that the bounds will never be exceeded, but obviously in some situations this is very undesirable (and won't give you error messages in MATLAB.)

Another helpful tool MATLAB provides is in the editor itself. If you use the %#codegen tag in your code, it will signal the editor's syntax checker to highlight various code generation problems, including places where you are obviously increasing the size of an array by indexing. This can't catch every situation, but it's a good help.

Editor example with codegen

One last note. As mentioned in the question, a MEX file generated by Coder will give you an "Index exceeds matrix dimensions" error right at the time of the assignment, and will gracefully exit and even tell you the original line of code where the error occurred. A C library generated out of Coder has no such nice behavior or bounds checking and will segfault out completely with no diagnostics. The intermediate answer is to do exactly what you're doing, which is to run the code as a MEX. That's not very helpful to your question (as you say, rebuilding the MEX can take time), but for those of us who code for the cold, cruel world of external C code, the intermediate test of being able to run the MEX to find these errors is a lifesaver.

The bottom line is this is a divergence in behavior between MATLAB and Coder generated C code, and it can be a source of significant problems. In my own code, I am very careful with array access and growth for exactly this reason. It's an area where I'd like to see improvement in the Coder tool itself, but there are ways to be very careful when writing MATLAB code targeted for Coder.

like image 124
Tony Avatar answered Oct 09 '22 10:10

Tony