Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficient indexing of structures in MATLAB

Up until recently, I have been storing time series data in structs in MATLAB by placing the index after the field name, e.g.:

Structure.fieldA(1) = 23423

So, the struct has a set of fields, and each field is a vector.

I've seen a lot of other programs use a different format, where the structure itself is indexed, and each index of the structure contains a set of fields, e.g.:

Structure(1).fieldA

Which is the most efficient method? Should I stick with the top option or should I be switching my programs across to using the bottom method?

like image 704
CaptainProg Avatar asked Jan 29 '17 16:01

CaptainProg


People also ask

What is MATLAB indexing?

Indexing into a matrix is a means of selecting a subset of elements from the matrix. MATLAB® has several indexing styles that are not only powerful and flexible, but also readable and expressive. Indexing is a key to the effectiveness of MATLAB at capturing matrix-oriented ideas in understandable computer programs.

Does MATLAB use 1 based indexing?

However MATLAB has indexing of arrays beginning from 1 instead of 0, which is the norm in almost every programming languages I have encountered so far.

Is MATLAB 1 or 0 indexed?

In most programming languages, the first element of an array is element 0. In MATLAB, indexes start at 1.


1 Answers

A struct where each field is an array is more performant since you have fewer data elements (one array per field) whereas a struct array has more flexibility at the cost of performance and memory usage (on element per struct per field).

From MATLAB's own documentation

Structures require a similar amount of overhead per field. Structures with many fields and small contents have a large overhead and should be avoided. A large array of structures with numeric scalar fields requires much more memory than a structure with fields containing large numeric arrays.

We can check the memory usage with a simple example

S = struct('field1', {1, 2}, 'field2', {3, 4});
SArray = struct('field1', {[1,2]}, 'field2', {[3,4]});

whos S*
%  Name        Size            Bytes  Class     Attributes
%
%  S           1x2               608  struct
%  SArray      1x1               384  struct

Some of the flexibility afforded by a struct array includes being able to easily grab a subset of the data:

subset = SArray(1:3);

% Compared to
subset.field1 = S.field1(1:3);
subset.field2 = S.field2(1:3);

Also, being able to store data of different sizes that may not easily fit into an array.

S(1).field1 = [1,2];
S(2).field1 = 3;

What solution is better, really depends on the data and how you're using it. If you have a large amount of data, the first option is likely going to be preferable due to it's smaller memory footprint.

If you're code is working for you, I wouldn't worry about converting it just for the sake of using a different convention unless you're having issues with performance (in which case use a struct of arrays) or difficulty accessing/modifying the data (use the array of struct).

like image 71
Suever Avatar answered Oct 02 '22 08:10

Suever