Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the fastest way to load data in Matlab

I have a vast quantity of data (>800Mb) that takes an age to load into Matlab mainly because it's split up into tiny files each <20kB. They are all in a proprietary format which I can read and load into Matlab, its just that it takes so long.

I am thinking of reading the data in and writing it out to some sort of binary file which should make it quicker for subsequent reads (of which there may be many, hence me needing a speed-up).

So, my question is, what would be the best format to write them to disk to make reading them back again as quick as possible?

I guess I have the option of writing using fwrite, or just saving the variables from matlab. I think I'd prefer the fwrite option so if needed, I could read them from another package/language...

like image 984
mor22 Avatar asked Jan 27 '11 09:01

mor22


2 Answers

Look in to the HDF5 data format, used by recent versions of MATLAB as the underlying format for .mat files. You can manually create your own HDF5 files using the hdf5write function, and this file can be accessed from any language that has HDF bindings (most common languages do, or at least offer a way to integrate C code that can call the HDF5 library).

If your data is numeric (and of the same datatype), you might find it hard to beat the performance of plain binary (fwrite).

like image 129
user57368 Avatar answered Oct 04 '22 11:10

user57368


Binary mat-files are the fastest. Just use

save myfile.mat <var_a> <var_b> ...
like image 36
nimrodm Avatar answered Oct 04 '22 10:10

nimrodm