Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MATLAB function is slow the first time, but much faster subsequently. Why?

I have a large MATLAB function file. It first creates a zero matrix, then updates about 70% of the cells by evaluating a number of corresponding (long) algebraic expressions that are hard coded in the function. Once this is complete, a numeric matrix is returned.

The .m file is about 4 MB large (I have 100 of these m. files, but that is not directly relevant). When I evaluate the function the first time, it takes about 9 seconds to evaluate. Subsequent runs, however, only take about 0.1 second, which is more what I was expecting.

Why does the first evaluation take 9 seconds? Anytime I close and reopen MATLAB, I each time have this slow first evaluation, with subsequent runs being much faster. Why is this?

The m. file can be found at the below public link (you can copy the text from the browser): https://dl.dropboxusercontent.com/u/157153767/K_a_12_102x.m

The command window input you should use is: [test]=K_a_12_102x(414000000,1.1095e+09,1.2500e-04,0.0840,0.0840,0.0240,0.0240,0.0020,0.0020,0,0,0,0,3.0397e+08,8.9930e+07,0,3.0397e+08,0,1.0702e+08,0,0,0,0,0,0,497.7389,80.7355,-15.9811,391.1985,-15.9811,103.5248,20440000,0,20440000,0.06)

like image 208
Kobs Avatar asked Oct 21 '13 15:10

Kobs


3 Answers

Slow first runs have been the case long before the JIT compiler was introduced into MATLAB, and true even for the case of MEX files, on which the JIT compiler is not applied. When you run code for the first time, MATLAB has to load it from disk, parse the code (see runtime type analysis details below), and apply JIT compilation if it is a .m file. Then on execution, space for data is allocated, and instructions are loaded into CPU cache, where they might stay with very fast access times for further executions. This is the reason for ubiquitous "cache warming" procedures outside the world of MATLAB, as I understand it (apologies to hardware buffs for my hand-waving). However, with .m fies, disk access is probably a big factor, even with files that are much smaller than "about 4MB large" as in your case. There's also an added step of function disambiguation when multiple functions have the same name.

To see this happening for a MEX file, simply run clear mex and time a function call. MATLAB has to load it from disk and into memory all over again, probably with an invalidated CPU cache.

Run-time Type Analysis

A second aspect of the code acceleration features (JIT code generation is the first), is run-time type analysis. From an old MathWork whitepaper:

Run-time type analysis is based on the following premise: If a line of M-code has been processed before, it is very likely that the variables have the same types and shapes that they had the last time the system saw this line. The first time that a line of code is executed, the system examines the variables and generates specific code for the data types and shapes that were found. Subsequent executions of the line can reuse this code as long as the system verifies that the variable types and sizes have not changed. Since the types rarely change, subsequent executions run as quickly as possible. If the types do change,the code is regenerated.

You might consider this part of the JIT compilation process, and it is. But the point is that this analysis is run on first execution, regardless of whether the accelerator decides to JIT-compile any lines of code. BTW, the whole file doesn't get compiled into machine code. It used to be possible to see which lines got accelerated in the profiler using setpref('profiler','showJitLines',1); but that was unfortunately removed as a feature.

Anyway, after actually looking at your code, there are a staggering number of constants and variables that need to be parsed after loading the file from disk. One line is over 31,000 characters long with several thousand numeric literals! That's a lot to analyze and decide what needs compilation and what can be cached between runs. As if to demonstrate this point, just viewing (not running) your code managed to crash the editor with DirectUI::DUIXmlParser::InitializeParserFromXmlLiteReader on the stack trace. Yikes, that's some nasty code!

Does the JIT compiler generate code for this function?

Let's time the code with MATLAB's acceleration features turned on. We also run control test that we know to run about 8x slower without acceleration.

>> feature accel on
>> clear K_a_12_102x
>> x = rand(1000); tic,for t=1:1e6, x=x; end,toc % control
Elapsed time is 0.083878 seconds.
% do first-run of K_a_12_102x, took 13.280327 sec
>> tic; [test]=K_a_12_102x(414000000,1.1095e+09,1.2500e-04,0.0840,0.0840,0.0240,0.0240,0.0020,0.0020,0,0,0,0,3.0397e+08,8.9930e+07,0,3.0397e+08,0,1.0702e+08,0,0,0,0,0,0,497.7389,80.7355,-15.9811,391.1985,-15.9811,103.5248,20440000,0,20440000,0.06); toc
Elapsed time is 0.151804 seconds.

Now we turn off acceleration and run the same tests:

>> feature accel off
>> clear K_a_12_102x
>> tic,for t=1:1e6, x=x; end,toc % control
Elapsed time is 0.630039 seconds.
% do a first-run of K_a_12_102x, took 15.634775 seconds
>> tic; [test]=K_a_12_102x(414000000,1.1095e+09,1.2500e-04,0.0840,0.0840,0.0240,0.0240,0.0020,0.0020,0,0,0,0,3.0397e+08,8.9930e+07,0,3.0397e+08,0,1.0702e+08,0,0,0,0,0,0,497.7389,80.7355,-15.9811,391.1985,-15.9811,103.5248,20440000,0,20440000,0.06); toc
Elapsed time is 0.159683 seconds.

Findings

The findings are two-fold:

  1. First-run time is NOT improved with acceleration (JIT) disabled (13.28 sec with JIT on vs. 15.63 sec with JIT off).
  2. Subsequent runs show that no machine code is generated when JIT is enabled (0.1518 sec with JIT ON vs. 0.1597 sec with JIT off)

In short, your code does not benefit from JIT acceleration, and JIT execution/analysis does NOT add to the first-run execution time.

The question remains, what causes the slow first-run time? Some possibilities: loading of the code text from disk, parsing (stripping comments and whitespace) the code before saving it in RAM, not reusing variable initialization saved from previous runs, maybe core MATLAB instructions used by the function saved in CPU cache, and any non-JIT related code analysis necessary for MATLAB to do runtime syntax checking. The fact that the file is 4MB and incredibly complex in terms of equation length and the sheer number of numeric literals suggests that its not CPU cache, but initial file loading and code analysis.

like image 93
chappjc Avatar answered Sep 25 '22 08:09

chappjc


I think that's the JIT-compilation. The first time you execute the file, MATLAB has to interpret it (translating text to machine code).

Subsequent runs use the cached machine code and execute much faster. To verify this: After a small change in the code, MATLAB will need to redo this compilation - so it should be slow for the next run again. (I did this with exactly that consequence.) You are doing a significant amount of simple operations, that should execute pretty quick. The conversion to machine code is slowing you down.

To speed the whole thing up: Transfer the code to C, C# or something similar and include it as a DLL file. You will have constant and fast calculations, but you cannot change them as easily.

(Using a C# DLL file, you've got some JIT compilation as well, but it's less often and probably still faster than MATLAB.)


I've done some coding and ported the code to C#. Original timings are 13.4 s and 0.15 s (first/second run)

Simple port, Release configuration:

>> test_1
Elapsed time is 124.7 seconds.
>> test_1
Elapsed time is 0.0297 seconds.

So the first run is much worse than MATLAB - bummer. While I love C# as it is, it might not be the best tool for this job. (BTW: I had to use the command-line compiler because Visual Studio 2010 kept crashing...)

like image 32
DasKrümelmonster Avatar answered Sep 22 '22 08:09

DasKrümelmonster


Additionally to the effect of the JIT compiler mentioned by TheCrumbMonster, there might be various caching effects going on. Either Matlab itself is smart enough to reuse some of its data structures, or it has some of its code already stored in cache memory of the processor instead of in the main memory. In fact even the JIT itself relies on caching the result of the compilation, otherwise you would have to recompile every time you call a function. Also all modern operating systems do various kinds of caching, so instead of reading some data file, MEX file or DLL from disk, you simply get it from memory.

This is one of the reasons that for accurately measuring the execution speed of some function, you should not use simple tic(); toc() statements, but use a function like timeit (use it, it is excellent!). This repeats a measurement several times to 'warm up' the cache and always throws away the first few measurements.

As to the reason why matlab is slow for this particular file, I can totally understand it. It takes my text editor more than a minute to open it, and you have about 100 lines with extremely long statements of the form

K_a_12=zeros(1089,100);
K_a_12(1011,1) = 2*h_a*((x*(250*G_a*L2^20*W + 5250*G_a*L1^2*L2^18*W + ...
K_a_12(1011,3) = 2*h_a*((x^13*(188955000*G_a*L2^8*W*h_1 +  ...

which were (I hope) generated automatically. It seems to me that the calculation can be done much more efficiently. First of all, you seem to be filling only a very small part of your matrix, so you should probably use a sparse matrix. Next, from quick inspection, every term seems of the form h_a * x^n1 * const * L1^n2 * L2^n3 * ...., which might be calculated in a simplified way. I would bet that the whole calculation can be done in a few lines by multiplying and exponentiation of just a few matrices, which should be saved as mat-files, not like a calculation written out in full. Finally, you also do not use about half the input arguments to the function.

like image 42
Bas Swinckels Avatar answered Sep 26 '22 08:09

Bas Swinckels