I was assigned to extend a certain component of a software (written by someone else). It's written for Android, fully in Java (has no native/c++ components I know of).
When getting familiar with the code, I came across a method (a drawing method for a rendering class). The method involves a big loop that updates objects (and then another method will render them later). The creator of the method seemed to cache all/most member variables and arrays and other objects' fields into local variables before the loop. The code looked like this:
float[] coordArr = mCoordArr;
float[] texCoordArr = mTexCoordArr;
float[] cArray = mColArray;
// ... there are further locals too, I didn't copy all here
float[] color = mColor;
float r = color[0];
float g = color[1];
float b = color[2];
float a = color[3];
int texw = mTexW;
int texH = mTexH;
Font font = mFont;
float[] ccords = font.ccords;
float cf = font.cf;
float cu = font.cu;
int len = mCurLength;
// Update the objects
for (int i = 0; i < len; ++i) {
// A quite big loop body
// ... all locals are accessed from the loop
}
The rendering component is single threaded, with all of its member variables.
I checked it with Java/Dalvik disassembler, and the bytecode comment says the method uses 41 registers. I assume the author cached them to locals to help the JIT and to save some time for field/array accesses, but isn't this high number of locals against performance? I heard about "register pressure", for example.
I just don't want to rewrite the code if not necessary (i.e. if the current code is OK), and in order to profile it, I would need to rewrite it (otherwise there is only one version -- the current one, so nothing to compare it with...).
If using "too" many locals is discouraged, then is there some "optimal" maximum that shouldn't be exceeded? (I know that the system's stack size is hard limit, of course.) Because if that's the case, I might need to revise other parts of the software too (if the original author was kind enough to put everything into locals).
While lots of local variables is likely to result in "register pressure", that simply means that the compiler is likely to do more memory fetches. However, the alternative is (for example) replace references to r
with colour[0]
which in theory involves an index check and an indirect fetch which may result in more memory fetches than can be ascribed to register shortage.
In short, there is no simple answer.
So, I'd be inclined to leave the code alone, especially if
there is evidence that the original / previous author(s) arrived at the current design as a result of profiling, or
the code already runs fast enough ...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With