I have recently started reading about benchmarks and writing them for Android (in Java). I am aware of such problems as warmups, garbage collector and compiler optimizations but don't know if the problem I face could be caused by any of those.
In my benchmark app I create an array of 10,000 float variables and initialize it with random values. When running benchmark code:
private void runMinorBenchmarkFloat (float[] array) {
float sum = 0;
long startTime;
long endTime;
/* Fast warm-up */
startTime = System.nanoTime();
for(int i=0; i<SMALL_LOOP_ITERATION_COUNT; i++)
for(int j=0; j<TAB_SIZE; j++)
sum += array[j];
endTime = System.nanoTime() - startTime;
postMessage("Warm-up for FLOAT finished in: " + endTime/1000000 + "ms.\n");
/* Main benchmark loop */
startTime = System.nanoTime();
for(int i=0; i<BIG_LOOP_ITERATION_COUNT; i++)
{
sum = 0;
for(int j=0; j<TAB_SIZE; j++)
sum += array[j];
}
endTime = System.nanoTime() - startTime;
postMessage("Benchmark for FLOAT finished in: " + endTime/1000000 + "ms.\n");
postMessage("Final value: " + sum + "\n\n");
}
on my phone I get around 2 seconds for warmup and 20 seconds for "real" loop.
Now, when I add two more float variables (sum2 and sum3 - never used inside the method):
private void runMinorBenchmarkFloat (float[] array) {
float sum = 0, sum2 = 0, sum3 = 0; // <------- the only code change here!!!
long startTime;
long endTime;
/* Fast warm-up */
startTime = System.nanoTime();
for(int i=0; i<SMALL_LOOP_ITERATION_COUNT; i++)
for(int j=0; j<TAB_SIZE; j++)
sum += array[j];
endTime = System.nanoTime() - startTime;
postMessage("Warm-up for FLOAT finished in: " + endTime/1000000 + "ms.\n");
/* Main benchmark loop */
startTime = System.nanoTime();
for(int i=0; i<BIG_LOOP_ITERATION_COUNT; i++)
{
sum = 0;
for(int j=0; j<TAB_SIZE; j++)
sum += array[j];
}
endTime = System.nanoTime() - startTime;
postMessage("Benchmark for FLOAT finished in: " + endTime/1000000 + "ms.\n");
postMessage("Final value: " + sum + "\n\n");
}
execution time jumps from 2 seconds for warmup to 5 seconds and from 20 seconds for real loop to 50 seconds.
The constants:
SMALL_LOOP_ITERATION_COUNT = 100,000
BIG_LOOP_ITERATION_COUNT = 1,000,000
Do you think that such difference could be caused by alignment problems (just loose idea)?
Thanks in advance for any answers.
EDIT:
It seems like this error does not appear on every device. I can reproduce it on Samsung Galaxy S5. The main goal of the program was to make little benchmark. I did four almost-the-same functions (runMinorBenchmark____ where _ was either: int, short, float, double) which differed only in variable 'sum' type. In the main benchmark function I invoked those functions. Because the mentioned error occured I decided to merge those function into one big one. Now... When running test I have such times: 1. 37640ms. (for int) 2. 46728ms. (for short) 3. 60589ms. (for float) 4. 34467ms. (for double)
I know that short is meant to be slower because of type casting. I also thought that float should be slower in case of casting it to double (maybe FPU does type casting to double every time(?)). But when I change variable type for sumFloat from float to double the time for float is identical to double time. I also did this "benchmark" on one other device which seemed to not suffer from this strange behaviour and the times for every test were almost the same: ~45000ms. (really no visible differences).
Dalvik VM error (?)
I refuse to believe that is the cause of your trouble. Surely the compiler just chucks those unused variables away? Are you sure the input array doesn't change, or your constants, or TAB_SIZE
?
If you're still sure, prove it by running something like this and pasting the output here:
public void proveIt() {
float[] inputArray = new float[10000];
for (int i = 0; i < 10000; i++) {
inputArray[i] = 1;
}
postMessage("Without declaration:");
runMinorBenchmarkFloatA(inputArray);
postMessage("With declaration:");
runMinorBenchmarkFloatB(inputArray);
postMessage("And again just to make sure...");
postMessage("Without declaration:");
runMinorBenchmarkFloatA(inputArray);
postMessage("With declaration:");
runMinorBenchmarkFloatB(inputArray);
}
long TAB_SIZE = 10000;
long SMALL_LOOP_ITERATION_COUNT = 100000;
long BIG_LOOP_ITERATION_COUNT = 1000000;
private void runMinorBenchmarkFloatA(float[] array) {
float sum = 0;
long startTime;
long endTime;
/* Fast warm-up */
startTime = System.nanoTime();
for (int i = 0; i < SMALL_LOOP_ITERATION_COUNT; i++)
for (int j = 0; j < TAB_SIZE; j++)
sum += array[j];
endTime = System.nanoTime() - startTime;
postMessage("Warm-up for FLOAT finished in: " + endTime
/ 1000000 + "ms.\n");
/* Main benchmark loop */
startTime = System.nanoTime();
for (int i = 0; i < BIG_LOOP_ITERATION_COUNT; i++) {
sum = 0;
for (int j = 0; j < TAB_SIZE; j++)
sum += array[j];
}
endTime = System.nanoTime() - startTime;
postMessage("Benchmark for FLOAT finished in: " + endTime
/ 1000000 + "ms.\n");
postMessage("Final value: " + sum + "\n\n");
}
private void runMinorBenchmarkFloatB(float[] array) {
float sum = 0, sum2 = 0, sum3 = 0;
long startTime;
long endTime;
/* Fast warm-up */
startTime = System.nanoTime();
for (int i = 0; i < SMALL_LOOP_ITERATION_COUNT; i++)
for (int j = 0; j < TAB_SIZE; j++)
sum += array[j];
endTime = System.nanoTime() - startTime;
postMessage("Warm-up for FLOAT finished in: " + endTime
/ 1000000 + "ms.\n");
/* Main benchmark loop */
startTime = System.nanoTime();
for (int i = 0; i < BIG_LOOP_ITERATION_COUNT; i++) {
sum = 0;
for (int j = 0; j < TAB_SIZE; j++)
sum += array[j];
}
endTime = System.nanoTime() - startTime;
postMessage("Benchmark for FLOAT finished in: " + endTime
/ 1000000 + "ms.\n");
postMessage("Final value: " + sum + "\n\n");
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With