Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why instanceof and iterating single list is faster than several specialized lists?

I assumed that separating objects that implement different interfaces into several lists and iterating those lists afterwards would be faster than dumping all objects into a single list and then switching via instanceof. E.g. this:

ArrayList<Visible> visibles = new ArrayList<>();
ArrayList<Highlightable> highlightables = new ArrayList<>();
ArrayList<Selectable> selectables = new ArrayList<>();

// populate the lists
// Visible is an interface, Highlightable is also interface (extends Visible),
// Selectable extends Highlightable
// All interfaces have 3 concrete subclasses each,
// to test situations when JVM is not able to optimize too much due to small number of classes

for (Visible e : visibles) {
    vsum += e.visibleValue();
}
for (Highlightable e : highlightables) {
    vsum += e.visibleValue();
    hsum += e.highlightableValue();
}
for (Selectable e : selectables) {
    vsum += e.visibleValue();
    hsum += e.highlightableValue();
    ssum += e.selectableValue();
}

should be faster than

ArrayList<Visible> visibles = new ArrayList<>();

// populate the list

for (Visible e : visibles) {
    if (e instanceof Selectable) {
        vsum += e.visibleValue();
        hsum += ((Selectable) e).highlightableValue();
        ssum += ((Selectable) e).selectableValue();
    } else if (e instanceof Highlightable) {
        vsum += e.visibleValue();
        hsum += ((Highlightable) e).highlightableValue();
    } else {
        vsum += e.visibleValue();
    }
}

However it doesn't seem to be the case:

Main.separateLists             thrpt   30     1546.898 ±     32.312   ops/s
Main.singleListAndInstanceof   thrpt   30     1673.733 ±     29.804   ops/s

I added full source for the benchmark below.

What could be the cause of instanceof version being faster? Even if we assume that isntanceof instruction is free, then both versions should at least be equal in performance (because they still add elements to list and iterate them).

package test;

import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.Fork;
import org.openjdk.jmh.annotations.Measurement;
import org.openjdk.jmh.annotations.Warmup;

import java.util.ArrayList;
import java.util.Random;

public class Main {

    public static void main(String[] args) throws Exception {
        org.openjdk.jmh.Main.main(args);
    }

    @Benchmark
    @Warmup(iterations = 5, time = 1)
    @Measurement(iterations = 15, time = 1)
    @Fork(value = 2)
    public static long separateLists() {
        ArrayList<Visible> visibles = new ArrayList<>(3_500);
        ArrayList<Highlightable> highlightables = new ArrayList<>(3_500);
        ArrayList<Selectable> selectables = new ArrayList<>(3_500);

        Random random = new Random();
        for (int i = 0; i < 10_000; i++) {
            switch (random.nextInt(9)) {
                case 0:
                    visibles.add(new Visible1(i));
                    break;
                case 1:
                    highlightables.add(new Highlightable1(i));
                    break;
                case 2:
                    selectables.add(new Selectable1(i));
                    break;

                case 3:
                    visibles.add(new Visible2(i));
                    break;
                case 4:
                    highlightables.add(new Highlightable2(i));
                    break;
                case 5:
                    selectables.add(new Selectable2(i));
                    break;

                case 6:
                    visibles.add(new Visible3(i));
                    break;
                case 7:
                    highlightables.add(new Highlightable3(i));
                    break;
                case 8:
                    selectables.add(new Selectable3(i));
                    break;
            }
        }

        long listSize = visibles.size() + highlightables.size() + selectables.size();

        long vsum = 0;
        long hsum = 0;
        long ssum = 0;
        for (Visible e : visibles) {
            vsum += e.visibleValue();
        }
        for (Highlightable e : highlightables) {
            vsum += e.visibleValue();
            hsum += e.highlightableValue();
        }
        for (Selectable e : selectables) {
            vsum += e.visibleValue();
            hsum += e.highlightableValue();
            ssum += e.selectableValue();
        }

        return listSize + vsum * hsum * ssum;
    }

    @Benchmark
    @Warmup(iterations = 5, time = 1)
    @Measurement(iterations = 15, time = 1)
    @Fork(value = 2)
    public static long singleListAndInstanceof() {
        ArrayList<Visible> visibles = new ArrayList<>(10_000);

        Random random = new Random();
        for (int i = 0; i < 10_000; i++) {
            switch (random.nextInt(9)) {
                case 0:
                    visibles.add(new Visible1(i));
                    break;
                case 1:
                    visibles.add(new Highlightable1(i));
                    break;
                case 2:
                    visibles.add(new Selectable1(i));
                    break;

                case 3:
                    visibles.add(new Visible2(i));
                    break;
                case 4:
                    visibles.add(new Highlightable2(i));
                    break;
                case 5:
                    visibles.add(new Selectable2(i));
                    break;

                case 6:
                    visibles.add(new Visible3(i));
                    break;
                case 7:
                    visibles.add(new Highlightable3(i));
                    break;
                case 8:
                    visibles.add(new Selectable3(i));
                    break;
            }
        }

        long listSize = visibles.size();

        long vsum = 0;
        long hsum = 0;
        long ssum = 0;
        for (Visible e : visibles) {
            if (e instanceof Selectable) {
                vsum += e.visibleValue();
                hsum += ((Selectable) e).highlightableValue();
                ssum += ((Selectable) e).selectableValue();
            } else if (e instanceof Highlightable) {
                vsum += e.visibleValue();
                hsum += ((Highlightable) e).highlightableValue();
            } else {
                vsum += e.visibleValue();
            }
        }

        return listSize + vsum * hsum * ssum;
    }
}

abstract class Visible {
    abstract int visibleValue();
}
abstract class Highlightable extends Visible {
    abstract int highlightableValue();
}
abstract class Selectable extends Highlightable {
    abstract int selectableValue();
}

class Visible1 extends Visible {
    private int v;
    Visible1(int v) {
        this.v = v;
    }
    @Override int visibleValue() { return v; }
}
class Highlightable1 extends Highlightable {
    private int v;
    Highlightable1(int v) {
        this.v = v;
    }
    @Override int visibleValue() { return v*2; }
    @Override int highlightableValue() { return v*3; }
}
class Selectable1 extends Selectable {
    private int v;
    Selectable1(int v) {
        this.v = v;
    }
    @Override int visibleValue() { return v*4; }
    @Override int highlightableValue() { return v*5; }
    @Override int selectableValue() { return v*6; }
}

class Visible2 extends Visible {
    private int v;
    Visible2(int v) {
        this.v = v;
    }
    @Override int visibleValue() { return v*7; }
}
class Highlightable2 extends Highlightable {
    private int v;
    Highlightable2(int v) {
        this.v = v;
    }
    @Override int visibleValue() { return v*8; }
    @Override int highlightableValue() { return v*9; }
}
class Selectable2 extends Selectable {
    private int v;
    Selectable2(int v) {
        this.v = v;
    }
    @Override int visibleValue() { return v*10; }
    @Override int highlightableValue() { return v*11; }
    @Override int selectableValue() { return v*12; }
}


class Visible3 extends Visible {
    private int v;
    Visible3(int v) {
        this.v = v;
    }
    @Override int visibleValue() { return v*13; }
}
class Highlightable3 extends Highlightable {
    private int v;
    Highlightable3(int v) {
        this.v = v;
    }
    @Override int visibleValue() { return v*14; }
    @Override int highlightableValue() { return v*15; }
}
class Selectable3 extends Selectable {
    private int v;
    Selectable3(int v) {
        this.v = v;
    }
    @Override int visibleValue() { return v*16; }
    @Override int highlightableValue() { return v*17; }
    @Override int selectableValue() { return v*18; }
}

Benchmark output:

Main.separateLists                                             thrpt  600     1690.522 ±    6.570   ops/s
Main.singleListAndInstanceof                                   thrpt  600     1751.375 ±    4.368   ops/s
Main.separateLists:L1-dcache-load-misses:u                     thrpt    2     2298.258               #/op
Main.singleListAndInstanceof:L1-dcache-load-misses:u           thrpt    2      627.451               #/op
Main.separateLists:L1-dcache-loads:u                           thrpt    2  1217756.290               #/op
Main.singleListAndInstanceof:L1-dcache-loads:u                 thrpt    2  1135982.650               #/op
Main.separateLists:L1-icache-load-misses:u                     thrpt    2      113.599               #/op
Main.singleListAndInstanceof:L1-icache-load-misses:u           thrpt    2       99.896               #/op
Main.separateLists:L1-icache-loads:u                           thrpt    2   656048.382               #/op
Main.singleListAndInstanceof:L1-icache-loads:u                 thrpt    2   694074.004               #/op
Main.separateLists:LLC-load-misses:u                           thrpt    2      872.681               #/op
Main.singleListAndInstanceof:LLC-load-misses:u                 thrpt    2      355.666               #/op
Main.separateLists:LLC-loads:u                                 thrpt    2    12036.496               #/op
Main.singleListAndInstanceof:LLC-loads:u                       thrpt    2     7445.434               #/op
Main.separateLists:LLC-stores:u                                thrpt    2    15277.223               #/op
Main.singleListAndInstanceof:LLC-stores:u                      thrpt    2    10649.517               #/op
Main.separateLists:branch-misses:u                             thrpt    2    22463.763               #/op
Main.singleListAndInstanceof:branch-misses:u                   thrpt    2    29940.958               #/op
Main.separateLists:branches:u                                  thrpt    2   254018.586               #/op
Main.singleListAndInstanceof:branches:u                        thrpt    2   275450.951               #/op
Main.separateLists:cycles:u                                    thrpt    2  1988517.839               #/op
Main.singleListAndInstanceof:cycles:u                          thrpt    2  1921584.057               #/op
Main.separateLists:dTLB-load-misses:u                          thrpt    2       66.212               #/op
Main.singleListAndInstanceof:dTLB-load-misses:u                thrpt    2       64.442               #/op
Main.separateLists:dTLB-loads:u                                thrpt    2  1217929.340               #/op
Main.singleListAndInstanceof:dTLB-loads:u                      thrpt    2  1135799.981               #/op
Main.separateLists:iTLB-load-misses:u                          thrpt    2        4.179               #/op
Main.singleListAndInstanceof:iTLB-load-misses:u                thrpt    2        3.876               #/op
Main.separateLists:iTLB-loads:u                                thrpt    2   656595.175               #/op
Main.singleListAndInstanceof:iTLB-loads:u                      thrpt    2   693913.010               #/op
Main.separateLists:instructions:u                              thrpt    2  2273646.245               #/op
Main.singleListAndInstanceof:instructions:u                    thrpt    2  2045332.939               #/op
Main.separateLists:stalled-cycles-backend:u                    thrpt    2   773671.154               #/op
Main.singleListAndInstanceof:stalled-cycles-backend:u          thrpt    2   619477.824               #/op
Main.separateLists:stalled-cycles-frontend:u                   thrpt    2   184604.485               #/op
Main.singleListAndInstanceof:stalled-cycles-frontend:u         thrpt    2   271938.450               #/op
Main.separateLists:·gc.alloc.rate                              thrpt  600      217.266 ±    0.846  MB/sec
Main.singleListAndInstanceof:·gc.alloc.rate                    thrpt  600      222.747 ±    0.556  MB/sec
Main.separateLists:·gc.alloc.rate.norm                         thrpt  600   202181.035 ±    2.986    B/op
Main.singleListAndInstanceof:·gc.alloc.rate.norm               thrpt  600   200083.395 ±    4.720    B/op
Main.separateLists:·gc.churn.PS_Eden_Space                     thrpt  600      217.792 ±    3.841  MB/sec
Main.singleListAndInstanceof:·gc.churn.PS_Eden_Space           thrpt  600      223.528 ±    4.973  MB/sec
Main.separateLists:·gc.churn.PS_Eden_Space.norm                thrpt  600   202704.197 ± 3508.997    B/op
Main.singleListAndInstanceof:·gc.churn.PS_Eden_Space.norm      thrpt  600   200804.794 ± 4414.457    B/op
Main.separateLists:·gc.churn.PS_Survivor_Space                 thrpt  600        0.095 ±    0.008  MB/sec
Main.singleListAndInstanceof:·gc.churn.PS_Survivor_Space       thrpt  600        0.091 ±    0.008  MB/sec
Main.separateLists:·gc.churn.PS_Survivor_Space.norm            thrpt  600       88.896 ±    7.778    B/op
Main.singleListAndInstanceof:·gc.churn.PS_Survivor_Space.norm  thrpt  600       81.693 ±    7.269    B/op
Main.separateLists:·gc.count                                   thrpt  600     2440.000             counts
Main.singleListAndInstanceof:·gc.count                         thrpt  600     2289.000             counts
Main.separateLists:·gc.time                                    thrpt  600     4501.000                 ms
Main.singleListAndInstanceof:·gc.time                          thrpt  600     4236.000                 ms

UPDATE: Below is the benchmark code and results with array setup code extracted into separate methods and removed from the measurement. instanceof is slower for that case, as expected - above differences are probably related to branch prediction issues in the list setup. (while those are interesting too, they probably should go into separate question)

package test;

import org.openjdk.jmh.annotations.*;

import java.util.ArrayList;
import java.util.Random;

public class Main {

    public static void main(String[] args) throws Exception {
        org.openjdk.jmh.Main.main(args);
    }

    @State(Scope.Thread)
    public static class SeparateListsState {
        public ArrayList<Visible> visibles;
        public ArrayList<Highlightable> highlightables;
        public ArrayList<Selectable> selectables;

        @Setup(Level.Invocation)
        public void doSetup() {
            visibles = new ArrayList<>();
            highlightables = new ArrayList<>();
            selectables = new ArrayList<>();

            Random random = new Random(9698426994L + 8879);
            for (int i = 0; i < 10_000; i++) {
                switch (random.nextInt(9)) {
                    case 0:
                        visibles.add(new Visible1(i));
                        break;
                    case 1:
                        highlightables.add(new Highlightable1(i));
                        break;
                    case 2:
                        selectables.add(new Selectable1(i));
                        break;

                    case 3:
                        visibles.add(new Visible2(i));
                        break;
                    case 4:
                        highlightables.add(new Highlightable2(i));
                        break;
                    case 5:
                        selectables.add(new Selectable2(i));
                        break;

                    case 6:
                        visibles.add(new Visible3(i));
                        break;
                    case 7:
                        highlightables.add(new Highlightable3(i));
                        break;
                    case 8:
                        selectables.add(new Selectable3(i));
                        break;
                }
            }
        }
    }

    @Benchmark
    @BenchmarkMode(Mode.Throughput)
    @Warmup(iterations = 5, time = 1)
    @Measurement(iterations = 150, time = 1)
    @Fork(value = 2)
    public static long separateLists(SeparateListsState state) {
        long vsum = 0;
        long hsum = 0;
        long ssum = 0;
        for (Visible e : state.visibles) {
            vsum += e.visibleValue();
        }
        for (Highlightable e : state.highlightables) {
            vsum += e.visibleValue();
            hsum += e.highlightableValue();
        }
        for (Selectable e : state.selectables) {
            vsum += e.visibleValue();
            hsum += e.highlightableValue();
            ssum += e.selectableValue();
        }

        return vsum * hsum * ssum;
    }

    @State(Scope.Thread)
    public static class SingleListAndInstanceofState {
        public ArrayList<Visible> visibles;

        @Setup(Level.Invocation)
        public void doSetup() {
            visibles = new ArrayList<>();

            Random random = new Random(9698426994L + 8879);
            for (int i = 0; i < 10_000; i++) {
                switch (random.nextInt(9)) {
                    case 0:
                        visibles.add(new Visible1(i));
                        break;
                    case 1:
                        visibles.add(new Highlightable1(i));
                        break;
                    case 2:
                        visibles.add(new Selectable1(i));
                        break;

                    case 3:
                        visibles.add(new Visible2(i));
                        break;
                    case 4:
                        visibles.add(new Highlightable2(i));
                        break;
                    case 5:
                        visibles.add(new Selectable2(i));
                        break;

                    case 6:
                        visibles.add(new Visible3(i));
                        break;
                    case 7:
                        visibles.add(new Highlightable3(i));
                        break;
                    case 8:
                        visibles.add(new Selectable3(i));
                        break;
                }
            }
        }
    }

    @Benchmark
    @BenchmarkMode(Mode.Throughput)
    @Warmup(iterations = 5, time = 1)
    @Measurement(iterations = 150, time = 1)
    @Fork(value = 2)
    public static long singleListAndInstanceof(SingleListAndInstanceofState state) {
        long vsum = 0;
        long hsum = 0;
        long ssum = 0;
        for (Visible e : state.visibles) {
            if (e instanceof Selectable) {
                vsum += e.visibleValue();
                hsum += ((Selectable) e).highlightableValue();
                ssum += ((Selectable) e).selectableValue();
            } else if (e instanceof Highlightable) {
                vsum += e.visibleValue();
                hsum += ((Highlightable) e).highlightableValue();
            } else {
                vsum += e.visibleValue();
            }
        }

        return vsum * hsum * ssum;
    }
}

abstract class Visible {
    abstract int visibleValue();
}
abstract class Highlightable extends Visible {
    abstract int highlightableValue();
}
abstract class Selectable extends Highlightable {
    abstract int selectableValue();
}

class Visible1 extends Visible {
    private int v;
    Visible1(int v) {
        this.v = v;
    }
    @Override int visibleValue() { return v; }
}
class Highlightable1 extends Highlightable {
    private int v;
    Highlightable1(int v) {
        this.v = v;
    }
    @Override int visibleValue() { return v*2; }
    @Override int highlightableValue() { return v*3; }
}
class Selectable1 extends Selectable {
    private int v;
    Selectable1(int v) {
        this.v = v;
    }
    @Override int visibleValue() { return v*4; }
    @Override int highlightableValue() { return v*5; }
    @Override int selectableValue() { return v*6; }
}

class Visible2 extends Visible {
    private int v;
    Visible2(int v) {
        this.v = v;
    }
    @Override int visibleValue() { return v*7; }
}
class Highlightable2 extends Highlightable {
    private int v;
    Highlightable2(int v) {
        this.v = v;
    }
    @Override int visibleValue() { return v*8; }
    @Override int highlightableValue() { return v*9; }
}
class Selectable2 extends Selectable {
    private int v;
    Selectable2(int v) {
        this.v = v;
    }
    @Override int visibleValue() { return v*10; }
    @Override int highlightableValue() { return v*11; }
    @Override int selectableValue() { return v*12; }
}


class Visible3 extends Visible {
    private int v;
    Visible3(int v) {
        this.v = v;
    }
    @Override int visibleValue() { return v*13; }
}
class Highlightable3 extends Highlightable {
    private int v;
    Highlightable3(int v) {
        this.v = v;
    }
    @Override int visibleValue() { return v*14; }
    @Override int highlightableValue() { return v*15; }
}
class Selectable3 extends Selectable {
    private int v;
    Selectable3(int v) {
        this.v = v;
    }
    @Override int visibleValue() { return v*16; }
    @Override int highlightableValue() { return v*17; }
    @Override int selectableValue() { return v*18; }
}

And the results:

Main.separateLists                                             thrpt  300     4211.552 ±    23.791   ops/s
Main.singleListAndInstanceof                                   thrpt  300     3920.251 ±    15.478   ops/s
Main.separateLists:L1-dcache-load-misses:u                     thrpt    2     3046.033                #/op
Main.singleListAndInstanceof:L1-dcache-load-misses:u           thrpt    2     1089.122                #/op
Main.separateLists:L1-dcache-loads:u                           thrpt    2  1090745.006                #/op
Main.singleListAndInstanceof:L1-dcache-loads:u                 thrpt    2  1125243.609                #/op
Main.separateLists:L1-icache-load-misses:u                     thrpt    2      150.542                #/op
Main.singleListAndInstanceof:L1-icache-load-misses:u           thrpt    2      143.304                #/op
Main.separateLists:L1-icache-loads:u                           thrpt    2   600852.620                #/op
Main.singleListAndInstanceof:L1-icache-loads:u                 thrpt    2   700771.042                #/op
Main.separateLists:LLC-load-misses:u                           thrpt    2     1299.520                #/op
Main.singleListAndInstanceof:LLC-load-misses:u                 thrpt    2      636.764                #/op
Main.separateLists:LLC-loads:u                                 thrpt    2    14408.815                #/op
Main.singleListAndInstanceof:LLC-loads:u                       thrpt    2    10429.768                #/op
Main.separateLists:LLC-stores:u                                thrpt    2    18999.178                #/op
Main.singleListAndInstanceof:LLC-stores:u                      thrpt    2    15370.582                #/op
Main.separateLists:branch-misses:u                             thrpt    2    22578.062                #/op
Main.singleListAndInstanceof:branch-misses:u                   thrpt    2    29257.959                #/op
Main.separateLists:branches:u                                  thrpt    2   258026.890                #/op
Main.singleListAndInstanceof:branches:u                        thrpt    2   284911.889                #/op
Main.separateLists:cycles:u                                    thrpt    2  1915774.770                #/op
Main.singleListAndInstanceof:cycles:u                          thrpt    2  1974841.023                #/op
Main.separateLists:dTLB-load-misses:u                          thrpt    2      101.573                #/op
Main.singleListAndInstanceof:dTLB-load-misses:u                thrpt    2       99.982                #/op
Main.separateLists:dTLB-loads:u                                thrpt    2  1090174.103                #/op
Main.singleListAndInstanceof:dTLB-loads:u                      thrpt    2  1129185.929                #/op
Main.separateLists:iTLB-load-misses:u                          thrpt    2        4.432                #/op
Main.singleListAndInstanceof:iTLB-load-misses:u                thrpt    2        3.955                #/op
Main.separateLists:iTLB-loads:u                                thrpt    2   600301.665                #/op
Main.singleListAndInstanceof:iTLB-loads:u                      thrpt    2   703339.482                #/op
Main.separateLists:instructions:u                              thrpt    2  1974603.052                #/op
Main.singleListAndInstanceof:instructions:u                    thrpt    2  2040460.093                #/op
Main.separateLists:stalled-cycles-backend:u                    thrpt    2   808914.974                #/op
Main.singleListAndInstanceof:stalled-cycles-backend:u          thrpt    2   685615.056                #/op
Main.separateLists:stalled-cycles-frontend:u                   thrpt    2   186013.216                #/op
Main.singleListAndInstanceof:stalled-cycles-frontend:u         thrpt    2   272207.204                #/op
Main.separateLists:·gc.alloc.rate                              thrpt  300      346.891 ±     1.166  MB/sec
Main.singleListAndInstanceof:·gc.alloc.rate                    thrpt  300      358.297 ±     0.614  MB/sec
Main.separateLists:·gc.alloc.rate.norm                         thrpt  300   310744.294 ±     0.107    B/op
Main.singleListAndInstanceof:·gc.alloc.rate.norm               thrpt  300   328992.302 ±     0.110    B/op
Main.separateLists:·gc.churn.PS_Eden_Space                     thrpt  300      349.387 ±    14.305  MB/sec
Main.singleListAndInstanceof:·gc.churn.PS_Eden_Space           thrpt  300      360.039 ±     9.075  MB/sec
Main.separateLists:·gc.churn.PS_Eden_Space.norm                thrpt  300   313154.953 ± 13018.012    B/op
Main.singleListAndInstanceof:·gc.churn.PS_Eden_Space.norm      thrpt  300   330629.833 ±  8345.712    B/op
Main.separateLists:·gc.churn.PS_Survivor_Space                 thrpt  300        0.092 ±     0.012  MB/sec
Main.singleListAndInstanceof:·gc.churn.PS_Survivor_Space       thrpt  300        0.094 ±     0.011  MB/sec
Main.separateLists:·gc.churn.PS_Survivor_Space.norm            thrpt  300       82.348 ±    10.661    B/op
Main.singleListAndInstanceof:·gc.churn.PS_Survivor_Space.norm  thrpt  300       86.465 ±    10.417    B/op
Main.separateLists:·gc.count                                   thrpt  300     1196.000              counts
Main.singleListAndInstanceof:·gc.count                         thrpt  300     1235.000              counts
Main.separateLists:·gc.time                                    thrpt  300     2178.000                  ms
Main.singleListAndInstanceof:·gc.time                          thrpt  300     2355.000                  ms
like image 885
Rogach Avatar asked Jun 26 '19 14:06

Rogach


2 Answers

As NoDataFound suggests in the comments, you're not just comparing the performance of iterating through the list, you're also comparing the list population methods. You need to pull this part of your code into a setup method - otherwise you're potentially going to be impacted by resize operations on your three ArrayList instances (amongst other things).

You should also either scrap the use of Random to populate the list, or at least instantiate it with the same seed across both implementations - otherwise you're not creating a repeatable order of elements across both implementations.

like image 191
Catchwa Avatar answered Nov 15 '22 14:11

Catchwa


The core point is: your measuring is flawed. Not only did that first version of the your question measure different "things", but even the updated question has one big problem, the (way too low) sample size of 10_000!

That is simply not a reasonable sample size. At least, not alone. You should rather check out what you see for 10K, 50K, 100K, 1 million, ... loop iterations.

You see, you make a mistake that many people make with Java: they assume that this or that construct on the source code side of things determines performance.

And that is only partially true. You see, the real performance kicks come out of the myriad optimisations that the JIT compiler within the JVM will do at runtime, based on the profiling information it collected.

I think, the default threshold for the JIT to kick in and translate bytecode into highly optimized machine code is like 90K(?) method invocations/loop iterations.

Please let that sink in: unless your code does something on the scale of 100K or more, the JIT doesn't even consider your code "worth optimising". But then it will kick in, and that can have dramatic effects on overall performance. And then, what you put into your source code ... might not matter much any more.

Thus there isn't a single measurement that tells you what the "best" solution is. That rather depends on the number of iterations you go through.

Thus, the real answer is: java performance benchmarking is hard, and you have to be extremely diligent about what you do, and the conclusions you draw from your results.

Beyond that, the real real answer is: performance is a luxury problem. Of course, one should avoid stupid mistakes that burn CPU cycles for nothing. But besides that, your primary goal is always to write simple code that is easy to read and understand. And then, when you notice "this feels sluggish" or "this doesn't meet our SLAs", then you carefully define experiments to measure what is going on, to identify that piece of code that causes the performance problem. And just for the record: you know which code the JIT can optimise the best ... surprise: simple straight forward code, that looks like code 90% of good java coders tend to write.

like image 21
GhostCat Avatar answered Nov 15 '22 16:11

GhostCat