Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Strange performance drop of JDK8 LocalDate.toEpochDay

I was curious if we finally get a fast datetime library with JDK8. Nearly all LocalDate computations use toEpochDay so I looked at the source and the large number of divisions and branches made me curious if I could do better.

results2

I eliminated some branching and all but one division, however the speedup is worse than expected. So my first question how is it possible that an algorithm using multiple division takes only about 30 cycles (throughput). Holger's comments seem to have answered this: Division by a small constant gets JIT-ed to multiplication. I did it manually and now I'm consistently beating the original implementation by a factor of 2.

The benchmark is pretty straightforward, just iterate though an array of random LocalDates and convert each of them toEpochDay. Despite the randomness, the results are pretty consistent. The size of the array is a parameter and my main question is where the big slowdown between 2000 and 30000 comes from. There's should be some slowdown as the data no more fit in the L1 cache, but the memory accesses of both algorithms are exactly the same (i.e., none but fetching the date from the array).

The still open question is: How it comes that the behaviour of two simple memory-access-free implementations of the same function change when iterating over an array? The original algorithm suffers a much bigger slowdown than mine.

My algorithm is probably not worth copying here, it's undocumented and about as cryptic as the original, and there's only a very rudimentary test.

like image 705
maaartinus Avatar asked Nov 10 '22 13:11

maaartinus


1 Answers

I haven't catched the reason directly, but it is certainly a benchmarking framework shortcoming. Something related to GC and per-invocation costs. I have the same performance degradation with JMH, except bench with 100 dates shows better perf than with 2000 dates, too. I've tried to create the dates array always of maximum size, and iterate just first 100, 2000, 30000 elements. In this case all versions performed equally (15.3 +- 0.3 ns on my machine).

import org.openjdk.jmh.annotations.*;

import java.time.LocalDate;
import java.util.*;
import java.util.concurrent.ThreadLocalRandom;
import java.util.concurrent.TimeUnit;


@State(Scope.Benchmark)
@BenchmarkMode(Mode.AverageTime)
@OperationsPerInvocation(LocalDateBenchmark.ITERATIONS)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
public class LocalDateBenchmark {
    public static final int MAX_ITERATIONS = 1000000;
    public static final int ITERATIONS = 30000;

    private static final LocalDate MIN_DATE = LocalDate.of(1900, 1, 1);
    private static final LocalDate MAX_DATE = LocalDate.of(2100, 1, 1);
    private static final int DAYS_BETWEEN = (int) (MAX_DATE.toEpochDay() - MIN_DATE.toEpochDay());

    public LocalDate[] dates = new LocalDate[MAX_ITERATIONS];
    private Random random;

    @Setup(Level.Trial)
    public void setUpAll() {
        Random r = ThreadLocalRandom.current();
        for (int i=0; i< dates.length; ++i) {
            dates[i] = MIN_DATE.plusDays(r.nextInt(DAYS_BETWEEN));
        }
    }

    @Setup(Level.Iteration)
    public void setUpRandom() {
        random = new Random();
    }

    @GenerateMicroBenchmark
    public int timeToEpochDay(LocalDateBenchmark state) {
        int result = 0;
        LocalDate[] dates = state.dates;
        int offset = random.nextInt(MAX_ITERATIONS - ITERATIONS);
        for (int i = offset; i < offset + ITERATIONS; i++) {
            LocalDate date = dates[i];
            result += date.toEpochDay();
        }
        return result;
    }
}
like image 189
leventov Avatar answered Nov 15 '22 13:11

leventov