Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is modulo slow in Java?

Tags:

I've been looking at the implementation of ThreadLocal in the JDK, out of curiosity, and I found this :

/**  * Increment i modulo len.  */  private static int nextIndex(int i, int len) {      return ((i + 1 < len) ? i + 1 : 0);  } 

It looks fairly obvious that this could be implemented with a simple return (i + 1) % len, but I think these guys know their stuff. Any idea why they did this ?

This code is highly oriented towards performance, with a custom map for holding thread-local mappings, weak references to help the GC being clever and so on, so I guess this is a matter of performance. Is modulo slow in Java ?

like image 610
Dici Avatar asked Mar 04 '16 00:03

Dici


People also ask

Is modulo slow Java?

Integer division and modulo are relatively slow because there is no direct hardware support (they compile to multiple instruction sequences). Floating point modulo is fast.

Is modulo operator slower?

So in simple terms, this should give you a feel for why division and hence modulo is slower: computers still have to do long division in the same stepwise fashion tha you did in grade school.

Is modulo faster than if?

A modulo-operation is very slow. An if is most likely to be faster than a modulo and more readable.

Is modulo operation expensive?

Modulo operator is expensive but the division is expensive too. So converting your code from using modulo operator to division is not going to optimize your code. Show activity on this post. Modulo can be done with a single processor instruction on most architectures (ex.


2 Answers

% is avoided for performance reasons in this example.

div/rem operations are slower even on CPU architecture level; not only in Java. For example, minimum latency of idiv instruction on Haswell is about 10 cycles, but only 1 cycle for add.

Let's benchmark using JMH.

import org.openjdk.jmh.annotations.*;  @State(Scope.Benchmark) public class Modulo {     @Param("16")     int len;      int i;      @Benchmark     public int baseline() {         return i;     }      @Benchmark     public int conditional() {         return i = (i + 1 < len) ? i + 1 : 0;     }      @Benchmark     public int mask() {         return i = (i + 1) & (len - 1);     }      @Benchmark     public int mod() {         return i = (i + 1) % len;     } } 

Results:

Benchmark           (len)  Mode  Cnt  Score   Error  Units Modulo.baseline        16  avgt   10  2,951 ± 0,038  ns/op Modulo.conditional     16  avgt   10  3,517 ± 0,051  ns/op Modulo.mask            16  avgt   10  3,765 ± 0,016  ns/op Modulo.mod             16  avgt   10  9,125 ± 0,023  ns/op 

As you can see, using % is ~2.6x slower than a conditional expression. JIT cannot optimize this automatically in the discussed ThreadLocal code, because the divisor (table.length) is variable.

like image 162
apangin Avatar answered Sep 28 '22 06:09

apangin


mod is not that slow in Java. It's implemented as the byte code instructions irem and frem for Integers and Floats respectively. The JIT does a good job of optimizing this.

In my benchmarks (see article), irem calls in JDK 1.8 take about 1 nanosecond. That's pretty quick. frem calls are about 3x slower, so use integers where possible.

If you're using Natural Integers (e.g. array indexing) and a power of 2 Divisor (e.g. 8 thread locals), then you can use a bit twiddling trick to get a 20% performance gain.

like image 21
Joseph Lust Avatar answered Sep 28 '22 04:09

Joseph Lust