In the class P
below, the method test
seems to return identically false
:
import java.util.function.IntPredicate; import java.util.stream.IntStream; public class P implements IntPredicate { private final static int SIZE = 33; @Override public boolean test(int seed) { int[] state = new int[SIZE]; state[0] = seed; for (int i = 1; i < SIZE; i++) { state[i] = state[i - 1]; } return seed != state[SIZE - 1]; } public static void main(String[] args) { long count = IntStream.range(0, 0x0010_0000).filter(new P()).count(); System.out.println(count); } }
Combining class P
with IntStream
, however, the method test
can (wrongly) return true
. The code in the main
method above results in some positive integer, like 716208
. The result changes after every execution.
This unexpected behavior occurs because the int
array state[]
can be set to zero during the execution. If a testing code, such as
if (seed == 0xf_fff0){ System.out.println(Arrays.toString(state)); }
is inserted at the tail of the method test
, then the program will output a line like [1048560, 1048560, 1048560, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
.
Question: Why can the int array state[]
be set to zero?
I already know how to avoid this behavior: just replacing int[]
with ArrayList
.
I examined in:
IntStream of(int… The following of() method returns a sequentially ordered stream whose elements are the specified values static IntStream of(int… values) Here, the parameter values are the elements of the new stream.
IntStream filter(IntPredicate predicate) Returns a stream consisting of the elements of this stream that match the given predicate. This is an intermediate operation.
The rangeClosed() class in the IntStream class returns a sequential ordered IntStream from startInclusive to endInclusive by an incremental step of 1. This includes both the startInclusive and endInclusive values. The syntax is as follows static IntStream rangeClosed(int startInclusive, int endInclusive)
Example: IntStream. range(1,5) generates a stream of ' 1,2,3,4 ' of type int . rangeClosed() method generates a stream of numbers starting from start value and stops after generating the end value, i.e start value is inclusive and end value is also inclusive.
One can reproduce the problem with an even simpler example, namely:
class Main { private final static int SIZE = 33; public static boolean test2(int seed) { int[] state = new int[SIZE]; state[0] = seed; for (int i = 1; i < SIZE; i++) { state[i] = state[i - 1]; } return seed != state[SIZE - 1]; } public static void main(String[] args) { long count = IntStream.range(0, 0x0010_0000).filter(Main::test2).count(); System.out.println(count); } }
The problem is caused by the JVM
optimization flag that allows the vectorization (SIMD) of loops (i.e., -XX:+AllowVectorizeOnDemand
). Likely arises from applying vectorization on the same array with intersecting ranges (i.e., state[i] = state[i - 1];
). A similar issue would be possible to reproduce if the JVM
would (for some of the elements of the IntStream.range(0, 0x0010_0000)
), optimize the loop:
for (int i = 1; i < SIZE; i++) state[i] = state[i - 1];
into:
System.arraycopy(state, 0, state, 1, SIZE - 1);
For instance:
class Main { private final static int SIZE = 33; public static boolean test2(int seed) { int[] state = new int[SIZE]; state[0] = seed; System.arraycopy(state, 0, state, 1, SIZE - 1); if(seed == 100) System.out.println(Arrays.toString(state)); return seed != state[SIZE - 1]; } public static void main(String[] args) { long count = IntStream.range(0, 0x0010_0000).filter(Main::test2).count(); System.out.println(count); } }
output:
[100, 100, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
NEW UPDATE: 01/01/2021
I have sent an email to one of the Developers involved in the implementation/integration of that flag -XX:+AllowVectorizeOnDemandand
received the following reply:
It is known that part of AllowVectorizeOnDemand code is broken.
There was fix (it excluded executing broken code which does incorrect vectorization) which was backported into jdk 11.0.11:
https://hg.openjdk.java.net/jdk-updates/jdk11u-dev/rev/69dbdd271e04
If you can, try build and test latest OpenJDK11u from https://hg.openjdk.java.net/jdk-updates/jdk11u-dev/
From the first link, one can read the following:
@bug 8251994 @summary Test vectorization of Streams$RangeIntSpliterator::forEachRemaining @requires vm.compiler2.enabled & vm.compMode != "Xint"
@run main compiler.vectorization.TestForEachRem test1 @run main compiler.vectorization.TestForEachRem test2 @run main compiler.vectorization.TestForEachRem test3 @run main compiler.vectorization.TestForEachRem test4
From the comments on the JIRA story on that bug, one can read:
I found the cause of the issue. To improve a chance to vectorize a loop, superword tries to hoist loads to the beginning of loop by replacing their memory input with corresponding (same memory slice) loop's memory Phi : http://hg.openjdk.java.net/jdk/jdk/file/8f73aeccb27c/src/hotspot/share/opto/superword.cpp#l471
Originally loads are ordered by corresponding stores on the same memory slice. But when they are hoisted they loose that ordering - nothing enforce the order. In test6 case the ordering is preserved (luckily?) after hoisting only when vector size is 32 bytes (avx2) but they become unordered with 16 (avx=0 or avx1) or 64 (avx512) bytes vectors. (...)
I have simple fix (use original loads ordering indexes) but looking on the code which causing the issue I see that it is bogus/incomplete - it does not help cases listed for JDK-8076284 changes:
https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015-April/017645.html
Using unrolling and cloning information to vectorize is interesting idea but as I see it is not complete. Even if pack_parallel() method is able created packs they are all removed by filter_packs() method. And additionally the above cases are vectorized without hoisting loads and pack_parallel - I verified it. That code is useless now and I will put it under flag to not run it. It needs more work to be useful. I reluctant to remove the code because may be in a future we will have time to invest into it.
This might explain why when I was comparing the assembly of the versions with and without the flag -XX:+AllowVectorizeOnDemand
, I notice that the version with the flag for the following code:
for (int i = 1; i < SIZE; i++) state[i] = state[i - 1];
(that I extract on a method called hotstop
to facilitate looking for it in the assembly), had:
00000001162bacf5: mov %r8d,0x10(%rsi,%r10,4) 0x00000001162bacfa: mov %r8d,0x14(%rsi,%r10,4) 0x00000001162bacff: mov %r8d,0x18(%rsi,%r10,4) 0x00000001162bad04: mov %r8d,0x1c(%rsi,%r10,4) 0x00000001162bad09: mov %r8d,0x20(%rsi,%r10,4) 0x00000001162bad0e: mov %r8d,0x24(%rsi,%r10,4) 0x00000001162bad13: mov %r8d,0x28(%rsi,%r10,4) 0x00000001162bad18: mov %r8d,0x2c(%rsi,%r10,4) ;*iastore {reexecute=0 rethrow=0 return_oop=0} ; - AAAAAA.Main::hotstop@15 (line 21)
Which looks to me like a loop unrolling
, a side from that, the method java.util.stream.Streams$RangeIntSpliterator::forEachRemaining
showed up only in the assembly of the version with the flag.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With