Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Random ArrayIndexOutOfBoundsException, using stream to order Map elements by value

Tags:

java

java-8

in the last days I'm starting to "play" with some Java 8 features, like stream (I studied a bit of documentation and several examples).

In my application I have a Map and I need to get the three element with highest value (the float part).

I tried different modifications to my code (and some of these solutions also: Sort a Map<Key, Value> by values (Java) ), for example:

Map<Long, Float> great = createMapWith20Elements();
Map<Long, Float> small = great.entrySet().stream()
        .sorted(Map.Entry.<Long, Float>comparingByValue().reversed()) 
        .limit(3) 
        .collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue));

But the reslt is always the same: sometimes the code works fine, other it gives me a

java.lang.ArrayIndexOutOfBoundsException: 19

In rare cases, the index out of the bounds is 18.

This "random" behaviour (18, 19, or correct elaborations) makes me think to a "parallel threading" problem.

I'm sure that great map has always 20 elements... if I print them I receive:

2,-0.5
3,0.0
4,0.0
5,0.0
6,0.0
7,-0.33333334
8,0.0
9,0.0
10,0.0
11,0.0
12,0.5
13,0.0
14,0.0
15,-0.5
18,0.0
19,0.0
21,0.0
22,0.0
23,0.0
24,0.0

I'm conscious that 17 objects are candidate to be the first 3... but it is not a problem for my algorithm.

Can you help me in some way?

Thanks

EDIT:

The method createMapWith20Elements() has a dummy name for better explaining my situation: I'm sure it returns 20 elements because it makes a DB reading... but it should return any matching record.

By the way it ends with

// myIds is an ArrayList<Long>
myIds.parallelStream().forEach(e -> trust.put(e, 0f));
return trust;

Replacing with myIds.stream() it seems working fine... I'm not able to figure how using parallelStream to write to an object (Collection and not Stream), and returning the object itself (Collection), in the calling function it can lead to this kind of problem.

like image 575
Pierpaolo Cira Avatar asked Jan 08 '23 02:01

Pierpaolo Cira


2 Answers

I think that the problem is the method createMapWith20Elements().

You are inserting elements in the map (probably a HashMap or a TreeMap) concurrently and both HashMap and TreeMap are not synchronized. So concurrent insertions (put method invocations) break the map structure (you get a corrupted map).

As you mention:

// myIds is an ArrayList<Long>
myIds.parallelStream().forEach(e -> trust.put(e, 0f));
return trust;

sometimes produces the errors. But

// myIds is an ArrayList<Long>
myIds.stream().forEach(e -> trust.put(e, 0f));
return trust;

does not produce the error.

If you want to insert concurrently then you must use a synchronized wrapper. So your code should be:

// myIds is an ArrayList<Long>
Map<Long, Float> syncTrust = Collections.synchronizedSortedMap(trust);
myIds.parallelStream().forEach(e -> syncTrust.put(e, 0f));
return trust;
like image 109
acesargl Avatar answered Jan 26 '23 01:01

acesargl


The issue is with the progressive resizing of the underlying not synchronized collection, I guess in your case your Map structure is not synchronized. Due to the gradual re-size, not handled correctly in a multi threads context, it can bring to have a thread trying to insert an element out of the range of the current size of the vector.

From the book "Oracle Certified Professional Java SE 8 Programmer II", Sybex:

For an ArrayList object, the JVM internally manages a primitive array of the same type. As the size of the dynamic ArrayList grows, a new, larger primitive array is periodically required. If two threads both trigger the array to be resized at the same time, a result can be lost, producing the unexpected value shown here. As briefly mentioned earlier, and also discussed later in this chapter, the unexpected result of two tasks executing at the same time is a race condition.

like image 23
Enrico Giurin Avatar answered Jan 25 '23 23:01

Enrico Giurin