Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

HashSet . slow performance in big set

I have encountered a problem i cannot find a solution. I am using a HashSet to store values. The values I store is of the custom type Cycles where i have overridden the HashCode and equals as following in order to make sure the slow performance is not cuased by the hascode or the equal methods Also i have set the initial capacity of the hashset to 10.000.000

@Override
public int hashCode() {
 final int prime = 31;
 int result = 1;
 result = prime * result + (int) (cycleId ^ (cycleId >>> 32));
 return result;
}

@Override
public boolean equals(Object obj) {
 if (this == obj)
 return true;
 if (obj == null)
 return false;
 if (getClass() != obj.getClass())
 return false;
 Cycle other = (Cycle) obj;
 if (cycleId != other.cycleId)
 return false;
 return true;
}

After the first 1.500.000 first values when i try to add a new value (with the add method of the HashSet class) the program is very slow. Eventually i am going to have java out of memory exception (Exception in thread "Thread-0" java.lang.OutOfMemoryError: Java heap space) before the stored values reach the 1.600.000

The IDE i use is Eclipse. So the next step was to increase the JVM heap size from the default value to 1 giga (using the commnads Xmx1000M and Xms1000M) Now the elipse starts with 10 times more memory available (i can see that in the bottom right where the total heap size memory and used memory is shown) but again i have the same "slow" performance and the same out of memory error IN THE SAME VALUES as before (after the 1.500.000 and before 1.600.000) which is very odd.

Does anyone has an idea what it might be the problem?

Thank you in advance

like image 600
C.L.S Avatar asked Jul 25 '10 11:07

C.L.S


3 Answers

You don't want to increase the JVM heap for Eclipse, you want to set it for your program.

Go to Run > Run Configurations (or Debug Configurations) and set the VM Options there.

like image 193
Devon_C_Miller Avatar answered Nov 13 '22 11:11

Devon_C_Miller


Not enough heap memory (increase it via -Xmx, e.g. -Xmx512m). When free memory goes very low, then much, much time is spent by the garbage collector which furiously scans the heap for unreachable objects.

Your hashCode() is fine, extra points for using all bits of the cycleId long.

Edit. Now I saw you did increase the memory, and didn't help. First of all, are you sure you did manage to increase the memory? You could check this by jconsole, connect to your app and see its heap size.

For an alternative explanation to be verified, is there any particular pattern in your cycleId that could make this hashCode() implementation bad? Like, its 32 high order bits are mostly similar to the 32 low order bits. (Yeah, right).

But no. Even if that would be the case, you would be seeing a gradual degradation of performance, not a sharp drop at a specific point (and you do get a OutOfMemoryError and frenzy gc operation). So my best guess is still a memory issue. You either didn't increase the heap size as you thought, or there is some other code grabbing memory at some point. (You could use a tool like VisualVM to profile this, and get a heap dump upon OOME, and see what objects it contains).

Edit2 I made bold the correct part of the above.

like image 32
Dimitris Andreou Avatar answered Nov 13 '22 12:11

Dimitris Andreou


A memory size available for the application you start from Eclipse should be configured from the Run menu. Try:

Run -> Run Configurations -> Arguments -> VM Arguments -> -Xmx1000M

The reason why your program is slow is Garbage Collector - it starts each time a memory is going to be out of the limit.

like image 2
Vitalii Fedorenko Avatar answered Nov 13 '22 12:11

Vitalii Fedorenko