Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java's Serial garbage collector performing far better than other garbage collectors?

I'm testing an API, written in Java, that is expected to minimize latency in processing messages received over a network. To achieve these goals, I'm playing around with the different garbage collectors that are available.

I'm trying four different techniques, which utilize the following flags to control garbage collection:

1) Serial: -XX:+UseSerialGC

2) Parallel: -XX:+UseParallelOldGC

3) Concurrent: -XX:+UseConcMarkSweepGC

4) Concurrent/incremental: -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -XX:+CMSIncrementalPacing

I ran each technique over the course of five hours. I periodically used the list of GarbageCollectorMXBean provided by ManagementFactory.getGarbageCollectorMXBeans() to retrieve the total time spent collecting garbage.

My results? Note that "latency" here is "Amount of time that my application+the API spent processing each message plucked off the network."

Serial: 789 GC events totaling 1309 ms; mean latency 47.45 us, median latency 8.704 us, max latency 1197 us

Parallel: 1715 GC events totaling 122518 ms; mean latency 450.8 us, median latency 8.448 us, max latency 8292 us

Concurrent: 4629 GC events totaling 116229 ms; mean latency 707.2 us, median latency 9.216 us, max latency 9151 us

Incremental: 5066 GC events totaling 200213 ms; mean latency 515.9 us, median latency 9.472 us, max latency 14209 us

I find these results to be so improbable that they border on absurd. Does anyone know why I might be having these kinds of results?

Oh, and for the record, I'm using Java HotSpot(TM) 64-Bit Server VM.

like image 523
user1274193 Avatar asked Mar 16 '12 14:03

user1274193


People also ask

What is the best garbage collector?

Serial collector This garbage collector performs all its work on a single thread. Using a single thread can improve efficiency because there is no communication overhead between multiple threads.

What is the difference between serial and throughput garbage collectors?

Serial collector uses one thread to execute garbage collection. Throughput collector uses multiple threads to execute garbage collection. Serial GC is the garbage collector of choice for applications that do not have low pause time requirements and run on client-style machines.

What is the difference between serial and parallel garbage collector?

The parallel collector is also known as throughput collector, it's a generational collector similar to the serial collector. The primary difference between the serial and parallel collectors is that the parallel collector has multiple threads that are used to speed up garbage collection.

What is serial garbage collector?

Serial Garbage CollectorIt uses the only thread for garbage collection. It works by holding all the threads of an application. It means that threads of the application freeze by the serial garbage collector during the garbage collection process and the process is known as stop the world event.


1 Answers

I'm working on a Java application that is expected to maximize throughput and minimize latency

Two problems with that:

  • Those are often contradictory goals, so you need to decide how important each is against the other (would you sacrifice 10% latency to get 20% throughput gain or vice versa? Are you aiming for some specific latency target, beyond which it doesn't matter whether it's any faster? Things like that.)
  • Your haven't given any results around either of these

All you've shown is how much time is spent in the garbage collector. If you actually achieve more throughput, you would probably expect to see more time spent in the garbage collector. Or to put it another way, I can make a change in the code to minimize the values you're reporting really easily:

// Avoid generating any garbage
Thread.sleep(10000000);

You need to work out what's actually important to you. Measure everything that's important, then work out where the trade-off lies. So the first thing to do is re-run your tests and measure latency and throughput. You may also care about total CPU usage (which isn't the same as CPU in GC of course) but while you're not measuring your primary aims, your results aren't giving you particularly useful information.

like image 66
Jon Skeet Avatar answered Oct 23 '22 15:10

Jon Skeet