Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to Optimize JVM & GC through Load Testing

Edit: Of the several extremely generous and helpful responses this question has already received, it is obvious to me that I didn't make an important part of this question clear when I asked it earlier this morning. The answers I've received so far are more about optimizing applications & removing bottlenecks at the code level. I am aware that this is way more important than trying to get an extra 3- or 5% out of your JVM!

This question assumes we've already done just about everything we could to optimize our application architecture at the code level. Now we want more, and the next place to look is at the JVM level and garbage collection; I've changed the question title accordingly. Thanks again!


We've got a "pipeline" style backend architecture where messages pass from one component to the next, with each component performing different processes at each step of the way.

Components live inside of WAR files deployed on Tomcat servers. Altogether we have about 20 components in the pipeline, living on 5 different Tomcat servers (I didn't choose the architecture or the distribution of WARs for each server). We use Apache Camel to create all the routes between the components, effectively forming the "connective tissue" of the pipeline.

I've been asked to optimize the GC and general performance of each server running a JVM (5 in all). I've spent several days now reading up on GC and performance tuning, and have a pretty good handle on what each of the different JVM options do, how the heap is organized, and how most of the options affect the overall performance of the JVM.

My thinking is that the best way to optimize each JVM is not to optimize it as a standalone. I "feel" (that's about as far as I can justify it!) that trying to optimize each JVM locally without considering how it will interact with the other JVMs on other servers (both upstream and downstream) will not produce a globally-optimized solution.

To me it makes sense to optimize the entire pipeline as a whole. So my first question is: does SO agree, and if not, why?

To do this, I was thinking about creating a LoadTester that would generate input and feed it to the first endpoint in the pipeline. This LoadTester might also have a separate "Monitor Thread" that would check the last endpoint for throughput. I could then do all sorts of processing where we check for average end-to-end travel time for messages, maximum throughput before faulting, etc.

The LoadTester would generate the same pattern of input messages over and over again. The variable in this experiment would be the JVM options passed to each Tomcat server's startup options. I have a list of about 20 different options I'd like to pass the JVMs, and figured I could just keep tweaking their values until I found near-optimal performance.

This may not be the absolute best way to do this, but it's the best way I could design with what time I've been given for this project (about a week).

Second question: what does SO think about this setup? How would SO create an "optimizing solution" any differently?

Last but not least, I'm curious as to what sort of metrics I could use as a basis of measure and comparison. I can really only think of:

  • Find the JVM option config that produces the fastest average end-to-end travel time for messages
  • Find the JVM option config that produces the largest volume throughput without crashing any of the servers

Any others? Any reasons why those 2 are bad?

After reviewing the play I could see how this might be construed as a monolithic question, but really what I'm asking is how SO would optimize JVMs running along a pipeline, and to feel free to cut-and-dice my solution however you like it.

Thanks in advance!

like image 675
IAmYourFaja Avatar asked Jan 05 '12 16:01

IAmYourFaja


People also ask

What is JVM tuning in Java?

Java virtual machine tuning is the process of adjusting the default parameters to match our application needs. This includes simple adjustments like the size of the heap, through choosing the right garbage collector to using optimized versions of getters.

What is the fastest JVM?

For the typical (median) values, there is no significant difference between the various JDKs except for OpenJDK 11 which is about 30% slower than the other versions. The fastest of them all is GraalVM EE 17, but the difference compared to OpenJDK 8/OpenJDK 17 is marginal.

How much RAM does JVM need?

This resource memory used by the JVM is often called overhead. The recommended minimum starting memory point for 64-bit Maximo 7.5 JVMs systems is 3584 MB. Therefore we recommended that physical memory availability for each JVM be 4096 MB;0.5 GB is for JVM allocation and 512 MB is for overhead.

How do I free up JVM memory?

There is no way to force JVM to free up the memory, System. gc() is just a hint. It's up to GC to manage the memory (do note there are various types of memory e.g. heap, meta space, off-heap).


1 Answers

Let me go up a level and say I did something similar in a large C app many years ago. It consisted of a number of processes exchanging messages across interconnected hardware. I came up with a two-step approach.

Step 1. Within each process, I used this technique to get rid of any wasteful activities. That took a few days of sampling, revising code, and repeating. The idea is there is a chain, and the first thing to do is remove inefficiences from the links.

Step 2. This part is laborious but effective: Generate time-stamped logs of message traffic. Merge them together into a common timeline. Look carefully at specific message sequences. What you're looking for is

  1. Was the message necessary, or was it a retransmission resulting from a timeout or other avoidable reason?
  2. When was the message sent, received, and acted upon? If there is a significant delay between being received and acted upon, what is the reason for that delay? Was it just a matter of being "in line" behind another process that was doing I/O, for example? Could it have been fixed with different process priorities?

This activity took me about a day to generate logs, combine them, find a speedup opportunity, and revise code. At this rate, after about 10 working days, I had found/fixed a number of problems, and improved the speed dramatically.

What is common about these two steps is I'm not measuring or trying to get "statistics". If something is spending too much time, that very fact exposes it to a dilligent programmer taking a close meticulous look at what is happening.

like image 66
Mike Dunlavey Avatar answered Sep 21 '22 18:09

Mike Dunlavey