Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Performance Cost of Profiling a Web-Application in Production

I am attempting to solve performance issues with a large and complex tomcat java web application. The biggest issue at the moment is that, from time to time, the memory usage spikes and the application becomes unresponsive. I've fixed everything I can fix with log profilers and Bayesian analysis of the log files. I'm considering running a profiler on the production tomcat server.

A Note to the Reader with Gentle Sensitivities:

I understand that some may find the very notion of profiling a production app offensive. Please be assured that I have exhausted most of the other options. The reason I am considering this is that I do not have the resources to completely duplicate our production setup on my test server, and I have been unable to cause the failures of interest on my test server.

Questions:

I am looking for answers which work either for a java web application running on tomcat, or answer this question in a language agnostic way.

  • What are the performance costs of profiling?
  • Any other reasons why it is a bad idea to remotely connect and profile a web application in production (strange failure modes, security issues, etc)?
  • How much does profiling effect the memory foot print?
  • Specifically are there java profiling tools that have very low performance costs?
  • Any java profiling tools designed for profiling web applications?
  • Does anyone have benchmarks on the performance costs of profiling with visualVM?
  • What size applications and datasets can visualVM scale to?
like image 505
Ethan Heilman Avatar asked Jul 30 '09 16:07

Ethan Heilman


People also ask

What is application performance profiling?

Performance profilers are software development tools designed to help you analyze the performance of your applications and improve poorly performing sections of code.

What is profiling in web application?

In software engineering, profiling ("program profiling", "software profiling") is a form of dynamic program analysis that measures, for example, the space (memory) or time complexity of a program, the usage of particular instructions, or the frequency and duration of function calls.

How would you profile a Java application in the production environment?

After the application is started, the agent will start profiling the application automatically with low-overhead. The profiles will be sent to the Dashboard, where a timeline of various profiles can be analyzed at any time. The Java agent currently supports CPU and lock profiling.


2 Answers

OProfile and its ancestor DPCI were developed for profiling production systems. The overhead for these is very low, and they profile your full system, including the kernel, so you can find performance problems in the VM and in the kernel and libraries.

To answer your questions:

  1. Overhead: These are sampled profilers, that is, they generate timer or performance counter interrupts at some regular interval, and they take a look at what code is currently executing. They use that to build a histogram of where you spend your time, and the overhead is very low (1-8% is what they claim) for reasonable sampling intervals.

    Take a look at this graph of sampling frequency vs. overhead for OProfile. You can tune the sampling frequency for lower overhead if the defaults are not to your liking.

  2. Usage in production: The only caveat to using OProfile is that you'll need to install it on your production machine. I believe there's kernel support in Red Hat since RHEL3, and I'm pretty sure other distributions support it.

  3. Memory: I'm not sure what the exact memory footprint of OProfile is, but I believe it keeps relatively small buffers around and dumps them to log files occasionally.

  4. Java: OProfile includes profiling agents that support Java and that are aware of code running in JITs. So you'll be able to see Java calls, not just the C calls in the interpreter and JIT.

  5. Web Apps: OProfile is a system-level profiler, so it's not aware of things like sessions, transactions, etc. that a web app would have.

    That said, it is a full-system profiler, so if your performance problem is caused by bad interactions between the OS and the JIT, or if it's in some third-party library, you'll be able to see that, because OProfile profiles the kernel and libraries. This is an advantage for production systems, as you can catch problems that are due to misconfigurations or particulars of the production environment that might not exist in your test environment.

  6. VisualVM: Not sure about this one, as I have no experience with VisualVM

Here's a tutorial on using OProfile to find performance bottlenecks.

like image 65
Todd Gamblin Avatar answered Sep 22 '22 22:09

Todd Gamblin


I've used YourKit to profile apps in a high-load production environment, and while there was certainly an impact, it was easily an acceptable one. Yourkit makes a big deal of being able to do this in a non-invasive manner, such as selectively turning off certain profiling features that are more expensive (it's a sliding scale, really).

My favourite aspect of it is that you can run the VM with the YourKit agent running, and it has zero performance impact. it's only when you connect the GUI and start profiling that it has an effect.

like image 25
skaffman Avatar answered Sep 23 '22 22:09

skaffman