Can anyone please help me to identify the exact are of problem. Is it either JVM, Log4j or something else in our application?
We are running a multi-threaded application using JDK 1.6.0.24 on Solaris 10 (SUNW,Sun-Fire-V240) Server. Which has RMI call to communicate to the client.
Our application went hanged.I have seen the below OutOfMemory in threaddump. However, I know it is because of GC is only able to claim 2% of the object memory.
# java.lang.OutOfMemoryError: GC overhead limit exceeded Heap PSYoungGen total 304704K, used 154560K [0xe0400000, 0xfbc00000, 0xfbc00000) eden space 154560K, 100% used [0xe0400000,0xe9af0000,0xe9af0000) from space 150144K, 0% used [0xf2960000,0xf2960000,0xfbc00000) to space 145856K, 0% used [0xe9af0000,0xe9af0000,0xf2960000) PSOldGen total 897024K, used 897023K [0xa9800000, 0xe0400000, 0xe0400000) object space 897024K, 99% used [0xa9800000,0xe03ffff0,0xe0400000) PSPermGen total 28672K, used 27225K [0xa3c00000, 0xa5800000, 0xa9800000) object space 28672K, 94% used [0xa3c00000,0xa5696580,0xa5800000)
In my case it should be because GC can not claim memory from lots of waiting threads. If I see the Thread Dump. Most of the thread are waiting to acquire the lock on org.apache.log4j.Logger. Using log4j-1.2.15
If you see the first thread's trace below. It acquires lock on 2 objects and other threads (~50) are waiting to get the lock. Almost same trace can be seen for 20 minutes.
Here is the thread dump:
pool-3-thread-51" prio=3 tid=0x00a38000 nid=0xa4 runnable [0xa0d5f000] java.lang.Thread.State: RUNNABLE at java.text.DateFormat.format(DateFormat.java:316) at org.apache.log4j.helpers.PatternParser$DatePatternConverter.convert(PatternParser.java:443) at org.apache.log4j.helpers.PatternConverter.format(PatternConverter.java:65) at org.apache.log4j.PatternLayout.format(PatternLayout.java:506) at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:310) at org.apache.log4j.RollingFileAppender.subAppend(RollingFileAppender.java:276) at org.apache.log4j.WriterAppender.append(WriterAppender.java:162) at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251) - locked (a org.apache.log4j.RollingFileAppender) at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66) at org.apache.log4j.Category.callAppenders(Category.java:206) - locked (a org.apache.log4j.Logger) at org.apache.log4j.Category.forcedLog(Category.java:391) at org.apache.log4j.Category.info(Category.java:666) at com.airvana.faultServer.niohandlers.NioNotificationHandler.parseAndQueueData(NioNotificationHandler.java:296) at com.airvana.faultServer.niohandlers.NioNotificationHandler.messageReceived(NioNotificationHandler.java:145) at org.jboss.netty.channel.SimpleChannelHandler.handleUpstream(SimpleChannelHandler.java:105) at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:567) "Timer-3" prio=3 tid=0x0099a800 nid=0x53 waiting for monitor entry [0xa1caf000] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:231) - waiting to lock (a org.apache.log4j.RollingFileAppender) at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66) at org.apache.log4j.Category.callAppenders(Category.java:206) - locked (a org.apache.log4j.spi.RootLogger) at org.apache.log4j.Category.forcedLog(Category.java:391) at org.apache.log4j.Category.info(Category.java:666) at com.airvana.controlapp.export.AbstractOMDataCollector.run(AbstractOMDataCollector.java:100) at java.util.TimerThread.mainLoop(Timer.java:512) at java.util.TimerThread.run(Timer.java:462) "TrapHandlerThreadPool:Thread-10" prio=3 tid=0x014dac00 nid=0x4f waiting for monitor entry [0xa1d6f000] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:231) - waiting to lock (a org.apache.log4j.RollingFileAppender) at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66) at org.apache.log4j.Category.callAppenders(Category.java:206) - locked (a org.apache.log4j.Logger) at org.apache.log4j.Category.forcedLog(Category.java:391) at org.apache.log4j.Category.info(Category.java:666) at com.airvana.faultServer.db.ConnectionPool.printDataSourceStats(ConnectionPool.java:146) at com.airvana.faultServer.db.SQLUtil.freeConnection(SQLUtil.java:267) at com.airvana.faultServer.db.DbAPI.addEventOrAlarmOptimized(DbAPI.java:904) at com.airvana.faultServer.eventProcessing.EventProcessor.processEvent(EventProcessor.java:24) at com.airvana.faultServer.filters.BasicTrapFilter.processTrap(BasicTrapFilter.java:80) at com.airvana.faultServer.eventEngine.EventEngine.notifyTrapProcessors(EventEngine.java:314) at com.airvana.faultServer.eventEngine.NodewiseTrapQueue.run(NodewiseTrapQueue.java:94) at com.airvana.common.utils.ThreadPool$PoolThread.run(ThreadPool.java:356) "RMI TCP Connection(27927)-10.193.3.41" daemon prio=3 tid=0x0186c800 nid=0x1d53 waiting for monitor entry [0x9f84e000] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:231) - waiting to lock (a org.apache.log4j.RollingFileAppender) at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66) at org.apache.log4j.Category.callAppenders(Category.java:206) - locked (a org.apache.log4j.Logger) at org.apache.log4j.Category.forcedLog(Category.java:391) at org.apache.log4j.Category.info(Category.java:666) at com.airvana.faultServer.processCommunications.ConfigAppCommReceiver.sendEvent(ConfigAppCommReceiver.java:178) at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305) at sun.rmi.transport.Transport$1.run(Transport.java:159) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.Transport.serviceCall(Transport.java:155) at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) "pool-3-thread-49" prio=3 tid=0x01257800 nid=0xa1 waiting for monitor entry [0xa0def000] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.log4j.Category.callAppenders(Category.java:204) - waiting to lock (a org.apache.log4j.Logger) at org.apache.log4j.Category.forcedLog(Category.java:391) at org.apache.log4j.Category.info(Category.java:666) at com.airvana.faultServer.niohandlers.NioNotificationHandler.processSeqNumber(NioNotificationHandler.java:548) at com.airvana.faultServer.niohandlers.NioNotificationHandler.parseAndQueueData(NioNotificationHandler.java:301) at com.airvana.faultServer.niohandlers.NioNotificationHandler.messageReceived(NioNotificationHandler.java:145) at org.jboss.netty.channel.SimpleChannelHandler.handleUpstream(SimpleChannelHandler.java:105) at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:567) at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:803) at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:385) at org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:324) at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:306) at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:223) at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:87) at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:567) at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:803) at org.jboss.netty.handler.timeout.ReadTimeoutHandler.messageReceived(ReadTimeoutHandler.java:149) at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:87) at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:567) at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipe "pool-3-thread-44" prio=3 tid=0x00927800 nid=0x9b waiting for monitor entry [0xa0f0f000] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.log4j.Category.callAppenders(Category.java:204) - waiting to lock (a org.apache.log4j.Logger) at org.apache.log4j.Category.forcedLog(Category.java:391) at org.apache.log4j.Category.info(Category.java:666) at com.airvana.faultServer.niohandlers.NioNotificationHandler.parseAndQueueData(NioNotificationHandler.java:296) at com.airvana.faultServer.niohandlers.NioNotificationHandler.messageReceived(NioNotificationHandler.java:145) at org.jboss.netty.channel.SimpleChannelHandler.handleUpstream(SimpleChannelHandler.java:105) at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:567) at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:803) at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:385) at org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:324) at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:306) at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:223) at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:87) at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:567) at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:803) at org.jboss.netty.handler.timeout.ReadTimeoutHandler.messageReceived(ReadTimeoutHandler.java:149) at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:87) at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:567) at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:803) at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:385) at org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:324) at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:306) at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:221) at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:87) at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:567) at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:803) at org.jboss.netty.handler.execution.ChannelEventRunnable.run(ChannelEventRunnable.java:76) at org.jboss.netty.handler.execution.OrderedMemoryAwareThreadPoolExecutor$ChildExecutor.run(OrderedMemoryAwareThreadPoolExecutor.java:314) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619)
An OutOfMemoryError due to GC overhead limit happens when the JVM decides that too large a percentage of time is spent running the garbage collector. This is a classic sign that the heap is nearly full.
If the heap is too full the JVM spends an increasing percentage of time garbage collecting to reclaim a decreasing amount of memory. The corresponding percentage of time left to do useful work decreases.
My hypothesis is that your loggers are backing up due to not getting enough time between GC runs to cope with the logging rate. The large number of blocked threads is thus a secondary symptom rather than the root cause of your problems.
Assuming that the above is true, the short term fix is to restart your application with JVM options for a larger heap size. You could also change the GC overhead threshold so that your application dies sooner. (This may seem strange, but it is probably better for your application to die quickly rather than grind to a standstill over a period of many minutes or hours.)
The real fix is to track down why you are running out of heap space. You need to enable GC logging and watch the memory usage trends as the application runs ... over hours, days, weeks. If you notice that the memory usage is growing in the long term, the chances are that you have a memory leak of some kind. You'll need to use a memory profiler to track it down.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With