Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

No hs_err_pid.log file created and core dumped from jvm on Solaris

Problem description

After a while of running my java server application I am experiencing strange behaviour of Oracle Java virtual machine on Solaris. Normally, when there is a crash of jvm hs_err_pid.log file gets created (location is determined by -XX:ErrorFile jvm paramter as explained here: How can I suppress the creation of the hs_err_pid file?

But in my case, the file was not created, the only thing left was the core core dump file.

Using pstack and pflags standard Solaris tools I was able to gather more information about the crash (which are included below) from the core file.

Tried solutions

  • Tried to find all hs_err_pid.log files across the file system, but nothing could be found (even outside the application working directory). i.e.:

    find / -name "hs_err_pid*"

  • I tried to find jvm bugs related to jvm, but I couldn't find nothing interesting similar to this case.

  • The problem looks somewhat similar to: Java VM: reproducable SIGSEGV on both 1.6.0_17 and 1.6.0_18, how to report? but still I cannot confirm this since the hs_err_pid.log file is missing and of course the OS platform is different.
  • (EDIT) As suggested in one of the answers to Tool for analyzing java core dump question, I have extracted heap dump from the core file using jmap and analysed it with with Eclipse MAT. I have found a leak (elements added to HashMap, never to be cleansed, at the time of core dump 1,4 M elements). This however does not explain why hs_err_pid.log file was not generated, nor jvm crashing.
  • (EDIT2) As suggested by Darryl Miles, -Xmx limitations has been checked (Test contained code that indefinitely added objects to a LinkedList):
    • java -Xmx1444m Test results with java.lang.OutOfMemoryError: Java heap space,
    • java -Xmx2048m Test results with java.lang.OutOfMemoryError: Java heap space,
    • java -Xmx3600m Test results with core dump.

The question

Has anyone experienced similar problem with jvm and how to proceed in such cases to find what actually happened (i.e. in what case the core gets dumped from the jvm and no hs_err_pid.log file is created)?

Any tip or pointer to resolving this would be very helpful.

Extracted flags

# pflags core
...
/2139095:      flags = DETACH
    sigmask = 0xfffffeff,0x0000ffff  cursig = SIGSEGV

Extracted stack

# pstack core
...
-----------------  lwp# 2139095 / thread# 2139095  --------------------
 fb208c3e ???????? (f25daee0, f25daec8, 74233960, 776e3caa, 74233998, 776e64f0)
 fb20308d ???????? (0, 1, f25db030, f25daee0, f25daec8, 7423399c)
 fb20308d ???????? (0, 0, 50, f25da798, f25daec8, f25daec8)
 fb20308d ???????? (0, 0, 50, f25da798, 8561cbb8, f25da988)
 fb203403 ???????? (f25da988, 74233a48, 787edef5, 74233a74, 787ee8a0, 0)
 fb20308d ???????? (0, f25da988, 74233a78, 76e2facf, 74233aa0, 76e78f70)
 fb203569 ???????? (f25da9b0, 8b5b400, 8975278, 1f80, fecd6000, 1)
 fb200347 ???????? (74233af0, 74233d48, a, 76e2fae0, fb208f60, 74233c58)
 fe6f4b0b __1cJJavaCallsLcall_helper6FpnJJavaValue_pnMmethodHandle_pnRJavaCallArguments_pnGThread__v_ (74233d44, 74233bc8, 74233c54, 8b5b400) + 1a3
 fe6f4db3 __1cCosUos_exception_wrapper6FpFpnJJavaValue_pnMmethodHandle_pnRJavaCallArguments_pnGThread__v2468_v_ (fe6f4968, 74233d44, 74233bc8, 74233c54, 8b5b4
00) + 27
 fe6f4deb __1cJJavaCallsEcall6FpnJJavaValue_nMmethodHandle_pnRJavaCallArguments_pnGThread__v_ (74233d44, 8975278, 74233c54, 8b5b400) + 2f
 fe76826d __1cJJavaCallsMcall_virtual6FpnJJavaValue_nLKlassHandle_nMsymbolHandle_4pnRJavaCallArguments_pnGThread__v_ (74233d44, 897526c, fed2d464, fed2d6d0, 7
4233c54, 8b5b400) + c1
 fe76f4fa __1cJJavaCallsMcall_virtual6FpnJJavaValue_nGHandle_nLKlassHandle_nMsymbolHandle_5pnGThread__v_ (74233d44, 8975268, 897526c, fed2d464, fed2d6d0, 8b5b
400) + 7e
 fe7805f6 __1cMthread_entry6FpnKJavaThread_pnGThread__v_ (8b5b400, 8b5b400) + d2
 fe77cbe4 __1cKJavaThreadRthread_main_inner6M_v_ (8b5b400) + 4c
 fe77cb8e __1cKJavaThreadDrun6M_v_ (8b5b400) + 182
 feadbd59 java_start (8b5b400) + f9
 feed59a9 _thr_setup (745c5200) + 4e
 feed5c90 _lwp_start (745c5200, 0, 0, 74233ff8, feed5c90, 745c5200)

System information:

# uname -a
SunOS xxxx 5.10 Generic_137138-09 i86pc i386 i86pc
# java -version
java version "1.6.0_11"
Java(TM) SE Runtime Environment (build 1.6.0_11-b03)
Java HotSpot(TM) Server VM (build 11.0-b16, mixed mode)
# ulimit -a
time(seconds) unlimited
file(blocks) unlimited
data(kbytes) unlimited
stack(kbytes) 10240
coredump(blocks) unlimited
nofiles(descriptors) 256
memory(kbytes) unlimited

Used jvm args:

java -Xms1024M -Xmx2048M -verbose:gc -Xloggc:logs/gc.log -server com.example.MyApplication

Please comment if you find some information missing, I'll try to add them.

like image 269
pkk Avatar asked Sep 26 '11 12:09

pkk


People also ask

Where are JVM crash logs?

JVM crash log files are named hs_err_pid*. log, with the process id of the JVM that crashed, and are placed in the Micro-Manager folder.

What is Hs_err_pid files on desktop?

“hs_err_pid” error is usually related to java. It's likely that the version of java installed on your computer is either out of date or corrupted. I'd recommend you to uninstall and reinstall Java on your computer.

What is XX ErrorFile?

C.The product flag -XX:ErrorFile= file can be used to specify where the file will be created, where file represents the full path for the file location. The substring %% in the file variable is converted to %, and the substring %p is converted to the process ID of the process.


2 Answers

6.0_11 is quite old and I have no recent experiences with, really recommend upgrade there...

However, no crash dump may occur with stackoverflow in the native code, i.e. calling some native function (like write of FileOutputStream, sockets use the same impl) with very low stack. So, even though the JVM attempts to write the file, there is not enough stack and the writing code also crashes. The second stackoverflow just bails out the process.

I did have similar case (no file created) on a production system and it was not pretty to trace it, yet the above explains the reason.

like image 58
bestsss Avatar answered Sep 28 '22 10:09

bestsss


As per my comments above. I beleive this issue to be running out of usable heap in 32bit address space by having set too high a -Xmx value. This forced the Kernel to police the limit (by denying requests for new memory) before the JVM could police it (by using controlled OutOfMemoryException mechanism). Unfortunately I do not know the specifics of Intel Solaris to know what is to be expected from that platform.

But as a general rule for Windows a maximum -Xmx might be 1800M and then reduce it by 16M per additional application thread you create. Since each thread needs stack space (both native and Java stack) as well as other per-thread accounting matters like Thread Local Storage etc... The result of this calculation should give you an approximation of the realistic usable heap space of a Java VM on any 32bit bit process whose operating system uses a 2G/2G split (User/Kernel).

It is possible with WinXP and above to use /3G switch on the kernel to get higher split (3G/1G user/kernel) and Linux has a /proc/<pid>/map file to allow you to see exactly how the process address space is laid out of a given process (if you were running this application you could watch over time as the [heap] grows to meet the shared file mappings used for .text/.rodata/.data/etc... from DSOs this results in the kernel denying requests to grow the heap.

This problem goes away for 64bit because there is so much more address space to use and you will run out of physical and virtual (swap) memory before the heap meets the other mappings.

I believe 'truss' on Solaris would have show up a brk/sbrk system-call that returned an error code, shortly before the core dump. Parts of standard native libraries are coded to never check the return code from requests for new memory and as a result crashes can be expected.

like image 41
Darryl Miles Avatar answered Sep 28 '22 10:09

Darryl Miles