Problem description
After a while of running my java server application I am experiencing strange behaviour of Oracle Java virtual machine on Solaris. Normally, when there is a crash of jvm hs_err_pid.log
file gets created (location is determined by -XX:ErrorFile
jvm paramter as explained here: How can I suppress the creation of the hs_err_pid file?
But in my case, the file was not created, the only thing left was the core
core dump file.
Using pstack
and pflags
standard Solaris tools I was able to gather more information about the crash (which are included below) from the core
file.
Tried solutions
Tried to find all hs_err_pid.log
files across the file system, but nothing could be found (even outside the application working directory). i.e.:
find / -name "hs_err_pid*"
I tried to find jvm bugs related to jvm, but I couldn't find nothing interesting similar to this case.
hs_err_pid.log
file is missing and of course the OS platform is different.core
file using jmap
and analysed it with with Eclipse MAT. I have found a leak (elements added to HashMap, never to be cleansed, at the time of core dump 1,4 M elements). This however does not explain why hs_err_pid.log
file was not generated, nor jvm crashing. LinkedList
):
java -Xmx1444m Test
results with java.lang.OutOfMemoryError: Java heap space
,java -Xmx2048m Test
results with java.lang.OutOfMemoryError: Java heap space
,java -Xmx3600m Test
results with core dump.The question
Has anyone experienced similar problem with jvm and how to proceed in such cases to find what actually happened (i.e. in what case the core gets dumped from the jvm and no hs_err_pid.log
file is created)?
Any tip or pointer to resolving this would be very helpful.
Extracted flags
# pflags core
...
/2139095: flags = DETACH
sigmask = 0xfffffeff,0x0000ffff cursig = SIGSEGV
Extracted stack
# pstack core
...
----------------- lwp# 2139095 / thread# 2139095 --------------------
fb208c3e ???????? (f25daee0, f25daec8, 74233960, 776e3caa, 74233998, 776e64f0)
fb20308d ???????? (0, 1, f25db030, f25daee0, f25daec8, 7423399c)
fb20308d ???????? (0, 0, 50, f25da798, f25daec8, f25daec8)
fb20308d ???????? (0, 0, 50, f25da798, 8561cbb8, f25da988)
fb203403 ???????? (f25da988, 74233a48, 787edef5, 74233a74, 787ee8a0, 0)
fb20308d ???????? (0, f25da988, 74233a78, 76e2facf, 74233aa0, 76e78f70)
fb203569 ???????? (f25da9b0, 8b5b400, 8975278, 1f80, fecd6000, 1)
fb200347 ???????? (74233af0, 74233d48, a, 76e2fae0, fb208f60, 74233c58)
fe6f4b0b __1cJJavaCallsLcall_helper6FpnJJavaValue_pnMmethodHandle_pnRJavaCallArguments_pnGThread__v_ (74233d44, 74233bc8, 74233c54, 8b5b400) + 1a3
fe6f4db3 __1cCosUos_exception_wrapper6FpFpnJJavaValue_pnMmethodHandle_pnRJavaCallArguments_pnGThread__v2468_v_ (fe6f4968, 74233d44, 74233bc8, 74233c54, 8b5b4
00) + 27
fe6f4deb __1cJJavaCallsEcall6FpnJJavaValue_nMmethodHandle_pnRJavaCallArguments_pnGThread__v_ (74233d44, 8975278, 74233c54, 8b5b400) + 2f
fe76826d __1cJJavaCallsMcall_virtual6FpnJJavaValue_nLKlassHandle_nMsymbolHandle_4pnRJavaCallArguments_pnGThread__v_ (74233d44, 897526c, fed2d464, fed2d6d0, 7
4233c54, 8b5b400) + c1
fe76f4fa __1cJJavaCallsMcall_virtual6FpnJJavaValue_nGHandle_nLKlassHandle_nMsymbolHandle_5pnGThread__v_ (74233d44, 8975268, 897526c, fed2d464, fed2d6d0, 8b5b
400) + 7e
fe7805f6 __1cMthread_entry6FpnKJavaThread_pnGThread__v_ (8b5b400, 8b5b400) + d2
fe77cbe4 __1cKJavaThreadRthread_main_inner6M_v_ (8b5b400) + 4c
fe77cb8e __1cKJavaThreadDrun6M_v_ (8b5b400) + 182
feadbd59 java_start (8b5b400) + f9
feed59a9 _thr_setup (745c5200) + 4e
feed5c90 _lwp_start (745c5200, 0, 0, 74233ff8, feed5c90, 745c5200)
System information:
# uname -a
SunOS xxxx 5.10 Generic_137138-09 i86pc i386 i86pc
# java -version
java version "1.6.0_11"
Java(TM) SE Runtime Environment (build 1.6.0_11-b03)
Java HotSpot(TM) Server VM (build 11.0-b16, mixed mode)
# ulimit -a
time(seconds) unlimited
file(blocks) unlimited
data(kbytes) unlimited
stack(kbytes) 10240
coredump(blocks) unlimited
nofiles(descriptors) 256
memory(kbytes) unlimited
Used jvm args:
java -Xms1024M -Xmx2048M -verbose:gc -Xloggc:logs/gc.log -server com.example.MyApplication
Please comment if you find some information missing, I'll try to add them.
JVM crash log files are named hs_err_pid*. log, with the process id of the JVM that crashed, and are placed in the Micro-Manager folder.
“hs_err_pid” error is usually related to java. It's likely that the version of java installed on your computer is either out of date or corrupted. I'd recommend you to uninstall and reinstall Java on your computer.
C.The product flag -XX:ErrorFile= file can be used to specify where the file will be created, where file represents the full path for the file location. The substring %% in the file variable is converted to %, and the substring %p is converted to the process ID of the process.
6.0_11 is quite old and I have no recent experiences with, really recommend upgrade there...
However, no crash dump may occur with stackoverflow in the native code, i.e. calling some native function (like write of FileOutputStream, sockets use the same impl) with very low stack. So, even though the JVM attempts to write the file, there is not enough stack and the writing code also crashes. The second stackoverflow just bails out the process.
I did have similar case (no file created) on a production system and it was not pretty to trace it, yet the above explains the reason.
As per my comments above. I beleive this issue to be running out of usable heap in 32bit address space by having set too high a -Xmx value. This forced the Kernel to police the limit (by denying requests for new memory) before the JVM could police it (by using controlled OutOfMemoryException mechanism). Unfortunately I do not know the specifics of Intel Solaris to know what is to be expected from that platform.
But as a general rule for Windows a maximum -Xmx might be 1800M and then reduce it by 16M per additional application thread you create. Since each thread needs stack space (both native and Java stack) as well as other per-thread accounting matters like Thread Local Storage etc... The result of this calculation should give you an approximation of the realistic usable heap space of a Java VM on any 32bit bit process whose operating system uses a 2G/2G split (User/Kernel).
It is possible with WinXP and above to use /3G switch on the kernel to get higher split (3G/1G user/kernel) and Linux has a /proc/<pid>/map file to allow you to see exactly how the process address space is laid out of a given process (if you were running this application you could watch over time as the [heap] grows to meet the shared file mappings used for .text/.rodata/.data/etc... from DSOs this results in the kernel denying requests to grow the heap.
This problem goes away for 64bit because there is so much more address space to use and you will run out of physical and virtual (swap) memory before the heap meets the other mappings.
I believe 'truss' on Solaris would have show up a brk/sbrk system-call that returned an error code, shortly before the core dump. Parts of standard native libraries are coded to never check the return code from requests for new memory and as a result crashes can be expected.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With