I'm running a custom 2.6.27 kernel and I just noticed the core files produced during a segfault are larger than the hard core file size limit set for processes.
And what makes it weirder is that the core file is only sometimes truncated (but not to the limit set by ulimit).
For example, this is the program I will try and crash below:
int main(int argc, char **argv)
{
// Get the hard and soft limit from command line
struct rlimit new = {atoi(argv[1]), atoi(argv[1])};
// Create some memory so as to beef up the core file size
void *p = malloc(10 * 1024 * 1024);
if (!p)
return 1;
if (setrlimit(RLIMIT_CORE, &new)) // Set the hard and soft limit
return 2; // for core files produced by this
// process
while (1);
free(p);
return 0;
}
And here's the execution:
Linux# ./a.out 1446462 & ## Set hard and soft limit to ~1.4 MB
[1] 14802
Linux# ./a.out 1446462 &
[2] 14803
Linux# ./a.out 1446462 &
[3] 14804
Linux# ./a.out 1446462 &
[4] 14807
Linux# cat /proc/14802/limits | grep core
Max core file size 1446462 1446462 bytes
Linux# killall -QUIT a.out
Linux# ls -l
total 15708
-rwxr-xr-x 1 root root 4624 Aug 1 18:28 a.out
-rw------- 1 root root 12013568 Aug 1 18:39 core.14802 <=== truncated core
-rw------- 1 root root 12017664 Aug 1 18:39 core.14803
-rw------- 1 root root 12013568 Aug 1 18:39 core.14804 <=== truncated core
-rw------- 1 root root 12017664 Aug 1 18:39 core.14807
[1] Quit (core dumped) ./a.out 1446462
[2] Quit (core dumped) ./a.out 1446462
[3] Quit (core dumped) ./a.out 1446462
[4] Quit (core dumped) ./a.out 1446462
So multiple things happened here. I set the hard limit for each process to be about 1.4 MB.
4096
bytes. What's going on here?I know the core file contains, among other things, the full stack and heap memory allocated. Shouldn't that be pretty much constant for such a simple program (give or take a few bytes at the most), hence producing a consistent core between multiple instances?
EDITS:
1 The requested output of du
Linux# du core.*
1428 core.14802
1428 core.14803
1428 core.14804
1428 core.14807
Linux# du -b core.*
12013568 core.14802
12017664 core.14803
12013568 core.14804
12017664 core.14807
2 Adding memset()
after malloc()
definitely reigned things in, in that the core file are now all truncated to 1449984
(still 3522
bytes over the limit).
So why were the cores so big before, what did they contain? Whatever it was, it wasn't subjected to the process' limits.
3 The new program shows some interesting behaviour as well:
Linux# ./a.out 12017664 &
[1] 26586
Linux# ./a.out 12017664 &
[2] 26589
Linux# ./a.out 12017664 &
[3] 26590
Linux# ./a.out 12017663 & ## 1 byte smaller
[4] 26653
Linux# ./a.out 12017663 & ## 1 byte smaller
[5] 26666
Linux# ./a.out 12017663 & ## 1 byte smaller
[6] 26667
Linux# killall -QUIT a.out
Linux# ls -l
total ..
-rwxr-xr-x 1 root root 4742 Aug 1 19:47 a.out
-rw------- 1 root root 12017664 Aug 1 19:47 core.26586
-rw------- 1 root root 12017664 Aug 1 19:47 core.26589
-rw------- 1 root root 12017664 Aug 1 19:47 core.26590
-rw------- 1 root root 1994752 Aug 1 19:47 core.26653 <== ???
-rw------- 1 root root 9875456 Aug 1 19:47 core.26666 <== ???
-rw------- 1 root root 9707520 Aug 1 19:47 core.26667 <== ???
[1] Quit (core dumped) ./a.out 12017664
[2] Quit (core dumped) ./a.out 12017664
[3] Quit (core dumped) ./a.out 12017664
[4] Quit (core dumped) ./a.out 12017663
[5] Quit (core dumped) ./a.out 12017663
[6] Quit (core dumped) ./a.out 12017663
The implementation of core dumping can be found in fs/binfmt_elf.c
. I'll follow the code in 3.12 and above (it changed with commit 9b56d5438) but the logic is very similar.
The code initially decides how much to dump of a VMA (virtual memory area) in vma_dump_size
. For an anonymous VMA such as the brk
heap, it returns the full size of the VMA. During this step, the core limit is not involved.
The first phase of writing the core dump then writes a PT_LOAD
header for each VMA. This is basically a pointer that says where to find the data in the remainder of the ELF file. The actual data is written by a for
loop, and is actually a second phase.
During the second phase, elf_core_dump
repeatedly calls get_dump_page
to get a struct page
pointer for each page of the program address space that has to be dumped. get_dump_page
is a common utility function found in mm/gup.c
. The comment to get_dump_page
is helpful:
* Returns NULL on any kind of failure - a hole must then be inserted into
* the corefile, to preserve alignment with its headers; and also returns
* NULL wherever the ZERO_PAGE, or an anonymous pte_none, has been found -
* allowing a hole to be left in the corefile to save diskspace.
and in fact elf_core_dump
calls a function in fs/coredump.c
( dump_seek
in your kernel, dump_skip
in 3.12+) if get_dump_page
returns NULL
. This function calls lseek to leave a hole in the dump (actually since this is the kernel it calls file->f_op->llseek
directly on a struct file
pointer). The main difference is that dump_seek
was indeed not obeying the ulimit, while the newer dump_skip
does.
As to why the second program has the weird behavior, it's probably because of ASLR (address space randomization). Which VMA is truncated depends on the relative order of the VMAs, which is random. You could try disabling it with
echo 0 | sudo tee /proc/sys/kernel/randomize_va_space
and see if your results are more homogeneous. To reenable ASLR, use
echo 2 | sudo tee /proc/sys/kernel/randomize_va_space
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With