Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Minimal stack size for Linux clone call?

Tags:

I've been fiddling with clone calls, and I noticed three different outcomes for different child thread stack allocations. The following demo allocates a stack n-bytes big where n is passed as an argument, then attempts to clone.

foo.c:

#define _GNU_SOURCE
#include <stdlib.h>
#include <unistd.h>
#include <sched.h>
#include <errno.h>

int child(void *arg)
{
    (void)arg;
    write(STDOUT_FILENO, "carpe momentum\n", 15);
    return 0;
}

int main(int argc, char **argv)
{
    long stacksize;
    pid_t pid;
    void *stack;

    if (argc < 2)
        return 1;

    errno = 0;
    stacksize = strtol(argv[1], NULL, 0);
    if (errno != 0)
        return 1;

    stack = malloc(stacksize);
    if (stack == NULL)
        return 1;

    pid = clone(child, stack + stacksize, 0, NULL);
    if (pid == -1)
        return 1;

    write(STDOUT_FILENO, "success\n", 8);

    return 0;
}

Here are my observations:

$ cc -o foo foo.c
$ ./foo 0
Segmentation fault
$ ./foo 23
Segmentation fault
$ ./foo 24
success
$ ./foo 583
success
$ ./foo 584
success
carpe momentum
$ ./foo 1048576 #1024 * 1024, amount suggested by man-page example
success
carpe momentum

All of the smattering of samples between 0 and 23 segfaulted, and for all of the samples between 24 and 583 the parent succeeded but the child was silent. Anything reasonable above 584 causes both to succeed.

Disassembly suggests that child only uses 16 bytes of stack space, plus at least 16 more to call write. But that's already more than the 24 bytes needed to stop segfaulting.

$ objdump -d foo
# ...
080484cb <child>:
 80484cb:       55                      push   %ebp
 80484cc:       89 e5                   mov    %esp,%ebp
 80484ce:       83 ec 08                sub    $0x8,%esp
 80484d1:       83 ec 04                sub    $0x4,%esp
 80484d4:       6a 0f                   push   $0xf
 80484d6:       68 50 86 04 08          push   $0x8048650
 80484db:       6a 01                   push   $0x1
 80484dd:       e8 be fe ff ff          call   80483a0 <write@plt>
 80484e2:       83 c4 10                add    $0x10,%esp
 80484e5:       b8 00 00 00 00          mov    $0x0,%eax
 80484ea:       c9                      leave  
 80484eb:       c3                      ret
# ...

This prompts several overlapping questions.

  • Why doesn't clone segfault between 24 and 583 bytes of stack?
  • How does child fail silently with too little stack?
  • What is all that stack space used for?
  • What is the significance of 24 and 584 bytes? How do they vary on different systems and implementations?
  • Can I calculate a minimum stack requirement? Should I?

I am on an i686 Debian system:

$ uname -a
Linux REDACTED 3.16.0-4-686-pae #1 SMP Debian 3.16.7-ckt25-2+deb8u3 (2016-07-02) i686 GNU/Linux
like image 670
nebuch Avatar asked Aug 14 '16 06:08

nebuch


1 Answers

  • Why doesn't clone segfault between 24 and 583 bytes of stack?

It does, but because it is a separate process, you don't see it. Before 24, it is not the child that segfaults, but the parent in trying to set up the child. Try using strace -ff to see this happening.

  • How does child fail silently with too little stack?

When the child dies, the parent is notified. The parent in this case (the one that does the clone() call) doesn't do anything with this notification. The reason it is not "silent" below 24 is because that's when the parent dies and in that case your shell will get the notification.

  • What is all that stack space used for?
  • What is the significance of 24 and 584 bytes? How do they vary on different systems and implementations?

The first 24 (and a bit) are used to set up the function call to child. Because it is a normal function, on completion it will return to the calling function. This means clone has to set up a calling function to return to (one that just cleanly terminates the child).

The 584 (and a bit) apparently is the amount of memory needed for the local variables of the calling function, your function, write and whatever write calls.

The reason I write "(and a bit)" is because there might be a bit of memory before stack that is available and abused by clone or child when running out of room. Try adding a free(stack) after the clone to see the result of that abuse.

  • Can I calculate a minimum stack requirement? Should I?

In general you should probably not. It requires pretty deep analysis of your functions and the external functions those use. Just like with "normal" programs, I would suggest going for the default (which is 8MB on linux, if I recall correctly). Only when you have strict memory requirements (or stack overflow problems), you should start to worry about these things.

like image 108
mweerden Avatar answered Sep 25 '22 16:09

mweerden