Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Threads share address space, but do not share stacks: Contradicting?

I know that threads share the address space, but do not share their stacks. Isn't that contradicting? Why is it true to say they share address space when they in fact do not share their stack - Stack is part of the address space, isn't it?

I would assume it threads share heap, data and code segment and not stack segment. To me all of them are considered process address space.

Can someone clarify please? Thanks!!

like image 444
user235306 Avatar asked Feb 10 '19 08:02

user235306


2 Answers

Yes, thread have the same address space but do not share stacks. Anything that one thread sees in memory another thread can see and at the same address, but each thread's stack is in a different place in the address space so they each call other functions independently without interfering with each other.

Take the following program as an example:

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <pthread.h>

void *foo(void *arg)
{
    int *n = arg;
    printf("in thread, arg=%p, value=%d, &n=%p\n", arg, *n, (void *)&n);
    return NULL;
}

int main()
{
    int x = 4;
    printf("in main, x=%d, &x=%p\n", x, (void *)&x);
    pthread_t tid;
    pthread_create(&tid, NULL, foo, &x);
    sleep(3);
    pthread_join(tid, NULL);

    return 0;
}

The main function passes the address of a local variable, which lives on the stack of the main thread, to another thread. The thread is able to dereference that pointer and read the value of the variable.

On my system it outputs the following:

in main, x=4, &x=0x7fff2142985c
in thread, arg=0x7fff2142985c, value=4, &n=0x7f6abaa90f08

Here you can see that both the main thread and the child thread see the same address and value for x in the main function. You can also see that the address of variable n in foo, which lives in the stack of the child thread, is very far away from the address of x in main (roughly 637GB apart).

This demonstrates that both threads can read the same memory with the same addresses and that each thread has its own stack.

like image 96
dbush Avatar answered Sep 18 '22 05:09

dbush


It is possible to have threads with separated stack address spaces, but it depends on two factors: how threads are implemented and which limitations are imposed by the operating system:

  1. If they are implemented exclusively in user space without any kernel help (like first thread libraries in old Unix OSes), they will share stack address space. The difference is where the stack starts in each thread.

  2. If the operating system implements special syscalls for building threads (i.e. like Mach3 based kernels cthreads), or built around special fork syscalls like Linux's clone, or around non-posix syscalls (like Windows), they can share most common address space but having different anonymous memory for the stack segment.

Note that in the first case (user space threads), threads share everything, even the same PID and there is not real separation between threads. If one thread gets blocked or kills the process all threads got blocked or killed (there is not real separated execution). Of course. in this case the stack address space is shared by all the threads in the same PID.

In the other cases (with support of OS), the degree of isolation depends on two things: the threads library and the kernel facilities. If the library, in spite of having mechanisms for creating process with different combinations of shared resources (like Linux clone), does not use it, the threads by sure will share the stack. If the library is advanced and has support for such an exotic feature, it may separate stacks.

But separating stacks in different address spaces introduces a big problem: you cannot share variables in the stack among threads. At first glance it does not seem a big problem, and even you may think it is an advantage. But it is not true, in fact sharing variables from the stack among several threads is a very common use case (e.g. in scientific code). Here follows a parallelized for in OpenMP (source https://www.openmp.org/wp-content/uploads/openmp-examples-4.5.0.pdf):

void simple(int n, float *a, float *b)
{
  int i;
#pragma omp parallel for
  for (i=1; i<n; i++) /*i is private by default*/
    b[i] = (a[i] + a[i-1]) / 2.0;
}

As you can see, the b and a vectors are passed as pointers during the call. You don't have any warranty that they reside in a "shared address space". If this OpenMP library is linked against a thread library where threads have stacks in different address spaces, this OpenMP parallelization will fail. This is really a bad start for a threading library when it is breaking one of the introductory examples of the OpenMP library.

So due compatibility, the most common is to never separate in different address spaces the stack, in spite of being possible to implement in most modern operating systems.

like image 35
killabytenow Avatar answered Sep 19 '22 05:09

killabytenow