Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split stacks unneccesary on amd64

There seems to be an opinion out there that using a "split stack" runtime model is unnecessary on 64-bit architectures. I say seems to be, because I haven't seen anyone actually say that, only dance around it:

The memory usage of a typical multi-threaded program can decrease significantly, as each thread does not require a worst-case stack size. It becomes possible to run millions of threads (either full NPTL threads or co-routines) in a 32-bit address space. -- Ian Lance Taylor

...implying that a 64-bit address space can already handle it.

And...

... the constant overhead of split stacks and the narrow use case (spawning enormous numbers of I/O-bound tasks on 32-bit architectures) isn't acceptable... -- bstrie

Two questions: Is this what they are saying? Second, if so, why are they unneccesary on 64-bit architectures?

like image 343
brooks94 Avatar asked Oct 18 '13 12:10

brooks94


Video Answer


2 Answers

Yes, that's what they are saying.

Split stacks are (currently) unnecessary on 64bit architectures because the 64bit virtual address space is so large it can contain millions of stack address ranges, each as large as an entire 32bit address space, if needed.

In the Flat memory model in use nowadays, the translation from virtual addresses to phisical memory locations is done with the support of the hardware MMU. On amd64 it turns out it's better (meaning, overall faster) to reserve big chunks of the 64bit virtual address space to each new stack you are creating, while only mapping the first page (4kB) to actual RAM. This way, the stack will be able to grow and shrink as needed, over contiguous virtual addresses (meaning less code in each function prologue, a big optimization) while the OS re-configures the MMU to map each page of virtual addresses to an actual free page of RAM, whenever the stack grows or shrinks above/below some configurable thresholds.

By choosing the thresholds smartly (see for example the theory of dynamic arrays) you can achieve O(1) complexity on the average stack operation, while retaining the benefits of millions of stacks that can grow as much as you need and only consume the memory they use.

PS: the current Go implementation is far behind any of this :-)

like image 125
Tobia Avatar answered Oct 14 '22 22:10

Tobia


The Go core team is currently discussing the possibility of using contiguous stacks in a future Go version.

The split stack approach is useful because stacks can grow more flexibly but it also requires that the runtime allocates a relatively big chunk of memory to distribute these stacks across. There has been a lot of confusion about Go's memory usage, in part because of this.

Making contiguous but growable (relocatable) stacks is an option that would provide the same flexibility and maybe reduce the confusion about Go's memory usage. As well as remedying some ill corner-cases on low-memory machines (see linked thread).

As to advantages/disadvantages on 32-bit vs. 64-bit architectures, I don't think there are any directly associated solely with the use of segmented stacks.

like image 28
thwd Avatar answered Oct 14 '22 22:10

thwd