I've heard in a lot of places (musl mailing list, macOS forums, etc.) that brk()
and sbrk()
are unsafe. Many of these places either don't give explanations at all, or give very vague explanations. For example, this link states that "these functions are fundamentally broken", and goes on to say that the malloc
and sbrk
subsystems are utterly broken, that they ruin the heap, et al.
My question is: Why is this so? If malloc
is used in such a way that it allocates a block of memory with sbrk
large enough to quell or substantially decrease the need for further allocations, then shouldn't sbrk
and brk
be perfectly safe to use?
Here are my implementations of sbrk
and brk
:
sbrk
:
#include <unistd.h>
#include <stddef.h>
void *sbrk(intptr_t inc)
{
intptr_t curbrk = syscall(SYS_brk, NULL);
if( inc == 0 ) goto ret;
if( curbrk < 0 ) return (void *)-1;
curbrk((void *)(curbrk+inc));
ret:
return (void *)curbrk;
}
brk
:
#include <unistd.h>
intptr_t brk(void *ptr)
{
if( (void *)syscall(SYS_brk, ptr) != ptr )
return -1;
else
return 0;
}
Because the sbrk function is not thread-safe, we acquire a lock immediately before calling sbrk and release a lock immediately after calling sbrk.
brk and sbrk are basic memory management system calls used in Unix and Unix-like operating systems to control the amount of memory allocated to the data segment of the process. These functions are typically called from a higher-level memory management library function such as malloc.
brk identifies the lowest data segment location not used by the caller as addr . This location is rounded up to the next multiple of the system page size. sbrk , the alternate interface, adds incr bytes to the caller data space and returns a pointer to the start of the new data area.
Return Values Upon successful completion, sbrk() returns the prior break value. Otherwise, it returns (void *)−1 and sets errno to indicate the error.
The reality highly depends on the implementation but here are some elements:
brk
/sbrk
were invented to allow a process to request more memory from the system, and release it in a single contiguous segment. As such, they were used by many malloc
and free
implementations. The problem was that, as it returned a unique segment, things would go wrong as multiple modules (of the same process) use it directly. It became even worse in a multi-threaded process because of race conditions. Suppose 2 threads want to add new memory. They will look at the current top address with sbrk(0)
, see the same address, request new memory with either brk
or sbrk
, and because of the race condition, will both use the same memory.
Even in a single threaded process, some malloc
and free
implementations assume that they only are allowed to use the low level s/brk
interface, and that any other code should use them. In that case things will go wrong if the image of the break segment that they internally maintain is no longer the assumed value. They should have to guess that some parts of the segment are "reserved" for other uses, possibly breaking the ability to release any memory.
For that reason, user code should never directly use brk
/sbrk
and only rely on malloc
/free
. If, and only if, you are writing an implementation of the standard library including malloc
/realloc
/calloc
/free
, you can safely use brk
/sbrk
On modern system, mmap
can make a much nicer usage of virtual memory management. You can use as many dynamic memory segments as you need with no interaction between them. So, on a modern system, unless you have a specific need for memory allocation using brk
/sbrk
, you should use mmap
.
The FreeBSD reference for brk
and sbrk
states this:
The brk() and sbrk() functions are legacy interfaces from before the advent of modern virtual memory management.
and later:
BUGS: Mixing brk() or sbrk() with malloc(3), free(3), or similar functions will result in non-portable program behavior.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With