munmap() failure with ENOMEM with private anonymous mapping

Q: What is mmap in Linux?

In computing, mmap(2) is a POSIX-compliant Unix system call that maps files or devices into memory. It is a method of memory-mapped file I/O. It implements demand paging because file contents are not read from disk directly and initially do not use physical RAM at all.

Tags:

I have recently discovered that Linux does not guarantee that memory allocated with mmap can be freed with munmap if this leads to situation when number of VMA (Virtual Memory Area) structures exceed vm.max_map_count. Manpage states this (almost) clearly:

 ENOMEM The process's maximum number of mappings would have been exceeded.
 This error can also occur for munmap(), when unmapping a region
 in the middle of an existing mapping, since this results in two
 smaller mappings on either side of the region being unmapped.

The problem is that Linux kernel always tries to merge VMA structures if possible, making munmap fail even for separately created mappings. I was able to write a small program to confirm this behavior:

#include <stdio.h>
#include <stdlib.h>
#include <errno.h>

#include <sys/mman.h>

// value of vm.max_map_count
#define VM_MAX_MAP_COUNT        (65530)

// number of vma for the empty process linked against libc - /proc/<id>/maps
#define VMA_PREMAPPED           (15)

#define VMA_SIZE                (4096)
#define VMA_COUNT               ((VM_MAX_MAP_COUNT - VMA_PREMAPPED) * 2)

int main(void)
{
    static void *vma[VMA_COUNT];

    for (int i = 0; i < VMA_COUNT; i++) {
        vma[i] = mmap(0, VMA_SIZE, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);

        if (vma[i] == MAP_FAILED) {
            printf("mmap() failed at %d\n", i);
            return 1;
        }
    }

    for (int i = 0; i < VMA_COUNT; i += 2) {
        if (munmap(vma[i], VMA_SIZE) != 0) {
            printf("munmap() failed at %d (%p): %m\n", i, vma[i]);
        }
    }
}

It allocates a large number of pages (twice the default allowed maximum) using mmap, then munmaps every second page to create separate VMA structure for each remaining page. On my machine the last munmap call always fails with ENOMEM.

Initially I thought that munmap never fails if used with the same values for address and size that were used to create mapping. Apparently this is not the case on Linux and I was not able to find information about similar behavior on other systems.

At the same time in my opinion partial unmapping applied to the middle of a mapped region is expected to fail on any OS for every sane implementation, but I haven't found any documentation that says such failure is possible.

I would usually consider this a bug in the kernel, but knowing how Linux deals with memory overcommit and OOM I am almost sure this is a "feature" that exists to improve performance and decrease memory consumption.

Other information I was able to find:

Similar APIs on Windows do not have this "feature" due to their design (see MapViewOfFile, UnmapViewOfFile, VirtualAlloc, VirtualFree) - they simply do not support partial unmapping.
glibc malloc implementation does not create more than 65535 mappings, backing off to sbrk when this limit is reached: https://code.woboq.org/userspace/glibc/malloc/malloc.c.html. This looks like a workaround for this issue, but it is still possible to make free silently leak memory.
jemalloc had trouble with this and tried to avoid using mmap/munmap because of this issue (I don't know how it ended for them).

Do other OS's really guarantee deallocation of memory mappings? I know Windows does this, but what about other Unix-like operating systems? FreeBSD? QNX?

EDIT: I am adding example that shows how glibc's free can leak memory when internal munmap call fails with ENOMEM. Use strace to see that munmap fails:

#include <stdio.h>
#include <stdlib.h>
#include <errno.h>

#include <sys/mman.h>

// value of vm.max_map_count
#define VM_MAX_MAP_COUNT        (65530)

#define VMA_MMAP_SIZE           (4096)
#define VMA_MMAP_COUNT          (VM_MAX_MAP_COUNT)

// glibc's malloc default mmap_threshold is 128 KiB
#define VMA_MALLOC_SIZE         (128 * 1024)
#define VMA_MALLOC_COUNT        (VM_MAX_MAP_COUNT)

int main(void)
{
    static void *mmap_vma[VMA_MMAP_COUNT];

    for (int i = 0; i < VMA_MMAP_COUNT; i++) {
        mmap_vma[i] = mmap(0, VMA_MMAP_SIZE, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);

        if (mmap_vma[i] == MAP_FAILED) {
            printf("mmap() failed at %d\n", i);
            return 1;
        }
    }

    for (int i = 0; i < VMA_MMAP_COUNT; i += 2) {
        if (munmap(mmap_vma[i], VMA_MMAP_SIZE) != 0) {
            printf("munmap() failed at %d (%p): %m\n", i, mmap_vma[i]);
            return 1;
        }
    }

    static void *malloc_vma[VMA_MALLOC_COUNT];

    for (int i = 0; i < VMA_MALLOC_COUNT; i++) {
        malloc_vma[i] = malloc(VMA_MALLOC_SIZE);

        if (malloc_vma[i] == NULL) {
            printf("malloc() failed at %d\n", i);
            return 1;
        }
    }

    for (int i = 0; i < VMA_MALLOC_COUNT; i += 2) {
        free(malloc_vma[i]);
    }
}

589

asked May 02 '17 17:05

StaceyGirl

1 Answers

One way to work around this problem on Linux is to mmap more that 1 page at once (e.g. 1 MB at a time), and also map a separator page after it. So, you actually call mmap on 257 pages of memory, then remap the last page with PROT_NONE, so that it cannot be accessed. This should defeat the VMA merging optimization in the kernel. Since you are allocating many pages at once, you should not run into the max mapping limit. The downside is you have to manually manage how you want to slice the large mmap.

As to your questions:

System calls can fail on any system for a variety of reasons. Documentation is not always complete.
You are allowed to munmap a part of a mmapd region as long as the address passed in lies on a page boundary, and the length argument is rounded up to the next multiple of the page size.

174

answered Sep 29 '22 06:09

jxh

Related questions
                            
                                Apache Arrow Java API Documentation [closed]
                            
                                Cannot get BHO working in 64 bit
                            
                                Exclude @types typings in installed dependencies
                            
                                Would it be legal for std::set to be specialized for (u)int8 and chars using bitset and shared static array
                            
                                Media scanner for secondary storage on Android Q
                            
                                Why does this loop take 1.32 cycles per iteration
                            
                                Simple way to run SwiftLint for our Swift package
                            
                                Calling a method with Provider.of(context) inside of dispose() method cause "Looking up a deactivated widget's ancestor is unsafe."
                            
                                Game development with Unity without Unity editor. Is it possible?
                            
                                HOWTO: Write Python API wrapper? [closed]
                            
                                Is there a way to declare an annotation attribute for *any* enum?
                            
                                iPhone: HTTPS client cert authentication

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

munmap() failure with ENOMEM with private anonymous mapping

Tags:

StaceyGirl

People also ask

1 Answers

jxh

Recent Activity

Donate For Us