Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Anonymous mmap zero-filled?

In Linux, the mmap(2) man page explains that an anonymous mapping

. . . is not backed by any file; its contents are initialized to zero.

The FreeBSD mmap(2) man page does not make a similar guarantee about zero-filling, though it does promise that bytes after the end of a file in a non-anonymous mapping are zero-filled.

Which flavors of Unix promise to return zero-initialized memory from anonymous mmaps? Which ones return zero-initialized memory in practice, but make no such promise on their man pages?

It is my impression that zero-filling is partially for security reasons. I wonder if any mmap implementations skip the zero-filling for a page that was mmapped, munmapped, then mmapped again by a single process, or if any implementations fill a newly mapped page with pseudorandom bits, or some non-zero constant.

P.S. Apparently, even brk and sbrk used to guarantee zero-filled pages. My experiments on Linux seem to indicate that, even if full pages are zero-filled upon page fault after a sbrk call allocates them, partial pages are not:

#include <unistd.h>
#include <stdio.h>

int main() {
  const intptr_t many = 100;
  char * start = sbrk(0);
  sbrk(many);
  for (intptr_t i = 0; i < many; ++i) {
    start[i] = 0xff;
  }
  printf("%d\n",(int)start[many/2]);
  sbrk(many/-2);
  sbrk(many/2);
  printf("%d\n",(int)start[many/2]);
  sbrk(-1 * many);
  sbrk(many/2);
  printf("%d\n",(int)start[0]);
}
like image 696
jbapple Avatar asked Jul 09 '13 03:07

jbapple


People also ask

Does mmap initialize to 0?

The mapping is not backed by any file; its contents are initialized to zero.

What does mmap do in c?

The mmap() function is used for mapping between a process address space and either files or devices. When a file is mapped to a process address space, the file can be accessed like an array in the program.

What is linux mmap?

mmap() creates a new mapping in the virtual address space of the calling process. The starting address for the new mapping is specified in addr. The length argument specifies the length of the mapping (which must be greater than 0).

What is FD in mmap?

mmap() is used for creating a memory mapping somewhere in virtual memory (somewhere which can be referenced to by the process issuing mmap). Specifying a file descriptor allows the memory to be swapped out to disk.


1 Answers

It's hard to say which ones promise what without simply exhaustively enumerating all man pages or other release documentation, but the underlying code that handles MAP_ANON is (usually? always?) also used to map in bss space for executables, and bss space needs to be zero-filled. So it's pretty darn likely.

As for "giving you back your old values" (or some non-zero values but most likely, your old ones) if you unmap and re-map, it certainly seems possible, if some system were to be "lazy" about deallocation. I have only used a few systems that support mmap in the first place (BSD and Linux derivatives) and neither one is lazy that way, at least, not in the kernel code handling mmap.

The reason sbrk might or might not zero-fill a "regrown" page is probably tied to history, or lack thereof. The current FreeBSD code matches with what I recall from the old, pre-mmap days: there are two semi-secret variables, minbrk and curbrk, and both brk and sbrk will only invoke SYS_break (the real system call) if they are moving curbrk to a value that is at least minbrk. (Actually, this looks slightly broken: brk has the at-least behavior but sbrk just adds its argument to curbrk and invokes SYS_break. Seems harmless since the kernel checks, in sys_obreak() in /sys/vm/vm_unix.c, so a too-negative sbrk() will fail with EINVAL.)

I'd have to look at the Linux C library (and then perhaps kernel code too) but it may simply ignore attempts to "lower the break", and merely record a "logical break" value in libc. If you have mmap() and no backwards compatibility requirements, you can implement brk() and sbrk() entirely in libc, using anonymous mappings, and it would be trivial to implement both of them as "grow-only", as it were.

like image 107
torek Avatar answered Sep 25 '22 00:09

torek