Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to provide extend-on-write functionality for memory mapped files in Linux?

Tags:

linux

posix

aix

I'm working on porting some code from AIX to Linux. Parts of the code use the shmat() system call to create new files. When used with SHM_MAP in a writable mode, one can extend the file beyond its original length (of zero, in my case):

When a file is mapped onto a segment, the file is referenced by accessing the segment. The memory paging system automatically takes care of the physical I/O. References beyond the end of the file cause the file to be extended in page-sized increments. The file cannot be extended beyond the next segment boundary.

(A "segment" in AIX is a 256 MB chunk of address space, and a "page" is usually 4 KB.)

What I would like to do on Linux is the following:

  • Reserve a large-ish chunk of address space (it doesn't have to be as big as 256 MB, these aren't such large files)
  • Set up the page protection bits so that a segfault is generated on the first access to a page that hasn't been touched before
  • On a page fault, clear the "cause a page fault" bit and allocate committed memory for the page, allowing the write (or read) that caused the page fault to proceed
  • Upon closing the shared memory area, write the modified pages to a file

I know I can do this on Windows with the VirtualProtect function, the PAGE_GUARD memory protection bit, and a structured exception handler. What is the corresponding method on Linux to do the same? Is there perhaps a better way to implement this extend-on-write functionality on Linux?

I've already considered:

  • using mmap() with some fixed large-ish size, but I can't tell how much of the file was written to by the application code
  • allocating an anonymous shared memory area of large-ish size, but again I can't tell how much of the area has been written
  • mmap() by itself does not seem to provide any facility to extend the length of the backing file

Naturally I would like to do this with only minimal changes to the application code.

like image 384
Greg Hewgill Avatar asked Aug 04 '11 03:08

Greg Hewgill


People also ask

What is mmap () used for?

The mmap() function establishes a mapping between a process' address space and a stream file. The address space of the process from the address returned to the caller, for a length of len, is mapped onto a stream file starting at offset off.

When you write to a memory-mapped file when is the file actually written to the disk?

4 Answers. Save this answer. Show activity on this post. A memory mapped file is actually partially or wholly mapped in memory (RAM), whereas a file you write to would be written to memory and then flushed to disk.

How does mmap work in Linux?

mmap works by manipulating your process's page table, a data structure your CPU uses to map address spaces. The CPU will translate "virtual" addresses to "physical" ones, and does so according to the page table set up by your kernel. When you access the mapped memory for the first time, your CPU generates a page fault.


2 Answers

This is very similar to a homework I once did. Basically I had a list of "pages" and a list of "frames", with associated information. Using SIGSEGV I would catch faults and alter the memory protection bits as necessary. I'll include parts that you may find useful.

Create mapping. Initially it has no permissions.

int w_create_mapping(size_t size, void **addr)
{

    *addr = mmap(NULL,
            size * w_get_page_size(),
            PROT_NONE,
            MAP_ANONYMOUS | MAP_PRIVATE,
            -1,
            0
    );

    if (*addr == MAP_FAILED) {
        perror("mmap");
        return FALSE;
    }

    return TRUE;
}

Install signal handler

int w_set_exception_handler(w_exception_handler_t handler)
{
    static struct sigaction sa;
    sa.sa_sigaction = handler;
    sigemptyset(&sa.sa_mask);
    sigaddset(&sa.sa_mask, SIGSEGV);
    sa.sa_flags = SA_SIGINFO;

    if (sigaction(SIGSEGV, &sa, &previous_action) < 0)
        return FALSE;

    return TRUE;
}

Exception handler

static void fault_handler(int signum, siginfo_t *info, void *context)
{
    void *address;      /* the address that faulted */

    /* Memory location which caused fault */
    address = info->si_addr;

    if (FALSE == page_fault(address)) {
        _exit(1);
    }
}

Increasing protection

int w_protect_mapping(void *addr, size_t num_pages, w_prot_t protection)
{
    int prot;

    switch (protection) {
    case PROTECTION_NONE:
        prot = PROT_NONE;
        break;
    case PROTECTION_READ:
        prot = PROT_READ;
        break;
    case PROTECTION_WRITE:
        prot = PROT_READ | PROT_WRITE;
        break;
    }

    if (mprotect(addr, num_pages * w_get_page_size(), prot) < 0)
        return FALSE;

    return TRUE;
}

I can't publicly make it all available since the team is likely to use that same homework again.

like image 112
cnicutar Avatar answered Oct 16 '22 10:10

cnicutar


Allocate a big buffer however you like and then use mprotect()* system call to make the tail of the buffer read only and register a signal handler for SIGSEGV to note where in the before writes have been made and use mprotect() yet again to enable writes.

  • http://linux.die.net/man/2/mprotect
like image 21
gby Avatar answered Oct 16 '22 11:10

gby