<p>I'm working on porting some code from AIX to Linux. Parts of the code use the <code>shmat()</code> system call to create new files. When used with <code>SHM_MAP</code> in a writable mode, one can extend the file beyond its original length (of zero, in my case):</p> <blockquote> <p>When a file is mapped onto a segment, the file is referenced by accessing the segment. The memory paging system automatically takes care of the physical I/O. References beyond the end of the file cause the file to be extended in page-sized increments. The file cannot be extended beyond the next segment boundary.</p> </blockquote> <p>(A "segment" in AIX is a 256 MB chunk of address space, and a "page" is usually 4 KB.)</p> <p>What I would <em>like</em> to do on Linux is the following:</p> <ul> <li>Reserve a large-ish chunk of address space (it doesn't have to be as big as 256 MB, these aren't such large files)</li> <li>Set up the page protection bits so that a segfault is generated on the first access to a page that hasn't been touched before</li> <li>On a page fault, clear the "cause a page fault" bit and allocate committed memory for the page, allowing the write (or read) that caused the page fault to proceed</li> <li>Upon closing the shared memory area, write the modified pages to a file</li> </ul> <p>I know I can do this on Windows with the VirtualProtect function, the <code>PAGE_GUARD</code> memory protection bit, and a structured exception handler. What is the corresponding method on Linux to do the same? Is there perhaps a better way to implement this extend-on-write functionality on Linux?</p> <p>I've already considered:</p> <ul> <li>using <code>mmap()</code> with some fixed large-ish size, but I can't tell how much of the file was written to by the application code</li> <li>allocating an anonymous shared memory area of large-ish size, but again I can't tell how much of the area has been written</li> <li> <code>mmap()</code> by itself does not seem to provide any facility to extend the length of the backing file</li> </ul> <p>Naturally I would like to do this with only minimal changes to the application code.</p>

<p>This is <em>very</em> similar to a homework I once did. Basically I had a list of "pages" and a list of "frames", with associated information. Using <code>SIGSEGV</code> I would catch faults and alter the memory protection bits as necessary. I'll include parts that you may find useful.</p> <h3>Create mapping. Initially it has no permissions.</h3> <pre class="prettyprint"><code>int w_create_mapping(size_t size, void **addr) { *addr = mmap(NULL, size * w_get_page_size(), PROT_NONE, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0 ); if (*addr == MAP_FAILED) { perror("mmap"); return FALSE; } return TRUE; } </code></pre> <h3>Install signal handler</h3> <pre class="prettyprint"><code>int w_set_exception_handler(w_exception_handler_t handler) { static struct sigaction sa; sa.sa_sigaction = handler; sigemptyset(&sa.sa_mask); sigaddset(&sa.sa_mask, SIGSEGV); sa.sa_flags = SA_SIGINFO; if (sigaction(SIGSEGV, &sa, &previous_action) < 0) return FALSE; return TRUE; } </code></pre> <h3>Exception handler</h3> <pre class="prettyprint"><code>static void fault_handler(int signum, siginfo_t *info, void *context) { void *address; /* the address that faulted */ /* Memory location which caused fault */ address = info->si_addr; if (FALSE == page_fault(address)) { _exit(1); } } </code></pre> <h3>Increasing protection</h3> <pre class="prettyprint"><code>int w_protect_mapping(void *addr, size_t num_pages, w_prot_t protection) { int prot; switch (protection) { case PROTECTION_NONE: prot = PROT_NONE; break; case PROTECTION_READ: prot = PROT_READ; break; case PROTECTION_WRITE: prot = PROT_READ | PROT_WRITE; break; } if (mprotect(addr, num_pages * w_get_page_size(), prot) < 0) return FALSE; return TRUE; } </code></pre> <p>I can't publicly make it all available since the team is likely to use that same homework again.</p>

How to provide extend-on-write functionality for memory mapped files in Linux?

Tags:

linux

posix

aix

I'm working on porting some code from AIX to Linux. Parts of the code use the shmat() system call to create new files. When used with SHM_MAP in a writable mode, one can extend the file beyond its original length (of zero, in my case):

When a file is mapped onto a segment, the file is referenced by accessing the segment. The memory paging system automatically takes care of the physical I/O. References beyond the end of the file cause the file to be extended in page-sized increments. The file cannot be extended beyond the next segment boundary.

(A "segment" in AIX is a 256 MB chunk of address space, and a "page" is usually 4 KB.)

What I would like to do on Linux is the following:

Reserve a large-ish chunk of address space (it doesn't have to be as big as 256 MB, these aren't such large files)
Set up the page protection bits so that a segfault is generated on the first access to a page that hasn't been touched before
On a page fault, clear the "cause a page fault" bit and allocate committed memory for the page, allowing the write (or read) that caused the page fault to proceed
Upon closing the shared memory area, write the modified pages to a file

I know I can do this on Windows with the VirtualProtect function, the PAGE_GUARD memory protection bit, and a structured exception handler. What is the corresponding method on Linux to do the same? Is there perhaps a better way to implement this extend-on-write functionality on Linux?

I've already considered:

using mmap() with some fixed large-ish size, but I can't tell how much of the file was written to by the application code
allocating an anonymous shared memory area of large-ish size, but again I can't tell how much of the area has been written
mmap() by itself does not seem to provide any facility to extend the length of the backing file

Naturally I would like to do this with only minimal changes to the application code.

384

asked Aug 04 '11 03:08

Greg Hewgill

2 Answers

This is very similar to a homework I once did. Basically I had a list of "pages" and a list of "frames", with associated information. Using SIGSEGV I would catch faults and alter the memory protection bits as necessary. I'll include parts that you may find useful.

Create mapping. Initially it has no permissions.

Click to copy

int w_create_mapping(size_t size, void **addr)
{

    *addr = mmap(NULL,
            size * w_get_page_size(),
            PROT_NONE,
            MAP_ANONYMOUS | MAP_PRIVATE,
            -1,
            0
    );

    if (*addr == MAP_FAILED) {
        perror("mmap");
        return FALSE;
    }

    return TRUE;
}

Install signal handler

Click to copy

int w_set_exception_handler(w_exception_handler_t handler)
{
    static struct sigaction sa;
    sa.sa_sigaction = handler;
    sigemptyset(&sa.sa_mask);
    sigaddset(&sa.sa_mask, SIGSEGV);
    sa.sa_flags = SA_SIGINFO;

    if (sigaction(SIGSEGV, &sa, &previous_action) < 0)
        return FALSE;

    return TRUE;
}

Exception handler

Click to copy

static void fault_handler(int signum, siginfo_t *info, void *context)
{
    void *address;      /* the address that faulted */

    /* Memory location which caused fault */
    address = info->si_addr;

    if (FALSE == page_fault(address)) {
        _exit(1);
    }
}

Increasing protection

Click to copy

int w_protect_mapping(void *addr, size_t num_pages, w_prot_t protection)
{
    int prot;

    switch (protection) {
    case PROTECTION_NONE:
        prot = PROT_NONE;
        break;
    case PROTECTION_READ:
        prot = PROT_READ;
        break;
    case PROTECTION_WRITE:
        prot = PROT_READ | PROT_WRITE;
        break;
    }

    if (mprotect(addr, num_pages * w_get_page_size(), prot) < 0)
        return FALSE;

    return TRUE;
}

I can't publicly make it all available since the team is likely to use that same homework again.

112

answered Oct 16 '22 10:10

cnicutar

Allocate a big buffer however you like and then use mprotect()* system call to make the tail of the buffer read only and register a signal handler for SIGSEGV to note where in the before writes have been made and use mprotect() yet again to enable writes.

http://linux.die.net/man/2/mprotect

answered Oct 16 '22 11:10

gby

Related questions
                            
                                What happens when there is a request for memory block which is not a power of 2?
                            
                                Beginners Ubuntu linux guide for experienced windows developer [closed]
                            
                                How to test browser Compatibility in Linux
                            
                                Socket getting created with same IP and port on local host
                            
                                How to run C++ binaries on NTFS in Ubuntu 10.10?
                            
                                Running a Python Script using Cron?
                            
                                Can two different versions of the same libs (with same name) exists in an application?
                            
                                strlen in assembly
                            
                                Linux Crypto API and linux/crypto.h - Documentation
                            
                                can I find via GDB if a variable belong to heap or stack?
                            
                                Is there a command line tool for data visualization and analysis? [closed]
                            
                                Embed an app into a window
                            
                                Maintaining a long-running task on Linux
                            
                                Setting Java system properties without putting the values on the command line
                            
                                how to install a node.js server at chat.mydomain.com on a hostgator vps hosting?
                            
                                Linux proc/pid/fd for stdout is 11?
                            
                                Difference between request_irq and __interrupt
                            
                                grep on colored lines
                            
                                Linux pipe : Capturing realtime output of ping via popen
                            
                                Check whether socket is closed in bash?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to provide extend-on-write functionality for memory mapped files in Linux?

Tags:

linux

posix

aix

Greg Hewgill

People also ask

2 Answers

Create mapping. Initially it has no permissions.

Install signal handler

Exception handler

Increasing protection

cnicutar

gby

Recent Activity

Donate For Us