This is sort of a technical question, maybe you can help me if you know about C and UNIX (or maybe it is a really newbie question!) A question came up today while analizing some code in our Operative Systems course. We are learning what it means to "fork" a process in UNIX, we already know it creates a copy of the current process parallel to it and they have separate data sections. But then I thought that maybe, if one creates a variable and a pointer pointing at it before doing fork(), because the pointer stores the memory address of the variable, one could try to modify the value of that variable from the child process by using that pointer. We tried a code similar to this in class: <pre class="prettyprint"><code>#include <stdio.h> #include <sys/types.h> #include <stdlib.h> int main (){ int value = 0; int * pointer = &value; int status; pid_t pid; printf("Parent: Initial value is %d\n",value); pid = fork(); switch(pid){ case -1: //Error (maybe?) printf("Fork error, WTF?\n"); exit(-1); case 0: //Child process printf("\tChild: I'll try to change the value\n\tChild: The pointer value is %p\n",pointer); (*pointer) = 1; printf("\tChild: I've set the value to %d\n",(*pointer)); exit(EXIT_SUCCESS); break; } while(pid != wait(&status)); //Wait for the child process printf("Parent: the pointer value is %p\nParent: The value is %d\n",pointer,value); return 0; } </code></pre> If you run it, you'll get something like this: <blockquote> Parent: Initial value is 0 Child: I'll try to change the value Child: The pointer value is 0x7fff733b0c6c Child: I've set the value to 1 Parent: the pointer value is 0x7fff733b0c6c Parent: The value is 0 </blockquote> It's obvious that the child process didn't affect at all the parent process. Frankly, I was expecting some "segmentation fault" error, because of accessing a not permitted memory address. But what really happened? Remember, I'm not looking for a way to communicate processes, that's not the point. What I want to know is what did the code do. Inside the child process, the change is visible, so it DID something. My main hypothesis is that pointers are not absolute to memory, they are relative to the process' stack. But I haven't been able to find an answer (no one in class knew, and googling I just found some questions about process communication) so I'd like to know from you, hopefully someone will know. Thanks for taking your time reading!

The key here is the concept of a virtual address space. Modern processors (Say anything newer then a 80386) have a memory management unit which maps from a per process virtual address space to physical memory pages under control of the kernel. When the kernel sets up a process it creates a set of page table entries for that process that define the physical memory pages to virtual address space mapping, and it is in this virtual address space that the program executes. Conceptually when you fork, the kernel copies the existing process pages to a new set of physical pages and sets up the new processes page tables so that as far as the new process is concerned it appears to be running in the same virtual memory layout as the original one had, while actually addressing entirely different physical memory. The detail is more subtle as nobody wants to waste time copying hundreds of MB of data unless such is necessary. When the process calls fork() the kernel sets up a second set of page table entries (for the new process), but points them at the same physical pages as the original process, it then sets the flag in both sets of pages to make the mmu consider them read only..... As soon as either process writes to a page, the memory management unit generates a page fault (due to the PTE entry having the read only flag set), and the page fault handler then allocates a new page from physical memory, copies the data over, updates the page table entry and sets the pages back to read/write. In this way, pages are only actually copied the first time either process tries to make a change to a copy on write page, and the slight of hand goes completely unnoticed by either process. Regards, Dan.

Logically, the <code>fork()</code>ed process gets its own, independent copy of more or less the whole state of the parent process. That couldn't work if pointers in the child referred to memory belonging to the parent. The details of how a particular UNIX-like kernel makes that work can vary. Linux implements the child process's memory via copy-on-write pages, which makes <code>fork()</code>ing comparatively cheap relative to other possible implementations. In that case, the child's pointers really do point to the parent process's memory, up until such time that either child or parent tries to modify that memory, at which time a copy is made for the child to use. That all relies on the underlying virtual memory system. Other UNIX and UNIX-like systems can and have done it differently.

About pointers after fork()

Tags:

c

fork

unix

pointers

process

This is sort of a technical question, maybe you can help me if you know about C and UNIX (or maybe it is a really newbie question!)

A question came up today while analizing some code in our Operative Systems course. We are learning what it means to "fork" a process in UNIX, we already know it creates a copy of the current process parallel to it and they have separate data sections.

But then I thought that maybe, if one creates a variable and a pointer pointing at it before doing fork(), because the pointer stores the memory address of the variable, one could try to modify the value of that variable from the child process by using that pointer.

We tried a code similar to this in class:

#include <stdio.h>
#include <sys/types.h>
#include <stdlib.h>

int main (){
    int value = 0;
    int * pointer = &value;
    int status;
    
    pid_t pid;
    
    printf("Parent: Initial value is %d\n",value);
    
    pid = fork();
    
    switch(pid){
    case -1: //Error (maybe?)
        printf("Fork error, WTF?\n");
        exit(-1);
        
    case 0: //Child process
        printf("\tChild: I'll try to change the value\n\tChild: The pointer value is %p\n",pointer);
        (*pointer) = 1;
        printf("\tChild: I've set the value to %d\n",(*pointer));
        
        exit(EXIT_SUCCESS);
        break;
    }
    
    while(pid != wait(&status)); //Wait for the child process
    
    printf("Parent: the pointer value is %p\nParent: The value is %d\n",pointer,value);
    
    return 0;
}

If you run it, you'll get something like this:

Parent: Initial value is 0

Child: I'll try to change the value

Child: The pointer value is 0x7fff733b0c6c

Child: I've set the value to 1

Parent: the pointer value is 0x7fff733b0c6c

Parent: The value is 0

It's obvious that the child process didn't affect at all the parent process. Frankly, I was expecting some "segmentation fault" error, because of accessing a not permitted memory address. But what really happened?

Remember, I'm not looking for a way to communicate processes, that's not the point. What I want to know is what did the code do. Inside the child process, the change is visible, so it DID something.

My main hypothesis is that pointers are not absolute to memory, they are relative to the process' stack. But I haven't been able to find an answer (no one in class knew, and googling I just found some questions about process communication) so I'd like to know from you, hopefully someone will know.

Thanks for taking your time reading!

931

asked Oct 23 '14 18:10

javierbg

2 Answers

The key here is the concept of a virtual address space.

Modern processors (Say anything newer then a 80386) have a memory management unit which maps from a per process virtual address space to physical memory pages under control of the kernel.

When the kernel sets up a process it creates a set of page table entries for that process that define the physical memory pages to virtual address space mapping, and it is in this virtual address space that the program executes.

Conceptually when you fork, the kernel copies the existing process pages to a new set of physical pages and sets up the new processes page tables so that as far as the new process is concerned it appears to be running in the same virtual memory layout as the original one had, while actually addressing entirely different physical memory.

The detail is more subtle as nobody wants to waste time copying hundreds of MB of data unless such is necessary. When the process calls fork() the kernel sets up a second set of page table entries (for the new process), but points them at the same physical pages as the original process, it then sets the flag in both sets of pages to make the mmu consider them read only.....

As soon as either process writes to a page, the memory management unit generates a page fault (due to the PTE entry having the read only flag set), and the page fault handler then allocates a new page from physical memory, copies the data over, updates the page table entry and sets the pages back to read/write. In this way, pages are only actually copied the first time either process tries to make a change to a copy on write page, and the slight of hand goes completely unnoticed by either process.

Regards, Dan.

168

answered Nov 09 '22 17:11

Dan Mills

Logically, the fork()ed process gets its own, independent copy of more or less the whole state of the parent process. That couldn't work if pointers in the child referred to memory belonging to the parent.

The details of how a particular UNIX-like kernel makes that work can vary. Linux implements the child process's memory via copy-on-write pages, which makes fork()ing comparatively cheap relative to other possible implementations. In that case, the child's pointers really do point to the parent process's memory, up until such time that either child or parent tries to modify that memory, at which time a copy is made for the child to use. That all relies on the underlying virtual memory system. Other UNIX and UNIX-like systems can and have done it differently.

answered Nov 09 '22 17:11

John Bollinger

Related questions
                            
                                Structs with enums are different in C and C++, why?
                            
                                Sizeof operator with variable-length array type
                            
                                Using strcat in C
                            
                                C/C++ bitfields versus bitwise operators to single out bits, which is faster, better, more portable?
                            
                                Optimized version of strstr (search has constant length)
                            
                                makefile extension
                            
                                Undefined reference to 'pthread_create' — linker command option order (libraries before/after object files?) [duplicate]
                            
                                write or printf, which is faster?
                            
                                Random Number: either 0 or 1
                            
                                Switching off optimization for a specific function in gcc 4.2.2
                            
                                Getting a hexadecimal number into a program via the command line
                            
                                Pointer to [-1]th index of array
                            
                                printf of a size_t variable with lld, ld and d type identifiers
                            
                                One question about function definition in C++
                            
                                for or while loop inside #define directive
                            
                                C/C++ sockets and a non-blocking recv()
                            
                                How to properly convert an unsigned char array into an uint32_t
                            
                                Macro definition ARRAY_SIZE
                            
                                Out parameters in C
                            
                                Pointer to a string in C?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With