In the man pages I've been reading, it seems popen, system, etc. tend to call fork(). In turn, fork() copies the process's entire memory state. This seems really heavy, especially when in many situations a child from a call to fork() uses little if any of the memory allocated for the parent. So, my question is, can I get fork() like behavior without duplicating the whole memory state of the parent process? Or is there something I am missing, such that fork() is not as heavy as it appears (like, maybe calls tend to be optimized to avoid unnecessary memory duplication)?

The problem is that to run the main function of a standardly linked executable, you need to call <code>execve</code>, and exec replaces the whole process image and so you need a new address space, which is what <code>fork</code> is for. You can get around this by having your calee expose its <code>main</code> functionality in a shared library (but then it must not be called main), and then you can load the function with the <code>main</code> functionality without having to fork (provided there are no symbol conflicts). That would be a more efficient alternative to <code>system</code> (basically with the efficiency of a function call). Now <code>popen</code> involves pipes and to use pipes you need to have the pipe ends in different schedulable units. Threads, which use the same address space, can be used here as a lighter alternative to separate processes.

Lighter weight alternatives to fork() in POSIX C?

Tags:

c

fork

posix

In the man pages I've been reading, it seems popen, system, etc. tend to call fork(). In turn, fork() copies the process's entire memory state. This seems really heavy, especially when in many situations a child from a call to fork() uses little if any of the memory allocated for the parent.

So, my question is, can I get fork() like behavior without duplicating the whole memory state of the parent process? Or is there something I am missing, such that fork() is not as heavy as it appears (like, maybe calls tend to be optimized to avoid unnecessary memory duplication)?

449

asked Oct 20 '15 21:10

Kyle

2 Answers

fork(2) is, as all syscalls, a primitive operation (but some C libraries use clone(2) for it), from the point of view of user-space application. It is mostly a single machine instruction SYSCALL or SYSENTER to switch from user-mode to kernel-mode, then the (recent version of) Linux kernel is doing quite significant processing.

It is in practice quite efficient (e.g. less than a millisecond, and sometimes even less than a tenth of it) because the kernel is extensively using lazy copy-on-write techniques to share pages between parent & child processes. The actual copying would happen later, on page faults, when overwriting a shared page.

And forking has a huge advantage, since the starting of some other program is delegated to execve(2): it is conceptually simple: the only difference between the parent & child processes is the result of fork

BTW on POSIX systems such as Linux, fork(2) or the suitable clone(2) equivalent is the only way to create a process (there are some few weird exceptions that you should generally ignore: the kernel is making some processes like /sbin/init etc...), since vfork(2) is obsolete.

answered Nov 07 '22 07:11

Basile Starynkevitch

The problem is that to run the main function of a standardly linked executable, you need to call execve, and exec replaces the whole process image and so you need a new address space, which is what fork is for.

You can get around this by having your calee expose its main functionality in a shared library (but then it must not be called main), and then you can load the function with the main functionality without having to fork (provided there are no symbol conflicts).

That would be a more efficient alternative to system (basically with the efficiency of a function call). Now popen involves pipes and to use pipes you need to have the pipe ends in different schedulable units. Threads, which use the same address space, can be used here as a lighter alternative to separate processes.

answered Nov 07 '22 08:11

PSkocik

Related questions
                            
                                Possible to call inline functions in gdb and/or emit them using GCC?
                            
                                How to do struct.pack and struct.unpack in cython?
                            
                                How Linux knows which ioctl function to call?
                            
                                MPI merge multiple intercoms into a single intracomm
                            
                                How is typecasting parsed by C compilers?
                            
                                Macro expands correctly, but gives me "expected expression" error
                            
                                When is it useful to include the same header multiple times in one file? [duplicate]
                            
                                fatal error: sys/socket.h: No such file or directory, x86_64-w64-mingw32 mode
                            
                                How to declare 2D c-arrays dynamically in Cython
                            
                                How unreliable are floating point values, operators and functions?
                            
                                Create function call dynamically in C++
                            
                                How to create an ada lib.a and link to C
                            
                                fatal error: 'endian.h' file not found
                            
                                Single header file with all the necessary #include statements
                            
                                Should I free the pointer returned by setlocale?
                            
                                Convert POSIX integer errno to compile-time constant
                            
                                Efficient algorithm for finding a byte in a bit array
                            
                                c fork's child ppid does not match parent's pid
                            
                                static in front of number in a c program [duplicate]
                            
                                ld: can't open output file for writing: bin/s, errno=2 for architecture x86_64

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With