Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python multiprocessing copy-on-write behaving differently between OSX and Ubuntu

I'm trying to share objects between the parent and child process in Python. To play around with the idea, I've created a simple Python script:

from multiprocessing import Process
from os import getpid

import psutil

shared = list(range(20000000))

def shared_printer():
    mem = psutil.Process(getpid()).memory_info().rss / (1024 ** 2)
    print(getpid(), len(shared), '{}MB'.format(mem))

if __name__ == '__main__':
    p = Process(target=shared_printer)
    p.start()
    shared_printer()
    p.join()

The code snippet uses the excellent psutil library to print the RSS (Resident Set Size). When I run this on OSX with Python 2.7.15, I get the following output:

(33101, 20000000, '1MB')
(33100, 20000000, '626MB')

When I run the exact same snippet on Ubuntu (Linux 4.15.0-1029-aws #30-Ubuntu SMP x86_64 GNU/Linux), I get the following output:

(4077, 20000000, '632MB')
(4078, 20000000, '629MB')

Notice that the child process' RSS is basicall 0MB on OSX and about the same size as the parent process' RSS in Linux. I had assumed that copy-on-write behavior would work the same way in Linux and allow the child process to refer to the parent process' memory for most pages (perhaps except the one storing the head of the object).

So I'm guessing that there's some difference in the copy-on-write behavior in the 2 systems. My question is: is there anything I can do in Linux to get that OSX-like copy-on-write behavior?

like image 828
chhabrakadabra Avatar asked Dec 18 '18 21:12

chhabrakadabra


1 Answers

So I'm guessing that there's some difference in the copy-on-write behavior >in the 2 systems. My question is: is there anything I can do in Linux to >get that OSX-like copy-on-write behavior?

The answer is NO. Behind the command psutil.Process(getpid()).memory_info().rss / (1024 ** 2) the OS uses the UNIX command $top [PID] and search for the field RES. Which contains the non-swapped physical memory a task has used in kb. i.e. RES = CODE + DATA.

IMHO, these means that both OS uses different memory managers. So that, it's almost impossible to constrain how much memory a process uses/needs. This is a intern issue of the OS. In Linux the child process has the same size of the parent process. Indeed, they copy the same stack, code and data. But different PCB (Process Control Block). Therefore, it is impossible to get close to 0 as OSX does. It smells that OSX does not literally copy the code and data. If they are the same code, it will make pointer to the data of the parent process.

PD: I hope that would help you!

like image 136
Agapito Gallart i Bernat Avatar answered Sep 23 '22 04:09

Agapito Gallart i Bernat