Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Communication between parent and child process when forking in Python

Tags:

python

fork

ipc

I was trying to have a Python program simultaneously run a processing loop, and a broadcasting service for the result, using a call to os.fork(), something like

pid = os.fork()
if pid == 0:
    time.sleep(3)
    keep_updating_some_value_while_parent_is_running()
else:
    broadcast_value()

Here keep_updating_some_value_while_parent_is_running(), which is executed by the child, stores some value that it keeps updating as long as the parent is running. It actually writes the value to disk so that the parent can easily access it. It detects the parent is running by checking that the web service that it runs is available.

broadcast_value() runs a web service, when consulted it reads the most recent value from disk and serves it.

This implementation works well, but it is unsatisfactory for several reasons:

  1. The time.sleep(3) is necessary because the web service requires some startup time. There is no guarantee at all that in 3 seconds the service will be up and running, while on the other hand it may be much earlier.

  2. Sharing data via disk is not always a good option or not even possible (so this solution doesn't generalize well).

  3. Detecting that the parent is running by checking that the web service is available is not very optimal, and moreover for different kinds of processes (that cannot be polled automatically so easily) this wouldn't work at all. Moreover it can be that the web service is running fine, but there is a temporary availability issue.

  4. The solution is OS dependent.

  5. When the child fails or exits for some reason, the parent will just keep running (this may be the desired behavior, but not always).

What I would like would be some way for the child process to know when the parent is up and running, and when it is stopped, and for the parent to obtain the most recent value computed by the child on request, preferably in an OS independent way. Solutions involving non-standard libraries also are welcome.

like image 552
doetoe Avatar asked Oct 29 '22 19:10

doetoe


1 Answers

I'd recommend using multiprocessing rather than os.fork(), as it handles a lot of details for you. In particular it provides the Manager class, which provides a nice way to share data between processes. You'd start one Process to handle getting the data, and another for doing the web serving, and pass them both a shared data dictionary provided by the Manager. The main process is then just responsible for setting all that up (and waiting for the processes to finish - otherwise the Manager breaks).

Here's what this might look like:

import time
from multiprocessing import Manager, Process

def get_data():
    """ Does the actual work of getting the updating value. """

def update_the_data(shared_dict):
    while not shared_dict.get('server_started'):
        time.sleep(.1)
    while True:
        shared_dict['data'] = get_data()
        shared_dict['data_timestamp'] = time.time()
        time.sleep(LOOP_DELAY)


def serve_the_data(shared_dict):
    server = initialize_server() # whatever this looks like
    shared_dict['server_started'] = True
    while True:
        server.serve_with_timeout()
        if time.time() - shared_dict['data_timestamp'] > 30:
            # child hasn't updated data for 30 seconds; problem?
            handle_child_problem()


if __name__ == '__main__':
    manager = Manager()
    shared_dict = manager.dict()
    processes = [Process(target=update_the_data, args=(shared_dict,)),
        Process(target=serve_the_data, args=(shared_dict,))]
    for process in processes:
        process.start()
    for process in processes:
        process.join()
like image 169
Nathan Vērzemnieks Avatar answered Nov 13 '22 05:11

Nathan Vērzemnieks