Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Understanding os.fork and Queue.Queue

I wanted to implement a simple python program using parallel execution. It's I/O bound, so I figured threads would be appropriate (as opposed to processes). After reading the documentation for Queue and fork, I thought something like the following might work.

q = Queue.Queue()

if os.fork():            # child
    while True:
        print q.get()
else:                    # parent
    [q.put(x) for x in range(10)]

However, the get() call never returns. I thought it would return once the other thread executes a put() call. Using the threading module, things behave more like I expected:

q = Queue.Queue()

def consume(q):
    while True:
        print q.get()

worker = threading.Thread (target=consume, args=(q,))
worker.start()

[q.put(x) for x in range(10)]

I just don't understand why the fork approach doesn't do the same thing. What am I missing?

like image 229
Brian Hawkins Avatar asked Mar 29 '12 20:03

Brian Hawkins


People also ask

What is an OS fork?

In an operating system, a fork is a Unix or Linux system call to create a new process from an existing running process. The new process is a child process of the calling parent process.

Is OS fork () a system call?

System call fork() is used to create processes. It takes no arguments and returns a process ID. The purpose of fork() is to create a new process, which becomes the child process of the caller. After a new child process is created, both processes will execute the next instruction following the fork() system call.

What happens when OS fork is executed?

Thus executing os. fork() creates two processes: A parent processparent processIn computing, a parent process is a process that has created one or more child processes.https://en.wikipedia.org › wiki › Parent_processParent process - Wikipedia and a child process. The newly created child process is the exact replica of the parent process. The child process will have copies of the descriptors if any used by the parent process.

How does Python multiprocessing queue work?

The multiprocessing. Queue provides a first-in, first-out FIFO queue, which means that the items are retrieved from the queue in the order they were added. The first items added to the queue will be the first items retrieved. This is opposed to other queue types such as last-in, first-out and priority queues.

Does Python have a queue?

Python provides Class queue as a module which has to be generally created in languages such as C/C++ and Java. Initializes a variable to a maximum size of maxsize. A maxsize of zero '0' means a infinite queue. This Queue follows FIFO rule.

What is a fork Python?

fork() : fork() is an operation whereby a process creates a copy of itself. It is usually a system call, implemented in the kernel. getpid() : getpid() returns the process ID (PID) of the calling process. Below is Python program implementing above : # Python code to create child process.

What is stack and queue in Python?

Stacks and Queues are the earliest data structure defined in computer science. A simple Python list can act as a queue and stack as well. A queue follows FIFO rule (First In First Out) and used in programming for sorting. It is common for stacks and queues to be implemented with an array or linked list.

Is Python queue faster than list?

Though list objects support similar operations, they are optimized for fast fixed-length operations and incur O(n) memory movement costs for pop(0) and insert(0, v) operations which change both the size and position of the underlying data representation. So using a deque will be much faster.


2 Answers

The POSIX fork system call creates a new process, rather than a new thread inside the same adress space:

The fork() function shall create a new process. The new process (child process) shall be an exact copy of the calling process (parent process) except as detailed below: [...]

So the Queue is duplicated in your first example, rather than shared between the parent and child.

You can use multiprocessing.Queue instead or just use threads like in your second example :)

By the way, using list comprehensions just for side effects isn't good practice for several reasons. You should use a for loop instead:

for x in range(10): q.put(x)
like image 100
Niklas B. Avatar answered Sep 23 '22 16:09

Niklas B.


To share the data between unrelated processes, you can use named pipes. Through the os.open() funcion.. http://docs.python.org/2/library/os.html#os.open. You can simply name a pipe as named_pipe='my_pipe' and in a different python programs use os.open(named_pipe, ), where mode is WRONLY and so on. After that you'll make a FIFO to write into the pipe. Don't forget to close the pipe and catch exceptions..

like image 22
user3238650 Avatar answered Sep 25 '22 16:09

user3238650