Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use tqdm through multi process in python?

I'm trying to use tqdm through multi processes. And the behavior is not as expected. I think the point is that the value of pbar doesn't update through the processes. So how to deal with this problem? I have also tried to use Value to update pbar.n manually, but still failed. It seems tqdm doesn't support update value and render manually.

def test(lock, pbar):
    for i in range(10000):
        sleep(0.1)
        lock.acquire()
        pbar.update()
        lock.release()

pbar = tqdm(total = 10000)
lock = Lock()
for i in range(5):
    Process(target = test, args = (lock, pbar))
like image 847
Sraw Avatar asked Mar 28 '17 08:03

Sraw


People also ask

Can you use tqdm with multiprocessing?

tqdm(range(0, 30)) does not work with multiprocessing (as formulated in the code below).

How do I use tqdm in Python?

Usage. Using tqdm is very simple, you just need to add your code between tqdm() after importing the library in your code. You need to make sure that the code you put in between the tqdm() function must be iterable or it would not work at all.

Does tqdm work with while loops?

tqdm does not require any dependencies and works across multiple python environments. Integrating tqdm can be done effortlessly in loops, on iterable, with Pandas or even with machine learning libraries— just wrap any iterable with tqdm(iterable) , and you're done!

Is tqdm only for loops?

I think tqdm is meant for long loops, not short loops that takes a lot of time. That is because tqdm estimates the ETA based on the average time it took a cycle to complete, so it wont be that useful.


1 Answers

Generally, each process has its own data, independent of every other process. Spawning a new process (which calls os.fork on Unix) creates a copy of the current process. Each process obtains its own copy of all global values (such as pbar). Each process's global variables may share the same names as variables in the other processes, but each can hold an independent value.

In your case it looks like you want just one pbar to exist, and all calls to update should update that one pbar. So create pbar in only one process, and use a Queue to send signals to that process to update pbar:

import multiprocessing as mp

SENTINEL = 1

def test(q):
    for i in range(10000):
        sleep(0.1)
        q.put(SENTINEL)

def listener(q):
    pbar = tqdm(total = 10000)
    for item in iter(q.get, None):
        pbar.update()

if __name__ == '__main__':
    q = mp.Queue()
    proc = mp.Process(target=listener, args=(q,))
    proc.start()
    workers = [mp.Process(target=test, args=(q,)) for i in range(5)]
    for worker in workers:
        worker.start()
    for worker in workers:
        worker.join()
    q.put(None)
    proc.join()
like image 53
unutbu Avatar answered Sep 23 '22 00:09

unutbu