Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

why python multithreading runs like a single thread on macos?

I have a similiar and simple computation task with three different parameters. So I take this chance to test how much time I can save by using multithreading.

Here is my code:

import threading
import time
from Crypto.Hash import MD2

def calc_func(text):
    t1 = time.time()
    h = MD2.new()
    total = 10000000
    old_text =text
    for n in range(total):
        h.update(text)
        text = h.hexdigest()
    print(f"thread done: old_text={old_text} new_text={text}, time={time.time()-t1}sec")

def do_3threads():
    t0 = time.time()
    texts = ["abcd", "abcde", "abcdef"]
    ths = []
    for text in texts:
        th = threading.Thread(target=calc_func, args=(text,))
        th.start()
        ths.append(th)
    for th in ths:
        th.join()
    print(f"main done: {time.time()-t0}sec")

def do_single():
    texts = ["abcd", "abcde", "abcdef"]
    for text in texts:
        calc_func(text)

if __name__ == "__main__":
    print("=== 3 threads ===")
    do_3threads()
    print("=== 1 thread ===")
    do_single()

The result is astonishing, each thread is taking roughly 4x time it takes if single threaded:

=== 3 threads ===
thread done: old_text=abcdef new_text=e8f636b1893f12abe956dc019294e923, time=25.460321187973022sec
thread done: old_text=abcd new_text=0d6cae713809c923475ea50dbfbb2c13, time=25.47859835624695sec
thread done: old_text=abcde new_text=cd028131bc5e161671a1c91c62e80f6a, time=25.4807870388031sec
main done: 25.481309175491333sec
=== 1 thread ===
thread done: old_text=abcd new_text=0d6cae713809c923475ea50dbfbb2c13, time=6.393985033035278sec
thread done: old_text=abcde new_text=cd028131bc5e161671a1c91c62e80f6a, time=6.5472939014434814sec
thread done: old_text=abcdef new_text=e8f636b1893f12abe956dc019294e923, time=6.483690977096558sec

This is totally not what I expected. This task is obviously a CPU intensive task, so I expect that, with multithreading, each thread could just take around 6.5 seconds and the whole process takes slightly over that, instead it took actually ~25.5 seconds, even worse than single threaded mode, which is ~20seconds.

The environment is python 3.7.7, macos 10.15.5, CPU is 8-core Intel i9, 16G memory.

Can someone explain that to me? Any input is appreciated.

like image 883
Cuteufo Avatar asked Jun 04 '26 06:06

Cuteufo


1 Answers

This task is obviously a CPU intensive task

Multithreading is not the proper tool for CPU bound tasks, but rather for something like network requests. This is because each Python process is limited to a single core due to the Global Interpreter Lock (GIL). All threads spawned by a process will run on the same core as the parent process.

Multiprocessing is what you are looking for, as it allows you to spawn multiple processes on, potentially, multiple cores.

like image 122
chrisharrel Avatar answered Jun 07 '26 09:06

chrisharrel