I have decided to learn how multi-threading is done in Python, and I did a comparison to see what kind of performance gain I would get on a dual-core CPU. I found that my simple multi-threaded code actually runs slower than the sequential equivalent, and I cant figure out why.
The test I contrived was to generate a large list of random numbers and then print the maximum
from random import random
import threading
def ox():
print max([random() for x in xrange(20000000)])
ox()
takes about 6 seconds to complete on my Intel Core 2 Duo, while ox();ox()
takes about 12 seconds.
I then tried calling ox() from two threads to see how fast that would complete.
def go():
r = threading.Thread(target=ox)
r.start()
ox()
go()
takes about 18 seconds to complete, with the two results printing within 1 second of eachother. Why should this be slower?
I suspect ox()
is being parallelized automatically, because I if look at the Windows task manager performance tab, and call ox()
in my python console, both processors jump to about 75% utilization until it completes. Does Python automatically parallelize things like max()
when it can?
You need to use a multi-process framework to parallelize with Python. Luckily, the multiprocessing module which ships with Python makes that fairly easy.
Very few languages can auto-parallelize expressions. If that is the functionality you want, I suggest Haskell (Data Parallel Haskell)
The problem is in function random() If you remove random from you code. Both cores try to access to shared state of the random function. Cores work consequentially and spent a lot of time on caches synchronization. Such behavior is known as false sharing. Read this article False Sharing
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With