I've written a working program in Python that basically parses a batch of binary files, extracting data into a data structure. Each file takes around a second to parse, which translates to hours for thousands of files. I've successfully implemented a threaded version of the batch parsing method with an adjustable number of threads. I tested the method on 100 files with a varying number of threads, timing each run. Here are the results (0 threads refers to my original, pre-threading code, 1 threads to the new version run with a single thread spawned).
0 threads: 83.842 seconds 1 threads: 78.777 seconds 2 threads: 105.032 seconds 3 threads: 109.965 seconds 4 threads: 108.956 seconds 5 threads: 109.646 seconds 6 threads: 109.520 seconds 7 threads: 110.457 seconds 8 threads: 111.658 seconds
Though spawning a thread confers a small performance increase over having the main thread do all the work, increasing the number of threads actually decreases performance. I would have expected to see performance increases, at least up to four threads (one for each of my machine's cores). I know threads have associated overhead, but I didn't think this would matter so much with single-digit numbers of threads.
I've heard of the "global interpreter lock", but as I move up to four threads I do see the corresponding number of cores at work--with two threads two cores show activity during parsing, and so on.
I also tested some different versions of the parsing code to see if my program is IO bound. It doesn't seem to be; just reading in the file takes a relatively small proportion of time; processing the file is almost all of it. If I don't do the IO and process an already-read version of a file, I adding a second thread damages performance and a third thread improves it slightly. I'm just wondering why I can't take advantage of my computer's multiple cores to speed things up. Please post any questions or ways I could clarify.
Multithreading is always faster than serial. Dispatching a cpu heavy task into multiple threads won't speed up the execution. On the contrary it might degrade overall performance. Imagine it like this: if you have 10 tasks and each takes 10 seconds, serial execution will take 100 seconds in total.
The threading is efficient in CPython, but threads can not run concurrently on different processors/cores. This is probably what was meant. It only affects you if you need to do shared memory concurrency. Other Python implementations does not have this problem.
Python doesn't support multi-threading because Python on the Cpython interpreter does not support true multi-core execution via multithreading. However, Python does have a threading library. The GIL does not prevent threading.
This is due to the Python GIL being the bottleneck preventing threads from running completely concurrently. The best possible CPU utilisation can be achieved by making use of the ProcessPoolExecutor or Process modules which circumvents the GIL and make code run more concurrently.
This is sadly how things are in CPython, mainly due to the Global Interpreter Lock (GIL). Python code that's CPU-bound simply doesn't scale across threads (I/O-bound code, on the other hand, might scale to some extent).
There is a highly informative presentation by David Beazley where he discusses some of the issues surrounding the GIL. The video can be found here (thanks @Ikke!)
My recommendation would be to use the multiprocessing
module instead of multiple threads.
The threading library does not actually utilize multiple cores simultaneously for computation. You should use the multiprocessing library instead for computational threading.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With