Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Could threading or multiprocessing improve performance when analyzing a single string with multiple regular expressions?

If I want to analyze a string using dozens of regular-expressions,
could either the threading or multiprocessing module improve performance?
In other words, would analyzing the string on multiple threads or processes be faster than:

match = re.search(regex1, string)
if match:
    afunction(match)
else:
    match = re.search(regex2, string)
    if match:
        bfunction(match)
    else:
        match = re.search(regex3, string)
        if match:
            cfunction(match)
...

No more than one regular expression would ever match, so that's not a concern.
If the answer is multiprocessing, what technique would you recommend looking into (queues, pipes)?

like image 227
Honest Abe Avatar asked May 23 '26 04:05

Honest Abe


1 Answers

Python threading won't improve performance because of the GIL which precludes more than one thread running at a time. If you have a multicore machine, it's possible that multiple processes may speed things up but only if the cost of spawning subprocesses and passing data around is less than the cost of performing your RE searches.

If you do this often, you might look into thread pools.


Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!