Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using multi-threading to process an image faster on python?

On a Python + Python Image Library script, there's a function called processPixel(image,pos) that calculates a mathematical index in function of an image and a position on it. This index is computed for each pixel using a simple for loop:

for x in range(image.size[0)):
    for y in range(image.size[1)):
        myIndex[x,y] = processPixel(image,[x,y])

This is taking too much time. How could threading be implemented to split the work speeding itup? How faster could a multi-threaded code be? Specifially, is this defined by the number of processor cores?

like image 849
MaiaVictor Avatar asked Jan 17 '23 20:01

MaiaVictor


1 Answers

You cannot speed it up using threading due to the Global Interpreter Lock. Certain internal state of the Python interpreter is protected by that lock, which prevents different threads that need to modify that state from running concurrently.

You could speed it up by spawning actual processes using multiprocessing. Each process will run in its own interpreter, thus circumventing the limitation of threads. With multiprocessing, you can either use shared memory, or give each process its own copy/partition of the data.

Depending on your task, you can either parallelize the processing of a single image by partitioning it, or you can parallelize the processing of a list of images (the latter is easily done using a pool). If you want to use the former, you might want to store the image in an Array that can be accessed as shared memory, but you'd still have to solve the problem of where to write the results (writing to shared memory can hurt performance badly). Also note that certain kinds of communication between processes (Queues, Pipes, or the parameter/return-value passing of some function in the module) require the data to be serialized using Pickle. This imposes certain limitations on the data, and might create significant performance-overhead (especially if you have many small tasks).

Another way for improving performance of such operations is to try writing them in Cython, which has its own support for parallelization using OpenMP - I have never used that though, so I don't know how much help it can be.

like image 187
Björn Pollex Avatar answered Feb 12 '23 11:02

Björn Pollex