Context
I often found myself in the following situation:
The problem is that simply reading the image takes a non negligible amount of time, sometime comparable or even longer than the image processing.
Question
So I was thinking that ideally I could read image n + 1 while processing image n. Or even better processing and reading multiple images at once in an automagically determined optimal way ?
I have read about multiprocessing, threads, twisted, gevent and the like but I can't figure out which one to use and how to implement this idea. Does anyone have a solution to this kind of issue ?
Minimal example
# generate a list of images
scipy.misc.imsave("lena.png", scipy.misc.lena())
files = ['lena.png'] * 100
# a simple image processing task
def process_image(im, threshold=128):
label, n = scipy.ndimage.label(im > threshold)
return n
# my current main loop
for f in files:
im = scipy.misc.imread(f)
print process_image(im)
Python is one of the widely used programming languages for this purpose. Its amazing libraries and tools help in achieving the task of image processing very efficiently.
Using ImageIO : Imageio is a Python library that provides an easy interface to read and write a wide range of image data, including animated images, video, volumetric data, and scientific formats.
Philip's answer is good, but will only create a couple of processes (one reading, one computing) which will hardly max out a modern >2 core system. Here's an alternative using multiprocessing.Pool
(specifically, its map method) which creates processes which do both the reading and compute aspects, but which should make better use of all the cores you have available (assuming there are more files than cores).
#!/usr/bin/env python
import multiprocessing
import scipy
import scipy.misc
import scipy.ndimage
class Processor:
def __init__(self,threshold):
self._threshold=threshold
def __call__(self,filename):
im = scipy.misc.imread(filename)
label,n = scipy.ndimage.label(im > self._threshold)
return n
def main():
scipy.misc.imsave("lena.png", scipy.misc.lena())
files = ['lena.png'] * 100
proc=Processor(128)
pool=multiprocessing.Pool()
results=pool.map(proc,files)
print results
if __name__ == "__main__":
main()
If I increase the number of images to 500, and use the processes=N
argument to Pool
, then I get
Processes Runtime
1 6.2s
2 3.2s
4 1.8s
8 1.5s
on my quad-core hyperthreaded i7.
If you got into more realistic use-cases (ie actual different images), your processes might be spending more time waiting on the image data to load from storage (in my testing, they load virtually instantaneously from cached disk) and then it might be worth explicitly creating more processes than cores to get some more overlap of compute and load. Only your own scalability testing on a realistic load and HW can tell you what's actually best for you though.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With