Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

multiprocessing of video frames in python

I am new to multiprocessing in python. I want to extract features from each frame of hour long video files. Processing each frame takes on the order of 30 ms. I thought multiprocessing was a good idea because each frame is processed independentle of all other frames.

I want to store the results of the feature extraction in a custom class.

I read a few examples and ended up using multiprocessing and Queues as suggested here. The result was disappointing though, now each frames takes about 1000 ms to process. I am guessing I produced a ton of overhead.

is there a more efficient way to process the frames in parallel and collect the results?

to illustrate, I put together a dummy example.

import multiprocessing as mp
from multiprocessing import Process, Queue
import numpy as np
import cv2

def main():
    #path='path\to\some\video.avi'
    coordinates=np.random.random((1000,2))
    #video = cv2.VideoCapture(path)
    listOf_FuncAndArgLists=[]

    for i in range(50):
        #video.set(cv2.CAP_PROP_POS_FRAMES,i)
        #img_frame_original = video.read()[1]
        #img_frame_original=cv2.cvtColor(img_frame_original, cv2.COLOR_BGR2GRAY)
        img_frame_dummy=np.random.random((300,300)) #using dummy image for this example
        frame_coordinates=coordinates[i,:]
        listOf_FuncAndArgLists.append([parallel_function,frame_coordinates,i,img_frame_dummy])

    queues=[Queue() for fff in listOf_FuncAndArgLists] #create a queue object for each function
    jobs = [Process(target=storeOutputFFF,args=[funcArgs[0],funcArgs[1:],queues[iii]]) for iii,funcArgs in enumerate(listOf_FuncAndArgLists)]
    for job in jobs: job.start() # Launch them all
    for job in jobs: job.join() # Wait for them all to finish
    # And now, collect all the outputs:
    return([queue.get() for queue in queues])         

def storeOutputFFF(fff,theArgs,que): #add a argument to function for assigning a queue
    print 'MULTIPROCESSING: Launching %s in parallel '%fff.func_name
    que.put(fff(*theArgs)) #we're putting return value into queue

def parallel_function(frame_coordinates,i,img_frame_original):
    #do some image processing that takes about 20-30 ms
    dummyResult=np.argmax(img_frame_original)
    return(resultClass(dummyResult,i))

class resultClass(object):
    def __init__(self,maxIntensity,i):
        self.maxIntensity=maxIntensity
        self.i=i

if __name__ == '__main__':
    mp.freeze_support()
    a=main()
    [x.maxIntensity for x in a]
like image 604
jlarsch Avatar asked Oct 18 '22 06:10

jlarsch


2 Answers

Parallel processing in (regular) python is a bit of a pain: in other languages we'd just use threads but the GIL makes that problematic, and using multiprocessing has a big overhead in moving data around. I've found that fine-grained parallelism is (relatively) hard to do, whilst processing 'chunks' of work that take 10's of seconds (or more) to process in a single process can be much more straight-forward.

An easier path to parallel processing your problem - if you're on a UNIXy system - would be to make a python program which processes a segment of video specified on the command-line (i.e. a frame number to start with, and a number of frames to process), and then use the GNU parallel tool to process multiple segments at once. A second python program can consolidate the results from a collection of files, or reading from stdin, piped from parallel. This way means that the processing code doesn't need to do it's own parallelism, but it does require the input file to be multiply accessed and to extract frames starting from mid-points. (This might also be extendable to work across multiple machines without changing the python...)

Using multiprocessing.Pool.map could be used in a similar way if you need a pure-python solution: map over a list of tuples (say, (file, startframe, endframe)) and then open the file in the function and process that segment.

like image 133
Wuggy Avatar answered Oct 22 '22 10:10

Wuggy


Multiprocessing creates some overhead for starting several processes and bringing them all back together.

Your code does that for every frame.

Try splitting your video into N evenly-sized pieces and processing them in parallel.

Put N equal to number of cores on your machine or something like that (your mileage may vary, but it's a good number to start experimenting with). There's no point in creating 50 processes if, say, 4 of them are getting executed and rest are simply waiting for their turn.

like image 30
Daerdemandt Avatar answered Oct 22 '22 10:10

Daerdemandt