I've been working on a python application which uses OpenCV to read frames from a video and create a composite of the "activity", i.e. the things that have changed from one frame to the next. To do that, I only really want to check one frame per second or so.
For a long time I've been using the following code (simplified, with some error checking, classes, etc removed for brevity) to get the video object and the first frame:
video_capture = cv2.VideoCapture(video_fullpath)
this_frame = get_frame(0)
def get_frame(time):
video_capture.set(cv2.CAP_PROP_POS_MSEC, time)
capture_success, this_frame = video_capture.read()
return this_frame
The process of getting subsequent frames, using the latter two lines of code above, is really slow. On a 2015 MacBook Pro it takes 0.3-0.4s to get each frame (at 1sec intervals in the video, which is a ~100MB .mp4 video file). By comparison, the rest of my operations, which are comparing each frame to its predecessor, are very quick - typically less than 0.01s.
I've therefore been looking at multi-threading, but I'm struggling.
I can get multi-threading working on a "lookahead" basis, i.e. whilst I'm processing one frame I can be getting the next one. And once I'm done processing the previous frame, I'll wait for the "lookahead" operation to finish before continuing. I do that with the following code:
while True:
this_frame, next_frame_thread = get_frame_async(prev_frame.time + time_increment)
<< do processing of this_frame ... >>
next_frame_thread.join()
def get_frame_async(time):
if time not in frames:
frames[time] = get_frame(time)
next_frame_thread = Thread(target=get_frame, args=(time,))
next_frame_thread.start()
return frames[time], next_frame_thread
The above seems to be working, but because the seeking operation is so slow compared to everything else it doesn't actually save much time - in fact it's difficult to see any benefit at all.
I then wondered whether I could be getting multiple frames in parallel. However, whenever I try I get a range of errors, mostly related to async_lock (e.g. Assertion fctx->async_lock failed at libavcodec/pthread_frame.c:155
). I wonder whether this is simply that an OpenCV VideoCapture object can't seek to multiple places at once... which would seem reasonable. But if that's true, is there any way to speed this operation up significantly?
I've been using a few different sources, including this one https://nrsyed.com/2018/07/05/multithreading-with-opencv-python-to-improve-video-processing-performance/ which shows huge speed-ups, but I'm struggling with why I'm getting these errors around async_lock. Is it just the seek operation? I can't find any examples of multithreading whilst seeking around the video - just example of people reading all frames sequentially.
Any tips or guidance on where / which parts are most likely to benefit from multithreading (or another approach) would be most welcome. This is my first attempt at multithreading, so completely accept I might have missed something obvious! Based on this page (https://www.toptal.com/python/beginners-guide-to-concurrency-and-parallelism-in-python), I was a bit overwhelmed by the range of different options available.
Thanks!
Based on the comments on the original question I've done some testing and thought it worth sharing the (interesting) results. Big savings potential for anyone using OpenCV's VideoCapture.set(CAP_PROP_POS_MSEC)
or VideoCapture.set(CAP_PROP_POS_FRAMES)
.
I've done some profiling comparing three options:
1. GET FRAMES BY SEEKING TO TIME:
frames = {}
def get_all_frames_by_ms(time):
while True:
video_capture.set(cv2.CAP_PROP_POS_MSEC, time)
capture_success, frames[time] = video_capture.read()
if not capture_success:
break
time += 1000
2. GET FRAMES BY SEEKING TO FRAME NUMBER:
frames = {}
def get_all_frames_by_frame(time):
while True:
# Note my test video is 12.333 FPS, and time is in milliseconds
video_capture.set(cv2.CAP_PROP_POS_FRAMES, int(time/1000*12.333))
capture_success, frames[time] = video_capture.read()
if not capture_success:
break
time += 1000
3. GET FRAMES BY GRABBING ALL, BUT RETRIEVING ONLY ONES I WANT:
def get_all_frames_in_order():
prev_time = -1
while True:
grabbed = video_capture.grab()
if grabbed:
time_s = video_capture.get(cv2.CAP_PROP_POS_MSEC) / 1000
if int(time_s) > int(prev_time):
# Only retrieve and save the first frame in each new second
self.frames[int(time_s)] = video_capture.retrieve()
prev_time = time_s
else:
break
Running through those three approaches, the timings (from three runs of each) are as follows:
In each case it's saving 100 frames at 1sec intervals into a dictionary, where each frame is a 3072x1728 image, from a .mp4 video file. All on a 2015 MacBookPro with 2.9 GHz Intel Core i5 and 8GB RAM.
Conclusions so far... if you're interested in retrieving only some frames from a video, then very worth looking at running through all frames in order and grabbing them all, but only retrieving those you're interested in - as an alternative to reading (which grabs and retrieves in one go). Gave me an almost 3x speedup.
I've also re-looked at multi-threading on this basis. I've got two test processes - one that gets the frames, and another that processes them once they're available:
frames = {}
def get_all_frames_in_order():
prev_time = -1
while True:
grabbed = video_capture.grab()
if grabbed:
time_s = video_capture.get(cv2.CAP_PROP_POS_MSEC) / 1000
if int(time_s) > int(prev_time):
# Only retrieve and save the first frame in each new second
frames[int(time_s)] = video_capture.retrieve()
prev_time = time_s
else:
break
def process_all_frames_as_available(processing_time):
prev_time = 0
while True:
this_time = prev_time + 1000
if this_time in frames and prev_time in frames:
# Dummy processing loop - just sleeps for specified time
sleep(processing_time)
prev_time += self.time_increment
if prev_time + self.time_increment > video_duration:
break
else:
# If the frames aren't ready yet, wait a short time before trying again
sleep(0.02)
For this testing, I then called them either one after the other (sequentially, single threaded), or with the following muti-threaded code:
get_frames_thread = Thread(target=get_all_frames_in_order)
get_frames_thread.start()
process_frames_thread = Thread(target=process_all_frames_as_available, args=(0.02,))
process_frames_thread.start()
get_frames_thread.join()
process_frames_thread.join()
Based on that, I'm now happy that multi-threading is working effectively and saving a significant amount of time. I generated timings for the two functions above separately, and then together in both single-threaded and multi-threaded modes. The results are below (number in bracket is the time in seconds that the 'processing' for each frame takes, which in this case is just a dummy / delay):
get_all_frames_in_order - 2.99s
Process time = 0.02s per frame:
process_all_frames_as_available - 0.97s
single-threaded - 3.99s
multi-threaded - 3.28s
Process time = 0.1s per frame:
process_all_frames_as_available - 4.31s
single-threaded - 7.35s
multi-threaded - 4.46s
Process time = 0.2s per frame:
process_all_frames_as_available - 8.52s
single-threaded - 11.58s
multi-threaded - 8.62s
As you can hopefully see, the multi-threading results are very good. Essentially, it takes just ~0.2s longer to do both functions in parallel than the slower of the two functions running entirely separately.
Hope that helps someone!
Coincidentally, I've worked on a similar problem, and I have created a python library (more of a thin wrapper) for reading videos. The library is called mydia.
The library does not use OpenCV. It uses FFmpeg as the backend for reading and processing videos.
mydia
supports custom frame selection, frame resizing, grayscale conversion and much more. The documentation can be viewed here
So, if you want to select N
frames per second (where N
= 1 in your case), the following code would do it:
import numpy as np
from mydia import Videos
video_path = "path/to/video"
def select_frames(total_frames, num_frames, fps, *args):
"""This function will return the indices of the frames to be captured"""
N = 1
t = np.arange(total_frames)
f = np.arange(num_frames)
mask = np.resize(f, total_frames)
return t[mask < N][:num_frames].tolist()
# Let's assume that the duration of your video is 120 seconds
# and you want 1 frame for each second
# (therefore, setting `num_frames` to 120)
reader = Videos(num_frames=120, mode=select_frames)
video = reader.read(video_path) # A video tensor/array
The best part is that internally, only those frames that are required are read, and therefore the process is much faster (which is what I believe you are looking for).
The installation of mydia
is extremely simple and can be viewed here.
This might have a slight learning curve, but I believe that it is exactly what you are looking for.
Moreover, if you have multiple videos, you could use multiple workers for reading them in parallel. For instance:
from mydia import Videos
path = "path/to/video"
reader = Videos()
video = reader.read(path, workers=4)
Depending on your CPU, this could give you a significant speed-up.
Hope this helps !!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With