I get lots of jitter when using OpenPose to extract pose data from video. This is unnatural looking, and my results don't look natural and human.
The data I get from the OpenPose model is what I have to work with, and I can't improve the quality of the model. The entire video clip is processed and the 15 anatomical key points are stored in a database. I'd like to use some signal processing to smooth out this data. How do I get rid of this jitter?
Use a Savgol filter to smooth the data.
Gif on Imgur
Video showing different levels of smoothing.
There is jitter in the data because the video is being processed frame by frame. The OpenPose model is good, but it is not consistent. The model tends to be wrong in random ways. This causes the positions of the body parts to bounce around the true value.
Fortunately, this data is distributed normally around the true value. This means that a Savgol filter can be used to smooth out the data, and generate accurate values from noisy data.
The first step in smoothing is to collect pose data for the entire video, and store it in a .csv file. save_pose_data.py
Sometimes the model gets turned around. In the next step, the body parts are swapped so that the left is always on the left. swap_body_parts.py
Body part #9 is the left knee and body part #12 is the right knee. Sometimes the model mixes up the right and left knee locations. I have to ensure:
x_coord for body part #9 < x_coord for body part #12
So, if #9 is left of #12, I will swap those positions.
Finally it's time to apply the smoothing. For 60 fps, I have found that values between 9 and 31 work well for the window_length parameter (lower is less smoothing and higher is more smoothing). smooth_with_savgol.py
In this gif, the right elbow (green) has been smoothed, and the left elbow (pink) has not.
Generally, smooth results come from tracking not detection. Detect the pose in the first frame of video and then track the keypoints with optical flow. Tracking is 100x faster than detection (Optical Flow vs. OpenPose).
This method is only suitable for post production. The pose data from all the frames must be known before applying the smoothing algorithm. Extracting the pose data for this .gif took my computer several minutes.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With