Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Human pose estimation - efficient linking of body parts

Problem's description: I am working at a project whose goal is to identify people's body parts in images (torso, head, left and right arms etc). The approach is based on finding parts of the human body (hypothesis) and then searching for the best pose configuration (= all the parts that really form a human body). The ideea is described better at this link http://www.di.ens.fr/willow/events/cvml2010/materials/INRIA_summer_school_2010_Andrew_human_pose.pdf.

The hypothesis are obtained after running a detection algorithm (here I am using a classifier from machine learning field) for each body part separately. So, the type of each hypothesis is known. Also, each hypothesis has a location (x and y coordinates in the image) and an orientation.

To determine the cost of linking two parts together, one can consider that each hypothesis of type head can be linked with each hypothesis of type torso (for example). But, a head hypothesis which is in the top right location of the image can not be linked (from a human point of view) with a torso hypothesis which is in the bottom left location of the image. I am trying to avoid these kinds of links based on the last statement and also due to the execution time.

Question: I am planing to reduce the searching space by considering a distance to the farthest hypothesis which can be a linking candidate. Which is the fastest way of solving this searching problem?

like image 568
Raluca Pandaru Avatar asked Apr 08 '13 19:04

Raluca Pandaru


People also ask

How do you do pose estimation?

The pose estimation models takes a processed camera image as the input and outputs information about keypoints. The keypoints detected are indexed by a part ID, with a confidence score between 0.0 and 1.0. The confidence score indicates the probability that a keypoint exists in that position.

What is 2D human pose estimation?

2D human pose estimation is used to estimate the 2D position or spatial location of human body keypoints from visuals such as images and videos. Traditional 2D human pose estimation methods use different hand-crafted feature extraction techniques for the individual body parts.

What is MediaPipe pose estimation?

MediaPipe Pose is a ML solution for high-fidelity body pose tracking, inferring 33 3D landmarks and background segmentation mask on the whole body from RGB video frames utilizing our BlazePose research that also powers the ML Kit Pose Detection API.

What is 6D pose estimation?

6D pose estimation is the task of detecting the 6D pose of an object, which include its location and orientation. This is an important task in robotics, where a robotic arm needs to know the location and orientation to detect and move objects in its vicinity successfully.


2 Answers

For similar problems I have resorted to spliting the source images into 16 (or more, depending on the relative size of the parts you're trying to link) smaller images, doing the detection and linking step in each of these seperatly, and an extra step where you will do only a link step for each subimage, and it's (possibly 8) neighbours.

In this case you will never even try to link one part in the upper left corner with the lower right one, and as an added bonus the first part of your problem is now extremely parallel.

Update: You could do a edge detection on the image first, and never cut the image in 2 when that would mean cutting an edge in two. recursively doing this would allow you to get a lot of small images with body parts on them you can then process separately.

like image 160
Jens Timmerman Avatar answered Nov 07 '22 12:11

Jens Timmerman


This kind of discrete assignment problem can be solved using the Hungarian algorithm.

In the computation of the cost (= distance) matrix, you can set the entry to some infinite or very high value when the distance is grater than a predefined threshold, This will prevent the algorithm from assigning a head to a torso which is too far away.

This last technique is also called gating in tracking lectures.

like image 23
sansuiso Avatar answered Nov 07 '22 11:11

sansuiso