I'm trying to get a list with landmark coordinates with MediaPipe's Face Mesh. For example: Landmark[6]: (0.36116672, 0.93204623, 0.0019629495)
I cant find the way to do that and would appreciate the help.
Mediapipe has more complex interface than most of the models you see publicly. But what you're looking for is easily achievable anyway.
import cv2
import mediapipe as mp
mp_drawing = mp.solutions.drawing_utils
mp_face_mesh = mp.solutions.face_mesh
file_list = ['test.png']
# For static images:
drawing_spec = mp_drawing.DrawingSpec(thickness=1, circle_radius=1)
with mp_face_mesh.FaceMesh(
static_image_mode=True,
min_detection_confidence=0.5) as face_mesh:
for idx, file in enumerate(file_list):
image = cv2.imread(file)
# Convert the BGR image to RGB before processing.
results = face_mesh.process(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
# Print and draw face mesh landmarks on the image.
if not results.multi_face_landmarks:
continue
annotated_image = image.copy()
for face_landmarks in results.multi_face_landmarks:
print('face_landmarks:', face_landmarks)
mp_drawing.draw_landmarks(
image=annotated_image,
landmark_list=face_landmarks,
connections=mp_face_mesh.FACE_CONNECTIONS,
landmark_drawing_spec=drawing_spec,
connection_drawing_spec=drawing_spec)
In this example, which is taken from here, you can see that they're iterating through results.multi_face_landmarks:
for face_landmarks in results.multi_face_landmarks:
Each iterable here consists of information about each face detected in the image, and length of results.multi_face_landmarks is number of faces detected in the image.
When you print attributes of let's say - first face, you'll see 'landmark' as a last attribute.
dir(results.multi_face_landmarks[0])
>> ..., 'landmark']
We need landmark attribute to acquire pixel coordinates after one step further.
Length of landmark attribute is 468, which basically is number of predicted [x,y,z] keypoints after regression.
If we take first keypoint:
results.multi_face_landmarks[0].landmark[0]
it will give us normalized [x,y,z] values:
x: 0.25341567397117615
y: 0.71121746301651
z: -0.03244325891137123
Finally, x, y and z here are attributes of each keypoint. We can check that by calling dir() on keypoint.
Now you can easily reach normalized pixel coordinates:
results.multi_face_landmarks[0].landmark[0].x -> X coordinate
results.multi_face_landmarks[0].landmark[0].y -> Y coordinate
results.multi_face_landmarks[0].landmark[0].z -> Z coordinate
For denormalization of pixel coordinates, we should multiply x coordinate by width and y coordinate by height.
Sample code:
for face in results.multi_face_landmarks:
for landmark in face.landmark:
x = landmark.x
y = landmark.y
shape = image.shape
relative_x = int(x * shape[1])
relative_y = int(y * shape[0])
cv2.circle(image, (relative_x, relative_y), radius=1, color=(225, 0, 100), thickness=1)
cv2_imshow(image)
Which would give us:
Click to see result image
Here is a full explanation -
Face Mesh MediaPipe
import cv2
import mediapipe as mp
mp_drawing = mp.solutions.drawing_utils
mp_face_mesh = mp.solutions.face_mesh
# For static images:
file_list = ['test.png']
drawing_spec = mp_drawing.DrawingSpec(thickness=1, circle_radius=1)
with mp_face_mesh.FaceMesh(
static_image_mode=True,
max_num_faces=1,
min_detection_confidence=0.5) as face_mesh:
for idx, file in enumerate(file_list):
image = cv2.imread(file)
# Convert the BGR image to RGB before processing.
results = face_mesh.process(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
# Print and draw face mesh landmarks on the image.
if not results.multi_face_landmarks:
continue
annotated_image = image.copy()
for face_landmarks in results.multi_face_landmarks:
print('face_landmarks:', face_landmarks)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With