Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the loss function of the Mask RCNN?

The paper has clearly mentioned the classification and regression losses are identical to the RPN network in the Faster RCNN . Can someone explain the Mask Loss function . How the use FCN to improve ?

like image 475
Shamane Siriwardhana Avatar asked Dec 05 '22 13:12

Shamane Siriwardhana


1 Answers

FCN uses per-pixel softmax and a multinominal loss. This means, that the mask prediction task (the boundaries of the object) and the class prediction task (what is the object being masked) are coupled.
Mask-RCNN decouples these tasks: the existing bounding-box prediction (AKA the localization task) head predicts the class, like faster-RCNN, and the mask branch generates a mask for each class, without competition among classes (e.g. if you have 21 classes the mask branch predicts 21 masks instead of FCN's single mask with 21 channels). The loss being used is per-pixel sigmoid + binary loss.
Bottom line, it's Sigmoid in Mask-RCNN vs. Soft-max in FCN.
(See table 2.b. in Mask RCNN paper - Ablation section).

like image 79
rkellerm Avatar answered May 13 '23 05:05

rkellerm