I am working on Multiclass Classification (4 classes) for Language Task and I am using the BERT model for classification task. I am following this blog as reference. My BERT Fine Tuned model returns <code>nn.LogSoftmax(dim=1)</code>. My data is pretty imbalanced so I used <code>sklearn.utils.class_weight.compute_class_weight</code> to compute weights of the classes and used the weights inside the Loss. <pre class="prettyprint"><code>class_weights = compute_class_weight('balanced', np.unique(train_labels), train_labels) weights= torch.tensor(class_weights,dtype=torch.float) cross_entropy = nn.NLLLoss(weight=weights) </code></pre> My results were not so good so I thought of Experementing with <code>Focal Loss</code> and have a code for Focal Loss. <pre class="prettyprint"><code>class FocalLoss(nn.Module): def __init__(self, alpha=1, gamma=2, logits=False, reduce=True): super(FocalLoss, self).__init__() self.alpha = alpha self.gamma = gamma self.logits = logits self.reduce = reduce def forward(self, inputs, targets): BCE_loss = nn.CrossEntropyLoss()(inputs, targets) pt = torch.exp(-BCE_loss) F_loss = self.alpha * (1-pt)**self.gamma * BCE_loss if self.reduce: return torch.mean(F_loss) else: return F_loss </code></pre> I have 3 questions now. First and the Most important is <ol> <li>Should I use Class Weight with Focal Loss?</li> <li>If I have to Implement weights inside this <code>Focal Loss</code>, can I use <code>weights</code> parameters inside <code> nn.CrossEntropyLoss()</code></li> <li>If this implement is incorrect, what should be the proper code for this one including the weights (if possible)</li> </ol>

You may find answers to your questions as follows: <ol> <li>Focal loss automatically handles the class imbalance, hence weights are not required for the focal loss. The alpha and gamma factors handle the class imbalance in the focal loss equation.</li> <li>No need of extra weights because focal loss handles them using alpha and gamma modulating factors</li> <li>The implementation you mentioned is correct according to the focal loss formula but I had trouble in causing my model to converge with this version hence, I used the following implementation from mmdetection framework </li> </ol> <pre class="prettyprint"><code> pred_sigmoid = pred.sigmoid() target = target.type_as(pred) pt = (1 - pred_sigmoid) * target + pred_sigmoid * (1 - target) focal_weight = (alpha * target + (1 - alpha) * (1 - target)) * pt.pow(gamma) loss = F.binary_cross_entropy_with_logits( pred, target, reduction='none') * focal_weight loss = weight_reduce_loss(loss, weight, reduction, avg_factor) return loss </code></pre> You can also experiment with another focal loss version available

How to Use Class Weights with Focal Loss in PyTorch for Imbalanced dataset for MultiClass Classification

Tags:

python

machine-learning

neural-network

deep-learning

pytorch

I am working on Multiclass Classification (4 classes) for Language Task and I am using the BERT model for classification task. I am following this blog as reference. My BERT Fine Tuned model returns nn.LogSoftmax(dim=1).

My data is pretty imbalanced so I used sklearn.utils.class_weight.compute_class_weight to compute weights of the classes and used the weights inside the Loss.

class_weights = compute_class_weight('balanced', np.unique(train_labels), train_labels)
weights= torch.tensor(class_weights,dtype=torch.float)
cross_entropy  = nn.NLLLoss(weight=weights)

My results were not so good so I thought of Experementing with Focal Loss and have a code for Focal Loss.

class FocalLoss(nn.Module):
  def __init__(self, alpha=1, gamma=2, logits=False, reduce=True):
    super(FocalLoss, self).__init__()
    self.alpha = alpha
    self.gamma = gamma
    self.logits = logits
    self.reduce = reduce

  def forward(self, inputs, targets):
    BCE_loss = nn.CrossEntropyLoss()(inputs, targets)

    pt = torch.exp(-BCE_loss)
    F_loss = self.alpha * (1-pt)**self.gamma * BCE_loss

    if self.reduce:
      return torch.mean(F_loss)
    else:
      return F_loss

I have 3 questions now. First and the Most important is

Should I use Class Weight with Focal Loss?
If I have to Implement weights inside this Focal Loss, can I use weights parameters inside nn.CrossEntropyLoss()
If this implement is incorrect, what should be the proper code for this one including the weights (if possible)

895

asked Nov 09 '20 11:11

Deshwal

Video Answer

2 Answers

You may find answers to your questions as follows:

Focal loss automatically handles the class imbalance, hence weights are not required for the focal loss. The alpha and gamma factors handle the class imbalance in the focal loss equation.
No need of extra weights because focal loss handles them using alpha and gamma modulating factors
The implementation you mentioned is correct according to the focal loss formula but I had trouble in causing my model to converge with this version hence, I used the following implementation from mmdetection framework

    pred_sigmoid = pred.sigmoid()
    target = target.type_as(pred)
    pt = (1 - pred_sigmoid) * target + pred_sigmoid * (1 - target)
    focal_weight = (alpha * target + (1 - alpha) *
                    (1 - target)) * pt.pow(gamma)
    loss = F.binary_cross_entropy_with_logits(
        pred, target, reduction='none') * focal_weight
    loss = weight_reduce_loss(loss, weight, reduction, avg_factor)
    return loss

You can also experiment with another focal loss version available

161

answered Oct 08 '22 16:10

user3411639

I think OP would've gotten his answer by now. I am writing this for other people who might ponder upon this.

There in one problem in OPs implementation of Focal Loss:

F_loss = self.alpha * (1-pt)**self.gamma * BCE_loss

In this line, the same alpha value is multiplied with every class output probability i.e. (pt). Additionally, code doesn't show how we get pt. A very good implementation of Focal Loss could be find here. But this implementation is only for binary classification as it has alpha and 1-alpha for two classes in self.alpha tensor.

In case of multi-class classification or multi-label classification, self.alpha tensor should contain number of elements equal to the total number of labels. The values could be inverse label frequency of labels or inverse label normalized frequency (just be cautious with labels which has 0 as frequency).

answered Oct 08 '22 15:10

Ashish

Related questions
                            
                                What is the purpose of the class meta in Django?
                            
                                How does np.ndarray.tobytes() work for dtype "object"?
                            
                                Modify trained model architecture and continue training Keras
                            
                                Two instances of class are equal but different hash code
                            
                                remove authentication and permission for specific url path
                            
                                Pandas apply in parallel when axis=0
                            
                                bottle : how to set a cookie inside a python decorator?
                            
                                Stuck in Watching for file changes with StatReloader
                            
                                Nested Class factory with tkinter
                            
                                Replace pandas column with sorted index
                            
                                how to convert a dataframe of counts to a probability density function
                            
                                Prevent changing indentation from tabs to spaces
                            
                                WARNING:tensorflow:AutoGraph could not transform <function format_example at ...> and will run it as-is
                            
                                Atom: Can't search for Packages or Themes in the Install Packages section of Settings [closed]
                            
                                How to instal python packages for Spyder
                            
                                How to run compiled vue project in django
                            
                                BS4 replace_with result is no longer in tree
                            
                                Matplotlib, vertical space between legend symbols
                            
                                Apply function to each row in Pandas dataframe by group
                            
                                How to run a Julia file, which uses a package, in Python?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With