Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficient metrics evaluation in PyTorch

I am new to PyTorch and want to efficiently evaluate among others F1 during my Training and my Validation Loop.

So far, my approach was to calculate the predictions on GPU, then push them to CPU and append them to a vector for both Training and Validation. After Training and Validation, I would evaluate both for each epoch using sklearn. However, profiling my code it showed, that pushing to cpu is quite a bottleneck.

for epoch in range(n_epochs):
    model.train()
    avg_loss = 0
    avg_val_loss = 0
    train_pred = np.array([])
    val_pred = np.array([])
    # Training loop (transpose X_batch to fit pretrained (features, samples) style)
    for X_batch, y_batch in train_loader:
        scores = model(X_batch)
        y_pred = F.softmax(scores, dim=1)
        train_pred = np.append(train_pred, self.get_vector(y_pred.detach().cpu().numpy()))

        loss = loss_fn(scores, self.get_vector(y_batch))
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        avg_loss += loss.item() / len(train_loader)

    model.eval()
    # Validation loop
    for X_batch, y_batch in val_loader:
        with torch.no_grad():
            scores = model(X_batch)
            y_pred = F.softmax(scores, dim=1)
            val_pred = np.append(val_pred, self.get_vector(y_pred.detach().cpu().numpy()))
            loss = loss_fn(scores, self.get_vector(y_batch))
            avg_val_loss += loss.item() / len(val_loader)

    # Model Checkpoint for best validation f1
    val_f1 = self.calculate_metrics(train_targets[val_index], val_pred, f1_only=True)
    if val_f1 > best_val_f1:
        prev_best_val_f1 = best_val_f1
        best_val_f1 = val_f1
        torch.save(model.state_dict(), self.PATHS['xlm'])
        evaluated_epoch = epoch

    # Calc the metrics
    self.save_metrics(train_targets[train_index], train_pred, avg_loss, 'train')
    self.save_metrics(train_targets[val_index], val_pred, avg_val_loss, 'val')

I am certain there is a more efficient way to a) store the predictions without having to push them to cpu each batch. b) calculate the metrics on GPU directly?

As I am new to PyTorch, I am very grateful for any hints and feedback :)

like image 242
JimmysCheeseSteak Avatar asked Jun 18 '19 07:06

JimmysCheeseSteak


People also ask

What is TorchMetrics?

TorchMetrics is a collection of 80+ PyTorch metrics implementations and an easy-to-use API to create custom metrics. It offers: A standardized interface to increase reproducibility. Reduces Boilerplate. Distributed-training compatible.

Can PyTorch provide model evaluation metrics for PyTorch?

Model evaluation metrics for PyTorch Torch-metrics serves as a custom library to provide common ML evaluation metrics in Pytorch, similar to tf.keras.metrics. As summarized in this issue, Pytorch does not have a built-in libary torch.metrics for model evaluation metrics. This is similar to the metrics library in PyTorch Lightning.

How to calculate the F1 score in PyTorch?

You can compute the F-score yourself in pytorch. The F1-score is defined for single-class (true/false) classification only. The only thing you need is to aggregating the number of: Count how many times the class was correctly predicted.

Which batch_size and devices are compatible with the implemented metric?

All implemented metric is compatible with any batch_size and devices (CPU or GPU). y_pred << 4D tensor in [batch_size, channels, img_rows, img_cols] y_true << 4D tensor in [batch_size, channels, img_rows, img_cols] metric = MSE () acc = metric (y_pred, y_true).item () print (" {} ==> {}".format (repr (metric), acc))

What is the metrics API in torchelastic?

The metrics API in torchelastic is used to publish telemetry metrics. It is designed to be used by torchelastic’s internal modules to publish metrics for the end user with the goal of increasing visibility and helping with debugging. However you may use the same API in your jobs to publish metrics to the same metrics sink.


Video Answer


1 Answers

You can compute the F-score yourself in pytorch. The F1-score is defined for single-class (true/false) classification only. The only thing you need is to aggregating the number of:

  • Count of the class in the ground truth target data;
  • Count of the class in the predictions;
  • Count how many times the class was correctly predicted.

Let's assume you want to compute F1 score for the class with index 0 in your softmax. In every batch, you can do:

predicted_classes = torch.argmax(y_pred, dim=1) == 0
target_classes = self.get_vector(y_batch)
target_true += torch.sum(target_classes == 0).float()
predicted_true += torch.sum(predicted_classes).float()
correct_true += torch.sum(
    predicted_classes == target_classes * predicted_classes == 0).float()

When all batches are processed:

recall = correct_true / target_true
precision = correct_true / predicted_true
f1_score = 2 * precission * recall / (precision + recall)

Don't forget to take care of cases when precision and recall are zero and when then desired class was not predicted at all.

like image 164
Jindřich Avatar answered Oct 19 '22 19:10

Jindřich