Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Save only best weights with huggingface transformers

Currently, I'm building a new transformer-based model with huggingface-transformers, where attention layer is different from the original one. I used run_glue.py to check performance of my model on GLUE benchmark. However, I found that Trainer class of huggingface-transformers saves all the checkpoints that I set, where I can set the maximum number of checkpoints to save. However, I want to save only the weight (or other stuff like optimizers) with best performance on validation dataset, and current Trainer class doesn't seem to provide such thing. (If we set the maximum number of checkpoints, then it removes older checkpoints, not ones with worse performances). Someone already asked about same question on Github, but I can't figure out how to modify the script and do what I want. Currently, I'm thinking about making a custom Trainer class that inherits original one and change the train() method, and it would be great if there's an easy and simple way to do this. Thanks in advance.

like image 679
Seewoo Lee Avatar asked Jun 23 '20 00:06

Seewoo Lee


2 Answers

You may try the following parameters from trainer in the huggingface

training_args = TrainingArguments(
    output_dir='/content/drive/results',          # output directory
    do_predict= True, 
    num_train_epochs=3,              # total number of training epochs
    **per_device_train_batch_size=4,  # batch size per device during training
    per_device_eval_batch_size=2**,   # batch size for evaluation
    warmup_steps=1000,                # number of warmup steps for learning rate  
    save_steps=1000,
    save_total_limit=10,
    load_best_model_at_end= True,
    weight_decay=0.01,               # strength of weight decay
    logging_dir='./logs',            # directory for storing logs
    logging_steps=0, evaluate_during_training=True)

There may be better ways to avoid too many checkpoints and selecting the best model. So far you can not save only the best model, but you check when the evaluation yields better results than the previous one.

like image 129
Shaina Raza Avatar answered Sep 20 '22 10:09

Shaina Raza


I have not seen any parameter for that. However, there is a workaround.

Use following combinations

    evaluation_strategy =‘steps’,
    eval_steps = 10, # Evaluation and Save happens every 10 steps
    save_total_limit = 5, # Only last 5 models are saved. Older ones are deleted.
    load_best_model_at_end=True,

When I tried with the above combination, at any time 5 previous models will be saved in output directory, but if the best model is not one among them, it will keep the best model as well. So it will be 1 + 5 models. You can change save_total_limit = 1 so it will serve your purpose

like image 25
Karthik Sunil Avatar answered Sep 19 '22 10:09

Karthik Sunil