Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Saving and reload huggingface fine-tuned transformer

I am trying to reload a fine-tuned DistilBertForTokenClassification model. I am using transformers 3.4.0 and pytorch version 1.6.0+cu101. After using the Trainer to train the downloaded model, I save the model with trainer.save_model() and in my trouble shooting I save in a different directory via model.save_pretrained(). I am using Google Colab and saving the model to my Google drive. After testing the model I also evaluated the model on my test getting great results, however, when I return to the notebook (or Factory restart the colab notebook) and try to reload the model, the predictions are terrible. Upon checking the directories, the config.json file is there as is the pytorch_mode.bin. Below is the full code.

from transformers import DistilBertForTokenClassification

# load the pretrained model from huggingface
#model = DistilBertForTokenClassification.from_pretrained('distilbert-base-cased', num_labels=len(uniq_labels))
model = DistilBertForTokenClassification.from_pretrained('distilbert-base-uncased', num_labels=len(uniq_labels)) 

model.to('cuda');

from transformers import Trainer, TrainingArguments

training_args = TrainingArguments(
    output_dir = model_dir +  'mitmovie_pt_distilbert_uncased/results',          # output directory
    #overwrite_output_dir = True,
    evaluation_strategy='epoch',
    num_train_epochs=3,              # total number of training epochs
    per_device_train_batch_size=16,  # batch size per device during training
    per_device_eval_batch_size=64,   # batch size for evaluation
    warmup_steps=500,                # number of warmup steps for learning rate scheduler
    weight_decay=0.01,               # strength of weight decay
    logging_dir = model_dir +  'mitmovie_pt_distilbert_uncased/logs',            # directory for storing logs
    logging_steps=10,
    load_best_model_at_end = True
)

trainer = Trainer(
    model = model,                         # the instantiated đŸ¤— Transformers model to be trained
    args = training_args,                  # training arguments, defined above
    train_dataset = train_dataset,         # training dataset
    eval_dataset = test_dataset             # evaluation dataset
)

trainer.train()

trainer.evaluate()

model_dir = '/content/drive/My Drive/Colab Notebooks/models/'
trainer.save_model(model_dir + 'mitmovie_pt_distilbert_uncased/model')

# alternative saving method and folder
model.save_pretrained(model_dir + 'distilbert_testing')

Coming back to the notebook after restarting...

from transformers import DistilBertForTokenClassification, DistilBertConfig, AutoModelForTokenClassification

# retreive the saved model 
model = DistilBertForTokenClassification.from_pretrained(model_dir + 'mitmovie_pt_distilbert_uncased/model', 
                                                        local_files_only=True)

model.to('cuda')

Model predictions are terrible now from either directory, however, the model does work and outputs the number of classes I would expect, it appears that the actual trained weights have not been saved or are somehow not getting loaded.

like image 947
Nate Avatar asked Nov 03 '20 13:11

Nate


People also ask

How do you save trained model Huggingface?

The model is a model provided by the library (loaded with the model id string of a pretrained model). The model was saved using save_pretrained() and is reloaded by supplying the save directory.

How can we save transformer models?

First way is to store a model like you have stored torch. save(model. state_dict(), PATH) and to load the same model on a different machine or some different place then first you have to make the instance of that model and then assign that model to the model parameter like this.

Where is Huggingface model saved?

On Linux, it is at ~/. cache/huggingface/transformers.

How do I load a Huggingface model?

The best way to load the tokenizers and models is to use Huggingface's autoloader class. Meaning that we do not need to import different classes for each architecture (like we did in the previous post), we only need to pass the model's name, and Huggingface takes care of everything for you.


Video Answer


1 Answers

Do you tried loading the by the trainer saved model in the folder:

mitmovie_pt_distilbert_uncased/results

The Huggingface trainer saves the model directly to the defined output_dir.

like image 71
André Soblechero Avatar answered Oct 08 '22 14:10

André Soblechero