I want to save all the trained model after finetuning like this in folder:
config.json
added_token.json
special_tokens_map.json
tokenizer_config.json
vocab.txt
pytorch_model.bin
I could only save pytorch_model.bin but other details I could not reach to save, How I could save all the config, tokenizer and etc of my model?
I used
tokenizer.save_pretrained('results/tokenizer/')
but earror apears
AttributeError: 'BertTokenizer' object has no attribute 'save_pretrained'
I saved the binary model file by the following code
torch.save(model_to_save.state_dict(), output_model_file)
but when I used it to save tokenizer or config file I could not do it because I dnot know what file extension should I save tokenizer and I could not reach cofig file, Is there any way to save all the details of my model? thank in advance
I don't know how you defined the tokenizer and what you assigned the "tokenizer" variable to, but this can be a solution to your problem:
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained(...)
tokenizer.save_pretrained('results/')
This saves everything about the tokenizer. And to save the model settings as well, run:
[... your fine-tuning code]
your_model.save_pretrained('results/')
Saved files in the output are:
config.json
added_token.json
special_tokens_map.json
tokenizer_config.json
vocab.txt pytorch_model.bin
UPDATE following the comment:
If you are using from pytorch_pretrained_bert import BertForSequenceClassification then that attribute is not available (as you can see from the code).
What you should do is use transformers which also integrate this functionality.
Example:
from transformers import BertForSequenceClassification
model = BertForSequenceClassification.from_pretrained('bert-base-uncased')
model.save_pretrained('results/tokenizer/')
Another solution would be to use AutoClasses.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With