Understanding BERT vocab [unusedxxx] tokens:

1 Answers

A quick search reveals the use of this, specifically in the discussion of the original BERT implementation, and this HuggingFace thread.

Unused tokens are helpful if you want to introduce specific words to your fine-tuning or further pre-training procedure; they allow you to treat words that are relevant only in your context just like you want, and avoid subword splitting that would occur with the original vocabulary of BERT. To quote from the first discussion:

Just replace the "[unusedX]" tokens with your vocabulary. Since these were not used they are effectively randomly initialized.

answered Oct 16 '22 20:10

dennlinger

Related questions
                            
                                Text generation using huggingface's distilbert models
                            
                                How to predict the probability of an empty string using BERT
                            
                                How to use the past with HuggingFace Transformers GPT-2?
                            
                                What are the inputs to the transformer encoder and decoder in BERT?
                            
                                How do I use BertForMaskedLM or BertModel to calculate perplexity of a sentence?
                            
                                How to fine tune BERT on unlabeled data?
                            
                                Downloading transformers models to use offline
                            
                                How exactly should the input file be formatted for the language model finetuning (BERT through Huggingface Transformers)?
                            
                                Save only best weights with huggingface transformers
                            
                                BERT tokenizer & model download
                            
                                Huggingface transformer model returns string instead of logits
                            
                                How to reconstruct text entities with Hugging Face's transformers pipelines without IOB tags?
                            
                                Huggingface AlBert tokenizer NoneType error with Colab
                            
                                How do I train a encoder-decoder model for a translation task using hugging face transformers?
                            
                                why take the first hidden state for sequence classification (DistilBertForSequenceClassification) by HuggingFace
                            
                                Transformer: Error importing packages. "ImportError: cannot import name 'SAVE_STATE_WARNING' from 'torch.optim.lr_scheduler'"
                            
                                Use of attention_mask during the forward pass in lm finetuning
                            
                                HuggingFace BERT `inputs_embeds` giving unexpected result

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Understanding BERT vocab [unusedxxx] tokens:

Tags:

huggingface-transformers

user12769533

People also ask

1 Answers

dennlinger

Recent Activity

Donate For Us