I am trying to run this example from huggingface website. https://huggingface.co/transformers/task_summary.html. It seems that the model returns two strings instead of logits! and that leads to an error thrown by torch.argmax()
from transformers import AutoTokenizer, AutoModelForQuestionAnswering
import torch
tokenizer = AutoTokenizer.from_pretrained("bert-large-uncased-whole-word-masking-finetuned-squad")
model = AutoModelForQuestionAnswering.from_pretrained("bert-large-uncased-whole-word-masking-finetuned-squad", return_dict=True)
text = r"""🤗 Transformers (formerly known as pytorch-transformers and pytorch-pretrained-bert) provides general-purpose
architectures (BERT, GPT-2, RoBERTa, XLM, DistilBert, XLNet…) for Natural Language Understanding (NLU) and Natural
Language Generation (NLG) with over 32+ pretrained models in 100+ languages and deep interoperability between
TensorFlow 2.0 and PyTorch.
"""
questions = ["How many pretrained models are available in 🤗 Transformers?",
"What does 🤗 Transformers provide?",
"🤗 Transformers provides interoperability between which frameworks?"]
for question in questions:
inputs = tokenizer(question, text, add_special_tokens=True, return_tensors="pt")
input_ids = inputs["input_ids"].tolist()[0] # the list of all indices of words in question + context
text_tokens = tokenizer.convert_ids_to_tokens(input_ids) # Get the tokens for the question + context
answer_start_scores, answer_end_scores = model(**inputs)
answer_start = torch.argmax(answer_start_scores) # Get the most likely beginning of answer with the argmax of the score
answer_end = torch.argmax(answer_end_scores) + 1 # Get the most likely end of answer with the argmax of the score
answer = tokenizer.convert_tokens_to_string(tokenizer.convert_ids_to_tokens(input_ids[answer_start:answer_end]))
print(f"Question: {question}")
print(f"Answer: {answer}")
E.g. for BERT-family of models, this returns the classification token after processing through a linear layer and a tanh activation function. The linear layer weights are trained from the next sentence prediction (classification) objective during pretraining.
You can use the same tokenizer for all of the various BERT models that hugging face provides. As BERT can only accept/take as input only 512 tokens at a time, we must specify the truncation parameter to True. The add special tokens parameter is just for BERT to add tokens like the start, end, [SEP], and [CLS] tokens.
The Bert outputs two things :- last_hidden_state : contains the hidden representations for each token in each sequence of the batch. So the size is (batch_size, seq_len, hidden_size) . pooler_output contains a “representation” of each sequence in the batch, and is of size (batch_size, hidden_size) .
Since one of the recent updates, the models return now task-specific output objects (which are dictionaries) instead of plain tuples. The site you used has not been updated to reflect that change. You can either force the model to return a tuple by specifying return_dict=False
:
answer_start_scores, answer_end_scores = model(**inputs, return_dict=False)
or you can extract the values from the QuestionAnsweringModelOutput
object by calling the values()
method:
answer_start_scores, answer_end_scores = model(**inputs).values()
or even utilizing the QuestionAnsweringModelOutput
object:
outputs = model(**inputs)
answer_start_scores = outputs.start_logits
answer_end_scores = outputs.end_logits
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With