Logo Questions Linux Laravel Mysql Ubuntu Git Menu

Huggingface transformer model returns string instead of logits

I am trying to run this example from huggingface website. https://huggingface.co/transformers/task_summary.html. It seems that the model returns two strings instead of logits! and that leads to an error thrown by torch.argmax()

    from transformers import AutoTokenizer, AutoModelForQuestionAnswering
    import torch
    tokenizer = AutoTokenizer.from_pretrained("bert-large-uncased-whole-word-masking-finetuned-squad")
    model = AutoModelForQuestionAnswering.from_pretrained("bert-large-uncased-whole-word-masking-finetuned-squad", return_dict=True)
    text = r"""🤗 Transformers (formerly known as pytorch-transformers and pytorch-pretrained-bert) provides general-purpose
    architectures (BERT, GPT-2, RoBERTa, XLM, DistilBert, XLNet…) for Natural Language Understanding (NLU) and Natural
    Language Generation (NLG) with over 32+ pretrained models in 100+ languages and deep interoperability between
    TensorFlow 2.0 and PyTorch.
    questions = ["How many pretrained models are available in 🤗 Transformers?",
    "What does 🤗 Transformers provide?",
    "🤗 Transformers provides interoperability between which frameworks?"]
    for question in questions:
      inputs = tokenizer(question, text, add_special_tokens=True, return_tensors="pt")
      input_ids = inputs["input_ids"].tolist()[0] # the list of all indices of words in question + context
      text_tokens = tokenizer.convert_ids_to_tokens(input_ids) # Get the tokens for the question + context
      answer_start_scores, answer_end_scores = model(**inputs)
      answer_start = torch.argmax(answer_start_scores)  # Get the most likely beginning of answer with the argmax of the score
      answer_end = torch.argmax(answer_end_scores) + 1  # Get the most likely end of answer with the argmax of the score
      answer = tokenizer.convert_tokens_to_string(tokenizer.convert_ids_to_tokens(input_ids[answer_start:answer_end]))
      print(f"Question: {question}")
      print(f"Answer: {answer}")

like image 531
Reza Afra Avatar asked Nov 18 '20 21:11

Reza Afra

People also ask

What does BERT model return?

E.g. for BERT-family of models, this returns the classification token after processing through a linear layer and a tanh activation function. The linear layer weights are trained from the next sentence prediction (classification) objective during pretraining.

How do you use the hugging face BERT model?

You can use the same tokenizer for all of the various BERT models that hugging face provides. As BERT can only accept/take as input only 512 tokens at a time, we must specify the truncation parameter to True. The add special tokens parameter is just for BERT to add tokens like the start, end, [SEP], and [CLS] tokens.

What is Pooler_output in BERT?

The Bert outputs two things :- last_hidden_state : contains the hidden representations for each token in each sequence of the batch. So the size is (batch_size, seq_len, hidden_size) . pooler_output contains a “representation” of each sequence in the batch, and is of size (batch_size, hidden_size) .

1 Answers

Since one of the recent updates, the models return now task-specific output objects (which are dictionaries) instead of plain tuples. The site you used has not been updated to reflect that change. You can either force the model to return a tuple by specifying return_dict=False:

answer_start_scores, answer_end_scores = model(**inputs, return_dict=False)

or you can extract the values from the QuestionAnsweringModelOutput object by calling the values() method:

answer_start_scores, answer_end_scores = model(**inputs).values()

or even utilizing the QuestionAnsweringModelOutput object:

outputs = model(**inputs)
answer_start_scores = outputs.start_logits
answer_end_scores = outputs.end_logits
like image 130
cronoik Avatar answered Oct 09 '22 11:10
