Where is perplexity calculated in the Huggingface gpt2 language model code?

Question

I see some github comments saying the output of the model() call's loss is in the form of perplexity: https://github.com/huggingface/transformers/issues/473

But when I look at the relevant code... https://huggingface.co/transformers/_modules/transformers/modeling_openai.html#OpenAIGPTLMHeadModel.forward

    if labels is not None:
        # Shift so that tokens < n predict n
        shift_logits = lm_logits[..., :-1, :].contiguous()
        shift_labels = labels[..., 1:].contiguous()
        # Flatten the tokens
        loss_fct = CrossEntropyLoss()
        loss = loss_fct(shift_logits.view(-1, shift_logits.size(-1)), shift_labels.view(-1))
        outputs = (loss,) + outputs

    return outputs  # (loss), lm_logits, (all hidden states), (all attentions)

I see cross entropy being calculated, but no transformation into perplexity. Where does the loss finally get transformed? Or is there a transformation already there that I'm not understanding?

user947659 · Accepted Answer

Ah ok, I found the answer. The code is actually returning cross entropy. In the github comment where they say it is perplexity...they are saying that because the OP does

return math.exp(loss)

which transforms entropy to perplexity :)

Where is perplexity calculated in the Huggingface gpt2 language model code?

Tags:

machine-learning

huggingface-transformers

gpt

perplexity

user947659

1 Answers

user947659

Recent Activity

Donate For Us

Where is perplexity calculated in the Huggingface gpt2 language model code?

Tags:

machine-learning

huggingface-transformers

gpt

perplexity

user947659

1 Answers

user947659

Related questions

Recent Activity

Donate For Us