Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does TensorFlow's documentation call a softmax's input "logits"?

TensorFlow calls each of the inputs to a softmax a logit. They go on to define the softmax's inputs/logits as: "Unscaled log probabilities."

Wikipedia and other sources say that a logit is the log of the odds, and the inverse of the sigmoid/logistic function. I.e., if sigmoid(x) = p(x), then logit( p(x) ) = log( p(x) / (1-p(x)) ) = x.

Is there a mathematical or conventional reason for TensorFlow to call a softmax's inputs "logits"? Shouldn't they just be called "unscaled log probabilities"?

Perhaps TensorFlow just wanted to keep the same variable name for binary logistic regression (where it makes sense to use the term logit) and categorical logistic regression...

This question was covered a little bit here, but no one seemed bothered by the use of the word "logit" to mean "unscaled log probability".

like image 730
Brian Bartoldson Avatar asked May 26 '17 21:05

Brian Bartoldson


1 Answers

Logit is nowadays used in ML community for any non-normalised probability distribution (basically anything that gets mapped to a probability distribution by a parameter-less transformation, like sigmoid function for a binary variable or softmax for multinomial one). It is not a strict mathematical term, but gained enough popularity to be included in TF documentation.

like image 125
lejlot Avatar answered Sep 18 '22 15:09

lejlot