Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the meaning of the word logits in TensorFlow? [duplicate]

In the following TensorFlow function, we must feed the activation of artificial neurons in the final layer. That I understand. But I don't understand why it is called logits? Isn't that a mathematical function?

loss_function = tf.nn.softmax_cross_entropy_with_logits(      logits = last_layer,      labels = target_output ) 
like image 836
Milad P. Avatar asked Jan 04 '17 02:01

Milad P.


People also ask

What is CNN logits?

A Logit function, also known as the log-odds function, is a function that represents probability values from 0 to 1, and negative infinity to infinity. The function is an inverse to the sigmoid function that limits values between 0 and 1 across the Y-axis, rather than the X-axis.

How do you interpret logits?

A probability of 0.5 corresponds to a logit of 0. Negative logit values indicate probabilities smaller than 0.5, positive logits indicate probabilities greater than 0.5. The relationship is symmetrical: Logits of −0.2 and 0.2 correspond to probabilities of 0.45 and 0.55, respectively.

What are logits keras?

The logits are the unnormalized log probabilities output the model (the values output before the softmax normalization is applied to them). Follow this answer to receive notifications. edited Nov 11, 2017 at 23:06. nbro.

What does a logit function do?

The purpose of the logit link is to take a linear combination of the covariate values (which may take any value between ±∞) and convert those values to the scale of a probability, i.e., between 0 and 1. The logit link function is defined in Eq. (3.4).


2 Answers

Logits is an overloaded term which can mean many different things:


In Math, Logit is a function that maps probabilities ([0, 1]) to R ((-inf, inf))

enter image description here

Probability of 0.5 corresponds to a logit of 0. Negative logit correspond to probabilities less than 0.5, positive to > 0.5.

In ML, it can be

the vector of raw (non-normalized) predictions that a classification model generates, which is ordinarily then passed to a normalization function. If the model is solving a multi-class classification problem, logits typically become an input to the softmax function. The softmax function then generates a vector of (normalized) probabilities with one value for each possible class.

Logits also sometimes refer to the element-wise inverse of the sigmoid function.

like image 192
Salvador Dali Avatar answered Sep 29 '22 19:09

Salvador Dali


Just adding this clarification so that anyone who scrolls down this much can at least gets it right, since there are so many wrong answers upvoted.

Diansheng's answer and JakeJ's answer get it right.
A new answer posted by Shital Shah is an even better and more complete answer.


Yes, logit as a mathematical function in statistics, but the logit used in context of neural networks is different. Statistical logit doesn't even make any sense here.


I couldn't find a formal definition anywhere, but logit basically means:

The raw predictions which come out of the last layer of the neural network.
1. This is the very tensor on which you apply the argmax function to get the predicted class.
2. This is the very tensor which you feed into the softmax function to get the probabilities for the predicted classes.


Also, from a tutorial on official tensorflow website:

Logits Layer

The final layer in our neural network is the logits layer, which will return the raw values for our predictions. We create a dense layer with 10 neurons (one for each target class 0–9), with linear activation (the default):

logits = tf.layers.dense(inputs=dropout, units=10) 

If you are still confused, the situation is like this:

raw_predictions = neural_net(input_layer) predicted_class_index_by_raw = argmax(raw_predictions) probabilities = softmax(raw_predictions) predicted_class_index_by_prob = argmax(probabilities) 

where, predicted_class_index_by_raw and predicted_class_index_by_prob will be equal.

Another name for raw_predictions in the above code is logit.


As for the why logit... I have no idea. Sorry.
[Edit: See this answer for the historical motivations behind the term.]


Trivia

Although, if you want to, you can apply statistical logit to probabilities that come out of the softmax function.

If the probability of a certain class is p,
Then the log-odds of that class is L = logit(p).

Also, the probability of that class can be recovered as p = sigmoid(L), using the sigmoid function.

Not very useful to calculate log-odds though.

like image 30
AneesAhmed777 Avatar answered Sep 29 '22 19:09

AneesAhmed777