Sigmoid output - can it be interpreted as probability?

Tags:

Sigmoid function outputs a number between 0 and 1. Is this a probability or is it merely a 'yes or no' depending on whether it's above or below 0.5?

Minimal example:

Cats vs dogs binary classification. 0 is cat, 1 is dog.

Can I perform the following interpretation of the sigmoid output values:

0.9 - it's most certainly a dog
0.52 - it's more likely to be a dog than a cat, but still quite unsure
0.5 - completely undecided, could be either a cat or a dog
0.48 - it's more likely to be a cat than a dog, but still quite unsure
0.1 - it's most certainly a cat

Or would this be the right way to interpret the results:

0.9 - it's a dog
0.52 - it's a dog
0.5 - completely undecided, could be either a cat or a dog
0.48 - it's a cat
0.1 - it's a cat

Note how in first case we utilise the numeric value to also express probabilities, while in the second case we completely ignore the probability interpretation and collapse the answers to binary. Which is correct? Can you explain why?

Background context, feel free to skip this:

I've found a number of sources that suggest that yes, sigmoid output can be interpreted as probability:

Source yes 1 - (...) sigmoid(z) will yield a value (a probability) between 0 and 1.
Source yes 2 - The "output" must come from a function that satisfies the properties of a distribution function in order for us to interpret it as probabilities. (...) The "sigmoid function" satisfies these properties.
Source yes 3 - tf.sigmoid(logits) gives you the probabilities.

And a number of sources that suggest contrary, that sigmoid output cannot be interpreted as probabilities:

Source no 1 - (...) the raw values cannot necessarily be interpreted as raw probabilities!
Source no 2 - Sigmoid (...) is not a probability distribution function
Source no (and also yes) 3 - the short answer is no, however, depending on the loss you use, it may be closer to truth than you may think.

(bonus questions, answer to win a car!) Why are there so many contradicting answers? What do these answers differ in? I find it unlikely that it's just a lot of people being completely wrong about it - I'm thinking they're just talking about different cases or some different fundamental assumptions. What's the difference that I'm missing?

I know I can just use a softmax. I also know that sigmoid can be used for non-exclusive multi-class classification (Source multi 1, Source multi 2, Source multi 3) - although even then it's unclear whether such multiple sigmoids output probabilities of various classes or again simply a 'yes or no', but for multiple classes. In my case though, I'm interested in exclusive two-class (binary) classification, and whether sigmoid can be used to determine its probabilities, or should two-class softmax be used.

580

asked Nov 26 '19 20:11

Voy

1 Answers

A sigmoid function is not a probability density function (PDF), as it integrates to infinity. However, it corresponds to the cumulative probability function of the logistic distribution.

Regarding your interpretation of the results, even though the sigmoid is not a PDF, given that its values lie in the interval [0,1], you can still interpret them as a confidence index. With that in mind, I would say that your first interpretation is the most appropriate one, although you are free to implement whichever classifier suits your purposes better.

116

answered Sep 18 '22 18:09

edu_

Related questions
                            
                                Keras predict() returns a better accuracy than evaluate()
                            
                                100% classifier accuracy after using train_test_split
                            
                                OCR for Devanagari (Hindi / Marathi / Sanskrit)
                            
                                Neural Network size for Animation system
                            
                                scikit learn: desired amount of Best Features (k) not selected
                            
                                Matrix factorization for collaborative filtering - new users and items?
                            
                                How to normalize an image color?
                            
                                Unseen nominal values in weka
                            
                                Do convolutional neural networks suffer from the vanishing gradient?
                            
                                Is there any way to train a sklearn model by disk data like HDF5 or such ?
                            
                                xgboost predict method returns the same predicted value for all rows
                            
                                How to get a concurrency of 1000 requests with Flask and Gunicorn [closed]
                            
                                Run model in reverse in Keras
                            
                                One dimensional data with CNN
                            
                                AttributeError: module 'tensorflow.contrib.learn' has no attribute 'TensorFlowDNNClassifier'
                            
                                How to create my own datasets using in scikit-learn?
                            
                                AttributeError:'Tensor' object has no attribute '_keras_history'
                            
                                Add hand-crafted features to Keras sequential model
                            
                                How can you re-use a variable scope in tensorflow without a new scope being created by default?
                            
                                Pytorch: How to create an update rule that doesn't come from derivatives?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Sigmoid output - can it be interpreted as probability?

Tags:

machine-learning

neural-network

probability

classification

sigmoid

Voy

People also ask

1 Answers

edu_

Recent Activity

Donate For Us