Keras Binary Classification - Sigmoid activation function

Tags:

I've implemented a basic MLP in Keras with tensorflow and I'm trying to solve a binary classification problem. For binary classification, it seems that sigmoid is the recommended activation function and I'm not quite understanding why, and how Keras deals with this.

I understand the sigmoid function will produce values in a range between 0 and 1. My understanding is that for classification problems using sigmoid, there will be a certain threshold used to determine the class of an input (typically 0.5). In Keras, I'm not seeing any way to specify this threshold, so I assume it's done implicitly in the back-end? If this is the case, how is Keras distinguishing between the use of sigmoid in a binary classification problem, or a regression problem? With binary classification, we want a binary value, but with regression a nominal value is needed. All I can see that could be indicating this is the loss function. Is that informing Keras on how to handle the data?

Additionally, assuming Keras is implicitly applying a threshold, why does it output nominal values when I use my model to predict on new data?

For example:

y_pred = model.predict(x_test)
print(y_pred)

gives:

[7.4706882e-02] [8.3481872e-01] [2.9314638e-04] [5.2297767e-03] [2.1608515e-01] ... [4.4894204e-03] [5.1120580e-05] [7.0263929e-04]

I can apply a threshold myself when predicting to get a binary output, however surely Keras must be doing that anyway in order to correctly classify? Perhaps Keras is applying a threshold when training the model, but when I use it to predict new values, the threshold isn't used as the loss function isn't used in predicting? Or is not applying a threshold at all, and the nominal values outputted happen to be working well with my model? I've checked this is happening on the Keras example for binary classification, so I don't think I've made any errors with my code, especially as it's predicting accurately.

If anyone could explain how this is working, I would greatly appreciate it.

Here's my model as a point of reference:

model = Sequential()
model.add(Dense(124, activation='relu', input_shape = (2,)))
model.add(Dropout(0.5))
model.add(Dense(124, activation='relu'))
model.add(Dropout(0.1))
model.add(Dense(1, activation='sigmoid'))
model.summary()

model.compile(loss='binary_crossentropy',
              optimizer=SGD(lr = 0.1, momentum = 0.003),
              metrics=['acc'])

history = model.fit(x_train, y_train,
                    batch_size=batch_size,
                    epochs=epochs,
                    verbose=1,
                    validation_data=(x_test, y_test))
score = model.evaluate(x_test, y_test, verbose=0)

242

asked Mar 06 '18 16:03

Daniel Whettam

1 Answers

The output of a binary classification is the probability of a sample belonging to a class.

how is Keras distinguishing between the use of sigmoid in a binary classification problem, or a regression problem?

It does not need to. It uses the loss function to calculate the loss, then the derivatives and update the weights.

In other words:

During training the framework minimizes the loss. The user must specify the loss function (provided by the framework) or supply their own. The network only cares about the scalar value this function outputs and its 2 arguments are predicted y^ and actual y.
Each activation function implements the forward propagation and back-propagation functions. The framework is only interested in these 2 functions. It does not care what the function does exactly. The only requirement is that the activation function is non-linear.

169

answered Sep 26 '22 22:09

Maxim Egorushkin

Related questions
                            
                                half (not split!) violin plots in seaborn
                            
                                how to use c++ code in flutter (android) application?
                            
                                Printed output not displayed when using joblib in jupyter notebook
                            
                                Pandas explode multiple columns
                            
                                "zsh: illegal hardware instruction python" when installing Tensorflow on macbook pro M1 [duplicate]
                            
                                Speed up reading multiple pickle files
                            
                                Pure Python library to generate Identicons? [closed]
                            
                                How to make pdb recognize that the source has changed between runs?
                            
                                Gracefully handling "MySQL has gone away"
                            
                                Is there an API for Wireshark, to develop programs/plugins that interact with it/enhance it? [closed]
                            
                                Python Web Framework with best Mongo support
                            
                                Is there a reason why Python's ctypes.CDLL cannot automatically generate restype and argtypes from C header files?
                            
                                How can printing an object result in different output than both str() and repr()?
                            
                                how to perform stable eye corner detection?
                            
                                How to avoid adding duplicates in a many-to-many relationship table in SQLAlchemy - python?
                            
                                What method can I use instead of __file__ in python?
                            
                                How to make a window fullscreen in a secondary display with tkinter?
                            
                                When to open file in binary mode (b)?
                            
                                Custom chained comparisons
                            
                                Zero padding slice past end of array in numpy

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Keras Binary Classification - Sigmoid activation function

Tags:

python

neural-network

tensorflow

keras

sigmoid

Daniel Whettam

People also ask

1 Answers

Maxim Egorushkin

Recent Activity

Donate For Us