I am lost in the scikit learn 0.18 user manual (http://scikit-learn.org/dev/modules/generated/sklearn.neural_network.MLPClassifier.html#sklearn.neural_network.MLPClassifier):
hidden_layer_sizes : tuple, length = n_layers - 2, default (100,) The ith element represents the number of neurons in the ith hidden layer.
If I am looking for only 1 hidden layer and 7 hidden units in my model, should I put like this? Thanks!
hidden_layer_sizes=(7, 1)
MLPClassifier stands for Multi-layer Perceptron classifier which in the name itself connects to a Neural Network. Unlike other classification algorithms such as Support Vectors or Naive Bayes Classifier, MLPClassifier relies on an underlying Neural Network to perform the task of classification.
epsilonfloat, default=1e-8. Value for numerical stability in adam. Only used when solver='adam'. n_iter_no_changeint, default=10. Maximum number of epochs to not meet tol improvement.
MLPClassifier classifier The multilayer perceptron (MLP) is a feedforward artificial neural network model that maps sets of input data onto a set of appropriate outputs. An MLP consists of multiple layers and each layer is fully connected to the following one.
hidden_layer_sizes=(7,)
if you want only 1 hidden layer with 7 hidden units.
length = n_layers - 2
is because you have 1 input layer and 1 output layer.
In the docs:
hidden_layer_sizes : tuple, length = n_layers - 2, default (100,)
means : hidden_layer_sizes is a tuple of size (n_layers -2)
n_layers means no of layers we want as per architecture.
Value 2 is subtracted from n_layers because two layers (input & output ) are not part of hidden layers, so not belong to the count.
default(100,) means if no value is provided for hidden_layer_sizes then default architecture will have one input layer, one hidden layer with 100 units and one output layer.
From the docs again:
The ith element represents the number of neurons in the ith hidden layer.
means each entry in tuple belongs to corresponding hidden layer.
Example :
For architecture 56:25:11:7:5:3:1 with input 56 and 1 output hidden layers will be (25:11:7:5:3). So tuple hidden_layer_sizes = (25,11,7,5,3,)
For architecture 3:45:2:11:2 with input 3 and 2 output hidden layers will be (45:2:11). So tuple hidden_layer_sizes = (45,2,11,)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With