Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Keras Sequential model input layer

When creating a Sequential model in Keras, I understand you provide the input shape in the first layer. Does this input shape then make an implicit input layer?

For example, the model below explicitly specifies 2 Dense layers, but is this actually a model with 3 layers consisting of one input layer implied by the input shape, one hidden dense layer with 32 neurons, and then one output layer with 10 possible outputs?

model = Sequential([
    Dense(32, input_shape=(784,)),
    Activation('relu'),
    Dense(10),
    Activation('softmax'),
])
like image 776
blackHoleDetector Avatar asked Oct 04 '17 19:10

blackHoleDetector


People also ask

How do you add input layer to sequential model Keras?

All layers can be defined as dense layers. However, there are two options to define the input layer. We can use the InputLayer() class to explicitly define the input layer of a Keras sequential model or we can use the Dense() class with the input_shape argument that will add the input layer behind the scene.

What is the input layer in Keras?

In Keras, the input layer itself is not a layer, but a tensor. It's the starting tensor you send to the first hidden layer. This tensor must have the same shape as your training data.

Does Keras need an input layer?

It is generally recommend to use the Keras Functional model via Input , (which creates an InputLayer ) without directly using InputLayer . When using InputLayer with the Keras Sequential model, it can be skipped by moving the input_shape parameter to the first layer after the InputLayer .

Does TF Keras sequential support multiple inputs?

Keras is able to handle multiple inputs (and even multiple outputs) via its functional API. Learn more about 3 ways to create a Keras model with TensorFlow 2.0 (Sequential, Functional, and Model Subclassing).


2 Answers

Well, it actually is an implicit input layer indeed, i.e. your model is an example of a "good old" neural net with three layers - input, hidden, and output. This is more explicitly visible in the Keras Functional API (check the example in the docs), in which your model would be written as:

inputs = Input(shape=(784,))                 # input layer
x = Dense(32, activation='relu')(inputs)     # hidden layer
outputs = Dense(10, activation='softmax')(x) # output layer

model = Model(inputs, outputs)

Actually, this implicit input layer is the reason why you have to include an input_shape argument only in the first (explicit) layer of the model in the Sequential API - in subsequent layers, the input shape is inferred from the output of the previous ones (see the comments in the source code of core.py).

You may also find the documentation on tf.contrib.keras.layers.Input enlightening.

like image 78
desertnaut Avatar answered Oct 18 '22 18:10

desertnaut


It depends on your perspective :-)

Rewriting your code in line with more recent Keras tutorial examples, you would probably use:

model = Sequential()
model.add(Dense(32, activation='relu', input_dim=784))
model.add(Dense(10, activation='softmax')

...which makes it much more explicit that you only have 2 Keras layers. And this is exactly what you do have (in Keras, at least) because the "input layer" is not really a (Keras) layer at all: it's only a place to store a tensor, so it may as well be a tensor itself.

Each Keras layer is a transformation that outputs a tensor, possibly of a different size/shape to the input. So while there are 3 identifiable tensors here (input, outputs of the two layers), there are only 2 transformations involved corresponding to the 2 Keras layers.

On the other hand, graphically, you might represent this network with 3 (graphical) layers of nodes, and two sets of lines connecting the layers of nodes. Graphically, it's a 3-layer network. But "layers" in this graphical notation are bunches of circles that sit on a page doing nothing, whereas a layers in Keras transform tensors and do actual work for you. Personally, I would get used to the Keras perspective :-)

Note finally that for fun and/or simplicity, I substituted input_dim=784 for input_shape=(784,) to avoid the syntax that Python uses to both confuse newcomers and create a 1-D tuple: (<value>,).

like image 34
omatai Avatar answered Oct 18 '22 19:10

omatai