Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is it that `input_shape` does not include the batch dimension when passed as an argument to the `Dense` layer?

In Keras, why is it that input_shape does not include the batch dimension when passed as an argument to layers like Dense but DOES include the batch dimension when input_shape is passed to the build method of a model?

import tensorflow as tf
from tensorflow.keras.layers import Dense

if __name__ == "__main__":
    model1 = tf.keras.Sequential([Dense(1, input_shape=[10])])
    model1.summary()

    model2 = tf.keras.Sequential([Dense(1)])
    model2.build(input_shape=[None, 10])  # why [None, 10] and not [10]?
    model2.summary()

Is this a conscious choice of API design? If it is, why?

like image 708
Jensun Ravichandran Avatar asked Nov 04 '20 13:11

Jensun Ravichandran


People also ask

Why is batch size none?

“None” tells that any batch size will be accepted. Set to None, then the bs is not bounded by a specific number. Params means each layer's trainable and non-trainable parameters.

What is Input_shape in Keras?

The input shape In Keras, the input layer itself is not a layer, but a tensor. It's the starting tensor you send to the first hidden layer. This tensor must have the same shape as your training data. Example: if you have 30 images of 50x50 pixels in RGB (3 channels), the shape of your input data is (30,50,50,3) .

What does TF Keras layers dense () do?

This function is used to create fully connected layers, in which every output depends on every input. Parameters: This function takes the args object as a parameter which can have the following properties: units: It is a positive number that defines the dimensionality of the output space.

What is Input_shape in conv2d?

input_shape : Shape tuple (tuple of integers) or list of shape tuples (one per output tensor of the layer). Shape tuples can include None for free dimensions, instead of an integer.


1 Answers

You can specify the input shape of your model in several different ways. For example by providing one of the following arguments to the first layer of your model:

  • batch_input_shape: A tuple where the first dimension is the batch size.
  • input_shape: A tuple that does not include the batch size, e.g., the batch size is assumed to be None or batch_size, if specified.
  • input_dim: A scalar indicating the dimension of the input.

In all these cases, Keras is internally storing an attribute _batch_input_size to build the model.

Regarding the build method, my guess is that this is indeed a conscious choice - information about the batch size might be useful to build the model in some (perhaps unthought-of) situations. Therefore, a framework that includes the batch dimension as input to build is more generic and complete than a framework that doesn't. Nonetheless, I agree with you that naming the argument batch_input_shape instead of input_shape would make everything more consistent.


It is also worth mentioning that users rarely need to call the build method by themselves. This happens internally when it is needed. Nowadays, it is even possible to ignore the input_shape argument when creating the model (although methods like summary will then not work until the model is built). In this case, Keras is able to infer the input shape from the argument x of fit.

like image 140
rvinas Avatar answered Oct 17 '22 21:10

rvinas