Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is the batch size None in the method call of a Keras layer?

I am implementing a custom layer in Keras. If I print the shape of the input passed to the call method, I get None as the first element. Why is that? Shouldn't the first element be the batch size?

def call(self, x):
    print(x.shape)  # (None, ...)

When I call model.fit, I am passing the batch size

batch_size = 50
model.fit(x_train, y_train, ..., batch_size=batch_size)

So, when is the method call actually called? And what is the recommended way of getting the batch size in the method call?

like image 578
nbro Avatar asked Jan 02 '23 01:01

nbro


1 Answers

None means it is a dynamic shape. It can take any value depending on the batch size you choose.

When you define a model by default it is defined to support any batch size you can choose. This is what the None means. In TensorFlow 1.* the input to your model is an instance of tf.placeholder().

If you don't use the keras.InputLayer() with specified batch size you get the first dimension None by default:

import tensorflow as tf

model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Dense(units=2, input_shape=(2, )))
print(model.inputs[0].get_shape().as_list()) # [None, 2]
print(model.inputs[0].op.type == 'Placeholder') # True

When you do use keras.InputLayer() with specified batch size you can define the input placeholder with fixed batch size:

import tensorflow as tf

model = tf.keras.models.Sequential()
model.add(tf.keras.layers.InputLayer((2,), batch_size=50))
model.add(tf.keras.layers.Dense(units=2, input_shape=(2, )))
print(model.inputs[0].get_shape().as_list()) # [50, 2]
print(model.inputs[0].op.type == 'Placeholder') # True

When you specify the batch size to the model.fit() method these input placeholders have already been defined and you cannot modify their shape. The batch size for model.fit() is used only to split the data you provided to batches.

If you define your input layer with batch size 2 and then you pass different value of a batch size to the model.fit() method you will get ValueError:

import tensorflow as tf
import numpy as np

model = tf.keras.models.Sequential()
model.add(tf.keras.layers.InputLayer((2,), batch_size=2)) # <--batch_size==2
model.add(tf.keras.layers.Dense(units=2, input_shape=(2, )))
model.compile(optimizer=tf.keras.optimizers.Adam(),
              loss='categorical_crossentropy')
x_train = np.random.normal(size=(10, 2))
y_train = np.array([[0, 1] for _ in range(10)])

model.fit(x_train, y_train, batch_size=3) # <--batch_size==3

This will raise: ValueError: Thebatch_sizeargument value 3 is incompatible with the specified batch size of your Input Layer: 2

like image 114
Vlad Avatar answered Jan 14 '23 13:01

Vlad