Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

when do you use Input shape vs batch_shape in keras?

Tags:

keras

shape

I don't find API that explains keras Input.

When should you use shape attribute vs batch_shape attribute?

like image 882
bhomass Avatar asked Jun 28 '17 01:06

bhomass


2 Answers

From the Keras source code:

Arguments

    shape: A shape tuple (integer), not including the batch size.
        For instance, `shape=(32,)` indicates that the expected input
        will be batches of 32-dimensional vectors.
    batch_shape: A shape tuple (integer), including the batch size.
        For instance, `batch_shape=(10, 32)` indicates that
        the expected input will be batches of 10 32-dimensional vectors.
        `batch_shape=(None, 32)` indicates batches of an arbitrary number
        of 32-dimensional vectors.

The batch size is how many examples you have in your training data.

You can use any. Personally I never used "batch_shape". When you use "shape", your batch can be any size, you don't have to care about it.

shape=(32,) means exactly the same as batch_shape=(None,32)

like image 58
Daniel Möller Avatar answered Sep 21 '22 23:09

Daniel Möller


To expand on Daniel's answer, one case I've found where it's necessary to specify batch_shape instead of shape to an Input layer is when you are using stateful LSTMs in the functional API. It's described well in Phillipe Remy's blog. In short, the stateful mode allows you to keep the hidden state values in an LSTM across batches (they usually get reset every batch if the default stateful=False is set). That means it needs knowledge about the batch size in order to shape everything properly. If you don't do this, it yells at you:

ValueError: If a RNN is stateful, it needs to know its batch size. Specify the batch size of your input tensors: 
    - If using a Sequential model, specify the batch size by passing a `batch_input_shape` argument to your first layer.
    - If using the functional API, specify the batch size by passing a `batch_shape` argument to your Input layer.

The second point is the relevant one here. If using LSTM with stateful=True in the functional API, you need to set batch_shape for your Input layers.

like image 23
adamconkey Avatar answered Sep 20 '22 23:09

adamconkey