What exactly does tf.keras.layers.Dense do?



My question

I'm using the Keras to build a convolutional neural network. I ran across the following:

model = tf.keras.Sequential()
model.add(layers.Dense(10*10*256, use_bias=False, input_shape=(100,)))

I'm curious - what exactly mathematically is going on here?

My best guess

My guess is that for input of size [100,N], the network will be evaluated N times, once for each training example. The Dense layer created by layers.Dense contains (10*10*256) * (100) parameters that will be updated during backpropagation.

1 Answers

Dense implements the operation: output = activation(dot(input, kernel) + bias) where activation is the element-wise activation function passed as the activation argument, kernel is a weights matrix created by the layer, and bias is a bias vector created by the layer (only applicable if use_bias is True).

Note: If the input to the layer has a rank greater than 2, then it is flattened prior to the initial dot product with kernel.


# as first layer in a sequential model:
model = Sequential()
model.add(Dense(32, input_shape=(16,)))
# now the model will take as input arrays of shape (*, 16)
# and output arrays of shape (*, 32)

# after the first layer, you don't need to specify
# the size of the input anymore:

Arguments :

> units: Positive integer, dimensionality of the output space.

> activation: Activation function to use. If you don't specify anything,

> no activation is applied (ie. "linear" activation: a(x) = x).

> use_bias: Boolean, whether the layer uses a bias vector.

> kernel_initializer: Initializer for the kernel weights matrix.

> bias_initializer: Initializer for the bias vector. 

>kernel_regularizer:Regularizer function applied to the kernel weights matrix.
> bias_regularizer: Regularizer function applied to the bias vector.

> activity_regularizer: Regularizer function applied to the output of the layer (its "activation").. 

>kernel_constraint: Constraint function applied to the kernel weights matrix. 

>bias_constraint: Constraint function applied to the bias vector.

Input shape:

N-D tensor with shape: (batch_size, ..., input_dim). The most common situation would be a 2D input with shape (batch_size, input_dim).

Output shape:

N-D tensor with shape: (batch_size, ..., units). For instance, for a 2D input with shape (batch_size, input_dim), the output would have shape (batch_size, units).

