Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When should we inherits keras.Model instead of keras.layers.Layer even if we don't use model.fit?

In some Tensorflow tutorials with tf2 (e.g. Neural Machine Translation with Attention and Eager essentials), they define custom tf.keras.Models instead of tf.keras.layers.Layers (e.g. BahdanauAttention(tf.keras.Model):)

Also, Models: composing layers doc uses tf.keras.Model explicitly. The section says:

The main class used when creating a layer-like thing which contains other layers is tf.keras.Model. Implementing one is done by inheriting from tf.keras.Model.

It sounds we need to inherit tf.keras.Model to define a layer which compose child layers.

However, as far as I checked, this code works even if I define ResnetIdentityBlock as a child class of tf.keras.layers.Layer. Other two tutorials work with Layer too. In addition to that, another tutorial says

Model is just like a Layer, but with added training and serialization utilities.

Thus, I have no idea what is the real difference between tf.keras.Model and tf.keras.layers.Layer and why those three tutorial with Eager execution uses tf.keras.Model though they don't use training and serialization utilities of tf.keras.Model.

Why do we need to inherit tf.keras.Model in those tutorials?

Additional comment

utilities of Model work only with special subsets of Layer (Layers whose call receive only one input). Thus, I think the idea like "Always extend Model because Model has more features" is not correct. Also, it violates a basic programming program like SRP.

like image 302
yunabe Avatar asked Sep 26 '19 13:09

yunabe


1 Answers

Update

So the comment was: Yes, I know training and serialization utilities exist in Model as I wrote in the question. My question is why TF tutorials need to use Model though they don't use these methods.

The best answer can be provided by the authors in this case because your question asks why they chose one method over the other where both of them have them can do the job equally well. Why can do the job equally well? Well, because Model is just like a Layer, but with added training and serialization utilities.

We can argue that using model when just layer can do the job is an overkill, but then it may be a matter of taste.

Hope it helps

PS.

In the eager example and custom layer writing tutorials that you provided we cannot replace model with layer, so these tutorials do not apply to your question


With model you can train but with layer only you cannot. See list of their methods below (excluding inner and inherited ones):

tf.keras.layers.Layer

activity_regularizer
activity_regularizer
add_loss
add_metric
add_update
add_variable
add_weight
apply
build
call
compute_mask
compute_output_shape
count_params
dtype
dynamic
from_config
get_config
get_input_at
get_input_mask_at
get_input_shape_at
get_losses_for
get_output_at
get_output_mask_at
get_output_shape_at
get_updates_for
get_weights
inbound_nodes
input
input_mask
input_shape
losses
metrics
name
non_trainable_variables
non_trainable_weights
outbound_nodes
output
output_mask
output_shape
set_weights
trainable
trainable
trainable_variables
trainable_weights
updates
variables
weights

see? no fit or evaluate method there. tf.keras.Model


compile
evaluate
evaluate_generator
fit
fit_generator
get_weights
load_weights
metrics
metrics_names
predict
predict_generator
predict_on_batch
reset_metrics
run_eagerly
run_eagerly
sample_weights
test_on_batch
train_on_batch
like image 135
eugen Avatar answered Oct 21 '22 13:10

eugen