Reading through the documentation of implementing custom layers with tf.keras
, they specify two options to inherit from, tf.keras.Layer
and tf.keras.Model
.
Under the context of creating custom layers, I'm asking myself what is the difference between these two? Technically what is different?
If I were to implement the transformer encoder for example, which one would be more suitable? (assuming the transformer is a only a "layer" in my full model)
The difference between tf.keras and keras is the Tensorflow specific enhancement to the framework. keras is an API specification that describes how a Deep Learning framework should implement certain part, related to the model definition and training.
tf.keras.layers.Conv2d is a tensorflow-keras layer while tf.layers.max_pooling2d is a tensorflow 'native layer' You cannot use a native layer directly within a Keras model, as it will be missing certain attributes required by the Keras API. However, it is possible to use native layer if wrapped within a tensorflow-keras Lambda layer.
Defining models and layers in TensorFlow Most models are made of layers. Layers are functions with a known mathematical structure that can be reused and have trainable variables. In TensorFlow, most high-level implementations of layers and models, such as Keras or Sonnet, are built on the same foundational class: tf.Module.
What is Keras? KERAS is an Open Source Neural Network library written in Python that runs on top of Theano or Tensorflow. It is designed to be modular, fast and easy to use. It was developed by François Chollet, a Google engineer.
In the documentation:
The Model class has the same API as Layer, with the following differences: - It exposes built-in training, evaluation, and prediction loops (model.fit(), model.evaluate(), model.predict()). - It exposes the list of its inner layers, via the model.layers property. - It exposes saving and serialization APIs.
Effectively, the "Layer" class corresponds to what we refer to in the literature as a "layer" (as in "convolution layer" or "recurrent layer") or as a "block" (as in "ResNet block" or "Inception block").
Meanwhile, the "Model" class corresponds to what is referred to in the literature as a "model" (as in "deep learning model") or as a "network" (as in "deep neural network").
So if you want to be able to call .fit()
, .evaluate()
, or .predict()
on those blocks or you want to be able to save and load those blocks separately or something you should use the Model class. The Layer class is leaner so you won't bloat the layers with unnecessary functionality...but I would guess that that generally wouldn't be a big problem.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With