What is a multi-headed model? And what exactly is a 'head' in a model?

Tags:

What is a multi-headed model in deep learning?

The only explanation I found so far is this: Every model might be thought of as a backbone plus a head, and if you pre-train backbone and put a random head, you can fine tune it and it is a good idea
Can someone please provide a more detailed explanation.

488

asked May 06 '19 11:05

spacer.34

1 Answers

The explanation you found is accurate. Depending on what you want to predict on your data you require an adequate backbone network and a certain amount of prediction heads.

For a basic classification network for example you can view ResNet, AlexNet, VGGNet, Inception,... as the backbone and the fully connected layer as the sole prediction head.

A good example for a problem where you need multiple-heads is localization, where you not only want to classify what is in the image but also want to localize the object (find the coordinates of the bounding box around it).

The image below shows the general architecture enter image description here

The backbone network ("convolution and pooling") is responsible for extracting a feature map from the image that contains higher level summarized information. Each head uses this feature map as input to predict its desired outcome.

The loss that you optimize for during training is usually a weighted sum of the individual losses for each prediction head.

answered Sep 23 '22 16:09

SaiBot

Related questions
                            
                                Missing values in scikits machine learning
                            
                                How would one use Kernel Density Estimation as a 1D clustering method in scikit learn?
                            
                                Getting TypeError: '(slice(None, None, None), 0)' is an invalid key
                            
                                Altering trained images to train neural network
                            
                                How to make virtual organisms learn using neural networks? [closed]
                            
                                Feature selection using scikit-learn
                            
                                sklearn metrics for multiclass classification
                            
                                Fitting data vs. transforming data in scikit-learn
                            
                                How to calculate optimal batch size
                            
                                What is the difference between Q-learning and Value Iteration?
                            
                                Comparing R to Matlab for Data Mining
                            
                                SVM and Neural Network
                            
                                Differences in SciKit Learn, Keras, or Pytorch [closed]
                            
                                Why rotation-invariant neural networks are not used in winners of the popular competitions?
                            
                                Machine Learning : Tensorflow v/s Tensorflow.js v/s Brain.js [closed]
                            
                                How to understand loss acc val_loss val_acc in Keras model fitting
                            
                                Linear Regression :: Normalization (Vs) Standardization
                            
                                Keras: weighted binary crossentropy
                            
                                Sklearn StratifiedKFold: ValueError: Supported target types are: ('binary', 'multiclass'). Got 'multilabel-indicator' instead
                            
                                What is the meaning of the "None" in model.summary of KERAS?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What is a multi-headed model? And what exactly is a 'head' in a model?

Tags:

machine-learning

neural-network

deep-learning

spacer.34

People also ask

1 Answers

SaiBot

Recent Activity

Donate For Us