How to Multi-Head learning

Tags:

I have about 5 models that work pretty well trained individually but I want to fuse them together in order to have one big model. I'm looking into it because one big model is more easy to update (in production) than many small model this is an image of what I want to achieve. enter image description here

my question are, is it ok to do it like this ? having one dataset per head model, how am I supposed to train the whole model ?

259

asked Nov 20 '19 08:11

PrinceZee

1 Answers

my question are, is it ok to do it like this

Sure you can do that. This approach is called multi-task learning. Depending on your datasets and what you are trying to do, it will maybe even increase the performance. Microsoft used a multi-task model to achieve some good results for the NLP Glue benchmark, but they also noted that you can increase the performance further by finetuning the joint model for each individual task.

having one dataset per head model, how am I supposed to train the whole model?

All you need is pytorch ModuleList:

#please note this is just pseudocode and I'm not well versed with computer vision
#therefore you need to check if resnet50 import is correct and look 
#for the imports of the task specific stuff
from torch import nn
from torchvision.models import resnet50

class MultiTaskModel(nn.Module):
    def __init__(self):
        #shared part
        self.resnet50 = resnet50()

        #task specific stuff
        self.tasks = nn.ModuleList()
        self.tasks.add_module('depth', Depth())
        self.tasks.add_module('denseflow', Denseflow())
        #...

    def forward(self, tasktag, ...):
        #shared part
        resnet_output = self.resnet50(...)

        #task specific parts
        if tasktag == 'depth':
            return self.tasks.depth(resnet_output)
        elif tasktag == 'denseflow':
            return self.tasks.denseflow(resnet_output)
        #...

188

answered Oct 15 '22 22:10

cronoik

Related questions
                            
                                Why the VC dimension of 2D perceptron is 3?
                            
                                Where does keras store its data sets when using a docker container?
                            
                                Affinity Propagation Clustering for Addresses
                            
                                How to save a trained model (Estimator) and Load it back to test it with data in Tensorflow?
                            
                                How to get loss function history using tf.contrib.opt.ScipyOptimizerInterface
                            
                                How to make the weights of an RNN cell untrainable in Tensorflow?
                            
                                How to add sparse vectors after group by, using Spark SQL?
                            
                                tensorflow neural network multi layer perceptron for regression example
                            
                                ValueError at /image/ Tensor Tensor("activation_5/Softmax:0", shape=(?, 4), dtype=float32) is not an element of this graph
                            
                                adjusted fitness in NEAT algorithm
                            
                                I am getting an accuracy of 1.0 every time in neural network
                            
                                How to configure input shape for bidirectional LSTM in Keras
                            
                                QnA Maker's metadata
                            
                                fastest way to load images in python for processing
                            
                                Creating a neural network in keras to multiply two input integers
                            
                                Understanding multivariate time series classification with Keras
                            
                                How to get all the models (one for each set of parameters) using GridSearchCV?
                            
                                How to monitor validation loss in the training of estimators in TensorFlow?
                            
                                Generate larger synthetic dataset based on a smaller dataset in Python
                            
                                Keras custom loss function (elastic net)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to Multi-Head learning

Tags:

machine-learning

deep-learning

pytorch

PrinceZee

People also ask

1 Answers

cronoik

Recent Activity

Donate For Us