Keras, Tensorflow : Merge two different model output into one

Tags:

I am working on one deep learning model where I am trying to combine two different model's output :

The overall structure is like this :

enter image description here

So the first model takes one matrix, for example [ 10 x 30 ]

#input 1
input_text          = layers.Input(shape=(1,), dtype="string")
embedding           = ElmoEmbeddingLayer()(input_text)
model_a             = Model(inputs = [input_text] , outputs=embedding)
                      # shape : [10,50]

Now the second model takes two input matrix :

X_in               = layers.Input(tensor=K.variable(np.random.uniform(0,9,[10,32])))
M_in               = layers.Input(tensor=K.variable(np.random.uniform(1,-1,[10,10]))

md_1               = New_model()([X_in, M_in]) #new_model defined somewhere
model_s            = Model(inputs = [X_in, A_in], outputs = md_1)
                     # shape : [10,50]

I want to make these two matrices trainable like in TensorFlow I was able to do this by :

matrix_a = tf.get_variable(name='matrix_a',
                           shape=[10,10],
                           dtype=tf.float32,
                                 initializer=tf.constant_initializer(np.array(matrix_a)),trainable=True)

I am not getting any clue how to make those matrix_a and matrix_b trainable and how to merge the output of both networks then give input.

I went through this question But couldn't find an answer because their problem statement is different from mine.

What I have tried so far is :

#input 1
input_text          = layers.Input(shape=(1,), dtype="string")
embedding           = ElmoEmbeddingLayer()(input_text)
model_a             = Model(inputs = [input_text] , outputs=embedding)
                      # shape : [10,50]

X_in               = layers.Input(tensor=K.variable(np.random.uniform(0,9,[10,10])))
M_in               = layers.Input(tensor=K.variable(np.random.uniform(1,-1,[10,100]))

md_1               = New_model()([X_in, M_in]) #new_model defined somewhere
model_s            = Model(inputs = [X_in, A_in], outputs = md_1)
                    # [10,50]


#tranpose second model output

tranpose          = Lambda(lambda x: K.transpose(x))
agglayer          = tranpose(md_1)

# concat first and second model output
dott             = Lambda(lambda x: K.dot(x[0],x[1]))
kmean_layer     = dotter([embedding,agglayer])


# input 
final_model = Model(inputs=[input_text, X_in, M_in], outputs=kmean_layer,name='Final_output')
final_model.compile(loss = 'categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
final_model.summary()

Overview of the model :

enter image description here

Update:

Model b

X = np.random.uniform(0,9,[10,32])
M = np.random.uniform(1,-1,[10,10])


X_in = layers.Input(tensor=K.variable(X))
M_in = layers.Input(tensor=K.variable(M))



layer_one       = Model_b()([M_in, X_in])
dropout2       = Dropout(dropout_rate)(layer_one)
layer_two      = Model_b()([layer_one, X_in])

model_b_ = Model([X_in, M_in], layer_two, name='model_b')

model a

length = 150


dic_size = 100
embed_size = 12

input_text = Input(shape=(length,))
embedding = Embedding(dic_size, embed_size)(input_text)

embedding = LSTM(5)(embedding) 
embedding = Dense(10)(embedding)

model_a = Model(input_text, embedding, name = 'model_a')

I am merging like this:

mult = Lambda(lambda x: tf.matmul(x[0], x[1], transpose_b=True))([embedding, model_b_.output])



final_model = Model(inputs=[model_b_.input[0],model_b_.input[1],model_a.input], outputs=mult)

Is it right way to matmul two keras model?

I don't know if I am merging the output correctly and the model is correct.

I would greatly appreciate it if anyone kindly gives me some advice on how should I make that matrix trainable and how to merge the model's output correctly then give input.

Thanks in advance!

374

asked Nov 17 '19 13:11

Aaditya Ura

1 Answers

Trainable weights

Ok. Since you are going to have custom trainable weights, the way to do this in Keras is creating a custom layer.

Now, since your custom layer has no inputs, we will need a hack that will be explained later.

So, this is the layer definition for the custom weights:

from keras.layers import *
from keras.models import Model
from keras.initializers import get as get_init, serialize as serial_init
import keras.backend as K
import tensorflow as tf


class TrainableWeights(Layer):

    #you can pass keras initializers when creating this layer
    #kwargs will take base layer arguments, such as name and others if you want
    def __init__(self, shape, initializer='uniform', **kwargs):
        super(TrainableWeights, self).__init__(**kwargs)
        self.shape = shape
        self.initializer = get_init(initializer)
        

    #build is where you define the weights of the layer
    def build(self, input_shape):
        self.kernel = self.add_weight(name='kernel', 
                                      shape=self.shape, 
                                      initializer=self.initializer, 
                                      trainable=True)
        self.built = True
        

    #call is the layer operation - due to keras limitation, we need an input
    #warning, I'm supposing the input is a tensor with value 1 and no shape or shape (1,)
    def call(self, x):
        return x * self.kernel
    

    #for keras to build the summary properly
    def compute_output_shape(self, input_shape):
        return self.shape
    

    #only needed for saving/loading this layer in model.save()
    def get_config(self):
        config = {'shape': self.shape, 'initializer': serial_init(self.initializer)}
        base_config = super(TrainableWeights, self).get_config()
        return dict(list(base_config.items()) + list(config.items()))

Now, this layer should be used like this:

dummyInputs = Input(tensor=K.constant([1]))
trainableWeights = TrainableWeights(shape)(dummyInputs)

Model A

Having the layer defined, we can start modeling.
First, let's see the model_a side:

#general vars
length = 150
dic_size = 100
embed_size = 12

#for the model_a segment
input_text = Input(shape=(length,))
embedding = Embedding(dic_size, embed_size)(input_text)

#the following two lines are just a resource to reach the desired shape
embedding = LSTM(5)(embedding) 
embedding = Dense(50)(embedding)

#creating model_a here is optional, only if you want to use model_a independently later
model_a = Model(input_text, embedding, name = 'model_a')

Model B

For this, we are going to use our TrainableWeights layer.
But first, let's simulate a New_model() as mentioned.

#simulates New_model() #notice the explicit batch_shape for the matrices
newIn1 = Input(batch_shape = (10,10))
newIn2 = Input(batch_shape = (10,30))
newOut1 = Dense(50)(newIn1)
newOut2 = Dense(50)(newIn2)
newOut = Add()([newOut1, newOut2])
new_model = Model([newIn1, newIn2], newOut, name='new_model')

Now the entire branch:

#the matrices    
dummyInput = Input(tensor = K.constant([1]))
X_in = TrainableWeights((10,10), initializer='uniform')(dummyInput)
M_in = TrainableWeights((10,30), initializer='uniform')(dummyInput)

#the output of the branch   
md_1 = new_model([X_in, M_in])

#optional, only if you want to use model_s independently later
model_s = Model(dummyInput, md_1, name='model_s')

The whole model

Finally, we can join the branches in a whole model.
Notice how I didn't have to use model_a or model_s here. You can do it if you want, but those submodels are not needed, unless you want later to get them individually for other usages. (Even if you created them, you don't need to change the code below to use them, they're already part of the same graph)

#I prefer tf.matmul because it's clear and understandable while K.dot has weird behaviors
mult = Lambda(lambda x: tf.matmul(x[0], x[1], transpose_b=True))([embedding, md_1])

#final model
model = Model([input_text, dummyInput], mult, name='full_model')

Now train it:

model.compile('adam', 'binary_crossentropy', metrics=['accuracy'])
model.fit(np.random.randint(0,dic_size, size=(128,length)),
          np.ones((128, 10)))

Since the output is 2D now, there is no problem about the 'categorical_crossentropy', my comment was because of doubts on the output shape.

150

answered Nov 02 '22 22:11

Daniel Möller

Related questions
                            
                                Python 3 -- Module not found
                            
                                Debug and list all coroutine pending by future in python asyncio
                            
                                Shorten large stack traces when using libraries
                            
                                Python FFT for feature extraction
                            
                                Type-checking Pandas DataFrames
                            
                                Import data frame from one Jupyter Notebook file to another
                            
                                from google.cloud import storage fails: ImportError: No module named google.cloud
                            
                                Pycharm pydev debugger: process is connecting forever
                            
                                Proper method for terminating a python program regardless of call location, with cleanup
                            
                                Pyautogui screenshot - NameError: name 'Image' is not defined
                            
                                ValueError: Input 0 of node incompatible with expected float_ref.**
                            
                                How to forecast using the Tensorflow model?
                            
                                Python Source Code - Update Grammar
                            
                                Running a single test method with pytest fails (not found)
                            
                                Merge pairs on common integer with restrictions
                            
                                PACF function in statsmodels.tsa.stattools gives numbers greater than 1 when using ywunbiased?
                            
                                In pycharm ImportError: DLL load failed: The specified module could not be found. while importing facerecognition
                            
                                Where should I modify my breadth first search algo for finding the shortest path between 2 nodes?
                            
                                Since latest python version retains insertion order of dict,will the meaning of equality (==) change?
                            
                                Absolute paths after freezing with cx_freeze (Qt5 / PySide2 App)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Keras, Tensorflow : Merge two different model output into one

Tags:

python-3.x

machine-learning

tensorflow

deep-learning

keras