In keras, I want to train an ensemble of models that share some layers. They are of the following form:
x ---> f(x) ---> g_1(f(x))
x ---> f(x) ---> g_2(f(x))
...
x ---> f(x) ---> g_n(f(x))
here f(x) are some nontrivial shared layers. g_1 through g_n have their specific parameters.
At each training stage, data x, is fed into one of the n networks, say, the i-th. A loss on g_i(f(x)) is then minimized/decreased via gradient based optimizer. How could I define and train such a model?
Thanks in advance!
You can easily do so by using functional Model.
A small example .. you can build on it:
import numpy as np
from keras.models import Model
from keras.layers import Dense, Input
X = np.empty(shape=(1000,100))
Y1 = np.empty(shape=(1000))
Y2 = np.empty(shape=(1000,2))
Y3 = np.empty(shape=(1000,3))
inp = Input(shape=(100,))
dense_f1 = Dense(50)
dense_f2 = Dense(20)
f = dense_f2(dense_f1(inp))
dense_g1 = Dense(1)
g1 = dense_g1(f)
dense_g2 = Dense(2)
g2 = dense_g2(f)
dense_g3 = Dense(3)
g3 = dense_g3(f)
model = Model([inp], [g1, g2, g3])
model.compile(loss=['mse', 'binary_crossentropy', 'categorical_crossentropy'], optimizer='rmsprop')
model.summary()
model.fit([X], [Y1, Y2, Y3], nb_epoch=10)
Edit:
Based on your comments, you can always make different models and write the training loop yourself based on how you need your training. You can see in the model.summary()
all the models are sharing the initial layers. Here is the extension to the example
model1 = Model(inp, g1)
model1.compile(loss=['mse'], optimizer='rmsprop')
model2 = Model(inp, g2)
model2.compile(loss=['binary_crossentropy'], optimizer='rmsprop')
model3 = Model(inp, g3)
model3.compile(loss=['categorical_crossentropy'], optimizer='rmsprop')
model1.summary()
model2.summary()
model3.summary()
batch_size = 10
nb_epoch=10
n_batches = X.shape[0]/batch_size
for iepoch in range(nb_epoch):
for ibatch in range(n_batches):
x_batch = X[ibatch*batch_size:(ibatch+1)*batch_size]
if ibatch%3==0:
y_batch = Y1[ibatch*batch_size:(ibatch+1)*batch_size]
model1.train_on_batch(x_batch, y_batch)
elif ibatch%3==1:
y_batch = Y2[ibatch*batch_size:(ibatch+1)*batch_size]
model2.train_on_batch(x_batch, y_batch)
else:
y_batch = Y3[ibatch*batch_size:(ibatch+1)*batch_size]
model3.train_on_batch(x_batch, y_batch)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With