I am attempting to train a GAN to learn the distribution of a number of features in an event. The Discriminator and Generator trained have a low loss but the generated events have different shaped distributions but I am unsure why.
I define the GAN as follow:
def create_generator():
generator = Sequential()
generator.add(Dense(50,input_dim=noise_dim))
generator.add(LeakyReLU(0.2))
generator.add(Dense(25))
generator.add(LeakyReLU(0.2))
generator.add(Dense(5))
generator.add(LeakyReLU(0.2))
generator.add(Dense(len(variables), activation='tanh'))
return generator
def create_descriminator():
discriminator = Sequential()
discriminator.add(Dense(4, input_dim=len(variables)))
discriminator.add(LeakyReLU(0.2))
discriminator.add(Dense(4))
discriminator.add(LeakyReLU(0.2))
discriminator.add(Dense(4))
discriminator.add(LeakyReLU(0.2))
discriminator.add(Dense(1, activation='sigmoid'))
discriminator.compile(loss='binary_crossentropy', optimizer=optimizer)
return discriminator
discriminator = create_descriminator()
generator = create_generator()
def define_gan(generator, discriminator):
# make weights in the discriminator not trainable
discriminator.trainable = False
model = Sequential()
model.add(generator)
model.add(discriminator)
model.compile(loss = 'binary_crossentropy', optimizer=optimizer)
return model
gan = define_gan(generator, discriminator)
And I train the GAN using this loop:
for epoch in range(epochs):
for batch in range(steps_per_epoch):
noise = np.random.normal(0, 1, size=(batch_size, noise_dim))
fake_x = generator.predict(noise)
real_x = x_train[np.random.randint(0, x_train.shape[0], size=batch_size)]
x = np.concatenate((real_x, fake_x))
# Real events have label 1, fake events have label 0
disc_y = np.zeros(2*batch_size)
disc_y[:batch_size] = 1
discriminator.trainable = True
d_loss = discriminator.train_on_batch(x, disc_y)
discriminator.trainable = False
y_gen = np.ones(batch_size)
g_loss = gan.train_on_batch(noise, y_gen)
My real events are scaled using the sklearn standard scaler:
scaler = StandardScaler()
x_train = scaler.fit_transform(x_train)
Generating events:
X_noise = np.random.normal(0, 1, size=(n_events, GAN_noise_size))
X_generated = generator.predict(X_noise)
When I then use the trained GAN after training for a few hundred to a few thousand epochs to generate new events and unscaling I get distributions that look like this:
And plotting two of the features against each other for the real and fake events gives:
This looks similar to mode collapse but I don't see how that could lead to these extremal values where everything is cut off beyond those points.
Mode collapse is one of the hardest problems to solve in GAN. A complete collapse is not common but a partial collapse happens often. The images below with the same underlined color look similar and the mode starts collapsing.
Following are GaN disadvantages. ➨It offers high cost due to higher material cost and costly processes involved in its manufacturing. ➨Currently small signal MMIC and LNA market is dominated by GaAs devices. It will take some time for GaN to take over this market due to cost factor.
The original networks I have defined below look like they will take around 90 hours. You have two options: Use 128 features instead of 196 in both the generator and the discriminator. This should drop training time to around 43 hours for 400 epochs.
We can improve GAN by turning our attention in balancing the loss between the generator and the discriminator. Unfortunately, the solution seems elusive. We can maintain a static ratio between the number of gradient descent iterations on the discriminator and the generator.
Mode collapse results in the generator finding a few values or small range of values that do the best at fooling the discriminator. Since your range of generated values is fairly narrow, I believe you are experiencing mode collapse. You can train for different durations and plot the results to see when collapse occurs. Sometimes, if you train long enough, it will fix itself and start learning again. There are a billion recommendations on how to train GANs, I collected bunch and then brute force my way through them for each GAN. You could try only training the discriminator every other cycle, in order to give the generator a chance to learn. Also, several people recommend not training the discriminator on real and fake data at the same time (I haven't done it so can't say what, if any, the impact is). You might also want to try adding in some batch normalization layers. Jason Brownlee has a bunch of good articles on training GANs, you may want to start there.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With