Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Difficulty in GAN training

I am attempting to train a GAN to learn the distribution of a number of features in an event. The Discriminator and Generator trained have a low loss but the generated events have different shaped distributions but I am unsure why.

I define the GAN as follow:

def create_generator():

    generator = Sequential()

    generator.add(Dense(50,input_dim=noise_dim))
    generator.add(LeakyReLU(0.2))    
    generator.add(Dense(25))
    generator.add(LeakyReLU(0.2))
    generator.add(Dense(5))
    generator.add(LeakyReLU(0.2))
    generator.add(Dense(len(variables), activation='tanh'))

    return generator


def create_descriminator():
    discriminator = Sequential()

    discriminator.add(Dense(4, input_dim=len(variables)))
    discriminator.add(LeakyReLU(0.2))
    discriminator.add(Dense(4))
    discriminator.add(LeakyReLU(0.2))
    discriminator.add(Dense(4))
    discriminator.add(LeakyReLU(0.2))
    discriminator.add(Dense(1, activation='sigmoid'))   
    discriminator.compile(loss='binary_crossentropy', optimizer=optimizer)
    return discriminator


discriminator = create_descriminator()
generator = create_generator()

def define_gan(generator, discriminator):
    # make weights in the discriminator not trainable
    discriminator.trainable = False
    model = Sequential()
    model.add(generator)
    model.add(discriminator)
    model.compile(loss = 'binary_crossentropy', optimizer=optimizer)
    return model

gan = define_gan(generator, discriminator)

And I train the GAN using this loop:

for epoch in range(epochs):
    for batch in range(steps_per_epoch):
        noise = np.random.normal(0, 1, size=(batch_size, noise_dim))
        fake_x = generator.predict(noise)

        real_x = x_train[np.random.randint(0, x_train.shape[0], size=batch_size)]

        x = np.concatenate((real_x, fake_x))
        # Real events have label 1, fake events have label 0
        disc_y = np.zeros(2*batch_size)
        disc_y[:batch_size] = 1

        discriminator.trainable = True
        d_loss = discriminator.train_on_batch(x, disc_y)

        discriminator.trainable = False
        y_gen = np.ones(batch_size)
        g_loss = gan.train_on_batch(noise, y_gen)

My real events are scaled using the sklearn standard scaler:

scaler = StandardScaler()
x_train = scaler.fit_transform(x_train)

Generating events:

X_noise = np.random.normal(0, 1, size=(n_events, GAN_noise_size))
X_generated = generator.predict(X_noise)

When I then use the trained GAN after training for a few hundred to a few thousand epochs to generate new events and unscaling I get distributions that look like this:

enter image description here

And plotting two of the features against each other for the real and fake events gives: enter image description here

This looks similar to mode collapse but I don't see how that could lead to these extremal values where everything is cut off beyond those points.

like image 352
pythonthrowaway Avatar asked Mar 12 '20 16:03

pythonthrowaway


People also ask

Why is training GAN difficult?

Mode collapse is one of the hardest problems to solve in GAN. A complete collapse is not common but a partial collapse happens often. The images below with the same underlined color look similar and the mode starts collapsing.

What are disadvantages of using GAN?

Following are GaN disadvantages. ➨It offers high cost due to higher material cost and costly processes involved in its manufacturing. ➨Currently small signal MMIC and LNA market is dominated by GaAs devices. It will take some time for GaN to take over this market due to cost factor.

Do GANs take a long time to train?

The original networks I have defined below look like they will take around 90 hours. You have two options: Use 128 features instead of 196 in both the generator and the discriminator. This should drop training time to around 43 hours for 400 epochs.

How can I improve my GAN training?

We can improve GAN by turning our attention in balancing the loss between the generator and the discriminator. Unfortunately, the solution seems elusive. We can maintain a static ratio between the number of gradient descent iterations on the discriminator and the generator.


1 Answers

Mode collapse results in the generator finding a few values or small range of values that do the best at fooling the discriminator. Since your range of generated values is fairly narrow, I believe you are experiencing mode collapse. You can train for different durations and plot the results to see when collapse occurs. Sometimes, if you train long enough, it will fix itself and start learning again. There are a billion recommendations on how to train GANs, I collected bunch and then brute force my way through them for each GAN. You could try only training the discriminator every other cycle, in order to give the generator a chance to learn. Also, several people recommend not training the discriminator on real and fake data at the same time (I haven't done it so can't say what, if any, the impact is). You might also want to try adding in some batch normalization layers. Jason Brownlee has a bunch of good articles on training GANs, you may want to start there.

like image 144
csteel Avatar answered Sep 29 '22 22:09

csteel