Difficulty in GAN training

Tags:

I am attempting to train a GAN to learn the distribution of a number of features in an event. The Discriminator and Generator trained have a low loss but the generated events have different shaped distributions but I am unsure why.

I define the GAN as follow:

Click to copy

def create_generator():

    generator = Sequential()

    generator.add(Dense(50,input_dim=noise_dim))
    generator.add(LeakyReLU(0.2))    
    generator.add(Dense(25))
    generator.add(LeakyReLU(0.2))
    generator.add(Dense(5))
    generator.add(LeakyReLU(0.2))
    generator.add(Dense(len(variables), activation='tanh'))

    return generator


def create_descriminator():
    discriminator = Sequential()

    discriminator.add(Dense(4, input_dim=len(variables)))
    discriminator.add(LeakyReLU(0.2))
    discriminator.add(Dense(4))
    discriminator.add(LeakyReLU(0.2))
    discriminator.add(Dense(4))
    discriminator.add(LeakyReLU(0.2))
    discriminator.add(Dense(1, activation='sigmoid'))   
    discriminator.compile(loss='binary_crossentropy', optimizer=optimizer)
    return discriminator


discriminator = create_descriminator()
generator = create_generator()

def define_gan(generator, discriminator):
    # make weights in the discriminator not trainable
    discriminator.trainable = False
    model = Sequential()
    model.add(generator)
    model.add(discriminator)
    model.compile(loss = 'binary_crossentropy', optimizer=optimizer)
    return model

gan = define_gan(generator, discriminator)

And I train the GAN using this loop:

Click to copy

for epoch in range(epochs):
    for batch in range(steps_per_epoch):
        noise = np.random.normal(0, 1, size=(batch_size, noise_dim))
        fake_x = generator.predict(noise)

        real_x = x_train[np.random.randint(0, x_train.shape[0], size=batch_size)]

        x = np.concatenate((real_x, fake_x))
        # Real events have label 1, fake events have label 0
        disc_y = np.zeros(2*batch_size)
        disc_y[:batch_size] = 1

        discriminator.trainable = True
        d_loss = discriminator.train_on_batch(x, disc_y)

        discriminator.trainable = False
        y_gen = np.ones(batch_size)
        g_loss = gan.train_on_batch(noise, y_gen)

My real events are scaled using the sklearn standard scaler:

Click to copy

scaler = StandardScaler()
x_train = scaler.fit_transform(x_train)

Generating events:

Click to copy

X_noise = np.random.normal(0, 1, size=(n_events, GAN_noise_size))
X_generated = generator.predict(X_noise)

When I then use the trained GAN after training for a few hundred to a few thousand epochs to generate new events and unscaling I get distributions that look like this:

enter image description here

And plotting two of the features against each other for the real and fake events gives: enter image description here

This looks similar to mode collapse but I don't see how that could lead to these extremal values where everything is cut off beyond those points.

352

asked Mar 12 '20 16:03

pythonthrowaway

1 Answers

Mode collapse results in the generator finding a few values or small range of values that do the best at fooling the discriminator. Since your range of generated values is fairly narrow, I believe you are experiencing mode collapse. You can train for different durations and plot the results to see when collapse occurs. Sometimes, if you train long enough, it will fix itself and start learning again. There are a billion recommendations on how to train GANs, I collected bunch and then brute force my way through them for each GAN. You could try only training the discriminator every other cycle, in order to give the generator a chance to learn. Also, several people recommend not training the discriminator on real and fake data at the same time (I haven't done it so can't say what, if any, the impact is). You might also want to try adding in some batch normalization layers. Jason Brownlee has a bunch of good articles on training GANs, you may want to start there.

144

answered Sep 29 '22 22:09

csteel

Related questions
                            
                                Visual Studio Code syntax highlighting not working
                            
                                Reading .dat file in python
                            
                                Feeding nullable data from BigQuery into Tensorflow Transform
                            
                                Does the django_address module provide a way to seed the initial country data?
                            
                                How to generate asgi.py for existent project?
                            
                                How do I correctly use mock call_args with Python's unittest.mock?
                            
                                Flask endpoint vs Sagemaker endpoint
                            
                                which python vs PYTHONPATH
                            
                                Do I need to split the data for isolation forest?
                            
                                Is it true that in multiprocessing, each process gets it's own GIL in CPython? How different is that from creating new runtimes?
                            
                                Django & mypy: ValuesQuerySet type hint
                            
                                How to process huge datasets in kedro
                            
                                Pandas - Generate Unique ID based on row values
                            
                                sklearn utils compute_class_weight function for large dataset
                            
                                Automatically determine header row when reading csv in pandas
                            
                                Pandas: Select all data from Pandas DataFrame between two dates
                            
                                Diminishing the impact of one variable over output in a regression model
                            
                                Why does Python3 run faster if it is negating vs XOR?
                            
                                Detect circles in openCV
                            
                                Tensorboard for custom training loop in Tensorflow 2

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Difficulty in GAN training

Tags:

python

tensorflow

keras

generative-adversarial-network

pythonthrowaway

People also ask

1 Answers

csteel

Recent Activity

Donate For Us