Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Always same output for tensorflow autoencoder

At the moment I try to build an Autoencoder for timeseries data in tensorflow. I have nearly 500 days of data where each day have 24 datapoints. Since this is my first try my architecture is very simple. After my input of size 24 the hidden layers are of size: 10; 3; 10 with an output of again 24. I normalized the data (datapoints are in range [-0.5; 0.5]), use the sigmoid activation function and the RMSPropOptimizer.

After training (loss function in picture) the output is the same for every timedata i give into the network. Does someone know what is the reason for that? Is it possible that my Dataset is the issue (code below)?

loss function of training

class TimeDataset:
def __init__(self,data):
    self._index_in_epoch = 0
    self._epochs_completed = 0
    self._data = data
    self._num_examples = data.shape[0]
    pass


@property
def data(self):
    return self._data

def next_batch(self, batch_size, shuffle=True):
    start = self._index_in_epoch

    # first call
    if start == 0 and self._epochs_completed == 0:
        idx = np.arange(0, self._num_examples)  # get all possible indexes
        np.random.shuffle(idx)  # shuffle indexe
        self._data = self.data[idx]  # get list of `num` random samples

    if start + batch_size > self._num_examples:
        # not enough samples left -> go to the next batch
        self._epochs_completed += 1
        rest_num_examples = self._num_examples - start
        data_rest_part = self.data[start:self._num_examples]
        idx0 = np.arange(0, self._num_examples)  # get all possible indexes
        np.random.shuffle(idx0)  # shuffle indexes
        self._data = self.data[idx0]  # get list of `num` random samples

        start = 0
        self._index_in_epoch = batch_size - rest_num_examples #avoid the case where the #sample != integar times of batch_size
        end =  self._index_in_epoch  
        data_new_part =  self._data[start:end]  
        return np.concatenate((data_rest_part, data_new_part), axis=0)
    else:
        # get next batch
        self._index_in_epoch += batch_size
        end = self._index_in_epoch
        return self._data[start:end]

*edit: here are some examples of the output (red original, blue reconstructed): enter image description here

**edit: I just saw an autoencoder example with a more complicant luss function than mine. Someone know if the loss function self.loss = tf.reduce_mean(tf.pow(self.X - self.decoded, 2)) is sufficient?

***edit: some more code to describe my training This is my Autoencoder Class:

class AutoEncoder():
def __init__(self):
    # Training Parameters
    self.learning_rate = 0.005
    self.alpha = 0.5

    # Network Parameters
    self.num_input = 24 # one day as input
    self.num_hidden_1 = 10 # 2nd layer num features
    self.num_hidden_2 = 3 # 2nd layer num features (the latent dim)

    self.X = tf.placeholder("float", [None, self.num_input])

    self.weights = {
        'encoder_h1': tf.Variable(tf.random_normal([self.num_input, self.num_hidden_1])),
        'encoder_h2': tf.Variable(tf.random_normal([self.num_hidden_1, self.num_hidden_2])),
        'decoder_h1': tf.Variable(tf.random_normal([self.num_hidden_2, self.num_hidden_1])),
        'decoder_h2': tf.Variable(tf.random_normal([self.num_hidden_1, self.num_input])),
    }
    self.biases = {
        'encoder_b1': tf.Variable(tf.random_normal([self.num_hidden_1])),
        'encoder_b2': tf.Variable(tf.random_normal([self.num_hidden_2])),
        'decoder_b1': tf.Variable(tf.random_normal([self.num_hidden_1])),
        'decoder_b2': tf.Variable(tf.random_normal([self.num_input])),
    }    

    self.encoded = self.encoder(self.X)
    self.decoded = self.decoder(self.encoded)

    # Define loss and optimizer, minimize the squared error
    self.loss = tf.reduce_mean(tf.pow(self.X - self.decoded, 2))

    self.optimizer = tf.train.RMSPropOptimizer(self.learning_rate).minimize(self.loss)

def encoder(self, x):
    # sigmoid, tanh, relu
    en_layer_1 = tf.nn.sigmoid (tf.add(tf.matmul(x, self.weights['encoder_h1']),
                                   self.biases['encoder_b1']))

    en_layer_2 = tf.nn.sigmoid (tf.add(tf.matmul(en_layer_1, self.weights['encoder_h2']),
                                   self.biases['encoder_b2']))

    return en_layer_2

def decoder(self, x):
    de_layer_1 = tf.nn.sigmoid (tf.add(tf.matmul(x, self.weights['decoder_h1']),
                                   self.biases['decoder_b1']))

    de_layer_2 = tf.nn.sigmoid (tf.add(tf.matmul(de_layer_1, self.weights['decoder_h2']),
                                   self.biases['decoder_b2']))

    return de_layer_2

and this is how I train my network (input data have shape (number_days, 24)):

model = autoencoder.AutoEncoder()

num_epochs = 3
batch_size = 50
num_batches = 300

display_batch = 50
examples_to_show = 16

loss_values = []

with tf.Session() as sess:

sess.run(tf.global_variables_initializer())

#training
for e in range(1, num_epochs+1):
    print('starting epoch {}'.format(e))
    for b in range(num_batches):
        # get next batch of data
        batch_x = dataset.next_batch(batch_size)

        # Run optimization op (backprop) and cost op (to get loss value)
        l = sess.run([model.loss], feed_dict={model.X: batch_x})
        sess.run(model.optimizer, feed_dict={model.X: batch_x})            

        # Display logs
        if b % display_batch == 0:
            print('Epoch {}: Batch ({}) Loss: {}'.format(e, b, l))
            loss_values.append(l)


# testing
test_data = dataset.next_batch(batch_size)
decoded_test_data = sess.run(model.decoded, feed_dict={model.X: test_data})
like image 983
Marvin K Avatar asked Nov 07 '18 14:11

Marvin K


People also ask

What are autoencoders in TensorFlow?

From theory to implementation in TensorFlow. Autoencoders are artificial neural networks that can learn from an unlabeled training set. This may be dubbed as unsupervised deep learning. They can be used for either dimensionality reduction or as a generative model, meaning that they can generate new data from input data.

What is the difference between autoencoder and hidden layer?

Source Note that an autoencoder has the same number of neurons in the input and output layer. The hidden layer must have less neurons in order to force the network to learn the most important features in the data. That way, it cannot trivially copy the input to the output.

Can we generate new faces with autoencoders?

Technically, we can generate new “faces” with autoencoders! Autoencoders take data as input, converts them to an efficient internal representation, and outputs data that looks like the input. In other words, it is looking for patterns in the inputs in order to generate something new, but very close to the input data.

What is autoencoder in neural network?

An autoencoder is a special type of neural network that is trained to copy its input to its output. For example, given an image of a handwritten digit, an autoencoder first encodes the image into a lower dimensional latent representation, then decodes the latent representation back to an image.


2 Answers

Just a suggestion, I have had some issues with autoencoders using the sigmoid function.

I switched to tanh or relu and those improved the results. With the autoencoder it is basically learning to recreate the output from the input, by encoding and decoding. If you mean it's the same as the input, then you are getting what you want. It has learned the data set.

Ultimately you can compare by reviewing the Mean Squared Error between the input and output and see if it is exactly the same. If you mean that the output is exactly the same regardless of the input, that isn't something I've run into. I guess if your input doesn't vary much from day to day, then I could imagine that would have some impact. Are you looking for anomalies?

Also, if you have a time series for training, I wouldn't shuffle the data in this particular case. If the temporal order is significant, you introduce data leakage (basically introducing future data into the training set) depending on what you are trying to achieve.

Ah, I didn't initially see your post with the graph results.. thanks for adding.

like image 146
CJP Avatar answered Sep 18 '22 21:09

CJP


The sigmoid output is floored at 0, so it cannot reproduce your data that is below 0.

If you want to use a sigmoid output, then rescale your data between ]0;1[ (0 and 1 excluded).

like image 41
Matthieu Brucher Avatar answered Sep 17 '22 21:09

Matthieu Brucher