Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using large input values with Auto Encoders

I have created an Auto Encoder Neural Network in MATLAB. I have quite large inputs at the first layer which I have to reconstruct through the network's output layer. I cannot use the large inputs as it is,so I convert it to between [0, 1] using sigmf function of MATLAB. It gives me a values of 1.000000 for all the large values. I have tried using setting the format but it does not help.

Is there a workaround to using large values with my auto encoder?

like image 412
Sasha Avatar asked Jul 14 '14 08:07

Sasha


People also ask

What is bottleneck in autoencoder?

Bottleneck: It is the lower dimensional hidden layer where the encoding is produced. The bottleneck layer has a lower number of nodes and the number of nodes in the bottleneck layer also gives the dimension of the encoding of the input.

Does autoencoders produce the same output as the input?

Lossy compression: The output of the autoencoder is not exactly the same as the input, it is a close but degraded representation. For lossless compression, they are not the way to go. Data-specific: Autoencoders are only able to meaningfully compress data similar to what they have been trained on.

Can auto encoders help with filling missing data?

Several deep learning techniques have been used to address this issue, and one of them is the Autoencoder and its Denoising and Variational variants. These models are able to learn a representation of the data with missing values and generate plausible new ones to replace them.


2 Answers

The process of convert your inputs to the range [0,1] is called normalization, however, as you noticed, the sigmf function is not adequate for this task. This link maybe is useful to you.

Suposse that your inputs are given by a matrix of N rows and M columns, where each row represent an input pattern and each column is a feature. If your first column is:

vec =

   -0.1941
   -2.1384
   -0.8396
    1.3546
   -1.0722

Then you can convert it to the range [0,1] using:

%# get max and min
maxVec = max(vec);
minVec = min(vec);

%# normalize to -1...1
vecNormalized = ((vec-minVec)./(maxVec-minVec))

vecNormalized =

    0.5566
         0
    0.3718
    1.0000
    0.3052

As @Dan indicates in the comments, another option is to standarize the data. The goal of this process is to scale the inputs to have mean 0 and a variance of 1. In this case, you need to substract the mean value of the column and divide by the standard deviation:

meanVec = mean(vec);
stdVec = std(vec);

vecStandarized = (vec-meanVec)./ stdVec

vecStandarized =

    0.2981
   -1.2121
   -0.2032
    1.5011
   -0.3839
like image 105
Pablo EM Avatar answered Nov 15 '22 03:11

Pablo EM


Before I give you my answer, let's think a bit about the rationale behind an auto-encoder (AE):
The purpose of auto-encoder is to learn, in an unsupervised manner, something about the underlying structure of the input data. How does AE achieves this goal? If it manages to reconstruct the input signal from its output signal (that is usually of lower dimension) it means that it did not lost information and it effectively managed to learn a more compact representation.

In most examples, it is assumed, for simplicity, that both input signal and output signal ranges in [0..1]. Therefore, the same non-linearity (sigmf) is applied both for obtaining the output signal and for reconstructing back the inputs from the outputs.
Something like

output = sigmf( W*input + b ); % compute output signal
reconstruct = sigmf( W'*output + b_prime ); % notice the different constant b_prime

Then the AE learning stage tries to minimize the training error || output - reconstruct ||.

However, who said the reconstruction non-linearity must be identical to the one used for computing the output?

In your case, the assumption that inputs ranges in [0..1] does not hold. Therefore, it seems that you need to use a different non-linearity for the reconstruction. You should pick one that agrees with the actual range of you inputs.

If, for example, your input ranges in (0..inf) you may consider using exp or ().^2 as the reconstruction non-linearity. You may use polynomials of various degrees, log or whatever function you think may fit the spread of your input data.


Disclaimer: I never actually encountered such a case and have not seen this type of solution in literature. However, I believe it makes sense and at least worth trying.

like image 29
Shai Avatar answered Nov 15 '22 05:11

Shai