Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the purpose of keras utils normalize?

I'd like to normalize my training set before passing it to my NN so instead of doing it manually (subtract mean and divide by std), I tried keras.utils.normalize() and I am amazed about the results I got.

Running this:

r = np.random.rand(3000) * 1000
nr = normalize(r)
print(np.mean(r))
print(np.mean(nr))
print(np.std(r))
print(np.std(nr))
print(np.min(r))
print(np.min(nr))
print(np.max(r))
print(np.max(nr))

​ ​Results in that:

495.60440066771866
0.015737914577213984
291.4440194021
0.009254802974329002
0.20755517410064872
6.590913227674956e-06
999.7631481267636
0.03174747238214018

Unfortunately, the docs don't explain what's happening under the hood. Can you please explain what it does and if I should use keras.utils.normalize instead of what I would have done manually?

like image 663
Yelve Yakut Avatar asked Sep 29 '18 19:09

Yelve Yakut


People also ask

What does keras utils normalize do?

The normalize function just performs a regular normalization to improve performance: Normalization is a rescaling of the data from the original range so that all values are within the range of 0 and 1. There is a nice explanation of the axis argument in another post: What is the meaning of axis=-1 in keras.

What is normalization in keras?

Normalization classA preprocessing layer which normalizes continuous features. This layer will shift and scale inputs into a distribution centered around 0 with standard deviation 1. It accomplishes this by precomputing the mean and variance of the data, and calling (input - mean) / sqrt(var) at runtime.

What does Tensorflow normalize do?

Tensorflow normalize is the method available in the tensorflow library that helps to bring out the normalization process for tensors in neural networks. The main purpose of this process is to bring the transformation so that all the features work on the same or similar level of scale.

What does it mean to normalize an axis?

“Normalizing” a vector most often means dividing by a norm of the vector. It also often refers to rescaling by the minimum and range of the vector, to make all the elements lie between 0 and 1 thus bringing all the values of numeric columns in the dataset to a common scale.


1 Answers

It is not the kind of normalization you expect. Actually, it uses np.linalg.norm() under the hood to normalize the given data using Lp-norms:

def normalize(x, axis=-1, order=2):
    """Normalizes a Numpy array.
    # Arguments
        x: Numpy array to normalize.
        axis: axis along which to normalize.
        order: Normalization order (e.g. 2 for L2 norm).
    # Returns
        A normalized copy of the array.
    """
    l2 = np.atleast_1d(np.linalg.norm(x, order, axis))
    l2[l2 == 0] = 1
    return x / np.expand_dims(l2, axis)

For example, in the default case, it would normalize the data using L2-normalization (i.e. the sum of squared of elements would be equal to one).

You can either use this function, or if you don't want to do mean and std normalization manually, you can use StandardScaler() from sklearn or even MinMaxScaler().

like image 194
today Avatar answered Oct 04 '22 14:10

today