Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

scaling inputs data to neural network

Do we have to scale input data for neural network? How does it affect the final solution of neural network?

I've tried to find some reliable sources on that. The book "elements of statistical learning" (page 400) says it will help choosing reasonable initial random weights to start with.

Aren't the final weights deterministic regardless of the initial random weights we use?

Thank you.

like image 698
James Avatar asked Jun 10 '13 23:06

James


People also ask

Do I need to scale data for neural network?

Yes, normalisation/scaling is typically recommended and sometimes very important. Especially for neural networks, normalisation can be very crucial because when you input unnormalised inputs to activation functions, you can get stuck in a very flat region in the domain and may not learn at all.

Why do we normalize inputs to neural network?

To summarize, normalization helps because it ensures (a) that there are both positive and negative values used as inputs for the next layer which makes learning more flexible and (b) that the network's learning regards all input features to a similar extent.

How does normalization affect neural network?

Normalization can help training of our neural networks as the different features are on a similar scale, which helps to stabilize the gradient descent step, allowing us to use larger learning rates or help models converge faster for a given learning rate.

Do we need to normalize data for deep learning?

In neural networks, you generally should use data where observations lie in a range between 0 and 1. In the context of deep learning, min-max normalization should therefore be your first choice.


1 Answers

Firstly, there are many types of ANNs, I will assume you are talking about the simplest one - multilayer perceptron with backpropagation.

Secondly, in your question you are mixing up data scaling (normalization) and weight initialization.

You need to randomly initialize weights to avoid symmetry while learning (if all weights are initially the same, their update will also be the same). In general, concrete values don't matter, but too large values can cause slower convergence.

You are not required to normalize your data, but normalization can make learning process faster. See this question for more details.

like image 78
ffriend Avatar answered Oct 10 '22 23:10

ffriend