Why scaling data is very important in neural network(LSTM)

Question

I am writing my master thesis about how to apply LSTM neural network in time series. In my experiment, i found out that scaling data can have a great impact on the result. For example, when i use a tanh activation function, and the value range is between -1 and 1, the model seems to converge faster and the validation error also does not jump dramatically after each epoch.

Does anyone know is there any mathmetical explanation for that? Or is there any papers already explain about this situation?

Lerner Zhang · Accepted Answer

Your question reminds me of a picture used in our class, but you can find a similar one from here at 3:02.

enter image description here

In the picture above you can see obviously that the path on the left is much longer than that on the right. The scaling is applied to the left to become the right one.

Why scaling data is very important in neural network(LSTM)

Tags:

neural-network

backpropagation

lstm

Thanh Quang

1 Answers

Lerner Zhang

Recent Activity

Donate For Us

Why scaling data is very important in neural network(LSTM)

Tags:

neural-network

backpropagation

lstm

Thanh Quang

1 Answers

Lerner Zhang

Related questions

Recent Activity

Donate For Us