Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why scaling data is very important in neural network(LSTM)

I am writing my master thesis about how to apply LSTM neural network in time series. In my experiment, i found out that scaling data can have a great impact on the result. For example, when i use a tanh activation function, and the value range is between -1 and 1, the model seems to converge faster and the validation error also does not jump dramatically after each epoch.

Does anyone know is there any mathmetical explanation for that? Or is there any papers already explain about this situation?

like image 900
Thanh Quang Avatar asked Feb 04 '23 04:02

Thanh Quang


1 Answers

Your question reminds me of a picture used in our class, but you can find a similar one from here at 3:02.

enter image description here

In the picture above you can see obviously that the path on the left is much longer than that on the right. The scaling is applied to the left to become the right one.

like image 128
Lerner Zhang Avatar answered Feb 08 '23 14:02

Lerner Zhang