Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

The concept of straight through estimator (STE) [closed]

I have seen straight through estimator (STE) in many Neural Network related papers e.g. this and this. But I cannot understand the concept. I wonder if anyone could explain STE or refer me to a simple resource?

like image 821
Amir Avatar asked Jul 13 '16 20:07

Amir


1 Answers

A straight through estimator is a way of estimating gradients for a threshold operation in a neural network. The threshold could be as simple as the following function,

enter image description here

As we can see, the derivative of this threshold function will 0 and during back-propagation, the network will not learn anything since it gets 0 gradients and the weights won't get updated.

The concept of a straight through estimator is that you set the incoming gradients to a threshold function equal to it's outgoing gradients, disregarding the derivative of the threshold function itself. This has been shown to perform well in the results (Figure 2) in this paper you have referenced.

like image 139
Chinni Avatar answered Nov 16 '22 19:11

Chinni