Is there any implementation (or straightforward description) of a training algorithm for feed-forward neural networks which doesn't use a sigmoid or linear squash-function, but a non-differentiable one, such as the heaviside-function?
I already found a paper on such an algorithm, but no according implementation, which I find bewildering, as it seems to me, there should be something out.
Any hints?
Backpropagation will not work with the heavyside function because its derivate is zero in all the domain, except for the point zero, where it is infinite. That is, the derivative of the heavyside function is the Dirac delta.
The consequence of this is that there is no change for any value other than zero and no progress can be made. At the point zero, the derivate is infinite, so the step is not manageable either.
You can find online an implementation for this function in Java, but I still don't think that it is a good idea to use it. If you increase the gama power in the sigmoid function, it becomes a very decent approximation of the heavyside function with the added benefit of differentiability.
Check this paper at see if it has any information that might be of help to you.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With