Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Purposely Overfit Neural Network

Technically speaking, given a complex enough network and sufficient amounts of time, is it always possible to overfit any dataset to the point where training error is 0?

like image 768
tallosan Avatar asked Mar 10 '23 14:03

tallosan


1 Answers

Neural networks are universal approximators, which pretty much means that as long as there exists a deterministic mapping f from input to output, there always exists a set of parameters (for large enough network) that give you error which is arbitrarly close to minimal possible error, but:

  • if dataset is infinite (it is a distribution) then minimal obtainable error (called Bayes risk) can be greater than zero, bur rather some value e (pretty much the measure of "overlap" of different classes/value).
  • if mapping f is non-deterministic then again there is a non-zero Bayes risk e (this is a mathematical way of saying that a given point can have "multiple" values, with given probabilities)
  • arbitrarly close does not mean minimal. So even if the minimal error is zero, it does not mean that you just need "big enough" network to get to zero, you might always end up with veeeery small epsilon (but you can decrease it as long as you want). For example a network trained on classification task which has sigmoid/softmax output cannot ever obtain minimal log loss (cross entropy loss), as you can always move your activations "closer to 1" or "closer to 0", but you cannot achieve neither of these.

So from mathematical perspective the answer is no, from practical point of view - under the assumption of finite training set and deterministic mapping - the answer is yes.

In particular when you are asking about accuracy of the classification, and you have finite dataset with unique label per datapoint then it is easy to construct by hand a neural network which has 100% accuracy. However this does not mean minimal possible loss (as described above). Thus from the optimization perspective you are not obtaining "zero error".

like image 147
lejlot Avatar answered Mar 15 '23 02:03

lejlot