What is inductive bias in machine learning? Why is it necessary?
The inductive bias (also known as learning bias) of a learning algorithm is the set of assumptions that the learner uses to predict outputs of given inputs that it has not encountered. In machine learning, one aims to construct algorithms that are able to learn to predict a certain target output.
Generally, every building block and every belief that we make about the data is a form of inductive bias. Inductive biases play an important role in the ability of machine learning models to generalize to the unseen data. A strong inductive bias can lead our model to converge to the global optimum.
A central factor in the application of machine learning to a given task is the inductive bias, i.e. the choice of hypotheses space from which learned functions are taken. The restriction posed by the inductive bias is necessary for practical learning, and reflects prior knowledge regarding the task at hand.
Every machine learning algorithm with any ability to generalize beyond the training data that it sees has some type of inductive bias, which are the assumptions made by the model to learn the target function and to generalize beyond training data.
For example, in linear regression, the model assumes that the output or dependent variable is related to independent variable linearly (in the weights). This is an inductive bias of the model.
Pretty much every design choice in machine learning signifies some sort of inductive bias. "Relational inductive biases, deep learning, and graph networks" (Battaglia et. al, 2018) is an amazing π read, which I will be referring to throughout this answer.
An inductive bias allows a learning algorithm to prioritize one solution (or interpretation) over another, independent of the observed data. [...] Inductive biases can express assumptions about either the data-generating process or the space of solutions.
Concretely speaking, the very composition of layers π° in deep learning provides a type of relational inductive bias: hierarchical processing. The type of layer imposes further relational inductive biases:
More generally, non-relational inductive biases used in deep learning include:
In a Bayesian model, inductive biases are typically expressed through the choice and parameterization of the prior distribution. Adding a Tikhonov regularization penalty to your loss function implies assuming that simpler hypotheses are more likely.
The stronger the inductive bias, the better the sample efficiency--this can be understood in terms of the bias-variance tradeoff. Many modern deep learning methods follow an βend-to-endβ design philosophy which emphasizes minimal a priori representational and computational assumptions, which explains why they tend to be so data-intensive. On the other hand, there is a lot of research into baking stronger relational inductive biases into deep learning architectures, e.g. with graph networks.
In philosophy, inductive reasoning refers to generalization from specific observations to a conclusion. This is a counterpoint to deductive reasoning, which refers to specialization from general ideas to a conclusion.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With