<p>What is inductive bias in machine learning? Why is it necessary?</p>

<h3>What is inductive bias?</h3> <p>Pretty much every design choice in machine learning signifies some sort of inductive bias. "Relational inductive biases, deep learning, and graph networks" (Battaglia et. al, 2018) is an amazing 🙌 read, which I will be referring to throughout this answer.</p> <blockquote> <p>An <strong>inductive bias</strong> allows a learning algorithm to <strong>prioritize one solution (or interpretation) over another</strong>, independent of the observed data. [...] Inductive biases can express assumptions about either the data-generating process or the space of solutions.</p> </blockquote> <h3>Examples in deep learning</h3> <p>Concretely speaking, the very <strong>composition of layers</strong> 🍰 in deep learning provides a type of relational inductive bias: <em>hierarchical processing</em>. The <strong>type of layer</strong> imposes further relational inductive biases:</p> <p><img src="https://i.stack.imgur.com/QbD58.png" alt="Various relational inductive biases in standard deep learning components (Battaglia et. al, 2018)"></p> <p>More generally, non-relational inductive biases used in deep learning include:</p> <ul> <li>activation non-linearities,</li> <li>weight decay,</li> <li>dropout,</li> <li>batch and layer normalization,</li> <li>data augmentation,</li> <li>training curricula,</li> <li>optimization algorithms,</li> <li> <em>anything that imposes constraints on the learning trajectory</em>.</li> </ul> <h3>Examples outside of deep learning</h3> <p>In a Bayesian model, inductive biases are typically expressed through the choice and parameterization of the prior distribution. Adding a Tikhonov regularization penalty to your loss function implies assuming that simpler hypotheses are more likely.</p> <h3>Conclusion</h3> <p>The stronger the inductive bias, the better the sample efficiency--this can be understood in terms of the <strong>bias-variance tradeoff</strong>. Many modern deep learning methods follow an “end-to-end” design philosophy which emphasizes minimal <em>a priori</em> representational and computational assumptions, which explains why they tend to be so <strong>data-intensive</strong>. On the other hand, there is a lot of research into baking stronger relational inductive biases into deep learning architectures, e.g. with graph networks.</p> <h3>An aside about the word "inductive"</h3> <p>In philosophy, inductive reasoning refers to <strong>generalization</strong> from specific observations to a conclusion. This is a counterpoint to deductive reasoning, which refers to <strong>specialization</strong> from general ideas to a conclusion.</p>

What is inductive bias in machine learning? [closed]

2 Answers

Every machine learning algorithm with any ability to generalize beyond the training data that it sees has some type of inductive bias, which are the assumptions made by the model to learn the target function and to generalize beyond training data.

For example, in linear regression, the model assumes that the output or dependent variable is related to independent variable linearly (in the weights). This is an inductive bias of the model.

172

answered Nov 19 '22 20:11

Viswa

What is inductive bias?

Pretty much every design choice in machine learning signifies some sort of inductive bias. "Relational inductive biases, deep learning, and graph networks" (Battaglia et. al, 2018) is an amazing 🙌 read, which I will be referring to throughout this answer.

An inductive bias allows a learning algorithm to prioritize one solution (or interpretation) over another, independent of the observed data. [...] Inductive biases can express assumptions about either the data-generating process or the space of solutions.

Examples in deep learning

Concretely speaking, the very composition of layers 🍰 in deep learning provides a type of relational inductive bias: hierarchical processing. The type of layer imposes further relational inductive biases:

Various relational inductive biases in standard deep learning components (Battaglia et. al, 2018)

More generally, non-relational inductive biases used in deep learning include:

activation non-linearities,
weight decay,
dropout,
batch and layer normalization,
data augmentation,
training curricula,
optimization algorithms,
anything that imposes constraints on the learning trajectory.

Examples outside of deep learning

In a Bayesian model, inductive biases are typically expressed through the choice and parameterization of the prior distribution. Adding a Tikhonov regularization penalty to your loss function implies assuming that simpler hypotheses are more likely.

Conclusion

The stronger the inductive bias, the better the sample efficiency--this can be understood in terms of the bias-variance tradeoff. Many modern deep learning methods follow an “end-to-end” design philosophy which emphasizes minimal a priori representational and computational assumptions, which explains why they tend to be so data-intensive. On the other hand, there is a lot of research into baking stronger relational inductive biases into deep learning architectures, e.g. with graph networks.

An aside about the word "inductive"

In philosophy, inductive reasoning refers to generalization from specific observations to a conclusion. This is a counterpoint to deductive reasoning, which refers to specialization from general ideas to a conclusion.

answered Nov 19 '22 20:11

Christabella Irwanto

Related questions
                            
                                Different result with roc_auc_score() and auc()
                            
                                SVM - hard or soft margins?
                            
                                Does Any one got "AttributeError: 'str' object has no attribute 'decode' " , while Loading a Keras Saved Model
                            
                                Linear regression analysis with string/categorical features (variables)?
                            
                                Machine learning in OCaml or Haskell?
                            
                                Tensorflow One Hot Encoder?
                            
                                Ways to improve the accuracy of a Naive Bayes Classifier?
                            
                                What is out of bag error in Random Forests? [closed]
                            
                                Pattern recognition in time series [closed]
                            
                                How to get most informative features for scikit-learn classifiers?
                            
                                Mixing categorial and continuous data in Naive Bayes classifier using scikit-learn
                            
                                why gradient descent when we can solve linear regression analytically
                            
                                Adding L1/L2 regularization in PyTorch?
                            
                                What is the difference between labeled and unlabeled data?
                            
                                Instance Normalisation vs Batch normalisation
                            
                                What are the major differences and benefits of Porter and Lancaster Stemming algorithms? [closed]
                            
                                Estimating the number of neurons and number of layers of an artificial neural network [closed]
                            
                                Extracting an information from web page by machine learning
                            
                                How to save final model using keras?
                            
                                Batch Normalization in Convolutional Neural Network

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What is inductive bias in machine learning? [closed]

Tags:

terminology

machine-learning

haguki-taro

People also ask