Why is weight vector orthogonal to decision plane in neural networks

Tags:

I am beginner in neural networks. I am learning about perceptrons. My question is Why is weight vector perpendicular to decision boundary(Hyperplane)? I referred many books but all are mentioning that weight vector is perpendicular to decision boundary but none are saying why?

Can anyone give me an explanation or reference to a book?

636

asked Apr 16 '12 15:04

8A52

1 Answers

The weights are simply the coefficients that define a separating plane. For the moment, forget about neurons and just consider the geometric definition of a plane in N dimensions:

w1*x1 + w2*x2 + ... + wN*xN - w0 = 0

You can also think of this as being a dot product:

w*x - w0 = 0

where w and x are both length-N vectors. This equation holds for all points on the plane. Recall that we can multiply the above equation by a constant and it still holds so we can define the constants such that the vector w has unit length. Now, take out a piece of paper and draw your x-y axes (x1 and x2 in the above equations). Next, draw a line (a plane in 2D) somewhere near the origin. w0 is simply the perpendicular distance from the origin to the plane and w is the unit vector that points from the origin along that perpendicular. If you now draw a vector from the origin to any point on the plane, the dot product of that vector with the unit vector w will always be equal to w0 so the equation above holds, right? This is simply the geometric definition of a plane: a unit vector defining the perpendicular to the plane (w) and the distance (w0) from the origin to the plane.

Now our neuron is simply representing the same plane as described above but we just describe the variables a little differently. We'll call the components of x our "inputs", the components of w our "weights", and we'll call the distance w0 a bias. That's all there is to it.

Getting a little beyond your actual question, we don't really care about points on the plane. We really want to know which side of the plane a point falls on. While w*x - w0 is exactly zero on the plane, it will have positive values for points on one side of the plane and negative values for points on the other side. That's where the neuron's activation function comes in but that's beyond your actual question.

answered Sep 23 '22 14:09

bogatron

Related questions
                            
                                Machine Learning & Big Data [closed]
                            
                                Machine Learning Algorithm for Predicting Order of Events?
                            
                                Hyperparameter optimization for Pytorch model [closed]
                            
                                Difference between standardscaler and Normalizer in sklearn.preprocessing
                            
                                How to understand SpatialDropout1D and when to use it?
                            
                                Does ImageDataGenerator add more images to my dataset?
                            
                                Can anyone give a real life example of supervised learning and unsupervised learning? [closed]
                            
                                Kmeans without knowing the number of clusters? [duplicate]
                            
                                What is the difference between UpSampling2D and Conv2DTranspose functions in keras?
                            
                                import input_data MNIST tensorflow not working
                            
                                What is the difference between back-propagation and feed-forward Neural Network?
                            
                                How to split data on balanced training set and test set on sklearn
                            
                                How to use k-fold cross validation in a neural network
                            
                                What is a policy in reinforcement learning? [closed]
                            
                                Plot Interactive Decision Tree in Jupyter Notebook
                            
                                Tensorflow: restoring a graph and model then running evaluation on a single image
                            
                                How does keras define "accuracy" and "loss"?
                            
                                Choosing between GeForce or Quadro GPUs to do machine learning via TensorFlow
                            
                                Scikit-learn, get accuracy scores for each class
                            
                                Restore original text from Keras’s imdb dataset

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why is weight vector orthogonal to decision plane in neural networks

Tags:

artificial-intelligence

machine-learning

neural-network

perceptron

biological-neural-network

8A52

People also ask

1 Answers

bogatron

Recent Activity

Donate For Us