What does it mean to "break symmetry"? in the context of neural network programming? [duplicate]

Tags:

I have heard a lot about "breaking the symmetry" within the context of neural network programming and initialization. Can somebody please explain what this means? As far as I can tell, it has something to do with neurons performing similarly during forward and backward propagation if the weight matrix is filled with identical values during initialization. Asymmetrical behavior would be more clearly replicated with random initialization, i.e., not using identical values throughout the matrix.

1000

asked Jan 08 '20 02:01

Jeff Austin

1 Answers

Your understanding is correct.

When all initial values are identical, for example initialize every weight to 0, then when doing backpropagation, all weights will get the same gradient, and hence the same update. This is what is referred to as the symmetry.

Intuitively, that means all nodes will learn the same thing, and we don't want that, because we want the network to learn different kinds of features. This is achieved by random initialization, since then the gradient will be different, and each node will grow to be more distinct to other nodes, enabling diverse feature extraction. This is what is referred to as breaking the symmetry.

195

answered Sep 30 '22 07:09

justhalf

Related questions
                            
                                Keras correct input shape for multilayer perceptron
                            
                                Panel data in Keras LSTM
                            
                                How to use max pooling to gather information from LSTM nodes
                            
                                Threading in tensorflow's input pipeline
                            
                                Why is a CNN slower to train than a fully connected MLP in Keras?
                            
                                Tensorflow Autoencoder - How To Calculate Reconstruction Error?
                            
                                Keras simple RNN implementation
                            
                                keras combine pretrained model
                            
                                pytorch variable index lost one dimension
                            
                                Advanced Activation layers in Keras Functional API
                            
                                questions on clustering methods
                            
                                Are neural networks really abandonware?
                            
                                how to write a matlab code for a pattern recognition in neural network
                            
                                Cannot train a neural network solving XOR mapping
                            
                                LSTM implementation with peephole
                            
                                What layers should experience "dropout" when training a Neural Network?
                            
                                Save or export weights and biases in TensorFlow for non-Python replication
                            
                                Dimensions in convolutional neural network
                            
                                How much data is actually required to train a doc2Vec model?
                            
                                batch normalization, yes or no?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What does it mean to "break symmetry"? in the context of neural network programming? [duplicate]

Tags:

computer-science

neural-network

distribution

Jeff Austin

People also ask

1 Answers

justhalf

Recent Activity

Donate For Us