Are there standard input, weight and output values for neural network nodes? [closed]

Tags:

neural-network

So I've started learning about neural networks, but I'm finding it hard to figure out the basics. Grateful for any help anyone can offer..

1) Are there standard values that should be input to a neuron? For example, if a neuron has 5 incoming connections, should each connection be providing a) a continuous value between 0 and 1? b) Either 0 or 1? c) Something else?

2) If you use an activation function of tanh, that means that the neuron will start outputting 1 if the dot product input reaches about 3 (tanh(3) = .995). If I have a layer of 20 hidden nodes, that means that the weights will need to be small - around the .05 mark - if we are to avoid maxing out the activation function? Then why do we set the starting weights to be between -1 and 1? Better to start them off very small?

3) What should be the output of a neuron? a) a value between 0 and 1? b) Either 0 or 1? c) Something else? Do some ANNs have neurons outputting between -1 and 1 (I think I've seen that?)

4) Seems like the rules change for the input layer and the output layer? For the input layer, I guess you have to encode your input data into a suitable format. Does that always mean encoding into values between 0 and 1? Likewise for the output layer, presumably you have to massage your output values to something useful? So perhaps if your ANN outputs a continuous value between 0 and 1, and you want a YES or NO, then you can just make a rule that <0.5 is NO and >0.5 is YES. Is that how it works?

5) Are there disadvantages to encoding scalar input values into binary? Seems a little strange that a large number might have a 1 as the final bit, yet that number+1 has a 0 as the final bit? Is there a more continuous way of encoding values that works better?

Sorry, lots of questions.. Grateful for any answers. Thanks!

937

asked Sep 12 '13 08:09

Bruce

1 Answers

Normalized values help training a lot, so make sure your inputs are in a short range. What the range should be depends on the task: sometimes, the variables are naturally booleans, but when they're real-valued, you'd better scale them and center them at zero. Otherwise, the network will spend time learning the mean and variance of the data, which is wasteful because there are very fast, very simple algorithms for that.
If you start out with large weights, training behavior is unpredictable. I've never heard anyone say that initial weights should be in [-1, 1]; the common recipe, AFAIK, is to use small random Gaussians with mean 0 and variance 1 (what you get from randn in Matlab or NumPy).
Depends on the activation function. For hidden-layer neurons, tanh is a common activation function, and it has range [-1, 1]. For the output layer, the appropriate activation function depends on the task. For regression you'd want a linear (unbounded) activation, while for probability estimation and classification you want logistic or softmax activation with range (0, 1).
This is a repetition of questions 1 and 3.
I really don't understand why you'd want to do this. Is there anything wrong with floating point numbers?

153

answered Nov 04 '22 08:11

Fred Foo

Related questions
                            
                                Approximating sine function with Neural Network and ReLU
                            
                                Training in batches but testing individual data item in Tensorflow?
                            
                                Imbalanced Dataset Using Keras
                            
                                How to overcome overfitting in CNN - standard methods don't work
                            
                                Mini batch training for inputs of variable sizes
                            
                                How to get summary information on tensorflow RNN
                            
                                How to feed sound as input to neural networks? [closed]
                            
                                What is the ideal value of loss function for a GAN
                            
                                Calculate face_descriptor faster
                            
                                What is the purpose of keras utils normalize?
                            
                                How does a Neural Network "remember" what its learned?
                            
                                Custom Hebbian Layer Implementation in Keras - input/output dims and lateral node connections
                            
                                What is the difference between conv1d with kernel_size=1 and dense layer?
                            
                                How to see the loss of the best epoch from early stopping in Keras?
                            
                                Validation dataset in PyTorch using DataLoaders
                            
                                InvalidArgumentError: required broadcastable shapes at loc(unknown)
                            
                                Effects of randomizing the order of inputs to a neural network
                            
                                Long term prediction using Artificial Neural Network
                            
                                Does this neural network model exist?
                            
                                How fast are Deep Learning techniques (DNN, DBN, ...) in practice ? [closed]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With