how to randomly initialize weights in tensorflow?

Tags:

tensorflow

in tensorflow, I learned from the tutorial that one would initialize the variables with something like sess.run(tf.global_variables_initializer())

however I found that every time I run this with the same input dataset, the loss value starts with the same value.

I presume this is due to the fact that the initialization is always setting up the variables with the same values. (probably zero)

I wish to randomize the values of weights. I've tried searching for this but tensorflow docs doesn't give a clear answer if the initialization is done with zero values by default or random values.

How can I specify the initializaing to setup random values?

update

my network is first a bunch of CNNs and pooling layers like below: ``` conv1 = tf.layers.conv2d(inputs=input_layer, filters=32, kernel_size=[3,3], padding="same", activation=tf.nn.relu, name="conv_chad_1")

    pool1 = tf.layers.max_pooling2d(inputs=conv1,pool_size=[2,2],strides=2)

    conv2 = tf.layers.conv2d(inputs=pool1, filters=64, kernel_size=[3,3], padding="same", activation=tf.nn.relu, name="conv_chad_2")

    pool2 = tf.layers.max_pooling2d(inputs=conv2,pool_size=[2,2],strides=2, name="pool_chad_2")

```

AFAIK, the weights are defined inside these predefined layers. How do I specify these layers to initialize their weight variables randomly??

449

asked May 17 '18 04:05

kwagjj

2 Answers

You should provide more information. For example, how do you initialize your variables in your graph? For initializing your weights in a neural network, you must initialize them randomly (biases are ok to be initialized all as zero). Thus you must use a code like the following for defining them with proper initialization:

# initialize weights randomly from a Gaussian distribution
# step 1: create the initializer for weights
weight_initer = tf.truncated_normal_initializer(mean=0.0, stddev=0.01)
# step 2: create the weight variable with proper initialization
W = tf.get_variable(name="Weight", dtype=tf.float32, shape=[784, 200], initializer=weight_initer)

# initialize biases as zero
# step 1: create the initializer for biases
bias_initer =tf.constant(0., shape=[200], dtype=tf.float32)
# step 2: create the bias variable with proper initialization
b = tf.get_variable(name="Bias", dtype=tf.float32, initializer=bias_initer)

answered Oct 18 '22 21:10

Ary

I had same problem, it's like you are executing the line of code that is global_value_initializer() every time. What you need to do is this, for instance, if you're working on jupyter notebook then declare that part of session(declaring init) in one cell and rest of them in another cell(training part).

Also, for when you want to continue training model after some pause you might wanna save the parameters and restore them. How to do that ,you can look here. If that doesn't solve your problem then show me that part of code you're dealing with. I might be able to help more.

PS: You can't restore your parameters when you change your optimizer, you gotta stick with one as far as I know. you can't do 100 iterations with one optimzer, and continue with another optimizer with those same parameters. Or maybe you can try some hack that might let you do that, let me know too.

answered Oct 18 '22 21:10

shivam13juna

Related questions
                            
                                PyCharm remote interpreter and Tensorflow -> can not import Cudart.so
                            
                                InvalidArgumentError: The node has inputs from different frames
                            
                                Keras + TensorFlow: “module 'tensorflow' has no attribute 'merge_all_summaries''”
                            
                                Tensor multiplication in Tensorflow
                            
                                Tensorflow Variables are Not Initialized using Between-graph Replication
                            
                                How to convert static_rnn inputs to dynamic_rnn inputs in tensorflow?
                            
                                TensorFlow operation 'tf.train.match_filenames_once ' not working
                            
                                how tf.space_to_depth() works in tensorflow?
                            
                                What is the most efficient way to compute a Kronecker Product in TensorFlow?
                            
                                Does tf.nn.l2_loss and tf.contrib.layers.l2_regularizer serve the same purpose of adding L2 regularization in tensorflow?
                            
                                keras add external trainable variable to graph
                            
                                Tensorboard error: 'Tensor' object has no attribute 'value'
                            
                                TypeError: unsupported callable using Dataset with estimator input_fn
                            
                                The result of fft in tensorflow is different from numpy
                            
                                reshaping image feed to tensorflow
                            
                                Use "Flatten" or "Reshape" to get 1D output of unknown input shape in keras
                            
                                cx_Freeze "no module named google" Error
                            
                                export Keras model to .pb file and optimize for inference gives random guess on Android
                            
                                Does tensorflow propagate gradients through a pdf
                            
                                Using LSTM to predict a simple synthetic time series. Why is it that bad?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With