How to update model parameters with accumulated gradients?

Tags:

I'm using TensorFlow to build a deep learning model. And new to TensorFlow.

Due to some reason, my model has limited batch size, then this limited batch-size will make the model has a high variance.

So, I want to use some trick to make the batch size larger. My idea is to store the gradients of each mini-batch, for example 64 mini-batches, and then sum the gradients together, use the mean gradients of this 64 mini batches of training data to update the model's parameters.

This means that for the first 63 mini-batches, do not update the parameters, and after the 64 mini batch, update the model's parameters only once.

But as TensorFlow is graph based, do anyone know how to implement this wanted feature?

Thanks very much.

242

asked Feb 10 '17 10:02

weixsong

1 Answers

I found a solution here: https://github.com/tensorflow/tensorflow/issues/3994#event-766328647

opt = tf.train.AdamOptimizer()
tvs = tf.trainable_variables()
accum_vars = [tf.Variable(tf.zeros_like(tv.initialized_value()), trainable=False) for tv in tvs]                                        
zero_ops = [tv.assign(tf.zeros_like(tv)) for tv in accum_vars]
gvs = opt.compute_gradients(rmse, tvs)
accum_ops = [accum_vars[i].assign_add(gv[0]) for i, gv in enumerate(gvs)]
train_step = opt.apply_gradients([(accum_vars[i], gv[1]) for i, gv in enumerate(gvs)])

In the training loop:

while True:
    sess.run(zero_ops)
    for i in xrange(n_minibatches):
        sess.run(accum_ops, feed_dict=dict(X: Xs[i], y: ys[i]))
    sess.run(train_step)

But this code seems not very clean and pretty, does anyone know how to optimize these code?

135

answered Oct 15 '22 04:10

weixsong

Related questions
                            
                                No FileSystem for scheme: s3 with pyspark
                            
                                What is the pandas.Panel deprecation warning actually recommending?
                            
                                Reproducibility and performance in PyTorch
                            
                                What is the Simplest Possible Payment Gateway to Implement? (using Django) [closed]
                            
                                In python, how to get subparsers to read in parent parser's argument?
                            
                                Which Python user interface library can I use for 2D games? [closed]
                            
                                Python + Django + VirtualEnv + Windows
                            
                                Summarizing a Wikipedia Article
                            
                                How to pass a numpy array of string types to a function in Cython
                            
                                python example for reading multiple protobuf messages from a stream
                            
                                Writing a tokenizer in Python
                            
                                Affine transformation between contours in OpenCV
                            
                                _csv.Error: iterator should return strings, not bytes (did you open the file in text mode?) [duplicate]
                            
                                Are sessions needed for python-social-auth
                            
                                Python: how to check if an item was added to a set, without 2x (hash, lookup)
                            
                                Does jedi-vim conflict with YouCompleteMe?
                            
                                How to make an Python subclass uncallable
                            
                                How to read data into TensorFlow batches from example queue?
                            
                                How to install a python package with all the dependencies into a Docker image?
                            
                                AnalysisException: u"cannot resolve 'name' given input columns: [ list] in sqlContext in spark

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to update model parameters with accumulated gradients?

Tags:

python

gradient

tensorflow

weixsong

People also ask

1 Answers

weixsong

Recent Activity

Donate For Us