How to accumulate gradients in tensorflow?

Tags:

I have a question similar to this one.

Because I have limited resources and I work with a deep model (VGG-16) - used to train a triplet network - I want to accumulate gradients for 128 batches of size one training example, and then propagate the error and update the weights.

It's not clear to me how do I do this. I work with tensorflow but any implementation/pseudocode is welcome.

320

asked Oct 16 '17 14:10

Hello Lili

1 Answers

Let's walk through the code proposed in one of the answers you liked to:

## Optimizer definition - nothing different from any classical example opt = tf.train.AdamOptimizer()  ## Retrieve all trainable variables you defined in your graph tvs = tf.trainable_variables() ## Creation of a list of variables with the same shape as the trainable ones # initialized with 0s accum_vars = [tf.Variable(tf.zeros_like(tv.initialized_value()), trainable=False) for tv in tvs] zero_ops = [tv.assign(tf.zeros_like(tv)) for tv in accum_vars]  ## Calls the compute_gradients function of the optimizer to obtain... the list of gradients gvs = opt.compute_gradients(rmse, tvs)  ## Adds to each element from the list you initialized earlier with zeros its gradient (works because accum_vars and gvs are in the same order) accum_ops = [accum_vars[i].assign_add(gv[0]) for i, gv in enumerate(gvs)]  ## Define the training step (part with variable value update) train_step = opt.apply_gradients([(accum_vars[i], gv[1]) for i, gv in enumerate(gvs)])

This first part basically adds new variables and ops to your graph which will allow you to

Accumulate the gradient with ops accum_ops in (the list of) variable accum_vars
Update the model weights with ops train_step

Then, to use it when training, you have to follow these steps (still from the answer you linked):

## The while loop for training while ...:     # Run the zero_ops to initialize it     sess.run(zero_ops)     # Accumulate the gradients 'n_minibatches' times in accum_vars using accum_ops     for i in xrange(n_minibatches):         sess.run(accum_ops, feed_dict=dict(X: Xs[i], y: ys[i]))     # Run the train_step ops to update the weights based on your accumulated gradients     sess.run(train_step)

162

answered Oct 05 '22 13:10

Pop

Related questions
                            
                                app.config not beeing loaded in .Net Core MSTests project
                            
                                The use of T[] as a template parameter
                            
                                Add uuid to a new column in a pandas DataFrame
                            
                                App Engine custom domain with service
                            
                                Decorator to return a 404 in a Nest controller
                            
                                Relabel instance to hostname in Prometheus
                            
                                How to choose between Azure data lake analytics and Azure Databricks
                            
                                Can I use multiple 'await' in an async function's try/catch block?
                            
                                Entity Framework Attach/Update confusion (EF Core)
                            
                                How to use companion objects on xml layout?
                            
                                Docker how to pass a relative path as an argument
                            
                                How to build multiple .py files into a single executable file using pyinstaller?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to accumulate gradients in tensorflow?

Tags:

Hello Lili

People also ask

1 Answers

Pop

Recent Activity

Donate For Us