I've trained 3 models and am now running code that loads each of the 3 checkpoints in sequence and runs predictions using them. I'm using the GPU. When the first model is loaded it pre-allocates the entire GPU memory (which I want for working through the first batch of data). But it doesn't unload memory when it's finished. When the second model is loaded, using both <code>tf.reset_default_graph()</code> and <code>with tf.Graph().as_default()</code> the GPU memory still is fully consumed from the first model, and the second model is then starved of memory. Is there a way to resolve this, other than using Python subprocesses or multiprocessing to work around the problem (the only solution I've found on via google searches)?

You can use numba library to release all the gpu memory <pre class="prettyprint lang-sh prettyprint-override"><code>pip install numba </code></pre> <pre class="prettyprint lang-py prettyprint-override"><code>from numba import cuda device = cuda.get_current_device() device.reset() </code></pre> This will release all the memory

Clearing Tensorflow GPU memory after model execution

Tags:

python

tensorflow

gpu

I've trained 3 models and am now running code that loads each of the 3 checkpoints in sequence and runs predictions using them. I'm using the GPU.

When the first model is loaded it pre-allocates the entire GPU memory (which I want for working through the first batch of data). But it doesn't unload memory when it's finished. When the second model is loaded, using both tf.reset_default_graph() and with tf.Graph().as_default() the GPU memory still is fully consumed from the first model, and the second model is then starved of memory.

Is there a way to resolve this, other than using Python subprocesses or multiprocessing to work around the problem (the only solution I've found on via google searches)?

430

asked Sep 28 '16 21:09

David Parks

2 Answers

A git issue from June 2016 (https://github.com/tensorflow/tensorflow/issues/1727) indicates that there is the following problem:

currently the Allocator in the GPUDevice belongs to the ProcessState, which is essentially a global singleton. The first session using GPU initializes it, and frees itself when the process shuts down.

Thus the only workaround would be to use processes and shut them down after the computation.

Example Code:

import tensorflow as tf
import multiprocessing
import numpy as np

def run_tensorflow():

    n_input = 10000
    n_classes = 1000

    # Create model
    def multilayer_perceptron(x, weight):
        # Hidden layer with RELU activation
        layer_1 = tf.matmul(x, weight)
        return layer_1

    # Store layers weight & bias
    weights = tf.Variable(tf.random_normal([n_input, n_classes]))


    x = tf.placeholder("float", [None, n_input])
    y = tf.placeholder("float", [None, n_classes])
    pred = multilayer_perceptron(x, weights)

    cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=y))
    optimizer = tf.train.AdamOptimizer(learning_rate=0.001).minimize(cost)

    init = tf.global_variables_initializer()

    with tf.Session() as sess:
        sess.run(init)

        for i in range(100):
            batch_x = np.random.rand(10, 10000)
            batch_y = np.random.rand(10, 1000)
            sess.run([optimizer, cost], feed_dict={x: batch_x, y: batch_y})

    print "finished doing stuff with tensorflow!"


if __name__ == "__main__":

    # option 1: execute code with extra process
    p = multiprocessing.Process(target=run_tensorflow)
    p.start()
    p.join()

    # wait until user presses enter key
    raw_input()

    # option 2: just execute the function
    run_tensorflow()

    # wait until user presses enter key
    raw_input()

So if you would call the function run_tensorflow() within a process you created and shut the process down (option 1), the memory is freed. If you just run run_tensorflow() (option 2) the memory is not freed after the function call.

187

answered Oct 13 '22 23:10

Oliver Wilken

You can use numba library to release all the gpu memory

pip install numba

from numba import cuda 
device = cuda.get_current_device()
device.reset()

This will release all the memory

answered Oct 14 '22 00:10

hitesh kumar

Related questions
                            
                                Pandas groupby with bin counts
                            
                                scikit-learn: how to scale back the 'y' predicted result
                            
                                Python Reverse Find in String
                            
                                Is there an official or common knowledge standard minimal interface for a "list-like" object?
                            
                                FSharp runs my algorithm slower than Python
                            
                                Firebase cloud functions using Python?
                            
                                How to apply __str__ function when printing a list of objects in Python
                            
                                function is not defined error in Python
                            
                                How can I classify data with the nearest-neighbor algorithm using Python?
                            
                                Python Class Based Decorator with parameters that can decorate a method or a function
                            
                                How can I get a list of the symbols in a sympy expression?
                            
                                Multiple configuration files with Python ConfigParser
                            
                                Timer for Python game
                            
                                How to replace a double backslash with a single backslash in python?
                            
                                Cancellable threading.Timer in Python
                            
                                How to pass a variable to an exception when raised and retrieve it when excepted?
                            
                                Suppressing output in python subprocess call [duplicate]
                            
                                Difference between IOError and OSError?
                            
                                Get the same hash value for a Pandas DataFrame each time
                            
                                Evaluate multiple scores on sklearn cross_val_score

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With