I'm writing a script, which sometimes leaks tensors. This can happen in multiple cases, for example when I'm training a neural network, but the training crashes. In this case, the training is interrupted and will not correctly dispose the tensors. This results in a memory leak, which I'm trying to clean up by disposing unused tensors.
Example
In the snippet below, I'm training two (very simple) models. The first run will work and will result in no leaked tensors (number of tensors before training = number of tensors after training). The second time, I'm using an invalid reshape
layer to force a crash during training. Therefore, an error is thrown and the tensors from the dataset (I guess?) will not be correctly disposed. The code is an example to show how tensors might be leaked.
async function train(shouldCrash) {
console.log(`Training, shouldCrash=${shouldCrash}`);
const dataset = tf.data.zip({ // setup data
xs: tf.data.array([[1],[1]]),
ys: tf.data.array([1]),
}).batch(1);
const model = tf.sequential({ // setup model
layers: [
tf.layers.dense({units: 1, inputShape: [1]}),
tf.layers.reshape({targetShape: [(shouldCrash ? 2 : 1)]}), // use invalid shape when crashing
],
});
model.compile({ optimizer: 'sgd', loss: 'meanSquaredError' });
console.log(' Tensors before:', tf.memory().numTensors);
try {
const history = await model.fitDataset(dataset, { epochs: 1 });
} catch (err) {
console.log(` Error: ${err.message}`);
}
console.log(' Tensors after:', tf.memory().numTensors);
}
(async () => {
await train(false); // normal training
await train(true); // training with error
})();
<script src="https://cdn.jsdelivr.net/npm/@tensorflow/[email protected]/dist/tf.min.js"></script>
Question
There is the tf.tidy
, which helps me in some cases to dispose unused tensors, but it can only be used for synchronous function calls. Therefore, it cannot be used when calling await model.fitDataset(...)
.
Is there a way to dispose any unused tensors? Alternatively, is there a way to dispose all existing tensors on the page (without reloading it)?
The most common mode of using TensorFlow involves first building a dataflow graph of TensorFlow operators (like tf.constant () and tf.matmul (), then running steps by calling the tf.Session.run () method in a loop (e.g. a training loop).
You can fix a memory leak by making sure you have no unintentional reference to objects you do not need anymore. What is a Memory Leak? A memory leak in JavaScript is defined as an increase in application memory consumption over time because allocated objects are not released.
To improve memory allocation performance, many TensorFlow users often use tcmalloc instead of the default malloc () implementation, as tcmalloc suffers less from fragmentation when allocating and deallocating large objects (such as many tensors).
On the subject of memory leaks in Node.js, you might want to explore AppSignal application monitoring for Node.js as well. Since release 1.2.0 it ships with a dashboard that shows the size of your heap, and the currently active top-level contexts. No instrumentation needed.
The way to clean any unused tensors in async code is to wrap the code that creates them between a startScope() and an endScope() call.
tf.engine().startScope()
// do your thing
tf.engine().endScope()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With