TensorFlow use reverse-mode automatic differentiation(reverse mode AD), as shown in https://github.com/tensorflow/tensorflow/issues/675.
Reverse mode AD need a data structure called a Wengert List - see https://en.wikipedia.org/wiki/Automatic_differentiation#Reverse_accumulation.
However, searching through the TensorFlow repository with the keyword "Wengert List", I get nothing.
Do they use a different name, or do they get rid of Wengert List? If so, how?
AD terminology is very old. It was invented when there was no Python and things were complicated. Nowadays you could just use a regular Python list for that purpose.
Implementation of reverse AD is in gradients
function of gradients_impl.py
here
The data-structure used to store the tape is initialized on line 532 and it's a Python Queue
# Initialize queue with to_ops.
queue = collections.deque()
However, searching through the TensorFlow repository with the keyword "Wengert List", but I get nothing.
This is because TensorFlow is not tape based AD, it is graph based AD system.
Wengert list would be the tape describing the order in which operations were originally executed.
There is also source code transformation based AD and a nice example of that system is Tangent.
Nowadays almost no one uses tape (Wengert list) any more. Check for instance what PyTorch does (Page 2).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With