In several parts of the documentation (e.g. Dataset Iterators
here) there are references to Stateful Objects
. What exactly are they and what role do they play in the graph?
To clarify, in the Dataset documentation there's an example with the one_shot_iterator
that works because it's stateless:
dataset = tf.data.Dataset.range(100)
iterator = dataset.make_one_shot_iterator()
what makes the iterator stateless?
As others have mentioned, stateful objects are those holding a state. Now, a state, in TensorFlow terms, is some value or data that is saved between different calls to tf.Session.run
. The most common and basic kind of stateful objects are variables. You can call run
once to update a model's parameters, which are variables, and they will maintain their assigned value for the next call to run
. This is different to most operations; for example, if you have an addition operation, which takes two tensors and outputs a third one, the output value that it computes in one call to run
is not saved. Indeed, even if your graph consists only of operations with constant values, tensor operations will be evaluated every time you call run
, even though the result will always be the same. When you assign a value to a variable, however, it will "stick" (and, by the way, take the corresponding memory and, if you choose so, be serialized on checkpoints).
Dataset iterators are also stateful. When you get a piece of data in one run then it is consumed, and then in the next run you get a different piece of data; the iterator "remembers" where it was between runs. That is why, similarly to how you initialize variables, you can initialize iterators (when they support it), to reset them back to a known state.
Technically speaking, another kind of stateful objects is random operations. One usually regards random operations as, well, random, but in reality they hold a random number generator that does have a state which is held between runs, and if you provide a seed then they will be in a well-defined state when you start the session. However there is not, as far as I know, any way to reset random operations to their initial state within the same session.
Note that frequently (when one is not referring to TensorFlow in particular) the term "stateful" is used in a slightly different sense, or at a different level of abstraction. For example, recurrent neural networks (RNNs) are generally said to be stateful, because, conceptually, they have an internal state that changes with every input they receive. When you make a RNN in TensorFlow, however, that internal state does not necessarily have to be in a stateful object! Like any other kind of neural network, RNNs in TensorFlow will in principle have some parameters or weights, typically stored in trainable variables - so, in TensorFlow terms, all trainable models, RNN or not, have stateful objects for the trained parameters. However, the internal state of the RNN is represented in TensorFlow with an input state value and an output state value that you get on each run (see tf.nn.dynamic_rnn
), and you can just start with a "zero" state on each run and forget about the final output state. Of course, you can also, if you want, take the input state to be the value of a variable, and the write the output state back to that variable, and then your RNN internal state will be "stateful" for TensorFlow; that is, you would be able to process some data in one run and the "pick up where you left it" in the next run (which may or may not make sense depending on the case). I understand this can be a bit confusing but I hope it makes sense.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With