Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Should I use @tf.function for all functions?

An official tutorial on @tf.function says:

To get peak performance and to make your model deployable anywhere, use tf.function to make graphs out of your programs. Thanks to AutoGraph, a surprising amount of Python code just works with tf.function, but there are still pitfalls to be wary of.

The main takeaways and recommendations are:

  • Don't rely on Python side effects like object mutation or list appends.
  • tf.function works best with TensorFlow ops, rather than NumPy ops or Python primitives.
  • When in doubt, use the for x in y idiom.

It only mentions how to implement @tf.function annotated functions but not when to use it.

Is there a heuristic on how to decide whether I should at least try to annotate a function with tf.function? It seems that there are no reasons not to do it, unless I am to lazy to remove side effects or change some things like range()-> tf.range(). But if I am willing to do this...

Is there any reason not to use @tf.function for all functions?

like image 832
problemofficer - n.f. Monica Avatar asked Jan 21 '20 18:01

problemofficer - n.f. Monica


People also ask

When should you use tf function?

tf. function is a decorator function provided by Tensorflow 2.0 that converts regular python code to a callable Tensorflow graph function, which is usually more performant and python independent. It is used to create portable Tensorflow models.

What is the significance of tf variable ()?

A tf. Variable represents a tensor whose value can be changed by running ops on it. Specific ops allow you to read and modify the values of this tensor.

Is eager execution slower?

Eager execution is slower than graph execution! Since eager execution runs all operations one-by-one in Python, it cannot take advantage of potential acceleration opportunities . Graph execution extracts tensor computations from Python and builds an efficient graph before evaluation.

Is tf variable trainable?

tf. GradientTape watches trainable variables by default: with tf.


3 Answers

TLDR: It depends on your function and whether you are in production or development. Don't use tf.function if you want to be able to debug your function easily, or if it falls under the limitations of AutoGraph or tf.v1 code compatibility. I would highly recommend watching the Inside TensorFlow talks about AutoGraph and Functions, not Sessions.

In the following I'll break down the reasons, which are all taken from information made available online by Google.

In general, the tf.function decorator causes a function to be compiled as a callable that executes a TensorFlow graph. This entails:

  • Conversion of the code through AutoGraph if required (including any functions called from an annotated function)
  • Tracing and executing the generated graph code

There is detailed information available on the design ideas behind this.

Benefits of decorating a function with tf.function

General benefits

  • Faster execution, especially if the function consists of many small ops (Source)

For functions with Python code / Using AutoGraph via tf.function decoration

If you want to use AutoGraph, using tf.function is highly recommended over calling AutoGraph directly. Reasons for this include: Automatic control dependencies, it is required for some APIs, more caching, and exception helpers (Source).

Drawbacks of decorating a function with tf.function

General drawbacks

  • If the function only consists of few expensive ops, there will not be much speedup (Source)

For functions with Python code / Using AutoGraph via tf.function decoration

  • No exception catching (should be done in eager mode; outside of the decorated function) (Source)
  • Debugging is much harder
  • Limitations due to hidden side effects and TF control flow

Detailed information on AutoGraph limitations is available.

For functions with tf.v1 code

  • It is not allowed to create variables more than once in tf.function, but this is subject to change as tf.v1 code is phased out (Source)

For functions with tf.v2 code

  • No specific drawbacks

Examples of limitations

Creating variables more than once

It is not allowed to create variables more than once, such as v in the following example:

@tf.function def f(x):     v = tf.Variable(1)     return tf.add(x, v)  f(tf.constant(2))  # => ValueError: tf.function-decorated function tried to create variables on non-first call. 

In the following code, this is mitigated by making sure that self.v is only created once:

class C(object):     def __init__(self):         self.v = None     @tf.function     def f(self, x):         if self.v is None:             self.v = tf.Variable(1)         return tf.add(x, self.v)  c = C() print(c.f(tf.constant(2)))  # => tf.Tensor(3, shape=(), dtype=int32) 

Hidden side effects not captured by AutoGraph

Changes such as to self.a in this example can't be hidden, which leads to an error since cross-function analysis is not done (yet) (Source):

class C(object):     def change_state(self):         self.a += 1      @tf.function     def f(self):         self.a = tf.constant(0)         if tf.constant(True):             self.change_state() # Mutation of self.a is hidden         tf.print(self.a)  x = C() x.f()  # => InaccessibleTensorError: The tensor 'Tensor("add:0", shape=(), dtype=int32)' cannot be accessed here: it is defined in another function or code block. Use return values, explicit Python locals or TensorFlow collections to access it. Defined in: FuncGraph(name=cond_true_5, id=5477800528); accessed from: FuncGraph(name=f, id=5476093776). 

Changes in plain sight are no problem:

class C(object):     @tf.function     def f(self):         self.a = tf.constant(0)         if tf.constant(True):             self.a += 1 # Mutation of self.a is in plain sight         tf.print(self.a)  x = C() x.f()  # => 1 

Example of limitation due to TF control flow

This if statement leads to an error because the value for else needs to be defined for TF control flow:

@tf.function def f(a, b):     if tf.greater(a, b):         return tf.constant(1)  # If a <= b would return None x = f(tf.constant(3), tf.constant(2))     # => ValueError: A value must also be returned from the else branch. If a value is returned from one branch of a conditional a value must be returned from all branches. 
like image 93
prouast Avatar answered Oct 05 '22 22:10

prouast


tf.function is useful in creating and using computational graphs, they should be used in training and in deployment, however it isnt needed for most of your functions.

Lets say that we are building a special layer that will be apart of a larger model. We would not want to have the tf.function decorator above the function that constructs that layer because it is merely a definition of what the layer will look like.

On the other hand, lets say that we are going to either make a prediction or continue our training using some function. We would want to have the decorator tf.function because we are actually using the computational graph to get some value.

A great example would be constructing a encoder-decoder model. DONT put the decorator around the function the create the encoder or decoder or any layer, that is only a definition of what it will do. DO put the decorator around the "train" or "predict" method because those are actually going to use the computational graph for computation.

like image 44
Drew Avatar answered Oct 05 '22 23:10

Drew


Per my understanding and according to the documentation, using tf.function is highly recommended mainly for speeding up your code since the code wrapped by tf.function would be converted to a graph and therefore there is a room for some optimizations (e.g. op pruning, folding, etc.) to be done which may not be performed when the same code is run eagerly.

However, there are also a few cases where using tf.function might incur additional overhead or does not result in noticeable speedups. One notable case is when the wrapped function is small and only used a few times in your code and therefore the overhead of calling the graph might be relatively large. Another case is when most of the computations are already done on an accelerator device (e.g. GPU, TPU), and therefore the speedups gained by graph computation might not be significant.

There is also a section in the documentation where the speedups are discussed in various scenarios, and at the beginning of this section the two cases above have been mentioned:

Just wrapping a tensor-using function in tf.function does not automatically speed up your code. For small functions called a few times on a single machine, the overhead of calling a graph or graph fragment may dominate runtime. Also, if most of the computation was already happening on an accelerator, such as stacks of GPU-heavy convolutions, the graph speedup won't be large.

For complicated computations, graphs can provide a significant speedup. This is because graphs reduce the Python-to-device communication and perform some speedups.

But at the end of the day, if it's applicable to your workflow, I think the best way to determine this for your specific use case and environment is to profile your code when it gets executed in eager mode (i.e. without using tf.function) vs. when it gets executed in graph mode (i.e. using tf.function extensively).

like image 23
today Avatar answered Oct 05 '22 22:10

today