tf.variable_scope
has a partitioner
parameter as mentioned in documentation.
As I understand it's used for distributed training. Can anyone explain it in more details what is the correct use of it?
Huge tensorflow variables can be sharded across several machines (see this discussion). Partitioner is the mechanism, through which tensorflow distributes and assembles the tensors back, so that the rest of the program doesn't know these implementation details and works with tensors the usual way.
You can specify the partitioner per one variable via tf.get_variable
:
If a partitioner is provided, a PartitionedVariable is returned. Accessing this object as a Tensor returns the shards concatenated along the partition axis.
Or you define the default partitioner for the whole scope via tf.variable_scope
, which will affect all variables defined in it.
See the list of available partitioners in tensorflow 1.3 on this page. The simplest one is tf.fixed_size_partitioner
, which shards the tensor along the specified axis. Here's an example usage (from this question):
w = tf.get_variable("weights",
weights_shape,
partitioner=tf.fixed_size_partitioner(num_shards, axis=0),
initializer=tf.truncated_normal_initializer(stddev=0.1))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With