Is there a way to pad a tensor of variable size to a given shape with a specific pad value? For example given the tensors:
[[1, 2],
[3, 4]]
and
[[1, 2, 3],
[4, 5, 6]]
Is there a way to have a generic operation which would take either and pad them with a value (say, to shape [2, 4]
with value -1
) to result in:
[[1, 2, -1, -1],
[3, 4, -1, -1]]
and
[[1, 2, 3, -1],
[4, 5, 6, -1]]
respectively? My reasoning (in case there is a better solution) is that I have examples from a TFRecords file, part of which has a variable length. For processing, a static length makes them easier to work with.
Yes. There is. Provided you do not need to change the rank of the tensor, it's very simple.
tf.pad()
accepts regular python lists with tensors. The format of the padding is a list of pairs of how much to pad on each side of that dimension.
e.g.
t = tf.constant([[1, 2], [3, 4]])
paddings = [[0, 0], [0, 4-tf.shape(t)[0]]]
out = tf.pad(t, paddings, 'CONSTANT', constant_values=-1)
sess.run(out)
# gives:
# array([[ 1, 2, -1, -1],
# [ 3, 4, -1, -1]], dtype=int32)
If you want to generalise this to a useful function, you could do something like:
def pad_up_to(t, max_in_dims, constant_values):
s = tf.shape(t)
paddings = [[0, m-s[i]] for (i,m) in enumerate(max_in_dims)]
return tf.pad(t, paddings, 'CONSTANT', constant_values=constant_values)
where max_in_dims
is essentially the desired shape of the output. Note: this function will fail if you provide a shape that is strictly smaller than t
in any dimension.
You can use it like:
t = tf.constant([[1, 2], [3, 4]]) # shape = [2, 2]
t_padded = pad_up_to(t, [2, 4], -1) # shape = [2, 4], padded with -1s
or
t = tf.placeholder(tf.float32, [None, None]) # shape = [?, ?]
t_padded = pad_up_to(t, [5,5], -1) # shape = [5, 5], padded with -1s
t_np = np.random.uniform(0, 1, [3,4]) # shape = [3,4], no padding
t_padded_out = sess.run(t_padded, {t: t_np})
t_np2 = np.random.uniform(0, 1, [2,1]) # shape = [2,1], no padding
t_padded_out2 = sess.run(t_padded, {t: t_np2})
Although the dimension sizes are calculated dynamically, the number of dimensions is not, so make sure that max_in_dims
has the same number of elements as t.shape.
An extension of Multihunter's solution so that padding is only performed when necessary and does not yield an error for longer inputs:
Suppose we have a sequential input called inp_seq
, which is a tensor of rank 4 and should be padded in order to have a minimum length of filter_size
in dimension 1.
def dynamic_padding(inp, min_size):
pad_size = min_size - tf.shape(inp)[1]
paddings = [[0, 0], [0, pad_size], [0, 0], [0, 0]] # assign here, during graph execution
return tf.pad(inp, paddings)
# Pad only if necessary
padded = tf.cond(tf.less(tf.shape(inp_seq)[1], filter_size), true_fn=lambda: dynamic_padding(inp_seq, filter_size), false_fn=lambda: inp_seq)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With