Apply boolean mask to last two dimensions of tensor in TensorFlow

Question

I'm in the process of porting a bunch of Numpy calculations over to TensorFlow. At one stage in my calculations, I use a boolean mask to extract and flatten a subset of values from a large array. The array can have many dimensions, but the boolean mask acts only on the last two dimensions. In Numpy, it looks something like this:

mask = np.array([
    [False, True , True , True ],
    [True , False, True , True ],
    [True , True , False, False],
    [True , True , False, False]]

large_array_masked = large_array[..., mask]

I can't figure out how to do the equivalent of this in TensorFlow. I tried:

tf.boolean_mask(large_array, mask, axis = -2)

That doesn't work because tf.boolean_mask() doesn't seem to take negative axis arguments.

As an ugly hack, I tried forcing mask to broadcast to the same shape as large_array using:

mask_broadcast = tf.logical_and(tf.fill(tf.shape(large_array), True), mask)
large_array_masked = tf.boolean_mask(large_array, mask_broadcast)

It appears that mask_broadcast has the shape and value that I want, but I get the error:

ValueError: Number of mask dimensions must be specified, even if some dimensions are None

Presumably this happens because large_array is calculated from inputs and therefore its shape is not static.

Any suggestions?

Sorin · Accepted Answer

In general, I've found that in tensorflow you want good known shapes. This is because most operations are matrix multiplications and the matrices are fixed shape.

If you really want to do this you need to convert to sparse tensor and then apply tf.sparse_retain.

The "equivalent" I would normally use in tensorflow is to multiply the mask with the large_array to 0 out the False values (large_array_masked = large_array * mask). This keeps the original shape so it makes it easier to pass to dense layers, etc...

tcquinn · Answer

I came up with a hack to solve my narrow problem so I'm posting here, but I'm accepting the answer from @Sorin because it's probably more generally applicable.

To get around the fact that tf.boolean_mask() can only act on the initial indices, I just rolled the indices forward, applied the mask, and then rolled them back. In simplified form, it looks like this:

indices = tf.range(tf.rank(large_array))
large_array_rolled_forward = tf.transpose(
    large_array,
    tf.concat([indices[-2:], indices[:-2]], axis=0))
large_array_rolled_forward_masked = tf.boolean_mask(
    large_array_rolled_forward,
    mask)
new_indices = tf.range(tf.rank(large_array_rolled_forward_masked))
large_array_masked = tf.transpose(
    large_array_rolled_forward_masked,
    tf.concat([new_indices[1:], [0]], axis=0))

Apply boolean mask to last two dimensions of tensor in TensorFlow

Tags:

python

numpy

tensorflow

tcquinn

2 Answers

Sorin

tcquinn

Recent Activity

Donate For Us

Apply boolean mask to last two dimensions of tensor in TensorFlow

Tags:

python

numpy

tensorflow

tcquinn

2 Answers

Sorin

tcquinn

Related questions

Recent Activity

Donate For Us