When using Dataset map
operations, is it possible to specify that any 'rows' where the map
invocation results in an error are quietly filtered out rather than having the error bubble up and kill the whole session?
I have an input pipeline set up that (more or less) does the following:
tf.image.crop_to_bounding_box
My issue is that there are (very rare) instances where my suggested bounding boxes are outside the bounds of a given image so (understandably) tf.image.crop_to_bounding_box
throws an error something like this:
tensorflow.python.framework.errors_impl.InvalidArgumentError: assertion failed: [width must be >= target + offset.]
which kills the session.
I'd prefer it if these errors were simply ignored and that the pipeline moved onto the next combination.
(I understand that the correct fix for this specific issue would be commit the time to checking each bounding box and image dimension size are possible the step before and filter them out using a filter
operation before it got to the map
with the cropping operation. I was wondering if there was an easy way to just ignore an error and move on to the next case both for easy of implementation in this specific case and also in more general cases)
Dataset. from_tensor_slices() method, we can get the slices of an array in the form of objects by using tf. data.
Dataset. prefetch transformation. It can be used to decouple the time when data is produced from the time when data is consumed.
To iterate over the dataset several times, use . repeat() . We can enumerate each batch by using either Python's enumerator or a build-in method.
For perfect shuffling, set the buffer size equal to the full size of the dataset. For instance, if your dataset contains 10,000 elements but buffer_size is set to 1,000, then shuffle will initially select a random element from only the first 1,000 elements in the buffer.
For Tensorflow 2
dataset = dataset.apply(tf.data.experimental.ignore_errors())
There is tf.contrib.data.ignore_errors
. I've never tried this myself, but according to the docs the usage is simply
dataset = dataset.map(some_map_function)
dataset = dataset.apply(tf.contrib.data.ignore_errors())
It should simply pass through the inputs (i.e. returns the same dataset) but ignore any that throw an error.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With