I'm having trouble locating some problem images in a dataset.
My model starts training, but I get the following error:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Invalid PNG data, size 135347
[[{{node case/cond/cond_jpeg/decode_image/cond_jpeg/cond_png/DecodePng}} = DecodePng[channels=3, dtype=DT_UINT8, _device="/device:CPU:0"](case/cond/cond_jpeg/decode_image/cond_jpeg/cond_png/cond_gif/DecodeGif/Switch:1, ^case/Assert/AssertGuard/Merge)]]
[[node IteratorGetNext (defined at object_detection/model_main.py:105) = IteratorGetNext[output_shapes=[[24], [24,300,300,3], [24,2], [24,3], [24,100], [24,100,4], [24,100,2], [24,100,2], [24,100], [24,100], [24,100], [24]], output_types=[DT_INT32, DT_FLOAT, DT_INT32, DT_INT32, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_INT32, DT_BOOL, DT_FLOAT, DT_INT32], _device="/job:localhost/replica:0/task:0/device:CPU:0"](IteratorV2)]]
So I've written a small script that runs before I generate my TFRecords to try and catch any problem images. This is basically the tutorial code but with a batch size of 1. This was the simplest way I could think of to try and catch the error.
def preprocess_image(image):
image = tf.image.decode_png(image, channels=3)
image = tf.image.resize_images(image, [192, 192])
image /= 255.0 # normalize to [0,1] range
return image
def load_and_preprocess_image(path):
image = tf.read_file(path)
return preprocess_image(image)
mobile_net = tf.keras.applications.MobileNetV2(input_shape=(192, 192, 3), include_top=False)
mobile_net.trainable=False
path_ds = tf.data.Dataset.from_tensor_slices(images)
image_ds = path_ds.map(load_and_preprocess_image, num_parallel_calls=4)
def change_range(image):
return (2*image-1)
keras_ds = image_ds.map(change_range)
keras_ds = keras_ds.batch(1)
for i, batch in tqdm(enumerate(iter(keras_ds))):
try:
feature_map_batch = mobile_net(batch)
except KeyboardInterrupt:
break
except:
print(images[i])
This duly crashes, but the exception isn't properly handled. It just throws the exception and crashes. So two questions:
I've isolated an image that fails, but OpenCV, SciPy, Matplotlib and Skimage all open it. For example, I've tried this:
import scipy
images = images[1258:]
print(scipy.misc.imread(images[0]))
import matplotlib.pyplot as plt
print(plt.imread(images[0]))
import cv2
print(cv2.imread(images[0]))
import skimage
print(skimage.io.imread(images[0]))
... try to run inference in Tensorflow
I get four matrices printed out. I assume these libraries are all using libpng or something similar.
Where image 1258 then crashes Tensorflow. Looking at the DecodePng source, it looks like it's actually crashing the TF png library.
I realise I could probably write my own dataloader, but that seems like a faff.
EDIT:
This also works as a snippet:
tf.enable_eager_execution()
for i, image in enumerate(images):
try:
with tf.gfile.GFile(image, 'rb') as fid:
image_data = fid.read()
image_tensor = tf.image.decode_png(
image_data,
channels=3,
name=None
)
except:
print("Failed: ", i, image_tensor)
Open a new python file. Copy the codes below. Specify the directory where your pictures. And run the code. You can see Corrupt JPEG data: premature end of data segment
message in the list (if you have a corrupt file).
from os import listdir
import cv2
#for filename in listdir('C:/tensorflow/models/research/object_detection/images/train'):
for filename in listdir(yourDirectory):
if filename.endswith(".jpg"):
print(yourDirectory+filename)
#cv2.imread('C:/tensorflow/models/research/object_detection/images/train/'+filename)
cv2.imread(yourDirectory+filename)
A rather late and unexpected self-answer to this question.
The issue turned out to be (most likely) bad RAM. After having some odd things happening in Linux like the filesystem going read-only and random tab crashes in Firefox I decided to run Memtest. I had 2x8GB DIMMs installed. Turned out there was a bad block somewhere around the 4GB mark (on both sticks) which meant that errors would only pop up (a) when the system was under quite a high load and (b) if it exceeded around 8GB utilisation. I also checked for things like a bad hard drive, but it was a fairly new SSD. I'd previously had very sporadic and random restarts on Windows using the same system, but again I assumed it was just Microsoft forcing updates.
So I post this here for posterity. If you're seeing odd things like images getting corrupted in non-repeatable way, it takes a few minutes to run Memtest as a sanity check. Serious errors should pop up within 30 seconds and it's worth running overnight (several passes) to double check.
The solutions posted above are still useful and I'm still not convinced by TF rolling their own PNG loader, but always worth checking your hardware!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With