Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Script to detect broken images

I wrote a python script to detect broken images and count them, The problem in my script is it detects all the images and does not detect broken images. How to fix this. I refered :

How to check if a file is a valid image file? for my code

My code

import os
from os import listdir
from PIL import Image
count=0
for filename in os.listdir('/Users/ajinkyabobade/Desktop/2'):
    if filename.endswith('.JPG'):
     try:
      img=Image.open('/Users/ajinkyabobade/Desktop/2'+filename)
      img.verify()
     except(IOError,SyntaxError)as e:
         print('Bad file  :  '+filename)
         count=count+1
         print(count)
like image 836
Ajinkya Avatar asked Mar 31 '26 00:03

Ajinkya


1 Answers

I have added another SO answer here that extends the PIL solution to better detect broken images. I also implemented this solution in my Python script here on GitHub.

I also verified that damaged files (jpg) frequently are not 'broken' images i.e, a damaged picture file sometimes remains a legit picture file, the original image is lost or altered but you are still able to load it.

I quote the other answer for completeness:

You can use Python Pillow(PIL) module, with most image formats, to check if a file is a valid and intact image file.

In the case you aim at detecting also broken images, @Nadia Alramli correctly suggests the im.verify() method, but this does not detect all the possible image defects, e.g., im.verify does not detect truncated images (that most viewer often load with a greyed area).

Pillow is able to detect these type of defects too, but you have to apply image manipulation or image decode/recode in or to trigger the check. Finally I suggest to use this code:

try:
  im = Image.load(filename)
  im.verify() #I perform also verify, don't know if he sees other types o defects
  im.close() #reload is necessary in my case
  im = Image.load(filename) 
  im.transpose(PIL.Image.FLIP_LEFT_RIGHT)
  im.close()
except: 
  #manage excetions here

In case of image defects this code will raise an exception. Please consider that im.verify is about 100 times faster than performing the image manipulation (and I think that flip is one of the cheaper transformations). With this code you are going to verify a set of images at about 10 MBytes/sec (modern 2.5Ghz x86_64 CPU).

For the other formats psd,xcf,.. you can use Imagemagick wrapper Wand, the code is as follows:

im = wand.image.Image(filename=filename)
temp = im.flip;
im.close()

But, from my experiments Wand does not detect truncated images, I think it loads lacking parts as greyed area without prompting.

I red that Imagemagick has an external command identify that could make the job, but I have not found a way to invoke that function programmatically and I have not tested this route.

I suggest to always perform a preliminary check, check the filesize to not be zero (or very small), is a very cheap idea:

statfile = os.stat(filename)
filesize = statfile.st_size
if filesize == 0:
  #manage here the 'faulty image' case
like image 61
Fabiano Tarlao Avatar answered Apr 02 '26 02:04

Fabiano Tarlao