Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I programmatically check whether an image (PNG, JPEG, or GIF) is corrupted?

Tags:

Okay. So I have about 250,000 high resolution images. What I want to do is go through all of them and find ones that are corrupted. If you know what 4scrape is, then you know the nature of the images I.

Corrupted, to me, is the image is loaded into Firefox and it says

The image “such and such image” cannot be displayed, because it contains errors.

Now, I could select all of my 250,000 images (~150gb) and drag-n-drop them into Firefox. That would be bad though, because I don't think Mozilla designed Firefox to open 250,000 tabs. No, I need a way to programmatically check whether an image is corrupted.

Does anyone know a PHP or Python library which can do something along these lines? Or an existing piece of software for Windows?

I have already removed obviously corrupted images (such as ones that are 0 bytes) but I'm about 99.9% sure that there are more diseased images floating around in my throng of a collection.

like image 340
Joel Verhagen Avatar asked Sep 09 '09 19:09

Joel Verhagen


People also ask

How do you check image file is corrupted or not in Java?

If the image in JPEG, use this: JPEGImageDecoder decoder = new JPEGImageDecoder(new FileImageSource(f) ,new FileInputStream(f)); decoder. produceImage(); if it throws an exception; this means the image is corrupted.


4 Answers

An easy way would be to try loading and verifying the files with PIL (Python Imaging Library).

from PIL import Image

v_image = Image.open(file)
v_image.verify()

Catch the exceptions...

From the documentation:

im.verify()

Attempts to determine if the file is broken, without actually decoding the image data. If this method finds any problems, it raises suitable exceptions. This method only works on a newly opened image; if the image has already been loaded, the result is undefined. Also, if you need to load the image after using this method, you must reopen the image file.

like image 129
ChristopheD Avatar answered Sep 27 '22 18:09

ChristopheD


i suggest you check out imagemagick for this: http://www.imagemagick.org/

there you have a tool called identify which you can either use in combination with a script/stdout or you can use the programming interface provided

like image 45
Niko Avatar answered Sep 27 '22 18:09

Niko


In PHP, with exif_imagetype():

if (exif_imagetype($filename) === false)
{
    unlink($filename); // image is corrupted
}

EDIT: Or you can try to fully load the image with ImageCreateFromString():

if (ImageCreateFromString(file_get_contents($filename)) === false)
{
    unlink($filename); // image is corrupted
}

An image resource will be returned on success. FALSE is returned if the image type is unsupported, the data is not in a recognized format, or the image is corrupt and cannot be loaded.

like image 33
Alix Axel Avatar answered Sep 27 '22 19:09

Alix Axel


If your exact requirements are that it show correctly in FireFox you may have a difficult time - the only way to be sure would be to link to the exact same image loading source code as FireFox.

Basic image corruption (file is incomplete) can be detected simply by trying to open the file using any number of image libraries.

However many images can fail to display simply because they stretch a part of the file format that the particular viewer you are using can't handle (GIF in particular has a lot of these edge cases, but you can find JPEG and the rare PNG file that can only be displayed in specific viewers). There are also some ugly JPEG edge cases where the file appears to be uncorrupted in viewer X, but in reality the file has been cut short and is only displaying correctly because very little information has been lost (FireFox can show some cut off JPEGs correctly [you get a grey bottom], but others result in FireFox seeming the load them half way and then display the error message instead of the partial image)

like image 20
David Avatar answered Sep 27 '22 20:09

David