I'm trying to open a jpeg file in Python 2.7,
from PIL import Image
im = Image.open(filename)
Which didn't work for me,
>>> im = Image.open(filename)
Traceback (most recent call last):
File "<pyshell#810>", line 1, in <module>
im = Image.open(filename)
File "/usr/lib/python2.7/dist-packages/PIL/Image.py", line 1980, in open
raise IOError("cannot identify image file")
IOError: cannot identify image file
though when trying out on external viewers, it opened fine. Digging in a bit, it turns out that the JpegImageFile._open
method from PIL
's JpegImagePlugin.py
file raises a SyntaxError
exception due to several extraneous 0x00
bytes before the 0xFFDA
marker in the JPEG's file header,
Corrupt JPEG data: 5 extraneous bytes before marker 0xda
That is, while other programs I tried simply ignored the unknown 0x00
marker towards the end of the header, PIL
prefered to raise an exception, not allowing me to open the image.
QUESTION: Apart from editing PIL
's code directly, is there any workaround for opening JPEGs with problematic headers?
The relevant code from the JpegImageFile
class which raises the exception appears below, for your convenience:
def _open(self):
s = self.fp.read(1)
if ord(s[0]) != 255:
raise SyntaxError("not a JPEG file")
# Create attributes
self.bits = self.layers = 0
# JPEG specifics (internal)
self.layer = []
self.huffman_dc = {}
self.huffman_ac = {}
self.quantization = {}
self.app = {} # compatibility
self.applist = []
self.icclist = []
while 1:
s = s + self.fp.read(1)
i = i16(s)
if MARKER.has_key(i):
name, description, handler = MARKER[i]
# print hex(i), name, description
if handler is not None:
handler(self, i)
if i == 0xFFDA: # start of scan
rawmode = self.mode
if self.mode == "CMYK":
rawmode = "CMYK;I" # assume adobe conventions
self.tile = [("jpeg", (0,0) + self.size, 0, (rawmode, ""))]
# self.__offset = self.fp.tell()
break
s = self.fp.read(1)
elif i == 0 or i == 65535:
# padded marker or junk; move on
s = "\xff"
else:
raise SyntaxError("no marker found")
PIL doesn't like corrupt data in the header and falls over as you've discovered.
I've made a pull request to Pillow (the friendly PIL fork) that should fix this problem.
It's not yet been accepted, but hopefully it'll be there for version 2.5.0 due out in a couple of months. In the meantime, you can try it out here: https://github.com/python-imaging/Pillow/pull/647
As a workaround, you could use something like ImageMagick to first convert the problematic images to something like png, and then use them in PIL/Pillow.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With