Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Image sanitization library

I have a website that displays images submitted by users. I am concerned about some wiseguy uploading an image which may exploit some 0-day vulnerability in a browser rendering engine. Moreover, I would like to purge images of metadata (like EXIF data), and attempt to compress them further in a lossless manner (there are several such command line utilities for PNG and JPEG).

With the above in mind, my question is as follows: is there some C/C++ library out there that caters to the above scenario? And even if the full pipeline of parsing -> purging -> sanitizing -> compressing -> writing is not available in any single library, can I at least implement the parsing -> purging -> sanitizing -> writing pipeline (without compressing) in a library that supports JPEG/PNG/GIF?

like image 851
Jon Smark Avatar asked Apr 17 '12 17:04

Jon Smark


2 Answers

Your requirement is impossible to fulfill: if there is a 0-day vulnerability in one of the image reading libraries you use, then your code may be exploitable when it tries to parse and sanitize the incoming file. By "presanitizing" as soon as the image is received, you'd just be moving the point of exploitation earlier rather than later.

The only thing that would help is to parse and sanitize incoming images in a sandbox, so that, at least, if there was a vulnerability, it would be contained to the sandbox. The sandbox could be a separate process running as an unprivileged user in a chroot environment (or VM, for the very paranoid), with an interface consisting only of bytestream in, sanitized image out.

The sanitization itself could be as simple as opening the image with ImageMagick, decoding it to a raster, and reencoding and emitting them in a standard format (say, PNG or JPEG). Note that if the input and output are both lossy formats (like JPEG) then this transformation will be lossy.

like image 117
Celada Avatar answered Nov 20 '22 07:11

Celada


I know, I'm 9 years late, but...

You could use a idea similar to the PDF sanitizer in Qubes OS, which copies a PDF to a disposable virtual machine, runs a PDF parser which converts PDF to basically TIFF images, which are sent back to the originating VM and reassembled into a PDF there. This way you reduced your attack surface to TIFF files. Which is tiny.

Qubes PDF Converter diagram (image taken from this article: https://blog.invisiblethings.org/2013/02/21/converting-untrusted-pdfs-into-trusted.html)

If there is really a 0-day exploit for your specific parser in that PDF, it compromises the disposable VM, but since only valid TIFF is accepted by the originating VM and since the disposable VM is discarded once the process is done, this is pointless. Unless of course the attacker also has a either Xen exploit at hand to break out of the disposable VM or a Spectre-type full memory read primitive coupled with a sidechannel to leak data to their machines. Since the disposable VM is not connected to the internet or has any audio hardware assigned, this boils down to creating EM interference by modulating the CPU power consumption, so the attacker probably needs a big antenna and a location close to your server.

It would be an expensive attack.

like image 23
iblue Avatar answered Nov 20 '22 09:11

iblue