Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Super fast getimagesize in php

I am trying to get image size (image dimensions, width and height) of hundreds of remote images and getimagesize is way too slow.

I have done some reading and found out the quickest way would be to use file_get_contents to read a certain amount of bytes from the images and examining the size within the binary data.

Anyone attempted this before? How would I examine different formats? Anyone has seen any library for this?

like image 640
Sir Lojik Avatar asked Jan 08 '11 20:01

Sir Lojik


3 Answers

function ranger($url){
    $headers = array(
    "Range: bytes=0-32768"
    );

    $curl = curl_init($url);
    curl_setopt($curl, CURLOPT_HTTPHEADER, $headers);
    curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
    $data = curl_exec($curl);
    curl_close($curl);
    return $data;
}

$start = microtime(true);

$url = "http://news.softpedia.com/images/news2/Debian-Turns-15-2.jpeg";

$raw = ranger($url);
$im = imagecreatefromstring($raw);

$width = imagesx($im);
$height = imagesy($im);

$stop = round(microtime(true) - $start, 5);

echo $width." x ".$height." ({$stop}s)";

test...

640 x 480 (0.20859s)

Loading 32kb of data worked for me.

like image 122
Dejan Marjanović Avatar answered Oct 21 '22 12:10

Dejan Marjanović


I have created a PHP library for exactly this scenario, it works by downloading the absolute minimum of the remote file needed to determine the filesize. This is different for every image and particularly for JPEG depends on how many embedded thumbnails there are in the file.

It is available on GitHub here: https://github.com/tommoor/fastimage

Example usage:

$image = new FastImage($uri);
list($width, $height) = $image->getSize();
echo "dimensions: " . $width . "x" . $height;
like image 28
Tom Avatar answered Oct 21 '22 10:10

Tom


I was looking for a better way to handle this situation, so I used a few different functions found around the internet.

Overall, when it worked, the fastest tended to be the getjpegsize function that James Relyea posted on the PHP page for getimagesize, beating the ranger function provided by Dejan above. http://php.net/manual/en/function.getimagesize.php#88793

Image #1 (787KB JPG on external older server)
getimagesize: 0.47042 to 0.47627 - 1700x2340 [SLOWEST]
getjpegsize: 0.11988 to 0.14854 - 1700x2340 [FASTEST]
ranger: 0.1917 to 0.22869 - 1700x2340

Image #2 (3MB PNG)
getimagesize: 0.01436 to 0.01451 - 1508x1780 [FASTEST]
getjpegsize: - failed
ranger: - failed

Image #3 (2.7MB JPG)
getimagesize: 0.00855 to 0.04806 - 3264x2448 [FASTEST]
getjpegsize: - failed
ranger: 0.06222 to 0.06297 - 3264x2448 * [SLOWEST]

Image #4 (1MB JPG)
getimagesize: 0.00245 to 0.00261 - 2031x1434
getjpegsize: 0.00135 to 0.00142 - 2031x1434 [FASTEST]
ranger: 0.0168 to 0.01702 - 2031x1434 [SLOWEST]

Image #5 (316KB JPG)
getimagesize: 0.00152 to 0.00162 - 1280x720
getjpegsize: 0.00092 to 0.00106 - 1280x720 [FASTEST]
ranger: 0.00651 to 0.00674 - 1280x720 [SLOWEST]
  • ranger failed when grabbing 32768 bytes on Image #3, so I increase it to 65536 and it worked to grab the size successfully.

There are problems, though, as both ranger and getjpegsize are limited in ways that make it not stable enough to use. Both failed when dealing with a large JPG image around 3MB, but ranger will work after changing the amount of bytes it grabs. Also, these alternates only deal with JPG images, which means that a conditional would need to be used to only use them on JPGs and getimagesize on the other image formats.

Also, note that the first image was on an older server running an old version of PHP 5.3.2, where as the 4 other images came from a modern server (cloud based cPanel with MultiPHP dialed back to 5.4.45 for compatibility).

It's worth noting that the cloud based server did far better with getimagesize which beat out ranger, in fact for all 4 tests on the cloud server, ranger was the slowest. Those 4 also were pulling the images from the same server as the code was running, though different accounts.

This makes me wonder if the PHP core improved in 5.4 or if the Apache version factors in. Also, it might be down to availability from the server and server load. Let's not forget how networks are getting faster and faster each year, so maybe the speed issue is becoming less of a concern.

So, the end result and my answer is that for complete support for all web image formats, and to still achieve super fast image size, it might be best to suck it up and use getimagesize and then cache the image sizes (if these images will be checked more than once) in a database table. In that scenario, only the first check will incur a larger cost, but subsequent requests would be minimal and faster than any function that reads the image headers.

As with any caching, it only works well if the content doesn't change and there is a way to check if has been a change. So, a possible solution is to check only the headers of a image URL when checking the cache, and if different, dump the cached version and grab it again with getimagesize.

like image 8
Exit Avatar answered Oct 21 '22 10:10

Exit