Suppose you have a thumbnail generator script that accepts source images in the form of a URL. Is there a way to detect if the source URL is "broken" - whether nonexistent or leads to an non-image file?
Just brute force using getimagesize()
or another PHP GD function is not a solution, since spoofed stray URL's that might not be images at all (http://example.com/malicious.exe
or the same file, but renamed as http://example.com/malicious.jpg
) could be input - such cases could easily be detected by PHP before having to invoke GD. I'm looking for GD pre-sanitizing before having GD try its battalion at parsing the file.
as a first step, the following regular expression checks if the URL is an image extension:
preg_match('@(https?://([-\w\.]+)+(:\d+)?(/([\w/_\.]*(\?\S+)?)?)?)([^\s]+(\.(?i)(jpg|png|gif|bmp))$)@', $txt,$url);
But sometimes a situation arises when the URL is valid but still does not contain an image. Therefore, we can use a function file_get_contents to check whenever the image link is valid but the image does not exist there. For this purpose, we will need to check for the contents of the image before showing it.
There are plenty of ways to find broken images on your website. The easiest and quickest way would be to use dedicated services, as in Broken Link Checker, Free Link Checker, Xenu, Netpeak Spider, etc. You could also do an extensive SEO Site audit with Serpstat's Tool and its alternatives.
Broken image links typically occur when the link address is no longer valid, i.e., when someone has deleted, moved or renamed the page on which your link relied.
use file_exists
function in php, you can check urls with it.
See documentation below, shows how to check img... exactly what you need
FILE EXISTS - http://www.php.net/manual/en/function.file-exists.php#93572
URL EXISTS - http://www.php.net/manual/en/function.file-exists.php#85246
Here is alternative code for checking the url. If you will test in browser replace \n
with <br/>
<?php
$urls = array('http://www.google.com/images/logos/ps_logo2.png', 'http://www.google.com/images/logos/ps_logo2_not_exists.png');
foreach($urls as $url){
echo "$url - ";
echo url_exists($url) ? "Exists" : 'Not Exists';
echo "\n\n";
}
function url_exists($url) {
$hdrs = @get_headers($url);
echo @$hdrs[1]."\n";
return is_array($hdrs) ? preg_match('/^HTTP\\/\\d+\\.\\d+\\s+2\\d\\d\\s+.*$/',$hdrs[0]) : false;
}
?>
Output is as follows
http://www.google.com/images/logos/ps_logo2.png - Content-Type: image/png
Exists
http://www.google.com/images/logos/ps_logo2_not_exists.png - Content-Type: text/html; charset=UTF-8
Not Exists
I have used the following to detect attributes for remote images
$src='http://example.com/image.jpg';
list($width, $height, $type, $attr) = @getimagesize($src);
example (checking stackoverflows "Careers 2.0" image)
$src='http://sstatic.net/ads/img/careers2-ad-header-so.png';
list($width, $height, $type, $attr) = @getimagesize($src);
echo '<pre>';
echo $width.'<br>';
echo $height.'<br>';
echo $type.'<br>';
echo $attr.'<br>';
echo '</pre>';
If $height, $width etc is null the image is obvious not an image or the file does not exists. Using cURL is overkill and slower (even with CURLOPT_HEADER)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With