Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex to GENERATE thumbnails!?!?! (but that's crazy!)

So here is my situation, and the solution that I've come up with to solve the problem. I have created an application that includes TinyMCE to allow users to create HTML content for publishing. The user can include images in their markup, and drag/resize those images affecting the final Width/Height attributes in the IMG tag. This is all great, the users can include images and resize/relocate them to their desired appearance. But one big problem is that I am now sending a (possibly) much larger image to the client, only to have the browser resize the image into the requested Width/Height attributes. All that bandwidth and lost load time....

So my solution is to pre-process my users markup content, scanning all of the IMG tags and parsing out the Height/Width/Src attributes. Then set each img's SRC tag to a phpThumb request with the parsed Height/Width passed into the thumbnails URL. This will create my reduced size image (optimising bandwidth at the expense of CPU and caching). What do you think about this solution? I've seen other posts where people were using mod_rewrite to do something similar, but I want to affect the content on the page service and not manipulate the image requests as they're being received. .... Any thoughts about this design?

I need some help with the fine details as my regex skills need some work, but I'm very short on time and promise to pay my technical knowledge debt soon. To make the regex's easier, I can be sure of some things. Only img tags that need this processing will have an existing width="" height="" attributes (with the double quotes, and lower cased text, but I suppose matching the text case insensitive would be better if TinyMCE changes)

So a regex to match only the necessary Img tags, and maybe another three regex's to extract the src, the width, and the height?

Thanks everyone.

like image 576
CryptoMonkey Avatar asked Apr 28 '10 17:04

CryptoMonkey


2 Answers

I think using regexs for this is a bad idea and you'd be better off parsing it using something like PHP Simple HTML DOM Parser, then you can do something like:

// Load HTML from a string
$html->load($your_posted_content);

// Find all images 
foreach($html->find('img') as $element) 
       echo $element->src . '<br>';
like image 107
Richard Harrison Avatar answered Nov 15 '22 10:11

Richard Harrison


Try this:

(?i)<img(?>\s+(?>src="([^"]*)"|width="([^"]*)"|height="([^"]*)"|\w+="[^"]*"))+

That will match any image tag, and if the src, width, and height attributes are present, their values will be stored in groups 1, 2, and 3 respectively. But it doesn't require any of those attributes to be there, so you'll want to verify that all three groups contain values before processing.

like image 36
Alan Moore Avatar answered Nov 15 '22 08:11

Alan Moore