Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

HTML safe wrapping of long lines

Tags:

regex

php

email

I'm having problems sending HTML emails with long lines of text. The WYSIWYG editor (FCKEditor 2.5) used on the site keeps removing all the \n characters on certain browsers, including IE and Chrome. The result is an email with a single huge line of text. This wouldn't be a problem if it wasn't for email clients that wrap lines of over 998 characters by inserting ! \n in it. Of course, these almost always end up in the most unfortunate places, breaking HTML tags and looking nasty in the content itself.

My initial solution was to add a line feed after every HTML tag or every 900 to 990 characters. This is the regex I ended up with:

 return preg_replace("/(<\/[^\>]+>|<[^\>]+\/>|>[^<]{900,990}\s)(\n)*/","$1\n",$str);

However, when there are lines that don't contain any tags at all, the whitespace matching part is never triggered. But if I remove the > from it's beginning, it starts breaking tags.

Is there a better way than regex to do this, or can this regex be healed?

EDIT: The 1000 character line length limit is defined in RFC 821.

like image 282
Kaivosukeltaja Avatar asked Mar 31 '11 11:03

Kaivosukeltaja


1 Answers

Following my comment, I'm posting this as I have been able to run a test.

tidy::repairString shoud do the job just fine, better than any regex solution.

$content = "<html>......</html>";
$oTidy = new tidy();
$content = $oTidy->repairString($content,
    array("show-errors" => 0, "show-warnings" => false),
    "utf8"
);

Adapt the Charset parameter (3rd) to your needs.

The clean option is unneeded for this, I was wrong in my comment.

like image 149
Yann Milin Avatar answered Oct 21 '22 20:10

Yann Milin