Situation is a string that results in something like this:
<p>This is some text and here is a <strong>bold text then the post stop here....</p>
Because the function returns a teaser (summary) of the text, it stops after certain words. Where in this case the tag strong is not closed. But the whole string is wrapped in a paragraph.
Is it possible to convert the above result/output to the following:
<p>This is some text and here is a <strong>bold text then the post stop here....</strong></p>
I do not know where to begin. The problem is that.. I found a function on the web which does it regex, but it puts the closing tag after the string.. therefore it won't validate because I want all open/close tags within the paragraph tags. The function I found does this which is wrong also:
<p>This is some text and here is a <strong>bold text then the post stop here....</p></strong>
I want to know that the tag can be strong, italic, anything. That's why I cannot append the function and close it manually in the function. Any pattern that can do it for me?
An opening tag begins a section of page content, and a closing tag ends it. For example, to markup a section of text as a paragraph, you would open the paragraph with an opening paragraph tag <p> and close it with a closing paragraph tag </p> (closing tags always proceed the element with a /).
<? php // close opened html tags function closetags ( $html ) { #put all opened tags into an array preg_match_all ( "#<([a-z]+)( .
The strip_tags() function strips a string from HTML, XML, and PHP tags. Note: HTML comments are always stripped. This cannot be changed with the allow parameter. Note: This function is binary-safe.
Also to close an any element tag you must put the slash in front of the elements name. </script>, </html> etc.
Here is a function i've used before, which works pretty well:
function closetags($html) {
preg_match_all('#<(?!meta|img|br|hr|input\b)\b([a-z]+)(?: .*)?(?<![/|/ ])>#iU', $html, $result);
$openedtags = $result[1];
preg_match_all('#</([a-z]+)>#iU', $html, $result);
$closedtags = $result[1];
$len_opened = count($openedtags);
if (count($closedtags) == $len_opened) {
return $html;
}
$openedtags = array_reverse($openedtags);
for ($i=0; $i < $len_opened; $i++) {
if (!in_array($openedtags[$i], $closedtags)) {
$html .= '</'.$openedtags[$i].'>';
} else {
unset($closedtags[array_search($openedtags[$i], $closedtags)]);
}
}
return $html;
}
Personally though, I would not do it using regexp but a library such as Tidy. This would be something like the following:
$str = '<p>This is some text and here is a <strong>bold text then the post stop here....</p>';
$tidy = new Tidy();
$clean = $tidy->repairString($str, array(
'output-xml' => true,
'input-xml' => true
));
echo $clean;
A small modification to the original answer...while the original answer stripped tags correctly. I found that during my truncation, I could end up with chopped up tags. For example:
This text has some <b>in it</b>
Truncating at character 21 results in:
This text has some <
The following code, builds on the next best answer and fixes this.
function truncateHTML($html, $length)
{
$truncatedText = substr($html, $length);
$pos = strpos($truncatedText, ">");
if($pos !== false)
{
$html = substr($html, 0,$length + $pos + 1);
}
else
{
$html = substr($html, 0,$length);
}
preg_match_all('#<(?!meta|img|br|hr|input\b)\b([a-z]+)(?: .*)?(?<![/|/ ])>#iU', $html, $result);
$openedtags = $result[1];
preg_match_all('#</([a-z]+)>#iU', $html, $result);
$closedtags = $result[1];
$len_opened = count($openedtags);
if (count($closedtags) == $len_opened)
{
return $html;
}
$openedtags = array_reverse($openedtags);
for ($i=0; $i < $len_opened; $i++)
{
if (!in_array($openedtags[$i], $closedtags))
{
$html .= '</'.$openedtags[$i].'>';
}
else
{
unset($closedtags[array_search($openedtags[$i], $closedtags)]);
}
}
return $html;
}
$str = "This text has <b>bold</b> in it</b>";
print "Test 1 - Truncate with no tag: " . truncateHTML($str, 5) . "<br>\n";
print "Test 2 - Truncate at start of tag: " . truncateHTML($str, 20) . "<br>\n";
print "Test 3 - Truncate in the middle of a tag: " . truncateHTML($str, 16) . "<br>\n";
print "Test 4: - Truncate with less text: " . truncateHTML($str, 300) . "<br>\n";
Hope it helps someone out there.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With