Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PHP: Display the first 500 characters of HTML

I have a huge HTML code in a PHP variable like :

$html_code = '<div class="contianer" style="text-align:center;">The Sameple text.</div><br><span>Another sample text.</span>....';

I want to display only first 500 characters of this code. This character count must consider the text in HTML tags and should exclude HTMl tags and attributes while measuring the length. but while triming the code, it should not affect DOM structure of HTML code.

Is there any tuorial or working examples available?

like image 469
Vinay Jeurkar Avatar asked Apr 19 '11 03:04

Vinay Jeurkar


2 Answers

If its the text you want, you can do this with the following too

substr(strip_tags($html_code),0,500);
like image 161
Starx Avatar answered Sep 22 '22 01:09

Starx


Ooohh... I know this I can't get it exactly off the top of my head but you want to load the text you've got as a DOMDOCUMENT

http://www.php.net/manual/en/class.domdocument.php

then grab the text from the entire document node (as a DOMnode http://www.php.net/manual/en/class.domnode.php)

This won't be exactly right, but hopefully this will steer you onto the right track. Try something like:

 $html_code = '<div class="contianer" style="text-align:center;">The Sameple text.</div><br><span>Another sample text.</span>....';
 $dom = new DOMDocument();
 $dom->loadHTML($html_code);
 $text_to_strip = $dom->textContent;
 $stripped = mb_substr($text_to_strip,0,500);
 echo "$stripped";  // The Sameple text.Another sample text.....

edit ok... that should work. just tested locally

edit2

Now that I understand you want to keep the tags, but limit the text, lets see. You're going to want to loop the content until you get to 500 characters. This is probably going to take a few edits and passes for me to get right, but hopefully I can help. (sorry I can't give undivided attention)

First case is when the text is less than 500 characters. Nothing to worry about. Starting with the above code we can do the following.

  if (strlen($stripped) > 500) {
       // this is where we do our work.

       $characters_so_far = 0;
       foreach ($dom->child_nodes as $ChildNode) {

          // should check if $ChildNode->hasChildNodes();
          // probably put some of this stuff into a function
          $characters_in_next_node += str_len($ChildNode->textcontent);
          if ($characters_so_far+$characters_in_next_node > 500) { 
              // remove the node 
              // try using 
              // $ChildNode->parentNode->removeChild($ChildNode);
          } 
          $characters_so_far += $characters_in_next_node
       }
       // 
       $final_out = $dom->saveHTML();
  } else {
        $final_out = $html_code;
  }
like image 21
Alex C Avatar answered Sep 23 '22 01:09

Alex C