Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to Get text from element, excluding some other elements inside that

I'm using domCrawler in symfony framework. I crawled contents from html using it. Now I need to get the text inside an element with ID. I'm able to fecth the text by using the code below:

$nodeValues = $crawler1->filter('#idOfTheElement')->each(function (Crawler $node, $i) {
            return $node->text();
        });

The element(#idOfTheElement) contains some spans, buttons etc (those having some classes also). I don't want the contents inside those. How to Get text from element, excluding some other elements inside that.

Note: The text I wanted to fetch, does not have any other wrapper, other than the element #idOfTheElement

The Html is look like below:

<li id='#idOfTheElement'>Tel :<button data-pjtooltip="{dtanchor:'tooltipOpposeMkt'}" class="noMkt JS_PJ" type="button">text :</button><dl><dt><a name="tooltipOpposeMkt"></a></dt><dd><div class="wrapper"><p><strong>Signification des pictogrammes</strong></p><p>Devant un numéro, le picto <img width="11" height="9" alt="" src="something"> signale une opposition aux opérations de marketing direct.</p><span class="arrow">&nbsp;</span></div></dd></dl>12 23 45 88 99</li>
like image 471
arun Avatar asked May 06 '15 12:05

arun


People also ask

How do I get text within an element?

Answer: Use the jQuery text() method You can simply use the jQuery text() method to get all the text content inside an element. The text() method also return the text content of child elements.

How to display text without HTML tags in JavaScript?

You can use this: var element = document. getElementById('txt'); var text = element. innerText || element.

How do I get just the text from HTML in jQuery?

You could use $('. gettext'). text(); in jQuery.

What are interactive elements in HTML?

However, the last essential components are interactive elements. Any time you put a username and password into a login field, or press a button to open a menu, or click a checkbox on a settings page, you're dealing with interactive elements such as buttons and inputs.


2 Answers

You can get element html and then get rid of the tags

preg_replace('@<(\w+)\b.*?>.*?</\1>@si', '', $node->html());
like image 68
Konstantin Pereiaslov Avatar answered Nov 18 '22 10:11

Konstantin Pereiaslov


First remove child nodes:

$crawler1->filter('#idOfTheElement')->each(function (Crawler $crawler) {
        foreach ($crawler as $node) {
            $node->parentNode->removeChild($node);
        }
    });

Then get text without child nodes:

$cleanContent = $crawler1->filter('#idOfTheElement')->text();
like image 26
leealex Avatar answered Nov 18 '22 09:11

leealex