I use Nokogiri for parse the html page with same content: <pre class="prettyprint"><code> Useful text Useless text </code></pre> When I call the method <code>page.css('p.parent').text</code> Nokogiri returns 'Useful text Useless text'. But I need only 'Useful text'. How to get node text without children?

XPath includes the <code>text()</code> node test for selecting text nodes, so you could do: <pre class="prettyprint"><code>page.xpath('//p[@class="parent"]/text()') </code></pre> Using XPath to select HTML classes can become quite tricky if the element in question could belong to more than one class, so this might not be ideal. Fortunately Nokogiri adds the <code>text()</code> selector to CSS, so you can use: <pre class="prettyprint"><code>page.css('p.parent > text()') </code></pre> to get the text nodes that are direct children of <code>p.parent</code>. This will also return some nodes that are whtespace only, so you may have to filter them out.

How to get node text without children?

Tags:

I use Nokogiri for parse the html page with same content:

<p class="parent">
  Useful text
  <br>
  <span class="child">Useless text</span>
</p>

When I call the method page.css('p.parent').text Nokogiri returns 'Useful text Useless text'. But I need only 'Useful text'.

281

asked Aug 27 '13 16:08

Denis Kreshikhin

1 Answers

XPath includes the text() node test for selecting text nodes, so you could do:

page.xpath('//p[@class="parent"]/text()')

Using XPath to select HTML classes can become quite tricky if the element in question could belong to more than one class, so this might not be ideal.

Fortunately Nokogiri adds the text() selector to CSS, so you can use:

page.css('p.parent > text()')

to get the text nodes that are direct children of p.parent. This will also return some nodes that are whtespace only, so you may have to filter them out.

197

answered Nov 14 '22 17:11

matt

Related questions
                            
                                Visual Studio, See variable's memory address in watch window
                            
                                How to run release:perform from a given Git tag?
                            
                                How to livereload Django templates?
                            
                                Spring DTO validation in Service or Controller?
                            
                                How to change app default theme to a different app theme?
                            
                                What is the meaning of an operator (:>) in a data constructor?
                            
                                Rails - how to obtain visitors' IP address?
                            
                                Number of n-element permutations with exactly k inversions
                            
                                parsing for PHOAS expressions
                            
                                replace crontab file with -e
                            
                                Normalize rows of a matrix within range 0 and 1
                            
                                Java String to JSON conversion

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to get node text without children?

Tags:

Denis Kreshikhin

People also ask

1 Answers

matt

Recent Activity

Donate For Us