I am using Nokogiri with Ruby to interpret the contents of an XML file. I would like to get an array (or similar) of all elements that are direct children of <code><where></code> in my example. However, I am getting various text nodes (e.g. <code>"\n\t\t\t"</code>), which I do not want. Is there any way I can remove or ignore them? <pre class="prettyprint lang-ruby prettyprint-override"><code>@body = " <xml> <request> <where> <username compare='e'>Admin</username> <rank compare='gt'>5</rank> </where> </request> </xml>" #in my code, the XML contains tab-indentation, rather than spaces. It is edited here for display purposes. @noko = Nokogiri::XML(@body) xml_request = @noko.xpath("//xml/request") where = xml_request.xpath("where") c = where.children p c </code></pre> The above Ruby script outputs: <code>[#<Nokogiri::XML::Text:0x100344c "\n\t\t\t">, #<Nokogiri::XML::Element:0x1003350 name="username" attributes=[#<Nokogiri::XML::Attr:0x10032fc name="compare" value="e">] children=[#<Nokogiri::XML::Text:0x1007580 "Admin">]>, #<Nokogiri::XML::Text:0x100734c "\n\t\t\t">, #<Nokogiri::XML::Element:0x100722c name="rank" attributes=[#<Nokogiri::XML::Attr:0x10071d8 name="compare" value="gt">] children=[#<Nokogiri::XML::Text:0x1006cec "5">]>, #<Nokogiri::XML::Text:0x10068a8 "\n\t\t">]</code> I would like to somehow obtain the following object: <code>[#<Nokogiri::XML::Element:0x1003350 name="username" attributes=[#<Nokogiri::XML::Attr:0x10032fc name="compare" value="e">] children=[#<Nokogiri::XML::Text:0x1007580 "Admin">]>, #Nokogiri::XML::Element:0x100722c name="rank" attributes=[#<Nokogiri::XML::Attr:0x10071d8 name="compare" value="gt">] children=[#<Nokogiri::XML::Text:0x1006cec "5">]>]</code> Currently I can work around the issue using <pre class="prettyprint"><code>c.each{|child| if !child.text? ... end } </code></pre> but <code>c.length == 5</code>. It would make my life easier if someone can suggest how to exclude direct child text nodes from c, so that <code>c.length == 2</code>

You have (at least) three options from which to choose: <ol> <li>Use <code>c = where.element_children</code> instead of <code>c = where.children</code>.</li> <li>Select only the child elements directly: <code>c = xml_request.xpath('./where/*')</code> or <code>c = where.xpath('./*')</code></li> <li>Filter the list of children to only those that are elements: <code>c = where.children.select(&:element?)</code></li> </ol>

Get children of an element without the text nodes

Tags:

xml

ruby

xpath

nokogiri

I am using Nokogiri with Ruby to interpret the contents of an XML file. I would like to get an array (or similar) of all elements that are direct children of <where> in my example. However, I am getting various text nodes (e.g. "\n\t\t\t"), which I do not want. Is there any way I can remove or ignore them?

@body = "
<xml>
  <request>
    <where>
      <username compare='e'>Admin</username>
      <rank compare='gt'>5</rank>
    </where>
  </request>
</xml>" #in my code, the XML contains tab-indentation, rather than spaces. It is edited here for display purposes.

@noko = Nokogiri::XML(@body)
xml_request = @noko.xpath("//xml/request")
where = xml_request.xpath("where")
c = where.children
p c

The above Ruby script outputs:

[#<Nokogiri::XML::Text:0x100344c "\n\t\t\t">, #<Nokogiri::XML::Element:0x1003350 name="username" attributes=[#<Nokogiri::XML::Attr:0x10032fc name="compare" value="e">] children=[#<Nokogiri::XML::Text:0x1007580 "Admin">]>, #<Nokogiri::XML::Text:0x100734c "\n\t\t\t">, #<Nokogiri::XML::Element:0x100722c name="rank" attributes=[#<Nokogiri::XML::Attr:0x10071d8 name="compare" value="gt">] children=[#<Nokogiri::XML::Text:0x1006cec "5">]>, #<Nokogiri::XML::Text:0x10068a8 "\n\t\t">]

I would like to somehow obtain the following object:

[#<Nokogiri::XML::Element:0x1003350 name="username" attributes=[#<Nokogiri::XML::Attr:0x10032fc name="compare" value="e">] children=[#<Nokogiri::XML::Text:0x1007580 "Admin">]>, #Nokogiri::XML::Element:0x100722c name="rank" attributes=[#<Nokogiri::XML::Attr:0x10071d8 name="compare" value="gt">] children=[#<Nokogiri::XML::Text:0x1006cec "5">]>]

Currently I can work around the issue using

c.each{|child|
  if !child.text?
    ...
  end
}

but c.length == 5. It would make my life easier if someone can suggest how to exclude direct child text nodes from c, so that c.length == 2

656

asked Feb 14 '12 23:02

SimonMayer

1 Answers

You have (at least) three options from which to choose:

Use c = where.element_children instead of c = where.children.
Select only the child elements directly:
c = xml_request.xpath('./where/*') or
c = where.xpath('./*')
Filter the list of children to only those that are elements:
c = where.children.select(&:element?)

180

answered Nov 04 '22 02:11

Phrogz

Related questions
                            
                                ElasticSearch Rails - Setting a Custom Analyzer
                            
                                How do I stub a class method with a class_double in RSpec?
                            
                                Ruby differences between += and << to concatenate a string [duplicate]
                            
                                Does Ruby's CSV.open buffer to memory and write all at once?
                            
                                Rails 5 - how to write a scope
                            
                                Break out of a begin/end block early
                            
                                What deployment directories do you use for Rails applications (deploying to a debian box)?
                            
                                Blocks of code in Python
                            
                                Named arguments as local variables in Ruby
                            
                                how to make a variable seen in all views - rails
                            
                                Passing data between blocks using sinatra
                            
                                Ruby flow control
                            
                                Capistrano Checking for undefined variable in Task
                            
                                Ruby creating instance variables outside a class
                            
                                Calling shell command from ruby with proper argument escaping
                            
                                How do I exclude fields from an embedded document in Mongoid?
                            
                                how do you activate or set the default rake?
                            
                                Using "::" instead of "module ..." for Ruby namespacing
                            
                                given a json object, how to iterate through the object in rails
                            
                                What is the meaning of the percent sign + pipe operator in Ruby, as in "%|"?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With