Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Nokogiri each node do, Ruby

Tags:

ruby

nokogiri

I have this xml:

   <kapitel>
      <nummer V="1"/>
      <von_icd_code V="A00"/>
      <bis_icd_code V="B99"/>
      <bezeichnung V="Bestimmte infektiöse und parasitäre Krankheiten"/>
      <gruppen_liste>
        <gruppe>
          <von_icd_code V="A00"/>
          <bis_icd_code V="A09"/>
          <bezeichnung V="Infektiöse Darmkrankheiten"/>
          <diagnosen_liste>
            <diagnose>
              <icd_code V="A00.-"/>
              <bezeichnung V="Cholera"/>
              <abrechenbar V="n"/>
              <krankheit_in_mitteleuropa_sehr_selten V="j"/>
              <schlüsselnummer_mit_inhalt_belegt V="j"/>
              <infektionsschutzgesetz_meldepflicht V="j"/>
              <infektionsschutzgesetz_abrechnungsbesonderheit V="j"/>

How you can see my first node is kapitel . I would like to do something like kapitel.each do |f| so that nokgiri extrakts the nodes von_icd_code and bis_icd_code in the right order. My code:

    require 'rubygems'
    require 'nokogiri'   
    require 'open-uri'

 @doc = Nokogiri::XML(File.open("icd.xml"))

  kapitel = @doc.css('kapitel')
   kapitel.each do |f|
    puts f.css('von_icd_code')  
    puts f.css('bis_icd_code')  
   end

The problem is that nogiri does not extrakt the 'von_icd_code' and 'bis_icd_code' in the right oder, instead first it list all von_icd_code and then all 'bis_icd_code'. How can i extrakt the nodes in the right oder?

And in my output i get:

<von_icd_code V="A00"/>

How can i only get the content of the V in this case A00

Thanks!

like image 755
John Smith Avatar asked Aug 10 '13 08:08

John Smith


2 Answers

You can use Nokogiri's traverse method, which, well, traverses all the XML nodes in a recursive fashion.

Your example will then look similar to this:

names = %w(von_icd_code bis_icd_code)
@doc.traverse {|node| p node['V'] if names.include? node.name}

And it prints out

"A00"
"B99"
"A00"
"A09"

There's a lot of neat things in the Nokogiri::Node which allow us to do really cool things with even most complex XML files. For a short list of them, you can take a look at this cheat sheet.

Good luck!

like image 191
Ivan Zarea Avatar answered Nov 07 '22 14:11

Ivan Zarea


Since bis_icd_code follows each von_icd_code, the obvious choice is css's + next adjacent sibling selector:

doc.css('von_icd_code').each do |icd|
  puts icd['V']
  puts icd.at('+ bis_icd_code')['V']
end
#=> A00
#=> B99
#=> A00
#=> A09
like image 35
pguardiario Avatar answered Nov 07 '22 14:11

pguardiario