Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I use Ruby to parse through XML easily to query and find certain tag values?

Tags:

parsing

xml

ruby

I am working with an API and want to know how I can easily search and display/format the output based on the tags.

For example, here is the page with the API and examples of the XML OUtput:

http://developer.linkedin.com/docs/DOC-1191

I want to be able to treat each record as an object, such as User.first-name User.last-name so that I can display and store information, and do searches.

Is there perhaps a gem that makes this easier to do?

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<people-search>
  <people total="108" count="10" start="0">
    <person>
      <id>tePXJ3SX1o</id>
      <first-name>Bill</first-name>
      <last-name>Doe</last-name>
      <headline>Marketing Professional and Matchmaker</headline>
      <picture-url>http://media.linkedin.com:/....</picture-url>
    </person>
    <person>
      <id>pcfBxmL_Vv</id>
      <first-name>Ed</first-name>
      <last-name>Harris</last-name>
      <headline>Chief Executive Officer</headline>
    </person>
     ...
  </people>
  <num-results>108</num-results>
</people-search>
like image 596
Satchel Avatar asked Jan 28 '26 13:01

Satchel


1 Answers

This might give you a jump start:

#!/usr/bin/env ruby

require 'nokogiri'

XML = %{<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<people-search>
  <people total="108" count="10" start="0">
    <person>
      <id>tePXJ3SX1o</id>
      <first-name>Bill</first-name>
      <last-name>Doe</last-name>
      <headline>Marketing Professional and Matchmaker</headline>
      <picture-url>http://media.linkedin.com:/foo.png</picture-url>
    </person>
    <person>
      <id>pcfBxmL_Vv</id>
      <first-name>Ed</first-name>
      <last-name>Harris</last-name>
      <headline>Chief Executive Officer</headline>
    </person>
  </people>
  <num-results>108</num-results>
</people-search>}

doc = Nokogiri::XML(XML)

doc.search('//person').each do |person|
    firstname   = person.at('first-name').text
    puts "firstname: #{firstname}"
end
# >> firstname: Bill
# >> firstname: Ed

The idea is you're looping over the section that repeats, "person", in this case. Then you pick out the sections you want and extract the text. I'm using Nokogiri's .at() to get the first occurrence, but there are other ways to do it.

The Nokogiri site has good examples and well written documentation so be sure to spend a bit of time going over it. You should find it easy going.

like image 89
the Tin Man Avatar answered Jan 30 '26 03:01

the Tin Man