Ruby code for quick-and-dirty XML serialization?

Given a moderately complex XML structure (dozens of elements, hundreds of attributes) with no XSD and a desire to create an object model, what's an elegant way to avoid writing boilerplate from_xml() and to_xml() methods?

For instance, given:

<Foo bar="1"><Bat baz="blah"/></Foo>

How do I avoid writing endless sequences of:

class Foo
  attr_reader :bar, :bat

  def from_xml(el)
     @bar = el.attributes['bar']
     @bat = Bat.new()
     @bat.from_xml(XPath.first(el, "./bat")

I don't mind creating the object structure explicitly; it's the serialization that I'm just sure can be taken care of with some higher-level programming...

I am not trying to save a line or two per class (by moving from_xml behavior into initializer or class method, etc.). I am looking for the "meta" solution that duplicates my mental process:

"I know that every element is going to become a class name. I know that every XML attribute is going to be a field name. I know that the code to assign is just @#{attribute_name} = el.[#{attribute_name}] and then recurse into sub-elements. And reverse on to_xml."

I agree with suggestion that a "builder" class plus XmlSimple seems the right path. XML -> Hash -> ? -> Object model (and Profit!)

Update 2008-09-18 AM: Excellent suggestions from @Roman, @fatgeekuk, and @ScottKoon seem to have broken the problem open. I downloaded HPricot source to see how it solved the problem; key methods are clearly instance_variable_set and class_eval . irb work is very encouraging, am now moving towards implementation .... Very excited

2 Answers

You could use Builder instead of creating your to_xml method, and you could use XMLSimple to pull your xml file into a Hash instead of using the from _xml method. Unfortunately, I'm not sure you'll really gain all that much from using these techniques.

I suggest using a XmlSimple for a start. After you run the XmlSimple#xml_in on your input file, you get a hash. Then you can recurse into it (obj.instance_variables) and turn all internal hashes (element.is_a?(Hash)) to the objects of the same name, for example:

obj.instance_variables.find {|v| obj.send(v.gsub(/^@/,'').to_sym).is_a?(Hash)}.each do |h|
  klass= eval(h.sub(/^@(.)/) { $1.upcase })

Perhaps a cleaner way can be found to do this. Afterwards, if you want to make an xml from this new object, you'll probably need to change the XmlSimple#xml_out to accept another option, which distinguishes your object from the usual hash it's used to receive as an argument, and then you'll have to write your version of XmlSimple#value_to_xml method, so it'll call the accessor method instead of trying to access a hash structure. Another option, is having all your classes support the [] operator by returning the wanted instance variable.

