Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Visitor Pattern in Ruby, or just use a Block?

Hey there, I have read the few posts here on when/how to use the visitor pattern, and some articles/chapters on it, and it makes sense if you are traversing an AST and it is highly structured, and you want to encapsulate the logic into a separate "visitor" object, etc. But with Ruby, it seems like overkill because you could just use blocks to do nearly the same thing.

I would like to pretty_print xml using Nokogiri. The author recommended that I use the visitor pattern, which would require I create a FormatVisitor or something similar, so I could just say "node.accept(FormatVisitor.new)".

The issue is, what if I want to start customizing all the stuff in the FormatVisitor (say it allows you to specify how nodes are tabbed, how attributes are sorted, how attributes are spaced, etc.).

  • One time I want the nodes to have 1 tab for each nest level, and the attributes to be in any order
  • The next time, I want the nodes to have 2 spaces, and the attributes in alphabetical order
  • The next time, I want them with 3 spaces and with two attributes per line.

I have a few options:

  • Create an options hash in the constructor (FormatVisitor.new({:tabs => 2})
  • Set values after I have constructed the Visitor
  • Subclass the FormatVisitor for each new implementation
  • Or just use blocks, not the visitor

Instead of having to construct a FormatVisitor, set values, and pass it to the node.accept method, why not just do this:


node.pretty_print do |format|
  format.tabs = 2
  format.sort_attributes_by {...}
end

That's in contrast to what I feel like the visitor pattern would look like:


visitor = Class.new(FormatVisitor) do
  attr_accessor :format
  def pretty_print(node)
    # do something with the text
    @format.tabs = 2 # two tabs per nest level
    @format.sort_attributes_by {...}
  end
end.new
doc.children.each do |child|
  child.accept(visitor)
end

Maybe I've got the visitor pattern all wrong, but from what I've read about it in ruby, it seems like overkill. What do you think? Either way is fine with me, just wondering what how you guys feel about it.

Thanks a lot, Lance

like image 489
Lance Avatar asked Oct 05 '09 06:10

Lance


People also ask

When would you use the visitor pattern?

Visitor design pattern is one of the behavioral design patterns. It is used when we have to perform an operation on a group of similar kind of Objects. With the help of visitor pattern, we can move the operational logic from the objects to another class.

What is the purpose of visitor pattern in design pattern?

In object-oriented programming and software engineering, the visitor design pattern is a way of separating an algorithm from an object structure on which it operates. A practical result of this separation is the ability to add new operations to existing object structures without modifying the structures.

What is a pattern in Ruby?

Pattern matching is an experimental feature allowing deep matching of structured values: checking the structure, and binding the matched parts to local variables. Pattern matching in Ruby is implemented with the in operator, which can be used in a standalone expression: <variable> in <pattern>


2 Answers

In essence, a Ruby block is the Visitor pattern without the extra boilerplate. For trivial cases, a block is sufficient.

For example, if you want to perform a simple operation on an Array object, you would just call the #each method with a block instead of implementing a separate Visitor class.

However, there are advantages in implementing a concrete Visitor pattern under certain cases:

  • For multiple, similar but complex operations, Visitor pattern provides inheritance and blocks don't.
  • Cleaner to write a separate test suite for Visitor class.
  • It's always easier to merge smaller, dumb classes into a larger smart class than separating a complex smart class into smaller dumb classes.

Your implementation seems mildly complex, and Nokogiri expects a Visitor instance that impelment #visit method, so Visitor pattern would actually be a good fit in your particular use case. Here is a class based implementation of the visitor pattern:

FormatVisitor implements the #visit method and uses Formatter subclasses to format each node depending on node types and other conditions.

# FormatVisitor implments the #visit method and uses formatter to format
# each node recursively.
class FormatVistor

  attr_reader :io

  # Set some initial conditions here.
  # Notice that you can specify a class to format attributes here.
  def initialize(io, tab: "  ", depth: 0, attributes_formatter_class: AttributesFormatter)
    @io = io
    @tab = tab
    @depth = depth
    @attributes_formatter_class = attributes_formatter_class
  end

  # Visitor interface. This is called by Nokogiri node when Node#accept
  # is invoked.
  def visit(node)
    NodeFormatter.format(node, @attributes_formatter_class, self)
  end

  # helper method to return a string with tabs calculated according to depth
  def tabs
    @tab * @depth
  end

  # creates and returns another visitor when going deeper in the AST
  def descend
    self.class.new(@io, {
      tab: @tab,
      depth: @depth + 1,
      attributes_formatter_class: @attributes_formatter_class
    })
  end
end

Here the implementation of AttributesFormatter used above.

# This is a very simple attribute formatter that writes all attributes
# in one line in alphabetical order. It's easy to create another formatter
# with the same #initialize and #format interface, and you can then
# change the logic however you want.
class AttributesFormatter
  attr_reader :attributes, :io

  def initialize(attributes, io)
    @attributes, @io = attributes, io
  end

  def format
    return if attributes.empty?

    sorted_attribute_keys.each do |key|
      io << ' ' << key << '="' << attributes[key] << '"'
    end
  end

  private

  def sorted_attribute_keys
    attributes.keys.sort
  end
end

NodeFormatters uses Factory pattern to instantiate the right formatter for a particular node. In this case I differentiated text node, leaf element node, element node with text, and regular element nodes. Each type has a different formatting requirement. Also note, that this is not complete, e.g. comment nodes are not taken into account.

class NodeFormatter
  # convience method to create a formatter using #formatter_for
  # factory method, and calls #format to do the formatting.
  def self.format(node, attributes_formatter_class, visitor)
    formatter_for(node, attributes_formatter_class, visitor).format
  end

  # This is the factory that creates different formatters
  # and use it to format the node
  def self.formatter_for(node, attributes_formatter_class, visitor)
    formatter_class_for(node).new(node, attributes_formatter_class, visitor)
  end

  def self.formatter_class_for(node)
    case
    when text?(node)
      Text
    when leaf_element?(node)
      LeafElement
    when element_with_text?(node)
      ElementWithText
    else
      Element
    end
  end

  # Is the node a text node? In Nokogiri a text node contains plain text
  def self.text?(node)
    node.class == Nokogiri::XML::Text
  end

  # Is this node an Element node? In Nokogiri an element node is a node
  # with a tag, e.g. <img src="foo.png" /> It can also contain a number
  # of child nodes
  def self.element?(node)
    node.class == Nokogiri::XML::Element
  end

  # Is this node a leaf element node? e.g. <img src="foo.png" />
  # Leaf element nodes should be formatted in one line.
  def self.leaf_element?(node)
    element?(node) && node.children.size == 0
  end

  # Is this node an element node with a single child as a text node.
  # e.g. <p>foobar</p>. We will format this in one line.
  def self.element_with_text?(node)
    element?(node) && node.children.size == 1 && text?(node.children.first)
  end

  attr_reader :node, :attributes_formatter_class, :visitor

  def initialize(node, attributes_formatter_class, visitor)
    @node = node
    @visitor = visitor
    @attributes_formatter_class = attributes_formatter_class
  end

  protected

  def attribute_formatter
    @attribute_formatter ||= @attributes_formatter_class.new(node.attributes, io)
  end

  def tabs
    visitor.tabs
  end

  def io
    visitor.io
  end

  def leaf?
    node.children.empty?
  end

  def write_tabs
    io << tabs
  end

  def write_children
    v = visitor.descend
    node.children.each { |child| child.accept(v) }
  end

  def write_attributes
    attribute_formatter.format
  end

  def write_open_tag
    io << '<' << node.name
    write_attributes
    if leaf?
      io << '/>'
    else
      io << '>'
    end
  end

  def write_close_tag
    return if leaf?
    io << '</' << node.name << '>'
  end

  def write_eol
    io << "\n"
  end

  class Element < self
    def format
      write_tabs
      write_open_tag
      write_eol
      write_children
      write_tabs
      write_close_tag
      write_eol
    end
  end

  class LeafElement < self
    def format
      write_tabs
      write_open_tag
      write_eol
    end
  end

  class ElementWithText < self
    def format
      write_tabs
      write_open_tag
      io << text
      write_close_tag
      write_eol
    end

    private

    def text
      node.children.first.text
    end
  end

  class Text < self
    def format
      write_tabs
      io << node.text
      write_eol
    end
  end
end

To use this class:

xml = "<root><aliens><alien><name foo=\"bar\">Alf<asdf/></name></alien></aliens></root>"
doc = Nokogiri::XML(xml)

# the FormatVisitor accepts an IO object and writes to it 
# as it visits each node, in this case, I pick STDOUT.
# You can also use File IO, Network IO, StringIO, etc.
# As long as it support the #puts method, it will work.
# I'm using the defaults here. ( two spaces, with starting depth at 0 )
visitor = FormatVisitor.new(STDOUT)

# this will allow doc ( the root node ) to call visitor.visit with
# itself. This triggers the visiting of each children recursively
# and contents written to the IO object. ( In this case, it will
# print to STDOUT.
doc.accept(visitor)

# Prints:
# <root>
#   <aliens>
#     <alien>
#       <name foo="bar">
#         Alf
#         <asdf/>
#       </name>
#     </alien>
#   </aliens>
# </root>

With the above code, you can change node formatting behaviors by constructing extra subclasses of NodeFromatters and plug them into the factory method. You can control the formatting of attributes with various implementation of the AttributesFromatter. As long as you adhere to its interface, you can plug it into the attributes_formatter_class argument without modifying anything else.

List of design patterns used:

  • Visitor Pattern: handle node traversal logic. ( Also interface requirement by Nokogiri. )
  • Factory Pattern, used to determine formatter based on node types and other formatting conditions. Note, if you don't like the class methods on NodeFormatter, you can extract them into NodeFormatterFactory to be more proper.
  • Dependency Injection (DI / IoC), used to control the formatting of attributes.

This demonstrates how you can combine a few patterns together to achieve the flexibility you desire. Although, if you need those flexibility is something you have to decide.

like image 110
Aaron Qian Avatar answered Oct 23 '22 13:10

Aaron Qian


I would go with what is simple and works. I don't know the details, but what you wrote compared with the Visitor pattern, looks simpler. If it also works for you, I would use that. Personally, I am tired with all these techniques that ask you to create a huge "network" of interelated classes, just to solve one small problem.

Some would say, yeah, but if you do it using patterns then you can cover many future needs and blah blah. I say, do now what works and if the need arises, you can refactor in the future. In my projects, that need almost never arises, but that's a different story.

like image 41
Petros Avatar answered Oct 23 '22 13:10

Petros