Hey there, I have read the few posts here on when/how to use the visitor pattern, and some articles/chapters on it, and it makes sense if you are traversing an AST and it is highly structured, and you want to encapsulate the logic into a separate "visitor" object, etc. But with Ruby, it seems like overkill because you could just use blocks to do nearly the same thing.
I would like to pretty_print xml using Nokogiri. The author recommended that I use the visitor pattern, which would require I create a FormatVisitor or something similar, so I could just say "node.accept(FormatVisitor.new)".
The issue is, what if I want to start customizing all the stuff in the FormatVisitor (say it allows you to specify how nodes are tabbed, how attributes are sorted, how attributes are spaced, etc.).
I have a few options:
Instead of having to construct a FormatVisitor, set values, and pass it to the node.accept method, why not just do this:
node.pretty_print do |format|
format.tabs = 2
format.sort_attributes_by {...}
end
That's in contrast to what I feel like the visitor pattern would look like:
visitor = Class.new(FormatVisitor) do
attr_accessor :format
def pretty_print(node)
# do something with the text
@format.tabs = 2 # two tabs per nest level
@format.sort_attributes_by {...}
end
end.new
doc.children.each do |child|
child.accept(visitor)
end
Maybe I've got the visitor pattern all wrong, but from what I've read about it in ruby, it seems like overkill. What do you think? Either way is fine with me, just wondering what how you guys feel about it.
Thanks a lot, Lance
Visitor design pattern is one of the behavioral design patterns. It is used when we have to perform an operation on a group of similar kind of Objects. With the help of visitor pattern, we can move the operational logic from the objects to another class.
In object-oriented programming and software engineering, the visitor design pattern is a way of separating an algorithm from an object structure on which it operates. A practical result of this separation is the ability to add new operations to existing object structures without modifying the structures.
Pattern matching is an experimental feature allowing deep matching of structured values: checking the structure, and binding the matched parts to local variables. Pattern matching in Ruby is implemented with the in operator, which can be used in a standalone expression: <variable> in <pattern>
In essence, a Ruby block is the Visitor pattern without the extra boilerplate. For trivial cases, a block is sufficient.
For example, if you want to perform a simple operation on an Array object, you would just call the #each
method with a block instead of implementing a separate Visitor class.
However, there are advantages in implementing a concrete Visitor pattern under certain cases:
Your implementation seems mildly complex, and Nokogiri expects a Visitor instance that impelment #visit
method, so Visitor pattern would actually be a good fit in your particular use case. Here is a class based implementation of the visitor pattern:
FormatVisitor implements the #visit
method and uses Formatter
subclasses to format each node depending on node types and other conditions.
# FormatVisitor implments the #visit method and uses formatter to format
# each node recursively.
class FormatVistor
attr_reader :io
# Set some initial conditions here.
# Notice that you can specify a class to format attributes here.
def initialize(io, tab: " ", depth: 0, attributes_formatter_class: AttributesFormatter)
@io = io
@tab = tab
@depth = depth
@attributes_formatter_class = attributes_formatter_class
end
# Visitor interface. This is called by Nokogiri node when Node#accept
# is invoked.
def visit(node)
NodeFormatter.format(node, @attributes_formatter_class, self)
end
# helper method to return a string with tabs calculated according to depth
def tabs
@tab * @depth
end
# creates and returns another visitor when going deeper in the AST
def descend
self.class.new(@io, {
tab: @tab,
depth: @depth + 1,
attributes_formatter_class: @attributes_formatter_class
})
end
end
Here the implementation of AttributesFormatter
used above.
# This is a very simple attribute formatter that writes all attributes
# in one line in alphabetical order. It's easy to create another formatter
# with the same #initialize and #format interface, and you can then
# change the logic however you want.
class AttributesFormatter
attr_reader :attributes, :io
def initialize(attributes, io)
@attributes, @io = attributes, io
end
def format
return if attributes.empty?
sorted_attribute_keys.each do |key|
io << ' ' << key << '="' << attributes[key] << '"'
end
end
private
def sorted_attribute_keys
attributes.keys.sort
end
end
NodeFormatter
s uses Factory pattern to instantiate the right formatter for a particular node. In this case I differentiated text node, leaf element node, element node with text, and regular element nodes. Each type has a different formatting requirement. Also note, that this is not complete, e.g. comment nodes are not taken into account.
class NodeFormatter
# convience method to create a formatter using #formatter_for
# factory method, and calls #format to do the formatting.
def self.format(node, attributes_formatter_class, visitor)
formatter_for(node, attributes_formatter_class, visitor).format
end
# This is the factory that creates different formatters
# and use it to format the node
def self.formatter_for(node, attributes_formatter_class, visitor)
formatter_class_for(node).new(node, attributes_formatter_class, visitor)
end
def self.formatter_class_for(node)
case
when text?(node)
Text
when leaf_element?(node)
LeafElement
when element_with_text?(node)
ElementWithText
else
Element
end
end
# Is the node a text node? In Nokogiri a text node contains plain text
def self.text?(node)
node.class == Nokogiri::XML::Text
end
# Is this node an Element node? In Nokogiri an element node is a node
# with a tag, e.g. <img src="foo.png" /> It can also contain a number
# of child nodes
def self.element?(node)
node.class == Nokogiri::XML::Element
end
# Is this node a leaf element node? e.g. <img src="foo.png" />
# Leaf element nodes should be formatted in one line.
def self.leaf_element?(node)
element?(node) && node.children.size == 0
end
# Is this node an element node with a single child as a text node.
# e.g. <p>foobar</p>. We will format this in one line.
def self.element_with_text?(node)
element?(node) && node.children.size == 1 && text?(node.children.first)
end
attr_reader :node, :attributes_formatter_class, :visitor
def initialize(node, attributes_formatter_class, visitor)
@node = node
@visitor = visitor
@attributes_formatter_class = attributes_formatter_class
end
protected
def attribute_formatter
@attribute_formatter ||= @attributes_formatter_class.new(node.attributes, io)
end
def tabs
visitor.tabs
end
def io
visitor.io
end
def leaf?
node.children.empty?
end
def write_tabs
io << tabs
end
def write_children
v = visitor.descend
node.children.each { |child| child.accept(v) }
end
def write_attributes
attribute_formatter.format
end
def write_open_tag
io << '<' << node.name
write_attributes
if leaf?
io << '/>'
else
io << '>'
end
end
def write_close_tag
return if leaf?
io << '</' << node.name << '>'
end
def write_eol
io << "\n"
end
class Element < self
def format
write_tabs
write_open_tag
write_eol
write_children
write_tabs
write_close_tag
write_eol
end
end
class LeafElement < self
def format
write_tabs
write_open_tag
write_eol
end
end
class ElementWithText < self
def format
write_tabs
write_open_tag
io << text
write_close_tag
write_eol
end
private
def text
node.children.first.text
end
end
class Text < self
def format
write_tabs
io << node.text
write_eol
end
end
end
To use this class:
xml = "<root><aliens><alien><name foo=\"bar\">Alf<asdf/></name></alien></aliens></root>"
doc = Nokogiri::XML(xml)
# the FormatVisitor accepts an IO object and writes to it
# as it visits each node, in this case, I pick STDOUT.
# You can also use File IO, Network IO, StringIO, etc.
# As long as it support the #puts method, it will work.
# I'm using the defaults here. ( two spaces, with starting depth at 0 )
visitor = FormatVisitor.new(STDOUT)
# this will allow doc ( the root node ) to call visitor.visit with
# itself. This triggers the visiting of each children recursively
# and contents written to the IO object. ( In this case, it will
# print to STDOUT.
doc.accept(visitor)
# Prints:
# <root>
# <aliens>
# <alien>
# <name foo="bar">
# Alf
# <asdf/>
# </name>
# </alien>
# </aliens>
# </root>
With the above code, you can change node formatting behaviors by constructing extra subclasses of NodeFromatter
s and plug them into the factory method. You can control the formatting of attributes with various implementation of the AttributesFromatter
. As long as you adhere to its interface, you can plug it into the attributes_formatter_class
argument without modifying anything else.
List of design patterns used:
NodeFormatter
, you can extract them into NodeFormatterFactory
to be more proper.This demonstrates how you can combine a few patterns together to achieve the flexibility you desire. Although, if you need those flexibility is something you have to decide.
I would go with what is simple and works. I don't know the details, but what you wrote compared with the Visitor pattern, looks simpler. If it also works for you, I would use that. Personally, I am tired with all these techniques that ask you to create a huge "network" of interelated classes, just to solve one small problem.
Some would say, yeah, but if you do it using patterns then you can cover many future needs and blah blah. I say, do now what works and if the need arises, you can refactor in the future. In my projects, that need almost never arises, but that's a different story.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With