In Ruby 2.7 and 3.1 this script does the same thing whether or not the % signs are there:
def count(str)
state = :start
tbr = []
str.each_char do
% %case state
when :start
tbr << 0
% %state = :symbol
% when :symbol
tbr << 1
% % state = :start
% end
end
tbr
end
p count("Foobar")
How is this parsed? You can add more % or remove some and it will still work, but not any combination. I found this example through trial and error.
I was teaching someone Ruby and noticed only after their script was working that they had a random % in the margin. I pushed it a little further to see how many it would accept.
This is a Percent String Literal receiving the message %
.
A Percent String Literal has the form:
%
characterIf the opening-delimiter is one of <
, [
, (
, or {
, then the closing-delimiter must be the corresponding >
, ]
, )
, or }
. Otherwise, the opening-delimiter can be any arbitrary character and the closing-delimiter must be the same character.
So,
%
(that is, %
SPACE SPACE)
is a Percent String Literal with SPACE as the delimiter and no content. I.e. it is equivalent to ""
.
a % b
a % b
is equivalent to
a.%(b)
I.e. sending the message %
to the result of evaluating the expression a
, passing the result of evaluating the expression b
as the single argument.
Which means
% % b
is (roughly) equivalent to
"".%(b)
So, what's b
then? Well, it's the expression following the %
operator (not to be confused with the %
sigil of the Percent String Literal).
The entire code is (roughly) equivalent to this:
def count(str)
state = :start
tbr = []
str.each_char do
"".%(case state
when :start
tbr << 0
"".%(state = :symbol)
""when :symbol
tbr << 1
"".%(state = :start)
""end)
end
tbr
end
p count("Foobar")
You can figure this out yourself by just asking Ruby:
# ruby --dump=parsetree_with_comment test.rb
###########################################################
## Do NOT use this node dump for any purpose other than ##
## debug and research. Compatibility is not guaranteed. ##
###########################################################
# @ NODE_SCOPE (id: 62, line: 1, location: (1,0)-(17,17))
# | # new scope
# | # format: [nd_tbl]: local table, [nd_args]: arguments, [nd_body]: body
# +- nd_tbl (local table): (empty)
# +- nd_args (arguments):
# | (null node)
[…]
# | | +- nd_body (body):
# | | @ NODE_OPCALL (id: 48, line: 5, location: (5,0)-(12,7))*
# | | | # method invocation
# | | | # format: [nd_recv] [nd_mid] [nd_args]
# | | | # example: foo + bar
# | | +- nd_mid (method id): :%
# | | +- nd_recv (receiver):
# | | | @ NODE_STR (id: 12, line: 5, location: (5,0)-(5,3))
# | | | | # string literal
# | | | | # format: [nd_lit]
# | | | | # example: 'foo'
# | | | +- nd_lit (literal): ""
# | | +- nd_args (arguments):
# | | @ NODE_LIST (id: 47, line: 5, location: (5,4)-(12,7))
# | | | # list constructor
# | | | # format: [ [nd_head], [nd_next].. ] (length: [nd_alen])
# | | | # example: [1, 2, 3]
# | | +- nd_alen (length): 1
# | | +- nd_head (element):
# | | | @ NODE_CASE (id: 46, line: 5, location: (5,4)-(12,7))
# | | | | # case statement
# | | | | # format: case [nd_head]; [nd_body]; end
# | | | | # example: case x; when 1; foo; when 2; bar; else baz; end
# | | | +- nd_head (case expr):
# | | | | @ NODE_DVAR (id: 13, line: 5, location: (5,9)-(5,14))
# | | | | | # dynamic variable reference
# | | | | | # format: [nd_vid](dvar)
# | | | | | # example: 1.times { x = 1; x }
# | | | | +- nd_vid (local variable): :state
[…]
Some of the interesting places here are the node at (id: 12, line: 5, location: (5,0)-(5,3))
which is the first string literal, and (id: 48, line: 5, location: (5,0)-(12,7))
which is the first %
message send:
# | | +- nd_body (body):
# | | @ NODE_OPCALL (id: 48, line: 5, location: (5,0)-(12,7))*
# | | | # method invocation
# | | | # format: [nd_recv] [nd_mid] [nd_args]
# | | | # example: foo + bar
# | | +- nd_mid (method id): :%
# | | +- nd_recv (receiver):
# | | | @ NODE_STR (id: 12, line: 5, location: (5,0)-(5,3))
# | | | | # string literal
# | | | | # format: [nd_lit]
# | | | | # example: 'foo'
# | | | +- nd_lit (literal): ""
Note: this is just the simplest possible method of obtaining a parse tree, which unfortunately contains a lot of internal minutiae that are not really relevant to figuring out what is going on. There are other methods such as the parser
gem or its companion ast
which produce far more readable results:
# ruby-parse count.rb
(begin
(def :count
(args
(arg :str))
(begin
(lvasgn :state
(sym :start))
(lvasgn :tbr
(array))
(block
(send
(lvar :str) :each_char)
(args)
(send
(dstr) :%
(case
(lvar :state)
(when
(sym :start)
(begin
(send
(lvar :tbr) :<<
(int 0))
(send
(dstr) :%
(lvasgn :state
(sym :symbol)))
(dstr)))
(when
(sym :symbol)
(begin
(send
(lvar :tbr) :<<
(int 1))
(send
(dstr) :%
(lvasgn :state
(sym :start)))
(dstr))) nil)))
(lvar :tbr)))
(send nil :p
(send nil :count
(str "Foobar"))))
So far, all we have talked about is the Syntax, i.e. the grammatical structure of the code. But what does it mean?
The method String#%
performs String Formatting a la C's printf
family of functions. However, since the format string (the receiver of the %
message) is the empty string, the result of the message send is the empty string as well, since there is nothing to format.
If Ruby were a purely functional, lazy, non-strict language, the result would be equivalent to this:
def count(str)
state = :start
tbr = []
str.each_char do
"".%(case state
when :start
tbr << 0
""
""when :symbol
tbr << 1
""
""end)
end
tbr
end
p count("Foobar")
which in turn is equivalent to this
def count(str)
state = :start
tbr = []
str.each_char do
"".%(case state
when :start
tbr << 0
""
when :symbol
tbr << 1
""
end)
end
tbr
end
p count("Foobar")
which is equivalent to this
def count(str)
state = :start
tbr = []
str.each_char do
"".%(case state
when :start
""
when :symbol
""
end)
end
tbr
end
p count("Foobar")
which is equivalent to this
def count(str)
state = :start
tbr = []
str.each_char do
"".%(case state
when :start, :symbol
""
end)
end
tbr
end
p count("Foobar")
which is equivalent to this
def count(str)
state = :start
tbr = []
str.each_char do
""
end
tbr
end
p count("Foobar")
which is equivalent to this
def count(str)
state = :start
tbr = []
tbr
end
p count("Foobar")
which is equivalent to this
def count(str)
[]
end
p count("Foobar")
Clearly, that is not what is happening, and the reason is that Ruby isn't a purely functional, lazy, non-strict language. While the arguments which are passed to the %
message sends are irrelevant to the result of the message send, they are nevertheless evaluated (because Ruby is strict and eager) and they have side-effects (because Ruby is not purely functional), i.e. their side-effects of re-assigning variables and mutating the tbr
result array are still executed.
If this code were written in a more Ruby-like style with less mutation and fewer side-effects and instead using functional transformations, then arbitrarily replacing results with empty strings would immediately break it. The only reason there is no effect here is because the abundant use of side-effects and mutation.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With