I ran the following from a bash shell:
echo 'hello world' | ruby -ne 'puts $_ if /hello/'
I thought it was a typo at first, but it outputted hello world
surprisingly.
I meant to type:
echo 'hello world' | ruby -ne 'puts $_ if /hello/ === $_'
Can anyone give an explanation, or point to documentation, to why we get this implicit comparison to $_
?
I'd also like to note:
echo 'hello world' | ruby -ne 'puts $_ if /test/'
Won't output anything.
The Ruby parser has a special case for regular expression literals in conditionals. Normally (i.e. without using the e
, n
or p
command line options) this code:
if /foo/
puts "TRUE!"
end
produces:
$ ruby regex-in-conditional1.rb
regex-in-conditional1.rb:1: warning: regex literal in condition
Assigning something that matches the regex to $_
first, like this:
$_ = 'foo'
if /foo/
puts "TRUE!"
end
produces:
$ ruby regex-in-conditional2.rb
regex-in-conditional2.rb:2: warning: regex literal in condition
TRUE!
This is a (poorly documented) exception to the normal rules for Ruby conditionals, where anything that’s not false
or nil
evaluates as truthy.
This only applies to regex literals, the following behaves as you might expect for a conditional:
regex = /foo/
if regex
puts "TRUE!"
end
output:
$ ruby regex-in-conditional3.rb
TRUE!
This is handled in the parser. Searching the MRI code for the text of the warning produces a single match in parse.y
:
case NODE_DREGX:
case NODE_DREGX_ONCE:
warning_unless_e_option(parser, node, "regex literal in condition");
return NEW_MATCH2(node, NEW_GVAR(rb_intern("$_")));
I don’t know Bison, so I can’t explain exactly what is going on here, but there are some clues you can deduce. The warning_unless_e_option
function simply suppresses the warning if the -e
option has been set, as this feature is discouraged in normal code but can be useful in expressions from the command line (this explains why you don’t see the warning in your code). The next line seems to be constructing a parse subtree which is a regular expression match between the regex and the $_
global variable, which contains “[t]he last input line of string by gets or readline”. These nodes will then be compiled into the usually regular expression method call.
That shows what is happening, I’ll just finish with a quote from the Kernel#gets
documentation which may explain why this is such an obscure feature
The style of programming using $_ as an implicit parameter is gradually losing favor in the Ruby community.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With