Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ruby Command Line Implicit Conditional Check

Tags:

shell

ruby

I ran the following from a bash shell:

echo 'hello world' | ruby -ne 'puts $_ if /hello/'

I thought it was a typo at first, but it outputted hello world surprisingly.

I meant to type:

echo 'hello world' | ruby -ne 'puts $_ if /hello/ === $_'

Can anyone give an explanation, or point to documentation, to why we get this implicit comparison to $_?

I'd also like to note:

echo 'hello world' | ruby -ne 'puts $_ if /test/'

Won't output anything.

like image 744
Shawn Avatar asked Jun 03 '15 22:06

Shawn


1 Answers

The Ruby parser has a special case for regular expression literals in conditionals. Normally (i.e. without using the e, n or p command line options) this code:

if /foo/
  puts "TRUE!"
end

produces:

$ ruby regex-in-conditional1.rb
regex-in-conditional1.rb:1: warning: regex literal in condition

Assigning something that matches the regex to $_ first, like this:

$_ = 'foo'
if /foo/
  puts "TRUE!"
end

produces:

$ ruby regex-in-conditional2.rb
regex-in-conditional2.rb:2: warning: regex literal in condition
TRUE!

This is a (poorly documented) exception to the normal rules for Ruby conditionals, where anything that’s not false or nil evaluates as truthy.

This only applies to regex literals, the following behaves as you might expect for a conditional:

regex = /foo/
if regex
  puts "TRUE!"
end

output:

$ ruby regex-in-conditional3.rb
TRUE!

This is handled in the parser. Searching the MRI code for the text of the warning produces a single match in parse.y:

case NODE_DREGX:
case NODE_DREGX_ONCE:
 warning_unless_e_option(parser, node, "regex literal in condition");
 return NEW_MATCH2(node, NEW_GVAR(rb_intern("$_")));

I don’t know Bison, so I can’t explain exactly what is going on here, but there are some clues you can deduce. The warning_unless_e_option function simply suppresses the warning if the -e option has been set, as this feature is discouraged in normal code but can be useful in expressions from the command line (this explains why you don’t see the warning in your code). The next line seems to be constructing a parse subtree which is a regular expression match between the regex and the $_ global variable, which contains “[t]he last input line of string by gets or readline”. These nodes will then be compiled into the usually regular expression method call.

That shows what is happening, I’ll just finish with a quote from the Kernel#gets documentation which may explain why this is such an obscure feature

The style of programming using $_ as an implicit parameter is gradually losing favor in the Ruby community.

like image 144
matt Avatar answered Oct 18 '22 22:10

matt