Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the difference between =~ and match() when pattern matching?

I am using Ruby 1.9.3. I was playing with some patterns and found something interesting:

Example 1:

irb(main):001:0> /hay/ =~  'haystack'
=> 0
irb(main):003:0> /st/ =~ 'haystack'
=> 3

Example 2:

irb(main):002:0> /hay/.match('haystack')
=> #<MatchData "hay">
irb(main):004:0> /st/.match('haystack')
=> #<MatchData "st">

=~ returns the first location of its first match, whereas match returns the pattern. Other than that, is there any difference between =~ and match()?

Execution time difference (As per @Casper)

irb(main):005:0> quickbm(10000000) { "foobar" =~ /foo/ }
Rehearsal ------------------------------------
   8.530000   0.000000   8.530000 (  8.528367)
--------------------------- total: 8.530000sec

       user     system      total        real
   8.450000   0.000000   8.450000 (  8.451939)
=> nil

irb(main):006:0> quickbm(10000000) { "foobar".match(/foo/) }
Rehearsal ------------------------------------
  15.360000   0.000000  15.360000 ( 15.363360)
-------------------------- total: 15.360000sec

       user     system      total        real
  15.240000   0.010000  15.250000 ( 15.250471)
=> nil
like image 619
DoLoveSky Avatar asked Jan 15 '13 18:01

DoLoveSky


2 Answers

First make sure you're using the correct operator: =~ is correct, ~= is not.

The operator =~ returns the index of the first match (nil if no match) and stores the MatchData in the global variable $~. Named capture groups are assigned to a hash on $~, and, when the RegExp is a literal on the left side of the operator, are also assigned to local variables with those names.

>> str = "Here is a string"
>> re = /(?<vowel>[aeiou])/    # Contains capture group named "vowel"
>> str =~ re
=> 1
>> $~
=> #<MatchData "e" vowel:"e">
>> $~[:vowel]    # Accessible using symbol...
=> "e"
>> $~["vowel"]    # ...or string
=> "e"
>> /(?<s_word>\ss\w*)/ =~ str
=> 9
>> s_word # This was assigned to a local variable
=> " string"

The method match returns the MatchData itself (again, nil if no match). Named capture groups in this case, on either side of the method call, are assigned to a hash on the returned MatchData.

>> m = str.match re
=> #<MatchData "e" vowel:"e">
>> m[:vowel]
=> "e"

See http://www.ruby-doc.org/core-1.9.3/Regexp.html (as well as the sections on MatchData and String) for more details.

like image 59
Reinstate Monica -- notmaynard Avatar answered Oct 24 '22 07:10

Reinstate Monica -- notmaynard


When you have a method that does not modify state, all that matters is the return value. So what's the difference between red and blue, besides color? My point is that this is kind of a strange question, one which you seem to already know the answer to. (@sawa set me straight here)

But that said, both methods return nil (a falsy value) when the regex does not match. And, both methods return a truthy value when it does match. =~ returns an integer that represents the first character of the match, and even if that is 0, because 0 is truthy in Ruby. match returns an object with very detailed match data, which is handy when you want lots of info about the match.

=~ is typically used in conditionals, when you only care if something matches:

do_stuff if "foobar" =~ /foo/
do_stuff if "foobar".match(/foo/) # same effect, but probably slower and harder to read

match is typically used when you want details about what was matched:

 name = "name:bob".match(/^name:(\w+)$/)[1]
 puts name #=> 'bob'
like image 4
Alex Wayne Avatar answered Oct 24 '22 08:10

Alex Wayne