Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Wildcard string matching in Ruby

I'd like to write a utility function/module that'll provide simple wildcard/glob matching to strings. The reason I'm not using regular expressions is that the user will be the one who'll end up providing the patterns to match using some sort of configuration file. I could not find any such gem that's stable - tried joker but it had problems setting up.

The functionality I'm looking for is simple. For example, given the following patterns, here are the matches:

pattern | test-string         | match
========|=====================|====================
*hn     | john, johnny, hanna | true , false, false     # wildcard  , similar to /hn$/i
*hn*    | john, johnny, hanna | true , true , false     # like /hn/i
hn      | john, johnny, hanna | false, false, false     # /^hn$/i
*h*n*   | john, johnny, hanna | true , true , true
etc...

I'd like this to be as efficient as possible. I thought about creating regexes from the pattern strings, but that seemed rather inefficient to do at runtime. Any suggestions on this implementation? thanks.

EDIT: I'm using ruby 1.8.7

like image 615
sa125 Avatar asked Jun 23 '11 04:06

sa125


People also ask

How do you match a string in Ruby?

Ruby | Regexp match() functionRegexp#match() : force_encoding?() is a Regexp class method which matches the regular expression with the string and specifies the position in the string to begin the search. Return: regular expression with the string after matching it.

How do I match a string to another string?

Case 1: Both strings contain * at a particular position, at that time we can replace both * with any character to make the string equal at that position. Case 2: If one string has character and the other has * at that position. So, we can replace * with the same character in another string.

Does Ruby have pattern-matching?

Pattern matching is a powerful tool commonly found in functional programming languages. The Ruby 2.7 release is going to include this feature.

What does =~ mean in Ruby?

=~ is Ruby's basic pattern-matching operator. When one operand is a regular expression and the other is a string then the regular expression is used as a pattern to match against the string. (This operator is equivalently defined by Regexp and String so the order of String and Regexp do not matter.


1 Answers

I don't see why you think it would be inefficient. Predictions about these sorts of things are notoriously unreliable, you should decide that it is too slow before you go bending over backwards to find a faster way. And then you should profile it to make sure that this is where the problem lies (btw there is an average of 3-4x speed boost from switching to 1.9)

Anyway, it should be pretty easy to do this, something like:

class Globber 
  def self.parse_to_regex(str)
    escaped = Regexp.escape(str).gsub('\*','.*?')
    Regexp.new "^#{escaped}$", Regexp::IGNORECASE
  end

  def initialize(str)
    @regex = self.class.parse_to_regex str
  end

  def =~(str)
    !!(str =~ @regex)
  end
end


glob_strs = {
  '*hn'    => [['john', true, ], ['johnny', false,], ['hanna', false]],
  '*hn*'   => [['john', true, ], ['johnny', true, ], ['hanna', false]],
  'hn'     => [['john', false,], ['johnny', false,], ['hanna', false]],
  '*h*n*'  => [['john', true, ], ['johnny', true, ], ['hanna', true ]],
}

puts glob_strs.all? { |to_glob, examples|
  examples.all? do |to_match, expectation|
    result = Globber.new(to_glob) =~ to_match
    result == expectation
  end
}
# >> true
like image 152
Joshua Cheek Avatar answered Oct 05 '22 04:10

Joshua Cheek