Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Check if string is a prefix of a Javascript RegExp

In Javascript I have defined a regular expression and now a user is typing in a string. I want to tell him if his string still could match the RegExp if he continues typing or if he's already on the wrong way. For instance:

var re = /a*b/;

"a".isPrefixOf( re ); // true
"x".isPrefixOf( re ); // false

How could an implementation of isPrefixOf look like?

Update: Thanks for your answers, making the regex prefix-proof, as suggested by brad, seems to be a good workaround. But I'm still trying to find a general solution.

Maybe this way: We create a new regex with the user input followed by .*. This regex describes all words that the user still may enter. If the intersection of this created regex and the original regex is empty then the user is already on the wrong way. If it's not, he's doing fine. For instance:

var re = /a*b/;
var sInput = "a";
var reInput = new RegExp( sInput + ".*" );

reIntersection = re.intersect( reInput );
reIntersection.isEmpty(); // false

intersect() returns a new regex that accepts only word which both re and reInput would accept. The function doesn't exist yet but we can implement it using look-ahead:

RegExp.prototype.intersect = function( pattern2 ) { 
    return new RegExp( '(?=' + this.source  + ')' + pattern2.source );
}

What remains open is the isEmpty() function. How could we check, if a Javascript regex matches any word or if it's empty?

like image 827
Georg Jähnig Avatar asked Jan 06 '09 13:01

Georg Jähnig


2 Answers

I think your best bet here is to make your Regex prefix-proof. For the example you gave, /a*b/, I think you could probably use /a*b?/.test(userinput). For more complex patterns this could get increasingly difficult, but I still think it can be done by nesting each subexpression in a series of optional quantifiers (?). For instance:

/a*bcd*e/

The prefix regex could be:

/a*(b(c(d*e?)?)?)?/

Its a little messy, but will solve your problem rather well I think.

like image 179
brad Avatar answered Oct 24 '22 12:10

brad


People seem to be splitting evenly on how they interpret this question, so I'll demonstrate the concept with a Java example.

import java.util.regex.*;

public class Test
{

  public static void main(String[] args) throws Exception
  {
    tryMatch("^a*b+$", "a", "ab", "abc");
  }

  public static void tryMatch(String regex, String... targets)
  {
    Pattern p = Pattern.compile(regex);
    Matcher m = p.matcher("");
    System.out.printf("%nregex: %s%n", regex);
    System.out.printf("target | matches() | hitEnd()%n");
    for (String str : targets)
    {
      m.reset(str);
      System.out.printf("%-6s | %-9B | %-9B%n",
          str, m.matches(), m.hitEnd());
    }
  }
}

output:

regex: ^a*b+$
target | matches() | hitEnd()
a      | FALSE     | TRUE
ab     | TRUE      | TRUE
abc    | FALSE     | FALSE

Target string "a" doesn't match because the regex requires at least one b, but it could be the prefix of a successful match, so hitEnd() returns true. String "ab" has all that's required for a match, but it would also match if we added more b's to the end, so hitEnd() still returns true. With "abc" the match attempt fails before it reaches the end of the target string, so the regex couldn't match any string that starts with "abc".

As far as I know, Javascript doesn't have anything like Java's hitEnd() method, but it might be possible to fake it. If anyone knows how, it'll be that Flagrant Badass, Steven Levithan.

like image 21
Alan Moore Avatar answered Oct 24 '22 13:10

Alan Moore