Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regular Expressions : Find mismatched point (or char index)

Tags:

java

regex

I'm beginner to Regular Expressions. Is there any way to find mismatched point or char index when we validating string using Regular Expressions? I've used RegEx in Java to validate string.
I only need to find first mismatched index.
Update
Please consider example like this.
Regular Expression : ^\d{9}[VX]$
Accepted String : 547812375X
Wrong String : 547A12375X

In wrong string there's A instead of 8. What I need is find the mismatched index, for here is 4. Character at index is mismatched for RegEx.

like image 211
SachiraChin Avatar asked Oct 16 '11 11:10

SachiraChin


People also ask

What does (? I do in regex?

(? i) makes the regex case insensitive. (? c) makes the regex case sensitive.

What is the regular expression matching one or more specific characters?

The character + in a regular expression means "match the preceding character one or more times". For example A+ matches one or more of character A. The plus character, used in a regular expression, is called a Kleene plus .

What does \+ mean in regex?

To match a character having special meaning in regex, you need to use a escape sequence prefix with a backslash ( \ ). E.g., \. matches "." ; regex \+ matches "+" ; and regex \( matches "(" . You also need to use regex \\ to match "\" (back-slash).

What does regex 0 * 1 * 0 * 1 * Mean?

Basically (0+1)* mathes any sequence of ones and zeroes. So, in your example (0+1)*1(0+1)* should match any sequence that has 1. It would not match 000 , but it would match 010 , 1 , 111 etc. (0+1) means 0 OR 1. 1* means any number of ones.


1 Answers

I think this code might do what you want:

package so7783938;

import static org.junit.Assert.assertEquals;

import java.util.regex.Matcher;
import java.util.regex.Pattern;

import org.junit.Test;

public class RegexFailureTest {

  public static int firstFailurePoint(Pattern regex, String str) {
    for (int i = 0; i <= str.length(); i++) {
      Matcher m = regex.matcher(str.substring(0, i));
      if (!m.matches() && !m.hitEnd()) {
        return i - 1;
      }
    }
    if (regex.matcher(str).matches()) {
      return -1;
    } else {
      return str.length();
    }
  }

  @Test
  public void testSimple() {
    Pattern abc = Pattern.compile("abc");
    assertEquals(0, firstFailurePoint(abc, ""));
    assertEquals(1, firstFailurePoint(abc, "a"));
    assertEquals(2, firstFailurePoint(abc, "ab"));
    assertEquals(-1, firstFailurePoint(abc, "abc"));
    assertEquals(3, firstFailurePoint(abc, "abcd"));
    assertEquals(3, firstFailurePoint(abc, "abcdefghi"));
    assertEquals(1, firstFailurePoint(abc, "aaa"));
    assertEquals(2, firstFailurePoint(abc, "abb"));
  }

  @Test
  public void testAlternative() {
    Pattern regex = Pattern.compile("hello|world");
    assertEquals(0, firstFailurePoint(regex, "x"));
    assertEquals(-1, firstFailurePoint(regex, "hello"));
    assertEquals(-1, firstFailurePoint(regex, "world"));
    assertEquals(3, firstFailurePoint(regex, "hel"));
    assertEquals(5, firstFailurePoint(regex, "hello kitty"));
    assertEquals(3, firstFailurePoint(regex, "help me"));
    assertEquals(3, firstFailurePoint(regex, "worse is better"));
  }

  @Test
  public void testExample() {
    Pattern regex = Pattern.compile("^\\d{9}[VX]$");
    assertEquals(-1, firstFailurePoint(regex, "547812375X"));
    assertEquals(3, firstFailurePoint(regex, "547A12375X"));
  }

}
like image 81
Roland Illig Avatar answered Sep 20 '22 13:09

Roland Illig