Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java matcher.matches() returning false when it should be true

So I'm having an issue, I need to check for simple beginning and ending HTML tags in a string. The beginning tag I haven't had any problems with it is when I am trying to find the ending tag that I have issues.

private Pattern pattern;
private Matcher matcher;
private Pattern endPattern;
private Matcher endMatcher;

private static final String HTML_TAG_PATTERN = "<([a-zA-Z]+)>";
public boolean hasCorrectHTML(String checking)
{
    boolean ret=true;
    pattern=Pattern.compile(HTML_TAG_PATTERN);
    matcher=pattern.matcher(checking);

    while(matcher.find() && ret)
    {
        String htmlEndTag="</"+matcher.group(1)+">";

        endPattern=Pattern.compile(htmlEndTag);
        endMatcher=endPattern.matcher(checking.substring(matcher.end()));

        ret=endMatcher.matches();
    }

    return ret;
}

In the above code I find the first tag of something and then I continue to find the ending tag. I know there are going to be some future issues with this setup, this is a work in progess. However, the check for the ending tag does not work. As far as I can see my logic is sound. I am taking whatever the tag is and checking for its end tag through . I then throw that into a second pattern and then check for a match using the second matcher. My text string is "<b>this test</b>". It detects <b> just fine but when I check for a match on </b> it always returns false. I've asked peers for any idea why this would be happening but they are at a loss too. I have no idea why this would be happening, any ideas? What am I missing here?

like image 809
Devvy Avatar asked May 13 '15 16:05

Devvy


2 Answers

Okay, so this was answered by JB Nizet, but inplace of endMatcher.matches() I should put endMatcher.find() instead because .matches() checks to see if the whole string matches the regex where .find() check for portions of the string that match the regex.

like image 160
Devvy Avatar answered Nov 14 '22 12:11

Devvy


I didn't quite understand your question, and I don't know if I solve your problem, if not, please give me some examples to quickly understand your question.

private Pattern pattern;
private Matcher matcher;
private Pattern endPattern;
private Matcher endMatcher;

private static final String HTML_TAG_PATTERN = "<([a-zA-Z]+)>[^<]*";
public boolean hasCorrectHTML(String checking)
{
    boolean ret=true;
    pattern=Pattern.compile(HTML_TAG_PATTERN);
    matcher=pattern.matcher(checking);

    while(matcher.find() && ret)
    {
        String htmlEndTag="</"+matcher.group(1)+">";

        endPattern=Pattern.compile(htmlEndTag);

        String endChecking = checking.substring(matcher.end());
        endMatcher=endPattern.matcher(endChecking);

        ret=endMatcher.matches();
    }

    return ret;
}
like image 3
JIE WANG Avatar answered Nov 14 '22 12:11

JIE WANG