So I'm having an issue, I need to check for simple beginning and ending HTML tags in a string. The beginning tag I haven't had any problems with it is when I am trying to find the ending tag that I have issues.
private Pattern pattern;
private Matcher matcher;
private Pattern endPattern;
private Matcher endMatcher;
private static final String HTML_TAG_PATTERN = "<([a-zA-Z]+)>";
public boolean hasCorrectHTML(String checking)
{
boolean ret=true;
pattern=Pattern.compile(HTML_TAG_PATTERN);
matcher=pattern.matcher(checking);
while(matcher.find() && ret)
{
String htmlEndTag="</"+matcher.group(1)+">";
endPattern=Pattern.compile(htmlEndTag);
endMatcher=endPattern.matcher(checking.substring(matcher.end()));
ret=endMatcher.matches();
}
return ret;
}
In the above code I find the first tag of something and then I continue to find the ending tag. I know there are going to be some future issues with this setup, this is a work in progess. However, the check for the ending tag does not work. As far as I can see my logic is sound. I am taking whatever the tag is and checking for its end tag through . I then throw that into a second pattern and then check for a match using the second matcher.
My text string is "<b>this test</b>
". It detects <b>
just fine but when I check for a match on </b>
it always returns false. I've asked peers for any idea why this would be happening but they are at a loss too. I have no idea why this would be happening, any ideas? What am I missing here?
Okay, so this was answered by JB Nizet, but inplace of endMatcher.matches() I should put endMatcher.find() instead because .matches() checks to see if the whole string matches the regex where .find() check for portions of the string that match the regex.
I didn't quite understand your question, and I don't know if I solve your problem, if not, please give me some examples to quickly understand your question.
private Pattern pattern;
private Matcher matcher;
private Pattern endPattern;
private Matcher endMatcher;
private static final String HTML_TAG_PATTERN = "<([a-zA-Z]+)>[^<]*";
public boolean hasCorrectHTML(String checking)
{
boolean ret=true;
pattern=Pattern.compile(HTML_TAG_PATTERN);
matcher=pattern.matcher(checking);
while(matcher.find() && ret)
{
String htmlEndTag="</"+matcher.group(1)+">";
endPattern=Pattern.compile(htmlEndTag);
String endChecking = checking.substring(matcher.end());
endMatcher=endPattern.matcher(endChecking);
ret=endMatcher.matches();
}
return ret;
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With