I've come across this Hackerrank problem and the regex should match string between the HTML tags. The regex and the string is
String str="<h1>Hello World!</h1>";
String regex="<(.+)>([^<]+)</\\1>";
Also what if the 'str' has more than one HTML tags like String str="<h1><h1>Hello World!</h1></h1>" and how ([^<]+) catches this 'str'.
My question is how ([^<]+) matches the 'str' and not ([a-zA-Z]+).
Here if the full source code :
import java.util.regex.Matcher;
import java.util.regex.Pattern;
/* Solution assumes we can't have the symbol "<" as text between tags */
public class Solution{
public static void main(String[] args){
Scanner scan = new Scanner(System.in);
int testCases = Integer.parseInt(scan.nextLine());
while (testCases-- > 0) {
String line = scan.nextLine();
boolean matchFound = false;
Pattern r = Pattern.compile(regex);
Matcher m = r.matcher(line);
while (m.find()) {
System.out.println(m.group(2));
matchFound = true;
}
if ( ! matchFound) {
System.out.println("None");
}
}
}
}
Don't mind if I'm stupid to ask this question and thank you in advance!
This regex guarantees that your string only contains one tag, assuming well formed HTML input.
The initial <(.+)> captures the name of your tag. The capture group will also get any attributes it can. Since + is a greedy quantifier, it will capture multiple tags if it can.
The trailing </\\1> matches against whatever the first group captured. That's why, if your HTML is well formed, the expression won't capture multiple tags or tags with attributes:
<h1>, closing tag </h1> ✓<h1 attr="value">, closing tag </h1>, but expecting </h1 attr="value"><h1><h2>, closing tag </h2></h1>, but expecting </h1><h2>That's why the tag can be matche with .+ rather safely, while the contents must be matched with [^<]+. You want to make sure you don't grab any stay tags in the content, but any other character at all is allowed. [^<]+ (pronounced. "not <, at least once) allows things like !, while [A-za-z] certainly would not.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With