I have a string with markdown syntax in it, and I want to be able to find markdown syntax for headings, i.e h1 = #, h2 = ## etc etc.
I know that whenever I find a heading, it is at the start of the line. I also know there can only be one heading per line. So for example, "###This is a heading" would match true for my h3 pattern, but not for my h2 or h1 patterns. This is my code so far:
h1 = Pattern.compile("(?<!\\#)^\\#(\\b)*");
h2 = Pattern.compile("(?<!\\#)^\\#{2}(\\b)*");
h3 = Pattern.compile("(?<!\\#)^\\#{3}(\\b)*");
h4 = Pattern.compile("(?<!\\#)^\\#{4}(\\b)*");
h5 = Pattern.compile("(?<!\\#)^\\#{5}(\\b)*");
h6 = Pattern.compile("(?<!\\#)^\\#{6}(\\b)*");
Whenever I use \\#, my compiler (IntelliJ) tells me: "Redundant character escape". It does that whenever I use \\#. As far as I know, # should not be a special character in regex, so escaping it with two backslashes should allow me to use it.
When I find a match, I want to surrond the entire match with bold HTML-tags, like this: "###Heading", but for some reason it's not working
//check for heading 6
Matcher match = h6.matcher(tmp);
StringBuffer sb = new StringBuffer();
while (match.find()) {
match.appendReplacement(sb, "<b>" + match.group(0) + "</b>");
}
match.appendTail(sb);
tmp = sb.toString();
EDIT
So I have to seperately look at each heading, I can't look at heading 1-6 in the same pattern (this has to do with other parts of my program that uses the same pattern). What I know so far:
There is no need to escape #
since it is not a special regex metacharacter. Also, the ^
is the string start anchor, so all the lookbehinds in your patterns are redundant as they always return true (since there is no character before the beginning of a string).
You seem to want to match a specified number of #
before a word char. Use
String s = "###### Heading6 Something here\r\n" +
"###### More text \r\n" +
"###Heading 3 text";
Matcher m = Pattern.compile("(?m)^#{6}(?!#)(.*)").matcher(s);
String result = m.replaceAll("<b>$1</b>");
System.out.println(result);
See the Java demo
Result:
<b> Heading6 Something here</b>
<b> More text </b>
###Heading 3 text
Details:
(?m)
- now, ^
matches start of a line^
- start of a line#{6}(?!#)
- exactly 6 #
symbols(.*)
- Group 1: 0+ chars other than a line break up to the line end.Thus, your regex definitions will look like
h1 = Pattern.compile("(?m)^#(?!#)(.*)");
h2 = Pattern.compile("(?m)^#{2}(?!#)(.*)");
h3 = Pattern.compile("(?m)^#{3}(?!#)(.*)");
h4 = Pattern.compile("(?m)^#{4}(?!#)(.*)");
h5 = Pattern.compile("(?m)^#{5}(?!#)(.*)");
h6 = Pattern.compile("(?m)^#{6}(?!#)(.*)");
You can try this:
^(#{1,6}\s*[\S]+)
As you have mentioned that heading comes only at the start of a line thus you don't need look behind.
UPDATE: If you want to bold the full line that starts with heading then you can try this:
^(#{1,6}.*)
And replace by:
<b>$1</b>
Regex Demo
Sample Java source:
final String regex = "^(#{1,6}\\s*[\\S]+)";
final String string = "#heading 1 \n"
+ "bla bla bla\n"
+ "### heading 3 djdjdj\n"
+ "bla bla bla\n"
+ "## heading 2 bal;kasddfas\n"
+ "fbla bla bla";
final String subst = "<b>$1</b>";
final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
final Matcher matcher = pattern.matcher(string);
final String result = matcher.replaceAll(subst);
System.out.println(result);
Run java source
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With