Why doesn't the following change the text for me in Android?
String content = "test\n=test=\ntest";
content = content.replaceAll("^=(.+)=$", "<size:large>$1</size:large>")
It returns the original value with no changes. I would expect it to replace the middle =test=
with <size:large>test</size:large>
What am I missing here?
Edit: Okay, I understand why ^
and $
don't work. The point is that I need something that matches text both at the beginning and end of a line, e.g. a line that contains only "=some text=". Most of the answers given aren't sufficient, for the following reasons:
=(.+)=
doesn't have anything to do with line endings, so matches any line with two =
in it that are not side by side.
.*=(.+)=.*
matches the whole line, but has the same problem as the previous
\n=(.+)=\n
gets closer, but won't match two lines in a row (e.g. test\n=test=\n=test=\ntest
) It also won't match an instance on the first or last line
(?<=\n)=(.+)=(?=\n)
almost works, but again won't match an instance on the first or last line
(?<!.)=(.+)=(?!.)
is the only one that seems will actually match every line that starts and ends with =
, for example, but $1 contains both the replacement and the original string.
content = content.replaceAll("(?<=(\n|^))=(.+)=(?=(\n|$))", "<size:large>$2</size:large>");
is the only answer that seems to actually do what it should.
The meta character “^” matches the beginning of a particular string i.e. it matches the first character of the string. For example, The expression “^\d” matches the string/line starting with a digit. The expression “^[a-z]” matches the string/line starting with a lower case alphabet.
The caret ^ and dollar $ characters have special meaning in a regexp. They are called “anchors”. The caret ^ matches at the beginning of the text, and the dollar $ – at the end. The pattern ^Mary means: “string start and then Mary”.
The position anchors ^ and $ match the beginning and the ending of the input string, respectively. That is, this regex shall match the entire input string, instead of a part of the input string (substring). \w+ matches 1 or more word characters (same as [a-zA-Z0-9_]+ ).
Your original regex works fine if you turn on multiline mode, using (?m)
:
content = content.replaceAll("(?m)^=(.+)=$", "<size:large>$1</size:large>");
Now ^
and $
do indeed match at line boundaries.
The best way to deal with this is to set Pattern.MULTILINE
. Using MULTILINE
, ^
and $
will match on lines that are separated using only \n
, and will similarly handle the beginning of input and the end of input.
Using String.replaceAll
you need to set these within the pattern using an embedded flag expression (?m)
, for MULTILINE
:
content = str.replaceAll("(?m)^=(.+)=$", "<size:large>$1</size:large>");
If you don't use MULTILINE
, you need to use positive lookahead and lookbehind for the \n
, and the regex gets complicated in order to match the first line, and the last line if there's no \n
at the end, e.g. if our input is: =test=\n=test=\n=test=\n=test=
.
String pattern = "(?<=(^|\n))=(.+)=(?=(\n|$))";
content = str.replaceAll(pattern, "<size:large>$2</size:large>");
In this pattern we're supplying options for the lookbehind: \n
or beginning of input, (^|\n)
; and for the lookahead: \n
or end of input, (\n|$)
. Notice that we need to use $2
as the captured group reference in the replacement because of the group introduced by the first or.
We can make the pattern more complicated by introducing the alternatives in the lookahead/lookbehind in non-capturing groups, which look like (?:)
:
String pattern = "(?<=(?:^|\n))=(.+)=(?=(?:\n|$))";
content = str.replaceAll(pattern, "<size:large>$1</size:large>");
Now we're back to using $1
as the captured group in the replacement.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With