If I have a regex with a capturing group, e.g. foo(_+f)
. If I match this against a string and want to replace the first capturing group in all matches with baz
so that
foo___f blah foo________f
is converted to:
foobaz blah foobaz
There doesn't appear to be any easy way to do this using the standard libraries. If I use Matcher.replaceAll() this will replace all matches of the entire pattern and convert the string to
baz blah baz
Obviously I can just iterate through the matches, store the start and end index of each capturing group, then go back and replace them, but is there an easier way?
Thanks, Don
Capturing groups are a way to treat multiple characters as a single unit. They are created by placing the characters to be grouped inside a set of parentheses. For example, the regular expression (dog) creates a single group containing the letters "d" "o" and "g" .
Non-capturing groups are important constructs within Java Regular Expressions. They create a sub-pattern that functions as a single unit but does not save the matched character sequence.
Groups group multiple patterns as a whole, and capturing groups provide extra submatch information when using a regular expression pattern to match against a string. Backreferences refer to a previously captured group in the same regular expression.
To perform a substitution, you use the Replace method of the Regex class, instead of the Match method that we've seen in earlier articles. This method is similar to Match, except that it includes an extra string parameter to receive the replacement value.
I think you want something like this?
System.out.println(
"foo__f blah foo___f boo___f".replaceAll("(?<=foo)_+f", "baz")
); // prints "foobaz blah foobaz boo___f"
Here you simply replace the entire match with "baz"
, but the match uses lookbehind to ensure that _+f
is preceded by foo
.
If lookbehind is not possible (perhaps because the length is not finite), then simply capture even what you're NOT replacing, and refer to them back in the replacement string.
System.out.println(
"fooooo_f boooo_f xxx_f".replaceAll("(fo+|bo+)(_+f)", "$1baz")
); // prints "fooooobaz boooobaz xxx_f"
So here we're effectively only replacing what \2
matches.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With