Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

replace capturing group

Tags:

java

regex

If I have a regex with a capturing group, e.g. foo(_+f). If I match this against a string and want to replace the first capturing group in all matches with baz so that

foo___f blah foo________f

is converted to:

foobaz blah foobaz

There doesn't appear to be any easy way to do this using the standard libraries. If I use Matcher.replaceAll() this will replace all matches of the entire pattern and convert the string to

baz blah baz

Obviously I can just iterate through the matches, store the start and end index of each capturing group, then go back and replace them, but is there an easier way?

Thanks, Don

like image 984
Dónal Avatar asked May 27 '10 12:05

Dónal


People also ask

What is a capturing group in regex?

Capturing groups are a way to treat multiple characters as a single unit. They are created by placing the characters to be grouped inside a set of parentheses. For example, the regular expression (dog) creates a single group containing the letters "d" "o" and "g" .

What does a non capturing group mean?

Non-capturing groups are important constructs within Java Regular Expressions. They create a sub-pattern that functions as a single unit but does not save the matched character sequence.

What is capturing group in regex Javascript?

Groups group multiple patterns as a whole, and capturing groups provide extra submatch information when using a regular expression pattern to match against a string. Backreferences refer to a previously captured group in the same regular expression.

How do you substitute in regex?

To perform a substitution, you use the Replace method of the Regex class, instead of the Match method that we've seen in earlier articles. This method is similar to Match, except that it includes an extra string parameter to receive the replacement value.


1 Answers

I think you want something like this?

    System.out.println(
        "foo__f blah foo___f boo___f".replaceAll("(?<=foo)_+f", "baz")
    ); // prints "foobaz blah foobaz boo___f"

Here you simply replace the entire match with "baz", but the match uses lookbehind to ensure that _+f is preceded by foo.

See also

  • regular-expressions.info/Lookarounds

If lookbehind is not possible (perhaps because the length is not finite), then simply capture even what you're NOT replacing, and refer to them back in the replacement string.

    System.out.println(
        "fooooo_f boooo_f xxx_f".replaceAll("(fo+|bo+)(_+f)", "$1baz")
    ); // prints "fooooobaz boooobaz xxx_f"

So here we're effectively only replacing what \2 matches.

like image 170
polygenelubricants Avatar answered Nov 15 '22 19:11

polygenelubricants