Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replacing multiple substrings in Java when replacement text overlaps search text

Say you have the following string:

cat dog fish dog fish cat

You want to replace all cats with dogs, all dogs with fish, and all fish with cats. Intuitively, the expected result:

dog fish cat fish cat dog

If you try the obvious solution, looping through with replaceAll(), you get:

  1. (original) cat dog fish dog fish cat
  2. (cat -> dog) dog dog fish dog fish dog
  3. (dog -> fish) fish fish fish fish fish fish
  4. (fish -> cat) cat cat cat cat cat cat

Clearly, this is not the intended result. So what's the simplest way to do this? I can cobble something together with Pattern and Matcher (and a lot of Pattern.quote() and Matcher.quoteReplacement()), but I refuse to believe I'm the first person to have this problem and there's no library function to solve it.

(FWIW, the actual case is a bit more complicated and doesn't involve straight swaps.)

like image 693
David Moles Avatar asked Sep 23 '11 16:09

David Moles


2 Answers

It seems StringUtils.replaceEach in apache commons does what you want:

StringUtils.replaceEach("abcdeab", new String[]{"ab", "cd"}, new String[]{"cd", "ab"});
// returns "cdabecd"

Note that the documenent at the above links seems to be in error. See comments below for details.

like image 69
Miserable Variable Avatar answered Oct 21 '22 23:10

Miserable Variable


String rep = str.replace("cat","§1§").replace("dog","§2§")
                .replace("fish","§3§").replace("§1§","dog")
                .replace("§2§","fish").replace("§3§","cat");

Ugly and inefficient as hell, but works.


OK, here's a more elaborate and generic version. I prefer using a regular expression rather than a scanner. That way I can replace arbitrary Strings, not just words (which can be better or worse). Anyway, here goes:

public static String replace(
    final String input, final Map<String, String> replacements) {

    if (input == null || "".equals(input) || replacements == null 
        || replacements.isEmpty()) {
        return input;
    }
    StringBuilder regexBuilder = new StringBuilder();
    Iterator<String> it = replacements.keySet().iterator();
    regexBuilder.append(Pattern.quote(it.next()));
    while (it.hasNext()) {
        regexBuilder.append('|').append(Pattern.quote(it.next()));
    }
    Matcher matcher = Pattern.compile(regexBuilder.toString()).matcher(input);
    StringBuffer out = new StringBuffer(input.length() + (input.length() / 10));
    while (matcher.find()) {
        matcher.appendReplacement(out, replacements.get(matcher.group()));
    }
    matcher.appendTail(out);
    return out.toString();
}

Test Code:

System.out.println(replace("cat dog fish dog fish cat",
    ImmutableMap.of("cat", "dog", "dog", "fish", "fish", "cat")));

Output:

dog fish cat fish cat dog

Obviously this solution only makes sense for many replacements, otherwise it's a huge overkill.

like image 45
Sean Patrick Floyd Avatar answered Oct 21 '22 23:10

Sean Patrick Floyd