Suppose I want to switch certain pairs of words. Say, for example, I want to switch dogs with cats and mice with rats, so that
This is my opinion about dogs and cats: I like dogs but I don't like cats. This is my opinion about mice and rats: I'm afraid of mice but I'm not afraid of rats.
becomes
This is my opinion about cats and dogs: I like cats but I don't like dogs. This is my opinion about rats and mice: I'm afraid of rats but I'm not afraid of mice.
The naїve approach
text = text.replace("dogs", "cats")
.replace("cats", "dogs")
.replace("mice", "rats")
.replace("rats", "mice")
is problematic since it can perform replacement on the same words multiple times. Either of the above example sentences would become
This is my opinion about dogs and dogs: I like dogs but I don't like dogs. This is my opinion about mice and mice: I'm afraid of mice but I'm not afraid of mice.
What's the simplest algorithm for replacing string pairs, while preventing something from being replaced multiple times?
Use whichever string search algorithm you deem to be appropriate, as long as it is able to search for regular expressions. Search for a regex that matches all the words you want to swap, e.g. dogs|cats|mice|rats. Maintain a separate string (in many languages, this needs to be some kind of StringBuilder in order for repeated appending to be fast) for the result, initially empty. For each match, you append the characters between the end of the previous match (or the beginning of the string) and the current match, and then you append the appropriate replacement (presumably obtained from a hashmap) to the result.
Most standard libraries should allow you to do this easily with built-in methods. For a Java example, see the documentation of Matcher.appendReplacement(StringBuffer, String). I recall doing this in C# as well, using a feature where you can specify a lambda function that decides what to replace each match with.
A naive solution that avoids any unexpected outcomes would be to replace each string with a temporary string, and then replace the temporary strings with the final strings. This assumes however, that you can form a string which is known not to be in the text, e.g.
text = text.replace("dogs", "{]1[}")
.replace("cats", "{]2[}")
.replace("mice", "{]3[}")
.replace("rats", "{]4[}")
.replace("{]2[}", "dogs")
.replace("{]1[}", "cats")
.replace("{]4[}", "mice")
.replace("{]3[}", "rats")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With