Assuming I have a List<String>
and an empty List<Pattern>
, is this the best way to handle making the words in the String into Pattern objects;
for(String word : stringList) {
patterns.add(Pattern.compile("\\b(" + word + ")\\b);
}
And then to run this on a string later;
for(Pattern pattern : patterns) {
Matcher matcher = pattern.matcher(myString);
if(matcher.matches()) {
myString = matcher.replaceAll("String[$1]");
}
}
The replaceAll bit is just an example, but $1 would be used most of the the time when I use this.
Is there a more efficient way? Because I feel like this is somewhat clunky. I'm using 80 Strings in the list by the way, though the Strings used are configurable, so there won't always be so many.
This is designed to be somewhat of a swearing filter so I'll let you assume the words in the List,
An example of input would be "You're a <curse>"
, the output would be "You're a *****"
for this word, though this may not always be the case and at some point I may be reading from a HashMap<String, String>
where the key is the capture group and the value is the replacement.
Example:
if(hashMap.get(matcher.group(1)) == null) {
// Can't test if \ is required. Used it here for safe measure.
matcher.replaceAll("\*\*\*\*");
} else {
matcher.replaceAll(hashMap.get(matcher.group(1));
}
You can join these patterns together using alternation with |
:
Pattern pattern = Pattern.compile("\\b(" + String.join("|",stringList) + ")\\b");
If you cannot use Java 8 so do not have the String.join
method, or if you need to escape the words to prevent characters in them from being interpreted as regex metacharacters, you will need to build this regex with a manual loop:
StringBuilder regex = new StringBuilder("\\b(");
for (String word : stringList) {
regex.append(Pattern.quote(word));
regex.append("|");
}
regex.setLength(regex.length() - 1); // delete last added "|"
regex.append(")\\b");
Pattern pattern = Pattern.compile(regex.toString());
To use different replacements for the different words, you can apply the pattern with this loop:
Matcher m = pattern.matcher(myString);
StringBuilder out = new StringBuilder();
int pos = 0;
while (m.find()) {
out.append(myString, pos, m.start());
String matchedWord = m.group(1);
String replacement = matchedWord.replaceAll(".", "*");
out.append(replacement);
pos = m.end();
}
out.append(myString, pos, myString.length());
myString = out.toString();
You can look up the replacement for the matched word any way you like. The example generates a replacement string of asterisks of the same length as the matched word.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With