Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Censoring selected words (replacing them with ****) using a single replaceAll?

Tags:

java

regex

I'd like to censor some words in a string by replacing each character in the word with a "*". Basically I would want to do

String s = "lorem ipsum dolor sit";
s = s.replaceAll("ipsum|sit", $0.length() number of *));

so that the resulting s equals "lorem ***** dolor ***".

I know how to do this with repeated replaceAll invokations, but I'm wondering, is this possible to do with a single replaceAll?


Update: It's a part of a research case-study and the reason is basically that I would like to get away with a one-liner as it simplifies the generated bytecode a bit. It's not for a serious webpage or anything.

like image 504
aioobe Avatar asked Mar 22 '26 04:03

aioobe


1 Answers

Here's a modification to aioobe's answer, using nested assertions instead of nested loop to generate the assertions:

public static void main(String... args) {
    String s = "lorem ipsum dolor sit blah $10 bleh";
    System.out.println(s.replaceAll(censorWords("ipsum", "sit", "$10"), "*"));
    // prints "lorem ***** dolor *** blah *** bleh"
}
public static String censorWords(String... words) {
    StringBuilder sb = new StringBuilder();
    for (String w : words) {
        if (sb.length() > 0) sb.append("|");
        sb.append(
           String.format("(?<=(?=%s).{0,%d}).",
              Pattern.quote(w),
              w.length()-1
           )
        );
    }
    return sb.toString();
}

Some key points:

  • StringBuilder.append in a loop instead of String +=
  • Pattern.quote to escape any $ or \ in censored words

That said, this is not the best solution to the problem. It's just a fun regex game to play, really.

Related questions

  • codingBat plusOut using regex

How it works

We want to replace with "*", so we have to match one character at a time. The question is which character.

It's the character where if you go back long enough, and then you look forward, you see a censored word.

Here's the regex in more abstract form:

(?<=(?=something).{0,N})

This matches positions where, allowing you to go back up to N characters, you can lookahead and see something.

like image 192
polygenelubricants Avatar answered Mar 24 '26 17:03

polygenelubricants



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!