Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex replace all match with symbol with same length

Tags:

java

regex

My application currently logs sensitive information, which I need to mask.

A current log line looks like:

<Unable to fetch user info combination of dob=[20001231] and pan=[ABCD1234Z]

But should be changed to something like

<Unable to fetch user info combination of dob=******** and pan=********>

I tried to mask this using

str.replaceAll("\\[.*?\\]", "*")

but it changed it to:

<Unable to fetch user info combination of dob=* and pan=*>

How can I preserve the quantity of characters when masking characters between square brackets?

like image 381
Justin Vincent Avatar asked Jan 11 '17 04:01

Justin Vincent


People also ask

What does '$' mean in regex?

$ means "Match the end of the string" (the position after the last character in the string). Both are called anchors and ensure that the entire string is matched instead of just a substring.

What is difference [] and () in regex?

[] denotes a character class. () denotes a capturing group. [a-z0-9] -- One character that is in the range of a-z OR 0-9. (a-z0-9) -- Explicit capture of a-z0-9 .

What does regex 0 * 1 * 0 * 1 * Mean?

Basically (0+1)* mathes any sequence of ones and zeroes. So, in your example (0+1)*1(0+1)* should match any sequence that has 1. It would not match 000 , but it would match 010 , 1 , 111 etc. (0+1) means 0 OR 1. 1* means any number of ones.

What is $1 in regex replace?

For example, the replacement pattern $1 indicates that the matched substring is to be replaced by the first captured group.


2 Answers

It can be done in one line:

str = str.replaceAll("(?=[^\\[]+]).", "*");

See live regex demo and/or live Java demo.

This preserves the square brackets. To omit them from the result, use this:

str = str.replaceAll("\\[?(?=[^\\[]*]).]?", "*");

See live Java demo.

like image 189
Bohemian Avatar answered Sep 21 '22 00:09

Bohemian


You can manipulate Pattern and Matcher to do this. For example like this:

String log = "<Unable to fetch user info combination of dob=[20001231] and pan=[ABCD1234Z]>";
Pattern pattern = Pattern.compile("\\[.*?\\]");
Matcher  matcher = pattern.matcher(log);
String match="";
while (matcher.find()){
    match=matcher.group();
    char[] symbols = new char[match.length()];
    Arrays.fill(symbols, '*');
    log = log.replace(match, new String(symbols));  
}
System.out.println(log);   

Output:

<Unable to fetch user info combination of dob=******** and pan=********>

There might be some performance issue in the example above, but at least you got the idea.

like image 44
Baby Avatar answered Sep 20 '22 00:09

Baby