Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java regex: Replace all characters with `+` except instances of a given string

Tags:

I have the following problem which states

Replace all characters in a string with + symbol except instances of the given string in the method

so for example if the string given was abc123efg and they want me to replace every character except every instance of 123 then it would become +++123+++.

I figured a regular expression is probably the best for this and I came up with this.

str.replaceAll("[^str]","+")  

where str is a variable, but its not letting me use the method without putting it in quotations. If I just want to replace the variable string str how can I do that? I ran it with the string manually typed and it worked on the method, but can I just input a variable?

as of right now I believe its looking for the string "str" and not the variable string.

Here is the output its right for so many cases except for two :(

enter image description here

List of open test cases:

plusOut("12xy34", "xy") → "++xy++" plusOut("12xy34", "1") → "1+++++" plusOut("12xy34xyabcxy", "xy") → "++xy++xy+++xy" plusOut("abXYabcXYZ", "ab") → "ab++ab++++" plusOut("abXYabcXYZ", "abc") → "++++abc+++" plusOut("abXYabcXYZ", "XY") → "++XY+++XY+" plusOut("abXYxyzXYZ", "XYZ") → "+++++++XYZ" plusOut("--++ab", "++") → "++++++" plusOut("aaxxxxbb", "xx") → "++xxxx++" plusOut("123123", "3") → "++3++3" 
like image 278
fsdff Avatar asked Sep 13 '18 04:09

fsdff


People also ask

How do you replace all occurrences of a regex pattern in a string?

count : Maximum number of pattern occurrences to be replaced. The count must always be a positive integer if specified. . By default, the count is set to zero, which means the re. sub() method will replace all pattern occurrences in the target string.

How do you match a character except one regex?

To match any character except a list of excluded characters, put the excluded charaters between [^ and ] . The caret ^ must immediately follow the [ or else it stands for just itself. The character '. ' (period) is a metacharacter (it sometimes has a special meaning).

What does '$' mean in regex?

$ means "Match the end of the string" (the position after the last character in the string).


1 Answers

Looks like this is the plusOut problem on CodingBat.

I had 3 solutions to this problem, and wrote a new streaming solution just for fun.

Solution 1: Loop and check

Create a StringBuilder out of the input string, and check for the word at every position. Replace the character if doesn't match, and skip the length of the word if found.

public String plusOut(String str, String word) {   StringBuilder out = new StringBuilder(str);    for (int i = 0; i < out.length(); ) {     if (!str.startsWith(word, i))       out.setCharAt(i++, '+');     else       i += word.length();   }    return out.toString(); } 

This is probably the expected answer for a beginner programmer, though there is an assumption that the string doesn't contain any astral plane character, which would be represented by 2 char instead of 1.

Solution 2: Replace the word with a marker, replace the rest, then restore the word

public String plusOut(String str, String word) {     return str.replaceAll(java.util.regex.Pattern.quote(word), "@").replaceAll("[^@]", "+").replaceAll("@", word); } 

Not a proper solution since it assumes that a certain character or sequence of character doesn't appear in the string.

Note the use of Pattern.quote to prevent the word being interpreted as regex syntax by replaceAll method.

Solution 3: Regex with \G

public String plusOut(String str, String word) {   word = java.util.regex.Pattern.quote(word);   return str.replaceAll("\\G((?:" + word + ")*+).", "$1+"); } 

Construct regex \G((?:word)*+)., which does more or less what solution 1 is doing:

  • \G makes sure the match starts from where the previous match leaves off
  • ((?:word)*+) picks out 0 or more instance of word - if any, so that we can keep them in the replacement with $1. The key here is the possessive quantifier *+, which forces the regex to keep any instance of the word it finds. Otherwise, the regex will not work correctly when the word appear at the end of the string, as the regex backtracks to match .
  • . will not be part of any word, since the previous part already picks out all consecutive appearances of word and disallow backtrack. We will replace this with +

Solution 4: Streaming

public String plusOut(String str, String word) {   return String.join(word,      Arrays.stream(str.split(java.util.regex.Pattern.quote(word), -1))       .map((String s) -> s.replaceAll("(?s:.)", "+"))       .collect(Collectors.toList())); } 

The idea is to split the string by word, do the replacement on the rest, and join them back with word using String.join method.

  • Same as above, we need Pattern.quote to avoid split interpreting the word as regex. Since split by default removes empty string at the end of the array, we need to use -1 in the second parameter to make split leave those empty strings alone.
  • Then we create a stream out of the array and replace the rest as strings of +. In Java 11, we can use s -> String.repeat(s.length()) instead.
  • The rest is just converting the Stream to an Iterable (List in this case) and joining them for the result
like image 81
nhahtdh Avatar answered Sep 21 '22 20:09

nhahtdh