I'm using "\\b(\\w+)(\\W+\\1\\b)+"
along with input = input.replaceAll(regex, "$1");
to find duplicate words in a string and remove the duplicates. For example the string input = "for for for" would become "for".
However it is failing to turn "Hello hello" into "Hello" even though I have used Pattern p = Pattern.compile(regex, Pattern.CASE_INSENSITIVE);
I can correct it by using "(?i)\\b(\\w+)(\\W+\\1\\b)+"
but I want to know why this is necessary? Why do I have to use the (?i) flag when I have already specified Pattern.CASE_INSENSITIVE?
Heres the full code for clarity:
import java.util.Scanner;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class DuplicateWords {
public static void main(String[] args) {
String regex = "\\b(\\w+)(\\W+\\1\\b)+";
Pattern p = Pattern.compile(regex, Pattern.CASE_INSENSITIVE);
Scanner in = new Scanner(System.in);
int numSentences = Integer.parseInt(in.nextLine());
while (numSentences-- > 0) {
String input = in.nextLine();
Matcher m = p.matcher(input);
// Check for subsequences of input that match the compiled pattern
while (m.find()) {
input = input.replaceAll(regex, "$1");
}
// Prints the modified sentence.
System.out.println(input);
}
in.close();
}
}
Your problem is that you're defining a regex with CASE_SENSITIVE
flag but not using it correctly in replaceAll
method.
You can also use (?i)
in the middle of the regex for ignore case match of back-reference \1
like this:
String repl = "Hello hello".replaceAll("\\b(\\w+)(\\W+(?i:\\1)\\b)+", "$1");
//=> Hello
And then use Matcher.replaceAll
later.
Working Code:
public class DuplicateWords {
public static void main(String[] args) {
String regex = "\\b(\\w+)(\\W+(?i:\\1)\\b)+";
Pattern p = Pattern.compile(regex);
// OR this one also works
// String regex = "\\b(\\w+)(\\W+\\1\\b)+";
// Pattern p = Pattern.compile(regex, Pattern.CASE_INSENSITIVE);
Scanner in = new Scanner(System.in);
int numSentences = Integer.parseInt(in.nextLine());
while (numSentences-- > 0) {
String input = in.nextLine();
Matcher m = p.matcher(input);
// Check for subsequences of input that match the compiled pattern
if (m.find()) {
input = m.replaceAll("$1");
}
// Prints the modified sentence.
System.out.println(input);
}
in.close();
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With