Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to replace any occurrence of a word between quotes

Tags:

java

regex

I need to be able to replace all occurrences of the word "and" ONLY when it occurs between single quotes. For example replacing "and" with "XXX" in the string:

This and that 'with you and me and others' and not 'her and him'

Results in:

This and that 'with you XXX me XXX others' and not 'her XXX him'

I have been able to come up with regular expressions which nearly gets every case, but I'm failing with the "and" between the two sets of quoted text.

My code:

String str = "This and that 'with you and me and others' and not 'her and him'";

String patternStr = ".*?\\'.*?(?i:and).*?\\'.*";
Pattern pattern= Pattern.compile(patternStr);
Matcher matcher = pattern.matcher(str);
System.out.println(matcher.matches());
while(matcher.matches()) {
    System.out.println("in matcher");
    str = str.replaceAll("(?:\\')(.*?)(?i:and)(.*?)(?:\\')", "'$1XXX$2'");
    matcher = pattern.matcher(str);
}

System.out.println(str);
like image 757
BlueVoid Avatar asked May 04 '11 18:05

BlueVoid


2 Answers

Try this code:

str = "This and that 'with you and me and others' and not 'her and him'";
Matcher matcher = Pattern.compile("('[^']*?')").matcher(str);
StringBuffer sb = new StringBuffer();
while (matcher.find()) {
   matcher.appendReplacement(sb, matcher.group(1).replaceAll("and", "XXX"));
}
matcher.appendTail(sb);
System.out.println("Output: " + sb);

OUTPUT

Output: This and that 'with you XXX me XXX others' and not 'her XXX him'
like image 193
anubhava Avatar answered Oct 16 '22 02:10

anubhava


String str = "This and that 'with you and me and others' and not 'her and him'";

Pattern p = Pattern.compile("(\\s+)and(\\s+)(?=[^']*'(?:[^']*+'[^']*+')*+[^']*+$)");
System.out.println(p.matcher(str).replaceAll("$1XXX$2"));

The idea is, each time you find the complete word and, you you scan from the current match position to the end of the string, looking for an odd number of single-quotes. If the lookahead succeeds, the matched word must be between a pair of quotes.

Of course, this assumes quotes always come in matched pairs, and that quotes can't be escaped. Quotes escaped with backslashes can be dealt with, but it makes the regex much longer.

I'm also assuming the target word never appears at the beginning or end of a quoted sequence, which seems reasonable for the word and. If you want to allow for target words that are not surrounded by whitespace, you could use something like "\\band\\b" instead, but be aware of Java's problems in the area of word characters vs word boundaries.

like image 21
Alan Moore Avatar answered Oct 16 '22 02:10

Alan Moore