The shell tr
command support replace one set of characters with another set.
For example, echo hello | tr [a-z] [A-Z]
will tranlate hello
to HELLO
.
In java, however, I must replace each character individually like the following
"10 Dogs Are Racing"
.replaceAll ("0", "0")
.replaceAll ("1", "1")
.replaceAll ("2", "2")
// ...
.replaceAll ("9", "9")
.replaceAll ("A", "A")
// ...
;
The apache-commons-lang library provides a convenient replaceChars
method to do such replacement.
// half-width to full-width
System.out.println
(
org.apache.commons.lang.StringUtils.replaceChars
(
"10 Dogs Are Racing",
"0123456789ABCDEFEGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz",
"0123456789ABCDEFEGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
)
);
// Result:
// 10 Dogs Are Racing
But as you can see, sometime the searchChars/replaceChars are too long (also too boring, please find a duplicated character in it if you want), and can be expressed by a simple regular expression [0-9A-Za-z]
/[0-9A-Za-z]
. Is there a regular expression way to achieve that ?
While there is no direct way to do this, constructing your own utility function to use in combination with replaceChars
is relatively simple. The version below accepts simple character classes, without [
or ]
; it does not do class negation ([^a-z]
).
For your use case, you could do:
StringUtils.replaceChars(str, charRange("0-9A-Za-z"), charRange("0-9A-Za-z"))
Code:
public static String charRange(String str) {
StringBuilder ret = new StringBuilder();
char ch;
for(int index = 0; index < str.length(); index++) {
ch = str.charAt(index);
if(ch == '\\') {
if(index + 1 >= str.length()) {
throw new PatternSyntaxException(
"Malformed escape sequence.", str, index
);
}
// special case for escape character, consume next char:
index++;
ch = str.charAt(index);
}
if(index + 1 >= str.length() || str.charAt(index + 1) != '-') {
// this was a single char, or the last char in the string
ret.append(ch);
} else {
if(index + 2 >= str.length()) {
throw new PatternSyntaxException(
"Malformed character range.", str, index + 1
);
}
// this char was the beginning of a range
for(char r = ch; r <= str.charAt(index + 2); r++) {
ret.append(r);
}
index = index + 2;
}
}
return ret.toString();
}
Produces:
0-9A-Za-z : 0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz
0-9A-Za-z : 0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With