I have the following characters that I would like to be considered "illegal":
~
, #
, @
, *
, +
, %
, {
, }
, <
, >
, [
, ]
, |
, “
, ”
, \
, _
, ^
I'd like to write a method that inspects a string and determines (true
/false
) if that string contains these illegals:
public boolean containsIllegals(String toExamine) {
return toExamine.matches("^.*[~#@*+%{}<>[]|\"\\_^].*$");
}
However, a simple matches(...)
check isn't feasible for this. I need the method to scan every character in the string and make sure it's not one of these characters. Of course, I could do something horrible like:
public boolean containsIllegals(String toExamine) {
for(int i = 0; i < toExamine.length(); i++) {
char c = toExamine.charAt(i);
if(c == '~')
return true;
else if(c == '#')
return true;
// etc...
}
}
Is there a more elegant/efficient way of accomplishing this?
Java String contains() Method The contains() method checks whether a string contains a sequence of characters. Returns true if the characters exist and false if not.
To check if a string contains special characters, call the test() method on a regular expression that matches any special character. The test method will return true if the string contains at least 1 special character and false otherwise.
Java For Testers In order to check if a String has only Unicode letters in Java, we use the isDigit() and charAt() methods with decision-making statements. The isLetter(int codePoint) method determines whether the specific character (Unicode codePoint) is a letter. It returns a boolean value, either true or false.
You can make use of Pattern
and Matcher
class here. You can put all the filtered character in a character class, and use Matcher#find()
method to check whether your pattern is available in string or not.
You can do it like this: -
public boolean containsIllegals(String toExamine) {
Pattern pattern = Pattern.compile("[~#@*+%{}<>\\[\\]|\"\\_^]");
Matcher matcher = pattern.matcher(toExamine);
return matcher.find();
}
find()
method will return true, if the given pattern is found in the string, even once.
Another way that has not yet been pointed out is using String#split(regex)
. We can split the string on the given pattern, and check the length of the array. If length is 1
, then the pattern was not in the string.
public boolean containsIllegals(String toExamine) {
String[] arr = toExamine.split("[~#@*+%{}<>\\[\\]|\"\\_^]", 2);
return arr.length > 1;
}
If arr.length > 1
, that means the string contained one of the character in the pattern, that is why it was splitted. I have passed limit = 2
as second parameter to split
, because we are ok with just single split.
I need the method to scan every character in the string
If you must do it character-by-character, regexp is probably not a good way to go. However, since all characters on your "blacklist" have codes less than 128, you can do it with a small boolean
array:
static final boolean blacklist[] = new boolean[128];
static {
// Unassigned elements of the array are set to false
blacklist[(int)'~'] = true;
blacklist[(int)'#'] = true;
blacklist[(int)'@'] = true;
blacklist[(int)'*'] = true;
blacklist[(int)'+'] = true;
...
}
static isBad(char ch) {
return (ch < 128) && blacklist[(int)ch];
}
Use a constant for avoids recompile the regex in every validation.
private static final Pattern INVALID_CHARS_PATTERN =
Pattern.compile("^.*[~#@*+%{}<>\\[\\]|\"\\_].*$");
And change your code to:
public boolean containsIllegals(String toExamine) {
return INVALID_CHARS_PATTERN.matcher(toExamine).matches();
}
This is the most efficient way with Regex.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With