Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can I define custom character class shorthands?

Java provides some useful character classes like \d and \w. Can I define my own character classes? For example, it would be useful to be able to define shorthands for character classes like [A-Za-z_].

like image 804
fredoverflow Avatar asked Jul 21 '11 11:07

fredoverflow


People also ask

How do u specify a set of characters?

You can use a hyphen inside a character class to specify a range of characters. [0-9] matches a single digit between 0 and 9. You can use more than one range. [0-9a-fA-F] matches a single hexadecimal digit, case insensitively.

What do you mean by the \d \w and \s shorthand character classes signify in regular expressions?

What do the \D, \W, and \S shorthand character classes signify in regular expressions? The \D, \W, and \S shorthand character classes match a single character that is not a digit, word, or space character, respectively.

How do you denote special characters in regex?

Special Regex Characters: These characters have special meaning in regex (to be discussed below): . , + , * , ? , ^ , $ , ( , ) , [ , ] , { , } , | , \ . Escape Sequences (\char): To match a character having special meaning in regex, you need to use a escape sequence prefix with a backslash ( \ ).

What is the difference between * and *??

*? is non-greedy. * will match nothing, but then will try to match extra characters until it matches 1 , eventually matching 101 . All quantifiers have a non-greedy mode: . *? , .


1 Answers

Can I define my own character classes?

No, you can't.

Personally, when I have a (slightly) complicated regex, I break the regex up in smaller sub-regexes and then "glue" them together with a String.format(...) like this:

public static boolean isValidIP4(String address) {
    String block_0_255 = "(0|[1-9]\\d|2[0-4]\\d|25[0-5])";
    String regex = String.format(
            "%s(\\.%s){3}", 
            block_0_255, block_0_255
    );
    return address.matches(regex);
}

which is far more readable than a single pattern:

"(0|[1-9]\\d|2[0-4]\\d|25[0-5])(\\.(0|[1-9]\\d|2[0-4]\\d|25[0-5])){3}"

Note that this is just a quick example: validating IP addresses can probably better be done by a class from the java.net package, and if you'd do it like that, the pattern should be placed outside the method and pre-compiled.

Be careful with % signs inside your pattern!

like image 185
Bart Kiers Avatar answered Oct 01 '22 16:10

Bart Kiers