Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regular expression matching fully qualified class names

What is the best way to match fully qualified Java class name in a text?

Examples: java.lang.Reflect, java.util.ArrayList, org.hibernate.Hibernate.

like image 876
Chun ping Wang Avatar asked Mar 05 '11 17:03

Chun ping Wang


2 Answers

A Java fully qualified class name (lets say "N") has the structure

N.N.N.N

The "N" part must be a Java identifier. Java identifiers cannot start with a number, but after the initial character they may use any combination of letters and digits, underscores or dollar signs:

([a-zA-Z_$][a-zA-Z\d_$]*\.)*[a-zA-Z_$][a-zA-Z\d_$]*
------------------------    -----------------------
          N                           N

They can also not be a reserved word (like import, true or null). If you want to check plausibility only, the above is enough. If you also want to check validity, you must check against a list of reserved words as well.

Java identifiers may contain any Unicode letter instead of "latin only". If you want to check for this as well, use Unicode character classes:

([\p{Letter}_$][\p{Letter}\p{Number}_$]*\.)*[\p{Letter}_$][\p{Letter}\p{Number}_$]*

or, for short

([\p{L}_$][\p{L}\p{N}_$]*\.)*[\p{L}_$][\p{L}\p{N}_$]*

The Java Language Specification, (section 3.8) has all details about valid identifier names.

Also see the answer to this question: Java Unicode variable names

like image 90
Tomalak Avatar answered Nov 06 '22 21:11

Tomalak


Here is a fully working class with tests, based on the excellent comment from @alan-moore

import static org.junit.Assert.assertFalse;
import static org.junit.Assert.assertTrue;

import java.util.regex.Pattern;

import org.junit.Test;

public class ValidateJavaIdentifier {

    private static final String ID_PATTERN = "\\p{javaJavaIdentifierStart}\\p{javaJavaIdentifierPart}*";
    private static final Pattern FQCN = Pattern.compile(ID_PATTERN + "(\\." + ID_PATTERN + ")*");

    public static boolean validateJavaIdentifier(String identifier) {
        return FQCN.matcher(identifier).matches();
    }


    @Test
    public void testJavaIdentifier() throws Exception {
        assertTrue(validateJavaIdentifier("C"));
        assertTrue(validateJavaIdentifier("Cc"));
        assertTrue(validateJavaIdentifier("b.C"));
        assertTrue(validateJavaIdentifier("b.Cc"));
        assertTrue(validateJavaIdentifier("aAa.b.Cc"));
        assertTrue(validateJavaIdentifier("a.b.Cc"));

        // after the initial character identifiers may use any combination of
        // letters and digits, underscores or dollar signs
        assertTrue(validateJavaIdentifier("a.b.C_c"));
        assertTrue(validateJavaIdentifier("a.b.C$c"));
        assertTrue(validateJavaIdentifier("a.b.C9"));

        assertFalse("cannot start with a dot", validateJavaIdentifier(".C"));
        assertFalse("cannot have two dots following each other",
                validateJavaIdentifier("b..C"));
        assertFalse("cannot start with a number ",
                validateJavaIdentifier("b.9C"));
    }
}
like image 8
Renaud Avatar answered Nov 06 '22 22:11

Renaud