Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

regex to match variable declaration in java

Tags:

java

regex

I want to parse a variable declaration statement and get the variable name. I am doing the below

String var = "private   String   ipaddress;";

i m using the regex pattern below to match the above string

.*private\\s+([a-z]*)\\s+([a-z0-9_]*);

It does not work. It says no match found Can any one please help.

like image 327
Krishnaveni Avatar asked Feb 08 '12 07:02

Krishnaveni


People also ask

What does '$' mean in regex?

$ means "Match the end of the string" (the position after the last character in the string). Both are called anchors and ensure that the entire string is matched instead of just a substring.

How do you name a variable in regex?

The first character of the variable name must either be alphabet or underscore. It should not start with the digit. No commas and blanks are allowed in the variable name. No special symbols other than underscore are allowed in the variable name.

Can we use regex in Java?

Regular expressions can be used to perform all types of text search and text replace operations. Java does not have a built-in Regular Expression class, but we can import the java.util.regex package to work with regular expressions.


2 Answers

First of all, remove that dot from the beginning of the regex, since it requires a character before the private for a match.

Second, your regex is case sensitive and won't match the capital s. Either use [a-zA-Z] or make the expression case insensitive ((?i) at the start IIRC).

Btw, [a-zA-Z0-9_] would be the same as \w.

Another thing: your expression would also catch illegal variable names as well as miss legal ones. Variables are not allowed to start with a number but they could also contain dollar signs. Thus the name expression should be something like ([a-zA-Z_$][\w$]*) meaning the first character must be a letter, underscore or dollar sign followed by any number of word characters or dollar signs.

A last note: depending on what you do with those declarations, keep in mind that you might have to check for those reserved words. The adjusted expression would still match "private String private", for example.

Another last note: keep in mind that there might more modifiers than private for a variable, e.g. public, protected, static etc. - or none at all.

Edit:

Now that you have the asterisk after the first dot, that shouldn't be a problem for your special case. However, a dot matches almost any character and thus would match fooprivate as well. Depending on what you want to achieve either remove the dot or add a \s+ after the .*.

like image 96
Thomas Avatar answered Oct 13 '22 18:10

Thomas


Since the declaration of a variable in Java can have more the 3 words before the variable name, I would suggest you do not limit your search and use this:

String var = "private   String   ipaddress;";
//String var2 = "private static final int test=13;";

Pattern p = Pattern.compile(".+\\s(.+?)(;|=)");
Matcher m = p.matcher(var);

while(m.find()){
    System.out.println(m.group(1));
}

It will look for any variable name that begins with a whitespace and ends with either ";" or "=". This is a more general search of variable name.

EDIT This one got me thinking actually, since this is also legal declaration in Java:

private
static
volatile
String
s , t1 = "";

This actually could be improved probably as it was thinked/done fast.

public static void main(String[] args) {
String var0 = "private static final int test,test2;";
String var1 = "private \n static \n final \n int \n testName \n =\n   5 \n";
String var2 = "private \n static \n final \n String \n testName \n =\n  \" aaa           = bbbb   \" \n";
String var3 = "private \n static \n final \n String \n testName,testName2 \n =\n  \" aaa           = bbbb   \" \n";

String var4 = "int i;";
String var5 = "String s ;";
String var6 = "final String test ;  ";
String var7 = "public int go = 23;";
String var8 = "public static final int value,valu2 ; ";
String var9 = "public static final String t,t1,t2 = \"23\";";
String var10 = "public \n static \n final \n String s1,s2,s3 = \" aaa , bbb, fff, = hhh = , kkk \";";
String var11 = "String myString=\"25\"";

LinkedList<String> input = new LinkedList<String>();
input.add(var0);input.add(var1);input.add(var2);input.add(var3);input.add(var4);input.add(var5);
input.add(var6);input.add(var7);input.add(var8);input.add(var9);input.add(var10);
input.add(var11);

LinkedList<String> result = parametersNames(input);
for(String param: result){
    System.out.println(param);
}

}

private static LinkedList<String> parametersNames(LinkedList<String> input){
LinkedList<String> result = new LinkedList<String>();
for(String var: input){

    if(var.contains("\n")) var = var.replaceAll("\n", "");
    var = var.trim();
    if(var.contains("=")){
        var = var.substring(0, var.indexOf("=")).trim() + "";
        Pattern p = Pattern.compile(".+\\s(.+)$");
        Matcher m = p.matcher(var);

       if(m.find()){
        if(m.group(1).contains(",")){
            String [] tokens = m.group(1).split(",");
            for(String token : tokens){
            result.add(token);
            }
        } else{
            result.add(m.group(1));
        }
        }

    } else{
        Pattern p = Pattern.compile(".+\\s(.+?)(;|=)");
        Matcher m = p.matcher(var);

        if(m.find()){
        if(m.group(1).contains(",")){
            String [] tokens = m.group(1).split(",");
            for(String token : tokens){
            result.add(token);
            }
        } else{
            result.add(m.group(1));
        }
        }
    }
}

return result;
}
like image 22
Eugene Avatar answered Oct 13 '22 18:10

Eugene