I would like to form a regex ro recognize the declaration of a variable name. User will enter a string that they would like as a variable name, and the program has to check whether the variable is valid.
I have been trying for the whole day and couldn't get the correct one.
The first character of the variable name must either be alphabet or underscore. It should not start with the digit. No commas and blanks are allowed in the variable name. No special symbols other than underscore are allowed in the variable name.
It indicates that the subpattern is a non-capture subpattern. That means whatever is matched in (?:\w+\s) , even though it's enclosed by () it won't appear in the list of matches, only (\w+) will.
new String(). matches(regEx) can be directly be used with try-catch to identify if regEx is valid. While this does accomplish the end result, Pattern. compile(regEx) is simpler (and is exactly what will end up happening anyway) and doesn't have any additional complexity.
If we try to pass a variable to the regex literal pattern it won't work. The right way of doing it is by using a regular expression constructor new RegExp() . In the above code, we have passed the removeStr variable as an argument to the new RegExp() constructor method.
First thing we do is gather a list of all the valid characters for the first character:
[a-zA-Z_$]
Then the other characters:
[a-zA-Z_$0-9]
we want to match the whole string, and we can have 0 or more of the other characters, so the regex becomes:
^[a-zA-Z_$][a-zA-Z_$0-9]*$
I allow capital characters in the first character in the regex (as well as dollar signs), because this is a test for validity, not for well-formed variables. (Note that constants should be in all caps, including the first letter.)
You can use this:
"^[_a-z]\\w*$"
How it works:
^ // Match at the beginning
[_a-z] // Match either "_", or "a-z" at the beginning
\\w* // Match zero or more of characters - [a-zA-Z0-9_], after the beginning
$ // Till the end
Note - according to Java Naming Convention, a variable should not start with an uppercase letter, so I have not included - [A-Z]
in the first character class.
Also, since Java allows the use of $
in the variable name, even at the start, you should consider adding it to your allowed character set. So, you can modify the above regex as:
"^[_$a-z][\\w$]*$"
This will do what you want:
"^[a-z_]\\w*$"
Explanation:
^
: start at the beginning of the string[a-z_]
: match a single lowercase letter or underscore\\w*
: match zero or more word characters (\w
is equivalent to [a-zA-Z_0-9]
)$
: match until the end of the string.Edit: Updated to reflect the "dollar sign allowed" and "name shouldn't start with uppercase" pointed out by the others. Thanks for the reminder.
Edit 2: After doing some research I've removed the matching for dollar signs again. While technically permissible, it's certainly bad style in this context and therefore discouraged, just like variables starting with an uppercase letter. See also https://stackoverflow.com/a/4636667/1814922
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With