Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java regexp error: \( is not a valid character

Tags:

java

regex

I was using java regexp today and found that you are not allowed to use the following regexp sequence

String pattern = "[a-zA-Z\\s\\.-\\)\\(]*";

if I do use it it will fail and tell me that \( is not a valid character.

But if I change the regexp to

String pattern = "[[a-zA-Z\\s\\.-]|[\\(\\)]]*";

Then it will work. Is this a bug in the regxp engine or am I not understanding how to work with the engine?

EDIT: I've had an error in my string: there shouldnt be 2 starting [[, it should be only one. This is now corrected

like image 238
Marthin Avatar asked Dec 04 '22 09:12

Marthin


2 Answers

Your regex has two problems.

  1. You've not closed the character class.

  2. The - is acting as a range operator with . on LHS and ( on RHS. But ( comes before . in unicode, so this results in an invalid range.

To fix problem 1, close the char class or if you meant to not include [ in the allowed characters delete one of the [.

To fix problem 2, either escape the - as \\- or move the - to the beginning or to the end of the char class.

So you can use:

String pattern = "[a-zA-Z\\s\\.\\-\\)\\(]*";

or

String pattern = "[a-zA-Z\\s\\.\\)\\(-]*";

or

String pattern = "[-a-zA-Z\\s\\.\\)\\(]*";
like image 126
codaddict Avatar answered Dec 21 '22 23:12

codaddict


You should only use the dash - at the end of the character class, since it is normally used to show a range (as in a-z). Rearrange it:

String pattern = "[[a-zA-Z\\s\\.\\)\\(-]*";

Also, I don't think you have to escape (.) characters inside brackets.

Update: As others pointed out, you must also escape the [ in a java regex character class.

like image 37
Tim Avatar answered Dec 21 '22 23:12

Tim