Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

java regex pattern unclosed character class

I need some help. Im getting:

Caused by: java.util.regex.PatternSyntaxException: Unclosed character class near index 24
^[a-zA-Z└- 0-9£µ /.'-\]*$
                        ^
        at java.util.regex.Pattern.error(Pattern.java:1713)
        at java.util.regex.Pattern.clazz(Pattern.java:2254)
        at java.util.regex.Pattern.sequence(Pattern.java:1818)
        at java.util.regex.Pattern.expr(Pattern.java:1752)
        at java.util.regex.Pattern.compile(Pattern.java:1460)
        at java.util.regex.Pattern.<init>(Pattern.java:1133)
        at java.util.regex.Pattern.compile(Pattern.java:823)

Here is my code:

String testString = value.toString();

Pattern pattern = Pattern.compile("^[a-zA-Z\300-\3770-9\u0153\346 \u002F.'-\\]*$");
Matcher m = pattern.matcher(testString);

I have to use the unicode value for some because I'm working with xhtml.

Any help would be great!

like image 663
Joseph Vance Avatar asked Jan 18 '13 00:01

Joseph Vance


2 Answers

Assuming that you want to match \ and - and not ]:

Pattern pattern = Pattern.compile("^[a-zA-Z\300-\3770-9\u0153\346 \u002F.'\\\\-]*$");

You need to double escape your backslashes, as \ is also an escape character in regex. Thus \\] escapes the backslash for java but not for regex. You need to add another java-escaped \ in order to regex-escape your second java-escaped \.

So \\\\ after java escaping becomes \\ which is then regex escaped to \.

Moving - to the end of the sequence means that it is used as a character, instead of a range operator as pointed out by Pshemo.

like image 181
Jeff Avatar answered Oct 17 '22 01:10

Jeff


It is hard to say what are you trying to achieve, but I can see few strange things in your regex:

  1. you have opened class of characters but never closed it. Instead you used \\] which makes ] normal character.
    • If you want to include ] in your characters class then you need additional ] at the end, like "^[a-zA-Z\300-\3770-9\u0153\346 \u002F.'-\\]]*$"
    • if you want to include \ in your characters class then you need to use \\\\ version, because you need to escape its special meaning two times, in regex engine, and in Javas String
  2. you used - with ('-\\]) which in character class is used to specify range of characters like a-z or A-Z. To escape its special meaning you need to use \\-
like image 31
Pshemo Avatar answered Oct 16 '22 23:10

Pshemo