Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Validating name string with dashes and singlequotes

Tags:

java

regex

I am trying to validate a string with the following specification:

"Non-empty string that contains only letters, dashes, or single quotes"

I'm using String.matches("[a-zA-Z|-|']*") but it's not catching the - characters correctly. For example:

Test         Result  Should Be
==============================
shouldpass   true    true
fail3        false   false
&fail        false   false
pass-pass    false   true
pass'again   true    true
-'-'-pass    false   true

So "pass-pass" and "-'-'-pass" are failing. What am I doing wrong with my regex?

like image 788
reesjones Avatar asked Dec 30 '25 13:12

reesjones


2 Answers

You should use the following regex:

[a-zA-Z'-]+

You regex is allowing literal |, and you have a range specified, from | to |. The hyphen must be placed at the end or beginning of the character class, or escaped in the middle if you want to match a literal hyphen. The + quantificator at the end will ensure the string is non-empty.

Another alternative is to include all Unicode letters:

[\p{L}'-]+

Java string: "[\\p{L}'-]+".

like image 98
Wiktor Stribiżew Avatar answered Jan 01 '26 02:01

Wiktor Stribiżew


Possible solution:

[a-zA-Z-']+

Problems with your regex:

  1. If you don't want to accept empty strings, change * to + to accept one or more characters instead of zero or more.

  2. Characters in character class are implicitly separated by OR operator. For instance:

    regex [abc] is equivalent of this regex a|b|c.

    So as you see regex engine doesn't need OR operator there, which means that | will be treated as simple pipe literal:

    [a|b] represents a OR | OR b characters

  3. You seem to know that - has special meaning in character class, which is to create range of characters like a-z. This means that |-| will be treated by regex engine as range of characters between | and | (which effectively is only one character: |) which looks like main problem of your regex.

    To create - literal we either need to

    • escape it \-
    • place it where - wouldn't be able to be interpreted as range. To be more precise we need to place it somewhere where it will not have access to characters which could be use as left and right range indicators l-r like:
      • at start of character class [- ...] (no left range character)
      • at end of character class [... -] (no right range character)
      • right after other range like A-Z-x - Z was already used as character representing end of range A-Z so it can't reused in Z-x range.
like image 42
Pshemo Avatar answered Jan 01 '26 03:01

Pshemo



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!