I need to modify regular expression to allow all standard characters, French characters, spaces AND dash (hyphen) but only one at a time.
What I have right now is:
import java.util.regex.Pattern;
public class FrenchRegEx {
static final String NAME_PATTERN = "[\u00C0-\u017Fa-zA-Z-' ]+";
public static void main(String[] args) {
String name;
//name = "Jean Luc"; // allowed
//name = "Jean-Luc"; // allowed
//name = "Jean-Luc-Marie"; // allowed
name = "Jean--Luc"; // NOT allowed
if (!Pattern.matches(NAME_PATTERN, name)) {
System.out.println("ERROR!");
} else System.out.println("OK!");
}
}
and it allows 'Jean--Luc' as a name and that is not allowed.
Any help with this? Thanks.
So, you want a pattern which is a 0 or more hyphens, separated by 1 or more other characters. It's just a matter of writing the pattern that way:
"[\u00C0-\u017Fa-zA-Z']+([- ][\u00C0-\u017Fa-zA-Z']+)*"
This also assumes you don't want names to start or end with a hyphen or space, nor that you want more than one space in a row, and that you also want to disallow a space to follow or proceed a hyphen.
You need to disallow consecutive hyphens. You may do it with a negative lookahead:
static final String NAME_PATTERN = "(?!.*--)[\u00C0-\u017Fa-zA-Z-' ]+";
^^^^^^^^
To disallow any of the special chars to be consecutive, use
static final String NAME_PATTERN = "(?!.*([-' ])\\1)[\u00C0-\u017Fa-zA-Z-' ]+";
Another way is to unroll the pattern a bit to match strings where the special char(s) can appear in between letters, but cannot appear consecutively (i.e. if you need to match Abc-def'here like strings):
static final String NAME_PATTERN = "[\u00C0-\u017Fa-zA-Z]+(?:[-' ][\u00C0-\u017Fa-zA-Z]+)*";
or to only allow 1 special char that can only appear in between letters (i.e. if you nee to only allow strings like abc-def, or abc'def):
static final String NAME_PATTERN = "[\u00C0-\u017Fa-zA-Z]+(?:[-' ][\u00C0-\u017Fa-zA-Z]+)?";
Note that you do not need anchors here because you are using the pattern inside a .matches() method that requires a full string match.
NOTE: you may further tune the patterns by moving special chars that may appear anywhere in the string from the [-' ] character class to the [\u00C0-\u017Fa-zA-Z] character classes, like [\u00C0-\u017Fa-zA-Z], but watch out for -. It should be placed at the end, near ].
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With