Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I match accented characters with PHP preg?

Tags:

I’d like to give my users the option to not only fill in letters and numbers, but also “special” letters like the “á”, “é”, etc. However, I do not want them to be able to use symbols like “!”, “@”, "%”, etc.

Is there a way to write a regex to accomplish this? (Preferably without specifying each special letter.)

Now I have:

$reg = '/^[\w\-]*$/';
like image 550
Maurice Avatar asked Jan 25 '10 16:01

Maurice


People also ask

How to match character in regex?

To match a character having special meaning in regex, you need to use a escape sequence prefix with a backslash ( \ ). E.g., \. matches "." ; regex \+ matches "+" ; and regex \( matches "(" . You also need to use regex \\ to match "\" (back-slash).

What does preg_ match return?

The preg_match() function returns whether a match was found in a string.

What is?= in regex?

(?= regex_here) is a positive lookahead. It is a zero-width assertion, meaning that it matches a location that is followed by the regex contained within (?=


2 Answers

You could use Unicode character properties to describe the characters:

/^[\p{L}-]*$/u

\p{L} describes the class of Unicode letter characters.

like image 142
Gumbo Avatar answered Oct 21 '22 06:10

Gumbo


What characters are considered "word-characters" depends on the locale. You should set a locale which has those characters in its natural alphabet, and use the /u modifier for the regexp, like this:

$str = 'perché';
setlocale(LC_ALL, 'it_IT@euro');
echo preg_match('#^\w+$#u', $str);
like image 42
Matteo Riva Avatar answered Oct 21 '22 06:10

Matteo Riva