I need help with regular expressions. My string contains unicode characters and code below doesn't work.
First four characters must be numbers, then comma and then any alphabetic characters or whitespaces... I already read that if i add /u on end of regular expresion but it didn't work for me...
My code works with non-unicode characters
$post = '9999,škofja loka';;
echo preg_match('/^[0-9]{4},[\s]*[a-zA-Z]+', $post);
Thanks for your answers!
Updated answer:
This is now tested and working
$post = '9999, škofja loka';
echo preg_match('/^\\d{4},[\\s\\p{L}]+$/u', $post);
\\w
will not work, because it does not contain all unicode letters and contains also [0-9_]
additionally to the letters.
Important is also the u
modifier to activate the unicode mode.
If there can be letters or whitespace after the comma then you should put those into the same character class, in your regex there are 0 or more whitespace after the comma and then there are only letters.
See http://www.regular-expressions.info/php.html for php regex details
The \\p{L}
(Unicode letter) is explained here
Important is also the use of the end of string boundary $
to ensure that really the complete string is verified, otherwise it will match only the first whitespace and ignore the rest for example.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With