Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regular expression for validating cyrillic

Tags:

php

I have a php function to validate "City":

function validate_city($field) {
    if ($field == "") return "Enter city.<br />";
    else if (preg_match("/[^а-Яa-zA-z-]/", $field))
        return "City hame only from letters and -.<br />";
    return "";
}

Every time, when I enter a cyrillic City name (for ex: "Минск") it returns: City hame only from letters and -. Variable $_POST['city'] looks like: Ð�инÑ�к

In JS this code works correct, I think something is in encoding.....

like image 836
Nick_NY Avatar asked Apr 01 '11 07:04

Nick_NY


3 Answers

You can use the following pattern to validate non latin characters:

preg_match ('/^[a-zA-Z\p{Cyrillic}\d\s\-]+$/u', $str);

See this post for the full explanation

like image 158
Martin Taleski Avatar answered Sep 30 '22 18:09

Martin Taleski


A better solution to match Cyrillic and Common characters would be:

preg_match ('/^[\p{Cyrillic}\p{Common}]+$/u', $str);
like image 40
Wiliam Avatar answered Sep 30 '22 16:09

Wiliam


This looks like utf-8, if it is, this tip from cebelab on php.net might be helpful:

I noticed that in order to deal with UTF-8 texts, without having to recompile php with the PCRE UTF-8 flag enabled, you can just add the following sequence at the start of your pattern: (*UTF8)

for instance : '#(*UTF8)[[:alnum:]]#' will return TRUE for 'é' where '#[[:alnum:]]#' will return FALSE

Use the builtin special character group :alnum: for this, you will need to reverse your match:

function validate_city($field) {
    if ($field == "") return "Enter city.<br />";
    else if (preg_match("/(*UTF8)^[[:alnum:]]+$/", $field))
    return "";
    return "City hame only from letters and -.<br />";
} 

edit, ah, forgot utf-8 in regex ; )

like image 42
Stephan B Avatar answered Sep 30 '22 17:09

Stephan B