Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

regex in Vietnamese characters

I have one string and want remove any character not in any case below:

  • not in this list : ÀÁÂÃÈÉÊÌÍÒÓÔÕÙÚĂĐĨŨƠàáâãèéêìíòóôõùúăđĩũơƯĂẠẢẤẦẨẪẬẮẰẲẴẶẸẺẼỀỀỂ ưăạảấầẩẫậắằẳẵặẹẻẽềềểỄỆỈỊỌỎỐỒỔỖỘỚỜỞỠỢỤỦỨỪễệỉịọỏốồổỗộớờởỡợụủứừỬỮỰỲỴÝỶỸửữựỳỵỷỹ

  • not in [a-z 0-9 A-Z]

  • not is : _ and white space.

can anyone help me with this regex in php?

like image 856
lamtheo.com Avatar asked Sep 29 '10 08:09

lamtheo.com


1 Answers

Try this regular expression:

/[^a-z0-9A-Z_ÀÁÂÃÈÉÊÌÍÒÓÔÕÙÚĂĐĨŨƠàáâãèéêìíòóôõùúăđĩũơƯĂẠẢẤẦẨẪẬẮẰẲẴẶẸẺẼỀỀỂưăạảấầẩẫậắằẳẵặẹẻẽềềểỄỆỈỊỌỎỐỒỔỖỘỚỜỞỠỢỤỦỨỪễếệỉịọỏốồổỗộớờởỡợụủứừỬỮỰỲỴÝỶỸửữựỳỵỷỹ]/u

The u modifier makes PHP to interpret the pattern string as UTF-8.

If that doesn’t work, try using Unicode character properties like \p{L} for letters or the escape sequence \x{1234} for describing single Unicode characters or custom character ranges:

/[^a-z0-9A-Z_\x{00C0}-\x{00FF}\x{1EA0}-\x{1EFF}]/u
like image 100
Gumbo Avatar answered Sep 30 '22 17:09

Gumbo