Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regular Expression for Japanese characters

I am doing internationalization in Struts. I want to write Javascript validation for Japanese and English users. I know regular expression for English but not for Japanese users. Is it possible to write one regular expression for both the users which validate on the basis of Unicode?

Please help me.

like image 838
Nilesh Shukla Avatar asked Jul 22 '11 08:07

Nilesh Shukla


People also ask

Is Japanese character Unicode?

Hiragana is a Unicode block containing hiragana characters for the Japanese language.

How do you denote special characters in regex?

Special Regex Characters: These characters have special meaning in regex (to be discussed below): . , + , * , ? , ^ , $ , ( , ) , [ , ] , { , } , | , \ . Escape Sequences (\char): To match a character having special meaning in regex, you need to use a escape sequence prefix with a backslash ( \ ).

What does this regex do?

Short for regular expression, a regex is a string of text that lets you create patterns that help match, locate, and manage text. Perl is a great example of a programming language that utilizes regular expressions. However, its only one of the many places you can find regular expressions.


1 Answers

Here is a regular expression that can be used to match all English alphanumeric characters, Japanese katakana, hiragana, multibytes of alphanumerics (hankaku and zenkaku), and dashes:

/[一-龠]+|[ぁ-ゔ]+|[ァ-ヴー]+|[a-zA-Z0-9]+|[a-zA-Z0-9]+|[々〆〤]+/u

You can edit it to fit your needs, but notice the "u" flag at the end.

I hope this helps!

like image 144
shawndreck Avatar answered Sep 19 '22 07:09

shawndreck