The thing is I need to treat this kind of Chinese input as invalid in client side validation:
Input is invalid when any English character mixed with any Chinese character and spaces has a total length >=10.
Let's say : "你的a你的a你的a你" or "你的 你的 你的 你" (length is 10) is invalid. But "你的a你的a你的a" (length is 9) is OK.
I am using both Javascript to do client side validation and Java to do the server side. So I suppose applying the regular expression on both should be perfect.
Can anyone give some hints how to write the rules in regular expression?
From What's the complete range for Chinese characters in Unicode?, the CJK unicode ranges are:
Block Range Comment
--------------------------------------- ----------- ----------------------------------------------------
CJK Unified Ideographs 4E00-9FFF Common
CJK Unified Ideographs Extension A 3400-4DBF Rare
CJK Unified Ideographs Extension B 20000-2A6DF Rare, historic
CJK Unified Ideographs Extension C 2A700–2B73F Rare, historic
CJK Unified Ideographs Extension D 2B740–2B81F Uncommon, some in current use
CJK Unified Ideographs Extension E 2B820–2CEAF Rare, historic
CJK Compatibility Ideographs F900-FAFF Duplicates, unifiable variants, corporate characters
CJK Compatibility Ideographs Supplement 2F800-2FA1F Unifiable variants
CJK Symbols and Punctuation 3000-303F
You probably want to allow code points from the Unicode blocks CJK Unified Ideographs and CJK Unified Ideographs Extension A.
This regex will match 0 to 9 spaces, ideographic spaces (U+3000), A-Z letters, or code points in those 2 CJK blocks.
/^[ A-Za-z\u3000\u3400-\u4DBF\u4E00-\u9FFF]{0,9}$/
The ideographs are listed in:
However, you may as well add more blocks.
function has10OrLessCJK(text) {
return /^[ A-Za-z\u3000\u3400-\u4DBF\u4E00-\u9FFF]{0,9}$/.test(text);
}
function checkValidation(value) {
var valid = document.getElementById("valid");
if (has10OrLessCJK(value)) {
valid.innerText = "Valid";
} else {
valid.innerText = "Invalid";
}
}
<input type="text"
style="width:100%"
oninput="checkValidation(this.value)"
value="你的a你的a你的a">
<div id="valid">
Valid
</div>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With