So I have a UTF-8 encoded string which can contain full-width kanji, full-width kana, half-width kana, romaji, numbers or kawaii japanese symbols like ★ or ♥.
If I want the length I use mb_strlen()
and it counts each of these as 1 in length. Which is fine for most purposes.
But, I've been asked (by a Japanese client) to only count half-width kana as 0.5 (for the purpose of maxlength of a text field) because apparently thats how Japanese websites do it. I do this using mb_strwidth()
which counts full-width as 2, and half-width as 1, then i just divide by 2.
However this method also counts romaji characters as 1 so something like Chocアイス
would count as 7 .. then i'd divide by 2 to account for kanji and I'd get 3.5. but I actually want 5.5 (4 for the Romaji + 1.5 for the 3 half-width kana).
// EDIT:
some more info: any character (even non-kana) which has both a full and a half should be 1 for the full-width and 0.5 for the half-width. for example, characters like ¥、3@(
should all be 1, but characters like ¥,3@(
should all be 0.5
// EXTRA EDIT: symbols like ☆ and ♥ should be 1, but the mb_strwidth/2 method return them as 0.5
Is there a standard way that Japanese systems count string length? Or does everyone just loop thru their strings and count the characters which don't match the standard width rules?
One way is to convert the half-width katakana to full-width and subtract the difference in width from the original length:
$raw = 'Chocアイス';
$full = mb_convert_kana($raw, 'K');
$len = mb_strlen($raw) - (mb_strwidth($full) - mb_strwidth($raw))/2;
assert($len === 5.5);
However, are you sure that you should be considering basic latin characters as full-width? There do exist full-width varieties of basic latin characters too---that is, should Choc
be considered the same as Choc
?
Usually, characters like "A" and "ア" would have a width of 1, but "A" and "ア" would have a width of 2 (which is what mb_strwidth
does). I'd be cautious about having to hack around that.
Given your edit, mb_strwidth
(or mb_strwidth/2
) does exactly what you want.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With