Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

international Count sms characters

I found Count characters/sms using jQuery, but it does not support international characters such as Chinese, Japanese, Thai, etc.

var $remaining = $('#remaining'),
    $messages = $remaining.next();

$('#message').keyup(function(){
    var chars = this.value.length,
        messages = Math.ceil(chars / 160),
        remaining = messages * 160 - (chars % (messages * 160) || messages * 160);

    $remaining.text(remaining + ' characters remaining');
    $messages.text(messages + ' message(s)');
});

Here are some examples of incorrect character counts:

您好,請問你吃飯了嗎? << 11 characters

สวัสดีคุณกินหรือ? << 17 characters

こんにちは、あなたは食べていますか? << 18 characters

안녕하세요, 당신이 먹는 거죠? << 17 characters

हैलो, आप खाते हैं? << 18 characters

Добры дзень, вы ясьце? << 22 characters

How can I make this work with non-ASCII characters?

like image 743
Ironman Avatar asked Jan 20 '23 03:01

Ironman


1 Answers

You can't really count in "characters" here. According to the SMS article on Wikipedia one of three different encodings are used for SMS (7-bit GSM, 8-bit GSM and UTF-16). So first you'll need to know/decide which encoding you'll be using.

If you know you'll always be using UTF-16, then you can count the number of 16-bit code units a string will take up. A standard SMS can consist of 70 16-bit code units. But this will limit messages in Latin characters to 70, too. So if you want to use the full 160 characters (with 7-bit encoding) or 140 characters (with 8-bit encoding) for Latin characters, then you'll need to distinguish between the three cases.

Example for counting UTF-16 16-bit code units:

var message = "您好,請問你吃飯了嗎?";

var utf16codeUnits = 0;

for (var i = 0, len = message.length; i < len; i++) {
  utf16codeUnits += message.charCodeAt(i) < 0x10000 ? 1 : 2;
}

BTW, this will come up with then same numbers you posted as "incorrect", so you'll need to explain why you consider them incorrect.


EDIT

Despite being accepted already I quickly hacked up a function that correctly (as far as I can say) calculates the GSM 7-bit (if possible) and UTF-16 sizes of a SMS message: http://jsfiddle.net/puKJb/

like image 98
RoToRa Avatar answered Jan 24 '23 02:01

RoToRa