I have a Nokia N900 phone, and when sending an SMS, the widget displays the number of characters left in the message (and the number of actual short messages needed to send the whole message).
I live in France, where I noticed the following odd thing when writing messages with non-ASCII characters:
So I'm wondering how the messages are encoded, because I can't see the above scheme matching the traditional encodings I know (iso-8859-1, UTF-8, UTF-16...).
https://en.wikipedia.org/wiki/SMS#Message_size
Depend on the encoding, SMS can send 160/140/70 characters. If any of the non-ASCII chars are used, the entire message would have to be encoded in UTF-16, hence the "consumption" you experienced.
@Vicky and @timdream are right, except that I believe it's technically UCS-2 and not UTF-16 that the phone sometimes uses, which has a fixed 16-bit size per character. UTF-16 uses a variable width of two or four bytes per character, depending on the character being encoded. This Wikipedia article explains this in detail. UCS-2 strictly takes the message down to 70 characters at most (160 bytes). Although the Unicode Consortium's description of UCS-2 is a bit confusing, a handful of sites around the web dealing with SMS confirm that Wikipedia is right.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With