I've been working with an SMS aggregator's web api to send and receive text messages. Not all characters are valid, and when I attempt to send a message with, say, a hash mark # it fails.
I need to clean the strings before I send them but I cannot find a valid list of what characters are good. Mr. Google isn't much help - maybe i'm looking for the wrong terms.
I have already scoured the api manual, and have emailed the company with my question, but there are no answers.
I expect that different phones can handle different lists of characters... eg an iPhone should handle a wide range of characters, but my old nokia flip phone will probably only handle a couple dozen characters beyond the alphanumeric. I'll need the lowest common denominator.
GSM characters Standard single SMS messages: For GSM phones with 7-bit character encoding, a standard SMS message can contain a maximum of 160 characters. That is 1120 bits / (7 bits/character) = 160 characters for a single SMS message.
The character limit for a single SMS message is technically 160 characters. However, most modern phones and networks support message concatenation: they split large messages into individual SMS messages (called "segments") and then re-create the large message at the receiving end.
SMS messages use either 7-bit or 16-bit encoding. SMS messages sent with 7-bit encoding (ISO 8859-1 or ISO 8859-15) or are limited to 160 characters per message. SMS messages sent with 16-bit encoding (UTF-8) are limited to 70 characters per message. 16-bit encoding allows special characters.
Spaces between words are also counted as characters. When typing your message, be aware that the longer it gets, the more it costs. SMS messages containing more than 160 standard GSM characters are known as long messages.
This is built entirely off of @vissi's answer, but this is something you should be able to plug in if you want to build a small collection into your application for verification purposes.
// Standard Latin Characters
'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M',
'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z',
'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm',
'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z',
// Numbers
'0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
// Punctuation
'!', '#', ' ', '"', '%', '&', '\'', '(', ')', '*', ',', '.', '?',
'+', '-', '/', ';', ':', '<', '=', '>', '¡', '¿', '_', '@',
// Currency
'$', '£', '¥', '\u00A4', // [UNTYPED] CURRENCY SIGN
// Accented Characters
'è', 'é', 'ù', 'ì', 'ò', 'Ç', 'Ø', 'ø', 'Æ', 'æ', 'ß', 'É', 'Å',
'å', 'Ä', 'Ö', 'Ñ', 'Ü', '§', 'ä', 'ö', 'ñ', 'ü', 'à',
// Greek Characters
'\u0394', // GREEK CAPITAL LETTER DELTA
'\u03A6', // GREEK CAPITAL LETTER PHI
'\u0393', // GREEK CAPITAL LETTER GAMMA
'\u039B', // GREEK CAPITAL LETTER LAMBDA
'\u03A9', // GREEK CAPITAL LETTER OMEGA
'\u03A0', // GREEK CAPITAL LETTER PI
'\u03A8', // GREEK CAPITAL LETTER PSI
'\u03A3', // GREEK CAPITAL LETTER SIGMA
'\u0398', // GREEK CAPITAL LETTER OMEGA
'\u039E', // GREEK CAPITAL LETTER XI
// Other Miscellaneous Characters
'\u001B', // ESCAPE
'\n', // NEW LINE or LINE FEED
'\r' // CARRIAGE RETURN
This depends on your aggregator. Default sms charset is limited to Latin and some special letters only (including hash mark), others are sent in Unicode or using locking shift table mechanism. But you are using an API to send messages, so all these things are encapsulated. I suggest continuing asking your aggregator for help, they probably block some characters manually.
@ £ $ ¥ è é ù ì ò Ç Ø ø Å å Δ _ Φ Γ Λ Ω Π Ψ Σ Θ Ξ ^ { } \ [ ~ ] | € Æ æ ß É [ ] ! “ # ¤ % & ‘ ( ) * + , – . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? ¡ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z Ä Ö Ñ Ü § ¿ a b c d e f g h i j k l m n o p q r s t u v w x y z ä ö ñ ü à
...also special characters like CR
LF
FF
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With