Does Standard ML support Unicode?
I believe it does not but cannot find any authoritative documentation for SML stating such.
A yes or no is all that is needed, but you must know for a fact. No guessing or I believe answers. An authoritative link would be better.
Not really. All there is in the standard for the time being is the ability to use \uXXXX
escapes in character and string literals, and that it does at least allow Unicode as the underlying character encoding for char
or the optional WideChar.char
. But the standard basis library does not prescribe any support for additional Unicode-aware functionality.
Particular implementations may have additional support, and you may perhaps find some third-party unicode libraries, but that's about it (unfortunately, I have no pointers at hand).
It depends a lot what you mean by "Unicode", which is a collection of many standards for many things. I've not seen any language or system that supports Unicode fully, and I don't even know what that would mean in all details.
You can certainly work with UTF-8 in SML: that encoding was invented to make it easy for ASCII applications to support Unicode. This might result it better and more efficient representation of Unicode than e.g. UTF-16 seen in Java, which does "support Unicode" officially, but then there are many practical problems with it (like surrogate characters).
With UTF-8 in SML strings, one question is how to work with string literals. Systems like Poly/ML allow to redefine the ML toplevel pretty printer for type string
, and it is also feasible to wrap up the compiler to process string literals in a Unicode friendly way. Both of this is done in Isabelle/ML, which is based on Poly/ML. So if you take that big theorem proving environment as ML development platform, you have some kind of Unicode support built in (via so-called "Isabelle symbols").
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With