Question

Does Standard ML support Unicode?

I believe it does not but cannot find any authoritative documentation for SML stating such.

A yes or no is all that is needed, but you must know for a fact. No guessing or I believe answers. An authoritative link would be better.

Was it helpful?

Solution

Not really. All there is in the standard for the time being is the ability to use \uXXXX escapes in character and string literals, and that it does at least allow Unicode as the underlying character encoding for char or the optional WideChar.char. But the standard basis library does not prescribe any support for additional Unicode-aware functionality.

Particular implementations may have additional support, and you may perhaps find some third-party unicode libraries, but that's about it (unfortunately, I have no pointers at hand).

OTHER TIPS

It depends a lot what you mean by "Unicode", which is a collection of many standards for many things. I've not seen any language or system that supports Unicode fully, and I don't even know what that would mean in all details.

You can certainly work with UTF-8 in SML: that encoding was invented to make it easy for ASCII applications to support Unicode. This might result it better and more efficient representation of Unicode than e.g. UTF-16 seen in Java, which does "support Unicode" officially, but then there are many practical problems with it (like surrogate characters).

With UTF-8 in SML strings, one question is how to work with string literals. Systems like Poly/ML allow to redefine the ML toplevel pretty printer for type string, and it is also feasible to wrap up the compiler to process string literals in a Unicode friendly way. Both of this is done in Isabelle/ML, which is based on Poly/ML. So if you take that big theorem proving environment as ML development platform, you have some kind of Unicode support built in (via so-called "Isabelle symbols").

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top