Pergunta

I'm new of ocaml and I wonder why: (command line interpreter OCaml version 4.01.0)

# open Core.Std;;
# String.lowercase "a";;
- : Core.Std.String.t = "a"
# String.lowercase("è");;
- : Core.Std.String.t = "�"
# String.lowercase "ò";;
- : Core.Std.String.t = "�"

The same! But with normal characters:

# (=) "a" (String.lowercase "a");;
- : bool = true

and obviously:

# (=) "è" (String.lowercase "è");;
- : bool = false

Can someone explain this behavior?

Thanks

Foi útil?

Solução

This is certainly due to the fact that your terminal is UTF-8 encoded, i.e. the strings you input are UTF-8 encoded. However the functions from the String module (at least in the official stdlib) act only on latin1 (ISO-8859-1) encoded strings. So you can't expect them to work on UTF-8 encoded strings.

This should be easy to check do a:

String.length "é" 

if this is not 1 but 2 you are inputing UTF-8 encoded strings.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top