The following snippet converts ISO-8859-1 encoded text to UTF8. I don't exactly understand what's going on here. Can someone explain why this works?

var utf8Buf bytes.Buffer
for _, b := range iso8859Slice {
  utf8Buf.WriteRune(rune(b))
}
utf8Str := utf8Buf.String()
有帮助吗?

解决方案

The loop takes each byte of the iso8859Str slice assuming it is of type []byte

Because iso-8859-1 is incorperated as the first 256 code points of Unicode, you have no need of actual conversion from iso-8859-1 to Unicode.

However, what you need to do is to UTF-8 encode the Unicode rune. This is done by Buffer.WriteRune()

WriteRune appends the UTF-8 encoding of Unicode code point r to the buffer

其他提示

First: It does not work if iso8859Str is of type string!

But if iso8859Str is of type []byte your range clause iterates over bytes and that is how unicode was designed: Bytes in ISO 8859-1 correspond to the same unicode codepoint.

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top