ISO-8859-1 to UTF8 conversion

https://stackoverflow.com/questions/18870883

29-06-2022
|

题

The following snippet converts ISO-8859-1 encoded text to UTF8. I don't exactly understand what's going on here. Can someone explain why this works?

var utf8Buf bytes.Buffer
for _, b := range iso8859Slice {
  utf8Buf.WriteRune(rune(b))
}
utf8Str := utf8Buf.String()

解决方案

The loop takes each byte of the iso8859Str slice assuming it is of type []byte

Because iso-8859-1 is incorperated as the first 256 code points of Unicode, you have no need of actual conversion from iso-8859-1 to Unicode.

However, what you need to do is to UTF-8 encode the Unicode rune. This is done by Buffer.WriteRune()

WriteRune appends the UTF-8 encoding of Unicode code point r to the buffer

其他提示

First: It does not work if iso8859Str is of type string!

But if iso8859Str is of type []byte your range clause iterates over bytes and that is how unicode was designed: Bytes in ISO 8859-1 correspond to the same unicode codepoint.

许可以下： CC-BY-SA 和归因

不隶属于 StackOverflow