Question

So golang is designed to handle unicode/utf-8 properly.

However, I seem to have problem getting utf-8 characters printed out in my terminal's standard output correctly.

The simplest program here:-

package main

import "fmt"

func main() {
    fmt.Println("Hello, 世界")
}

when executed, shows encoded characters.

$ go run hello.go
Hello, 世界

My terminal's locale is set correctly:

$ locale
LANG="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_CTYPE="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_ALL="en_US.UTF-8"

and I am using vim with set encoding=utf-8 and set fileencodings=utf-8 included in my .vimrc file.

This might actually be a vim problem. I used nano to write this script from scratch and name it hello2.go and the print out actually prints Hello, 世界 correctly. But the original hello.go script, created with vim, only gives me the gibberish Hello, ‰∏ñÁïå.

So just double check that my vim-created hello.go script is a UTF-8 Unicode text, I run the file command on it. Like this:-

$ file hello.go
hello.go: C source, UTF-8 Unicode text

So what gives? Why does my vim created hello.go script above print out gibberish but my nano created hello2.go (which contains the same exact lines of code) does not?

$ file hello2.go
hello2.go: C source, UTF-8 Unicode text

In fact, when I open vim-created hello.go with nano, the source code now reads:-

package main

import "fmt"

func main() {
        fmt.Println("Hello 世界")
}

But if I open the same vim-created hello.go with vim, the source code reads:-

package main

import "fmt"

func main() {
    fmt.Println("Hello, 世界")
}

Why is this so?

Was it helpful?

Solution

These are the offending lines in my .vimrc that is causing me this problem.

if has("gui_running")
    set guitablabel=%t%=%m  "Set the label of the tabs
    set nomacatsui anti enc=utf-8 tenc=macroman gfn=Monaco:h11
    " set window size
    set lines=40
    set columns=120
else
    set enc=utf-8 tenc=macroman gfn=Monaco:h11
    set fenc=utf-8
endif

Specifically, tenc=macroman is screwing up my encoding at terminal level.

Switched it to tenc=utf-8 and all is good.

Wasted 4 hours of my life on this I-should-have-seen-this-coming problem! Ugh.

OTHER TIPS

Most likely the issue is related to Font used in Windows "Command Prompt" (Console) tool properties.

Click on menu icon in top left corner, select "Properties", then "Font" tab. On my Windows 8.1 machine default was "Raster Font" and I was getting question marks ????? instead of Unicode characters, encoded by UTF8 in Go Lang program.

The solution is to select one of TrueType fonts, "Consolas" or "Lucida Console"

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top