How do I convert among printed decimal/octal/hex/UTF-8 representations of a UTF-8 character from the command-line?

StackOverflow https://stackoverflow.com/questions/16597408

  •  29-05-2022
  •  | 
  •  

Question

In another question someone suggested echo -e with \0<sequence> for octal, and \x<sequence> for hex. E.g.:

echo -e "\\0302\\0241" --> ¡

Is there a simple way to convert in the other direction, from UTF-8 character to printed octal/hex sequence?

Was it helpful?

Solution

Yep - use hexdump, like this:

$ echo -n i | hexdump

Which will output something like this:

0000000 0069                              
0000003

For something more formatted, you could do this:

$ echo ü | hexdump | awk '{print "\\x"toupper(substr($2,3,4)) "\\x"toupper(substr($2,0,2)) "\\x"toupper(substr($3,3,4))}' | head -1

which will print out this:

\xC3\xBC\x0A

Code taken from here: How do you echo a 4-digit Unicode character in Bash?

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top