Question

This command works as expected

$ awk 'BEGIN {printf "%c", 0x7f}' | od -tx1
0000000 7f
0000001

However for all characters >= 0x80, awk is outputting 2-byte characters

$ awk 'BEGIN {printf "%c", 0x80}' | od -tx1
0000000 c2 80
0000002

How can I force single byte characters with that set?

Was it helpful?

Solution

Turn off whatever UTF-8 locale you are using:

$ LC_CTYPE=C awk 'BEGIN {printf "%c", 0x80}' | hd
00000000  80                                                |.|
00000001

OTHER TIPS

$ awk --characters-as-bytes 'BEGIN {printf "%c", 0x80}' | od -tx1
0000000 80
0000001
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top