Question

In whitespace-mode (whitespace.el on EmacsWiki), there's a variable called whitespace-display-mappings. The variable determines how certain special characters should be rendered in different contexts. It can have various default values depending on your environment, but one default value is:

'((space-mark   ?\     [?\u00B7]     [?.]) ; space - centered dot
  (space-mark   ?\xA0  [?\u00A4]     [?_]) ; hard space - currency
  (space-mark   ?\x8A0 [?\x8A4]      [?_]) ; hard space - currency
  (space-mark   ?\x920 [?\x924]      [?_]) ; hard space - currency
  (space-mark   ?\xE20 [?\xE24]      [?_]) ; hard space - currency
  (space-mark   ?\xF20 [?\xF24]      [?_]) ; hard space - currency
  (newline-mark ?\n    [?$ ?\n])    ; eol - dollar sign
  (tab-mark     ?\t    [?\u00BB ?\t] [?\\ ?\t])) ; tab - left quote mark

The mappings for ?\, ?\xA0, ?\n and ?\t are straight-forward, because they map normal spaces, nobreak-spaces, newlines and tabs, respectively.

However, the characters ?\x8A0, ?\x9A0, ?\xE20 and ?\xF20 are mysterious. They represent Devanganari/<not assigned>, Bengali/BENGALI LETTER TTHA, Thai/THAI CHARACTER PHO SAMPHAO and Tibetan/TIBETAN DIGIT ZERO, respectively. The characters that they map to are also mysterious; by default, they simply map to 4 characters "later" in the Unicode block, which makes no sense.

The characters might be sentinels produced by Emacs that represent some other kind of "special" characters, but that makes no sense because it'd be impossible to map e.g. TIBETAN DIGIT ZERO to some other character in that case.

What do these characters and the characters that they map to actually represent?

Was it helpful?

Solution

Most likely this code was written for Emacs<23, when the internal encoding of characters in Emacs was not Unicode. So those chars were actually various different NBSP (one for latin-1, one for latin-2, ...).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top