Question

ISO 8859-1 contains a few letter-free diacritics: The diaeresis (¨), the acute accent (´), the cedilla (¸) and the macron (¯).¹

Why were they included? As far as I know (please correct me if I am wrong), the ISO 8859 encodings do not support combining diacritical marks like Unicode, so you cannot even use them to create fancy new letters like Ÿ, ś, ŗ and ī; you can just use them stand-alone like this: a¨b. What's the point of that? Surely, the designers of ISO 8859-1 were very smart people and had very good reasons. What were they?


¹ The backtick/grave accent ` and the circumflex ^ should probably be in this list as well, but the reason for them being included in the ISO 8859 encodings seems fairly obvious to me: backwards compatibility to 7-bit ASCII.

Was it helpful?

Solution

Note: when some important missing characters (such as the Euro symbol ) were added to the character set to create ISO8859-15, some mostly unused characters had to go, and this included the letter-free diacritics. So, the designers of ISO8859-1 may have been very smart people and may have had good reasons, but apparently nobody understood them!

However, your characterization that you can't create combined characters is not exactly true: if you have a terminal and/or printer that supports control characters, you can print YBACKSPACE¨ to get Ÿ. (That's of course different to how combined characters work in Unicode.)

Different to what backspace does today, the original meaning is to move the cursor back one space, and everything that gets printed then is printed on top of what was there before. That's how you would get boldface, strikethrough, or underlined text, for example:

  • HEYBACKSPACEBACKSPACEBACKSPACEHEY = HEY
  • HEYBACKSPACEBACKSPACEBACKSPACE--- = HEY

OTHER TIPS

ISO based Latin-1 on ECMA-094, which based it on the DEC Multinational Character Set so Europeans could use the DEC VT220. The first 128 code points of every 8-bit character set had to be the same as ASCII for backward-compatibility. Indeed, back in the bad old days, misconfigured network hardware often interpreted the high bit as an error-correction code and turned extended characters into 7-bit ASCII, so character sets had to be able to fall back to ASCII if this happened. This is why Russians adopted KOI8-R, which produced readable fallback transliterations, over the ISO standard for Cyrillic.

ASCII had them because the keys existed on teletype terminals. The keys existed on teletypes because, as Jörg mentioned, people would write à on an old-fashioned manual typewriter by typing a backspace `. (I typed it on my Linux box just now as: a right-alt `.) IBM based the keyboard of its PC on its typewriters, so it had those keys too, and since they exist, but have no meaning in any natural language, people started using them for markup. Here, for example, they denote code fragments.

Licensed under: CC-BY-SA with attribution
scroll top