Question

Historically, a lot of web pages advertising themselves as being in the ISO-8859-1 (Latin-1) encoding actually contained content in the Windows-1252 encoding (which is a strict superset of Latin-1).

This was enough of a problem that browsers updated their behaviour to treat all Latin-1 text as if it were Windows-1252. This behaviour was then rationalised into the HTML5 [draft] standard.

I'm writing a set of pages on which I want to show the difference between the two encodings, however this seems to be impossible because my Latin-1 page is never actually treated as Latin-1. Is there any way, in any browser, that I can actually force the page encoding to be respected and display the demo?

Was it helpful?

Solution

I’m afraid there is no direct way. I thought Opera once had an option for this, and its current version has an option (via opera:config) to force a specific encoding, overriding HTTP headers and all, but even there, iso-8859-1 actually means windows-1252. I checked Opera versions 5 and 9 too, no luck.

But using the current version of Opera (12.02), you can set the encoding via View → Encoding, and in the “Western” set (where iso-8859-1 means windows-1252 as usual), selecting iso-8859-15 causes the range 130–159 (decimal) of bytes to be effectively ignored in display, not shown as per windows-1252. So this more or less means treating the data as genuinely iso-8859-1 – except of course that the few graphic characters where iso-8859-1 and iso-8859-15 differ, the latter is used.

Technically, those bytes represent C1 Controls in iso-8859-1, and in the mode described above, Opera actually treats them that way. They are disallowed in HTML but normally just ignored by browsers.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top