Why can some HTML documents display special chars written plainly (e.g. as ä) without the need for codes (e.g. ä)

StackOverflow https://stackoverflow.com/questions/10530516

  •  07-06-2021
  •  | 
  •  

Frage

I'm making a little website with german and french content. Some of the documents display text correctly, even though all umlauts are written as äöü and not with codes. Other docs need the codes but I can't find the difference between the documents.

When trying to google for an answer, I can only find tons of code references but no explanation why some docs don't need them.

War es hilfreich?

Lösung

Any HTML document (or any text document for that matter) is encoded to a certain encoding - this is a mapping between the characters and the values representing them. Different encodings mean different characters.

Many pages use UTF-8 a Unicode encoding and they state so either in the HTTP header or in a Meta tag (Content-Type) on the page itself - such pages can use most characters directly.

You should read The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!).

Andere Tipps

1) charset-declaration in the html-code (meta) 2) the encoding of your documents. For example... if you're working with UTF-8 and there is ONE document (for example a js-file) in ISO 8859-1 then some browsers will show you the site in ISO 8859-1 wich destroys your äöüß, ...

Because, per the HTML specification:

Authoring tools (e.g., text editors) may encode HTML documents in the character encoding of their choice

Some documents use an encoding (such as iso‑8859‑1, or Windows‑1252, or utf‑8) that can represent the character ä directly; others use an encoding (such as us‑ascii) that cannot, and therefore need to use the character entity reference ä.

Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top