Why can some HTML documents display special chars written plainly (e.g. as ä) without the need for codes (e.g. ä)

StackOverflow https://stackoverflow.com/questions/10530516

  •  07-06-2021
  •  | 
  •  

Вопрос

I'm making a little website with german and french content. Some of the documents display text correctly, even though all umlauts are written as äöü and not with codes. Other docs need the codes but I can't find the difference between the documents.

When trying to google for an answer, I can only find tons of code references but no explanation why some docs don't need them.

Это было полезно?

Решение

Any HTML document (or any text document for that matter) is encoded to a certain encoding - this is a mapping between the characters and the values representing them. Different encodings mean different characters.

Many pages use UTF-8 a Unicode encoding and they state so either in the HTTP header or in a Meta tag (Content-Type) on the page itself - such pages can use most characters directly.

You should read The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!).

Другие советы

1) charset-declaration in the html-code (meta) 2) the encoding of your documents. For example... if you're working with UTF-8 and there is ONE document (for example a js-file) in ISO 8859-1 then some browsers will show you the site in ISO 8859-1 wich destroys your äöüß, ...

Because, per the HTML specification:

Authoring tools (e.g., text editors) may encode HTML documents in the character encoding of their choice

Some documents use an encoding (such as iso‑8859‑1, or Windows‑1252, or utf‑8) that can represent the character ä directly; others use an encoding (such as us‑ascii) that cannot, and therefore need to use the character entity reference ä.

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top