Are character set names case-sensitive in HTTP?

https://stackoverflow.com/questions/19391221

30-06-2022
|

Pergunta

This is a follow-up to Are HTTP headers case-sensitive?.

In the HTTP Content-Type header, I have seen character set names expressed both in upper- and lower-case form. For example, for the UTF-8 character set:

Content-Type: text/html; charset=UTF-8

Content-Type: text/html; charset=utf-8

Here are some mixed-case variants (the latter two certainly not being likely in the real world):

Content-Type: text/html; charset=Utf-8

Content-Type: text/html; charset=UtF-8

Content-Type: text/html; charset=uTf-8

Are all forms equally valid? Or, are the client and server applications that ignore the case of the character set name merely being flexible? Alternatively, are those applications that recognize only one representation non-compliant?

Solução

[Here is the result of my research.]

RFC 2616 clause 3.4 says the following:

HTTP character sets are identified by case-insensitive tokens. The complete set of tokens is defined by the IANA Character Set registry [19].
charset = token

The IANA Character Set registry is now maintained here. At the very top of this document under Note, the second paragraph reads:

The character set names may be up to 40 characters taken from the printable characters of US-ASCII. However, no distinction is made between use of upper and lower case letters.

Conclusion: These two references indicate that case does not matter when specifying a character set name.

Licenciado em: CC-BY-SA com atribuição

Não afiliado a StackOverflow