Byte-Order Mark found in UTF-8 File. W3C Validation Error

https://stackoverflow.com/questions/11103500

15-06-2021
|

Question

I have created a web site which is valid to strict XHTML and passes the validation, but the W3C validator tells me I have a note (error):

Byte-Order Mark found in UTF-8 File.

The Unicode Byte-Order Mark (BOM) in UTF-8 encoded files is known to cause problems for some text editors and older browsers. You may want to consider avoiding its use until it is better supported.

But I have no BOM in my file. It's straight XHTML done in VS.

Is the server adding it? How can I get rid of the error?

This is important as it screws up semantic extraction. http://www.w3.org/2003/12/semantic-extractor.html

Solution

The W3C Markup Validator does not indicate a BOM in UTF-8 as an error; it would itself be in error if it did, since a BOM is allowed at the start of UTF-8 data. It issues a warning.

The warning is seriously outdated. No problems have been observed in relevant browsers for many years. On the contrary, BOM should be regarded as useful, since if e.g. a file is saved locally (and HTTP headers are thus lost, the BOM in UTF-8 format lets browsers to infer, with practical certainty, that the document is UTF-8 encoded.

The Semantic data extraction tool is not very up-to-date, and it suffers from a too theoretic approach, but it does not seem to have any problem with BOM at the start of UTF-8 data.

It is possible that the server adds the BOM, or that your authoring tool adds it. Either way, it should be considered as useful, rather than a problem.

OTHER TIPS

You do have a BOM (EF BB BF) in your resource. Consider removing it, perhaps, using some hex editor. How do I remove the BOM character from my xml file

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow