I know that BOM is used for UTF-8 files, but what about the text files where every character is 2-bytes, should I add the byte order mark to them, too?

有帮助吗?

解决方案

BOM's were invented for UCS-2 and UTF-16, and then only later appropriated by Microsoft (and then XML) for UTF-8. Think about the name: 'byte order mark'. UTF-8 has only one possible byte order, so it doesn't need a BOM to reveal the order. The three-byte sequence for U+FEFF in UTF-8 has, instead, become a Unicode signature for file type sniffing.

However, early versions of the XML support in Java did not respond well to a UTF-8 BOM, in spite of the inclusion of the UTF-8 BOM in the XML standard. Further, a file with a BOM can't be simply concatenated onto another file, because U+FEFF isn't BOM in the middle of the file; it's ZWNBSP.

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top