سؤال

I am generating a csv file through php to be downloaded through the browser. Do I need to insert the byte order mark bytes in the beginning, considering the target system that would be used can be a mac,unix , windows , etc ?

هل كانت مفيدة؟

المحلول

No, you are not required to.

Byte Order Mark is used in some Unicode encodings, namely UTF-8, UTF-16 and UTF-32 to determine that the encoding is really Unicode.

In UTF-16, it is used to differentiate UTF-16 from UCS-2 (a subset of UTF-16).

It is optional in UTF-8 and UTF-32, but valid. However, in UTF-8, it can cause compatibility issues. To quote a well-phrased Wikipedia entry:

If compatibility with existing programs is not important, the BOM could be used to identify if a file is in UTF-8 versus a legacy encoding, but this is still problematic, due to many instances where the BOM is added or removed without actually changing the encoding, or various encodings are concatenated together. Checking if the text is valid UTF-8 is more reliable than using BOM.

I would go against using the BOM in UTF-8 for those reasons.

نصائح أخرى

Concerning the original question, it is really up to the way that file is encoded when written. If it will be utf-8 encoded i'd add the BOM. If there are just ASCII characters within the file, the BOM can be absent because there will be no sequences. If however utf-8 sequences are within the file, it will be more easy to detect the BOM as to walk through the whole file and check for valid sequences. And even if you detect a single sequence, it still might be single characters above 0x7F.

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top