ASCII / UTF8 set random?

https://stackoverflow.com/questions/22481186

16-06-2023
|

Domanda

I have tried a program called UTFCast Professional. It checkes the file encoding.

When I write code I use Sublime Text.

Random encoding

What I get is that some files are UTF8 and some files are ASCII/UTF8. It appears to be set random. All of them are set to "BOM: No".

Why is some files UTF8 and some ASCII/UTF8?
Is it possible that in some cases it does not know if it's ASCII or UTF8?
Should I be worried for future encoding problems? I have not have any so far.

(I prefer UTF8)

Soluzione

A plain text file does not in any way save what encoding it's in anywhere. Any program that purportedly tells you what encoding a file is in is by definition only giving you its best guess based on the content of the file. Now, since a file which contains only characters which are present in ASCII and is saved as UTF-8 is indistinguishable from a pure ASCII file, either answer is valid. Even Latin-1 and a large number of other answers would be valid.

So the answer why that program randomly outputs one or the other is because its detection algorithm triggers one or the other based on some characteristics of the file content. Only the program author can tell you exactly why. The file is encoded as UTF-8 without BOM. Whatever any application tells you it thinks it is is entirely up to that application.

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a StackOverflow