UTF-8 without BOM

https://stackoverflow.com/questions/5406172

29-10-2019
|

Question

I have javascript files that I need them to be saved in UTF-8 (without BOM), every time I convert them to the correct format in Notepad++, they are reverted back to UTF-8 with BOM when I open them in Visual Studio. How can I stop VS2010 from doing that?

Another question, is UTF-8 without signature in Visual Studio the same as UTF-8 without BOM?

Solution

BOM or Byte Order Mark is sometimes quite annoying. Visual Studio does not change the file unless you save it (as Hans said).

And here is the solution to your problem: If you want to save a file with other encodings select save as and extend the save button in file dialog and select "Save with encoding". Or if you you want to get rid of this setting permanently just open File menu and select "Advanced save options" and there you should select "UTF-8 without signature" (and that also answered your last question :). Yes "UTF-8 without signature" is same as without BOM.

OTHER TIPS

I've created the Fix File Encoding extension that prevents Visual Studio 2010+ from adding BOM to UTF-8 files.

Now with pictures.

Go to File -> Save As.

enter image description here

Then on Save button click on triangle and click Save with Encoding...

enter image description here

Click ok to overwrite the file then from list of encodings find UTF-8 Without signature -> Click OK.

enter image description here

Hope this saves you some time.

Unfortunately this does not work with csproj files. There is no "Advanced save option" and even though you have set it to "UTF-8 without signature" for a cs file, csproj files still are saved with BOM. If you use VSS it still complains about project files.

UTF-8 - "Save As" (Without Signature) Default - Request to include Default for UTF Without Signature

VS 2017 natively supports EditorConfig so a recommended solution is to add .editorconfig file to your code base and set charset => utf-8. Then once you save a file it will be saved as UTF-8 without BOM.

For vs2010 c++, there will be problems with UTF8 without BOM, when source files contain multi-byte characters(eg. Chinese).

Those characters will not be recognized correctly without BOM, and result in failed compling.

Recently I found this tiny command-line tool which adds or removes the BOM on arbitary UTF-8 encoded files: UTF BOM Utils (new link at github)

Little drawback, you can download only the plain C++ source code. You have to create the makefile (with CMake, for example) and compile it by yourself, binaries are not provided on this page. However, for a software developer this should not be any issue.

Even with Dave81's solution, Visual Studio 2015 Community was still reverting my file to UTF8-BOM every single time I saved that html file.

When I created that html file, I right-clicked on the project and selected "Add" then added an HTML file.

By default, Visual studio will include a < meta charset="utf-8" / > tag in your HTML file.

Simply removing the tag then applying Dave81's solution made the problem go away for real this time.

It seems the Visual Studio parse your html file and when it sees that tag it converts the file to UTF8-BOM without any consideration to the original file format (UTF8 without BOM).

** I would have made a comment directly under Dave81's solution, but I didn't have enough points to do so...

For Visual Studio Code do the following:

From bottom right, select current encoding
From options, Select Save with enconding
From options, select UTF-8

UTF8Encoding utf8EmitBOM = new UTF8Encoding(false);
StreamWriter  sw = new StreamWriter(Path.Combine(sourcefilePath, fileName), false, utf8EmitBOM);

This code will create file in UTF-8 without BOM

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow