Question

I have an XSD file that is encoded in UTF-8, and any text editor I run it through doesn't show any character at the beginning of the file, but when I pull it up in Visual Studio's debugger, I clearly see an empty box in front of the file.

Box in file

I also get the error:

Data at the root level is invalid. Line 1, position 1.

alt text

Anyone know what this is?

Update: Edited post to qualify type of file. It's an XSD file created by Microsoft's XSD creator.

Was it helpful?

Solution

It turns out, the answer is that what I'm seeing is a Byte Order Mark, which is a character that tells whatever is loading the document what it is encoded in. In my case, it's encoded in utf-8, so the corresponding BOM was EF BB BF, as shown below. To remove it, I opened it up in Notepad++ and clicked on "Encode in UTF-8 without BOM", as shown below:

Saving in NotePad++.

To actually see the BOM, I had to open it up in TextPad in Binary mode:, and conducted a Google search for "EF BB BF".

binary mode

It took me about 8 hours to find out this was what was causing it, so I thought I'd share this with everyone.

Update: If I had read Joel Spolsky's blog post: The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!), then I might not have had this problem.

OTHER TIPS

here's how you do it with vim:

# vim file.xml
:set nobomb
:wq
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top