Question

I have a folder with XML files and I need to merge them into one file. When I try this:

var allFiles = Directory.GetFiles(path, "*.xml");
String result = Path.Combine( path, "merged.xml" );
using( var stream = new FileStream( result, FileMode.Create, FileAccess.Write ) ) {
    foreach( var file in allFiles ) {
        var fileContents = File.ReadAllBytes( file );
        stream.Write( fileContents , 0, fileContents.Length );
    }
    stream.Close();
}

I see that 0xEF 0xBB 0xBF (the byte order marker) sequence appears in the result file between the content of any two files but not at the start of the file and not at the end of the file.

If I use StreamWriter instead:

var allFiles = Directory.GetFiles(path, "*.xml");
String result = Path.Combine( path, "merged.xml" );
using( var stream = new FileStream( result, FileMode.Create, FileAccess.Write ) ) {
    using( var writer = new StreamWriter( stream ) ) {
        foreach( var file in allFiles ) {
           var fileText = File.ReadAllText( file );
           writer.Write( fileText );
        }
    }
    stream.Close();
}

then the sequence does not appear.

How does it happen that I'm reading and writing the files as binary and some byte sequence gets injected?

Was it helpful?

Solution

Your original files have BOMs in them.
File.ReadAllBytes() faithfully returns those bytes, just like any other byte.

File.ReadAllText() parses the bytes as UTF8, stripping the BOMs.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top