Question

I'm combining multiple text files into a single text file in C# and I'm doing it in chunks using byte arrays... that part is working for me. But now I want to add a new line right before a new text file gets merged into my target text file. I've found on different threads that using the below code should let me add a new line, but for some reason, the final output in my text file is a small block (਍) rather than a new line.

The problematic bit of code:

byte[] newLine = Encoding.Default.GetBytes(Environment.NewLine);
output.Write(newLine, 0, newLine.Length);

The full section of the code that merges text files:

int chunkSize = 3 * 1024; // 3KB

using (FileStream output = File.Create(outputFolder + @"\TargetFile.txt"))
{
    foreach (string text in textFiles)
    {
        using (FileStream input = File.OpenRead(text))
        {
            byte[] buffer = new byte[chunkSize];
            int bytesRead;
            while ((bytesRead = input.Read(buffer, 0, buffer.Length)) > 0)
            {
                output.Write(buffer, 0, bytesRead);
            }

            byte[] newLine = Encoding.Default.GetBytes(Environment.NewLine);
            output.Write(newLine, 0, newLine.Length);
        }
    }
}

I can open the TargetFile.txt with no problem in NotePad or other Text editors and I'm able to see all my text perfectly rendered, except that additional new line I'm trying to add... What am I doing wrong?

NOTE: In my code example I'm using Encoding.Default, but I've also tried using Encoding.ASCII with exactly the same result.

Was it helpful?

Solution

The character in question is U+0A0D, which is not in fact a valid character. You want U+000D followed by U+000A, and the fact that you are getting U+0A0D suggests that the text file uses a 16-bit little-endian Unicode encoding, so Encoding.Default, Encoding.ASCII and Encoding.UTF8 are all wrong. Encoding.Unicode is more likely to be correct.

Note that there is no one fixed encoding for text files, so the fact that Encoding.Default is wrong here doesn't make it wrong everywhere. Be prepared to deal with files with different encodings.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top