質問

How can I add a byte order mark to a StringBuilder? (I have to pass a string to another method which will save it as a file, but I can't modify that method).

I tried this:

var sb = new StringBuilder();
sb.Append('\xEF');
sb.Append('\xBB');
sb.Append('\xBF');

But when I view it with hex editor, it adds the following sequence: C3 AF C2 BB C2 BF

The string is huge, so it would be good to do it without back and forth converting to byte array.

Edit: Clarification after questions in comments. I have to pass the string to another method which takes a string and creates a file of it on Azure Blob Storage. I can't modify the other method.

役に立ちましたか?

解決

Two options:

  1. Don't include the byte order mark in your text at all... instead use an encoding which will automatically include it
  2. Include it as a character in your StringBuilder:

    sb.Append('\uFEFF'); // U+FEFF is the byte-order mark character
    

Personally I'd go for the first approach normally, but the "I can't modify that method" suggests it may not be an option in your case.

他のヒント

Byte-order marks are to inform readers of a file that the file is of a particular encoding. As such, you should only need the byte-order marks (BOM) in the actual file. If you want to include BOM in a text file you're writing, simply use StreamWriter to write to the file. For example:

using(var writer = new StreamWriter(stream, System.Text.Encoding.UTF8))
{
    writer.Write(sb.ToString);
}

If you don't want BOM with UTF-8:

using(var writer = new StreamWriter(stream))
{
    writer.Write(sb.ToString());
}

Or if you want different BOM:

using(var writer = new StreamWriter(stream, System.Text.Encoding.UTF16))
{
    writer.Write(sb.ToString);
}

Update:

If you wanted to be coupled from the implementation detail of a BOM or a BOM of a particular encoding (i.e. could change at runtime or after deployment) but still wanted to pass a BOM-marked string, you could do something like this (assumes .NET 4.5):

var stream = new MemoryStream();
var encoding = Encoding.UTF8; // TODO: configurize this, if necessary
using(var writer = new StreamWriter(stream, encoding, 1024, true))
{
    writer.Write(sb.ToString());
}
CantModifyButMustUseThis(encoding.GetString(stream.ToArray());

IIRC (and not certain that I do), BOM gets added when you convert to byte using one of the relevant Unicode Encoders. I believe some of those's constructors take a bool that control if to add BOM.

I used this code in ASP.NET core, and well!! it works

 [HttpGet("GetCsv")]
    public async Task<IActionResult> GetCsv() {
        
        var cc = new CsvConfiguration(new System.Globalization.CultureInfo("en-US"));
        var entity = await _service.AdminPanelList();
        using (var ms = new MemoryStream()) {
            using (var sw = new StreamWriter(stream: ms, encoding: new UTF8Encoding(true))) {
                using (var cw = new CsvWriter(sw, cc)) {

                    var bom = '\uFEFF'.ToString();
                    byte[] bomArray = Encoding.UTF8.GetBytes(bom);
                    
                    ms.Write(bomArray);
                    cw.WriteRecords(entity);
                }

                var finalArray = ms.ToArray();
                



                var result = File(finalArray, "text/csv", $"PersonExport.csv");
                    

                return result;
            }
        }
    }
ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top