Cómo GetBytes () en C # con la codificación UTF-8 con la lista de materiales?

https://stackoverflow.com/questions/4414088

08-10-2019
|

Pregunta

Estoy teniendo un problema con la codificación UTF-8 en mi asp.net mvc 2 aplicación en C #. Estoy tratando de usuario le permiten descargar un archivo de texto simple a partir de una cadena. Estoy tratando de conseguir bytes matriz con la siguiente línea:

var x = Encoding.UTF8.GetBytes(csvString);

pero cuando lo vuelvo para descargar usando:

return File(x, ..., ...);

Me obtener un archivo que es sin lista de materiales por lo que no entiendo caracteres croatas muestran correctamente. Esto se debe a mi matriz bytes no incluye la lista de materiales después de la codificación. Me triend inserción de los bytes de forma manual y luego se muestra correctamente, pero eso no es la mejor manera de hacerlo.

También probé crear instancia de clase UTF8Encoding y pasando un valor booleano (verdadero) a su constructor para incluir la lista de materiales, pero no funciona bien.

Cualquier persona tiene una solución? Gracias!

Solución

Try like this:

public ActionResult Download()
{
    var data = Encoding.UTF8.GetBytes("some data");
    var result = Encoding.UTF8.GetPreamble().Concat(data).ToArray();
    return File(result, "application/csv", "foo.csv");
}

The reason is that the UTF8Encoding constructor that takes a boolean parameter doesn't do what you would expect:

byte[] bytes = new UTF8Encoding(true).GetBytes("a");

The resulting array would contain a single byte with the value of 97. There's no BOM because UTF8 doesn't require a BOM.

Otros consejos

I created a simple extension to convert any string in any encoding to its representation of byte array when it is written to a file or stream:

public static class StreamExtensions
{
    public static byte[] ToBytes(this string value, Encoding encoding)
    {
        using (var stream = new MemoryStream())
        using (var sw = new StreamWriter(stream, encoding))
        {
            sw.Write(value);
            sw.Flush();
            return stream.ToArray();
        }
    }
}

Usage:

stringValue.ToBytes(Encoding.UTF8)

This will work also for other encodings like UTF-16 which requires the BOM.

UTF-8 does not require a BOM, because it is a sequence of 1-byte words. UTF-8 = UTF-8BE = UTF-8LE.

In contrast, UTF-16 requires a BOM at the beginning of the stream to identify whether the remainder of the stream is UTF-16BE or UTF-16LE, because UTF-16 is a sequence of 2-byte words and the BOM identifies whether the bytes in the words are BE or LE.

The problem does not lie with the Encoding.UTF8 class. The problem lies with whatever program you are using to view the files.

Remember that .NET strings are all unicode while there stay in memory, so if you can see your csvString correctly with the debugger the problem is writing the file.

In my opinion you should return a FileResult with the same encoding that the files. Try setting the returning File encoding,

Licenciado bajo: CC-BY-SA con atribución

No afiliado a StackOverflow