PrinceXML：“输入不正确的UTF-8”

https://stackoverflow.com/questions/4204114

25-09-2019
|

题

我生成从数据库HTML，然后把它发送到PrinceXML转换为PDF。我用这样做的代码是：

string _htmlTemplate = @"<!DOCTYPE html PUBLIC ""-//W3C//DTD XHTML 1.0 Transitional//EN"" ""http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd""><html lang=""en-GB"" xml:lang=""en-GB"" xmlns=""http://www.w3.org/1999/xhtml""><head><meta http-equiv=""Content-type"" content=""text/html;charset=UTF-8"" /><title>Generated PDF Contract</title></head><body>{0}</body></html>";

string _pgeContent = string.Format(_htmlTemplate, sb.ToString());
writer.Write(sb.ToString());
Byte[] arrBytes = UTF8Encoding.Default.GetBytes(_pgeContent);
Stream s = new MemoryStream(arrBytes);

Prince princeConverter = new Prince(ConfigurationManager.AppSettings["PrinceXMLInstallLoc"].ToString());
princeConverter.SetLog(ConfigurationManager.AppSettings["PrinceXMLLogLoc"]);
princeConverter.AddStyleSheet(Server.MapPath(ConfigurationManager.AppSettings["FormsDocGenCssLocl"]));
Response.ClearContent();
Response.ClearHeaders();
Response.ContentType = "application/pdf";
Response.BufferOutput = true;

然而，转换失败，错误：

输入不正确的UTF-8，编码指示！字节：0XA0 0x77 0x65 0X62

我已经采取生成的HTML并将其上传到W3C验证。它验证标记为UTF-8编码的XHTML 1.0过渡没有错误或警告。

我也通过用细齿梳寻找无效字符的文件不见了。到目前为止，什么都没有。

任何人都可以提出别的东西，我可以尝试？

解决方案

好了嘀咕诅咒和撕裂了剩下我的头发经过一下午的，我想出了一个修复我的具体问题。

这样看来，System.Text.UTF8Encoding默认不输出UTF-8标识符字节。因此，在我的情况下我需要使用，需要一个布尔参数到的该控制输出的构造。

UTF8Encoding u8enc = new UTF8Encoding(true);//Ensures a UTF8 identifier is emitted.

在这之后一切都很好。希望这可以帮助别人： - ）

许可以下： CC-BY-SA 和归因

不隶属于 StackOverflow