As Martin says, the evidence you have provided suggests that the file is encoded in Windows-1252, and that unparsed-text('file.md', 'utf-8')
is therefore right to reject it.
Encoding of input file for XSLT 2.0 function unparsed-text()
Pregunta
Let's say I have this file.md
encoded in UTF-8
(.md means it's markdown format)
Hello world
This text is encoded in UTF-8.
Then I approach it using function unparsed-text('file.md', 'UTF-8')
. That works like a charm.
Problem shows up when (let's say) I use one of my native language (Czech) specific character, for example this file2.md
:
Hello world
This character "š" is read like "sh" in english.
Using same encoding parameter in unparsed-text()
I get error:
XTDE1200: Failed to read input file file:/C:/file2.md (java.nio.charset.MalformedInputException): Input length = 1
file2.md
has same encoding UTF-8 as file.md
, czech characters are in this charset, yet XSLT processor doesn't accept it. If I change encoding parameter to windows-1250
ie. unparsed-text('file2.md', 'windows-1250')
it works nicely.
So question is, why I get this error? Does it relate to the fact that input file is with extension .md (.txt works). Is there way around it? I really want to be able to use same encoding in my xsl stylesheet as supplied input file has.
Thanks for answers.
Solución