Question

I got an xhtml file containing links with multiple parameters :

index.jsp?foo=bar&foo2=bar2&foo3=bar3.

Saxon 9.5 tries to interpret &foo2 as an entity and obviously fails. I cannot change my xml (it is a web page from internet), I could pre-process it with some regex but want to avoid programming if possible.

java -jar %SAXON_HOME%\saxon9he.jar -xsl:transfo.xsl -s:pageWeb.xml -o:result.html -dtd:off --recognize-uri-query-parameters:false

does not work. Is it possible without modifying the xml ?

Thank you

Était-ce utile?

La solution

Well if you feed something to an XML parser that is not well-formed XML then the parser is going to reject it, that is why there is a specification. And Saxon simply relies on an XML parser to process its input documents and stylesheets.

If you have input that is not well-formed then you can try to use a different parser like TagSoup or the HTML5 parser, you need to tell Saxon to use it using the -x option e.g. java -jar %SAXON_HOME%\saxon9he.jar -x:org.ccil.cowan.tagsoup.Parser ... or java -jar %SAXON_HOME%\saxon9he.jar -x:nu.validator.htmlparser.sax.HtmlParser ....

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top