Question

I've got an arbitrary XHTML document which are usually not well formed, since websites can be made like that and browser will show it. How can I support XSLT translation for not well formed XHTML code? Is there a way that it can avoid those parts which are not well formed?

I have this code in Java, but as I've said it's not supporting not well formed XHTML:

try {
            TransformerFactory tFactory=TransformerFactory.newInstance();

            Source xslDoc=new StreamSource("path1");
            Source xmlDoc=new StreamSource("path2");

            String outputFileName="path3";

            OutputStream htmlFile=new FileOutputStream(outputFileName);
            Transformer trasform=tFactory.newTransformer(xslDoc);
            trasform.transform(xmlDoc, new StreamResult(htmlFile));
        } 
catch (Exception e) {...}
Was it helpful?

Solution

You can use JSoup library to parse and fix your HTML and then use XSLT.

OTHER TIPS

You can try to use an HTML parser like http://about.validator.nu/htmlparser/ or like TagSoup.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top