Domanda

I'm currently using a network software(IxChariot). This soft exports the output into the html format. And I would like to create a soft which transform the html code from Ixchariot to an other html code(I would like to add some tips and remove some useless information).

I need to remove some table but the only difference between the table is the title.

Example of html code :

<TABLE CELLPADDING=3 BORDER=1 style="page-break-inside : avoid">
<H2>Run Options</H2><BR>

Other example :

<TABLE CELLPADDING=3 BORDER=1 style="page-break-inside : avoid">
<H2>Test Setup (Console to Endpoint 1)</H2><BR>

To perform that I'm trying to use pseudo selector from jsoup like that :

Elements table = doc.select("table:has(h2:contains(Run Options)").remove();

I already tried like that too :

Elements table = doc.select("table");
table:has(h2:contains(title));

But It's not working,(currently, all the code is remove). Could you help me with the pseudo selector ? Or if you have a better idea..

ps: I'm not an expert, I just have some basic knowledge in programming.

È stato utile?

Soluzione

Maybe problem is that <H2> is not element expected to be inside <TABLE> so it is removed by JSoup. In that case try simple XML parser instead of default HTML parser. You can do it with

Jsoup.connect(yourUrl).parser(Parser.xmlParser()).get()
Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top