Frage

I am new to HtmlUnit and I don't know how to get the text inside the [...]

A part of my html file:

<ul ......somethin....>
<li data-role="list-divider" role="heading" style="font-size:16px;" class="ui-bar-f">
  INFORMATION_LINE_1
</li>

<li data-theme="d" class="ui-li ui-btn-icon-right ui-btn-up-d ui-odd-match-column ">
  <div class="ui-btn-inner ui-li">
    <div class="">
      <div class="ui-btn-text">
        <a href="/x/cxntay/13113/ndzvsssl/g1" class=" ui-link-inherit ui-link-hover">
          <h3 class="ui-li-heading">
            <span class="xheader">INFORMATION_LINE_2</span>
            <span class="label live">INFORMATION_LINE_3</span>
          </h3>
          <div class="ui-live-scores">
            <span class="team1-scores">
              <span class="ui-team-name">INFORMATION_LINE_4</span>
              <span style="font-weight:bold">INFORMATION_LINE_5</span>
            </span>
            <span>INFORMATION_LINE_6</span>
          </div>
        </a>
      </div>
    </div>
  </div>
</li>
</ul>

Now, I want to retrieve "INFORMATION_LINE_X"(1,2...6) in between these tags..

This is what I tried:

List<HtmlUnorderedList> ls = (List<HtmlUnorderedList>) page.getByXPath("/ul");
List<DomNode> dls = ls.get(0).getChildNodes();
System.out.println(dls.get(0).getFirstByXPath("//li[@data-role='list-divider']/text()");

I just tried to get INFORMATION_LINE_1 But it printed null. I need to get all the INFORMATION_LINES.

War es hilfreich?

Lösung

It is better to use just XPath rather than mixing it with HTMLUnit methods. Something like this should work to get you the first information line:

HtmlElement e = page.getFirstByXPath("//li[@data-role='list-divider']");
System.out.println(e.asText());

In order to fetch the other information lines you should follow the same approach but changing the XPath string.

Bear in mind you should always debug the page by taking a look at the code by printing the output of page.asXml(). If you use a real browser you are not actually seeing exactly the same as HTMLUnit is seeing. You can stumble with differences particularly if the page executes JavaScript.

Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top