I'm currently attempting to pull in specific data from an html site using xpath queries, but I'm having trouble pulling in specific parts.

Using //div[@id='main']/h2 as my xpath query I am able to pull the "View Current" text using the following:

exampleSite.title = [[element firstChild] content];

However I would also like to pull in the following:

1. <b>5/9/2013<nbsp><nbsp> 10:58:45 PM</b>
2. <b>6.32</b>
3. <b>5  Total Points</b>
4. <b>3.72</b>

So far I've got this: //div[@id='main']/table[@class='bodytext']/tr but that's where I get stuck. Any help would be greatly appreciated! Thank you!

Here is the html I'm attempting to scrape:

<div id="main">
<h2>View Current</h2>

      <table width="96%" border="0" cellpadding="4" cellspacing="0" bordercolor="#eeeeee" align="center" height="276" valign="top" class="bodytext">
        <tr valign="top" >
          <td colspan = 2 height="13" valign="top" align="left" width="54%" class="headerblue" >Balances <br>
          </td>
        </tr>
        <tr valign="top" > 
          <td colspan = 2 height="13" valign="top" align="left" width="54%" class="text" >Balances 
            as of: <b>5/9/2013<nbsp><nbsp> 10:58:45 PM</b></td>
        </tr>
        <tr valign="top" > 
          <td colspan = 2 height="13" valign="top" align="left" width="46%" class="text" >Account 
            Number: <b>101010123</b></td>
        </tr>
        <tr valign="top" > 
          <td colspan = 2 height="13" valign="top" align="left" width="46%" class="text" ></td>
        </tr>

        <tr valign="top" > 
          <td height="13" valign="top" align="left" width="46%" class="text" >Example Card Amount: 
            <b>6.32</b></td>
<td height="13" valign="top" align="left" width="46%" class="text" ><a href="balance.asp?">View Details</a></td>
        </tr>

        <tr valign="top" > 
          <td height="13" valign="top" align="left" width="46%" class="text" >Example Dining Plans:<b>5  Total Points</b>

</td>
<td height="13" valign="top" align="left" width="46%" class="text" ><a href="balance2.asp?">View Details</a></td>
        </tr>

        <tr valign="top" > 
          <td height="13" valign="top" align="left" width="46%" class="text" >Credit For Printing: 
            <b>3.72</b></td>
<td height="13" valign="top" align="left" width="46%" class="text" ><a href="balance1.asp?">View Details</a></td>
        </tr>

          <td colspan = 2 height="13" valign="top" align="CENTER"  class="text">For 
            questions contact Cashiers at<BR> (000)000-0011 or <a href="mailto:example@example.com">example@example.com</a></td>
        </tr>
        <tr valign="top"> 
          <td colspan = 2 height="13" valign="top" align="CENTER"  class="text" > 

<a href="balance1.asp">All Plan Usage for last 90 days is available here</a>
            </td>
        </tr>
        <tr valign="top"> 
          <td colspan = 2 height="13" valign="top" align="CENTER"  class="text" > 

<a href="balance.asp?pln=Full">All Usage for last 365 days is available here</a>
            </td>
        </tr>

      </table>



</div>
有帮助吗?

解决方案

//div[@id='main']/table[@class='bodytext']/tr/td/b should give you a list of all <b>s in your table cells.

其他提示

Here is an extension to Mennny's answer, which is actually right, so you should accept it. I'll try to answer your additional questions in the comments:

You do your parsing like this: (htmlData is my demo data)

NSData *htmlData = [NSData dataWithContentsOfFile:[@"/Users/dennis/Desktop/demo.html" stringByStandardizingPath]];
TFHpple *parser = [[TFHpple alloc] initWithHTMLData:htmlData];
NSArray *bTags = [parser searchWithXPathQuery:@"//div[@id='main']/table[@class='bodytext']/tr/td/b"];

After that you put the contents of the parsed <b>tags in an NSMutableArray.

NSMutableArray *stringsInBTag = [[NSMutableArray alloc] initWithCapacity:0];
for (TFHppleElement *element in bTags) {
    [stringsInBTag addObject:element.content];
}

What you get there is: (logged output of the array)

( "5/9/2013", 101010123, "6.32", "5 Total Points", "3.72" )

Now you want to set your labels:

// Set label 1 to third <b>
self.label1.text = stringsInBTag[2];

// Set label 2 to first <b> 
self.label2.text = stringsInBTag[0];
许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top