I have the following HTML on my webpage:

<p>This is a <a href="http://www.google.com/">hyperlink</a> and this is another <a href="http://www.bing.com/">hyperlink</a>. There are many like it, but <a href="http://en.wikipedia.org/wiki/Full_Metal_Jacket">this one is mine</a>.</p>

Now, I was wondering...

Is there any way, I can use a PHP function to split this block of text up into an array?

$html[0] = "<p>This is a & this is another . There are many like it, but .</p>";
$html[1] = "http://www.google.com/";
$html[2] = "http://www.bing.com/";
$html[3] = "http://en.wikipedia.org/wiki/Full_Metal_Jacket";

So, basically stripping the initial block of text of all hyperlinks and storing them all in their own array element.

Many thanks for any help with this.

有帮助吗?

解决方案

Use this RegEx to get URL's of html:

  $url = "http://www.example.net/somepage.html";
  $input = @file_get_contents($url) or die("Could not access file: $url");
  $regexp = "<a\s[^>]*href=(\"??)([^\" >]*?)\\1[^>]*>(.*)<\/a>";
  if(preg_match_all("/$regexp/siU", $input, $matches)) {
    // $matches[2] = array of link addresses
    // $matches[3] = array of link text - including HTML code
  }
?>
许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top