Question

I have got stuck with a question I have just been helped on - its a new problem but only just slightly.

I have this preg_match to get the contents of href. Please don't tell me not to use regex - I am aware of using other parsers/classes etc but this is an old script that just needs to be fixed for now. :) No time for re-writes!

preg_match("~<a target=\'_blank\' rel=\'nofollow\' href=\"(.*?)\">~i", $epilink, $epiurl);

It returns:

http://www.example.com/frame2.php?view=&epi=54673-r

However, it should return:

http://www.example.com/frame2.php?view=168204&epi=54673

This is an example of html it would work with:

<a target='_blank' rel='nofollow' href="http://www.example.com/frame2.php?view=545903&epi=54683">

Why is the URL I have returned malformed?

Thanks all for any help.

Was it helpful?

Solution

$string="<a target='_blank' rel='nofollow' href=\"http://www.example.com/frame2.php?view=545903&epi=54683\">";
$s = explode('">',$string);
foreach($s as $k){
   if (strpos($k,"href")!==FALSE){
        echo preg_replace('/.*href="|/ms',"",$k);
        break;
   }
}

output

$ php test.php
http://www.example.com/frame2.php?view=545903&epi=54683

OTHER TIPS

This should work:

$epilink = "<a target='_blank' rel='nofollow' href=\"http://www.example.com/frame2.php?view=545903&epi=54683\">";
preg_match("/<a target='_blank' rel='nofollow' href=\"(.*?)\">/i", $epilink, $epiurl);

print_r($epiurl);

you can also use preg_match_all

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top