Domanda

I like to get the urls that are in anchor tag definitions from html strings. The html is structured fairly well but the string that I am trying to collect contains addresses for google maps and can be very different. I am trying to get all matching urls using preg_match_all.

<tr><td><a href="http://maps.google.com/maps?q=4165 E LIVE OAK AVE,">map</a></td></tr>
<tr><td><a href="http://maps.google.com/maps?q=8000 SUNSET BLVD, LOS ANGELES,">map</a></td></tr>
<tr><td><a href="http://maps.google.com/maps?q=30600 THOUSAND OAKS BLVD, AGOURA,">map</a></td></tr>
<tr><td><a href="http://maps.google.com/maps?q=9090 19TH ST, ALTA LOMA,">map</a></td></tr>
<tr><td><a href="http://maps.google.com/maps?q=185 W ALTADENA DR, ALTADENA,">map</a></td></tr>
<tr><td><a href="http://maps.google.com/maps?q=620 E MOUNT CURVE AVE,">map</a></td></tr>
È stato utile?

Soluzione

Try the following regular expression:

/http:\/\/maps.google.com\/maps\?q[^"]+(?=")/

But the page may contain similar URLs outside the HTML structure you've presented, then it's better to use a more complicated regexp:

/(?<=<tr><td><a href=")http:\/\/maps.google.com\/maps\?q[^"]+(?=">map<\/a><\/td><\/tr>)/
Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top