Question

I'd like a reg exp which can take a block of string, and find the strings matching the format:

<a href="mailto:x@x.com">....</a>

And for all strings which match this format, it will extract out the email address found after the mailto:. Any thoughts?

This is needed for an internal app and not for any spammer purposes!

Was it helpful?

Solution

If you want to match the whole thing from :

$r = '`\<a([^>]+)href\=\"mailto\:([^">]+)\"([^>]*)\>(.*?)\<\/a\>`ism';
preg_match_all($r,$html, $matches, PREG_SET_ORDER);

To fastern and shortern it:

$r = '`\<a([^>]+)href\=\"mailto\:([^">]+)\"([^>]*)\>`ism';
preg_match_all($r,$html, $matches, PREG_SET_ORDER);

The 2nd matching group will be whatever email it is.

Example:

$html ='<div><a href="mailto:test@live.com">test</a></div>';

$r = '`\<a([^>]+)href\=\"mailto\:([^">]+)\"([^>]*)\>(.*?)\<\/a\>`ism';
preg_match_all($r,$html, $matches, PREG_SET_ORDER);
var_dump($matches);

Output:

array(1) {
  [0]=>
  array(5) {
    [0]=>
    string(39) "test"
    [1]=>
    string(1) " "
    [2]=>
    string(13) "test@live.com"
    [3]=>
    string(0) ""
    [4]=>
    string(4) "test"
  }
}

OTHER TIPS

There are plenty of different options on regexp.info

One example would be:

\b[A-Z0-9._%+-]+@(?:[A-Z0-9-]+\.)+[A-Z]{2,4}\b

The "mailto:" is trivial to prepend to that.

/(mailto:)(.+)(\")/

The second matching group will be the email address.

You can work with the internal PHP filter http://us3.php.net/manual/en/book.filter.php

(they have one which is specially there for validating or sanitizing email -> FILTER_VALIDATE_EMAIL)

Greets

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top