Frage

First of all, I should stress that I'm trying to learn here, not be malicious or spam anyone.

I'm trying to learn about regex in Google search results by finding email addresses using the following code. However, sometimes it only finds some of the email addresses, other times not at all.

If I try it with a Wikipedia URL then I don't have a problem.

$url = "https://www.google.com/search?q=hello@hotmail.com";
// $url = "http://en.wikipedia.org/wiki/Email_address"; this works fine
$string = file_get_contents($url);

$matches = array();
$pattern = '/[a-z\d._%+-]+@[a-z\d.-]+\.[a-z]{2,4}\b/i';
preg_match_all($pattern,$string,$matches);

foreach ($matches as $row)
{
    foreach ($row as $row2)
    {
        echo $row2."<br>";
    }
}
War es hilfreich?

Lösung

You're missing uppercase:

'/[A-Za-z\d._%+-]+@[A-Za-z\d.-]+\.[A-Za-z]{2,4}\b/i'

I put it in everywhere in case you want to match HELLO@GMAIL.COM, you can always downcase it.

EDIT: I think I was trying to solve this for a different email address which wasn't being matched

EDIT 2: search the html, those that don't work have emphasis like example<em>@example.com</em> so won't parse.

Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top