First of all, I should stress that I'm trying to learn here, not be malicious or spam anyone.

I'm trying to learn about regex in Google search results by finding email addresses using the following code. However, sometimes it only finds some of the email addresses, other times not at all.

If I try it with a Wikipedia URL then I don't have a problem.

$url = "https://www.google.com/search?q=hello@hotmail.com";
// $url = "http://en.wikipedia.org/wiki/Email_address"; this works fine
$string = file_get_contents($url);

$matches = array();
$pattern = '/[a-z\d._%+-]+@[a-z\d.-]+\.[a-z]{2,4}\b/i';
preg_match_all($pattern,$string,$matches);

foreach ($matches as $row)
{
    foreach ($row as $row2)
    {
        echo $row2."<br>";
    }
}
有帮助吗?

解决方案

You're missing uppercase:

'/[A-Za-z\d._%+-]+@[A-Za-z\d.-]+\.[A-Za-z]{2,4}\b/i'

I put it in everywhere in case you want to match HELLO@GMAIL.COM, you can always downcase it.

EDIT: I think I was trying to solve this for a different email address which wasn't being matched

EDIT 2: search the html, those that don't work have emphasis like example<em>@example.com</em> so won't parse.

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top