Question

I need to scrap the webspage inside the <pre> tag contents. i am using preg_match_all function but its not working.

My Scraping Website <pre> tag content is given below.

<pre># Mon Jul 22 03:10:03 CDT 2013

99.46.177.18
99.27.119.169
99.254.168.132
99.245.96.210
99.245.29.38
99.240.245.97
99.239.100.211
<pre>

Php file

Updated

$data = file_get_contents('http://www.infiltrated.net/blacklisted');
preg_match_all ("/<pre>([^`]*?)<\/pre>/", $data, $matches);
print_r($matches);
exit;

My php file returns empty array. i know my preg_match_all function is a problem.

how can i get the pre tag contents. please guide me.

Edit Question

I can run @Pieter script. but its returns only Array()

My script is given below.

    <?php
    $url = 'http://www.infiltrated.net/blacklisted';
    $data = new DOMDocument();
    $data->loadHTML(file_get_contents($url));
    $xpath = new DomXpath($data);

    $pre_tags = array();
    foreach($xpath->query('//pre') as $node){
    $pre_tags[] = $node->nodeValue; 
    }

print_r($pre_tags);
exit;
?>
Was it helpful?

Solution 2

Finally I got it. This http://www.infiltrated.net/blacklisted url is loading from one text file.so only the pre tags shows in the page source. so i am using this method.

$array = explode("\n", file_get_contents('http://www.infiltrated.net/blacklisted'));
print_r($array);

Finally its working greet.

OTHER TIPS

Use the PHP functions to loop through DOM. Using Regex-patterns for HTML tags is strongly discouraged.

Try this code:

$data = new DOMDocument();
$data->loadHTML(file_get_contents($url));
$xpath = new DomXpath($data);

$pre_tags = array();
foreach($xpath->query('//pre') as $node){
    $pre_tags[] = $node->nodeValue;
}

Or try PHP Simple HTML DOM Parser, see: http://simplehtmldom.sourceforge.net/

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top