scraping all images from a website using DOMDocument

https://stackoverflow.com/questions/15895773

02-04-2022
|

Question

I basically want to get ALL the images in any website using DOMDocument. but then i cant even load my html due to some reasons I dont know yet.

$url="http://<any_url_here>/";
$dom = new DOMDocument();
@$dom->loadHTML($url); //i have also tried removing @
$dom->preserveWhiteSpace = false;
$dom->saveHTML();
$images = $dom->getElementsByTagName('img');
foreach ($images as $image) 
{
echo $image->getAttribute('src');
}

what happens is nothing gets printed . or did I do something wrong with the code?

Solution

You don't get a result because $dom->loadHTML() expects html. You give it an url, you first need to get the html of the page you want to parse. You can use file_get_contents() for that.

I used this in my image grab class. Works fine for me.

$html = file_get_contents('http://www.google.com/');
$dom = new domDocument;
$dom->loadHTML($html);
$dom->preserveWhiteSpace = false;
$images = $dom->getElementsByTagName('img');
foreach ($images as $image) {
  echo $image->getAttribute('src');
}

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow