html DOM only scrape largest image
-
12-11-2019 - |
Question
I have a bookmarklet which looks at a page and extracts all the images for the user to see.
include('simple_html_dom.php');
function getUrlAddress()
{
/*** check for https is on or not ***/
$url = $_SERVER['HTTPS'] == 'on' ? 'https' : 'http';
/*** return the full address ***/
return $url .'://'.$_SERVER['HTTP_HOST'].$_SERVER['REQUEST_URI'];
}
/*** example usage ***/
echo getUrlAddress ();
$html = file_get_html($url);
foreach($html->find('img') as $e)
echo '<img src='.$e->src .'><br>';
Now, most of the time the user will click a page with a particular product on it, maybe from ebay or amazon etc - ideally, I want to show the actual image from a product as opposed to every logo/button etc but how?
I understand they don't wrap them in tags like so is there another way to do it?
Maybe by size? size of image in px and/or file size? (Would this be indicative anyway? it's a bit of an assumption)
Two examples so you can see what I mean, if you use the above code (you'll obv have to get simple_html_dom.php)
UPDATE
Amazon actually does something similar I've found - it can never be perfect as you're relying on all dev people writing the same, ain't gonna happen! This is closest to the functionality I need. It doesn't only scrape largest image but it seems to only scrape images relevant to the item, clever stuff?
Solution
It looks like eBay uses id="i_vv4-35" and Amazon has onclick="openImmersiveView(event)"
Try doing something like:
if($site == 'eBay' && $e->id == 'i_vv4-35');
if($site == 'Amazon' && $e->onclick == 'openImmersiveView(event)');
OTHER TIPS
foreach($html->find('img') as $e)
if (strpos($e,'SX300') !== false) {
$image = $e;
}
else if (strpos($e,'SY300') !== false) {
$image = $e;
}