Question

I've just started PHP and I want to scrape a little page which I can't, I tried doing 'PREG_MATCH_ALL' but it just doesn't get the result I want.. Basically I want to scrape the youtube video links from here only: https://gdata.youtube.com/feeds/api/standardfeeds/most_shared - Scrape all of them and then use them later.

I tried using the following code which failed;

<?php
    $data = file_get_contents('https://gdata.youtube.com/feeds/api/standardfeeds/most_shared');
    preg_match_all("/src='(.+?)'>/", $data, $links);
    $link_out = $links[0][0];
    echo $link_out;
?>

I'm new to PHP, so little help please.

Thanks

Was it helpful?

Solution

As the feed is XML, you can use PHP's SimpleXMLElement to obtain the data.

<?php
$xml = new SimpleXMLElement(
    'https://gdata.youtube.com/feeds/api/standardfeeds/most_shared',
    null,
    true
);

foreach($xml->entry as $entry) {
    echo $entry->content['src'], PHP_EOL;
}

/*
    https://www.youtube.com/v/IjWc43FCYlg?version=3&f=standard&app=youtube_gdata
    https://www.youtube.com/v/Xw1C5T-fH2Y?version=3&f=standard&app=youtube_gdata
    https://www.youtube.com/v/Kq0_dGKx4Os?version=3&f=standard&app=youtube_gdata
    https://www.youtube.com/v/gbcBYs0ljI0?version=3&f=standard&app=youtube_gdata
    https://www.youtube.com/v/78juOpTM3tE?version=3&f=standard&app=youtube_gdata
    https://www.youtube.com/v/OOiZ-5DqwYI?version=3&f=standard&app=youtube_gdata
    https://www.youtube.com/v/zjz614QVyfQ?version=3&f=standard&app=youtube_gdata
    https://www.youtube.com/v/h15m87WsCHQ?version=3&f=standard&app=youtube_gdata
    https://www.youtube.com/v/SXKOTdyOUBg?version=3&f=standard&app=youtube_gdata
    https://www.youtube.com/v/BRAM8MpqIeA?version=3&f=standard&app=youtube_gdata
    https://www.youtube.com/v/5yB3n9fu-rM?version=3&f=standard&app=youtube_gdata
    https://www.youtube.com/v/NAOo9SnzRH8?version=3&f=standard&app=youtube_gdata
    https://www.youtube.com/v/0KtILkzC-1g?version=3&f=standard&app=youtube_gdata
    https://www.youtube.com/v/kWSIFh8ICaA?version=3&f=standard&app=youtube_gdata
    https://www.youtube.com/v/Mi6AhogZCeg?version=3&f=standard&app=youtube_gdata
    https://www.youtube.com/v/kWuIGAZ1x2I?version=3&f=standard&app=youtube_gdata
    https://www.youtube.com/v/lKY5fmDGVLs?version=3&f=standard&app=youtube_gdata
    https://www.youtube.com/v/C94PaCtqOk4?version=3&f=standard&app=youtube_gdata
    https://www.youtube.com/v/V-fL8zopddI?version=3&f=standard&app=youtube_gdata
    https://www.youtube.com/v/UWlzMIl7E48?version=3&f=standard&app=youtube_gdata
    https://www.youtube.com/v/mcw6j-QWGMo?version=3&f=standard&app=youtube_gdata
    https://www.youtube.com/v/-RSDaRttpzk?version=3&f=standard&app=youtube_gdata
    https://www.youtube.com/v/8_RDx4skTp4?version=3&f=standard&app=youtube_gdata
    https://www.youtube.com/v/7YDWdv9kR0M?version=3&f=standard&app=youtube_gdata
    https://www.youtube.com/v/m96tYpEk1Ao?version=3&f=standard&app=youtube_gdata
*/

Anthony.

OTHER TIPS

Try with this pregmatch:

preg_match_all("/src='([^']+)'/si", $data, $links);

and show results:

echo "<pre>";
print_r($links);
<?php
$data = file_get_contents('https://gdata.youtube.com/feeds/api/standardfeeds/most_shared');
preg_match_all("/src='(.+?)'\/>/", $data, $links);
print_r($links[1]);

You forgot to match the closing / of the anchor tags.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top