Using dom to scrape the source code coming from file_get_content

Question

DOMDocument is great for well formed documents but not all HTML pages are well formed. Use Simple HTML DOM http://sourceforge.net/projects/simplehtmldom/ instead. I have created a working solution that extracts the data you requested.

yelp.php

<?php

  ini_set('display_errors', 1);
  error_reporting(E_ALL ^ E_NOTICE);

   /************************************************
   *                                               *
   *    2014.04.28                                 *
   *    Developed by Ben McFarlin at Qeala Labs    *
   *    www.qeala.com                              *
   *                                               *
   ************************************************/

    include_once('simple_html_dom.php');

  function yelp($url){
    print("$url\n");

    $root = new stdClass();
    $items = array();
    $html = file_get_html($url);

    if($html){

      $containers = $html->find('div.review-list div.review div.review-wrapper');
      foreach($containers as $container){
        $comments = $container->find('div.review-content p.review_comment');
        $item = new stdClass();
        foreach($comments as $comment){
          $comment_html = $comment->innertext();
          $item->comment = $comment_html;
        }
        $metas = $container->find('div.review-content meta');
        foreach($metas as $meta){
          $itemprop = $meta->itemprop;
          $content = $meta->content;
          if($itemprop == 'ratingValue') $key = 'rating';
          else $key = 'date';
          $item->$key = $content;
        }
        $items[] = $item;
      }
    }

    $root->items = $items;

    if($html){
      $html->clear();
      unset($html);
    }

    return $root;
  }

  $url = 'http://www.yelp.com/biz/franchino-san-francisco?start=80';
  $root = yelp($url);
  var_dump($root);


?>

Update

I have FireFox with the Firebug extension installed. While viewing the web page, I right click on the data I want to capture and choose Inspect Element with FireBug. The debug window opens with the HTML element already selected. I right click on that element and choose Copy CSS Path. That will give the full CSS selector for the element. Normally it's way too specific and can be reduced to just a few elements. I then review the HTML structure (already open in the debug window) to determine what I can eliminate. At that point it's just a matter of knowing CSS selectors. Hope that helps. It may take some practice but you will find that technique invaluable for any type of HTML/CSS work.

Firefox Web Browser

Firebug Web Development Tool

Learn CSS at W3Schools