سؤال

I need to extract data from here. The webpage contains review comments, review headline, number of reviews which are found useful and rating ( in stars ) and i need to extract those.

  1. The problem now i face i am able to retrieve only the review comment and that too which is present first in the page( it doesnt move to next review comment )....

  2. I am not able to retrieve the review headline as it comes with differnt object id in the html .

ex: ( can i use regex for object id in this case? )

<a href="/review/www.currys.co.uk/5370859f00006400028963d9">Customer services what a load of cp</a>

Also i don't know how to get the number of reviews found useful and rating 1 to 5 as they are indicated in icons.

My code:

$url = "https://www.trustpilot.co.uk/review/www.currys.co.uk";
$html = file_get_contents( $url);
libxml_use_internal_errors( true);
$doc = new DOMDocument; $doc->loadHTML( $html);
$xpath = new DOMXpath( $doc);
$node = $xpath->query( '//div[@itemprop="reviewBody"][@class="review-body"]')->item( 0);
echo $node  >textContent;
هل كانت مفيدة؟

المحلول

The reason it's showing only the first one now is because you have selected only for ->item( 0), you need to loop through them. Also, to print the elements inside the tag, you use nodeValue (you have tried textContent which doesn't exist).

The following code prints 10 reviews in table with the rating (stars), headlines and content:

$url = "https://www.trustpilot.co.uk/review/www.currys.co.uk";
$html = file_get_contents( $url);
libxml_use_internal_errors( true);
$doc = new DOMDocument; $doc->loadHTML( $html);
$xpath = new DOMXpath( $doc);
//get all ratings where <meta itemprop="ratingValue">
$ratings = $xpath->query('//meta[@itemprop="ratingValue"]');
//get all headings where <h3 class="review-title en h4">
$headings = $xpath->query( '//h3[@class="review-title en h4"]');
//get all content
$node = $xpath->query( '//div[@itemprop="reviewBody"][@class="review-body"]');

$table = "<table border=1>";
for($i=0;$i<10;$i++){
$table .= '<tr>
           <td>Star: '.str_repeat("*",$ratings->item($i)->getAttribute('content')).'</tr>
           <td>'.$headings->item($i)->nodeValue.'</tr>
           <td>'.$node->item($i)->nodeValue.'</tr>
           </tr>';
}
$table .= '</table>';
echo $table;
مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top