質問

I am experimenting with PHPQuery (https://code.google.com/p/phpquery/) to scrape data from my website. I want to extract meta information from a page.

Here is what I have tried so far :

$html = phpQuery::newDocumentHTML($file, $charset = 'utf-8');

$MetaItems = [];
foreach (pq('meta') as $keys) {
    $names = trim(strtolower(pq($keys)->attr('name')));
    if ($names !== null && $names !== '') {
        array_push($MetaItems, $names);
    }
}
            
for ($i=0; $i < count($MetaItems); $i++) {
    $test = 'meta[name="' . $MetaItems[$i] . '"]';
    echo pq($test)->html();
}

Above : In $MetaItems I get all the meta attribute name.This array is filled correctly.

But selecting and extracting text is not working. How do i get the above code to work? Thanks.

役に立ちましたか?

解決

You want an assoc array with name => content, correct? Try this:

$metaItems = array();
foreach(pq('meta') as $meta) {
  $key = pq($meta)->attr('name');
  $value = pq($meta)->attr('content');
  $metaItems[$key] = $value;
}

var_dump($metaItems);

他のヒント

Going under the assumption that the values you are extracting are exactly the same as the values of the name attributes your trying to get... I'm pretty sure the value of the name attribute is case sensitive. You need to remove the strtolower and the trim. Both could be causing issues. I would replace the first part with this:

$html = phpQuery::newDocumentHTML($file, $charset = 'utf-8');

$MetaItems = [];
foreach (pq('meta') as $keys) {
    $names = pq($keys)->attr('name');
    if (!empty($names) && trim($names)) {
        array_push($MetaItems, $names);
    }
}

hope that helps

ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top