代码点火器：帮助从网页获取元标记的类/库？

https://stackoverflow.com/questions/2273555

20-09-2019
|

题

我正在使用代码点火器。我想我使用哪个 php 框架并不重要。

但在我编写自己的类之前，已经编写了另一个类，该类允许用户获取任何站点的页面标题和元标记（关键字、描述）...如果有的话。

任何能够做到这一点的 PHP 类都很棒。

谢谢大家

解决方案

您应该看看这个类： PHP简单的HTML DOM 它的工作原理是这样的：

include('simple_html_dom.php');
$html = file_get_html('http://www.codeigniter.com/');

echo $html->find('title', 0)->innertext; // get <title>

echo "<pre>";
foreach($html->find('meta') as $element)
       echo $element->name . " : " . $element->content  . '<br>'; //prints every META tag

echo "</pre>";

其他提示

您可以使用 get_meta_tags 从远程页面获取所有元标记 - http://ca3.php.net/get_meta_tags

该页面有一个类来获取页面和描述，他们也使用 get_meta_tags - http://www.emirplicanic.com/php/get-remote-page-title-with-php.php

您应该能够将两者结合起来以获得您需要的一切。

使用PHP的curl库。它可以从网上拉其他页面，并获取他们为字符串，然后你可以解析使用正则表达式的字符串，找到网页的标题和meta标签。

看到这个吧。这是通用类来获得网页的meta标签和做更多的事情。看看你是否能在笨库中添加这一点。感谢

使用DOM / xpath的

libxml_use_internal_errors(true);
$c = file_get_contents("http://url/here");
$d = new DomDocument();
$d->loadHTML($c);
$xp = new domxpath($d);
foreach ($xp->query("//meta[@name='keywords']") as $el) {
    echo $el->getAttribute("content");
}
foreach ($xp->query("//meta[@name='description']") as $el) {
    echo $el->getAttribute("content");
}

许可以下： CC-BY-SA 和归因

不隶属于 StackOverflow