搜索网站

https://stackoverflow.com/questions/2297787

21-09-2019
|

题

import urllib
import re
import os
search = (raw_input('[!]Search: '))
site = "http://www.exploit-db.com/list.php?description="+search+"&author=&platform=&type=&port=&osvdb=&cve="   
print site
source = urllib.urlopen(site).read()
founds = re.findall("href='/exploits/\d+",source)
print "\n[+]Search",len(founds),"Results\n"
if len(founds) >=1:
        for found in founds:
                found = found.replace("href='","")
                print "http://www.exploit-db.com"+found
else:
        print "\nCouldnt find anything with your search\n"

当我搜索exploit-db.com网站，我只能拿出25个结果，我怎么可以把它转到其他页面或去通过25个结果。

解决方案

易于仅通过访问该网站并查看网址为手动页面查询：只需摆正在URL ?的page=1&看后结果的第二页，或page=2&看第三页，等等。

这怎么是一个Python的问题吗？这是一个（非常基本的！）“屏幕抓取”的问题。

其他提示

显然，exploit-db.com站点不允许延伸的页大小。因此，您通过结果列表通过重复了urllib.urlopen（），以获得后续页需要“手动”页面。该URL是一样的一个最初使用，再加上&page=n参数。注意该n值似乎是从0（即＆page = 1将得到的第二页）

许可以下： CC-BY-SA 和归因

不隶属于 StackOverflow