You can get all 2 spans with metrics-authority
class, first one is a Domain Authority
, second one is a Page Authority
. Additionally, you can get Root Domains
from the div
with id="metrics-page-link-metrics"
:
import urllib2
from lxml import html
tree = html.parse(urllib2.urlopen('http://www.opensiteexplorer.org/links?site=www.google.com'))
spans = tree.xpath('//span[@class="metrics-authority"]')
data = [item.text.strip() for item in spans]
print "Domain Authority: {0}, Page Authority: {1}".format(*data)
div = tree.xpath('//div[@id="metrics-page-link-metrics"]//div[@class="has-tooltip"]')[1]
print "Root Domains: {0}".format(div.text.strip())
prints:
Domain Authority: 100, Page Authority: 97
Root Domains: 680
Hope that helps.