Question

I have:

try:
    page = requests.get(Scrape.site_to_scrape['git']+gitUser)
    tree = urllib.urlopen(page).read()
    soup = BS(response)
    parse_git_full_name = soup.find("span", {"class":"vcard-fullname"}).get_text()
    return parse_git_full_name

except:
    print "Syntax: python site_scrape.py -g <git user name here>"

but, it keeps falling into the except: block

I'm trying to parse an element like:

<span class="vcard-fullname" itemprop="name">The name</span>

I'm trying to get the value between the <span> tag

Was it helpful?

Solution

This was resolved instead using xpath with a single selector. Hopefully this will help someone else pulling their hair out over beautifulsoup selectors.

try:
    page = requests.get(Scrape.site_to_scrape['git']+gitUser)
    tree = html.fromstring(page.text)

    full_name = tree.xpath('//span[@class="vcard-fullname"]/text()')

    print 'Full Name: ', full_name

except:
    print "Syntax: python site_scrape.py -g <git user name here>"
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top