문제

I'm having a problem with BeautifulSoup parsing certain information from here. After I grab the html code with the module Mechanize I'm trying to focus my wanted end results on this little snippet of html:

    <table style="padding-top:10px;">
        <tr><th>ISP:</th><td>Brighthouse Networks</td></tr>
        <tr><th>Services:</th><td><a href="/ip-services">None Detected</a></td></tr>
        <tr><th>City:</th><td>Miami</td></tr>
        <th>Region:</th><td>Florida</td>
        <tr><th>Country:</th><td>United States</td></tr>
    </table>

Now I want my end results to be as such:

    ISP: Brighthouse Networks
    Services: None Detected
    City: Miami
    Region: Florida
    Country: United States

My parsing section of my script I wrote is:

    soup = BeautifulSoup(html)
    table = soup.findall('table',{'style':'padding-top:10px;'})
    for t in table:
        print t.text

However, this does not yield the results I wanted which I listed above. Any help is greatly appreciated. Thanks!

도움이 되었습니까?

해결책

This works (:

from bs4 import BeautifulSoup

html = """<table style="padding-top:10px;">
    <tr><th>ISP:</th><td>Brighthouse Networks</td></tr>
    <tr><th>Services:</th><td><a href="/ip-services">None Detected</a></td></tr>
    <tr><th>City:</th><td>Miami</td></tr>
    <th>Region:</th><td>Florida</td>
    <tr><th>Country:</th><td>United States</td></tr>
</table>"""

soup = BeautifulSoup(html)
table = soup.findAll('table', {"style":"padding-top:10px;"})[0]

trs = table('tr')
for tr in trs:
    print tr.th.text,
    print tr.td.text

#and this for the 'Region'
print table("th")[3].text,
print table("td")[3].text

Output:

ISP: Brighthouse Networks
Services: None Detected
City: Miami
Country: United States
Region: Florida

다른 팁

table = soup.find_all('table',attrs={'style':'padding-top:10px;'})

should do the trick.

find_all() has the following signature:

find_all(name, attrs, recursive, text, limit, **kwargs)

If you pass attribute as second argument find_all expects it to be a string and not a dictionary. If you want to pass a dictionary of attributes to find_all you should do this by passing it as attrs keyword argument.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top