Question

I'm learning Python and trying to parse data using BeautifulSoup. I wish it to print the IPv4 rather than the IPv6 address from a whatsmyip website. I can't seem to figure out why it parses the IPv6 over the IPv4 when the first occurrence is the IPv4 address in the html tags. Appreciate any help on this.

import urllib2
from bs4 import BeautifulSoup

page = urllib2.urlopen("http://www.whatsmyip.net")
pagehtml = page.read()
page.close()

soup = BeautifulSoup(pagehtml)

data = soup.find_all("input")
for input in data:
    ip = input.get('value')
print ip
Was it helpful?

Solution

Just because the IPv6 address is the last one found in <input> element. You are iterating over all <input> elements and as a result your ip variable remembers the last one.

Try this:

print data[0].get('value') 
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top