Get the values of a text
attribute for each element:
import urllib2
from bs4 import BeautifulSoup
soup = BeautifulSoup(urllib2.urlopen("http://www.webmd.com/diet/belly-fat-diet"))
print([header.text for header in soup.find_all("h3")])
print([p.text for p in soup.find_all("p")])
Prints:
[u'The Promise', u'Does It Work?', ... ]
[u'Common Conditions', u'Featured Topics', ... ]
Note that in the example I'm using BeautifulSoup4
which is the version you should use too - the third version is no longer developed and maintained.