Question

I am trying to separate position information and movie title information that I pulled from an html document using BeautifulSoup.

I am pulling the information from lines like this:

<div class="filmo-row even" id="writer-tt1308667">

And I want to separate "writer" and "tt1308667" by "-"

My code is:

i=0
b = soup.find_all('div')
for row in b:
    Position_ttcode=row.get('id')
    print Position_ttcode
    split=Position_ttcode.split('-')

And I am getting the error:

AttributeError: 'NoneType' object has no attribute 'split' 

What am I missing? Please help!

Was it helpful?

Solution

The problem is that not all of the div elements on the page have id attribute.

You should narrow down the search by providing either the class name or id atrribute to find_all():

for div in soup.find_all("div", {'class': 'filmo-row'}):
    print div.get('id')

or, for example, you can check if div has an id attribute that contains writer- text by using re module:

for div in soup.find_all("div", {'id': re.compile('writer-'}):
    print div.get('id')

Hope that helps.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top