Beautiful Soup can't get news titles

https://stackoverflow.com/questions/23457657

15-07-2023
|

Question

from bs4 import BeautifulSoup
import requests
url ="http://www.basketnews.lt/lygos/59-nacionaline-krepsinio-asociacija/2013/naujienos.html"
r = requests.get(url)
soup = BeautifulSoup(r.text)

naujienos = soup.findAll('a', {'class':'title'})

print naujienos

Here is important part of HTML:

<div class="title">

    <a href="/news-73147-rockets-veikiausiai-pasiliks-mchalea.html"></a>
    <span class="feedbacks"></span>

</div>

I get empty list. Where is my mistake?

EDIT:

Thanks it worked. Now I want to print news titles. This is how I am trying to do it:

nba = soup.select('div.title > a')

for i in nba:
   print ""+i.string+"\n"

I get max 5 titles and error occurs: cannot concatenate 'str' and 'NoneType' objects

Solution

soup.findAll('a', {'class':'title'})

This says, give me all a tags that also have class="title". That's obviously not what you're trying to do.

I think you want a tags that are the direct descendant of a tag with class="title". You can try using a css selector:

soup.select('div.title > a')
Out[58]: 
[<a href="/news-73150-blatcheas-garantuoju-kad-laimesime.html">Blatche'as: âGarantuoju, kad laimÄsimeâ</a>,
 <a href="/news-73147-rockets-veikiausiai-pasiliks-mchalea.html">âRocketsâ veikiausiai pasiliks McHaleâÄ
</a>,
# snip lots of other links
]

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow