문제

For example, I have:

<a class="banana" href="http://example.com">link1</a>
<a href="http://example2.com" class="banana"><img ... /></a>
<a class="banana">link2</a>
<a href="http://google.com">link3</a>

How I can get:

['<a href="http://example2.com" class="banana"><img ... /></a>','<a href="http://google.com">link3</a>']
도움이 되었습니까?

해결책

You can use css selector a[href] to get a tags with href attribute:

h = '''
<a class="banana" href="http://example.com">link1</a>
<a href="http://example2.com" class="banana"><img ... /></a>
<a class="banana">link2</a>
<a href="http://google.com">link3</a>
'''

from bs4 import BeautifulSoup
soup = BeautifulSoup(h)
print(soup.select('a[href]'))

output:

[<a class="banana" href="http://example.com">link1</a>,
 <a class="banana" href="http://example2.com"><img ...=""/></a>,
 <a href="http://google.com">link3</a>]
라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top