What is wrong with the bs4 documentation? I can't run unwrap() sample code
-
14-07-2021 - |
Вопрос
I'm trying to strip out some fussy text from pages like this. I want to preserve the anchored links but lose the breaks and the a.intro. I thought I could use something like unwrap() to strip off layers but I'm getting an error: TypeError: 'NoneType' object is not callable
For kicks, I tried running the documentation sample code itself, since I couldn't see how my version differed.
markup = '<a href="http://example.com/">I linked to <i>example.com</i></a>'
soup = BeautifulSoup(markup)
a_tag = soup.a
a_tag.i.unwrap()
a_tag
# <a href="http://example.com/">I linked to example.com</a>
I'm getting the exact same error. What am I missing here? I'm working in Scraperwiki, fwiw.
Решение
Seems to be a scraperwiki issue. Works fine in ipython console.
Другие советы
I get this error too.
In [27]: type(a_tag.i.unwrap)
Out[27]: NoneType
In [28]: 'unwrap' in dir(a_tag.i)
Out[28]: False
FWIW, replace_with_children
yields the same results:
In [29]: type(a_tag.i.replace_with_children)
Out[29]: NoneType
Looks like a bug to me.
In [13]: import BeautifulSoup as Bs
In [16]: Bs.__version__
Out[16]: '3.2.1'
I had the same error message with soup.select()
. The reason was an old version of the BeautifulSoup4 library. Somebody at ScraperWiki fixed it (see this conversation at the ScraperWiki Google Group).