Question

I wrote this script to download the lyrics for my songs and store them in a text file :

>>> lis = os.listdir('D:\Phone\Sounds')
>>> for i in lis:
    print i

    br.open('http://www.azlyrics.com/') # THE PROBLEM

    br.select_form(nr=0)
    track = eyed3.load(i).tag
    if(track.artist != None):
        ft = track.artist.find('ft.')
        if(ft != -1):
            br['q'] = track.title + ' ' + track.artist[:ft]
        else:
            br['q'] = track.title + ' ' + track.artist
    else:
        br['q'] = track.title
    br.submit()
    s = BeautifulSoup(br.response().read())
    a = s.find('div',{'class':'sen'})
    if(a != None):
        s = BeautifulSoup(urllib.urlopen(a.find('a')['href']))
        file = open(i.replace('.mp3','.txt'),'w')
        file.write(str(s.find('div',{'style':'margin-left:10px;margin-right:10px;'})).replace('<br />','\n'))
        file.close()
    else:
        print 'Lyrics not found'

This seems to work for a while, i downloaded the lyrics for some songs and suddenly it raises a BadStatusLine error

Heartbreaker.mp3
<response_seek_wrapper at 0x4af6f08L whose wrapped object = <closeable_response at 0x4cb9288L whose fp = <socket._fileobject object at 0x00000000047A2480>>>
<response_seek_wrapper at 0x4b1b888L whose wrapped object = <closeable_response at 0x4cc0048L whose fp = <socket._fileobject object at 0x00000000047A2570>>>
Heartless (The Fray Cover).mp3
<response_seek_wrapper at 0x4b22d08L whose wrapped object = <closeable_response at 0x4b15988L whose fp = <socket._fileobject object at 0x00000000047B2750>>>
<response_seek_wrapper at 0x4cb9388L whose wrapped object = <closeable_response at 0x4b1b448L whose fp = <socket._fileobject object at 0x000000000362AED0>>>
Lyrics not found
Heartless.mp3
<response_seek_wrapper at 0x4cc0288L whose wrapped object = <closeable_response at 0x4b01108L whose fp = <socket._fileobject object at 0x000000000362AE58>>>
<response_seek_wrapper at 0x4b15808L whose wrapped object = <closeable_response at 0x47a4508L whose fp = <socket._fileobject object at 0x000000000362A6D8>>>
Here Without You.mp3
<response_seek_wrapper at 0x4b1b3c8L whose wrapped object = <closeable_response at 0x4916508L whose fp = <socket._fileobject object at 0x000000000362A480>>>
<response_seek_wrapper at 0x47a4fc8L whose wrapped object = <closeable_response at 0x37830c8L whose fp = <socket._fileobject object at 0x000000000362A0C0>>>
Hero.mp3
<response_seek_wrapper at 0x4930408L whose wrapped object = <closeable_response at 0x4cced48L whose fp = <socket._fileobject object at 0x00000000047A2228>>>
<response_seek_wrapper at 0x453ca48L whose wrapped object = <closeable_response at 0x4b23f88L whose fp = <socket._fileobject object at 0x00000000047A2048>>>
Hey Jude.mp3
<response_seek_wrapper at 0x3783808L whose wrapped object = <closeable_response at 0x4cd71c8L whose fp = <socket._fileobject object at 0x00000000047A2A20>>>
<response_seek_wrapper at 0x4ccee48L whose wrapped object = <closeable_response at 0x4cd7c08L whose fp = <socket._fileobject object at 0x00000000047A2B10>>>
Hey, Soul Sister.mp3

Traceback (most recent call last):
  File "<pyshell#23>", line 3, in <module>
    br.open('http://www.azlyrics.com/')
  File "build\bdist.win-amd64\egg\mechanize\_mechanize.py", line 203, in open
    return self._mech_open(url, data, timeout=timeout)
  File "build\bdist.win-amd64\egg\mechanize\_mechanize.py", line 230, in _mech_open
    response = UserAgentBase.open(self, request, data)
  File "build\bdist.win-amd64\egg\mechanize\_opener.py", line 193, in open
    response = urlopen(self, req, data)
  File "build\bdist.win-amd64\egg\mechanize\_urllib2_fork.py", line 344, in _open
    '_open', req)
  File "build\bdist.win-amd64\egg\mechanize\_urllib2_fork.py", line 332, in _call_chain
    result = func(*args)
  File "build\bdist.win-amd64\egg\mechanize\_urllib2_fork.py", line 1142, in http_open
    return self.do_open(httplib.HTTPConnection, req)
  File "build\bdist.win-amd64\egg\mechanize\_urllib2_fork.py", line 1116, in do_open
    r = h.getresponse()
  File "D:\Programming\Python\lib\httplib.py", line 1027, in getresponse
    response.begin()
  File "D:\Programming\Python\lib\httplib.py", line 407, in begin
    version, status, reason = self._read_status()
  File "D:\Programming\Python\lib\httplib.py", line 371, in _read_status
    raise BadStatusLine(line)
BadStatusLine: ''

So, why does the br.open function suddenly stop working ? Thanks in advance .

Was it helpful?

Solution

The error is generated by httplib when it doesn't understand response status code. Quote from docs:

A subclass of HTTPException. Raised if a server responds with a HTTP status code that we don’t understand.

I haven't received any errors while running br.open('http://www.azlyrics.com/'). So, the problem is on your side.

Most likely you are using proxy, take a look at Python's mechanize proxy support.

UPD: Give a try to this:

br = mechanize.Browser()
br.addheaders = [('User-agent', 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.1) Gecko/2008071615 Fedora/3.0.1-1.fc9 Firefox/3.0.1')]

br.set_debug_http(True)
br.set_debug_redirects(True)
br.set_debug_responses(True)

br.open('http://www.azlyrics.com')

print br.response().read()

Hope that helps.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top