Question

I am teaching myself object oriented programming and web parsing in python. I want to create a class that will parse a web page. I have a problem and a question about my code.

I am trying to download a page using Beautifulsoup. I created a class and a function to download the page but the page doesn't seem to download. I'm not sure why. If someone could help me with this, that would be great. Here is the code:

from BeautifulSoup import BeautifulSoup
import urllib2

class parser():

    def __init__(self, url):
                self.url = "http://www.any_url"
                self.contents  = ''
                
        def download_page(self):
            
                page=urllib2.urlopen(self.url)
                soup = BeautifulSoup(page.read())

                page_find=soup.findAll()
                print page_find

if __name__ == '__main__':

    parser.download_page
    

Another issue I had was the indents. Right now, it appears my function download_page exists inside my constructor. I tried to keep my functions separate but I kept getting errors because of my indents. I basically just kept hitting 'tab' until it all compiled. Could someone explain why this is happening? Is it really a problem?

I ask because whenever I looked at object oriented in python, functions are usually indented more evenly.

Was it helpful?

Solution

I think the problem is that you aren't using classes correctly. Try something like:

class Parser(object):
 
    def __init__(self, url):
        ...

    def download_page(self):
        ...

Then use:

parser = Parser(url) # create instance of the class
parser.download_page() # call instance method

At the moment, you are trying to call download_page on the class, not an instance.

That said, when you have a class with "two methods, one of which is __init__" you should probably stop writing classes.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top