Question

Dears I want get source page but not in internet rather in local system

example : url=urllib.request.urlopen ('c://1.html')



>>> import urllib.request
>>> url=urllib.request.urlopen ('http://google.com')
>>> page =url.read()
>>> page=page.decode()
>>> page

what's my problem ?

No correct solution

OTHER TIPS

from os.path import abspath
with open(abspath('c:/1.html') as fh:
    print(fh.read())

Since url.read() just gives you the data as-is, and .decode() doesn't really do anything except convert the byte data from the socket to a traditional string, just print the filecontents?

urllib is mainly (if not only) a transporter to recieve HTML data, not actually parse the content. So all it does is connect to the source, separate the headers and give you the content. If you've already stored it locally, in a file.. Well then urllib has no more use to you. Consider looking at a HTML Parsing library such as BeautifulSoup for instance.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top