Question

Dears I want get source page but not in internet rather in local system

example : url=urllib.request.urlopen ('c://1.html')



>>> import urllib.request
>>> url=urllib.request.urlopen ('http://google.com')
>>> page =url.read()
>>> page=page.decode()
>>> page

what's my problem ?

Pas de solution correcte

Autres conseils

from os.path import abspath
with open(abspath('c:/1.html') as fh:
    print(fh.read())

Since url.read() just gives you the data as-is, and .decode() doesn't really do anything except convert the byte data from the socket to a traditional string, just print the filecontents?

urllib is mainly (if not only) a transporter to recieve HTML data, not actually parse the content. So all it does is connect to the source, separate the headers and give you the content. If you've already stored it locally, in a file.. Well then urllib has no more use to you. Consider looking at a HTML Parsing library such as BeautifulSoup for instance.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top