Can't download entire html pages

https://stackoverflow.com/questions/21914593

14-10-2022
|

Question

As my title may indicate I am trying to show and download html-pages using a script. I've tried different python (and ActionScript 3) methods, but none of them actually shows the entire visible content on the website.

However they all show some javascipt code (the webpages I'd like to download are dynamically created by javascript)

Is there some way I can possibly catch the visible content? The functionality I want is similar to a "Select All - Copy" - windows method.

No correct solution

OTHER TIPS

Since you wrote

The functionality I want is similar to a "Select All - Copy" - windows method.

I understand that you want to download the "source code" of the web page. If this is what you want then here is what you need to do.

import urllib.request
import re

urls = ["http://google.com","http://yahoo.com"];

i=0;
while i < len(urls):    
    htmlfile = urllib.request.urlopen(urls[i]);
    htmltext = htmlfile.read();
    print(htmltext);
    print("\n");
    i=i+1;

It reads the urls and prints their source code.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow