Domanda

I want to get the whole(entire) web page's source code, however some contents of the website is not loaded at first.(Seems this have relation with Ajax) How can I get these contents which are not loaded at once with java?

I tried to use java's url.openStrem. But this didn't work. I only got the content "loading..." not the real content after loaded.

Thank you very much.

È stato utile?

Soluzione

You need to remote control an existing browser (which is not exactly easy with Java as most use other languages / component systems / interfaces) or use a headless browser that can execute Javascript. HTMLUnit would be of the latter category.

Altri suggerimenti

Try using a html parser for such thing. Jericho Htmlparser would be helpful here.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top