Pergunta

I want to get the whole(entire) web page's source code, however some contents of the website is not loaded at first.(Seems this have relation with Ajax) How can I get these contents which are not loaded at once with java?

I tried to use java's url.openStrem. But this didn't work. I only got the content "loading..." not the real content after loaded.

Thank you very much.

Foi útil?

Solução

You need to remote control an existing browser (which is not exactly easy with Java as most use other languages / component systems / interfaces) or use a headless browser that can execute Javascript. HTMLUnit would be of the latter category.

Outras dicas

Try using a html parser for such thing. Jericho Htmlparser would be helpful here.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top