Pregunta

I am trying to scrape some data from a web page for an android application. The problem is that when I pull in the HTML from the page, I only get a small portion of the page, not the whole thing. When I go to the actual page in Chrome, press F12, I see way more code than this Java method returns.

Here is my code to get the HTML string:

        System.setProperty("http.agent", USER_AGENT);
        HttpResponse response = null;
        HttpGet get = null;
        HttpClient client = null;
        String s = "";
        try {
            if (client == null) {
                client = new DefaultHttpClient();
            }
            get = new HttpGet(URL_LOG_MAIN);
            response = client.execute(get);
            s = EntityUtils.toString(response.getEntity(), "UTF-8");
        } catch (IOException ex) {
            ex.printStackTrace();
        }
        return s;

and I have these hard coded constants:

    private static final String URL_LOG_MAIN = "https://changelog.omnirom.org/";
    private static final String USER_AGENT = "Mozilla//5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.0.10) Gecko//2009042316 Firefox//3.0.10 (.NET CLR 3.5.30729)";
¿Fue útil?

Solución

The portions of the page that I am missing are generated by JavaScript since the page is very dynamic and is updated anytime the teams Github repo changes.

Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top