The portions of the page that I am missing are generated by JavaScript since the page is very dynamic and is updated anytime the teams Github repo changes.
Java not returning entire HTML string, only a portion of it
-
10-10-2022 - |
Pregunta
I am trying to scrape some data from a web page for an android application. The problem is that when I pull in the HTML from the page, I only get a small portion of the page, not the whole thing. When I go to the actual page in Chrome, press F12, I see way more code than this Java method returns.
Here is my code to get the HTML string:
System.setProperty("http.agent", USER_AGENT);
HttpResponse response = null;
HttpGet get = null;
HttpClient client = null;
String s = "";
try {
if (client == null) {
client = new DefaultHttpClient();
}
get = new HttpGet(URL_LOG_MAIN);
response = client.execute(get);
s = EntityUtils.toString(response.getEntity(), "UTF-8");
} catch (IOException ex) {
ex.printStackTrace();
}
return s;
and I have these hard coded constants:
private static final String URL_LOG_MAIN = "https://changelog.omnirom.org/";
private static final String USER_AGENT = "Mozilla//5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.0.10) Gecko//2009042316 Firefox//3.0.10 (.NET CLR 3.5.30729)";
Solución
Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow