It shouldn't be a problem all you need is a valid html. you can use this:
String html = getHtml();
Document doc = Jsoup.parse(html);
Elements elms = doc.select("cssSelector");
Elements elms1 = doc.getElementsByClass("class");
Domanda
So I'm trying to get some data from a certain website. When the app is first started, it downloads a html file of a certain website and cleans it.
private class cleanHtml extends AsyncTask<Void, Void, Void>{
@Override
protected Void doInBackground(Void... arg0) {
try {
HtmlCleaner cleaner = new HtmlCleaner();
String url = "https://www.easistent.com/urniki/263/razredi/16515";
TagNode node = cleaner.clean(new URL(url));
CleanerProperties props = cleaner.getProperties();
String fileName = Environment.getExternalStorageDirectory().getPath() + "/Android/data/com.whizzapps.stpsurniki/cleaned.html";
new PrettyXmlSerializer(props).writeToFile(node, fileName, "utf-8");
Log.i("TAG", "AsyncTask done!");
} catch (MalformedURLException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return null;
}
}
Now I know I could parse html using HtmlCleaner using XPath, but I have no knowledge at all in XPath. I'm pretty sure it would be easier to parse it with Jsoup after the file is cleaned. Is this okay?
Soluzione
It shouldn't be a problem all you need is a valid html. you can use this:
String html = getHtml();
Document doc = Jsoup.parse(html);
Elements elms = doc.select("cssSelector");
Elements elms1 = doc.getElementsByClass("class");