質問

So I'm trying to get some data from a certain website. When the app is first started, it downloads a html file of a certain website and cleans it.

private class cleanHtml extends AsyncTask<Void, Void, Void>{

    @Override
    protected Void doInBackground(Void... arg0) {
        try {
            HtmlCleaner cleaner = new HtmlCleaner();
            String url = "https://www.easistent.com/urniki/263/razredi/16515";
            TagNode node = cleaner.clean(new URL(url));
            CleanerProperties props = cleaner.getProperties();
            String fileName = Environment.getExternalStorageDirectory().getPath() + "/Android/data/com.whizzapps.stpsurniki/cleaned.html";
            new PrettyXmlSerializer(props).writeToFile(node, fileName, "utf-8");
            Log.i("TAG", "AsyncTask done!");
        } catch (MalformedURLException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
        return null;
    }
}

Now I know I could parse html using HtmlCleaner using XPath, but I have no knowledge at all in XPath. I'm pretty sure it would be easier to parse it with Jsoup after the file is cleaned. Is this okay?

役に立ちましたか?

解決

It shouldn't be a problem all you need is a valid html. you can use this:

 String html = getHtml();
 Document doc = Jsoup.parse(html);
 Elements elms = doc.select("cssSelector");
 Elements elms1 = doc.getElementsByClass("class");
ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top