문제

So I'm trying to get some data from a certain website. When the app is first started, it downloads a html file of a certain website and cleans it.

private class cleanHtml extends AsyncTask<Void, Void, Void>{

    @Override
    protected Void doInBackground(Void... arg0) {
        try {
            HtmlCleaner cleaner = new HtmlCleaner();
            String url = "https://www.easistent.com/urniki/263/razredi/16515";
            TagNode node = cleaner.clean(new URL(url));
            CleanerProperties props = cleaner.getProperties();
            String fileName = Environment.getExternalStorageDirectory().getPath() + "/Android/data/com.whizzapps.stpsurniki/cleaned.html";
            new PrettyXmlSerializer(props).writeToFile(node, fileName, "utf-8");
            Log.i("TAG", "AsyncTask done!");
        } catch (MalformedURLException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
        return null;
    }
}

Now I know I could parse html using HtmlCleaner using XPath, but I have no knowledge at all in XPath. I'm pretty sure it would be easier to parse it with Jsoup after the file is cleaned. Is this okay?

도움이 되었습니까?

해결책

It shouldn't be a problem all you need is a valid html. you can use this:

 String html = getHtml();
 Document doc = Jsoup.parse(html);
 Elements elms = doc.select("cssSelector");
 Elements elms1 = doc.getElementsByClass("class");
라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top