Manipulate html document
-
25-06-2021 - |
Question
...<b><a>hello</a></b>...
I'd like to remove the <b></b>
tags from the html document. Is it possible using Jsoup?
Solution
If doc
is your Document containig your HTML:
doc.select("b").unwrap();
(can be used with Element
/ Elements
too)
Example:
Document document = new Document("");
document.html("...<b><a>hello</a></b>...").select("b").unwrap();
Now your document doesn't contain any b-Tag
OTHER TIPS
public String clean(String unsafe){
Whitelist whitelist = Whitelist.none();
whitelist.addTags(new String[]{"a"});
String safe = Jsoup.clean(unsafe, whitelist);
return StringEscapeUtils.unescapeXml(safe);
}
From Removing Html tags except few specific ones from String in java
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow