JEditorPane and source of HTML element
-
06-02-2021 - |
Pregunta
I have (still) problems with HTMLEditorKit
and HTMLDocument
in Java. I can only set the inner HTML of an element, but I cannot get it. Is there some way, how to get a uderlying HTML code of an element?
My problem is, that the HTML support is quite poor and bad written. The API does not allow basic and expected functions. I need change the colspan
or rowspan attribute
of <td>
. The Java developers have closed the straightforward way: the attribute set of element is immutable. The workaround could be to take the code of element (e.g. <td colspan="2">Hi <u>world</u></td>
) and replace it with new content (e.g. <td colspan="3">Hi <u>world</u></td>
). This way seems to be closed too. (Bonus question: What's the HTMLEditorKit
good for?)
Solución 2
Thanks for hint, Stanislav. That's my solution:
/**
* The method gets inner HTML of given element. If the element is named <code>p-implied</code>
* or <code>content</code>, it returns null.
* @param e element
* @param d document containing given element
* @return the inner HTML of a HTML tag or null, if e is not a valid HTML tag
* @throws IOException
* @throws BadLocationException
*/
public String getInnerHtmlOfTag(Element e, Document d) throws IOException, BadLocationException {
if (e.getName().equals("p-implied") || e.getName().equals("content"))
return null;
CharArrayWriter caw = new CharArrayWriter();
int i;
final String startTag = "<" + e.getName();
final String endTag = "</" + e.getName() + ">";
final int startTagLength = startTag.length();
final int endTagLength = endTag.length();
write(caw, d, e.getStartOffset(), e.getEndOffset() - e.getStartOffset());
//we have the element but wrapped as full standalone HTML code beginning with HTML start tag
//thus we need unpack our element
StringBuffer str = new StringBuffer(caw.toString());
while (str.length() >= startTagLength) {
if (str.charAt(0) != '<')
str.deleteCharAt(0);
else if (!str.substring(0, startTagLength).equals(startTag))
str.delete(0, startTagLength);
else
break;
}
//we've found the beginning of the tag
for (i = 0; i < str.length(); i++) { //skip it...
if (str.charAt(i) == '>')
break; //we've found end position of our start tag
}
str.delete(0, i + 1); //...and eat it
//skip the content
for (i = 0; i < str.length(); i++) {
if (str.charAt(i) == '<' && i + endTagLength < str.length() && str.substring(i, i + endTagLength).equals(endTag))
break; //we've found the end position of inner HTML of our tag
}
str.delete(i, str.length()); //now just remove all from i position to the end
return str.toString().trim();
}
This method can be easilly modified to get outter HTML (so the code containing the entire tag).
Otros consejos
You can get the selected Element html. Use write() method of the kit passing there offsets of the Element. But it will be included with surrounding tags "<html>" "<body>" etc.