JEditorPane and source of HTML element

https://stackoverflow.com/questions/7635964

06-02-2021
|

Pregunta

I have (still) problems with HTMLEditorKit and HTMLDocument in Java. I can only set the inner HTML of an element, but I cannot get it. Is there some way, how to get a uderlying HTML code of an element?

My problem is, that the HTML support is quite poor and bad written. The API does not allow basic and expected functions. I need change the colspan or rowspan attribute of <td>. The Java developers have closed the straightforward way: the attribute set of element is immutable. The workaround could be to take the code of element (e.g. <td colspan="2">Hi <u>world</u></td>) and replace it with new content (e.g. <td colspan="3">Hi <u>world</u></td>). This way seems to be closed too. (Bonus question: What's the HTMLEditorKit good for?)

Solución 2

Thanks for hint, Stanislav. That's my solution:

/**
 * The method gets inner HTML of given element. If the element is named <code>p-implied</code>
 * or <code>content</code>, it returns null.
 * @param e element
 * @param d document containing given element
 * @return the inner HTML of a HTML tag or null, if e is not a valid HTML tag
 * @throws IOException
 * @throws BadLocationException
 */
public String getInnerHtmlOfTag(Element e, Document d) throws IOException, BadLocationException {
    if (e.getName().equals("p-implied") || e.getName().equals("content"))
        return null;

    CharArrayWriter caw = new CharArrayWriter();
    int i;
    final String startTag = "<" + e.getName();
    final String endTag = "</" + e.getName() + ">";
    final int startTagLength = startTag.length();
    final int endTagLength = endTag.length();

    write(caw, d, e.getStartOffset(), e.getEndOffset() - e.getStartOffset());
    //we have the element but wrapped as full standalone HTML code beginning with HTML start tag
    //thus we need unpack our element
    StringBuffer str = new StringBuffer(caw.toString());
    while (str.length() >= startTagLength) {
        if (str.charAt(0) != '<')
            str.deleteCharAt(0);
        else if (!str.substring(0, startTagLength).equals(startTag))
            str.delete(0, startTagLength);
        else
            break;
    }
    //we've found the beginning of the tag
    for (i = 0; i < str.length(); i++) { //skip it...
        if (str.charAt(i) == '>')
            break; //we've found end position of our start tag
    }
    str.delete(0, i + 1); //...and eat it
    //skip the content
    for (i = 0; i < str.length(); i++) {
        if (str.charAt(i) == '<' && i + endTagLength < str.length() && str.substring(i, i + endTagLength).equals(endTag))
            break; //we've found the end position of inner HTML of our tag
    }
    str.delete(i, str.length()); //now just remove all from i position to the end

    return str.toString().trim();
}

This method can be easilly modified to get outter HTML (so the code containing the entire tag).

Otros consejos

You can get the selected Element html. Use write() method of the kit passing there offsets of the Element. But it will be included with surrounding tags "<html>" "<body>" etc.

Licenciado bajo: CC-BY-SA con atribución

No afiliado a StackOverflow