You might want to try escaping the text by importing:
import static org.apache.commons.lang.StringEscapeUtils.escapeHtml;
And using it as:
a.setWordCloudStringToDisplay(escapeHtml(wordcloud));
Question
I have been using SimpleXML for a while now to serialize my java objects, but I am still learning and run into trouble sometimes. I have the following XML that I want to deserialize:
<messages>
<message>
<text>
A communications error has occurred. Please try again, or contact <a href="someURL">administrator</a>. Alternatively, please <a href = "someURL' />">register</a>.
</text>
</message>
I would like process it such that the contents of the element are treated as a single string and the anchor tags to be ignored. I have no control on how this XML is generated - it is, as you can see, an error message from some server. How do I achieve this? Many thanks in advance.
Solution
You might want to try escaping the text by importing:
import static org.apache.commons.lang.StringEscapeUtils.escapeHtml;
And using it as:
a.setWordCloudStringToDisplay(escapeHtml(wordcloud));
OTHER TIPS
To read text and Element is not offered basically by Simple XML. You have to use Converter. You can read https://stackoverflow.com/questions/17462970/simpleframwork-xml-element-with-inner-text-and-child-elements that answer quite the same problem except that it read only one text.
Here is a solution to get multiples text and href in a single string.
First, I create a A class for the 'a' tag, with a toString methode to print the tag as it is in xml :
@Root(name = "a")
public class A {
@Attribute(required = false)
private String href;
@Text
private String value;
@Override
public String toString(){
return "<a href = \"" + href + "\">" + value + "</a>";
}
}
Then the Text class to read the 'text', where the convert is necessary :
@Root(name = "Text")
@Convert(Text.Parsing.class)
public class Text {
@Element
public String value;
private static class Parsing implements Converter<Text> {
// to read <a href...>
private final Serializer ser = new Persister();
@Override
public Text read(InputNode node) throws Exception {
Text t = new Text();
String s;
InputNode aref;
// read the begining of text (until first xml tag)
s = node.getValue();
if (s != null) { t.value = s; }
// read first tag (return null if no more tag in the Text)
aref = node.getNext();
while (aref != null) {
// add to the value using toString() of A class
t.value = t.value + ser.read(A.class, aref);
// read the next part of text (after the xml tag, until the next tag)
s = node.getValue();
// add to the value
if (s != null) { t.value = t.value + s; }
// read the next tag and loop
aref = node.getNext();
}
return t;
}
@Override
public void write(OutputNode node, Text value) throws Exception {
throw new UnsupportedOperationException("Not supported yet.");
}
}
}
Note that I read the 'a' tag with a standard serializer, and add in the A class a toString methode to get it back as an xml string. I have not found a way to read directly the 'a' tag as text.
And the main class (don't forget the AnnotationStrategy which map the Convert method to the deserialisation of the text element) :
public class parseText {
public static void main(String[] args) throws Exception {
Serializer serializer = new Persister(new AnnotationStrategy());
InputStream in = ClassLoader.getSystemResourceAsStream("file.xml");
Text t = serializer.read(Text.class, in, false);
System.out.println("Texte : " + t.value);
}
}
When I use it with the following xml file :
<text>
A communications error has occurred. Please try again, or contact <a href="someURL">administrator</a>.
Alternatively, please <a href = "someURL' />">register</a>.
</text>
It give the following result :
Texte :
A communications error has occurred. Please try again, or contact <a href = "someURL">administrator</a>.
Alternatively, please <a href = "someURL' />">register</a>.
I hope this will help you to solve your problem.