Question

Assume I have this string:

String s = "random text blah blah <change>hello</change> more random text <change>hey</change> ..";

And I want to change the values between the <change> and </change> elements (I want to encode them, or decode them, it's just an example in this case, note that to encode these values, I need the value between the tags (before the changement) itself too).

What's the best way to do this? I was thinking about using the s.replaceAll() function but I'm not sure how I can use it for this example.

I can't just use an XML parser because the text between the tags might contain some special characters like < and >, that will cause errors when using an XML parser.

I'm using Java.

Était-ce utile?

La solution

Since as you claim this is not valid XML document you can try with regex. To replace founded value with its new version you can use appendReplacement and appendTail from Matcher class.

  • appendReplacement replace founded value with its new version. You decide how you want to replace it.
  • appendTail adds part after last match to buffer.

To find match between <change> and </change> you can use <change>(.*?)</change> regex - if you want dot to represent all characters (including line separators like \n) you should use DOTALL flag from Pattern.

Demo:

String input = "random text blah blah <change>hello</change> more random text <change>hey</change> ..";
StringBuffer sb = new StringBuffer();

Pattern p = Pattern.compile("<change>(.*?)</change>",Pattern.DOTALL);
Matcher m = p.matcher(input);

while(m.find()){
    String valueFromTags = m.group(1);
    m.appendReplacement(sb, valueFromTags.toUpperCase());
    //                                    ^^^^^^^^^^^^^
    // you decide what to put as replacement of original value
    // toUpperCase is just example
}
m.appendTail(sb);

String result = sb.toString();
System.out.println(result);

Output:

random text blah blah HELLO more random text HEY ..

Autres conseils

You could use a regex, but it's a bit slow.

String newString = s.replaceAll("(?<=<change>).+?(?=<\\/change>)", "Your new string");

This means that you can have extra < or > characters within the change bit, and it will still work perfectly.

EDIT : if you wish to use the original word as well you can use this:

    String regexPattern = "(?<=<change>).+?(?=<\\/change>)";
    String originalString = "random text blah blah <change>hello</change> more random text <change>hey</change> ..";

    Pattern pattern = Pattern.compile(regexPattern);
    Matcher matches = pattern.matcher(originalString);

    if (matches.find()){
        String originalText = matches.group(0);
        String t = originalString.replaceAll(regexPattern, originalText + " whatever you want to add");
        System.out.println(t);
    }
    else {
        System.out.println("No matches found");
    }

Do you need to use XML notation?

You can use @change too.

Interesting puzzle.

Assuming you WANT to change the tags:

public class Test
{
    public static void main(String[] args)
    {
        String s = "random text blah blah <change>hello</change> more random text <change>hey</change> ..";
        System.out.println("BEFORE:"+s);
        System.out.println("AFTER :"+replace(s, "HI", "HELLO"));
    }

    private static String replace(String source, String ...replace)
    {
        if (source == null)
            return null;
        // ... more checks here
        int index=0, next, m=0;
        do
        {
            index = source.indexOf("<change>", index);
            next = source.indexOf("</change>", index)+"</change>".length();
            if (index>0)
            {
                source = source.substring(0, index) + replace[m] + source.substring(next);
                m++;
            }
        }
        while (index>0);
        return source;
    }

}

The output would be

BEFORE:random text blah blah <change>hello</change> more random text <change>hey</change> ..
AFTER :random text blah blah HI more random text HELLO ..

It may not be a good Idea

Looking for occurrence < and > and replacing those. Assuming there will be no generalized "<>" in the String

String s = "random text blah blah <change>hello</change> more random text <change>hey</change> .."
 String formatted = s.replaceAll("\\>", "><").replaceAll("\\<","/><");

Here is a solution taht works with regular expression:

    public static void main(String[] args) {
        final String SIMPLE_TAG_REGEX = "<(.+?)>(.+?)</(.+?)>";
        final Pattern PATTERN = Pattern.compile(SIMPLE_TAG_REGEX);

        final String s = "hello <foo>bar</foo> world, <lorem>ipsum</lorem>";
        final Matcher matcher = PATTERN.matcher(s);
        while (matcher.find()) {
            final String startTag = matcher.group(1);
            final String content = matcher.group(2);
            final String endTag = matcher.group(3);
            System.out.println(startTag + ", " + endTag + ": " + content);
        }
    }

Prints out:

    foo, foo: bar
    lorem, lorem: ipsum

Please check if startTag.equals(endTag). A regex can't do that in theorie (and praxis :) )!

This is one way to do it:

    String s = "random text blah blah <change>hello</change> more random text <change>hey</change> .."
    String formatted = s.replaceAll("hello", "YOUR CHANGE HERE");
    formatted = s.replaceAll("hey", "YOUR CHANGE HERE");

Or you could take advantage of the regex in replace all:

EDIT:

    String s = "random text blah blah <change>hello</change> more random text <change>hey</change> ..";
    String formatted = s.replaceAll("<change>(\\w)+</change>", "YOUR CHANGE HERE");
    System.out.println(formatted);
Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top