Question

I need to find some string in text after keyword in inside brackets first occurrence. This is the text example:

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. === FIRST KEYWORD === veniam, {{ text need to get }} ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit {{in voluptate velit esse cillum}} dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia {{deserunt mollit anim }}id est laborum

So I need get text inside the brackets after first keyword.

I tried many combination but the best was that I received the text from last brackets not first. With this exp I got text after keywords (?<==== FIRST KEYWORD ===).(.|\n)* But with finding first text in brackets I didn't succeed.

UPD: Thank you all, but answer from Bohemian not work for my corpus.
This answer :

"(?<==== FIRST KEYWORD ===)[^{]*\\{\\{([^}]*)\\}\\}"

works, but I don't see it now. So I cannot say thanks to that guy who wrote this, I don't remember.

Was it helpful?

Solution 2

Option 1: If you want the {{text AND the braces}}

String ResultString = null;
try {
    Pattern regex = Pattern.compile("=== FIRST KEYWORD ===[^{]*?(\\{\\{(?:.(?!}}))*.}})", Pattern.DOTALL);
    Matcher regexMatcher = regex.matcher(subjectString);
    if (regexMatcher.find()) {
        ResultString = regexMatcher.group(1);
    } 
} catch (PatternSyntaxException ex) {
    // Syntax error in the regular expression
}

Option 2: If you want to match the {{ text but NOT the braces }}

String ResultString = null;
try {
    Pattern regex = Pattern.compile("=== FIRST KEYWORD ===[^{]*?\\{\\{((?:.(?!}}))*.)}}", Pattern.DOTALL);
    Matcher regexMatcher = regex.matcher(subjectString);
    if (regexMatcher.find()) {
        ResultString = regexMatcher.group(1);
    } 
} catch (PatternSyntaxException ex) {
    // Syntax error in the regular expression
}

OTHER TIPS

This code extracts your target:

String target = input.replaceAll("(?s).*?=== FIRST KEYWORD ===.*?\\{\\{(.*?)\\}\\}.*", "$1");

The important part of the regex is the use of a reluctant quantifier .*?, which will stop consuming input at the first available match (not skipping over it to a subsequent match).

Edit:

Note (thanks to @guido for pointing this out) that the dotall flag (?s) has been added, which allows the dot matches to run across lines - required when working with multi-line input.


Some test code, using an abbreviated form of your example:

String input = "one two === FIRST KEYWORD === three {{xxx}} four {{yyy}} five";
String target = input.replaceAll("(?s).*?=== FIRST KEYWORD ===.*?\\{\\{(.*?)\\}\\}.*", "$1");
System.out.println(target);

Output:

xxx
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top