Question

i write a piece of program to fetch content from a string between ":"(may not have) and "@" and order guaranteed,for example a string like "url:123@my.com",the I fetch "123",or "123@my.com" then i fetch "123" ,too; so I write a regular expression to implement it ,but i can not work,behind is first version:

Pattern pattern = Pattern.compile("(?<=:?).*?(?=@)");
Matcher matcher = pattern.matcher("sip:+8610086@dmcw.com");
if (matcher.find()) {
     Log.d("regex", matcher.group());
} else {
     Log.d("regex", "not match");
}

it can not work because in the first case:"url:123@my.com" it will get the result:"url:123" obviously not what i want:

so i write the second version:

Pattern pattern = Pattern.compile("(?<=:??).*?(?=@)");

but it get the error,somebody said java not support variable length in look behind;

so I try the third version:

Pattern pattern = Pattern.compile("(?<=:).*?(?=@)|.*?(?=@)");

and its result is same as the first version ,BUT SHOULD NOT THE FIRST CONDITION BE CONSIDERED FIRST?

it same as

Pattern pattern = Pattern.compile(".*?(?=@)|(?<=:).*?(?=@)");

not left to right! I consider I understood regular expression before ,but confused again.thanks in advance anyway.

Was it helpful?

Solution

Try this (slightly edited, see comments):

String test = "sip:+8610086@dmcw.com";
String test2 = "8610086@dmcw.com";
Pattern pattern = Pattern.compile("(.+?:)?(.+?)(?=@)");
Matcher matcher = pattern.matcher(test);
if (matcher.find()) {
    System.out.println(matcher.group(2));
}
matcher = pattern.matcher(test2);
if (matcher.find()) {
    System.out.println(matcher.group(2));
}

Output:

+8610086
8610086

Let me know if you need explanations on the pattern.

OTHER TIPS

You really don't need any look-aheads or look-behinds here. What you need can be accomplished by using a a greedy quantifer and some alternation:

    .*(?:^|:)([^@]+)

By default java regular expression quantifiers (*+{n}?) are all greedy (will match as many characters as possible until a match can't be found. They can be made lazy by using a question mark after the quantifier like so: .*?

You will want to output capture group 1 for this expression, outputting capture group 0 will return the entire match.

As you said, you can't do a variable lookbehind in java.

Then, you can do something like this, you don't need lookbehind or lookaround.

Regex: :?([^@:]*)@

Example In this example (forget about \n, its because of regex101) you will get in the first group what you need, and you don't have to do anything special. Sometimes the easiest solution is the best.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top