You could try a regex a bit more like this:
"(\\b[0-9]{3,5},? [A-Za-z]+(?: [A-Za-z]+,?)* [a-zA-Z]{2} [0-9]{5})"
The [A-Za-z]+,?
part allows only letters (and not numbers).
Question
My code is like:
String try1 = " how abcd is a lake 3909 Witmer Road Niagara Falls NY 14305 and our adress is 120, 5th cross, 1st main, domlur, Bangalore 50071 nad 420, Fanboy Lane, NewYark, AS 12345";
String add1="( \\b+[0-9]{3,5}[, ]* (.*)[, ]* (.*)[, ]* [a-zA-Z]{2} [0-9]{5})";
Pattern p = Pattern.compile(add1);
Matcher m = p.matcher(try1);
if(m.find())
{
System.out.println("Address ======> " + m.group());
}
else System.out.println("Address ======>Not found ");
I want only US addresses in output:
[(3909 Witmer Road Niagara Falls NY 14305) and (420, Fanboy Lane, NewYark, AS 12345)]
but it's outputting like this:
(3909 Witmer Road Niagara Falls NY 14305 and our adress is 120, 5th cross, 1st main, domlur, Bangalore 50071 nad 420, Fanboy Lane, NewYark, AS 12345)
Solution
You could try a regex a bit more like this:
"(\\b[0-9]{3,5},? [A-Za-z]+(?: [A-Za-z]+,?)* [a-zA-Z]{2} [0-9]{5})"
The [A-Za-z]+,?
part allows only letters (and not numbers).
OTHER TIPS
The * operator is greedy, so it matches as many characters as it can. In your expression, the [a-zA-Z]{2} [0-9]{5} part that matches the zip code and state matches the very last ZIP and state in the input, because the .* patterns you have earlier in the expression, expand to as many characters as they can.
Try changing the .
s to [^0-9]
so that it matches anything except digits.