Finding Acronyms Using Regex In Python

Domanda

I'm trying to use regex in Python to match acronyms separated by periods. I have the following code:

import re
test_string = "U.S.A."
pattern = r'([A-Z]\.)+'
print re.findall(pattern, test_string)

The result of this is:

['A.']

I'm confused as to why this is the result. I know + is greedy, but why is are the first occurrences of [A-Z]\. ignored?

Soluzione 2

The (...) in regex creates a group. I suggest changing to:

pattern = r'(?:[A-Z]\.)+'

Altri suggerimenti

Description

This regex will:

(?:(?<=\.|\s)[A-Z]\.)+

enter image description here

Sample Text

This is the U.S.A. we have RADAR.

Matches

U.S.A

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a StackOverflow