문제

Can anyone explain why this re (in Python):

pattern = re.compile(r"""
^
([[a-zA-Zàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿ]+\s{1}]+)
([a-zA-Zàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿ]+)   # Last word.
\.{1}                                                                                 
$
""", re.VERBOSE + re.UNICODE)

if re.match(pattern, line):

does not match "A sentence."

I would actually like to return the entire sentence (including the period) as a returned group (), but have been failing miserably.

올바른 솔루션이 없습니다

다른 팁

I think that maybe you meant to do this:

(([a-zA-Zàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿ]+\s{1})+)
 ^                                             ^

I don't think the nested square brackets you had do what you think they do.

This regex works:

pattern = re.compile(r"""
^
([a-zA-Zàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿ]+\s{1})+
([a-zA-Zàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿ]+)   # Last word.
\.{1}
$
""", re.VERBOSE + re.UNICODE)

line = "A sentence."

match = re.match(pattern, line)

>>> print "'%s'" % match.group(0)
'A sentence.'
>>> print "'%s'" % match.group(1)
'A '
>>> print "'%s'" % match.group(2)
'sentence'

To return the entire match (line in this case), use match.group(0).

Because the first match group can match multiple times (once for each word except the last one), you can only access the next to last word using match.group(1).

Btw, the {1} notation is not necessary in this case, matching once and only once is the default behavior, so this bit can be removed.

The extra set of square brackets definitely weren't helping you :)

It turns out the following actually works and includes all the extended ascii characters I wanted

^
([\w+\s{1}]+\w{1}\.{1})
$
라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top