سؤال

I have the following code which does what I want, retrieve the package name from the result of that command :

command :

dpkg --get-selections | grep amule

string to analyze :

string = 'amule\t\t\t\t\t\tinstall\namule-common\t\t\t\t\tinstall\namule-utils\t\t\t\t\tinstall\n'

code :

pattern = re.compile(r"[a-z](.*)\w*(?=([\\\t]*install))")
matches = re.finditer(pattern, result[0])

for match in matches:
    plist.append(match.group().strip())

result :

plist = ['amule', 'amule-common', 'amule-utils']

But I would like to optimize the code, to not use the strip function and obtain the same result only using regex. So far though, I couldn't get rid of all the '\t', even using '+', '*' or {n} before the 'install' string. Any idea ?

Thank you

هل كانت مفيدة؟

المحلول 2

Ok, with your help (the backslash was the issue), here's what I could come up with

pattern = re.compile(r'([\w\-]+)(?=(\s*install\s*))', re.MULTILINE)
matches = re.finditer(pattern, string_to_analize)

for match in matches:
    print match.group()

which does exactly what is needed.

Thanks a lot for your help ! ;)

PS : Just a very strange thing : that regex does not function on the website, do you understand why ? http://regex101.com/r/iM2gJ1

نصائح أخرى

You should be able to do this easily by using the re.M flag (multiline).

"([\w\-]+)\s*install", re.M

Like so:

match = re.search(r"([\w\-]+)\s*install", re.M)
if match:
    plist = match

See a working example here: http://regex101.com/r/jE0dL8

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top