Question

I'm trying to get re.search to find strings that don't have the letter p in them. My regex code returns everything in the list which is what I don't want. I wrote an alternate solution that gives me the exact results that I want, but I want to see if this can be solved with re.search, but I'll also accept another regex solution. I also tried re.findall and that didn't work, and re.match won't work because it looks for the pattern at the beginning of a string.

import re

someList = ['python', 'ppython', 'ython', 'cython', '.python', '.ythop', 'zython', 'cpython', 'www.python.org', 'xyzthon', 'perl', 'javap', 'c++']

# this returns everything from the source list which is what I DON'T want
pattern = re.compile('[^p]')
result = []

for word in someList:
    if pattern.search(word):
        result.append(word)
print '\n', result
''' ['python', 'ppython', 'ython', 'cython', '.python', '.ythop', 'zython', 'cpython', 'www.python.org', 'xyzthon', 'perl', 'javap', 'c++'] '''

# this non regex solution returns the results I want
cnt = 0; no_p = []

for word in someList:
    for letter in word:
        if letter == 'p':
            cnt += 1
            pass
    if cnt == 0:
        no_p.append(word)
    cnt = 0
print '\n', no_p
''' ['ython', 'cython', 'zython', 'xyzthon', 'c++'] '''
Was it helpful?

Solution 2

Your understanding of character-set negation is flawed. The regex [^p] will match any string that has a character other than p in it, which is all of your strings. To "negate" a regex, simply negate the condition in the if statement. So:

import re

someList = ['python', 'ppython', 'ython', 'cython', '.python', '.ythop', 'zython', 'cpython', 'www.python.org', 'xyzthon', 'perl', 'javap', 'c++']

pattern = re.compile('p')
result = []
for word in someList:
    if not pattern.search(word):
        result.append(word)
print result

It is, of course, rather pointless to use a regex to see if a single specific character is in the string. Your second attempt is more apt for this, but it could be coded better:

result = []
for word in someList:
    if 'p' not in word:
        result.append(word)
print result

OTHER TIPS

You are almost there. The pattern you are using is looking for at least one letter that is not 'p'. You need a more strict one. Try:

pattern = re.compile('^[^p]*$')
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top