(Darn, Jon beat me. Oh well, you can look at the examples anyway)
Like the other guys have said, regex is not the best tool for this job. If you are working with filepaths, take a look at os.path.
As for filtering files you don't want, you can do if 'thumb' not in filename: ...
once you have dissected the path (where filename
is a str
).
And for posterity, here are my thoughts on those regex. r".*(?!thumb).*"
does not work as because .*
is greedy and the lookahead is given a very low priority. Take a look at this:
>>> re.search('(.*)((?!thumb))(.*)', '/tmp/somewhere/thumb').groups()
('/tmp/somewhere/thumb', '', '')
>>> re.search('(.*?)((?!thumb))(.*)', '/tmp/somewhere/thumb').groups()
('', '', '/tmp/somewhere/thumb')
>>> re.search('(.*?)((?!thumb))(.*?)', '/tmp/somewhere/thumb').groups()
('', '', '')
The last one is quite strange...
The other regex (r"^(?!.*thumb).*"
) works because .*
is inside the lookahead, so you don't have any issues with characters being stolen. You actually don't even need the ^
, depending on if you are using re.match
or re.search
:
>>> re.search('((?!.*thumb))(.*)', '/tmp/somewhere/thumb').groups()
('', 'humb')
>>> re.search('^((?!.*thumb))(.*)', '/tmp/somewhere/thumb').groups()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'NoneType' object has no attribute 'groups'
>>> re.match('((?!.*thumb))(.*)', '/tmp/somewhere/thumb').groups()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'NoneType' object has no attribute 'groups'