Question

I want to find out if a string contains the word "randomize". This word my exist in and outside of brackets in the string but I am only interested if the word exists IN SIDE of the brackets.

mystring = "You said {single order='randomize'} that P.E is...Why?"

I understand that i have to use regex for this but my attampts have failed thus far.

Essentially I want to say:

look for "randomize" and check if its in brackets. 

Thanks

Was it helpful?

Solution

You could use some negated classes:

>>> import re
>>> mystring = "You said {single order='randomize'} that P.E is...Why?"
>>> if mystring.find("randomize") != -1:
...     if re.search(r'{[^{}]*randomize[^{}]*}', mystring):
...         print("'randomize' present within braces")
...     else:
...         print("'randomize' present but not within braces")
... else:
...     print("'randomize' absent")

# => 'randomize' present within braces

OTHER TIPS

This is the kind of thing that's very difficult for regex to do. You see if you do something like re.escape(r"{.*?randomize.*?}"), you can match something like "Hello there, I'm going to {break} your randomize regex {foobar}" and it will return "{break} your randomize regex {foobar}". You can probably pull this off with lookahead and lookbehind assertions, but not without telling us if the brackets can be nested, since this will then fail on "I'm going to break you {now with randomize {nested} brackets}"

As per your update that the brackets will never be nested, this regex should match:

re.search("{[^}]*?randomize.*?}", mystring)

And you can access the group using .group(0). Put it all together to do something like:

for mystring in group_of_strings_to_test:
    if re.search("{[^}]*?randomize.*?}", mystring).group(0):
        # it has "randomize" in a bracket
    else:
        # it doesn't.

To assure you're not inside nested {}'s it could be

 {[^{}]*randomize[^{}]*}

The naive simple method:

>>> import re
>>> mystring = "You said {single order='randomize'} that P.E is...Why?"
>>> print re.search('{.*randomize.*}', mystring).group(0)

Once we have this, we can improve it bit by bit. For instance, this is called a greedy regex, which means:

>>> print re.search('{.*randomize*}', "{FOO {randomize} BAR}").group(0)
{FOO {randomize} BAR}

You'll probably want it to be non-greedy, so you should use '.*?' instead:

>>> print re.search('{.*?randomize.*?}', mystring).group(0)

Also, it will not handle nested:

>>> print re.search('{.*?randomize.*?}', "{FOO} randomize {BAR}").group(0)
{FOO} randomize {BAR}

If you want to handle simple nested, you may want to match all characters except other brackets.

>>> print re.search('{[^}]*randomize[^{]*}', mystring).group(0) 
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top