Pergunta

In Python/django I have a string from which I extract "the title" by matching characters before the ':' character, like:

some_string = "This is my Title: This is some text"

So I'm using this code to extract the title:

result = regex.search('(.*):', some_string)
result.group(1)
>>> 'This is my Title'

There will be problems when a user put only a url in the string, like:

some_string = 'http://vimeo.com/49742318'
result.group(1)
>>> 'http'

I prefer to just have an empty string returened. I've tried using the negative look ahead metatag (?!):

result = regex.search('(.*(?!http)):', some_string)

But it still returns 'http' instead of an empty string. How should it be?

Foi útil?

Solução

The problem is that at the point where you've put the negative lookahead, the next character is also constrained to be a colon: the negative lookahead succeeds trivially as h is not the next character!

What you probably actually want is to put the negative lookahead after the colon so that the next character is not a /:

(.*):(?!/)

But at that point you might as well use a positive lookahead and stop using a capturing group at all. You should also not allow colons to be captured or the RE would be able to consume much more than you might expect:

result = regex.search('[^:]*(?=:[^/])', some_string)
result.group()
Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top