Question

I'l like to know how to find a string that is between slach and a bracket or ']' like for example.

data = "(AVP:SMTP/xx@xx.xx) R:AVP:SMS.0/+44648474 id:24"
data2 = "(AVP:SMTP/<xxx@xx.xx>) R:AVP:FAX.0/<thisword> id:25"

si the idea is to get only xx@xx.xx and +44648474 for the first data and xx@xx.xx and thiswordfor the data2


I've tried this regex:


k = re.findall(r"/(\S+)",data2)

but it returns <xxx@xx.xx>) and <thisword>


and what i'd like to get is xx@xx.xx and thisword

Was it helpful?

Solution

This one works.

import re

data = "(AVP:SMTP/xx@xx.xx) R:AVP:SMS.0/+44648474 id:24"
data2 = "(AVP:SMTP/<xxx@xx.xx>) R:AVP:FAX.0/<thisword> id:25"

regex = re.compile(r"/<?([^>\s\)]+)")

print regex.findall(data)
print regex.findall(data2)

>>> 
['xx@xx.xx', '+44648474']
['xxx@xx.xx', 'thisword']

This regex breakdown:

  • / : the / character.
  • <? : optionaly a < character.
  • ( : start capture group.
  • [^>\s\)]+ : capture anything that is not >, \s (whitespace), or ).
  • ) : close capture group.

OTHER TIPS

You can exclude such delimiters by using lookaround assertions:

k = re.findall(r"(?<=/<)[^>]+(?=>)",data2)

This would ensure "/<" before the match, match then everything that is not ">" at least once and succeed when there is a ">" after the match.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top