Your first question can be answered in this way:
>>> import re
>>> t = "foo <bar e word> f ga <foo b>"
>>> t2 = re.sub(r"(^|\s+)(?![^<>]*?>)", " #", t).lstrip()
>>> t2
'#foo #<bar e word> #f #ga #<foo b>'
I added lstrip()
to remove the single space that occurs before the result of this pattern. If you want to go with your first option, you could simply replace #<
with <
.
Your second question can be solved in the following manner, although you might need to think about the ,
in a list like ['g,', "b'"]
. Should the comma from your string be there or not? There may be a faster way. The following is merely proof of concept. A list comprehension might take the place of the final element, although it would be farily complicated.
>>> s = "c'4 d8 < e' g' >16 fis'4 a,, <g, b'> c''1"
>>> q2 = re.compile(r"(?:<)\s*[^>]*\s*(?:>)\d*|(?<!<)[^\d\s<>]+\d+|(?<!<)[^\d\s<>]+")
>>> s2 = q2.findall(s)
>>> s3 = [re.sub(r"\s*[><]\s*", '', x) for x in s2]
>>> s4 = [y.split() if ' ' in y else y for y in s3]
>>> s4
["c'4", 'd8', ["e'", "g'16"], "fis'4", 'a,,', ['g,', "b'"], "c''1"]
>>> q3 = re.compile(r"([^\d]+)(\d*)")
>>> s = []
>>> for item in s4:
if type(item) == list:
lis = []
for elem in item:
lis.append(q3.search(elem).group(1))
if q3.search(elem).group(2) != '':
num = q3.search(elem).group(2)
if q3.search(elem).group(2) != '':
s.append((num, lis))
else:
s.append((0, lis))
else:
if q3.search(item).group(2) != '':
s.append((q3.search(item).group(2), [q3.search(item).group(1)]))
else:
s.append((0, [q3.search(item).group(1)]))
>>> s
[('4', ["c'"]), ('8', ['d']), ('16', ["e'", "g'"]), ('4', ["fis'"]), (0, ['a,,']), (0, ['g,', "b'"]), ('1', ["c''"])]