Python interpreting a Regex from a yaml config file
Вопрос
So I have a yaml file that I'm using as a config file. I'm trying to do some string matching with regular expressions, but I'm having trouble interpreting the regex from yaml into python. The regex in question looks like this:
regex:
- [A-Za-z0-9]
And when I try to use the re.match
function, I get this error:
Traceback (most recent call last):
File "./dirpylint.py", line 132, in <module>
sys.exit(main())
File "./dirpylint.py", line 32, in main
LevelScan(level)
File "./dirpylint.py", line 50, in LevelScan
regex_match(level)
File "./dirpylint.py", line 65, in regex_match
if re.match(expression, item) == None:
File "/usr/lib/python2.7/re.py", line 137, in match
return _compile(pattern, flags).match(string)
File "/usr/lib/python2.7/re.py", line 229, in _compile
p = _cache.get(cachekey)
TypeError: unhashable type: 'list'
I understand that it's interpreting the regex as a list, but how would I use the regex defined in the yaml file to search for a string?
Решение
The problem is the YAML, not the Python.
If you want to store a string value containing literal square brackets in a YAML file, you have to quote it:
regex:
- "[A-Za-z0-9]"
Also, note that in this YAML the value of regex
is a list containing one string, not a simple string.
Другие советы
I did this in my YAML parsing "engine".
In [1]: from StringIO import StringIO
In [2]: import re, yaml
In [3]: yaml.add_constructor('!regexp', lambda l, n: re.compile(l.construct_scalar(n)))
In [4]: yaml.load(StringIO("pattern: !regexp '^(Yes|No)$'"))
Out[4]: {'pattern': re.compile(ur'^(Yes|No)$')}
Also this works if you want to use safe_load and !!python/regexp (similar to ruby's and nodejs' implementations):
In [5]: yaml.SafeLoader.add_constructor(u'tag:yaml.org,2002:python/regexp', lambda l, n: re.compile(l.construct_scalar(n)))
In [6]: yaml.safe_load(StringIO("pattern: !!python/regexp '^(Yes|No)$'"))
Out[6]: {'pattern': re.compile(ur'^(Yes|No)$')}
You're using two list constructs in your YAML
file. When you load the YAML
file:
>>> d = yaml.load(open('config.yaml'))
You get this:
>>> d
{'regex': [['A-Za-z0-9']]}
Note that the square brackets in your regular expression are actually disappearing because they are being recognized as list delimiters. You can quote them:
regex: - "[A-Za-z0-9]"
To get this:
>>> yaml.load(open('config.yaml'))
{'regex': ['[A-Za-z0-9]']}
So the regular expression is d['regex'][0]
. But you could also just do this in your yaml
file:
regex: "[A-Za-z0-9]"
Which gets you:
>>> d = yaml.load(open('config.yaml'))
>>> d
{'regex': '[A-Za-z0-9]'}
So the regular expression can be retrieved with a similar dictionary lookup:
>>> d['regex']
'[A-Za-z0-9]'
...which is arguably much simpler.