Question

So I have a yaml file that I'm using as a config file. I'm trying to do some string matching with regular expressions, but I'm having trouble interpreting the regex from yaml into python. The regex in question looks like this:

regex:
    - [A-Za-z0-9]

And when I try to use the re.match function, I get this error:

Traceback (most recent call last):
  File "./dirpylint.py", line 132, in <module>
    sys.exit(main())
  File "./dirpylint.py", line 32, in main
    LevelScan(level)
  File "./dirpylint.py", line 50, in LevelScan
    regex_match(level)
  File "./dirpylint.py", line 65, in regex_match
    if re.match(expression, item) == None:
  File "/usr/lib/python2.7/re.py", line 137, in match
    return _compile(pattern, flags).match(string)
  File "/usr/lib/python2.7/re.py", line 229, in _compile
    p = _cache.get(cachekey)
TypeError: unhashable type: 'list'

I understand that it's interpreting the regex as a list, but how would I use the regex defined in the yaml file to search for a string?

Was it helpful?

Solution

The problem is the YAML, not the Python.
If you want to store a string value containing literal square brackets in a YAML file, you have to quote it:

regex:
  - "[A-Za-z0-9]"

Also, note that in this YAML the value of regex is a list containing one string, not a simple string.

OTHER TIPS

I did this in my YAML parsing "engine".

In [1]: from StringIO import StringIO
In [2]: import re, yaml
In [3]: yaml.add_constructor('!regexp', lambda l, n: re.compile(l.construct_scalar(n)))
In [4]: yaml.load(StringIO("pattern: !regexp '^(Yes|No)$'"))
Out[4]: {'pattern': re.compile(ur'^(Yes|No)$')}

Also this works if you want to use safe_load and !!python/regexp (similar to ruby's and nodejs' implementations):

In [5]: yaml.SafeLoader.add_constructor(u'tag:yaml.org,2002:python/regexp', lambda l, n: re.compile(l.construct_scalar(n)))
In [6]: yaml.safe_load(StringIO("pattern: !!python/regexp '^(Yes|No)$'"))
Out[6]: {'pattern': re.compile(ur'^(Yes|No)$')}

You're using two list constructs in your YAML file. When you load the YAML file:

>>> d = yaml.load(open('config.yaml'))

You get this:

>>> d
{'regex': [['A-Za-z0-9']]}

Note that the square brackets in your regular expression are actually disappearing because they are being recognized as list delimiters. You can quote them:

regex: - "[A-Za-z0-9]"

To get this:

>>> yaml.load(open('config.yaml'))
{'regex': ['[A-Za-z0-9]']}

So the regular expression is d['regex'][0]. But you could also just do this in your yaml file:

regex: "[A-Za-z0-9]"

Which gets you:

>>> d = yaml.load(open('config.yaml'))
>>> d
{'regex': '[A-Za-z0-9]'}

So the regular expression can be retrieved with a similar dictionary lookup:

>>> d['regex']
'[A-Za-z0-9]'

...which is arguably much simpler.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top