Question

I'd like to be able to define a schema in yaml, read it in using pyyaml and then validate with voluptuous (or other schema validator!). However, as stated in the question title, I run into the need to have the builtin class str instantiated for voluptuous rather than the string representation of it.

from voluptuous import Schema
import yaml


y = '''
a: str
b: int
c:
  d: float
  e: str
'''

yaml_schema = yaml.load(y,
                        Loader=yaml.CLoader)

schema1 = Schema(yaml_schema, required=True)

However, this schema is now looking for the string str as the only acceptable value of a. Using the direct pyyaml (e.g. 'a': !!python/int) fails. Instead, I want the schema below:

schema2 = Schema({'a': str,
                 'b': int,
                 'c': {'d': float,
                       'e': str}},
                required=True)

I am well aware that eval is not a production solution, but the function evaler below will convert schema1 to schema2 ...

def evaler(d):
    out = {}
    for k, v in d.items():
        if isinstance(v, dict):
            out[k] = evaler(v)
        else:
            out[k] = eval(v)
    return out

## Tests:

## passing
v.Schema(evaler(yaml_schema), 
         required=True)({'a': 'foo', 
                         'b': 2, 
                         'c': {'d': 2.0,
                               'e': 'bar'}})

## failling
v.Schema(evaler(yaml_schema), 
         required=True)({'a': 3, 
                         'b': 2, 
                         'c': {'d': 2.0,
                               'e': 1}})

I'm also aware that you can instantiate an empty class:

class foo: pass
globals()['foo']

But with builtins this is not possible:

globals()['int']
# KeyError: 'int'

I explored the new and type modules, but didn't have any luck...

Was it helpful?

Solution

The safest, simplest, clearest solution is a mapping explicitly listing the types you care about:

TYPES = {
    'str': str,
    'int': int,
    ...
}

You can eliminate the repetition (at the cost of some flexibility) by creating this dict from a list of types:

TYPES = {cls.__name__: cls for cls in [str, int, ...]}

Then you can recursively walk the document (as you do in evaler) and replace each string s with TYPES[s]. If you insist on supporting all built-in types by their name, without listing them separately, you can use the builtins module (called __builtin__ in Python 2). getattr is your friend here. You should probably check whether it's a type -- there are a lot of built in names that aren't.

You need to walk the document in any case. From PyYAML's perspective, the string "str" used as mapping value has the same tag as the string "a" used as mapping key, so you can't do anything by specifying a different class for that tag. While you could perhaps dive into its guts and introduce a hack that treats scalar mapping values differently, this would be just that: A hack. And a huge amount of additional work to boot. Not worth it.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top