Question

I have a yaml setting file which creates some records in db:

setting1:
  name: [item,item]
  name1: text
anothersetting2:
  name: [item,item]
  sub_setting:
      name :[item,item]

when i update this file with setting3 and regenerate records in db by:

import yaml
fh = open('setting.txt', 'r')
setting_list = yaml.load(fh)
for i in setting_list:
    add_to_db[i]

it's vital that the order of them settings (id numbers in db) stay the same each time as im addig them to the db... and setting3 just gets appended to the yaml.load()'s end so that its id doesn't confuse any records which are already in the db ... At the moment each time i add another setting and call yaml.load() records get loaded in different order which results in different ids. I would welcome any ideas ;)

EDIT: I've followed abarnert tips and took this gist https://gist.github.com/844388

Works as expected thanks !

Was it helpful?

Solution 3

The YAML spec clearly says that the key order within a mapping is a "representation detail" that cannot be relied on. So your settings file is already invalid if it's relying on the mapping, and you'd be much better off using valid YAML, if at all possible.

Of course YAML is extensible, and there's nothing stopping you from adding an "ordered mapping" type to your settings files. For example:

!omap setting1:
  name: [item,item]
  name1: text
!omap anothersetting2:
  name: [item,item]
  !omap sub_setting:
      name :[item,item]

You didn't mention which yaml module you're using. There is no such module in the standard library, and there are at least two packages just on PyPI that provide modules with that name. However, I'm going to guess it's PyYAML, because as far as I know that's the most popular.

The extension described above is easy to parse with PyYAML. See http://pyyaml.org/ticket/29:

def omap_constructor(loader, node):
    return loader.construct_pairs(node)
yaml.add_constructor(u'!omap', omap_constructor)

Now, instead of:

{'anothersetting2': {'name': ['item', 'item'],
  'sub_setting': 'name :[item,item]'},
 'setting1': {'name': ['item', 'item'], 'name1': 'text'}}

You'll get this:

(('anothersetting2', (('name', ['item', 'item']),
  ('sub_setting', ('name, [item,item]'),))),
 ('setting1', (('name', ['item', 'item']), ('name1', 'text'))))

Of course this gives you a tuple of key-value tuples, but you can easily write a construct_ordereddict and get an OrderedDict instead. You can also write a representer that stores OrdereredDict objects as !omaps, if you need to output as well as input.

If you really want to hook PyYAML to make it use an OrderedDict instead of a dict for default mappings, it's pretty easy to do if you're already working directly on parser objects, but more difficult if you want to stick with the high-level convenience methods. Fortunately, the above-linked ticket has an implementation you can use. Just remember that you're not using real YAML anymore, but a variant, so any other software that deals with your files can, and likely will, break.

OTHER TIPS

My project oyaml is a drop-in replacement for PyYAML, which will load maps into collections.OrderedDict instead of regular dicts. Just pip install it and use as normal - works on both Python 3 and Python 2.

Demo with your example:

>>> import oyaml as yaml  # pip install oyaml
>>> yaml.load('''setting1:
...   name: [item,item]
...   name1: text
... anothersetting2:
...   name: [item,item]
...   sub_setting:
...       name :[item,item]''')
OrderedDict([('setting1',
              OrderedDict([('name', ['item', 'item']), ('name1', 'text')])),
             ('anothersetting2',
              OrderedDict([('name', ['item', 'item']),
                           ('sub_setting', 'name :[item,item]')]))])

Note that if the stdlib dict is order preserving (Python >= 3.7, CPython >= 3.6) then oyaml will use an ordinary dict.

You can now use ruaml.yaml for this.

From https://pypi.python.org/pypi/ruamel.yaml:

ruamel.yaml is a YAML parser/emitter that supports roundtrip preservation of comments, seq/map flow style, and map key order

For a given single item that is known to be an ordered dictionary just make the items of a list and used collections.OrderedDict:

setting1:
  - name: [item,item]
  - name1: text
anothersetting2:
  - name: [item,item]
  - sub_setting:
      name :[item,item]

import collections
import yaml
fh = open('setting.txt', 'r')
setting_list = yaml.load(fh)

setting1 = collections.OrderedDict(list(x.items())[0] for x in setting_list['setting1'])

Last I heard, PyYAML did not support this, though it would probably be easy to modify it to accept a dictionary or dictionary-like object as a starting point.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top