Question

I have a list of dictionaries, which I want to serialize:

list_of_dicts = [ { 'key_1': 'value_a', 'key_2': 'value_b'},
                  { 'key_1': 'value_c', 'key_2': 'value_d'},
                  ...
                  { 'key_1': 'value_x', 'key_2': 'value_y'}  ]

yaml.dump(list_of_dicts, file, default_flow_style = False)

produces the following:

- key_1: value_a
  key_2: value_b
- key_1: value_c
  key_2: value_d
(...)
- key_1: value_x
  key_2: value_y

But i'd like to get this:

- key_1: value_a
  key_2: value_b
                     <-|
- key_1: value_c       | 
  key_2: value_d       |  empty lines between blocks
(...)                  |
                     <-|
- key_1: value_x
  key_2: value_y

PyYAML documentation talks about dump() arguments very briefly and doesn't seem to have anything on this particular subject.

Editing the file manually to add newlines improves readability quite a lot, and the structure still loads just fine afterwards, but I have no idea how to make dump method generate it.

And in general, is there a way to have more control over output formatting besides simple indentation?

Was it helpful?

Solution

There's no easy way to do this with the library (Node objects in yaml dumper syntax tree are passive and can't emit this info), so I ended up with

stream = yaml.dump(list_of_dicts, default_flow_style = False)
file.write(stream.replace('\n- ', '\n\n- '))

OTHER TIPS

PyYAML documentation only talks about dump() arguments briefly, because there is not much to say. This kind of control is not provided by PyYAML.

To allow preservation of such empty (and comment) lines in YAML that is loaded, I started the development of the ruamel.yaml library, a superset of the stalled PyYAML, with YAML 1.2 compatibility, many features added and bugs fixed. With ruamel.yaml you can do:

import sys
import ruamel.yaml

yaml_str = """\
- key_1: value_a
  key_2: value_b

- key_1: value_c
  key_2: value_d

- key_1: value_x  # a few before this were ellipsed
  key_2: value_y
"""

yaml = ruamel.yaml.YAML()
data = yaml.load(yaml_str)
yaml.dump(data, sys.stdout)

and get the output exactly the same as the input string (including the comment).

You can also build the output that you want from scratch:

import sys
import ruamel.yaml

yaml = ruamel.yaml.YAML()
list_of_dicts = yaml.seq([ { 'key_1': 'value_a', 'key_2': 'value_b'},
                           { 'key_1': 'value_c', 'key_2': 'value_d'},
                           { 'key_1': 'value_x', 'key_2': 'value_y'}  ])

for idx in range(1, len(list_of_dicts)):
    list_of_dicts.yaml_set_comment_before_after_key(idx, before='\n')

ruamel.yaml.comments.dump_comments(list_of_dicts)
yaml.dump(list_of_dicts, sys.stdout)

The conversion using yaml.seq() is necessary to create an object that allows attachment of the empty-lines through special attributes.

The library also allows preservation/easy-setting of quotes and literal style on strings, format of int (hex, octal, binary) and floats. As well as separate indent specification for mappings and sequences (although not for individual mappings or sequences).

While its a little klunky, I had the same goal as OP. I solved it by subclassing yaml.Dumper

from yaml import Dumper

class MyDumper(Dumper):

  def write_indent(self):
    indent = self.indent or 0
    if not self.indention or self.column > indent \
        or (self.column == indent and not self.whitespace):
      self.write_line_break()


    ##########$#######################################
    # On the first level of indentation, add an extra
    # newline

    if indent == 2:
      self.write_line_break()

    ##################################################

    if self.column < indent:
      self.whitespace = True
      data = u' '*(indent-self.column)
      self.column = indent
      if self.encoding:
        data = data.encode(self.encoding)
      self.stream.write(data)

You call it like this:

print dump(python_dict, default_flow_style=False, width=79, Dumper=MyDumper)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top