Question

Similar to Is there a query language for JSON? and the more specific How can I filter a YAML dataset with an attribute value? - I would like to:

  • hand-edit small amounts data in YAML files
  • perform arbitrary queries on the complete dataset (probably in Python, open to other ideas)
  • work with the resulting subset in Python

It doesn't appear that PyYAML has a feature like this, and today I can't find the link I had to the YQuery language, which wasn't a mature project anyway (or maybe I dreamt it).

Is there a (Python) library that offers YAML queries? If not, is there a Pythonic way to "query" a set of objects other than just iterating over them?

No correct solution

OTHER TIPS

I don't thing there is a direct way to do it. But PyYAML reads yaml files into a dict representing everything in the file. Afterwards you can execute all dict related operations. The question python query keys in a dictionary based on values mentions some pythonic "query" styles.

bootalchemy provides a means to do this via SQLAlchemy. First, define your schema in a SQLAlchemy model. Then load your YAML into a SQLAlchemy session using bootalchemy. Finally, perform queries on that session. (You don't have to commit the session to an actual database.)

Example from the PyPI page (assume model is already defined):

from bootalchemy.loader import Loader

# (this simulates loading the data from YAML)
data = [{'Genre':[{'name': "action",
                   'description':'Car chases, guns and violence.'
                  }
                 ]
        }
       ]

# load YAML data into session using pre-defined model
loader = Loader(model)
loader.from_list(session, data)

# query SQLAlchemy session
genres = session.query(Genre).all()

# print results
print [(genre.name, genre.description) for genre in genres]

Output:

[('action', 'Car chases, guns and violence.')]

You could try to use jsonpath? Yes, that's meant for json, not yaml, but as long as you have json-compatible datastructures, this should work, because you're working on the parsed data, not on the json or yaml represention? (seems to work with the python libraries jsonpath and jsonpath-rw)

You can check the following tools:

  • yq for CLI queries, like with jq,
  • yaml-query another CLI query tool written in Python.
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top