Pergunta

I've noticed a few Python packages that use config files written in Python. Apart from the obvious privilege escalation, what are the pros and cons of this approach?

Is there much of a precedence for this? Are there any guides as to the best way to implement this?

Just to clarify: In my particular use case, this will only be used by programmers or people who know what they're doing. It's not a config file in a piece of software that will be distributed to end users.

Foi útil?

Solução

The best example I can think of for this is the django settings.py file, but I'm sure there are tons of other examples for using a Python file for configuration.

There are a couple of key advantages for using Python as config file over other solutions, for example:

  • There is no need to parse the file: Since the file is already Python, you don't have to write or import a parser to extract the key value pairs from the file.
  • Configuration settings can be more than just key/values: While it would be folly to have settings define their own classes, you can use them to define tuples, lists or dictionaries of settings allowing for more options and configuration than other options. This is especially true with django, where the settings file has to accommodate for all manner of plug-ins that weren't originally known by the framework designers.
  • Writing configuration files is easy: This is spurious, but since the configuration is a Python file it can be edited and debugged within the IDE of the program itself.
  • Implicit error-checking: If your program requires an option called FILE_NAME and that isn't in the settings the program will throw an exception. This means that settings become mandatory and error handling of the settings can be more explicit. This can be a double edged sword, but manually changing config files should be for power editors who should be able to handle the consequences of exceptions.
  • Config options are easily accessed and namespaces: Once you go import settings you can wildly start calling settings.UI_COLOR or settings.TIMEOUT. These are clear, and with the right IDE, tracking where these settings are made becomes easier than with flat files.

But the most powerful reason: Overrides, overrides, overrides. This is quite an advanced situation and can be use-case specific, but one that is encouraged by django in a few places.

Picture that you are building a web application, where there is a development and production server. Each of these need their own settings, but 90% of them are the same. In that case you can do things like define a config file that covers all of development and make it (if its safer) the default settings, and then override if its production, like so:

PORT = 8080
HOSTNAME = "dev.example.com"
COLOR = "0000FF"

if SITE_IS_LIVE:
    import * from production_settings.py

Doing an import * from will cause any settings that have been declared in the production_settings.py file to override the declarations in the settings file.

I've not seen a best practise guideline or PEP document that covers how to do this, but if you wanted some general guidelines, the django settings.py is a good example to follow.

  • Use consistent variable names, preferably UPPER CASE as they are understood to be settings or constants.
  • Expect odd data structures, if you are using Python as the configuration language, then try to handle all basic data types.
  • Don't try and make an interface to change settings, that isn't a simple text editor.

When shouldn't you use this approach? When you are dealing with simple key/value pairs that need to be changed by novice users. Python configs are a power user option only. Novice users will forget to end quotes or lists, not be consistent, will delete options they think don't apply and will commit the unholiest of unholies and will mix tabs and spaces spaces only. Because you are essentially dealing with code not config files, all off these will break your program. On the otherside, writing a tool that would parse through through a python file to find the appropriate options and update them is probably more trouble than it is worth, and you'd be better of reusing an existing module like ConfigParser

Outras dicas

I think Python code gets directly used for configuration mostly because it's just so easy, quick, powerful and flexible way to get it done. There is currently no other tool in the Python ecosystem that provides all these benefits together. The ConfigParserShootout cat give you enough reasons why it may be better to roll Python code as config.

There are some security considerations that can be worked around either by defensive code evaluation or by policies such as properly setting the filesystem permissions in deployment.

I've seen so much struggle with rather complex configuration being done in various formats, using various parsers, but in the end being the easiest when done in code.

The only real downside I came upon is that people managing the configuration have to be somewhat aware of Python, at least the syntax, to be able to do anything and not to brake anything. May or may not matter case by case.

Also the fact that some serious projects, such as Django and Sphinx, are using this very approach should be soothing enough:

There are many options for writing configuration files, with well written parsers:

there's no good reason to have any kind of configuration be parsed as a python script directly. That could led to many kind of problems, from the security aspects to the hard to debug errors, that could be raised late in the run of the program life.

There's even discussions to build an alternative to the setup.py for python packages, which is pretty close to a python source code based configuration from a python coder's point of view.

Otherwise, you may just have seen python objects exported as strings, that looks a bit like json, though a little more flexible… Which is then perfectly fine as long as you don't eval()/exec() them or even import them, but pass it through a parser, like 'ast.literal_eval' or parsing, so you can make sure you only load static data not executable code.

The only few times I'd understand having something close to a config file written in python, is a module included in a library that defines constants used by that library designed to be handled by the user of the library. I'm not even sure that would be a good design decision, but I'd understand such a thing.

edit:

I wouldn't consider django's settings.py an example of good practice, though I consider it's part of what I'm consider a configuration file for coding-literate users that works fine because django is aimed at being used mostly by coders and sysadmins. Also, django offers a way of configuration through a webpage.

To take @lego's arguments:

  • There is no need to parse the file

there's no need to explicitly parse it, though the cost of parsing is anecdotic, even more given the safety and the extra safety and the ability to detect problems early on

  • Configuration settings can be more than just key/values

ini files apart, you can define almost any fundamental python type using json/yaml or xml. And you don't want to define classes, or instanciate complex objects in a configuration file…

  • Writing configuration files is easy:

but using a good editor, json/yaml or even xml syntax can be checked and verified, to have a perfectly parsable file.

  • Implicit error-checking:

not an argument neither, as you say it's double sworded, you can have something that parses fine, but causes an exception after many hours of run.

  • Config options are easily accessed and namespaces:

using json/yaml or xml, options can easily be namespaced, and used as python objects naturally.

  • But the most powerful reason: Overrides, overrides, overrides

It's not a good argument neither in favor of python code. Considering your code is made of several modules that are interdendant and use a common configuration file, and each of them have their own configuration, then it's pretty easy to load first the main configuration file as a good old python dictionary, and the other configuration files just loaded by updating the dictionary.

If you want to track changes, then there are many recipes to organize a hierarchy of dicts that fallbacks to another dict if it does not contain the value.

And finally, configuration values changed at runtime can't be (actually shouldn't be) serialized in python correctly, as doing so would mean changing the currently running program.

I'm not saying you shouldn't use python to store configuration variables, I'm just saying that whatever syntax you choose, you should get it through a parser before getting it as instances in your program. Never, ever load user modifiable content without double checking. Never trust your users!

If the django people are doing it, it's because they've built a framework that only makes sense when gathering many plugins together to build an application. And then, to configure the application, you're using a database (which is a kind of configuration file… on steroids), or actual files.

HTH

I've done this frequently in company internal tools and games. Primary reason being simplicity: you just import the file and don't need to care about formats or parsers. Usually it has been exactly what @zmo said, constants meant for non programmers in the team to modify (say the size of the grid of the game level. or the display resolution).

Sometimes it has been useful to be able to have logic in the configuration. For example alternative functions that populate the initial configuration of the board in the game. I've found this a great advantage actually.

I acknowledge that this could lead to hard to debug problems. Perhaps in these cases those modules have been more like game level init modules than typical config files. Anyhow I've been really happy about the straightforward way to make clear textual config files with the ability to have logic there too and haven't gotten bit by it.

This is yet another config file option. There are several quite adequate config file formats available.

Please take a moment to understand the system administrator's viewpoint or some 3rd party vendor supporting your product. If there is yet another config file format they might drop your product. If you have a product that is of monumental importance then people will go through the hassle of learning the syntax just to read your config file. (like X.org, or apache)

If you plan on another programming language accessing/writing the config file info then a python based config file would be a bad idea.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top