Pergunta

I've worked on projects that have very complex XML configuration, and one problem that's faced is maintaining the internal consistency of the XML.

In this case I don't mean the strict XML schema consistency, but rather the higher level relation between the nodes used. Most problems were caused by implicit linking between information encoded in the XML, and implicit relation of that information to the codebase. Examples might be:

  • XML node data implicitly linked to enums in code
  • Business objects in the same config that are related (in that they share information that needs to be consistent) without any explicit relation between them
  • Code in XML to be compiled and parsed at runtime

It struck me that a) this might become a practice of increasing frequency and b) that in some cases we're implicitly creating a new coding language that's not compile-time checked -- and in fact has few checks at all until it's run.

Is anyone else out there facing similar scenarios, and are there any tools or approaches that make the problem more tractable? I'd like some general examples that are technology-agnostic -- my own specific experience has been with C# and with config for proprietary systems.

Note: although I have an answer to this below, I've no intent of taking my own as the final answer.

Foi útil?

Solução

This will greatly depend on languages/frameworks/tools you're using for your project.

Using XML for configuration can be really problematic because it cannot be compile time checked.

For example, when using Java and Spring Framework, there exists an Eclipse plugin called Spring Tool Suite which can do synchronization checking between XML configuration and the actual code.

But this is just one example for a specific language and a specific framework. You should try to find out if something similiar exists for your scenario.

By the way, if you let us know what technologies you're using, we might be able to assist you more.

Outras dicas

Did you try Schematron? http://www.schematron.com/

It's a higher level language targeted at validating XML semantically, not just syntactically.

See also Wikipedia: http://en.wikipedia.org/wiki/Schematron

I use automated functional tests to ensure the data integrity of our default config at work. I don't know that the problem you describe is necessarily dependent on the config format being XML. That being said I would suggest:

  • You don't use XML as a config format. (Google can give you the reasons)
  • Tight coupling through the implicit use of enums or any other constructed values are just as bad in a config as they are in code.
  • Code that get's pulled from a config and executed makes me cringe. Is it in a config so that it can be overridden? If so that's a huge vector for bugs, undefined behavior and potential security issues.
  • the business objects issue also sounds like bad tight coupling.

You may not be able to do anything about XML as the config format, the bad coupling, or the execution of code from config. You can mitigate the risk by creating a suite of functional tests that ensure:

  • the enums are valid
  • the code is compilable, executable, doesn't contain evals, rpc calls, calls no functions, or whatever other analysis checks you want to add.

You can also sometimes help change by providing an alternative (config or config entries).

A good link on from @David Peleg's is Topologi, which offer products to check XML, including "business rules" checks.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top