Question

Given XML objects of many classes (say, types of document images), I need to generate some outputs depending on the class of the object, and a complex set of mathematical rules relating the contents of the XML file.

What is the generic name of this task (parsing?) and what is the easiest way to encode separate rules for each class, bearing in mind that the rules may involve mathematical relationships. I think I should create a file for each class to keep it manageable using a DSL but I am not sure. Someone suggested incorporating a full-blown Lua or Javascript interpreter. Is this a good idea? I want to keep it lean, and simple.

Was it helpful?

Solution

Parsing refers to reading a series of tokens and matching rules in a grammar. If you can specify your problem in this way you can write the grammar using pyparsing.

If what you are interested in doing is extracting the structure of an XML document, then you can use the standard python module xml.etree.ElementTree. Also look at beautifulsoup.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top