Question

We are currently exploring deploying Zementis ADAPA or their UPPI plugin on top of a hadoop cluster. We plan to extract out SAS models to PMML and deploy them.

However, in addition to the models extracted from SAS, we need to express much simpler 'models'/classification rules in PMML.

An example is:

input: var1, var2
rule: var1 >= var2
output: 'true' of 'false'

I'm currently thinking of expressing this as a very simple decision tree (TreeModel in PMML) or a very simple rule set (RuleSet in PMML).

Here are my questions:

  1. Am I using the right models?
  2. Is this even the right approach? Is there another way to express rules in PMML?
  3. Is this even the right thing to ask of PMML? Is anyone else using PMML to express rules like this?
Was it helpful?

Solution

Since the PMML document always expects some sort of a 'model' to be present, you'll have to essentially trick it by putting in a dummy regression model. Then, you'll do your 'rule / logic' using the PMML 'if-then-else' construct in you input preprocessing (TransformationDictionary) to 'derive' your answer field. After that, you'll have to output this derived field using the 'output' element.

I know this is just too much work for too little benefit. I did this just as a proof-of-concept and we decided to not do simple rules in PMML.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top