Question

I'm using Augustus as a PMML model consumer. I've modified the add two numbers example to include a DefineFunction element, like this:

<PMML version="4.1" xmlns="http://www.dmg.org/PMML-4_1">
    <Header/>
    <DataDictionary>
        <DataField name="x" dataType="double" optype="continuous"/>
        <DataField name="y" dataType="double" optype="continuous"/>
    </DataDictionary>
    <TransformationDictionary>
        <DefineFunction dataType="float" optype="continuous" name="add">
            <ParameterField optype="continuous" name="first"></ParameterField>
            <ParameterField optype="continuous" name="second"></ParameterField>
                <Apply function="+" invalidValueTreatment="returnInvalid">
                    <FieldRef field="first"></FieldRef>
                    <FieldRef field="second"></FieldRef>
                </Apply>
        </DefineFunction>
        <DerivedField name="z" dataType="double" optype="continuous">
            <Apply function="add">
                <FieldRef field="x"/>
                <FieldRef field="y"/>
            </Apply>
        </DerivedField>
    </TransformationDictionary>
</PMML>

I save this model in a file and try to run it like so:

from resources import add_two_numbers_file # this is just the path to my model file
from augustus.strict import modelLoader

# Load model
with open(add_two_numbers_file, 'r') as model_file:
    model_str = model_file.read()
    model = modelLoader.loadXml(model_str)

# Run model
print model.calc({'x':[1,2,3],'y':[4,5,6]}).look()

However, I get an error:

AttributeError: 'DefineFunction' object has no attribute '_setupCalculate'

I'm using the latest trunk (revision 794) and am able to run the unmodified example (without a DefineFunction) without a problem. Is DefineFunction supported by Augustus?

Was it helpful?

Solution

jcrudy, you are right: this was a bug. (An API changed and DefineFunction was not brought up-to-date.) It is now fixed in the public SVN repository: with Augustus >= r795, you can run your example as originally intended.

By the way, your PMML is coming from an external file, yet you load it into a string and then into a PMML DOM. You can skip the intermediate step by just passing loadXML the file name:

model = modelLoader.loadXml(add_two_numbers_file)

(This could be relevant for very large PMML files; also note that they can be GZipped.)

OTHER TIPS

I was able to solve this by making two changes. After having a look at the augustus source and determining that, indeed, _setupCalculate is not defined anywhere, I monkey-patched it in. My script now looks like this:

# Monkey-patch augustus
import augustus.pmml.DefineFunction
def _setupCalculate(self, dataTable, functionTable, performanceTable):
    return (dataTable, functionTable, performanceTable)
augustus.pmml.DefineFunction.DefineFunction._setupCalculate = _setupCalculate

# Now the actual script
from augustus.strict import modelLoader

# Load model
add_two_numbers_file = 'addTwoNumbers.pmml'
with open(add_two_numbers_file, 'r') as model_file:
    model_str = model_file.read()
    model = modelLoader.loadXml(model_str)

# Run model
print model.calc({'x':[1,2,3],'y':[4,5,6]}).look()

I made the naive assumption that _setupCalculate does not need to do anything important. I was now getting a different and more inscrutable error:

ValueError: assignment destination is read-only

at the line

mask[mask2] = defs.MISSING

in FieldType.py. After a few trips through the debugger, I saw that this line was only executed during type casting and noticed that I was using both float and double types in my PMML. By removing unnecessary dataType attributes, I was able to get the following to work:

<PMML version="4.1" xmlns="http://www.dmg.org/PMML-4_1">
    <Header/>
    <DataDictionary>
        <DataField name="x" dataType="double" optype="continuous"/>
        <DataField name="y" dataType="double" optype="continuous"/>
    </DataDictionary>
    <TransformationDictionary>
        <DefineFunction optype="continuous" name="add">
            <ParameterField optype="continuous" name="first"></ParameterField>
            <ParameterField optype="continuous" name="second"></ParameterField>
            <Apply function="+" invalidValueTreatment="returnInvalid">
                <FieldRef field="first"></FieldRef>
                <FieldRef field="second"></FieldRef>
            </Apply>
        </DefineFunction>
        <DerivedField name="z" dataType="double" optype="continuous">
            <Apply function="add">
                <FieldRef field="x"/>
                <FieldRef field="y"/>
            </Apply>
        </DerivedField>
    </TransformationDictionary>
</PMML>

The trunk version of augustus I used is equivalent to version 0.6-beta3. It seems like the problems I had are just bugs, and the tricks used in this answer are likely to become unnecessary in the near future.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top