Adding the Output element to PMML files exported from IBM SPSS

IBM SPSS products (Modeler and Statistics) offer a great deal of support for PMML. However, exported PMML files lack the "Output" element which is responsible for instructing the scoring engine (ADAPA or UPPI) to output not only the predicted value, but also probabilities and pseudo-probabilities (for classification models). This element can also be used to output a myriad of other important values which complement the predicted value. For a complete list of possible outputs, please refer to the PMML Output page

Given that PMML files generated by IBM SPSS products lack the Output element, you will need to add it manually to your PMML file. The text below shows the Output element for a binomial output.




Obviously, you will need to craft the Output element to conform to your model and requirements. The Output element below was crafted for a classification model with three possible outcomes, classes "1", "2", or "3". 

The output element reads as follows:

           <OutputField name="predictedJobCat" feature="predictedValue"/>
           <OutputField name="Prob_1" feature="probability" value="1"/>
           <OutputField name="Prob_2" feature="probability" value="2"/>
           <OutputField name="Prob_3" feature="probability" value="3"/>           

To output the probability of winning category (independent of which category wins), omit the attribute "value" from the OutputField element as follows.

           <OutputField name="predictedJobCat" feature="predictedValue"/>
           <OutputField name="WinningProb" feature="probability"/>        

Important Note:

The "Output" element needs to be placed right after the "MiningSchema" element so that it conforms to the PMML schema. That is, it needs to follow the closing MiningSchema XML tag </MiningSchema>.


Supported Output Features in ADAPA and UPPI

The "Output" element in PMML offers support for a host of different outputs which vary depending on the modeling technique being used. ADAPA supports most of the output features offered in PMML. These are:

  • predictedValue (your typical score or class)
  • predictedDisplayValue (for classification)
  • transformedValue (used to output derived fields and post-processing)
  • decision (used to output business decisions)
  • probability (for classification)
  • residual (out of scope for scoring, but supported if training data is specified)
  • clusterId (for clustering)
  • clusterAffinity (for clustering)
  • entityId (generic use)
  • entityAffinity (generic use)
  • ruleValue (for association rules)
  • reasonCode (for scorecards)


Article is closed for comments.
Powered by Zendesk