PMML, the Predictive Model Markup Language, is the de facto standard to represent predictive models. With PMML, models can be exported from one tool and easily imported by another, without all the hassle of dealing with proprietary code and incompatibilities.
Converting from one version to another:
More often than not though, auto-generated PMML code is represented in different versions of PMML. A tool may export PMML 2.1 and another import PMML 4.2. This problem raises the issue of conversion.
Validating code against the schema:
PMML is an XML-based language. The Data Mining Group (DMG) publishes a PMML Schema (.xsd file) which is specific on how PMML elements should be used. Unfortunately, some tools do not adhere 100% to the schema. For true interoperability, PMML needs to be successfully validated against the schema and if any problems are found, these need to be pointed out so that they can be fixed.
Correcting files so that they conform to the schema:
Once schema incompatibilities are identified, life becomes a lot easier if problems are corrected automatically so that any PMML code that won't validate against the schema at first is successfully validated after being corrected.
Obviously, one may wonder why not have perfect PMML code at all times and in its latest version. This is the ideal scenario, but in reality, PMML producers and consumers have different levels of support for the standard and have a tendency to lag behind when it comes to updating importers and exporters to accompany the latest release.
Reading in PMML from all versions and vendors ...
Besides schema validation, our scoring products (ADAPA and UPPI) automatically correct known issues with PMML code from several sources/vendors. The aim is to successfully validate code in older versions of PMML and convert them to the latest PMML version.
If the PMML code cannot be converted, that usually means that it could not be automatically corrected. In that case, comments will be embedded into the PMML code pinpointing the problem so that they can be fixed manually before being uploaded into ADAPA or UPPI a second time.