Model deployment used to be a big task. Predictive models, once built, needed to be re-coded into production to be able to score new data. This process was prone to errors and could easily take up to six months. Re-coding of predictive models has no place in the big data era we live in. Since data is changing rapidly, model deployment needs to be near instantaneous and error-free. PMML, the Predictive Model Markup Language, is the standard to represent predictive models.
PMML (Predictive Model Markup Language) is an XML-based standard for the vendor-independent exchange of predictive analytics, data mining and machine learning models (call them what you like) between originating design tools and the operational execution platform. Developed by the Data Mining Group in the late 1990s, PMML has matured quietly to the point where it now has extensive vendor support and has become the backbone of big data predictive analytics. In the agile world we live today, PMML delivers the necessary representational power for solutions to be quickly and easily exchanged between systems, allowing for predictions to move at the speed of business.
One of the leading statistical modeling platforms today is R. R allows for quick exploration of data, extraction of important features and has available a large variety of packages which give data scientists easy access to various modeling techniques. The ‘pmml’ package for R was created to allow data scientists to export their models, once constructed, to PMML format. The latest version of this package, v1.5, was released August 2015 and contains various new functions providing the modeler a more interactive access to the PMML constructed; they can now modify the PMML after it was constructed to a greater degree.
The next step would, of course, be to upload this PMML into an operational platform:
This series of posts describes in more detail some of the new functions implemented and their uses:
1. Helper functions to modify the MiningField element attributes
2. Helper functions to modify the DataDictionary element attributes
3. Helper functions to modify the DataDictionary element attributes - Part II
4. Helper functions to add OutputField element attributes and child elements