Introducing Py2PMML

The Zementis Python to PMML Converter (Py2PMML) provides you with an easy to use interface to translate your Python-generated machine learning models into PMML, the Predictive Model Markup Language standard. In particular, it allows for models built using scikit-learn to be consumed by Zementis ADAPA and UPPI scoring engines. 

Once translated into PMML, models can be easily deployed and scored against new incoming data. For example, models can be deployed in ADAPA for real-time scoring or UPPI for big data scoring in-database or Hadoop.

How does it work?

Easy! Once you build your model using the scikit-learn library, all you need to do is write out a .txt file containing the model's parameters. The .txt file needs to follow a strict order and contain all the required information. This is the file used by Py2PMML to generate the corresponding PMML file for your model. With the PMML file in hand, you can simply deploy it in ADAPA for real-time scoring or UPPI for big data scoring. 


What are the supported model types?

As of now, the supported scikit-learn predictive modeling classes are:

Supported pre-processing classes are (contact us for further details):

  • Class MinMaxScalerStandardizes features by scaling each feature to a given range 
  • Class OneHotEnconder - Creates dummy continuous variables out of categorical variables
  • Missing Value Replacement 

To learn exactly how each .txt file needs to be generated so that Py2PMML can do its job, please take a look at the specific posting for the particular model type you are interested in converting to PMML. 



Scikit-learn: Machine Learning in Python, Pedregosa et al., JMLR 12, pp. 2825-2830, 2011.


Article is closed for comments.
Powered by Zendesk