The main use of predictive models is to generate predictions for new data. This data frequently resides in databases like MySQL, and the ADAPA scoring engine needs a way to easily access it. One way of accomplishing this is by using the Pentaho Data Integration (PDI) tool, and in this post we outline how to score data from relational databases using the ADAPA REST API and PDI.
PDI provides an easy to use point-and-click interface to manage the whole workflow: retrieving the data, scoring it through ADAPA, and saving the results elsewhere. It is possible to use PDI to read and write to different databases, including MySQL, Microsoft SQL Server, Oracle, PostgreSQL, and others. PDI can also act as a client to the ADAPA Scoring Engine by leveraging the ADAPA REST API, and take care of transforming the data into necessary formats - JSON and URL in this case.
Prior to starting, we assume that:
- PDI is installed
- Data to be scored is stored in either MySQL, Microsoft SQL Server, Oracle or PostgreSQL
- A PMML model for the data is deployed and available through the ADAPA REST API.
The process is built and executed in PDI. The transformation should consist of the following steps:
- Retrieve data from the database
- Transform to a JSON object
- Convert the JSON object to a URL as a method to transmit it
- Send URL to ADAPA through REST API
- Capture ADAPA output
- Write the scoring result back to a flat file
For detailed step-by-step instructions using a neural network model deployed in ADAPA, please review the following videos: