Java library and command-line application for converting R models to PMML.
- Fast and memory-efficient:
- Can produce a 5 GB Random Forest PMML file in less than 1 minute on a desktop PC
- Supported model and transformation types:
adapackage:ada- Stochastic Boosting (SB) classification
adabagpackage:bagging- Bagging classificationboosting- Boosting classification
apolloapollo(formerlymaxLik) - Discrete Choice Model (DCM) classification
caretpackage:preProcess- Transformation methods "range", "center", "scale" and "medianImpute"train- Selected JPMML-R model types
caretEnsemblepackage:caretEnsemble- Ensemble regression and classification
CHAIDpackage:party- CHi-squared Automated Interaction Detection (CHAID) classification
earthpackage:earth- Multivariate Adaptive Regression Spline (MARS) regression
elmNNRcpppackage:elm- Extreme Learning Machine (ELM) regression
evtreepackage:party- Evolutionary Learning of Trees (EvTree) regression and classification
e1071package:naiveBayes- Naive Bayes (NB) classificationsvm- Support Vector Machine (SVM) regression, classification and anomaly detection
gbmpackage:gbm- Gradient Boosting Machine (GBM) regression and classification
glmnetpackage:glmnet(elnet,fishnet,lognetandmultnetsubtypes) - Generalized Linear Model with lasso or elasticnet regularization (GLMNet) regression and classificationcv.glmnet- Cross-validated GLMNet regression and calculation
IsolationForestpackage:iForest- Isolation Forest (IF) anomaly detection
lightgbmpackage:lgb.Booster- LightGBM regression and classification
MASSpackage:negbin- Generalized Linear Model (GLM) regressionpolr- Ordinal regression
mlrpackage:WrappedModel- Selected JPMML-R model types
neuralnetpackage:nn- Neural Network (NN) regression
nnetpackage:multinom- Multinomial log-linear classificationnnet.formula- Neural Network (NNet) regression and classification
partypackage:ctree- Conditional Inference Tree (CIT) classification
partykitpackage:party- Recursive Partytioning (Party) regression and classification
plspackage:mvr- Multivariate Regression (MVR) regression
psclpackage:hurdle- Hurdle regressionzeroinfl- Zero-inflated Count Data regression
randomForestpackage:randomForest- Random Forest (RF) regression and classification
rangerpackage:ranger- Random Forest (RF) regression and classification
rmspackage:lrm- Binary Logistic Regression (LR) classificationols- Ordinary Least Squares (OLS) regressionorm- Ordinal regression
rpartpackage:rpart- Recursive Partitioning (RPart) regression and classification
r2pmmlpackage:scorecard- Scorecard regression
statspackage:glm- Generalized Linear Model (GLM) regression and classification:binomial,gaussian,Gamma,inverse.gammaandpoissonfamiliesMASS::negative.binomialfamilystatmod::tweediefamily
kmeans- K-Means clusteringlm- Linear Model (LM) regression
xgboostpackage:xgb.Booster- XGBoost (XGB) regression and classification
- Data pre-processing using model formulae:
- Interaction terms
base::I(..)function terms:- Logical operators
&,|and! - Relational operators
==,!=,<,<=,>=and> - Arithmetic operators
+,-,*,/, and% - Exponentiation operators
^and** - The
is.nafunction - Arithmetic functions
abs,ceiling,exp,floor,log,log10,roundandsqrt
- Logical operators
base::cut()andbase::ifelse()function termsplyr::revalue()andplyr::mapvalues()function terms
- Production quality:
- Complete test coverage.
- Fully compliant with the JPMML-Evaluator library.
- Java 11 or newer.
Enter the project root directory and build using Apache Maven:
mvn clean install
The build produces a library JAR file pmml-rexp/target/pmml-rexp-1.7-SNAPSHOT.jar, and an executable uber-JAR file pmml-rexp-example/target/pmml-rexp-example-executable-1.7-SNAPSHOT.jar.
A typical workflow can be summarized as follows:
- Use R to train a model.
- Serialize the model in RDS data format to a file in a local filesystem.
- Use the JPMML-R command-line converter application to turn the RDS file to a PMML file.
The following R script trains a Random Forest (RF) model and saves it in RDS data format to a file rf.rds:
library("randomForest")
rf = randomForest(Species ~ ., data = iris)
saveRDS(rf, "rf.rds")Converting the RDS file rf.rds to a PMML file rf.pmml:
java -jar pmml-rexp-example/target/pmml-rexp-example-executable-1.7-SNAPSHOT.jar --rds-input rf.rds --pmml-output rf.pmml
Getting help:
java -jar pmml-rexp-example/target/pmml-rexp-example-executable-1.7-SNAPSHOT.jar --help
The conversion of large files (1 GB and beyond) can be sped up by increasing the JVM heap size using -Xms and -Xmx options:
java -Xms4G -Xmx8G -jar pmml-rexp-example/target/pmml-rexp-example-executable-1.7-SNAPSHOT.jar --rds-input rf.rds --pmml-output rf.pmml
Up-to-date:
- Converting logistic regression models to PMML documents
- Deploying R language models on Apache Spark ML
Slightly outdated:
JPMML-R is licensed under the terms and conditions of the GNU Affero General Public License, Version 3.0.
If you would like to use JPMML-R in a proprietary software project, then it is possible to enter into a licensing agreement which makes JPMML-R available under the terms and conditions of the BSD 3-Clause License instead.
JPMML-R is developed and maintained by Openscoring Ltd, Estonia.
Interested in using Java PMML API software in your company? Please contact [email protected]