- multi-class classification (see below for details)
- binary classifcation
We have encountered an issue with ksvm while building the dummy-fication piece. Basically, except for the first categorical variable in a model, all other categorical variables loose their first input category. That is, ksvm does not create a dummy variable for the first category (which is not really a problem for model building or scoring. It is a cosmetic issue though). We have already pointed this out to the author of ksvm. For now, the PMML export code mimics this issue so that you can get a match during scoring.
The example below shows how to train a support vector machine to perform binary classification using the audit dataset provided by Togaware (thanks to Graham Williams).
audit <- read.csv(file("http://rattle.togaware.com/audit.csv"));
myksvm <- ksvm(as.factor(TARGET_Adjusted) ~ ., data=audit[,c(2:10,13)], kernel="rbfdot", prob.model=TRUE);
saveXML(pmml(myksvm, data=audit), "AuditSVM.pmml");
BTW, any models you build in ksvm and export using the PMML package can be uploaded directly into ADAPA for scoring.
SVM element in PMML allows for multiclass-classification
ADAPA fully supports multi-class classification for SVMs using one-against-all approach (also known as one-against-rest) and one-against-one.
For multiclass-classification with k classes, k > 2, the R ksvm function uses the `one-against-one'-approach, in which k(k-1)/2 binary classifiers are trained; the appropriate class is found by a voting scheme.
In PMML, the one-against-one approach is supported via the definition for each machine of an extra alternate target category given that all k(k-1)/2 machines are binary classifiers.
Voting schemes for multiclass-classification problems in SVM are described in:
C.-W. Hsu and C.-J. Lin
A comparison on methods for multi-class support vector machines
IEEE Transactions on Neural Networks, 13(2002) 415-425.