How to use stacking of algorithms in R

In [2]:
# -------------------------------------------------------------------------------
# How to use stacking of algorithms in R
# -------------------------------------------------------------------------------
# load libraries
library(mlbench)
library(caret)
library(caretEnsemble)

# load data
data(PimaIndiansDiabetes)
# rename dataset to keep code below generic
dataset <- PimaIndiansDiabetes
dim(dataset)
sapply(dataset, class)

# setup cross validation and control parameters
control <- trainControl(method="repeatedcv", number=3, repeats = 3, verbose = TRUE, search = "grid")
metric <- "Accuracy"
tuneLength <- 2
algorithmList <- c('lda', 'rpart', 'glm', 'knn', 'svmRadial')

# Training process 
models <- caretList(diabetes~., data=dataset, trControl=control, methodList=algorithmList, 
                    tuneLength = tuneLength)

# collect the results of trained models
results <- resamples(models)

# Summarize the fitted models
summary(results)
# Plot and rank the fitted models
dotplot(results)
bwplot(results)
  1. 768
  2. 9
pregnant
'numeric'
glucose
'numeric'
pressure
'numeric'
triceps
'numeric'
insulin
'numeric'
mass
'numeric'
pedigree
'numeric'
age
'numeric'
diabetes
'factor'
Warning message in trControlCheck(x = trControl, y = target):
“trControl$savePredictions not 'all' or 'final'.  Setting to 'final' so we can ensemble the models.”Warning message in trControlCheck(x = trControl, y = target):
“indexes not defined in trControl.  Attempting to set them ourselves, so each model in the ensemble will have the same resampling indexes.”
+ Fold1.Rep1: parameter=none 
- Fold1.Rep1: parameter=none 
+ Fold2.Rep1: parameter=none 
- Fold2.Rep1: parameter=none 
+ Fold3.Rep1: parameter=none 
- Fold3.Rep1: parameter=none 
+ Fold1.Rep2: parameter=none 
- Fold1.Rep2: parameter=none 
+ Fold2.Rep2: parameter=none 
- Fold2.Rep2: parameter=none 
+ Fold3.Rep2: parameter=none 
- Fold3.Rep2: parameter=none 
+ Fold1.Rep3: parameter=none 
- Fold1.Rep3: parameter=none 
+ Fold2.Rep3: parameter=none 
- Fold2.Rep3: parameter=none 
+ Fold3.Rep3: parameter=none 
- Fold3.Rep3: parameter=none 
Aggregating results
Fitting final model on full training set
+ Fold1.Rep1: cp=0.1045 
- Fold1.Rep1: cp=0.1045 
+ Fold2.Rep1: cp=0.1045 
- Fold2.Rep1: cp=0.1045 
+ Fold3.Rep1: cp=0.1045 
- Fold3.Rep1: cp=0.1045 
+ Fold1.Rep2: cp=0.1045 
- Fold1.Rep2: cp=0.1045 
+ Fold2.Rep2: cp=0.1045 
- Fold2.Rep2: cp=0.1045 
+ Fold3.Rep2: cp=0.1045 
- Fold3.Rep2: cp=0.1045 
+ Fold1.Rep3: cp=0.1045 
- Fold1.Rep3: cp=0.1045 
+ Fold2.Rep3: cp=0.1045 
- Fold2.Rep3: cp=0.1045 
+ Fold3.Rep3: cp=0.1045 
- Fold3.Rep3: cp=0.1045 
Aggregating results
Selecting tuning parameters
Fitting cp = 0.104 on full training set
+ Fold1.Rep1: parameter=none 
- Fold1.Rep1: parameter=none 
+ Fold2.Rep1: parameter=none 
- Fold2.Rep1: parameter=none 
+ Fold3.Rep1: parameter=none 
- Fold3.Rep1: parameter=none 
+ Fold1.Rep2: parameter=none 
- Fold1.Rep2: parameter=none 
+ Fold2.Rep2: parameter=none 
- Fold2.Rep2: parameter=none 
+ Fold3.Rep2: parameter=none 
- Fold3.Rep2: parameter=none 
+ Fold1.Rep3: parameter=none 
- Fold1.Rep3: parameter=none 
+ Fold2.Rep3: parameter=none 
- Fold2.Rep3: parameter=none 
+ Fold3.Rep3: parameter=none 
- Fold3.Rep3: parameter=none 
Aggregating results
Fitting final model on full training set
+ Fold1.Rep1: k=5 
- Fold1.Rep1: k=5 
+ Fold1.Rep1: k=7 
- Fold1.Rep1: k=7 
+ Fold2.Rep1: k=5 
- Fold2.Rep1: k=5 
+ Fold2.Rep1: k=7 
- Fold2.Rep1: k=7 
+ Fold3.Rep1: k=5 
- Fold3.Rep1: k=5 
+ Fold3.Rep1: k=7 
- Fold3.Rep1: k=7 
+ Fold1.Rep2: k=5 
- Fold1.Rep2: k=5 
+ Fold1.Rep2: k=7 
- Fold1.Rep2: k=7 
+ Fold2.Rep2: k=5 
- Fold2.Rep2: k=5 
+ Fold2.Rep2: k=7 
- Fold2.Rep2: k=7 
+ Fold3.Rep2: k=5 
- Fold3.Rep2: k=5 
+ Fold3.Rep2: k=7 
- Fold3.Rep2: k=7 
+ Fold1.Rep3: k=5 
- Fold1.Rep3: k=5 
+ Fold1.Rep3: k=7 
- Fold1.Rep3: k=7 
+ Fold2.Rep3: k=5 
- Fold2.Rep3: k=5 
+ Fold2.Rep3: k=7 
- Fold2.Rep3: k=7 
+ Fold3.Rep3: k=5 
- Fold3.Rep3: k=5 
+ Fold3.Rep3: k=7 
- Fold3.Rep3: k=7 
Aggregating results
Selecting tuning parameters
Fitting k = 7 on full training set
+ Fold1.Rep1: sigma=0.1292, C=0.25 
- Fold1.Rep1: sigma=0.1292, C=0.25 
+ Fold1.Rep1: sigma=0.1292, C=0.50 
- Fold1.Rep1: sigma=0.1292, C=0.50 
+ Fold2.Rep1: sigma=0.1292, C=0.25 
- Fold2.Rep1: sigma=0.1292, C=0.25 
+ Fold2.Rep1: sigma=0.1292, C=0.50 
- Fold2.Rep1: sigma=0.1292, C=0.50 
+ Fold3.Rep1: sigma=0.1292, C=0.25 
- Fold3.Rep1: sigma=0.1292, C=0.25 
+ Fold3.Rep1: sigma=0.1292, C=0.50 
- Fold3.Rep1: sigma=0.1292, C=0.50 
+ Fold1.Rep2: sigma=0.1292, C=0.25 
- Fold1.Rep2: sigma=0.1292, C=0.25 
+ Fold1.Rep2: sigma=0.1292, C=0.50 
- Fold1.Rep2: sigma=0.1292, C=0.50 
+ Fold2.Rep2: sigma=0.1292, C=0.25 
- Fold2.Rep2: sigma=0.1292, C=0.25 
+ Fold2.Rep2: sigma=0.1292, C=0.50 
- Fold2.Rep2: sigma=0.1292, C=0.50 
+ Fold3.Rep2: sigma=0.1292, C=0.25 
- Fold3.Rep2: sigma=0.1292, C=0.25 
+ Fold3.Rep2: sigma=0.1292, C=0.50 
- Fold3.Rep2: sigma=0.1292, C=0.50 
+ Fold1.Rep3: sigma=0.1292, C=0.25 
- Fold1.Rep3: sigma=0.1292, C=0.25 
+ Fold1.Rep3: sigma=0.1292, C=0.50 
- Fold1.Rep3: sigma=0.1292, C=0.50 
+ Fold2.Rep3: sigma=0.1292, C=0.25 
- Fold2.Rep3: sigma=0.1292, C=0.25 
+ Fold2.Rep3: sigma=0.1292, C=0.50 
- Fold2.Rep3: sigma=0.1292, C=0.50 
+ Fold3.Rep3: sigma=0.1292, C=0.25 
- Fold3.Rep3: sigma=0.1292, C=0.25 
+ Fold3.Rep3: sigma=0.1292, C=0.50 
- Fold3.Rep3: sigma=0.1292, C=0.50 
Aggregating results
Selecting tuning parameters
Fitting sigma = 0.129, C = 0.5 on full training set
Call:
summary.resamples(object = results)

Models: lda, rpart, glm, knn, svmRadial 
Number of resamples: 9 

Accuracy 
               Min.   1st Qu.    Median      Mean   3rd Qu.      Max. NA's
lda       0.7109375 0.7539062 0.7656250 0.7664832 0.7898833 0.8078431    0
rpart     0.6953125 0.7237354 0.7304688 0.7317741 0.7460938 0.7568627    0
glm       0.7226562 0.7421875 0.7725490 0.7690941 0.7859922 0.8117647    0
knn       0.7109375 0.7265625 0.7265625 0.7343716 0.7509728 0.7568627    0
svmRadial 0.7265625 0.7421875 0.7764706 0.7699655 0.7898833 0.8039216    0

Kappa 
               Min.   1st Qu.    Median      Mean   3rd Qu.      Max. NA's
lda       0.3236701 0.4060106 0.4612795 0.4607267 0.5286646 0.5501674    0
rpart     0.1977501 0.3456808 0.3676070 0.3563230 0.3953456 0.4416857    0
glm       0.3456221 0.3986043 0.4719366 0.4657953 0.5211883 0.5605658    0
knn       0.3309466 0.3809162 0.3875179 0.3968227 0.4294436 0.4654848    0
svmRadial 0.3419023 0.4066999 0.4738081 0.4622392 0.5126679 0.5422560    0