How to tune parameters in R : Automatic tuning of parameters of Random Forest

In [1]:
# -------------------------------------------------------------------------------
# How to tune parameters in R : Automatic tuning of parameters of Random Forest
# -------------------------------------------------------------------------------
# load libraries
library(mlbench)
library(caret)

# load data
data(PimaIndiansDiabetes)
# rename dataset to keep code below generic
dataset <- PimaIndiansDiabetes
dim(dataset)
sapply(dataset, class)

# setup cross validation and control parameters
control <- trainControl(method="repeatedcv", number=3, repeats = 3, verbose = TRUE, search = "grid")
metric <- "Accuracy"

# ----------------------------------------
# Auto tuning of Hyper parameters
tuneLength <- 2
# ----------------------------------------

# Training process 
# Fit / train a Linear Discriminant Analysis model to the training dataset
fit.RF1 <- caret::train(diabetes~., data=dataset, method="cforest", metric=metric, 
                        preProc=c("center", "scale"), trControl=control, tuneLength = tuneLength)
print(fit.RF1); plot(fit.RF1)

# Fit / train a Logistic Regression model to the training dataset
fit.RF2 <- caret::train(diabetes~., data=dataset, method="RRF", metric=metric, 
                        preProc=c("center", "scale"), trControl=control, tuneLength = tuneLength)
print(fit.RF2); plot(fit.RF2)

# collect the results of trained models
results <- resamples(list(RF1 = fit.RF1, RF2 = fit.RF2))

# Summarize the fitted models
summary(results)

# Plot and rank the fitted models
dotplot(results)
bwplot(results)
Loading required package: lattice
Loading required package: ggplot2
  1. 768
  2. 9
pregnant
'numeric'
glucose
'numeric'
pressure
'numeric'
triceps
'numeric'
insulin
'numeric'
mass
'numeric'
pedigree
'numeric'
age
'numeric'
diabetes
'factor'
+ Fold1.Rep1: mtry=2 
- Fold1.Rep1: mtry=2 
+ Fold1.Rep1: mtry=8 
- Fold1.Rep1: mtry=8 
+ Fold2.Rep1: mtry=2 
- Fold2.Rep1: mtry=2 
+ Fold2.Rep1: mtry=8 
- Fold2.Rep1: mtry=8 
+ Fold3.Rep1: mtry=2 
- Fold3.Rep1: mtry=2 
+ Fold3.Rep1: mtry=8 
- Fold3.Rep1: mtry=8 
+ Fold1.Rep2: mtry=2 
- Fold1.Rep2: mtry=2 
+ Fold1.Rep2: mtry=8 
- Fold1.Rep2: mtry=8 
+ Fold2.Rep2: mtry=2 
- Fold2.Rep2: mtry=2 
+ Fold2.Rep2: mtry=8 
- Fold2.Rep2: mtry=8 
+ Fold3.Rep2: mtry=2 
- Fold3.Rep2: mtry=2 
+ Fold3.Rep2: mtry=8 
- Fold3.Rep2: mtry=8 
+ Fold1.Rep3: mtry=2 
- Fold1.Rep3: mtry=2 
+ Fold1.Rep3: mtry=8 
- Fold1.Rep3: mtry=8 
+ Fold2.Rep3: mtry=2 
- Fold2.Rep3: mtry=2 
+ Fold2.Rep3: mtry=8 
- Fold2.Rep3: mtry=8 
+ Fold3.Rep3: mtry=2 
- Fold3.Rep3: mtry=2 
+ Fold3.Rep3: mtry=8 
- Fold3.Rep3: mtry=8 
Aggregating results
Selecting tuning parameters
Fitting mtry = 2 on full training set
Conditional Inference Random Forest 

768 samples
  8 predictor
  2 classes: 'neg', 'pos' 

Pre-processing: centered (8), scaled (8) 
Resampling: Cross-Validated (3 fold, repeated 3 times) 
Summary of sample sizes: 511, 512, 513, 511, 513, 512, ... 
Resampling results across tuning parameters:

  mtry  Accuracy   Kappa    
  2     0.7686600  0.4535750
  8     0.7591129  0.4478574

Accuracy was used to select the optimal model using the largest value.
The final value used for the model was mtry = 2.
+ Fold1.Rep1: mtry=2, coefReg=0.01, coefImp=0 
- Fold1.Rep1: mtry=2, coefReg=0.01, coefImp=0 
+ Fold1.Rep1: mtry=8, coefReg=0.01, coefImp=0 
- Fold1.Rep1: mtry=8, coefReg=0.01, coefImp=0 
+ Fold1.Rep1: mtry=2, coefReg=1.00, coefImp=0 
- Fold1.Rep1: mtry=2, coefReg=1.00, coefImp=0 
+ Fold1.Rep1: mtry=8, coefReg=1.00, coefImp=0 
- Fold1.Rep1: mtry=8, coefReg=1.00, coefImp=0 
+ Fold1.Rep1: mtry=2, coefReg=0.01, coefImp=1 
- Fold1.Rep1: mtry=2, coefReg=0.01, coefImp=1 
+ Fold1.Rep1: mtry=8, coefReg=0.01, coefImp=1 
- Fold1.Rep1: mtry=8, coefReg=0.01, coefImp=1 
+ Fold1.Rep1: mtry=2, coefReg=1.00, coefImp=1 
- Fold1.Rep1: mtry=2, coefReg=1.00, coefImp=1 
+ Fold1.Rep1: mtry=8, coefReg=1.00, coefImp=1 
- Fold1.Rep1: mtry=8, coefReg=1.00, coefImp=1 
+ Fold2.Rep1: mtry=2, coefReg=0.01, coefImp=0 
- Fold2.Rep1: mtry=2, coefReg=0.01, coefImp=0 
+ Fold2.Rep1: mtry=8, coefReg=0.01, coefImp=0 
- Fold2.Rep1: mtry=8, coefReg=0.01, coefImp=0 
+ Fold2.Rep1: mtry=2, coefReg=1.00, coefImp=0 
- Fold2.Rep1: mtry=2, coefReg=1.00, coefImp=0 
+ Fold2.Rep1: mtry=8, coefReg=1.00, coefImp=0 
- Fold2.Rep1: mtry=8, coefReg=1.00, coefImp=0 
+ Fold2.Rep1: mtry=2, coefReg=0.01, coefImp=1 
- Fold2.Rep1: mtry=2, coefReg=0.01, coefImp=1 
+ Fold2.Rep1: mtry=8, coefReg=0.01, coefImp=1 
- Fold2.Rep1: mtry=8, coefReg=0.01, coefImp=1 
+ Fold2.Rep1: mtry=2, coefReg=1.00, coefImp=1 
- Fold2.Rep1: mtry=2, coefReg=1.00, coefImp=1 
+ Fold2.Rep1: mtry=8, coefReg=1.00, coefImp=1 
- Fold2.Rep1: mtry=8, coefReg=1.00, coefImp=1 
+ Fold3.Rep1: mtry=2, coefReg=0.01, coefImp=0 
- Fold3.Rep1: mtry=2, coefReg=0.01, coefImp=0 
+ Fold3.Rep1: mtry=8, coefReg=0.01, coefImp=0 
- Fold3.Rep1: mtry=8, coefReg=0.01, coefImp=0 
+ Fold3.Rep1: mtry=2, coefReg=1.00, coefImp=0 
- Fold3.Rep1: mtry=2, coefReg=1.00, coefImp=0 
+ Fold3.Rep1: mtry=8, coefReg=1.00, coefImp=0 
- Fold3.Rep1: mtry=8, coefReg=1.00, coefImp=0 
+ Fold3.Rep1: mtry=2, coefReg=0.01, coefImp=1 
- Fold3.Rep1: mtry=2, coefReg=0.01, coefImp=1 
+ Fold3.Rep1: mtry=8, coefReg=0.01, coefImp=1 
- Fold3.Rep1: mtry=8, coefReg=0.01, coefImp=1 
+ Fold3.Rep1: mtry=2, coefReg=1.00, coefImp=1 
- Fold3.Rep1: mtry=2, coefReg=1.00, coefImp=1 
+ Fold3.Rep1: mtry=8, coefReg=1.00, coefImp=1 
- Fold3.Rep1: mtry=8, coefReg=1.00, coefImp=1 
+ Fold1.Rep2: mtry=2, coefReg=0.01, coefImp=0 
- Fold1.Rep2: mtry=2, coefReg=0.01, coefImp=0 
+ Fold1.Rep2: mtry=8, coefReg=0.01, coefImp=0 
- Fold1.Rep2: mtry=8, coefReg=0.01, coefImp=0 
+ Fold1.Rep2: mtry=2, coefReg=1.00, coefImp=0 
- Fold1.Rep2: mtry=2, coefReg=1.00, coefImp=0 
+ Fold1.Rep2: mtry=8, coefReg=1.00, coefImp=0 
- Fold1.Rep2: mtry=8, coefReg=1.00, coefImp=0 
+ Fold1.Rep2: mtry=2, coefReg=0.01, coefImp=1 
- Fold1.Rep2: mtry=2, coefReg=0.01, coefImp=1 
+ Fold1.Rep2: mtry=8, coefReg=0.01, coefImp=1 
- Fold1.Rep2: mtry=8, coefReg=0.01, coefImp=1 
+ Fold1.Rep2: mtry=2, coefReg=1.00, coefImp=1 
- Fold1.Rep2: mtry=2, coefReg=1.00, coefImp=1 
+ Fold1.Rep2: mtry=8, coefReg=1.00, coefImp=1 
- Fold1.Rep2: mtry=8, coefReg=1.00, coefImp=1 
+ Fold2.Rep2: mtry=2, coefReg=0.01, coefImp=0 
- Fold2.Rep2: mtry=2, coefReg=0.01, coefImp=0 
+ Fold2.Rep2: mtry=8, coefReg=0.01, coefImp=0 
- Fold2.Rep2: mtry=8, coefReg=0.01, coefImp=0 
+ Fold2.Rep2: mtry=2, coefReg=1.00, coefImp=0 
- Fold2.Rep2: mtry=2, coefReg=1.00, coefImp=0 
+ Fold2.Rep2: mtry=8, coefReg=1.00, coefImp=0 
- Fold2.Rep2: mtry=8, coefReg=1.00, coefImp=0 
+ Fold2.Rep2: mtry=2, coefReg=0.01, coefImp=1 
- Fold2.Rep2: mtry=2, coefReg=0.01, coefImp=1 
+ Fold2.Rep2: mtry=8, coefReg=0.01, coefImp=1 
- Fold2.Rep2: mtry=8, coefReg=0.01, coefImp=1 
+ Fold2.Rep2: mtry=2, coefReg=1.00, coefImp=1 
- Fold2.Rep2: mtry=2, coefReg=1.00, coefImp=1 
+ Fold2.Rep2: mtry=8, coefReg=1.00, coefImp=1 
- Fold2.Rep2: mtry=8, coefReg=1.00, coefImp=1 
+ Fold3.Rep2: mtry=2, coefReg=0.01, coefImp=0 
- Fold3.Rep2: mtry=2, coefReg=0.01, coefImp=0 
+ Fold3.Rep2: mtry=8, coefReg=0.01, coefImp=0 
- Fold3.Rep2: mtry=8, coefReg=0.01, coefImp=0 
+ Fold3.Rep2: mtry=2, coefReg=1.00, coefImp=0 
- Fold3.Rep2: mtry=2, coefReg=1.00, coefImp=0 
+ Fold3.Rep2: mtry=8, coefReg=1.00, coefImp=0 
- Fold3.Rep2: mtry=8, coefReg=1.00, coefImp=0 
+ Fold3.Rep2: mtry=2, coefReg=0.01, coefImp=1 
- Fold3.Rep2: mtry=2, coefReg=0.01, coefImp=1 
+ Fold3.Rep2: mtry=8, coefReg=0.01, coefImp=1 
- Fold3.Rep2: mtry=8, coefReg=0.01, coefImp=1 
+ Fold3.Rep2: mtry=2, coefReg=1.00, coefImp=1 
- Fold3.Rep2: mtry=2, coefReg=1.00, coefImp=1 
+ Fold3.Rep2: mtry=8, coefReg=1.00, coefImp=1 
- Fold3.Rep2: mtry=8, coefReg=1.00, coefImp=1 
+ Fold1.Rep3: mtry=2, coefReg=0.01, coefImp=0 
- Fold1.Rep3: mtry=2, coefReg=0.01, coefImp=0 
+ Fold1.Rep3: mtry=8, coefReg=0.01, coefImp=0 
- Fold1.Rep3: mtry=8, coefReg=0.01, coefImp=0 
+ Fold1.Rep3: mtry=2, coefReg=1.00, coefImp=0 
- Fold1.Rep3: mtry=2, coefReg=1.00, coefImp=0 
+ Fold1.Rep3: mtry=8, coefReg=1.00, coefImp=0 
- Fold1.Rep3: mtry=8, coefReg=1.00, coefImp=0 
+ Fold1.Rep3: mtry=2, coefReg=0.01, coefImp=1 
- Fold1.Rep3: mtry=2, coefReg=0.01, coefImp=1 
+ Fold1.Rep3: mtry=8, coefReg=0.01, coefImp=1 
- Fold1.Rep3: mtry=8, coefReg=0.01, coefImp=1 
+ Fold1.Rep3: mtry=2, coefReg=1.00, coefImp=1 
- Fold1.Rep3: mtry=2, coefReg=1.00, coefImp=1 
+ Fold1.Rep3: mtry=8, coefReg=1.00, coefImp=1 
- Fold1.Rep3: mtry=8, coefReg=1.00, coefImp=1 
+ Fold2.Rep3: mtry=2, coefReg=0.01, coefImp=0 
- Fold2.Rep3: mtry=2, coefReg=0.01, coefImp=0 
+ Fold2.Rep3: mtry=8, coefReg=0.01, coefImp=0 
- Fold2.Rep3: mtry=8, coefReg=0.01, coefImp=0 
+ Fold2.Rep3: mtry=2, coefReg=1.00, coefImp=0 
- Fold2.Rep3: mtry=2, coefReg=1.00, coefImp=0 
+ Fold2.Rep3: mtry=8, coefReg=1.00, coefImp=0 
- Fold2.Rep3: mtry=8, coefReg=1.00, coefImp=0 
+ Fold2.Rep3: mtry=2, coefReg=0.01, coefImp=1 
- Fold2.Rep3: mtry=2, coefReg=0.01, coefImp=1 
+ Fold2.Rep3: mtry=8, coefReg=0.01, coefImp=1 
- Fold2.Rep3: mtry=8, coefReg=0.01, coefImp=1 
+ Fold2.Rep3: mtry=2, coefReg=1.00, coefImp=1 
- Fold2.Rep3: mtry=2, coefReg=1.00, coefImp=1 
+ Fold2.Rep3: mtry=8, coefReg=1.00, coefImp=1 
- Fold2.Rep3: mtry=8, coefReg=1.00, coefImp=1 
+ Fold3.Rep3: mtry=2, coefReg=0.01, coefImp=0 
- Fold3.Rep3: mtry=2, coefReg=0.01, coefImp=0 
+ Fold3.Rep3: mtry=8, coefReg=0.01, coefImp=0 
- Fold3.Rep3: mtry=8, coefReg=0.01, coefImp=0 
+ Fold3.Rep3: mtry=2, coefReg=1.00, coefImp=0 
- Fold3.Rep3: mtry=2, coefReg=1.00, coefImp=0 
+ Fold3.Rep3: mtry=8, coefReg=1.00, coefImp=0 
- Fold3.Rep3: mtry=8, coefReg=1.00, coefImp=0 
+ Fold3.Rep3: mtry=2, coefReg=0.01, coefImp=1 
- Fold3.Rep3: mtry=2, coefReg=0.01, coefImp=1 
+ Fold3.Rep3: mtry=8, coefReg=0.01, coefImp=1 
- Fold3.Rep3: mtry=8, coefReg=0.01, coefImp=1 
+ Fold3.Rep3: mtry=2, coefReg=1.00, coefImp=1 
- Fold3.Rep3: mtry=2, coefReg=1.00, coefImp=1 
+ Fold3.Rep3: mtry=8, coefReg=1.00, coefImp=1 
- Fold3.Rep3: mtry=8, coefReg=1.00, coefImp=1 
Aggregating results
Selecting tuning parameters
Fitting mtry = 8, coefReg = 1, coefImp = 0 on full training set
Regularized Random Forest 

768 samples
  8 predictor
  2 classes: 'neg', 'pos' 

Pre-processing: centered (8), scaled (8) 
Resampling: Cross-Validated (3 fold, repeated 3 times) 
Summary of sample sizes: 511, 513, 512, 511, 512, 513, ... 
Resampling results across tuning parameters:

  mtry  coefReg  coefImp  Accuracy   Kappa    
  2     0.01     0        0.6979612  0.3109660
  2     0.01     1        0.7356897  0.4016199
  2     1.00     0        0.7561095  0.4473596
  2     1.00     1        0.7257035  0.3835740
  8     0.01     0        0.7183198  0.3607852
  8     0.01     1        0.7257018  0.3804981
  8     1.00     0        0.7565351  0.4484112
  8     1.00     1        0.7235486  0.3759642

Accuracy was used to select the optimal model using the largest value.
The final values used for the model were mtry = 8, coefReg = 1 and coefImp = 0.
Call:
summary.resamples(object = results)

Models: RF1, RF2 
Number of resamples: 9 

Accuracy 
         Min.   1st Qu.    Median      Mean   3rd Qu.      Max. NA's
RF1 0.7421875 0.7529412 0.7656250 0.7686600 0.7882353 0.7898833    0
RF2 0.7148438 0.7372549 0.7626459 0.7565351 0.7686275 0.7968750    0

Kappa 
         Min.   1st Qu.    Median      Mean   3rd Qu.      Max. NA's
RF1 0.3830424 0.4084771 0.4352049 0.4535750 0.5047330 0.5080817    0
RF2 0.3491223 0.3921302 0.4689543 0.4484112 0.4894982 0.5350981    0