KNN, PLS, PDA algorithms

The k-nearest neighbors algorithm, or kNN, is one of the simplest machine learning algorithms. Usually, k is a small, odd number - sometimes only 1. The larger k is, the more accurate the classification will be, but the longer it takes to perform the classification.
Let’s say you want to classify an object into one of several classes -- for example, "pictures containing a face" and "pictures not containing a face". You do this by looking at the k elements of the training set that are closest to the one you want to classify, and letting them vote by majority on what that object’s class should be. If two of your closest elements were in class A and only one in class B, and k = 3, then you would conclude the element that are you trying to classify would go in class A. "Closest" here refers to literal distance in n-dimensional space, or the Euclidean distance.
There's also something called weighted kNN, which is like kNN except neighbors that are closer count as stronger votes. If there is one example of class A, and two examples of class B that are farther away, the algorithm still might classify the input as class A.

Partial least squares regression (PLS regression) is a statistical method that bears some relation to principal components regression; instead of finding hyperplanes of maximum variance between the response and independent variables, it finds a linear regression model by projecting the predicted variables and the observable variables to a new space. Because both the X and Y data are projected to new spaces, the PLS family of methods are known as bilinear factor models. Partial least squares discriminant analysis (PLS-DA) is a variant used when the Y is categorical.
PLS is used to find the fundamental relations between two matrices (X and Y), i.e. a latent variable approach to modeling the covariance structures in these two spaces. A PLS model will try to find the multidimensional direction in the X space that explains the maximum multidimensional variance direction in the Y space. PLS regression is particularly suited when the matrix of predictors has more variables than observations, and when there is multicollinearity among X values. By contrast, standard regression will fail in these cases (unless it is regularized).
Partial least squares was introduced by the Swedish statistician Herman O. A. Wold, who then developed it with his son, Svante Wold. An alternative term for PLS (and more correct according to Svante Wold) is projection to latent structures, but the term partial least squares is still dominant in many areas. Although the original applications were in the social sciences, PLS regression is today most widely used in chemometrics and related areas. It is also used in bioinformatics, sensometrics, neuroscience, and anthropology.

Here we are going to implement KNN, PLS and PDA using Telecom Churn Dataset.

0. Loading required libraries

In [3]:
library(DBI)
library(corrgram)
library(caret) 
library(gridExtra)
library(ggpubr)

1. Setting up the code parallelizing

Today is a good practice to start parallelizing your code. The common motivation behind parallel computing is that something is taking too long time. For somebody that means any computation that takes more than 3 minutes – this because parallelization is incredibly simple and most tasks that take time are embarrassingly parallel. Here are a few common tasks that fit the description:

  • Bootstrapping
  • Cross-validation
  • Multivariate Imputation by Chained Equations (MICE)
  • Fitting multiple regression models
You can find out more about parallelizing your computations in R - here.

For Windows users

In [ ]:
# process in parallel on Windows
library(doParallel) 
cl <- makeCluster(detectCores(), type='PSOCK')
registerDoParallel(cl)

For Mac OSX and Unix like systems users

In [6]:
# process in parallel on Mac OSX and UNIX like systems
library(doMC)
registerDoMC(cores = 4)

2. Importing Data

In [8]:
#Set working directory where CSV is located

#getwd()
#setwd("...YOUR WORKING DIRECTORY WITH A DATASET...")
#getwd()
In [7]:
# Load the DataSets: 
dataSet <- read.csv("TelcoCustomerChurnDataset.csv", header = TRUE, sep = ',')
colnames(dataSet) #Check the dataframe column names
  1. 'Account_Length'
  2. 'Vmail_Message'
  3. 'Day_Mins'
  4. 'Eve_Mins'
  5. 'Night_Mins'
  6. 'Intl_Mins'
  7. 'CustServ_Calls'
  8. 'Churn'
  9. 'Intl_Plan'
  10. 'Vmail_Plan'
  11. 'Day_Calls'
  12. 'Day_Charge'
  13. 'Eve_Calls'
  14. 'Eve_Charge'
  15. 'Night_Calls'
  16. 'Night_Charge'
  17. 'Intl_Calls'
  18. 'Intl_Charge'
  19. 'State'
  20. 'Area_Code'
  21. 'Phone'

3. Exploring the dataset

In [8]:
# Print top 10 rows in the dataSet
head(dataSet, 10)
A data.frame: 10 × 21
Account_LengthVmail_MessageDay_MinsEve_MinsNight_MinsIntl_MinsCustServ_CallsChurnIntl_PlanVmail_Plan⋯Day_ChargeEve_CallsEve_ChargeNight_CallsNight_ChargeIntl_CallsIntl_ChargeStateArea_CodePhone
<int><int><dbl><dbl><dbl><dbl><int><fct><fct><fct>⋯<dbl><int><dbl><int><dbl><int><dbl><fct><int><fct>
112825265.1197.4244.710.01nono yes⋯45.07 9916.78 9111.0132.70KS415382-4657
210726161.6195.5254.413.71nono yes⋯27.4710316.6210311.4533.70OH415371-7191
3137 0243.4121.2162.612.20nono no ⋯41.3811010.30104 7.3253.29NJ415358-1921
4 84 0299.4 61.9196.9 6.62noyesno ⋯50.90 88 5.26 89 8.8671.78OH408375-9999
5 75 0166.7148.3186.910.13noyesno ⋯28.3412212.61121 8.4132.73OK415330-6626
6118 0223.4220.6203.9 6.30noyesno ⋯37.9810118.75118 9.1861.70AL510391-8027
712124218.2348.5212.6 7.53nono yes⋯37.0910829.62118 9.5772.03MA510355-9993
8147 0157.0103.1211.8 7.10noyesno ⋯26.69 94 8.76 96 9.5361.92MO415329-9001
9117 0184.5351.6215.8 8.71nono no ⋯31.37 8029.89 90 9.7142.35LA408335-4719
1014137258.6222.0326.411.20noyesyes⋯43.9611118.87 9714.6953.02WV415330-8173
In [9]:
# Print last 10 rows in the dataSet
tail(dataSet, 10)
A data.frame: 10 × 21
Account_LengthVmail_MessageDay_MinsEve_MinsNight_MinsIntl_MinsCustServ_CallsChurnIntl_PlanVmail_Plan⋯Day_ChargeEve_CallsEve_ChargeNight_CallsNight_ChargeIntl_CallsIntl_ChargeStateArea_CodePhone
<int><int><dbl><dbl><dbl><dbl><int><fct><fct><fct>⋯<dbl><int><dbl><int><dbl><int><dbl><fct><int><fct>
3324117 0118.4249.3227.013.65yesno no ⋯20.13 9721.19 5610.22 33.67IN415362-5899
3325159 0169.8197.7193.711.61no no no ⋯28.8710516.80 82 8.72 43.13WV415377-1164
3326 78 0193.4116.9243.3 9.32no no no ⋯32.88 88 9.9410910.95 42.51OH408368-8555
3327 96 0106.6284.8178.914.91no no no ⋯18.12 8724.21 92 8.05 74.02OH415347-6812
3328 79 0134.7189.7221.411.82no no no ⋯22.90 6816.12128 9.96 53.19SC415348-3830
332919236156.2215.5279.1 9.92no no yes⋯26.5512618.32 8312.56 62.67AZ415414-4276
3330 68 0231.1153.4191.3 9.63no no no ⋯39.29 5513.04123 8.61 42.59WV415370-3271
3331 28 0180.8288.8191.914.12no no no ⋯30.74 5824.55 91 8.64 63.81RI510328-8230
3332184 0213.8159.6139.2 5.02no yesno ⋯36.35 8413.57137 6.26101.35CT510364-6381
3333 7425234.4265.9241.413.70no no yes⋯39.85 8222.60 7710.86 43.70TN415400-4344
In [10]:
# Dimention of Dataset
dim(dataSet)
  1. 3333
  2. 21
In [11]:
# Check Data types of each column
table(unlist(lapply(dataSet, class)))
 factor integer numeric 
      5       8       8 
In [12]:
# Check Data types of individual column
data.class(dataSet$Account_Length) 
data.class(dataSet$Vmail_Message) 
data.class(dataSet$Day_Mins)
data.class(dataSet$Eve_Mins)
data.class(dataSet$Night_Mins) 
data.class(dataSet$Intl_Mins)
data.class(dataSet$CustServ_Calls)
data.class(dataSet$Intl_Plan) 
data.class(dataSet$Vmail_Plan)
data.class(dataSet$Day_Calls)
data.class(dataSet$Day_Charge) 
data.class(dataSet$Eve_Calls)
data.class(dataSet$Eve_Charge) 
data.class(dataSet$Night_Calls)
data.class(dataSet$Night_Charge)
data.class(dataSet$Intl_Calls) 
data.class(dataSet$Intl_Charge)
data.class(dataSet$State) 
data.class(dataSet$Phone)
data.class(dataSet$Churn)
'numeric'
'numeric'
'numeric'
'numeric'
'numeric'
'numeric'
'numeric'
'factor'
'factor'
'numeric'
'numeric'
'numeric'
'numeric'
'numeric'
'numeric'
'numeric'
'numeric'
'factor'
'factor'
'factor'

Converting variables Intl_Plan, Vmail_Plan, State to numeric data type.

In [13]:
dataSet$Intl_Plan <- as.numeric(dataSet$Intl_Plan)
dataSet$Vmail_Plan <- as.numeric(dataSet$Vmail_Plan)
dataSet$State <- as.numeric(dataSet$State)
In [14]:
# Check Data types of each column
table(unlist(lapply(dataSet, class)))
 factor integer numeric 
      2       8      11 

4. Exploring or Summarising dataset with descriptive statistics

In [15]:
# Find out if there is missing value in rows
rowSums(is.na(dataSet))
  1. 0
  2. 0
  3. 0
  4. 0
  5. 0
  6. 0
  7. 0
  8. 0
  9. 0
  10. 0
  11. 0
  12. 0
  13. 0
  14. 0
  15. 0
  16. 0
  17. 0
  18. 0
  19. 0
  20. 0
  21. 0
  22. 0
  23. 0
  24. 0
  25. 0
  26. 0
  27. 0
  28. 0
  29. 0
  30. 0
  31. 0
  32. 0
  33. 0
  34. 0
  35. 0
  36. 0
  37. 0
  38. 0
  39. 0
  40. 0
  41. 0
  42. 0
  43. 0
  44. 0
  45. 0
  46. 0
  47. 0
  48. 0
  49. 0
  50. 0
  51. 0
  52. 0
  53. 0
  54. 0
  55. 0
  56. 0
  57. 0
  58. 0
  59. 0
  60. 0
  61. 0
  62. 0
  63. 0
  64. 0
  65. 0
  66. 0
  67. 0
  68. 0
  69. 0
  70. 0
  71. 0
  72. 0
  73. 0
  74. 0
  75. 0
  76. 0
  77. 0
  78. 0
  79. 0
  80. 0
  81. 0
  82. 0
  83. 0
  84. 0
  85. 0
  86. 0
  87. 0
  88. 0
  89. 0
  90. 0
  91. 0
  92. 0
  93. 0
  94. 0
  95. 0
  96. 0
  97. 0
  98. 0
  99. 0
  100. 0
  101. 0
  102. 0
  103. 0
  104. 0
  105. 0
  106. 0
  107. 0
  108. 0
  109. 0
  110. 0
  111. 0
  112. 0
  113. 0
  114. 0
  115. 0
  116. 0
  117. 0
  118. 0
  119. 0
  120. 0
  121. 0
  122. 0
  123. 0
  124. 0
  125. 0
  126. 0
  127. 0
  128. 0
  129. 0
  130. 0
  131. 0
  132. 0
  133. 0
  134. 0
  135. 0
  136. 0
  137. 0
  138. 0
  139. 0
  140. 0
  141. 0
  142. 0
  143. 0
  144. 0
  145. 0
  146. 0
  147. 0
  148. 0
  149. 0
  150. 0
  151. 0
  152. 0
  153. 0
  154. 0
  155. 0
  156. 0
  157. 0
  158. 0
  159. 0
  160. 0
  161. 0
  162. 0
  163. 0
  164. 0
  165. 0
  166. 0
  167. 0
  168. 0
  169. 0
  170. 0
  171. 0
  172. 0
  173. 0
  174. 0
  175. 0
  176. 0
  177. 0
  178. 0
  179. 0
  180. 0
  181. 0
  182. 0
  183. 0
  184. 0
  185. 0
  186. 0
  187. 0
  188. 0
  189. 0
  190. 0
  191. 0
  192. 0
  193. 0
  194. 0
  195. 0
  196. 0
  197. 0
  198. 0
  199. 0
  200. 0
  201. ⋯
  202. 0
  203. 0
  204. 0
  205. 0
  206. 0
  207. 0
  208. 0
  209. 0
  210. 0
  211. 0
  212. 0
  213. 0
  214. 0
  215. 0
  216. 0
  217. 0
  218. 0
  219. 0
  220. 0
  221. 0
  222. 0
  223. 0
  224. 0
  225. 0
  226. 0
  227. 0
  228. 0
  229. 0
  230. 0
  231. 0
  232. 0
  233. 0
  234. 0
  235. 0
  236. 0
  237. 0
  238. 0
  239. 0
  240. 0
  241. 0
  242. 0
  243. 0
  244. 0
  245. 0
  246. 0
  247. 0
  248. 0
  249. 0
  250. 0
  251. 0
  252. 0
  253. 0
  254. 0
  255. 0
  256. 0
  257. 0
  258. 0
  259. 0
  260. 0
  261. 0
  262. 0
  263. 0
  264. 0
  265. 0
  266. 0
  267. 0
  268. 0
  269. 0
  270. 0
  271. 0
  272. 0
  273. 0
  274. 0
  275. 0
  276. 0
  277. 0
  278. 0
  279. 0
  280. 0
  281. 0
  282. 0
  283. 0
  284. 0
  285. 0
  286. 0
  287. 0
  288. 0
  289. 0
  290. 0
  291. 0
  292. 0
  293. 0
  294. 0
  295. 0
  296. 0
  297. 0
  298. 0
  299. 0
  300. 0
  301. 0
  302. 0
  303. 0
  304. 0
  305. 0
  306. 0
  307. 0
  308. 0
  309. 0
  310. 0
  311. 0
  312. 0
  313. 0
  314. 0
  315. 0
  316. 0
  317. 0
  318. 0
  319. 0
  320. 0
  321. 0
  322. 0
  323. 0
  324. 0
  325. 0
  326. 0
  327. 0
  328. 0
  329. 0
  330. 0
  331. 0
  332. 0
  333. 0
  334. 0
  335. 0
  336. 0
  337. 0
  338. 0
  339. 0
  340. 0
  341. 0
  342. 0
  343. 0
  344. 0
  345. 0
  346. 0
  347. 0
  348. 0
  349. 0
  350. 0
  351. 0
  352. 0
  353. 0
  354. 0
  355. 0
  356. 0
  357. 0
  358. 0
  359. 0
  360. 0
  361. 0
  362. 0
  363. 0
  364. 0
  365. 0
  366. 0
  367. 0
  368. 0
  369. 0
  370. 0
  371. 0
  372. 0
  373. 0
  374. 0
  375. 0
  376. 0
  377. 0
  378. 0
  379. 0
  380. 0
  381. 0
  382. 0
  383. 0
  384. 0
  385. 0
  386. 0
  387. 0
  388. 0
  389. 0
  390. 0
  391. 0
  392. 0
  393. 0
  394. 0
  395. 0
  396. 0
  397. 0
  398. 0
  399. 0
  400. 0
  401. 0
In [16]:
# Find out if there is missing value in columns
colSums(is.na(dataSet))
Account_Length
0
Vmail_Message
0
Day_Mins
0
Eve_Mins
0
Night_Mins
0
Intl_Mins
0
CustServ_Calls
0
Churn
0
Intl_Plan
0
Vmail_Plan
0
Day_Calls
0
Day_Charge
0
Eve_Calls
0
Eve_Charge
0
Night_Calls
0
Night_Charge
0
Intl_Calls
0
Intl_Charge
0
State
0
Area_Code
0
Phone
0

Missing value checking using different packages (mice and VIM)

In [17]:
#Checking missing value with the mice package
library(mice)
md.pattern(dataSet)
Attaching package: ‘mice’


The following objects are masked from ‘package:base’:

    cbind, rbind


 /\     /\
{  `---'  }
{  O   O  }
==>  V <==  No need for mice. This data set is completely observed.
 \  \|/  /
  `-----'

A matrix: 2 × 22 of type dbl
Account_LengthVmail_MessageDay_MinsEve_MinsNight_MinsIntl_MinsCustServ_CallsChurnIntl_PlanVmail_Plan⋯Eve_CallsEve_ChargeNight_CallsNight_ChargeIntl_CallsIntl_ChargeStateArea_CodePhone
33331111111111⋯1111111110
0000000000⋯0000000000
In [18]:
#Checking missing value with the VIM package
library(VIM)
mice_plot <- aggr(dataSet, col=c('navyblue','yellow'),
                  numbers=TRUE, sortVars=TRUE,
                  labels=names(dataSet[1:21]), cex.axis=.9,
                  gap=3, ylab=c("Missing data","Pattern"))
Loading required package: colorspace

Loading required package: grid

VIM is ready to use.


Suggestions and bug-reports can be submitted at: https://github.com/statistikat/VIM/issues


Attaching package: ‘VIM’


The following object is masked from ‘package:datasets’:

    sleep


 Variables sorted by number of missings: 
       Variable Count
 Account_Length     0
  Vmail_Message     0
       Day_Mins     0
       Eve_Mins     0
     Night_Mins     0
      Intl_Mins     0
 CustServ_Calls     0
          Churn     0
      Intl_Plan     0
     Vmail_Plan     0
      Day_Calls     0
     Day_Charge     0
      Eve_Calls     0
     Eve_Charge     0
    Night_Calls     0
   Night_Charge     0
     Intl_Calls     0
    Intl_Charge     0
          State     0
      Area_Code     0
          Phone     0

After the observation, we can claim that dataset contains no missing values.

Summary of dataset

In [19]:
# Selecting just columns with numeric data type
numericalCols <- colnames(dataSet[c(1:7,9:20)])

Difference between the lapply and sapply functions (we will use them in the next 2 cells):
We use lapply - when we want to apply a function to each element of a list in turn and get a list back.
We use sapply - when we want to apply a function to each element of a list in turn, but we want a vector back, rather than a list.

Finding statistics metrics with lapply function

In [20]:
#Sum
lapply(dataSet[numericalCols], FUN = sum)
$Account_Length
336849
$Vmail_Message
26994
$Day_Mins
599190.4
$Eve_Mins
669867.5
$Night_Mins
669506.5
$Intl_Mins
34120.9
$CustServ_Calls
5209
$Intl_Plan
3656
$Vmail_Plan
4255
$Day_Calls
334752
$Day_Charge
101864.17
$Eve_Calls
333681
$Eve_Charge
56939.44
$Night_Calls
333659
$Night_Charge
30128.07
$Intl_Calls
14930
$Intl_Charge
9214.35
$State
90189
$Area_Code
1457129
In [21]:
#Mean
lapply(dataSet[numericalCols], FUN = mean)
$Account_Length
101.064806480648
$Vmail_Message
8.0990099009901
$Day_Mins
179.775097509751
$Eve_Mins
200.980348034803
$Night_Mins
200.87203720372
$Intl_Mins
10.2372937293729
$CustServ_Calls
1.56285628562856
$Intl_Plan
1.0969096909691
$Vmail_Plan
1.27662766276628
$Day_Calls
100.435643564356
$Day_Charge
30.5623072307231
$Eve_Calls
100.114311431143
$Eve_Charge
17.0835403540354
$Night_Calls
100.107710771077
$Night_Charge
9.03932493249325
$Intl_Calls
4.47944794479448
$Intl_Charge
2.76458145814581
$State
27.0594059405941
$Area_Code
437.182418241824
In [22]:
#median
lapply(dataSet[numericalCols], FUN = median)
$Account_Length
101
$Vmail_Message
0
$Day_Mins
179.4
$Eve_Mins
201.4
$Night_Mins
201.2
$Intl_Mins
10.3
$CustServ_Calls
1
$Intl_Plan
1
$Vmail_Plan
1
$Day_Calls
101
$Day_Charge
30.5
$Eve_Calls
100
$Eve_Charge
17.12
$Night_Calls
100
$Night_Charge
9.05
$Intl_Calls
4
$Intl_Charge
2.78
$State
27
$Area_Code
415
In [23]:
#Min
lapply(dataSet[numericalCols], FUN = min)
$Account_Length
1
$Vmail_Message
0
$Day_Mins
0
$Eve_Mins
0
$Night_Mins
23.2
$Intl_Mins
0
$CustServ_Calls
0
$Intl_Plan
1
$Vmail_Plan
1
$Day_Calls
0
$Day_Charge
0
$Eve_Calls
0
$Eve_Charge
0
$Night_Calls
33
$Night_Charge
1.04
$Intl_Calls
0
$Intl_Charge
0
$State
1
$Area_Code
408
In [24]:
#Max
lapply(dataSet[numericalCols], FUN = max)
$Account_Length
243
$Vmail_Message
51
$Day_Mins
350.8
$Eve_Mins
363.7
$Night_Mins
395
$Intl_Mins
20
$CustServ_Calls
9
$Intl_Plan
2
$Vmail_Plan
2
$Day_Calls
165
$Day_Charge
59.64
$Eve_Calls
170
$Eve_Charge
30.91
$Night_Calls
175
$Night_Charge
17.77
$Intl_Calls
20
$Intl_Charge
5.4
$State
51
$Area_Code
510
In [25]:
#Length
lapply(dataSet[numericalCols], FUN = length)
$Account_Length
3333
$Vmail_Message
3333
$Day_Mins
3333
$Eve_Mins
3333
$Night_Mins
3333
$Intl_Mins
3333
$CustServ_Calls
3333
$Intl_Plan
3333
$Vmail_Plan
3333
$Day_Calls
3333
$Day_Charge
3333
$Eve_Calls
3333
$Eve_Charge
3333
$Night_Calls
3333
$Night_Charge
3333
$Intl_Calls
3333
$Intl_Charge
3333
$State
3333
$Area_Code
3333

Finding statistics metrics with sapply function

In [26]:
# Sum
sapply(dataSet[numericalCols], FUN = sum)
Account_Length
336849
Vmail_Message
26994
Day_Mins
599190.4
Eve_Mins
669867.5
Night_Mins
669506.5
Intl_Mins
34120.9
CustServ_Calls
5209
Intl_Plan
3656
Vmail_Plan
4255
Day_Calls
334752
Day_Charge
101864.17
Eve_Calls
333681
Eve_Charge
56939.44
Night_Calls
333659
Night_Charge
30128.07
Intl_Calls
14930
Intl_Charge
9214.35
State
90189
Area_Code
1457129
In [27]:
# Mean
sapply(dataSet[numericalCols], FUN = mean)
Account_Length
101.064806480648
Vmail_Message
8.0990099009901
Day_Mins
179.775097509751
Eve_Mins
200.980348034803
Night_Mins
200.87203720372
Intl_Mins
10.2372937293729
CustServ_Calls
1.56285628562856
Intl_Plan
1.0969096909691
Vmail_Plan
1.27662766276628
Day_Calls
100.435643564356
Day_Charge
30.5623072307231
Eve_Calls
100.114311431143
Eve_Charge
17.0835403540354
Night_Calls
100.107710771077
Night_Charge
9.03932493249325
Intl_Calls
4.47944794479448
Intl_Charge
2.76458145814581
State
27.0594059405941
Area_Code
437.182418241824
In [28]:
# Median
sapply(dataSet[numericalCols], FUN = median)
Account_Length
101
Vmail_Message
0
Day_Mins
179.4
Eve_Mins
201.4
Night_Mins
201.2
Intl_Mins
10.3
CustServ_Calls
1
Intl_Plan
1
Vmail_Plan
1
Day_Calls
101
Day_Charge
30.5
Eve_Calls
100
Eve_Charge
17.12
Night_Calls
100
Night_Charge
9.05
Intl_Calls
4
Intl_Charge
2.78
State
27
Area_Code
415
In [29]:
# Min
sapply(dataSet[numericalCols], FUN = min)
Account_Length
1
Vmail_Message
0
Day_Mins
0
Eve_Mins
0
Night_Mins
23.2
Intl_Mins
0
CustServ_Calls
0
Intl_Plan
1
Vmail_Plan
1
Day_Calls
0
Day_Charge
0
Eve_Calls
0
Eve_Charge
0
Night_Calls
33
Night_Charge
1.04
Intl_Calls
0
Intl_Charge
0
State
1
Area_Code
408
In [30]:
# Max
sapply(dataSet[numericalCols], FUN = max)
Account_Length
243
Vmail_Message
51
Day_Mins
350.8
Eve_Mins
363.7
Night_Mins
395
Intl_Mins
20
CustServ_Calls
9
Intl_Plan
2
Vmail_Plan
2
Day_Calls
165
Day_Charge
59.64
Eve_Calls
170
Eve_Charge
30.91
Night_Calls
175
Night_Charge
17.77
Intl_Calls
20
Intl_Charge
5.4
State
51
Area_Code
510
In [31]:
# Length
sapply(dataSet[numericalCols], FUN = length)
Account_Length
3333
Vmail_Message
3333
Day_Mins
3333
Eve_Mins
3333
Night_Mins
3333
Intl_Mins
3333
CustServ_Calls
3333
Intl_Plan
3333
Vmail_Plan
3333
Day_Calls
3333
Day_Charge
3333
Eve_Calls
3333
Eve_Charge
3333
Night_Calls
3333
Night_Charge
3333
Intl_Calls
3333
Intl_Charge
3333
State
3333
Area_Code
3333

In the next few cells, you will find three different options on how to aggregate data.

In [32]:
# OPTION 1: (Using Aggregate FUNCTION - all variables together)
aggregate(dataSet[numericalCols], list(dataSet$Churn), summary)
A data.frame: 2 × 20
Group.1Account_LengthVmail_MessageDay_MinsEve_MinsNight_MinsIntl_MinsCustServ_CallsIntl_PlanVmail_PlanDay_CallsDay_ChargeEve_CallsEve_ChargeNight_CallsNight_ChargeIntl_CallsIntl_ChargeStateArea_Code
<fct><dbl[,6]><dbl[,6]><dbl[,6]><dbl[,6]><dbl[,6]><dbl[,6]><dbl[,6]><dbl[,6]><dbl[,6]><dbl[,6]><dbl[,6]><dbl[,6]><dbl[,6]><dbl[,6]><dbl[,6]><dbl[,6]><dbl[,6]><dbl[,6]><dbl[,6]>
no 1, 73, 100, 100.7937, 127, 2430, 0, 0, 8.604561, 22, 510, 142.825, 177.2, 175.1758, 210.30, 315.6 0.0, 164.5, 199.6, 199.0433, 233.20, 361.823.2, 165.90, 200.25, 200.1332, 234.90, 395.00, 8.4, 10.2, 10.15888, 12.0, 18.90, 1, 1, 1.449825, 2, 81, 1, 1, 1.065263, 1, 21, 1, 1, 1.295439, 2, 20, 87.0, 100, 100.2832, 114.0, 1630, 24.2825, 30.12, 29.78042, 35.75, 53.65 0, 87, 100, 100.0386, 114, 1700.00, 13.980, 16.97, 16.91891, 19.820, 30.7533, 87, 100, 100.0582, 113, 1751.04, 7.470, 9.01, 9.006074, 10.570, 17.770, 3, 4, 4.532982, 6, 190.00, 2.27, 2.75, 2.743404, 3.24, 5.11, 14, 27, 27.01193, 40, 51408, 408, 415, 437.0747, 510, 510
yes1, 76, 103, 102.6646, 127, 2250, 0, 0, 5.115942, 0, 480, 153.250, 217.6, 206.9141, 265.95, 350.870.9, 177.1, 211.3, 212.4101, 249.45, 363.747.4, 171.25, 204.80, 205.2317, 239.85, 354.92, 8.8, 10.6, 10.70000, 12.8, 20.00, 1, 2, 2.229814, 4, 91, 1, 1, 1.283644, 2, 21, 1, 1, 1.165631, 1, 20, 87.5, 103, 101.3354, 116.5, 1650, 26.0550, 36.99, 35.17592, 45.21, 59.6448, 87, 101, 100.5611, 114, 1686.03, 15.055, 17.96, 18.05497, 21.205, 30.9149, 85, 100, 100.3996, 115, 1582.13, 7.705, 9.22, 9.235528, 10.795, 15.971, 2, 4, 4.163561, 5, 200.54, 2.38, 2.86, 2.889545, 3.46, 5.41, 17, 27, 27.33954, 39, 51408, 408, 415, 437.8178, 510, 510
In [33]:
# OPTION 2: (Using Aggregate FUNCTION - variables separately)
aggregate(dataSet$Intl_Mins, list(dataSet$Churn), summary)
aggregate(dataSet$Day_Mins, list(dataSet$Churn), summary)
aggregate(dataSet$Night_Mins, list(dataSet$Churn), summary)
A data.frame: 2 × 2
Group.1x
<fct><dbl[,6]>
no 0, 8.4, 10.2, 10.15888, 12.0, 18.9
yes2, 8.8, 10.6, 10.70000, 12.8, 20.0
A data.frame: 2 × 2
Group.1x
<fct><dbl[,6]>
no 0, 142.825, 177.2, 175.1758, 210.30, 315.6
yes0, 153.250, 217.6, 206.9141, 265.95, 350.8
A data.frame: 2 × 2
Group.1x
<fct><dbl[,6]>
no 23.2, 165.90, 200.25, 200.1332, 234.90, 395.0
yes47.4, 171.25, 204.80, 205.2317, 239.85, 354.9
In [34]:
# OPTION 3: (Using "by" FUNCTION instead of "Aggregate" FUNCTION)
by(dataSet$Intl_Mins, dataSet[8], FUN = summary)
by(dataSet$Day_Mins, dataSet[8], FUN = summary)
by(dataSet$Night_Mins, dataSet[8], FUN = summary)
Churn: no
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   0.00    8.40   10.20   10.16   12.00   18.90 
------------------------------------------------------------ 
Churn: yes
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
    2.0     8.8    10.6    10.7    12.8    20.0 
Churn: no
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
    0.0   142.8   177.2   175.2   210.3   315.6 
------------------------------------------------------------ 
Churn: yes
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
    0.0   153.2   217.6   206.9   265.9   350.8 
Churn: no
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   23.2   165.9   200.2   200.1   234.9   395.0 
------------------------------------------------------------ 
Churn: yes
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   47.4   171.2   204.8   205.2   239.8   354.9 

Find out correlation

In [35]:
# Correlations/covariances among numeric variables 
library(Hmisc)
cor(dataSet[c(2,5,11,13,16,18)], use="complete.obs", method="kendall") 
cov(dataSet[c(2,5,11,13,16,18)], use="complete.obs")
Loading required package: survival


Attaching package: ‘survival’


The following object is masked from ‘package:caret’:

    cluster


Loading required package: Formula


Attaching package: ‘Hmisc’


The following objects are masked from ‘package:base’:

    format.pval, units


A matrix: 6 × 6 of type dbl
Vmail_MessageNight_MinsDay_CallsEve_CallsNight_ChargeIntl_Charge
Vmail_Message 1.000000000 0.003718463-0.009573189-5.382921e-03 0.003710434-1.263503e-03
Night_Mins 0.003718463 1.000000000 0.012550159 3.291091e-03 0.999625309-7.103399e-03
Day_Calls-0.009573189 0.012550159 1.000000000 9.253492e-03 0.012531632 1.038631e-02
Eve_Calls-0.005382921 0.003291091 0.009253492 1.000000e+00 0.003310838-9.536135e-05
Night_Charge 0.003710434 0.999625309 0.012531632 3.310838e-03 1.000000000-7.097366e-03
Intl_Charge-0.001263503-0.007103399 0.010386309-9.536135e-05-0.007097366 1.000000e+00
A matrix: 6 × 6 of type dbl
Vmail_MessageNight_MinsDay_CallsEve_CallsNight_ChargeIntl_Charge
Vmail_Message187.37134656 5.3174453 -2.6229779 -1.59925653 0.23873433 0.02975334
Night_Mins 5.317445292557.7140018 23.2812431 -2.10859729115.09955435-0.57867377
Day_Calls -2.62297790 23.2812431402.7681409 2.58373944 1.04716693 0.32775442
Eve_Calls -1.59925653 -2.1085973 2.5837394396.91099860 -0.09322113 0.13025644
Night_Charge 0.23873433 115.0995543 1.0471669 -0.09322113 5.17959717-0.02605168
Intl_Charge 0.02975334 -0.5786738 0.3277544 0.13025644 -0.02605168 0.56817315
In [36]:
# Correlations with significance levels
rcorr(as.matrix(dataSet[c(2,5,11,13,16,18)]), type="pearson")
              Vmail_Message Night_Mins Day_Calls Eve_Calls Night_Charge
Vmail_Message          1.00       0.01     -0.01     -0.01         0.01
Night_Mins             0.01       1.00      0.02      0.00         1.00
Day_Calls             -0.01       0.02      1.00      0.01         0.02
Eve_Calls             -0.01       0.00      0.01      1.00         0.00
Night_Charge           0.01       1.00      0.02      0.00         1.00
Intl_Charge            0.00      -0.02      0.02      0.01        -0.02
              Intl_Charge
Vmail_Message        0.00
Night_Mins          -0.02
Day_Calls            0.02
Eve_Calls            0.01
Night_Charge        -0.02
Intl_Charge          1.00

n= 3333 


P
              Vmail_Message Night_Mins Day_Calls Eve_Calls Night_Charge
Vmail_Message               0.6576     0.5816    0.7350    0.6583      
Night_Mins    0.6576                   0.1855    0.9039    0.0000      
Day_Calls     0.5816        0.1855               0.7092    0.1857      
Eve_Calls     0.7350        0.9039     0.7092              0.9056      
Night_Charge  0.6583        0.0000     0.1857    0.9056                
Intl_Charge   0.8678        0.3810     0.2111    0.6167    0.3808      
              Intl_Charge
Vmail_Message 0.8678     
Night_Mins    0.3810     
Day_Calls     0.2111     
Eve_Calls     0.6167     
Night_Charge  0.3808     
Intl_Charge              

5. Visualising DataSet

In [37]:
# Pie Chart from data 
mytable <- table(dataSet$Churn)
lbls <- paste(names(mytable), "\n", mytable, sep="")
pie(mytable, labels = lbls, col=rainbow(length(lbls)), 
    main="Pie Chart of Classes\n (with sample sizes)")
In [38]:
# Barplot of categorical data
par(mfrow=c(1,1))
barplot(table(dataSet$Churn), ylab = "Count", 
        col=c("darkblue","red"))
barplot(prop.table(table(dataSet$Churn)), ylab = "Proportion", 
        col=c("darkblue","red"))
barplot(table(dataSet$Churn), xlab = "Count", horiz = TRUE, 
        col=c("darkblue","red"))
barplot(prop.table(table(dataSet$Churn)), xlab = "Proportion", horiz = TRUE, 
        col=c("darkblue","red"))
In [39]:
# Scatterplot Matrices from the glus Package 
library(gclus)
dta <- dataSet[c(2,5,11,13,16,18)] # get data 
dta.r <- abs(cor(dta)) # get correlations
dta.col <- dmat.color(dta.r) # get colors
# reorder variables so those with highest correlation are closest to the diagonal
dta.o <- order.single(dta.r) 
cpairs(dta, dta.o, panel.colors=dta.col, gap=.5, 
       main="Variables Ordered and Colored by Correlation" )
Loading required package: cluster

Visualise correlations

In [40]:
corrgram(dataSet[c(2,5,11,13,16,18)], order=TRUE, lower.panel=panel.shade,
         upper.panel=panel.pie, text.panel=panel.txt, main=" ")
In [41]:
# More graphs on correlatios amaong data
# Using "Hmisc"
res2 <- rcorr(as.matrix(dataSet[,c(2,5,11,13,16,18)]))
# Extract the correlation coefficients
res2$r
# Extract p-values
res2$P
A matrix: 6 × 6 of type dbl
Vmail_MessageNight_MinsDay_CallsEve_CallsNight_ChargeIntl_Charge
Vmail_Message 1.000000000 0.007681136-0.009548068-0.005864351 0.007663290 0.002883658
Night_Mins 0.007681136 1.000000000 0.022937845-0.002092768 0.999999215-0.015179849
Day_Calls-0.009548068 0.022937845 1.000000000 0.006462114 0.022926638 0.021666095
Eve_Calls-0.005864351-0.002092768 0.006462114 1.000000000-0.002055984 0.008673858
Night_Charge 0.007663290 0.999999215 0.022926638-0.002055984 1.000000000-0.015186139
Intl_Charge 0.002883658-0.015179849 0.021666095 0.008673858-0.015186139 1.000000000
A matrix: 6 × 6 of type dbl
Vmail_MessageNight_MinsDay_CallsEve_CallsNight_ChargeIntl_Charge
Vmail_Message NA0.65755700.58160890.73503350.65830200.8678283
Night_Mins0.6575570 NA0.18552680.90386940.00000000.3809828
Day_Calls0.58160890.1855268 NA0.70919640.18574180.2111142
Eve_Calls0.73503350.90386940.7091964 NA0.90555110.6166654
Night_Charge0.65830200.00000000.18574180.9055511 NA0.3807855
Intl_Charge0.86782830.38098280.21111420.61666540.3807855 NA
In [42]:
# Using "corrplot"
library(corrplot)
library(RColorBrewer)
corrplot(res2$r, type = "upper", order = "hclust", col=brewer.pal(n=8, name="RdYlBu"),
         tl.col = "black", tl.srt = 45)
corrplot(res2$r, type = "lower", order = "hclust", col=brewer.pal(n=8, name="RdYlBu"),
         tl.col = "black", tl.srt = 45)
corrplot 0.84 loaded

In [43]:
# Using PerformanceAnalytics
library(PerformanceAnalytics)
data <- dataSet[, c(2,5,11,13,16,18)]
chart.Correlation(data, histogram=TRUE, pch=19)
Loading required package: xts

Loading required package: zoo


Attaching package: ‘zoo’


The following objects are masked from ‘package:base’:

    as.Date, as.Date.numeric



Attaching package: ‘PerformanceAnalytics’


The following object is masked from ‘package:graphics’:

    legend


In [44]:
# Using Colored Headmap 
col <- colorRampPalette(c("blue", "white", "red"))(20)
heatmap(x = res2$r, col = col, symm = TRUE)

We should notice that Night_Mins and Night_Charge have a strong, linear, positive relationship.

6. Pre-Processing of DataSet i.e. train (75%) : test (25%) split

In [45]:
train_test_index <- createDataPartition(dataSet$Churn, p=0.75, list=FALSE)
training_dataset <- dataSet[, c(1:20)][train_test_index,]
testing_dataset  <- dataSet[, c(1:20)][-train_test_index,]
In [46]:
dim(training_dataset)
dim(testing_dataset)
  1. 2501
  2. 20
  1. 832
  2. 20

7. Cross Validation and control parameter setup

In [47]:
control <- trainControl(method="repeatedcv", # repeatedcv / adaptive_cv
                        number=2, repeats = 2, 
                        verbose = TRUE, search = "grid",
                        allowParallel = TRUE)
metric <- "Accuracy"
tuneLength = 2

8. Algorithm : LDA & LDA2

In [49]:
names(getModelInfo())
  1. 'ada'
  2. 'AdaBag'
  3. 'AdaBoost.M1'
  4. 'adaboost'
  5. 'amdai'
  6. 'ANFIS'
  7. 'avNNet'
  8. 'awnb'
  9. 'awtan'
  10. 'bag'
  11. 'bagEarth'
  12. 'bagEarthGCV'
  13. 'bagFDA'
  14. 'bagFDAGCV'
  15. 'bam'
  16. 'bartMachine'
  17. 'bayesglm'
  18. 'binda'
  19. 'blackboost'
  20. 'blasso'
  21. 'blassoAveraged'
  22. 'bridge'
  23. 'brnn'
  24. 'BstLm'
  25. 'bstSm'
  26. 'bstTree'
  27. 'C5.0'
  28. 'C5.0Cost'
  29. 'C5.0Rules'
  30. 'C5.0Tree'
  31. 'cforest'
  32. 'chaid'
  33. 'CSimca'
  34. 'ctree'
  35. 'ctree2'
  36. 'cubist'
  37. 'dda'
  38. 'deepboost'
  39. 'DENFIS'
  40. 'dnn'
  41. 'dwdLinear'
  42. 'dwdPoly'
  43. 'dwdRadial'
  44. 'earth'
  45. 'elm'
  46. 'enet'
  47. 'evtree'
  48. 'extraTrees'
  49. 'fda'
  50. 'FH.GBML'
  51. 'FIR.DM'
  52. 'foba'
  53. 'FRBCS.CHI'
  54. 'FRBCS.W'
  55. 'FS.HGD'
  56. 'gam'
  57. 'gamboost'
  58. 'gamLoess'
  59. 'gamSpline'
  60. 'gaussprLinear'
  61. 'gaussprPoly'
  62. 'gaussprRadial'
  63. 'gbm_h2o'
  64. 'gbm'
  65. 'gcvEarth'
  66. 'GFS.FR.MOGUL'
  67. 'GFS.LT.RS'
  68. 'GFS.THRIFT'
  69. 'glm.nb'
  70. 'glm'
  71. 'glmboost'
  72. 'glmnet_h2o'
  73. 'glmnet'
  74. 'glmStepAIC'
  75. 'gpls'
  76. 'hda'
  77. 'hdda'
  78. 'hdrda'
  79. 'HYFIS'
  80. 'icr'
  81. 'J48'
  82. 'JRip'
  83. 'kernelpls'
  84. 'kknn'
  85. 'knn'
  86. 'krlsPoly'
  87. 'krlsRadial'
  88. 'lars'
  89. 'lars2'
  90. 'lasso'
  91. 'lda'
  92. 'lda2'
  93. 'leapBackward'
  94. 'leapForward'
  95. 'leapSeq'
  96. 'Linda'
  97. 'lm'
  98. 'lmStepAIC'
  99. 'LMT'
  100. 'loclda'
  101. 'logicBag'
  102. 'LogitBoost'
  103. 'logreg'
  104. 'lssvmLinear'
  105. 'lssvmPoly'
  106. 'lssvmRadial'
  107. 'lvq'
  108. 'M5'
  109. 'M5Rules'
  110. 'manb'
  111. 'mda'
  112. 'Mlda'
  113. 'mlp'
  114. 'mlpKerasDecay'
  115. 'mlpKerasDecayCost'
  116. 'mlpKerasDropout'
  117. 'mlpKerasDropoutCost'
  118. 'mlpML'
  119. 'mlpSGD'
  120. 'mlpWeightDecay'
  121. 'mlpWeightDecayML'
  122. 'monmlp'
  123. 'msaenet'
  124. 'multinom'
  125. 'mxnet'
  126. 'mxnetAdam'
  127. 'naive_bayes'
  128. 'nb'
  129. 'nbDiscrete'
  130. 'nbSearch'
  131. 'neuralnet'
  132. 'nnet'
  133. 'nnls'
  134. 'nodeHarvest'
  135. 'null'
  136. 'OneR'
  137. 'ordinalNet'
  138. 'ordinalRF'
  139. 'ORFlog'
  140. 'ORFpls'
  141. 'ORFridge'
  142. 'ORFsvm'
  143. 'ownn'
  144. 'pam'
  145. 'parRF'
  146. 'PART'
  147. 'partDSA'
  148. 'pcaNNet'
  149. 'pcr'
  150. 'pda'
  151. 'pda2'
  152. 'penalized'
  153. 'PenalizedLDA'
  154. 'plr'
  155. 'pls'
  156. 'plsRglm'
  157. 'polr'
  158. 'ppr'
  159. 'PRIM'
  160. 'protoclass'
  161. 'qda'
  162. 'QdaCov'
  163. 'qrf'
  164. 'qrnn'
  165. 'randomGLM'
  166. 'ranger'
  167. 'rbf'
  168. 'rbfDDA'
  169. 'Rborist'
  170. 'rda'
  171. 'regLogistic'
  172. 'relaxo'
  173. 'rf'
  174. 'rFerns'
  175. 'RFlda'
  176. 'rfRules'
  177. 'ridge'
  178. 'rlda'
  179. 'rlm'
  180. 'rmda'
  181. 'rocc'
  182. 'rotationForest'
  183. 'rotationForestCp'
  184. 'rpart'
  185. 'rpart1SE'
  186. 'rpart2'
  187. 'rpartCost'
  188. 'rpartScore'
  189. 'rqlasso'
  190. 'rqnc'
  191. 'RRF'
  192. 'RRFglobal'
  193. 'rrlda'
  194. 'RSimca'
  195. 'rvmLinear'
  196. 'rvmPoly'
  197. 'rvmRadial'
  198. 'SBC'
  199. 'sda'
  200. 'sdwd'
  201. 'simpls'
  202. 'SLAVE'
  203. 'slda'
  204. 'smda'
  205. 'snn'
  206. 'sparseLDA'
  207. 'spikeslab'
  208. 'spls'
  209. 'stepLDA'
  210. 'stepQDA'
  211. 'superpc'
  212. 'svmBoundrangeString'
  213. 'svmExpoString'
  214. 'svmLinear'
  215. 'svmLinear2'
  216. 'svmLinear3'
  217. 'svmLinearWeights'
  218. 'svmLinearWeights2'
  219. 'svmPoly'
  220. 'svmRadial'
  221. 'svmRadialCost'
  222. 'svmRadialSigma'
  223. 'svmRadialWeights'
  224. 'svmSpectrumString'
  225. 'tan'
  226. 'tanSearch'
  227. 'treebag'
  228. 'vbmpRadial'
  229. 'vglmAdjCat'
  230. 'vglmContRatio'
  231. 'vglmCumulative'
  232. 'widekernelpls'
  233. 'WM'
  234. 'wsrf'
  235. 'xgbDART'
  236. 'xgbLinear'
  237. 'xgbTree'
  238. 'xyf'
In [48]:
getModelInfo("pls"); getModelInfo("kknn"); getModelInfo("pda");
$gpls
$label
'Generalized Partial Least Squares'
$library
'gpls'
$loop
NULL
$type
'Classification'
$parameters
A data.frame: 1 × 3
parameterclasslabel
<chr><chr><chr>
K.provnumeric#Components
$grid
function (x, y, len = NULL, search = "grid") 
{
    if (search == "grid") {
        out <- data.frame(K.prov = seq(1, len))
    }
    else {
        out <- data.frame(K.prov = unique(sample(1:ncol(x), size = len, 
            replace = TRUE)))
    }
    out
}
$fit
function (x, y, wts, param, lev, last, classProbs, ...) 
gpls::gpls(x, y, K.prov = param$K.prov, ...)
$predict
function (modelFit, newdata, submodels = NULL) 
predict(modelFit, newdata)$class
$prob
function (modelFit, newdata, submodels = NULL) 
{
    out <- predict(modelFit, newdata)$predicted
    out <- cbind(out, 1 - out)
    colnames(out) <- modelFit$obsLevels
    out
}
$predictors
function (x, ...) 
{
    out <- if (hasTerms(x)) 
        predictors(x$terms)
    else colnames(x$data$x.order)
    out[!(out %in% "Intercept")]
}
$tags
  1. 'Logistic Regression'
  2. 'Partial Least Squares'
  3. 'Linear Classifier'
$sort
function (x) 
x[order(x[, 1]), ]
$levels
function (x) 
x$obsLevels
$kernelpls
$label
'Partial Least Squares'
$library
'pls'
$type
  1. 'Regression'
  2. 'Classification'
$parameters
A data.frame: 1 × 3
parameterclasslabel
<chr><chr><chr>
ncompnumeric#Components
$grid
function (x, y, len = NULL, search = "grid") 
{
    if (search == "grid") {
        out <- data.frame(ncomp = seq(1, min(ncol(x) - 1, len), 
            by = 1))
    }
    else {
        out <- data.frame(ncomp = unique(sample(1:ncol(x), size = len, 
            replace = TRUE)))
    }
    out
}
$loop
function (grid) 
{
    grid <- grid[order(grid$ncomp, decreasing = TRUE), , drop = FALSE]
    loop <- grid[1, , drop = FALSE]
    submodels <- list(grid[-1, , drop = FALSE])
    list(loop = loop, submodels = submodels)
}
$fit
function (x, y, wts, param, lev, last, classProbs, ...) 
{
    ncomp <- min(ncol(x), param$ncomp)
    out <- if (is.factor(y)) {
        caret::plsda(x, y, method = "kernelpls", ncomp = ncomp, 
            ...)
    }
    else {
        dat <- if (is.data.frame(x)) 
            x
        else as.data.frame(x, stringsAsFactors = TRUE)
        dat$.outcome <- y
        pls::plsr(.outcome ~ ., data = dat, method = "kernelpls", 
            ncomp = ncomp, ...)
    }
    out
}
$predict
function (modelFit, newdata, submodels = NULL) 
{
    out <- if (modelFit$problemType == "Classification") {
        if (!is.matrix(newdata)) 
            newdata <- as.matrix(newdata)
        out <- predict(modelFit, newdata, type = "class")
    }
    else as.vector(pls:::predict.mvr(modelFit, newdata, ncomp = max(modelFit$ncomp)))
    if (!is.null(submodels)) {
        tmp <- vector(mode = "list", length = nrow(submodels))
        if (modelFit$problemType == "Classification") {
            if (length(submodels$ncomp) > 1) {
                tmp <- as.list(predict(modelFit, newdata, ncomp = submodels$ncomp))
            }
            else tmp <- list(predict(modelFit, newdata, ncomp = submodels$ncomp))
        }
        else {
            tmp <- as.list(as.data.frame(apply(predict(modelFit, 
                newdata, ncomp = submodels$ncomp), 3, function(x) list(x)), 
                stringsAsFActors = FALSE))
        }
        out <- c(list(out), tmp)
    }
    out
}
$prob
function (modelFit, newdata, submodels = NULL) 
{
    if (!is.matrix(newdata)) 
        newdata <- as.matrix(newdata)
    out <- predict(modelFit, newdata, type = "prob", ncomp = modelFit$tuneValue$ncomp)
    if (length(dim(out)) == 3) {
        if (dim(out)[1] > 1) {
            out <- out[, , 1]
        }
        else {
            out <- as.data.frame(t(out[, , 1]), stringsAsFactors = TRUE)
        }
    }
    if (!is.null(submodels)) {
        tmp <- vector(mode = "list", length = nrow(submodels) + 
            1)
        tmp[[1]] <- out
        for (j in seq(along = submodels$ncomp)) {
            tmpProb <- predict(modelFit, newdata, type = "prob", 
                ncomp = submodels$ncomp[j])
            if (length(dim(tmpProb)) == 3) {
                if (dim(tmpProb)[1] > 1) {
                  tmpProb <- tmpProb[, , 1]
                }
                else {
                  tmpProb <- as.data.frame(t(tmpProb[, , 1]), 
                    stringsAsFactors = TRUE)
                }
            }
            tmp[[j + 1]] <- as.data.frame(tmpProb[, modelFit$obsLevels, 
                drop = FALSE], stringsAsFactors = TRUE)
        }
        out <- tmp
    }
    out
}
$varImp
function (object, estimate = NULL, ...) 
{
    library(pls)
    modelCoef <- coef(object, intercept = FALSE, comps = 1:object$ncomp)
    perf <- MSEP(object)$val
    nms <- dimnames(perf)
    if (length(nms$estimate) > 1) {
        pIndex <- if (is.null(estimate)) 
            1
        else which(nms$estimate == estimate)
        perf <- perf[pIndex, , , drop = FALSE]
    }
    numResp <- dim(modelCoef)[2]
    if (numResp <= 2) {
        modelCoef <- modelCoef[, 1, , drop = FALSE]
        perf <- perf[, 1, ]
        delta <- -diff(perf)
        delta <- delta/sum(delta)
        out <- data.frame(Overall = apply(abs(modelCoef), 1, 
            weighted.mean, w = delta))
    }
    else {
        perf <- -t(apply(perf[1, , ], 1, diff))
        perf <- t(apply(perf, 1, function(u) u/sum(u)))
        out <- matrix(NA, ncol = numResp, nrow = dim(modelCoef)[1])
        for (i in 1:numResp) {
            tmp <- abs(modelCoef[, i, , drop = FALSE])
            out[, i] <- apply(tmp, 1, weighted.mean, w = perf[i, 
                ])
        }
        colnames(out) <- dimnames(modelCoef)[[2]]
        rownames(out) <- dimnames(modelCoef)[[1]]
    }
    as.data.frame(out, stringsAsFactors = TRUE)
}
$predictors
function (x, ...) 
rownames(x$projection)
$levels
function (x) 
x$obsLevels
$tags
  1. 'Partial Least Squares'
  2. 'Feature Extraction'
  3. 'Kernel Method'
  4. 'Linear Classifier'
  5. 'Linear Regression'
$sort
function (x) 
x[order(x[, 1]), ]
$ORFpls
$label
'Oblique Random Forest'
$library
'obliqueRF'
$loop
NULL
$type
'Classification'
$parameters
A data.frame: 1 × 3
parameterclasslabel
<chr><chr><chr>
mtrynumeric#Randomly Selected Predictors
$grid
function (x, y, len = NULL, search = "grid") 
{
    if (search == "grid") {
        out <- data.frame(mtry = caret::var_seq(p = ncol(x), 
            classification = is.factor(y), len = len))
    }
    else {
        out <- data.frame(mtry = unique(sample(1:ncol(x), size = len, 
            replace = TRUE)))
    }
    out
}
$fit
function (x, y, wts, param, lev, last, classProbs, ...) 
{
    require(obliqueRF)
    obliqueRF::obliqueRF(as.matrix(x), y, training_method = "pls", 
        ...)
}
$predict
function (modelFit, newdata, submodels = NULL) 
predict(modelFit, newdata)
$prob
function (modelFit, newdata, submodels = NULL) 
predict(modelFit, newdata, type = "prob")
$levels
function (x) 
x$obsLevels
$notes
'Unlike other packages used by `train`, the `obliqueRF` package is fully loaded when this model is used.'
$tags
  1. 'Random Forest'
  2. 'Oblique Tree'
  3. 'Partial Least Squares'
  4. 'Implicit Feature Selection'
  5. 'Ensemble Model'
  6. 'Two Class Only'
$sort
function (x) 
x[order(x[, 1]), ]
$pls
$label
'Partial Least Squares'
$library
'pls'
$type
  1. 'Regression'
  2. 'Classification'
$parameters
A data.frame: 1 × 3
parameterclasslabel
<chr><chr><chr>
ncompnumeric#Components
$grid
function (x, y, len = NULL, search = "grid") 
{
    if (search == "grid") {
        out <- data.frame(ncomp = seq(1, min(ncol(x) - 1, len), 
            by = 1))
    }
    else {
        out <- data.frame(ncomp = unique(sample(1:ncol(x), replace = TRUE)))
    }
    out
}
$loop
function (grid) 
{
    grid <- grid[order(grid$ncomp, decreasing = TRUE), , drop = FALSE]
    loop <- grid[1, , drop = FALSE]
    submodels <- list(grid[-1, , drop = FALSE])
    list(loop = loop, submodels = submodels)
}
$fit
function (x, y, wts, param, lev, last, classProbs, ...) 
{
    ncomp <- min(ncol(x), param$ncomp)
    out <- if (is.factor(y)) {
        plsda(x, y, method = "oscorespls", ncomp = ncomp, ...)
    }
    else {
        dat <- if (is.data.frame(x)) 
            x
        else as.data.frame(x, stringsAsFactors = TRUE)
        dat$.outcome <- y
        pls::plsr(.outcome ~ ., data = dat, method = "oscorespls", 
            ncomp = ncomp, ...)
    }
    out
}
$predict
function (modelFit, newdata, submodels = NULL) 
{
    out <- if (modelFit$problemType == "Classification") {
        if (!is.matrix(newdata)) 
            newdata <- as.matrix(newdata)
        out <- predict(modelFit, newdata, type = "class")
    }
    else as.vector(pls:::predict.mvr(modelFit, newdata, ncomp = max(modelFit$ncomp)))
    if (!is.null(submodels)) {
        tmp <- vector(mode = "list", length = nrow(submodels))
        if (modelFit$problemType == "Classification") {
            if (length(submodels$ncomp) > 1) {
                tmp <- as.list(predict(modelFit, newdata, ncomp = submodels$ncomp))
            }
            else tmp <- list(predict(modelFit, newdata, ncomp = submodels$ncomp))
        }
        else {
            tmp <- as.list(as.data.frame(apply(predict(modelFit, 
                newdata, ncomp = submodels$ncomp), 3, function(x) list(x))))
        }
        out <- c(list(out), tmp)
    }
    out
}
$prob
function (modelFit, newdata, submodels = NULL) 
{
    if (!is.matrix(newdata)) 
        newdata <- as.matrix(newdata)
    out <- predict(modelFit, newdata, type = "prob", ncomp = modelFit$tuneValue$ncomp)
    if (length(dim(out)) == 3) {
        if (dim(out)[1] > 1) {
            out <- out[, , 1]
        }
        else {
            out <- as.data.frame(t(out[, , 1]), stringsAsFactors = TRUE)
        }
    }
    if (!is.null(submodels)) {
        tmp <- vector(mode = "list", length = nrow(submodels) + 
            1)
        tmp[[1]] <- out
        for (j in seq(along = submodels$ncomp)) {
            tmpProb <- predict(modelFit, newdata, type = "prob", 
                ncomp = submodels$ncomp[j])
            if (length(dim(tmpProb)) == 3) {
                if (dim(tmpProb)[1] > 1) {
                  tmpProb <- tmpProb[, , 1]
                }
                else {
                  tmpProb <- as.data.frame(t(tmpProb[, , 1]), 
                    stringsAsFactors = TRUE)
                }
            }
            tmp[[j + 1]] <- as.data.frame(tmpProb[, modelFit$obsLevels])
        }
        out <- tmp
    }
    out
}
$varImp
function (object, estimate = NULL, ...) 
{
    library(pls)
    modelCoef <- coef(object, intercept = FALSE, comps = 1:object$ncomp)
    perf <- pls:::MSEP.mvr(object)$val
    nms <- dimnames(perf)
    if (length(nms$estimate) > 1) {
        pIndex <- if (is.null(estimate)) 
            1
        else which(nms$estimate == estimate)
        perf <- perf[pIndex, , , drop = FALSE]
    }
    numResp <- dim(modelCoef)[2]
    if (numResp <= 2) {
        modelCoef <- modelCoef[, 1, , drop = FALSE]
        perf <- perf[, 1, ]
        delta <- -diff(perf)
        delta <- delta/sum(delta)
        out <- data.frame(Overall = apply(abs(modelCoef), 1, 
            weighted.mean, w = delta))
    }
    else {
        if (dim(perf)[3] <= 2) {
            perf <- -t(t(apply(perf[1, , ], 1, diff)))
            perf <- t(t(apply(perf, 1, function(u) u/sum(u))))
        }
        else {
            perf <- -t(apply(perf[1, , ], 1, diff))
            perf <- t(apply(perf, 1, function(u) u/sum(u)))
        }
        out <- matrix(NA, ncol = numResp, nrow = dim(modelCoef)[1])
        for (i in 1:numResp) {
            tmp <- abs(modelCoef[, i, , drop = FALSE])
            out[, i] <- apply(tmp, 1, weighted.mean, w = perf[i, 
                ])
        }
        colnames(out) <- dimnames(modelCoef)[[2]]
        rownames(out) <- dimnames(modelCoef)[[1]]
    }
    as.data.frame(out, stringsAsFactors = TRUE)
}
$predictors
function (x, ...) 
rownames(x$projection)
$levels
function (x) 
x$obsLevels
$tags
  1. 'Partial Least Squares'
  2. 'Feature Extraction'
  3. 'Linear Classifier'
  4. 'Linear Regression'
$sort
function (x) 
x[order(x[, 1]), ]
$plsRglm
$label
'Partial Least Squares Generalized Linear Models '
$library
'plsRglm'
$loop
NULL
$type
  1. 'Classification'
  2. 'Regression'
$parameters
A data.frame: 2 × 3
parameterclasslabel
<chr><chr><chr>
nt numeric#PLS Components
alpha.pvals.explinumericp-Value threshold
$grid
function (x, y, len = NULL, search = "grid") 
{
    if (search == "grid") {
        out <- expand.grid(nt = 1:len, alpha.pvals.expli = 10^(c(-2:(len - 
            3), 0)))
    }
    else {
        out <- data.frame(nt = sample(1:ncol(x), size = len, 
            replace = TRUE), alpha.pvals.expli = runif(len, min = 0, 
            0.2))
    }
    out
}
$fit
function (x, y, wts, param, lev, last, classProbs, ...) 
{
    require(plsRglm)
    if (is.factor(y)) {
        lv <- levels(y)
        y <- as.numeric(y) - 1
        dst <- "pls-glm-logistic"
    }
    else {
        lv <- NULL
        dst <- "pls-glm-gaussian"
    }
    theDots <- list(...)
    if (any(names(theDots) == "modele")) {
        mod <- plsrRglm::plsRglm(y, x, nt = param$nt, pvals.expli = param$alpha.pvals.expli < 
            1, sparse = param$alpha.pvals.expli < 1, alpha.pvals.expli = param$alpha.pvals.expli, 
            ...)
    }
    else {
        mod <- plsRglm::plsRglm(y, x, nt = param$nt, modele = dst, 
            pvals.expli = param$alpha.pvals.expli < 1, sparse = param$alpha.pvals.expli < 
                1, alpha.pvals.expli = param$alpha.pvals.expli, 
            ...)
    }
    mod
}
$predict
function (modelFit, newdata, submodels = NULL) 
{
    out <- predict(modelFit, newdata, type = "response")
    if (modelFit$problemType == "Classification") {
        out <- factor(ifelse(out >= 0.5, modelFit$obsLevels[2], 
            modelFit$obsLevels[1]))
    }
    out
}
$prob
function (modelFit, newdata, submodels = NULL) 
{
    out <- predict(modelFit, newdata, type = "response")
    out <- cbind(1 - out, out)
    dimnames(out)[[2]] <- rev(modelFit$obsLevels)
    out
}
$varImp
NULL
$predictors
function (x, ...) 
{
    vars <- names(which(coef(x)[[2]][, 1] != 0))
    vars[vars != "Intercept"]
}
$notes
'Unlike other packages used by `train`, the `plsRglm` package is fully loaded when this model is used.'
$tags
  1. 'Generalized Linear Models'
  2. 'Partial Least Squares'
  3. 'Two Class Only'
$levels
function (x) 
x$lev
$sort
function (x) 
x[order(-x$alpha.pvals.expli, x$nt), ]
$simpls
$label
'Partial Least Squares'
$library
'pls'
$type
  1. 'Regression'
  2. 'Classification'
$parameters
A data.frame: 1 × 3
parameterclasslabel
<chr><chr><chr>
ncompnumeric#Components
$grid
function (x, y, len = NULL, search = "grid") 
{
    if (search == "grid") {
        out <- data.frame(ncomp = seq(1, min(ncol(x) - 1, len), 
            by = 1))
    }
    else {
        out <- data.frame(ncomp = unique(sample(1:(ncol(x) - 
            1), size = len, replace = TRUE)))
    }
    out
}
$loop
function (grid) 
{
    grid <- grid[order(grid$ncomp, decreasing = TRUE), , drop = FALSE]
    loop <- grid[1, , drop = FALSE]
    submodels <- list(grid[-1, , drop = FALSE])
    list(loop = loop, submodels = submodels)
}
$fit
function (x, y, wts, param, lev, last, classProbs, ...) 
{
    ncomp <- min(ncol(x), param$ncomp)
    out <- if (is.factor(y)) {
        plsda(x, y, method = "simpls", ncomp = ncomp, ...)
    }
    else {
        dat <- if (is.data.frame(x)) 
            x
        else as.data.frame(x, stringsAsFactors = TRUE)
        dat$.outcome <- y
        pls::plsr(.outcome ~ ., data = dat, method = "simpls", 
            ncomp = ncomp, ...)
    }
    out
}
$predict
function (modelFit, newdata, submodels = NULL) 
{
    out <- if (modelFit$problemType == "Classification") {
        if (!is.matrix(newdata)) 
            newdata <- as.matrix(newdata)
        out <- predict(modelFit, newdata, type = "class")
    }
    else as.vector(pls:::predict.mvr(modelFit, newdata, ncomp = max(modelFit$ncomp)))
    if (!is.null(submodels)) {
        tmp <- vector(mode = "list", length = nrow(submodels))
        if (modelFit$problemType == "Classification") {
            if (length(submodels$ncomp) > 1) {
                tmp <- as.list(predict(modelFit, newdata, ncomp = submodels$ncomp))
            }
            else tmp <- list(predict(modelFit, newdata, ncomp = submodels$ncomp))
        }
        else {
            tmp <- as.list(as.data.frame(apply(predict(modelFit, 
                newdata, ncomp = submodels$ncomp), 3, function(x) list(x))))
        }
        out <- c(list(out), tmp)
    }
    out
}
$prob
function (modelFit, newdata, submodels = NULL) 
{
    if (!is.matrix(newdata)) 
        newdata <- as.matrix(newdata)
    out <- predict(modelFit, newdata, type = "prob", ncomp = modelFit$tuneValue$ncomp)
    if (length(dim(out)) == 3) {
        if (dim(out)[1] > 1) {
            out <- out[, , 1]
        }
        else {
            out <- as.data.frame(t(out[, , 1]), stringsAsFactors = TRUE)
        }
    }
    if (!is.null(submodels)) {
        tmp <- vector(mode = "list", length = nrow(submodels) + 
            1)
        tmp[[1]] <- out
        for (j in seq(along = submodels$ncomp)) {
            tmpProb <- predict(modelFit, newdata, type = "prob", 
                ncomp = submodels$ncomp[j])
            if (length(dim(tmpProb)) == 3) {
                if (dim(tmpProb)[1] > 1) {
                  tmpProb <- tmpProb[, , 1]
                }
                else {
                  tmpProb <- as.data.frame(t(tmpProb[, , 1]), 
                    stringsAsFactors = TRUE)
                }
            }
            tmp[[j + 1]] <- as.data.frame(tmpProb[, modelFit$obsLevels, 
                drop = FALSE], stringsAsFactors = TRUE)
        }
        out <- tmp
    }
    out
}
$varImp
function (object, estimate = NULL, ...) 
{
    library(pls)
    modelCoef <- coef(object, intercept = FALSE, comps = 1:object$ncomp)
    perf <- pls:::MSEP.mvr(object)$val
    nms <- dimnames(perf)
    if (length(nms$estimate) > 1) {
        pIndex <- if (is.null(estimate)) 
            1
        else which(nms$estimate == estimate)
        perf <- perf[pIndex, , , drop = FALSE]
    }
    numResp <- dim(modelCoef)[2]
    if (numResp <= 2) {
        modelCoef <- modelCoef[, 1, , drop = FALSE]
        perf <- perf[, 1, ]
        delta <- -diff(perf)
        delta <- delta/sum(delta)
        out <- data.frame(Overall = apply(abs(modelCoef), 1, 
            weighted.mean, w = delta))
    }
    else {
        perf <- -t(apply(perf[1, , ], 1, diff))
        perf <- t(apply(perf, 1, function(u) u/sum(u)))
        out <- matrix(NA, ncol = numResp, nrow = dim(modelCoef)[1])
        for (i in 1:numResp) {
            tmp <- abs(modelCoef[, i, , drop = FALSE])
            out[, i] <- apply(tmp, 1, weighted.mean, w = perf[i, 
                ])
        }
        colnames(out) <- dimnames(modelCoef)[[2]]
        rownames(out) <- dimnames(modelCoef)[[1]]
    }
    as.data.frame(out, stringsAsFactors = TRUE)
}
$levels
function (x) 
x$obsLevels
$predictors
function (x, ...) 
rownames(x$projection)
$tags
  1. 'Partial Least Squares'
  2. 'Feature Extraction'
  3. 'Linear Classifier'
  4. 'Linear Regression'
$sort
function (x) 
x[order(x[, 1]), ]
$spls
$label
'Sparse Partial Least Squares'
$library
'spls'
$type
  1. 'Regression'
  2. 'Classification'
$parameters
A data.frame: 3 × 3
parameterclasslabel
<chr><chr><chr>
K numeric#Components
eta numericThreshold
kappanumericKappa
$grid
function (x, y, len = NULL, search = "grid") 
{
    if (search == "grid") {
        out <- expand.grid(K = 1:min(nrow(x), ncol(x)), eta = seq(0.1, 
            0.9, length = len), kappa = 0.5)
    }
    else {
        out <- data.frame(kappa = runif(len, min = 0, max = 0.5), 
            eta = runif(len, min = 0, max = 1), K = sample(1:min(nrow(x), 
                ncol(x)), size = len, replace = TRUE))
    }
    out
}
$loop
NULL
$fit
function (x, y, wts, param, lev, last, classProbs, ...) 
{
    param$K <- min(param$K, length(y))
    if (is.factor(y)) {
        caret:::splsda(x, y, K = param$K, eta = param$eta, kappa = param$kappa, 
            ...)
    }
    else {
        spls::spls(x, y, K = param$K, eta = param$eta, kappa = param$kappa, 
            ...)
    }
}
$predict
function (modelFit, newdata, submodels = NULL) 
{
    if (length(modelFit$obsLevels) < 2) {
        spls::predict.spls(modelFit, newdata)
    }
    else {
        as.character(caret:::predict.splsda(modelFit, newdata, 
            type = "class"))
    }
}
$prob
function (modelFit, newdata, submodels = NULL) 
{
    if (!is.matrix(newdata)) 
        newdata <- as.matrix(newdata)
    caret:::predict.splsda(modelFit, newdata, type = "prob")
}
$predictors
function (x, ...) 
colnames(x$x)[x$A]
$tags
  1. 'Partial Least Squares'
  2. 'Feature Extraction'
  3. 'Linear Classifier'
  4. 'Linear Regression'
  5. 'L1 Regularization'
$levels
function (x) 
x$obsLevels
$sort
function (x) 
x[order(-x$eta, x$K), ]
$widekernelpls
$label
'Partial Least Squares'
$library
'pls'
$type
  1. 'Regression'
  2. 'Classification'
$parameters
A data.frame: 1 × 3
parameterclasslabel
<chr><chr><chr>
ncompnumeric#Components
$grid
function (x, y, len = NULL, search = "grid") 
{
    if (search == "grid") {
        out <- data.frame(ncomp = seq(1, min(ncol(x) - 1, len), 
            by = 1))
    }
    else {
        out <- data.frame(ncomp = unique(sample(1:(ncol(x) - 
            1), size = len, replace = TRUE)))
    }
    out
}
$loop
function (grid) 
{
    grid <- grid[order(grid$ncomp, decreasing = TRUE), , drop = FALSE]
    loop <- grid[1, , drop = FALSE]
    submodels <- list(grid[-1, , drop = FALSE])
    list(loop = loop, submodels = submodels)
}
$fit
function (x, y, wts, param, lev, last, classProbs, ...) 
{
    ncomp <- min(ncol(x), param$ncomp)
    out <- if (is.factor(y)) {
        caret::plsda(x, y, method = "widekernelpls", ncomp = ncomp, 
            ...)
    }
    else {
        dat <- if (is.data.frame(x)) 
            x
        else as.data.frame(x, stringsAsFactors = TRUE)
        dat$.outcome <- y
        pls::plsr(.outcome ~ ., data = dat, method = "widekernelpls", 
            ncomp = ncomp, ...)
    }
    out
}
$predict
function (modelFit, newdata, submodels = NULL) 
{
    out <- if (modelFit$problemType == "Classification") {
        if (!is.matrix(newdata)) 
            newdata <- as.matrix(newdata)
        out <- predict(modelFit, newdata, type = "class")
    }
    else as.vector(pls:::predict.mvr(modelFit, newdata, ncomp = max(modelFit$ncomp)))
    if (!is.null(submodels)) {
        tmp <- vector(mode = "list", length = nrow(submodels))
        if (modelFit$problemType == "Classification") {
            if (length(submodels$ncomp) > 1) {
                tmp <- as.list(predict(modelFit, newdata, ncomp = submodels$ncomp))
            }
            else tmp <- list(predict(modelFit, newdata, ncomp = submodels$ncomp))
        }
        else {
            tmp <- as.list(as.data.frame(apply(predict(modelFit, 
                newdata, ncomp = submodels$ncomp), 3, function(x) list(x))))
        }
        out <- c(list(out), tmp)
    }
    out
}
$prob
function (modelFit, newdata, submodels = NULL) 
{
    if (!is.matrix(newdata)) 
        newdata <- as.matrix(newdata)
    out <- predict(modelFit, newdata, type = "prob", ncomp = modelFit$tuneValue$ncomp)
    if (length(dim(out)) == 3) {
        if (dim(out)[1] > 1) {
            out <- out[, , 1]
        }
        else {
            out <- as.data.frame(t(out[, , 1]), stringsAsFactors = TRUE)
        }
    }
    if (!is.null(submodels)) {
        tmp <- vector(mode = "list", length = nrow(submodels) + 
            1)
        tmp[[1]] <- out
        for (j in seq(along = submodels$ncomp)) {
            tmpProb <- predict(modelFit, newdata, type = "prob", 
                ncomp = submodels$ncomp[j])
            if (length(dim(tmpProb)) == 3) {
                if (dim(tmpProb)[1] > 1) {
                  tmpProb <- tmpProb[, , 1]
                }
                else {
                  tmpProb <- as.data.frame(t(tmpProb[, , 1]), 
                    stringsAsFactors = TRUE)
                }
            }
            tmp[[j + 1]] <- as.data.frame(tmpProb[, modelFit$obsLevels, 
                drop = FALSE], stringsAsFactors = TRUE)
        }
        out <- tmp
    }
    out
}
$predictors
function (x, ...) 
rownames(x$projection)
$varImp
function (object, estimate = NULL, ...) 
{
    library(pls)
    modelCoef <- coef(object, intercept = FALSE, comps = 1:object$ncomp)
    perf <- pls:::MSEP.mvr(object)$val
    nms <- dimnames(perf)
    if (length(nms$estimate) > 1) {
        pIndex <- if (is.null(estimate)) 
            1
        else which(nms$estimate == estimate)
        perf <- perf[pIndex, , , drop = FALSE]
    }
    numResp <- dim(modelCoef)[2]
    if (numResp <= 2) {
        modelCoef <- modelCoef[, 1, , drop = FALSE]
        perf <- perf[, 1, ]
        delta <- -diff(perf)
        delta <- delta/sum(delta)
        out <- data.frame(Overall = apply(abs(modelCoef), 1, 
            weighted.mean, w = delta))
    }
    else {
        perf <- -t(apply(perf[1, , ], 1, diff))
        perf <- t(apply(perf, 1, function(u) u/sum(u)))
        out <- matrix(NA, ncol = numResp, nrow = dim(modelCoef)[1])
        for (i in 1:numResp) {
            tmp <- abs(modelCoef[, i, , drop = FALSE])
            out[, i] <- apply(tmp, 1, weighted.mean, w = perf[i, 
                ])
        }
        colnames(out) <- dimnames(modelCoef)[[2]]
        rownames(out) <- dimnames(modelCoef)[[1]]
    }
    as.data.frame(out, stringsAsFactors = TRUE)
}
$levels
function (x) 
x$obsLevels
$tags
  1. 'Partial Least Squares'
  2. 'Feature Extraction'
  3. 'Linear Classifier'
  4. 'Linear Regression'
$sort
function (x) 
x[order(x[, 1]), ]
$kknn =
$label
'k-Nearest Neighbors'
$library
'kknn'
$loop
NULL
$type
  1. 'Regression'
  2. 'Classification'
$parameters
A data.frame: 3 × 3
parameterclasslabel
<chr><chr><chr>
kmax numeric Max. #Neighbors
distancenumeric Distance
kernel characterKernel
$grid
function (x, y, len = NULL, search = "grid") 
{
    if (search == "grid") {
        out <- data.frame(kmax = (5:((2 * len) + 4))[(5:((2 * 
            len) + 4))%%2 > 0], distance = 2, kernel = "optimal")
    }
    else {
        by_val <- if (is.factor(y)) 
            length(levels(y))
        else 1
        kerns <- c("rectangular", "triangular", "epanechnikov", 
            "biweight", "triweight", "cos", "inv", "gaussian")
        out <- data.frame(kmax = sample(seq(1, floor(nrow(x)/3), 
            by = by_val), size = len, replace = TRUE), distance = runif(len, 
            min = 0, max = 3), kernel = sample(kerns, size = len, 
            replace = TRUE))
    }
    out
}
$fit
function (x, y, wts, param, lev, last, classProbs, ...) 
{
    dat <- if (is.data.frame(x)) 
        x
    else as.data.frame(x, stringsAsFactors = TRUE)
    dat$.outcome <- y
    kknn::train.kknn(.outcome ~ ., data = dat, kmax = param$kmax, 
        distance = param$distance, kernel = as.character(param$kernel), 
        ...)
}
$predict
function (modelFit, newdata, submodels = NULL) 
{
    if (!is.data.frame(newdata)) 
        newdata <- as.data.frame(newdata, stringsAsFactors = TRUE)
    predict(modelFit, newdata)
}
$levels
function (x) 
x$obsLevels
$tags
'Prototype Models'
$prob
function (modelFit, newdata, submodels = NULL) 
{
    if (!is.data.frame(newdata)) 
        newdata <- as.data.frame(newdata, stringsAsFactors = TRUE)
    predict(modelFit, newdata, type = "prob")
}
$sort
function (x) 
x[order(-x[, 1]), ]
$pda
$label
'Penalized Discriminant Analysis'
$library
'mda'
$loop
NULL
$type
'Classification'
$parameters
A data.frame: 1 × 3
parameterclasslabel
<chr><chr><chr>
lambdanumericShrinkage Penalty Coefficient
$grid
function (x, y, len = NULL, search = "grid") 
{
    if (search == "grid") {
        out <- data.frame(lambda = c(0, 10^seq(-1, -4, length = len - 
            1)))
    }
    else {
        out <- data.frame(lambda = 10^runif(len, min = -5, 1))
    }
    out
}
$fit
function (x, y, wts, param, lev, last, classProbs, ...) 
{
    dat <- if (is.data.frame(x)) 
        x
    else as.data.frame(x, stringsAsFactors = TRUE)
    dat$.outcome <- y
    if (!is.null(wts)) {
        out <- mda::fda(as.formula(".outcome ~ ."), data = dat, 
            method = mda::gen.ridge, weights = wts, lambda = param$lambda, 
            ...)
    }
    else {
        out <- mda::fda(as.formula(".outcome ~ ."), data = dat, 
            method = mda::gen.ridge, lambda = param$lambda, ...)
    }
    out
}
$predict
function (modelFit, newdata, submodels = NULL) 
predict(modelFit, newdata)
$prob
function (modelFit, newdata, submodels = NULL) 
predict(modelFit, newdata, type = "posterior")
$levels
function (x) 
x$obsLevels
$tags
  1. 'Discriminant Analysis'
  2. 'Polynomial Model'
  3. 'Accepts Case Weights'
$sort
function (x) 
x[order(x[, 1]), ]
$pda2
$label
'Penalized Discriminant Analysis'
$library
'mda'
$loop
NULL
$type
'Classification'
$parameters
A data.frame: 1 × 3
parameterclasslabel
<chr><chr><chr>
dfnumericDegrees of Freedom
$grid
function (x, y, len = NULL, search = "grid") 
{
    if (search == "grid") {
        out <- data.frame(df = 2 * (0:(len - 1) + 1))
    }
    else {
        out <- data.frame(df = runif(len, min = 1, max = 5))
    }
    out
}
$fit
function (x, y, wts, param, lev, last, classProbs, ...) 
{
    dat <- if (is.data.frame(x)) 
        x
    else as.data.frame(x, stringsAsFactors = TRUE)
    dat$.outcome <- y
    if (!is.null(wts)) {
        out <- mda::fda(as.formula(".outcome ~ ."), data = dat, 
            method = mda::gen.ridge, weights = wts, df = param$df, 
            ...)
    }
    else {
        out <- mda::fda(as.formula(".outcome ~ ."), data = dat, 
            method = mda::gen.ridge, df = param$df, ...)
    }
    out
}
$levels
function (x) 
x$obsLevels
$predict
function (modelFit, newdata, submodels = NULL) 
predict(modelFit, newdata)
$prob
function (modelFit, newdata, submodels = NULL) 
predict(modelFit, newdata, type = "posterior")
$tags
  1. 'Discriminant Analysis'
  2. 'Polynomial Model'
  3. 'Accepts Case Weights'
$sort
function (x) 
x[order(x[, 1]), ]

1) Training - without explicit parameter tuning / using default

In [51]:
# PLS
fit.pls <- caret::train(Churn~., data=training_dataset, method="pls", 
                        metric=metric, 
                        trControl=control,
                        verbose = TRUE
)
print(fit.pls)
Aggregating results
Selecting tuning parameters
Fitting ncomp = 1 on full training set
Partial Least Squares 

2501 samples
  19 predictor
   2 classes: 'no', 'yes' 

No pre-processing
Resampling: Cross-Validated (2 fold, repeated 2 times) 
Summary of sample sizes: 1250, 1251, 1251, 1250 
Resampling results across tuning parameters:

  ncomp  Accuracy   Kappa
  1      0.8548582  0    
  2      0.8548582  0    
  3      0.8548582  0    

Accuracy was used to select the optimal model using the largest value.
The final value used for the model was ncomp = 1.
In [53]:
# KKNN
fit.kknn <- caret::train(Churn~., data=training_dataset, method="kknn", 
                         metric=metric, 
                         trControl=control,
                         verbose = TRUE
)
print(fit.kknn)
Aggregating results
Selecting tuning parameters
Fitting kmax = 9, distance = 2, kernel = optimal on full training set
k-Nearest Neighbors 

2501 samples
  19 predictor
   2 classes: 'no', 'yes' 

No pre-processing
Resampling: Cross-Validated (2 fold, repeated 2 times) 
Summary of sample sizes: 1250, 1251, 1250, 1251 
Resampling results across tuning parameters:

  kmax  Accuracy   Kappa    
  5     0.8762510  0.3732125
  7     0.8766512  0.3661114
  9     0.8774505  0.3507773

Tuning parameter 'distance' was held constant at a value of 2
Tuning
 parameter 'kernel' was held constant at a value of optimal
Accuracy was used to select the optimal model using the largest value.
The final values used for the model were kmax = 9, distance = 2 and kernel
 = optimal.
In [55]:
# PDA
fit.pda <- caret::train(Churn~., data=training_dataset, method="pda", 
                        metric=metric, 
                        trControl=control,
                        verbose = TRUE
)
print(fit.pda)
Aggregating results
Selecting tuning parameters
Fitting lambda = 0.1 on full training set
Penalized Discriminant Analysis 

2501 samples
  19 predictor
   2 classes: 'no', 'yes' 

No pre-processing
Resampling: Cross-Validated (2 fold, repeated 2 times) 
Summary of sample sizes: 1250, 1251, 1250, 1251 
Resampling results across tuning parameters:

  lambda  Accuracy   Kappa    
  0e+00   0.8536588  0.2701071
  1e-04   0.8536588  0.2701071
  1e-01   0.8548588  0.2750280

Accuracy was used to select the optimal model using the largest value.
The final value used for the model was lambda = 0.1.

2) Training - with explicit parameter tuning using preProcess method

In [56]:
# PLS
fit.pls_preProc <- caret::train(Churn~., data=training_dataset, method="pls", 
                                metric=metric, 
                                trControl=control,
                                preProc=c("center", "scale"), 
                                verbose = TRUE
)
print(fit.pls_preProc)
Aggregating results
Selecting tuning parameters
Fitting ncomp = 1 on full training set
Partial Least Squares 

2501 samples
  19 predictor
   2 classes: 'no', 'yes' 

Pre-processing: centered (19), scaled (19) 
Resampling: Cross-Validated (2 fold, repeated 2 times) 
Summary of sample sizes: 1251, 1250, 1250, 1251 
Resampling results across tuning parameters:

  ncomp  Accuracy   Kappa     
  1      0.8614552  0.09858137
  2      0.8602552  0.15191354
  3      0.8606552  0.15978440

Accuracy was used to select the optimal model using the largest value.
The final value used for the model was ncomp = 1.
In [57]:
# KKNN
fit.kknn_preProc <- caret::train(Churn~., data=training_dataset, method="kknn", 
                                 metric=metric, 
                                 trControl=control,
                                 preProc=c("center", "scale", "pca"), 
                                 verbose = TRUE
)
print(fit.kknn_preProc)
Aggregating results
Selecting tuning parameters
Fitting kmax = 9, distance = 2, kernel = optimal on full training set
k-Nearest Neighbors 

2501 samples
  19 predictor
   2 classes: 'no', 'yes' 

Pre-processing: centered (19), scaled (19), principal component
 signal extraction (19) 
Resampling: Cross-Validated (2 fold, repeated 2 times) 
Summary of sample sizes: 1251, 1250, 1251, 1250 
Resampling results across tuning parameters:

  kmax  Accuracy   Kappa    
  5     0.8748494  0.3529289
  7     0.8766488  0.3446831
  9     0.8780476  0.3491199

Tuning parameter 'distance' was held constant at a value of 2
Tuning
 parameter 'kernel' was held constant at a value of optimal
Accuracy was used to select the optimal model using the largest value.
The final values used for the model were kmax = 9, distance = 2 and kernel
 = optimal.
In [58]:
# PDA
fit.pda_preProc <- caret::train(Churn~., data=training_dataset, method="pda", 
                                metric=metric, 
                                trControl=control,
                                preProc=c("center", "scale", "pca"), 
                                verbose = TRUE
)
print(fit.pda_preProc)
Aggregating results
Selecting tuning parameters
Fitting lambda = 0 on full training set
Penalized Discriminant Analysis 

2501 samples
  19 predictor
   2 classes: 'no', 'yes' 

Pre-processing: centered (19), scaled (19), principal component
 signal extraction (19) 
Resampling: Cross-Validated (2 fold, repeated 2 times) 
Summary of sample sizes: 1251, 1250, 1251, 1250 
Resampling results across tuning parameters:

  lambda  Accuracy   Kappa    
  0e+00   0.8576577  0.2271001
  1e-04   0.8576577  0.2271001
  1e-01   0.8576577  0.2271001

Accuracy was used to select the optimal model using the largest value.
The final value used for the model was lambda = 0.

3) Training - with explicit parameter tuning using preProcess method & Automatic Grid i.e. tuneLength

In [59]:
# PLS
fit.pls_automaticGrid <- caret::train(Churn~., data=training_dataset, method="pls", 
                                      metric=metric, 
                                      trControl=control,
                                      preProc=c("center", "scale"), 
                                      tuneLength = tuneLength,
                                      verbose = TRUE
)
print(fit.pls_automaticGrid)
Aggregating results
Selecting tuning parameters
Fitting ncomp = 1 on full training set
Partial Least Squares 

2501 samples
  19 predictor
   2 classes: 'no', 'yes' 

Pre-processing: centered (19), scaled (19) 
Resampling: Cross-Validated (2 fold, repeated 2 times) 
Summary of sample sizes: 1250, 1251, 1251, 1250 
Resampling results across tuning parameters:

  ncomp  Accuracy   Kappa    
  1      0.8624550  0.1047710
  2      0.8608561  0.1550857

Accuracy was used to select the optimal model using the largest value.
The final value used for the model was ncomp = 1.
In [60]:
# KKNN
fit.kknn_automaticGrid <- caret::train(Churn~., data=training_dataset, method="kknn", 
                                       metric=metric, 
                                       trControl=control,
                                       preProc=c("center", "scale", "pca"), 
                                       tuneLength = tuneLength,
                                       verbose = TRUE
)
print(fit.kknn_automaticGrid)
Aggregating results
Selecting tuning parameters
Fitting kmax = 7, distance = 2, kernel = optimal on full training set
k-Nearest Neighbors 

2501 samples
  19 predictor
   2 classes: 'no', 'yes' 

Pre-processing: centered (19), scaled (19), principal component
 signal extraction (19) 
Resampling: Cross-Validated (2 fold, repeated 2 times) 
Summary of sample sizes: 1250, 1251, 1251, 1250 
Resampling results across tuning parameters:

  kmax  Accuracy   Kappa    
  5     0.8702539  0.3245917
  7     0.8714528  0.3045254

Tuning parameter 'distance' was held constant at a value of 2
Tuning
 parameter 'kernel' was held constant at a value of optimal
Accuracy was used to select the optimal model using the largest value.
The final values used for the model were kmax = 7, distance = 2 and kernel
 = optimal.
In [61]:
# PDA
fit.pda_automaticGrid <- caret::train(Churn~., data=training_dataset, method="pda", 
                                      metric=metric, 
                                      trControl=control,
                                      preProc=c("center", "scale", "pca"), 
                                      tuneLength = tuneLength,
                                      verbose = TRUE
)
print(fit.pda_automaticGrid)
Aggregating results
Selecting tuning parameters
Fitting lambda = 0 on full training set
Penalized Discriminant Analysis 

2501 samples
  19 predictor
   2 classes: 'no', 'yes' 

Pre-processing: centered (19), scaled (19), principal component
 signal extraction (19) 
Resampling: Cross-Validated (2 fold, repeated 2 times) 
Summary of sample sizes: 1250, 1251, 1250, 1251 
Resampling results across tuning parameters:

  lambda  Accuracy   Kappa    
  0.0     0.8544596  0.2195928
  0.1     0.8544596  0.2195928

Accuracy was used to select the optimal model using the largest value.
The final value used for the model was lambda = 0.

4) Training - with explicit parameter tuning using preProcess method & Manual Grid i.e. tuneGrid

Grid needs to parameterise manually for each particular algorithm

In [62]:
# PLS
grid <- expand.grid(ncomp=c(seq(from = 1, to = 4, by = 0.5)))
fit.pls_manualGrid <- caret::train(Churn~., data=training_dataset, method="pls", 
                                   metric=metric, 
                                   trControl=control,
                                   preProc=c("center", "scale"), 
                                   tuneGrid = grid,
                                   verbose = TRUE
)
print(fit.pls_manualGrid)
plot(fit.pls_manualGrid)
Aggregating results
Selecting tuning parameters
Fitting ncomp = 1 on full training set
Partial Least Squares 

2501 samples
  19 predictor
   2 classes: 'no', 'yes' 

Pre-processing: centered (19), scaled (19) 
Resampling: Cross-Validated (2 fold, repeated 2 times) 
Summary of sample sizes: 1251, 1250, 1250, 1251 
Resampling results across tuning parameters:

  ncomp  Accuracy   Kappa    
  1.0    0.8642550  0.1251190
  1.5    0.8642550  0.1251190
  2.0    0.8618568  0.1671640
  2.5    0.8618568  0.1671640
  3.0    0.8624561  0.1716716
  3.5    0.8624561  0.1716716
  4.0    0.8624563  0.1716890

Accuracy was used to select the optimal model using the largest value.
The final value used for the model was ncomp = 1.
In [63]:
# KKNN
grid <- expand.grid(kmax      = c(seq(from = 1, to = 10, by = 1)),
                    distance  = c(seq(from = 1, to = 10, by = 2)),
                    kernel    = c("rectangular", "triangular","epanechnikov")
)
fit.kknn_manualGrid <- caret::train(Churn~., data=training_dataset, method="kknn", 
                                    metric=metric, 
                                    trControl=control,
                                    preProc=c("center", "scale", "pca"), 
                                    tuneGrid = grid,
                                    verbose = TRUE
)
print(fit.kknn_manualGrid)
plot(fit.kknn_manualGrid)
Aggregating results
Selecting tuning parameters
Fitting kmax = 10, distance = 3, kernel = triangular on full training set
k-Nearest Neighbors 

2501 samples
  19 predictor
   2 classes: 'no', 'yes' 

Pre-processing: centered (19), scaled (19), principal component
 signal extraction (19) 
Resampling: Cross-Validated (2 fold, repeated 2 times) 
Summary of sample sizes: 1251, 1250, 1250, 1251 
Resampling results across tuning parameters:

  kmax  distance  kernel        Accuracy   Kappa    
   1    1         rectangular   0.8420614  0.3014900
   1    1         triangular    0.8420614  0.3014900
   1    1         epanechnikov  0.8420614  0.3014900
   1    3         rectangular   0.8522582  0.3417751
   1    3         triangular    0.8522582  0.3417751
   1    3         epanechnikov  0.8522582  0.3417751
   1    5         rectangular   0.8470596  0.3287266
   1    5         triangular    0.8470596  0.3287266
   1    5         epanechnikov  0.8470596  0.3287266
   1    7         rectangular   0.8422611  0.3051310
   1    7         triangular    0.8422611  0.3051310
   1    7         epanechnikov  0.8422611  0.3051310
   1    9         rectangular   0.8398620  0.2909128
   1    9         triangular    0.8398620  0.2909128
   1    9         epanechnikov  0.8398620  0.2909128
   2    1         rectangular   0.8420614  0.3014900
   2    1         triangular    0.8420614  0.3014900
   2    1         epanechnikov  0.8420614  0.3014900
   2    3         rectangular   0.8522582  0.3417751
   2    3         triangular    0.8522582  0.3417751
   2    3         epanechnikov  0.8522582  0.3417751
   2    5         rectangular   0.8470596  0.3287266
   2    5         triangular    0.8470596  0.3287266
   2    5         epanechnikov  0.8470596  0.3287266
   2    7         rectangular   0.8422611  0.3051310
   2    7         triangular    0.8422611  0.3051310
   2    7         epanechnikov  0.8422611  0.3051310
   2    9         rectangular   0.8398620  0.2909128
   2    9         triangular    0.8398620  0.2909128
   2    9         epanechnikov  0.8398620  0.2909128
   3    1         rectangular   0.8694516  0.3103348
   3    1         triangular    0.8518580  0.3059017
   3    1         epanechnikov  0.8532577  0.3110970
   3    3         rectangular   0.8682526  0.3036131
   3    3         triangular    0.8570563  0.3336962
   3    3         epanechnikov  0.8572563  0.3314725
   3    5         rectangular   0.8588536  0.2881036
   3    5         triangular    0.8542574  0.3329393
   3    5         epanechnikov  0.8542572  0.3304396
   3    7         rectangular   0.8598523  0.2883871
   3    7         triangular    0.8504588  0.3124852
   3    7         epanechnikov  0.8508587  0.3125404
   3    9         rectangular   0.8514580  0.2797816
   3    9         triangular    0.8486604  0.3017994
   3    9         epanechnikov  0.8494601  0.3036769
   4    1         rectangular   0.8694516  0.3103348
   4    1         triangular    0.8602561  0.3121887
   4    1         epanechnikov  0.8626555  0.3181380
   4    3         rectangular   0.8682526  0.3036131
   4    3         triangular    0.8626548  0.3322890
   4    3         epanechnikov  0.8628552  0.3300243
   4    5         rectangular   0.8588536  0.2881036
   4    5         triangular    0.8634540  0.3452524
   4    5         epanechnikov  0.8638537  0.3418055
   4    7         rectangular   0.8598523  0.2883871
   4    7         triangular    0.8558569  0.3071932
   4    7         epanechnikov  0.8566571  0.3072527
   4    9         rectangular   0.8514580  0.2797816
   4    9         triangular    0.8560574  0.3082390
   4    9         epanechnikov  0.8568574  0.3074399
   5    1         rectangular   0.8720508  0.3024885
   5    1         triangular    0.8670529  0.3140192
   5    1         epanechnikov  0.8678526  0.3130305
   5    3         rectangular   0.8724492  0.3013423
   5    3         triangular    0.8686518  0.3372829
   5    3         epanechnikov  0.8680521  0.3346127
   5    5         rectangular   0.8718502  0.2633336
   5    5         triangular    0.8652536  0.3275910
   5    5         epanechnikov  0.8652540  0.3209199
   5    7         rectangular   0.8606534  0.2598372
   5    7         triangular    0.8638536  0.3211675
   5    7         epanechnikov  0.8634532  0.3191451
   5    9         rectangular   0.8520590  0.2494092
   5    9         triangular    0.8618540  0.3150281
   5    9         epanechnikov  0.8624540  0.3175709
   6    1         rectangular   0.8720508  0.3024885
   6    1         triangular    0.8712505  0.3138422
   6    1         epanechnikov  0.8692526  0.3102379
   6    3         rectangular   0.8724492  0.3013423
   6    3         triangular    0.8712515  0.3329888
   6    3         epanechnikov  0.8714515  0.3255296
   6    5         rectangular   0.8718502  0.2633336
   6    5         triangular    0.8684523  0.3155740
   6    5         epanechnikov  0.8704513  0.3178978
   6    7         rectangular   0.8606534  0.2598372
   6    7         triangular    0.8680518  0.3101346
   6    7         epanechnikov  0.8674524  0.3022913
   6    9         rectangular   0.8520590  0.2494092
   6    9         triangular    0.8678518  0.3127939
   6    9         epanechnikov  0.8680515  0.3081330
   7    1         rectangular   0.8722508  0.2764402
   7    1         triangular    0.8714500  0.3011855
   7    1         epanechnikov  0.8720499  0.3026094
   7    3         rectangular   0.8718497  0.2964056
   7    3         triangular    0.8726512  0.3241032
   7    3         epanechnikov  0.8742502  0.3276691
   7    5         rectangular   0.8714505  0.2535004
   7    5         triangular    0.8714505  0.3176812
   7    5         epanechnikov  0.8700515  0.3061824
   7    7         rectangular   0.8638508  0.2623799
   7    7         triangular    0.8702502  0.3049019
   7    7         epanechnikov  0.8712508  0.3025107
   7    9         rectangular   0.8520590  0.2494092
   7    9         triangular    0.8720497  0.3127338
   7    9         epanechnikov  0.8726496  0.3046298
   8    1         rectangular   0.8722508  0.2764402
   8    1         triangular    0.8726491  0.3035201
   8    1         epanechnikov  0.8724496  0.2979223
   8    3         rectangular   0.8718497  0.2964056
   8    3         triangular    0.8740504  0.3186757
   8    3         epanechnikov  0.8740499  0.3000372
   8    5         rectangular   0.8714505  0.2535004
   8    5         triangular    0.8716505  0.3127464
   8    5         epanechnikov  0.8700515  0.3061824
   8    7         rectangular   0.8638508  0.2623799
   8    7         triangular    0.8722497  0.2973342
   8    7         epanechnikov  0.8726500  0.2936926
   8    9         rectangular   0.8520590  0.2494092
   8    9         triangular    0.8724499  0.3014958
   8    9         epanechnikov  0.8726500  0.2885658
   9    1         rectangular   0.8714508  0.2646270
   9    1         triangular    0.8730489  0.2986668
   9    1         epanechnikov  0.8732496  0.3012693
   9    3         rectangular   0.8718497  0.2964056
   9    3         triangular    0.8730497  0.2892004
   9    3         epanechnikov  0.8744496  0.2952054
   9    5         rectangular   0.8714505  0.2535004
   9    5         triangular    0.8736497  0.3053630
   9    5         epanechnikov  0.8734504  0.2951302
   9    7         rectangular   0.8638508  0.2623799
   9    7         triangular    0.8730500  0.2883575
   9    7         epanechnikov  0.8726499  0.2826063
   9    9         rectangular   0.8520590  0.2494092
   9    9         triangular    0.8726502  0.2851742
   9    9         epanechnikov  0.8734499  0.2804414
  10    1         rectangular   0.8714508  0.2646270
  10    1         triangular    0.8732489  0.2979334
  10    1         epanechnikov  0.8732496  0.3012693
  10    3         rectangular   0.8718497  0.2964056
  10    3         triangular    0.8752488  0.2952917
  10    3         epanechnikov  0.8750491  0.2956038
  10    5         rectangular   0.8714505  0.2535004
  10    5         triangular    0.8740497  0.3018323
  10    5         epanechnikov  0.8724504  0.2861394
  10    7         rectangular   0.8638508  0.2623799
  10    7         triangular    0.8734504  0.2825553
  10    7         epanechnikov  0.8728497  0.2801595
  10    9         rectangular   0.8520590  0.2494092
  10    9         triangular    0.8722500  0.2711882
  10    9         epanechnikov  0.8738499  0.2779492

Accuracy was used to select the optimal model using the largest value.
The final values used for the model were kmax = 10, distance = 3 and kernel
 = triangular.
In [64]:
# PDA
grid <- expand.grid(lambda = c(seq(from = 0.1, to = 1.0, by = 0.2)))
fit.pda_manualGrid <- caret::train(Churn~., data=training_dataset, method="pda", 
                                   metric=metric, 
                                   trControl=control,
                                   preProc=c("center", "scale", "pca"), 
                                   tuneGrid = grid,
                                   verbose = TRUE
)
print(fit.pda_manualGrid)
plot(fit.pda_manualGrid)
Aggregating results
Selecting tuning parameters
Fitting lambda = 0.1 on full training set
Penalized Discriminant Analysis 

2501 samples
  19 predictor
   2 classes: 'no', 'yes' 

Pre-processing: centered (19), scaled (19), principal component
 signal extraction (19) 
Resampling: Cross-Validated (2 fold, repeated 2 times) 
Summary of sample sizes: 1251, 1250, 1250, 1251 
Resampling results across tuning parameters:

  lambda  Accuracy   Kappa    
  0.1     0.8568579  0.2011181
  0.3     0.8568579  0.2011181
  0.5     0.8568579  0.2011181
  0.7     0.8568579  0.2011181
  0.9     0.8568579  0.2011181

Accuracy was used to select the optimal model using the largest value.
The final value used for the model was lambda = 0.1.

Collect the results of trained models

In [65]:
results <- resamples(list(    trained_Model_1  = fit.pls
                              , trained_Model_2  = fit.kknn
                              , trained_Model_3  = fit.pda
                              
                              , trained_Model_4  = fit.pls_preProc
                              , trained_Model_5  = fit.kknn_preProc
                              , trained_Model_6  = fit.pda_preProc
                              
                              , trained_Model_7  = fit.pls_automaticGrid
                              , trained_Model_8  = fit.kknn_automaticGrid
                              , trained_Model_9  = fit.pda_automaticGrid
                              
                              , trained_Model_10 = fit.pls_manualGrid
                              , trained_Model_11 = fit.kknn_manualGrid
                              , trained_Model_12 = fit.pda_manualGrid
))

Summarize the fitted models

In [66]:
summary(results)
Call:
summary.resamples(object = results)

Models: trained_Model_1, trained_Model_2, trained_Model_3, trained_Model_4, trained_Model_5, trained_Model_6, trained_Model_7, trained_Model_8, trained_Model_9, trained_Model_10, trained_Model_11, trained_Model_12 
Number of resamples: 4 

Accuracy 
                      Min.   1st Qu.    Median      Mean   3rd Qu.      Max.
trained_Model_1  0.8545164 0.8545164 0.8548582 0.8548582 0.8552000 0.8552000
trained_Model_2  0.8713030 0.8749001 0.8776496 0.8774505 0.8802000 0.8832000
trained_Model_3  0.8497202 0.8544301 0.8560576 0.8548588 0.8564863 0.8576000
trained_Model_4  0.8600000 0.8606835 0.8612556 0.8614552 0.8620273 0.8633094
trained_Model_5  0.8680000 0.8740743 0.8792496 0.8780476 0.8832229 0.8856914
trained_Model_6  0.8521183 0.8568296 0.8588563 0.8576577 0.8596844 0.8608000
trained_Model_7  0.8608000 0.8620825 0.8625100 0.8624550 0.8628825 0.8640000
trained_Model_8  0.8681055 0.8681055 0.8696528 0.8714528 0.8730000 0.8784000
trained_Model_9  0.8473221 0.8527178 0.8552582 0.8544596 0.8570000 0.8600000
trained_Model_10 0.8617106 0.8628277 0.8632547 0.8642550 0.8646820 0.8688000
trained_Model_11 0.8697042 0.8714261 0.8724000 0.8752488 0.8762227 0.8864908
trained_Model_12 0.8473221 0.8538305 0.8584000 0.8568579 0.8614273 0.8633094
                 NA's
trained_Model_1     0
trained_Model_2     0
trained_Model_3     0
trained_Model_4     0
trained_Model_5     0
trained_Model_6     0
trained_Model_7     0
trained_Model_8     0
trained_Model_9     0
trained_Model_10    0
trained_Model_11    0
trained_Model_12    0

Kappa 
                       Min.    1st Qu.     Median       Mean   3rd Qu.
trained_Model_1  0.00000000 0.00000000 0.00000000 0.00000000 0.0000000
trained_Model_2  0.31125762 0.32440042 0.34310095 0.35077732 0.3694779
trained_Model_3  0.25244113 0.27291322 0.27996529 0.27502797 0.2820800
trained_Model_4  0.07329637 0.07817774 0.09166261 0.09858137 0.1120662
trained_Model_5  0.31187875 0.31378782 0.34817041 0.34911986 0.3835025
trained_Model_6  0.16288382 0.20900009 0.22993344 0.22710010 0.2480334
trained_Model_7  0.08534278 0.09920325 0.10710986 0.10477096 0.1126776
trained_Model_8  0.25814312 0.28514529 0.30242199 0.30452543 0.3218021
trained_Model_9  0.18784733 0.19518163 0.21761766 0.21959284 0.2420289
trained_Model_10 0.09904860 0.10121026 0.11925537 0.12511904 0.1431641
trained_Model_11 0.26345484 0.26869677 0.27111119 0.29529169 0.2977061
trained_Model_12 0.15736186 0.17590486 0.20075719 0.20111811 0.2259704
                      Max. NA's
trained_Model_1  0.0000000    0
trained_Model_2  0.4056497    0
trained_Model_3  0.2877402    0
trained_Model_4  0.1377039    0
trained_Model_5  0.3882599    0
trained_Model_6  0.2856497    0
trained_Model_7  0.1195214    0
trained_Model_8  0.3551146    0
trained_Model_9  0.2552887    0
trained_Model_10 0.1629168    0
trained_Model_11 0.3754895    0
trained_Model_12 0.2455962    0

Plot and rank the fitted models

In [67]:
dotplot(results)
In [68]:
bwplot(results)

Assign the best trained model based on Accuracy

In [69]:
best_trained_model <- fit.pda_automaticGrid

9. Test skill of the BEST trained model on validation/testing dataset

In [71]:
predictions <- predict(best_trained_model, newdata=testing_dataset)

Evaluate the BEST trained model and print results

In [72]:
res_  <- caret::confusionMatrix(table(predictions, testing_dataset$Churn))
print("Results from the BEST trained model ... ...\n"); 
print(round(res_$overall, digits = 3))
[1] "Results from the BEST trained model ... ...\n"
      Accuracy          Kappa  AccuracyLower  AccuracyUpper   AccuracyNull 
         0.864          0.196          0.839          0.887          0.856 
AccuracyPValue  McnemarPValue 
         0.263          0.000 

10. Save the model to disk

In [73]:
#getwd()
saveRDS(best_trained_model, "./best_trained_model.rds")
In [74]:
# load the model
#getwd()
saved_model <- readRDS("./best_trained_model.rds")
print(saved_model)
Penalized Discriminant Analysis 

2501 samples
  19 predictor
   2 classes: 'no', 'yes' 

Pre-processing: centered (19), scaled (19), principal component
 signal extraction (19) 
Resampling: Cross-Validated (2 fold, repeated 2 times) 
Summary of sample sizes: 1250, 1251, 1250, 1251 
Resampling results across tuning parameters:

  lambda  Accuracy   Kappa    
  0.0     0.8544596  0.2195928
  0.1     0.8544596  0.2195928

Accuracy was used to select the optimal model using the largest value.
The final value used for the model was lambda = 0.
In [75]:
# make a predictions on "new data" using the final model
final_predictions <- predict(saved_model, dataSet[1:20])
confusionMatrix(table(final_predictions, dataSet$Churn))
res_ <- confusionMatrix(table(final_predictions, dataSet$Churn))
print("Results from the BEST trained model ... ...\n"); 
print(round(res_$overall, digits = 3))
Confusion Matrix and Statistics

                 
final_predictions   no  yes
              no  2796  414
              yes   54   69
                                          
               Accuracy : 0.8596          
                 95% CI : (0.8473, 0.8712)
    No Information Rate : 0.8551          
    P-Value [Acc > NIR] : 0.2387          
                                          
                  Kappa : 0.1795          
                                          
 Mcnemar's Test P-Value : <2e-16          
                                          
            Sensitivity : 0.9811          
            Specificity : 0.1429          
         Pos Pred Value : 0.8710          
         Neg Pred Value : 0.5610          
             Prevalence : 0.8551          
         Detection Rate : 0.8389          
   Detection Prevalence : 0.9631          
      Balanced Accuracy : 0.5620          
                                          
       'Positive' Class : no              
                                          
[1] "Results from the BEST trained model ... ...\n"
      Accuracy          Kappa  AccuracyLower  AccuracyUpper   AccuracyNull 
         0.860          0.179          0.847          0.871          0.855 
AccuracyPValue  McnemarPValue 
         0.239          0.000 
In [76]:
print(res_$table)
fourfoldplot(res_$table, color = c("#CC6666", "#99CC99"),
             conf.level = 0, margin = 1, main = "Confusion Matrix")
                 
final_predictions   no  yes
              no  2796  414
              yes   54   69

REFERENCES