How to do Linear Regression in R Using Principal Component Regression

In [2]:
# ---------------------------------------------------------------------------
# How to do Linear Regression in R Using Principal Component Regression
# ---------------------------------------------------------------------------
# load data longley data for Econometrics
library(pls)
data(longley)

Data <- as.matrix(longley)
dim(Data)

head(Data)

x <- Data[,1:6]
y <- Data[,7]

# -----------------------------
# Using Principal Component Regression
# -----------------------------

# fit model
fit <- pcr(Employed~., data=longley, validation="CV")

# summarize the fit
summary(fit)

# make predictions
predictions <- predict(fit, longley, ncomp=6)

# summarize accuracy
mse <- mean((longley$Employed - predictions)^2)
print(mse)

# visualise regression
plot(longley$Employed, predictions)
  1. 16
  2. 7
GNP.deflatorGNPUnemployedArmed.ForcesPopulationYearEmployed
194783.0 234.289235.6 159.0 107.6081947 60.323
194888.5 259.426232.5 145.6 108.6321948 61.122
194988.2 258.054368.2 161.6 109.7731949 60.171
195089.5 284.599335.1 165.0 110.9291950 61.187
195196.2 328.975209.9 309.9 112.0751951 63.221
195298.1 346.999193.2 359.4 113.2701952 63.639
Data: 	X dimension: 16 6 
	Y dimension: 16 1
Fit method: svdpc
Number of components considered: 6

VALIDATION: RMSEP
Cross-validated using 10 random segments.
       (Intercept)  1 comps  2 comps  3 comps  4 comps  5 comps  6 comps
CV           3.627    1.812    1.294   0.4934   0.5823   0.5596   0.4197
adjCV        3.627    1.771    1.280   0.4881   0.5705   0.5470   0.4085

TRAINING: % variance explained
          1 comps  2 comps  3 comps  4 comps  5 comps  6 comps
X           64.96    94.90    99.99   100.00   100.00   100.00
Employed    78.42    89.73    98.51    98.56    98.83    99.55
[1] 0.0522765