Time Series Forecasting Using R

Time Series Forecasting Using R

 

Introduction

In this guide, you will learn how to implement the following time series forecasting techniques using the statistical programming language ‘R’: 1. Naive Method 2. Simple Exponential Smoothing 3. Holt’s Trend Method 4. ARIMA 5. TBATS

We will begin by exploring the data.

Problem Statement

Unemployment is a major socio-economic and political issue for any country and, hence, managing it is a chief task for any government. In this guide, we will try to forecast the unemployment levels for a twelve-month period. The data used in this guide was produced from US economic time series data available from the Federal Reserve Economic Data. The data contains 574 rows and 6 variables, as described below:

  • date – represents the first date of every month starting from January 1968.
  • psavert – personal savings rate.
  • pce – personal consumption expenditures, in billions of dollars.
  • uempmed – median duration of unemployment, in weeks.
  • pop – total population, in thousands.
  • unemploy- number of unemployed in thousands (dependent variable).

 

Even though we have six variables, the focus will be on the ‘date’ and ‘unemploy’ columns, as the focus is on univariate time series forecasting.

Loading the Required Libraries


library(readr)
library(ggplot2)
library(forecast)
library(fpp2)
library(TTR)
library(dplyr)

Reading the Data

The first line of code below reads in the data, while the second line prints the overview of the data.


dat <- read_csv("timeseriesdata.csv")
glimpse(dat)

Output:


Observations: 564
Variables: 7
$ date 	<chr> "01-01-1968", "01-02-1968", "01-03-1968", "01-04-1968...
$ pce  	<dbl> 531.5, 534.2, 544.9, 544.6, 550.4, 556.8, 563.8, 567....
$ pop  	<int> 199808, 199920, 200056, 200208, 200361, 200536, 20070...
$ psavert  <dbl> 11.7, 12.2, 11.6, 12.2, 12.0, 11.6, 10.6, 10.4, 10.4,...
$ uempmed  <dbl> 5.1, 4.5, 4.1, 4.6, 4.4, 4.4, 4.5, 4.2, 4.6, 4.8, 4.4...
$ unemploy <int> 2878, 3001, 2877, 2709, 2740, 2938, 2883, 2768, 2686,...
$ Class	<chr> "Train", "Train", "Train", "Train", "Train", "Train",...

Data Partitioning

We will be creating the training and the test data sets for model validation. The first couple of lines of code below perform this task by using the ‘Class’ variable that already has labels of Train and Test.

The third line of code prints the number of rows in the training and the test data. Note that the test data set has 12 rows because we will be building the model to forecast for 12 months ahead.


dat_train = subset(dat, Class == 'Train')
dat_test = subset(dat, Class == 'Test')
 
nrow(dat_train); nrow(dat_test)

Output:


[1] 552
[1] 12

Preparing the Time Series Object

To run the forecasting models in ‘R’, we need to convert the data into a time series object which is done in the first line of code below. The ‘start’ and ‘end’ argument specifies the time of the first and the last observation, respectively. The argument ‘frequency’ specifies the number of observations per unit of time.

We will also create a utility function for calculating Mean Absolute Percentage Error (or MAPE), which will be used to evaluate the performance of the forecasting models. The lower the MAPE value, the better the forecasting model. This is done in the second to fourth lines of code.


dat_ts <- ts(dat_train[, 6], start = c(1968, 1), end = c(2013, 12), frequency = 12)
 
#lines 2 to 4
mape <- function(actual,pred){
  mape <- mean(abs((actual - pred)/actual))*100
  return (mape)
}

With the data and the mape function prepared, we move to the forecasting techniques in the subsequent sections.

Naive Forecasting Method

The simplest forecasting method is to use the most recent observation as the forecast for the next observation. This is called a naive forecast and can be implemented using the ‘naive()’ function. This method may not be the best forecasting technique, but it often provides a useful benchmark for other, more advanced forecasting methods.

The first line of code below reads in the time series object ‘dat_ts’ and creates the naive forecasting model. The second argument ‘h’ specifies the number of values you want to forecast which is set to 12, in our case. The second line prints the summary of the model as well as the forecasted value for the next 12 months.


naive_mod <- naive(dat_ts, h = 12)
summary(naive_mod)

Output:


Forecast method: Naive method
 
Model Information:
Call: naive(y = dat_ts, h = 12)
 
Residual sd: 214.9766
 
Error measures:
                	ME 	RMSE      MAE   	MPE 	MAPE      MASE
 Training set 13.60799 214.9766 161.1361 0.1931667 2.148826 0.1690101
              	ACF1
 Training set 0.201558
 
Forecasts:
     	Point Forecast 	Lo 80    Hi 80	Lo 95	Hi 95
Jan 2014      	10376 10100.496 10651.50 9954.654 10797.35
Feb 2014      	10376  9986.379 10765.62 9780.126 10971.87
Mar 2014      	10376  9898.814 10853.19 9646.206 11105.79
Apr 2014      	10376  9824.993 10927.01 9533.307 11218.69
May 2014      	10376  9759.955 10992.04 9433.841 11318.16
Jun 2014      	10376  9701.157 11050.84 9343.916 11408.08
Jul 2014      	10376  9647.086 11104.91 9261.222 11490.78
Aug 2014      	10376  9596.758 11155.24 9184.252 11567.75
Sep 2014      	10376  9549.489 11202.51 9111.961 11640.04
Oct 2014      	10376  9504.781 11247.22 9043.585 11708.41
Nov 2014      	10376  9462.258 11289.74 8978.552 11773.45
Dec 2014      	10376  9421.627 11330.37 8916.413 11835.59

The output above shows that the naive method predicts the same value for the entire forecasting horizon. Let us now use the forecasted value and evaluate the model performance on the test data.

The first line of code below adds a new variable, naive, in the test data which contains the forecasted value obtained from the naive method. The second line uses the mape function to produce the MAPE error on the test data, which comes out to be 8.5 percent.


dat_test$naive = 10376
mape(dat_test$unemploy, dat_test$naive)  ## 8.5%

Output:


[1] 8.486551

Simple Exponential Smoothing

Exponential Smoothing methods are an extension of the naive method, wherein the forecasts are produced using weighted averages of past observations, with the weights decaying exponentially as the observations get older. In simple words, higher weights are given to the more recent observations and vice versa. The value of the smoothing parameter for the level is decided by the parameter ‘alpha’.

The first line of code below reads in the time series object ‘dat_ts’ and creates the simple exponential smoothing model. The second line prints the summary of the model as well as the forecasted value for the next 12 months.


se_model <- ses(dat_ts, h = 12)
summary(se_model)

Output:


Forecast method: Simple exponential smoothing
 
Model Information:
Simple exponential smoothing
 
Call:
ses(y = dat_ts, h = 12)
 
Smoothing parameters: alpha = 0.9999
 
Initial states:
   l = 2849.6943
   sigma:  215.1798
 
  	AIC     AICc  	BIC
 9419.182 9419.226 9432.123
 
Error measures:
                	ME 	RMSE      MAE   	MPE 	MAPE      MASE
 Training set 13.63606 214.7896 160.8971 0.1946177 2.146727 0.1687594
               	ACF1
 Training set 0.2017391
 
Forecasts:
      	Point Forecast 	Lo 80    Hi 80	Lo 95	Hi 95
Jan 2014   	10376.04 10100.280 10651.81 9954.299 10797.79
Feb 2014   	10376.04  9986.074 10766.01 9779.637 10972.45
Mar 2014       10376.04  9898.438 10853.65 9645.609 11106.48
Apr 2014   	10376.04  9824.557 10927.53 9532.618 11219.47
May 2014   	10376.04  9759.466 10992.62 9433.070 11319.02
Jun 2014   	10376.04  9700.619 11051.47 9343.071 11409.02
Jul 2014   	10376.04  9646.504 11105.58 9260.308 11491.78
Aug 2014   	10376.04  9596.134 11155.95 9183.274 11568.81
Sep 2014   	10376.04  9548.826 11203.26 9110.923 11641.17
Oct 2014   	10376.04  9504.080 11248.01 9042.490 11709.60
Nov 2014   	10376.04  9461.521 11290.57 8977.402 11774.69
Dec 2014   	10376.04  9420.857 11331.23 8915.212 11836.88
 

The output above shows that the simple exponential smoothing has the same value for all the forecasts. Because the alpha value is close to 1, the forecasts are closer to the most recent observations. Let us now evaluate the model performance on the test data.

The first line of code below stores the output of the model in a data frame. The second line adds a new variable, simplexp, in the test data which contains the forecasted value from the simple exponential model. The third line uses the mape function to produce the MAPE error on the test data, which comes out to be 8.5 percent.


df_fc = as.data.frame(se_model)
dat_test$simplexp = df_fc$`Point Forecast`
mape(dat_test$unemploy, dat_test$simplexp) 

Output:


[1] 8.486869

Holt’s Trend Method

This is an extension of the simple exponential smoothing method which considers the trend component while generating forecasts. This method involves two smoothing equations, one for the level and one for the trend component.

The first line of code below creates the holt’s winter model and stores it in an object ‘holt_model’. The second line prints the summary and the forecasts for the next 12 months.


holt_model <- holt(dat_ts, h = 12)
summary(holt_model)

Output:


Forecast method: Holt's method
 
Model Information:
Holt's method
 
Call: holt(y = dat_ts, h = 12)
 
Smoothing parameters: alpha = 0.8743; beta = 0.2251
 
Initial states:
 	l = 2941.2241
 	b = -0.9146
 
sigma:  200.7121
 	AIC     AICc  	BIC
 9344.330 9344.440 9365.898
 
Error measures:
              	ME 	RMSE      MAE    	MPE 	MAPE    MASE
Training set -1.775468 199.9836 152.9176 0.04058417 2.056684 0.16039
                	ACF1
Training set -0.001011626
 
Forecasts:
    	 Point Forecast	Lo 80     Hi 80	Lo 95	Hi 95
Jan 2014  	10194.656 9937.433 10451.879 9801.267 10588.04
Feb 2014   	9973.102 9590.814 10355.390 9388.443 10557.76
Mar 2014   	9751.549 9239.463 10263.634 8968.381 10534.72
Apr 2014   	9529.995 8881.048 10178.943 8537.515 10522.48
May 2014   	9308.442 8514.997 10101.887 8094.972 10521.91
Jun 2014   	9086.888 8141.265 10032.512 7640.682 10533.10
Jul 2014   	8865.335 7759.990  9970.680 7174.856 10555.81
Aug 2014   	8643.782 7371.378  9916.185 6697.808 10589.76
Sep 2014   	8422.228 6975.652  9868.804 6209.881 10634.58
Oct 2014   	8200.675 6573.036  9828.313 5711.416 10689.93
Nov 2014   	7979.121 6163.744  9794.498 5202.741 10755.50
Dec 2014   	7757.568 5747.979  9767.157 4684.167 10830.97

The output above shows that the MAPE for the training data is 2.1 percent. Let us now evaluate the model performance on the test data, which is done in the lines of code below. The MAPE error on the test data comes out to be 6.6 percent, which is an improvement over the previous models.


df_holt = as.data.frame(holt_model)
dat_test$holt = df_holt$`Point Forecast`
mape(dat_test$unemploy, dat_test$holt) 

Output:


[1] 6.597904

ARIMA

ARIMA modeling is one of the most popular approaches to time series forecasting. While exponential smoothing models are based on a description of the trend and seasonality in the data, ARIMA models aim to describe the autocorrelations in the data.

The ‘auto.arima()’ function in ‘R’ is used to build ARIMA models by using a variation of the Hyndman-Khandakar algorithm, which combines unit root tests, minimisation of the AICc, and MLE to obtain an ARIMA model.

The first line of code below creates the ARIMA model and stores it in an object ‘arima_model’. The second line prints the summary and the forecasts for the next 12 months.


arima_model <- auto.arima(dat_ts)
summary(arima_model)

Output:


Series: dat_ts
ARIMA(1,1,2)(0,0,2)[12]
 
Coefficients:
      	ar1  	ma1     ma2 	sma1 	sma2
   	0.9277  -0.8832  0.1526  -0.2287  -0.2401
s.e.  0.0248   0.0495  0.0446   0.0431   0.0392
 
sigma^2 estimated as 35701:  log likelihood=-3668.8
AIC=7349.59   AICc=7349.75   BIC=7375.46
 
Training set error measures:
                	ME 	RMSE      MAE   	MPE 	MAPE     MASE
Training set 6.377816 187.9162 143.4292 0.1084702 1.940315 0.150438
           	      ACF1
Training set 0.005512865

The output above shows that the MAPE for the training data is 1.94 percent. Let us now evaluate the model performance on the test data, which is done in the lines of code below. The MAPE error on the test data comes out to be 2.1 percent, which is an improvement over all the previous models.


fore_arima = forecast::forecast(arima_model, h=12)
df_arima = as.data.frame(fore_arima)
dat_test$arima = df_arima$`Point Forecast`
mape(dat_test$unemploy, dat_test$arima)  ## 2.1%

Output:


[1] 2.063779

TBATS

The TBATS model combines several components of the already discussed techniques in this guide, making them a very good choice for forecasting.

It constitutes the following elements:

  • T: Trigonometric terms for seasonality
  • B: Box-Cox transformations for heterogeneity
  • A: ARMA errors for short-term dynamics
  • T: Trend
  • S: Seasonal (including multiple and non-integer periods)

The first line of code below creates the TBATS model and stores it in an object ‘model_tbats’. The second line prints the summary and the forecasts for the next 12 months.


model_tbats <- tbats(dat_ts)
summary(model_tbats)
{r}

Output:



##               	Length Class  Mode    
## lambda           	1   -none- numeric 
## alpha            	1   -none- numeric 
## beta             	1   -none- numeric 
## damping.parameter	1   -none- numeric 
## gamma.values     	0   -none- NULL 	
## ar.coefficients  	0   -none- NULL 	
## ma.coefficients  	0   -none- NULL 	
## likelihood       	1   -none- numeric 
## optim.return.code	1   -none- numeric 
## variance         	1   -none- numeric 
## AIC              	1   -none- numeric 
## parameters       	2   -none- list 	
## seed.states          2   -none- numeric 
## fitted.values  	552   ts     numeric 
## errors         	552   ts     numeric 
## x             	1104   -none- numeric 
## seasonal.periods 	0   -none- NULL 	
## y              	552   ts     numeric 
## call             	2   -none- call 	
## series           	1   -none- character
## method           	1   -none- character
 

Let us now evaluate the model performance on the test data, which is done in the lines of code below. The MAPE error on the test data comes out to be 2.2 percent, which is close to the performance of the ARIMA model.


for_tbats <- forecast::forecast(model_tbats, h = 12)
df_tbats = as.data.frame(for_tbats)
dat_test$tbats = df_tbats$`Point Forecast`
mape(dat_test$unemploy, dat_test$tbats) 

Output:


[1] 2.250596

Conclusion

In this guide, you have learned about several forecasting techniques using ‘R’. The performance of the models on the test data is summarized below:

  • Naive Method: MAPE of 8.5 percent
  • Simple Exponential Smoothing: MAPE of 8.5 percent
  • Holt’s Trend Method: MAPE of 6.6 percent
  • ARIMA: MAPE of 2.1 percent
  • TBATS: MAPE of 2.2 percent

The Naive and Simple Exponential Smoothing models did well by achieving a lower MAPE of 8.5 percent. All the other models outperformed them by producing lower MAPE. However, ARIMA and TBATS model emerge as the winner basis their performance on the test data with MAPE close to 2.1 percent.

 

Python Example for Beginners

Two Machine Learning Fields

There are two sides to machine learning:

  • Practical Machine Learning:This is about querying databases, cleaning data, writing scripts to transform data and gluing algorithm and libraries together and writing custom code to squeeze reliable answers from data to satisfy difficult and ill defined questions. It’s the mess of reality.
  • Theoretical Machine Learning: This is about math and abstraction and idealized scenarios and limits and beauty and informing what is possible. It is a whole lot neater and cleaner and removed from the mess of reality.

 

Data Science Resources: Data Science Recipes and Applied Machine Learning Recipes

Introduction to Applied Machine Learning & Data Science for Beginners, Business Analysts, Students, Researchers and Freelancers with Python & R Codes @ Western Australian Center for Applied Machine Learning & Data Science (WACAMLDS) !!!

Latest end-to-end Learn by Coding Recipes in Project-Based Learning:

Applied Statistics with R for Beginners and Business Professionals

Data Science and Machine Learning Projects in Python: Tabular Data Analytics

Data Science and Machine Learning Projects in R: Tabular Data Analytics

Python Machine Learning & Data Science Recipes: Learn by Coding

R Machine Learning & Data Science Recipes: Learn by Coding

Comparing Different Machine Learning Algorithms in Python for Classification (FREE)

Disclaimer: The information and code presented within this recipe/tutorial is only for educational and coaching purposes for beginners and developers. Anyone can practice and apply the recipe/tutorial presented here, but the reader is taking full responsibility for his/her actions. The author (content curator) of this recipe (code / program) has made every effort to ensure the accuracy of the information was correct at time of publication. The author (content curator) does not assume and hereby disclaims any liability to any party for any loss, damage, or disruption caused by errors or omissions, whether such errors or omissions result from accident, negligence, or any other cause. The information presented here could also be found in public knowledge domains.  

Google –> SETScholars