Suresh-Rose Time Series Forecasting Project Report

Download as pdf or txt
Download as pdf or txt
You are on page 1of 75

Rose wine sales Time

Series Forecasting
Project Report

Suresh Veeraraghavan
3/12/2023

1
Contents
Rose Wine Sale Time Series Forecasting....................................................................................................... 8
Executive Summary....................................................................................................................................... 8
Data Dictionary ......................................................................................................................................... 8
1 Read the data as an appropriate time series data and plot the data. .................................................. 8
1.1 Dataset Sample: ............................................................................................................................ 8
1.2 Missing data .................................................................................................................................. 9
2 Perform appropriate Exploratory Data Analysis to understand the data & also perform
decomposition. ........................................................................................................................................... 10
2.1 Five Point Summary .................................................................................................................... 10
2.2 Dataset Info ................................................................................................................................. 10
2.3 Year wise Box Plot ....................................................................................................................... 10
2.4 Month wise Box Plot ................................................................................................................... 11
2.5 Month plot with median ............................................................................................................. 11
2.6 Pivot table view ........................................................................................................................... 12
2.7 Empirical Distribution ................................................................................................................. 13
2.8 Average and Sale percentage change ......................................................................................... 14
2.9 Decomposition of Time Series – Additive ................................................................................... 14
2.10 Decomposition of Time Series – Multiplicative .......................................................................... 16
3 Split the data into training and test. The test data should start in 1991. ........................................... 17
3.1 Sample of data split .................................................................................................................... 17
4 Build all the exponential smoothing models ...................................................................................... 18
4.1 Linear Regression Model............................................................................................................. 18
4.1.1 Test RMSE – Linear Regression ........................................................................................... 19
4.2 Naïve Forecast............................................................................................................................. 19
4.2.1 Test RMSE – Naïve Model ................................................................................................... 20
4.3 Simple Average ........................................................................................................................... 20
4.3.1 Test RMSE – Simple Average Model ................................................................................... 21
4.4 Moving Average (MA) ................................................................................................................. 22
4.4.1 Test RMSE – Moving Average ............................................................................................. 25
4.5 Simple Exponential Smoothing (SES) - ETS(A, N, N) .................................................................... 26
4.5.1 Smoothing parameters ....................................................................................................... 27
4.5.2 Test RMSE – SES .................................................................................................................. 27

2
4.6 Double Exponential Smoothing - ETS(A, A, N) ............................................................................ 28
4.6.1 DES Smoothing parameters ................................................................................................ 28
4.6.2 RMSE Test – DES ................................................................................................................. 29
4.7 Holt Winter's linear method with additive errors (Triple Exponential Additive Smoothing) - ETS
(A, A, A) ................................................................................................................................................... 30
4.7.1 Triple Exponential Additive Smoothing parameters ........................................................... 30
4.7.2 TEST RMSE – Triple Exponential Additive Smoothing ......................................................... 32
4.8 Holt Winter's linear method – multiplicative (TES) – ETS (A, A, M) ............................................ 33
4.8.1 Parameters .......................................................................................................................... 33
4.8.2 TEST RMSE – TES Multiplicative .......................................................................................... 34
4.9 Holt Winter's linear method with additive errors - Using Damped Trend - ETS(A, A, A)............ 34
4.9.1 TES additive – Damped Trend parameters ......................................................................... 35
4.9.2 TEST RMSE – TES additive Damped Trend .......................................................................... 36
4.10 Holt Winter's linear method - multiplicative - using DAMPED TREND - ETS(A, A, M) ................ 37
4.10.1 TES multiplicative – Damped Trend parameters ................................................................ 38
4.10.2 TEST RMSE – TES multiplicative Damped Trend ................................................................. 39
4.11 Inference/Conclusion based on the model build so far: ............................................................. 40
5 Check for the stationarity of the data on which the model is being built on using appropriate
statistical tests and also mention the hypothesis for the statistical test. If the data is found to be non-
stationary, take appropriate steps to make it stationary. Check the new data for stationarity and
comment. Note: Stationarity should be checked at alpha = 0.05 .............................................................. 41
5.1 Data Stationarity verification: ..................................................................................................... 41
5.1.1 Dicky Fuller test - check for stationarity of the time series ................................................ 41
5.1.2 One order difference result ................................................................................................ 43
5.1.3 Time series plot before and after one order difference ..................................................... 43
6 Build an automated version of the ARIMA/SARIMA model in which the parameters are selected
using the lowest Akaike Information Criteria (AIC) on the training data and evaluate this model on the
test data using RMSE. ................................................................................................................................. 44
6.1.1 ACF and PACF before one order difference on full data ..................................................... 45
6.1.2 ACF and PACF after performing one order difference on full data .................................... 46
6.1.3 ACF and PACF for Train dataset with one order difference ................................................ 47
6.2 ARIMA Automated ...................................................................................................................... 48
6.2.2 RMSE – ARIMA Automated ................................................................................................. 50
6.2.3 Automated ARIMA Prediction............................................................................................. 51
6.3 SARIMA Automated .................................................................................................................... 51

3
6.3.2 Predicted sample test data ................................................................................................. 54
6.3.3 RMSE – Automated SARIMA ............................................................................................... 55
6.3.4 Automated SARIMA prediction ........................................................................................... 55
7 Build ARIMA/SARIMA models based on the cut-off points of ACF and PACF on the training data and
evaluate this model on the test data using RMSE. ..................................................................................... 56
7.1 ACF & PACF plot with one difference ......................................................................................... 56
7.2 ARIMA Manual Model (2,1,2) ..................................................................................................... 57
7.2.1 RMSE – Manual ARIMA ....................................................................................................... 58
7.2.2 Manual ARIMA prediction................................................................................................... 59
7.3 SARIMA Manual Model ............................................................................................................... 59
7.3.1 ACF and PACF with difference of 6 and diff in train dataset to identify the P and Q ........ 61
7.3.2 Manual SARIMA Model1: (2,1,2) (3, 0, 1, 12) ..................................................................... 62
7.3.3 Manual SARIMA Model 2: (2,1,2) (0, 0, 1, 12) .................................................................... 64
7.3.4 Manual SARIMA Model 3: (3,1,2) (2, 0, 1, 12) .................................................................... 65
7.3.5 RMSE – Manual SARIMA Models ........................................................................................ 67
7.3.6 SARIMA Models Prediction ................................................................................................. 67
8 Build a table with all the models built along with their corresponding parameters and the respective
RMSE values on the test data. .................................................................................................................... 69
9 Based on the model-building exercise, build the most optimum model(s) on the complete data and
predict 12 months into the future with appropriate confidence intervals/bands. .................................... 71
9.1 Best Model fitting ....................................................................................................................... 71
9.2 12 months prediction.................................................................................................................. 72
10 Comment on the model thus built and report your findings and suggest the measures that the
company should be taking for future sales................................................................................................. 73
10.1 Findings on the dataset ............................................................................................................... 73
10.2 Comments on the model build ................................................................................................... 75
10.3 Suggestion based on the analysis we performed ....................................................................... 75

4
List of Figures
Figure 1 –First 5 rows .................................................................................................................................... 8
Figure 2- Last 5 rows ..................................................................................................................................... 9
Figure 3 - Time Series of Rose Wine Sale ...................................................................................................... 9
Figure 4 - Missing data .................................................................................................................................. 9
Figure 5 - Five Point Summary .................................................................................................................... 10
Figure 6 - Dataset Info................................................................................................................................. 10
Figure 7 - Year wise Box plot ....................................................................................................................... 10
Figure 8- Month wise Box plot .................................................................................................................... 11
Figure 9 - Month wise with media .............................................................................................................. 11
Figure 10 - Pivot table view......................................................................................................................... 12
Figure 11 - Empirical_distribution............................................................................................................... 13
Figure 12 - 2.8 Average and Sale percentage change ................................................................................. 14
Figure 13 - Decomposition - Additive ......................................................................................................... 14
Figure 14 - Decomposition - Additive values .............................................................................................. 15
Figure 15 - Decomposition Additive Residual ............................................................................................. 15
Figure 16 - Decomposition - Multiplicative................................................................................................. 16
Figure 17 - Residuals - Multiplicative decomposition ................................................................................. 16
Figure 18 - Train and Test Split graph ......................................................................................................... 17
Figure 19 - Train and Test Split sample records .......................................................................................... 18
Figure 20- Linear Regression ....................................................................................................................... 19
Figure 21 - Linear Regression test RMSE..................................................................................................... 19
Figure 22 - Naive Forecast........................................................................................................................... 20
Figure 23 - Naive test RMSE ........................................................................................................................ 20
Figure 24 - Simple Average ......................................................................................................................... 21
Figure 25 - Simple Average RMSE ............................................................................................................... 21
Figure 26 - Moving Average 2, 4, 6 and 9 point .......................................................................................... 22
Figure 27 - Moving Average 2 point ............................................................................................................ 23
Figure 28 - Moving Average 4 point ............................................................................................................ 23
Figure 29 - Moving Average 6 point ............................................................................................................ 24
Figure 30 - Moving Average 9 point ............................................................................................................ 24
Figure 31 - Moving Average RMSE .............................................................................................................. 25
Figure 32 - SES Parameters ......................................................................................................................... 27
Figure 33 - data prediction using SES .......................................................................................................... 27
Figure 34 - SES test data prediction graph .................................................................................................. 27
Figure 35 - SES RMSE................................................................................................................................... 27
Figure 36 - DES Smoothing Parameters ...................................................................................................... 28
Figure 37 - DES Smoothing graph ............................................................................................................... 29
Figure 38 - DES RMSE .................................................................................................................................. 29
Figure 39 - 4.7.1 Triple Exponential Additive Smoothing parameters ................................................... 30
Figure 40 – Smoothing models SES, DES & TES .......................................................................................... 31
Figure 41 - Triple Exponential Additive Smoothing .................................................................................... 31
Figure 42 - RMSE TES .................................................................................................................................. 32
Figure 43 - Holt Winter's Parameters – Multiplicative ............................................................................... 33

5
Figure 44 - Prediction of TES Multiplicative ................................................................................................ 34
Figure 45 - RMSE – TES Multiplicative ........................................................................................................ 34
Figure 46 - TES Damped Trend parameters ................................................................................................ 35
Figure 47 - RMSE TES additive Damped Trend ........................................................................................... 36
Figure 48 - 4.10.1 TES multiplicative – Damped Trend parameters ........................................................ 38
Figure 49 - RMSE – TES multiplicative Damped Trend................................................................................ 39
Figure 50 - 2 Point Trailing Moving Average.............................................................................................. 40
Figure 51 - Dickey Fuller Test ...................................................................................................................... 41
Figure 52 - Rolling Mean and Standard Deviation ...................................................................................... 42
Figure 53 - Dickey Fuller test after one order diff....................................................................................... 43
Figure 54 - Rolling Mean and std dev after 1 order diff.............................................................................. 43
Figure 55 - Time series before and after 1 order diff.................................................................................. 43
Figure 56 - ACF Full data ............................................................................................................................. 45
Figure 57 - PACF Full data ........................................................................................................................... 45
Figure 58 - ACF Full data with one order diff .............................................................................................. 46
Figure 59 - PACF Full data with one order diff ............................................................................................ 46
Figure 60 - ACF Train data - with one order difference .............................................................................. 47
Figure 61 - ARIMA automated parameters ................................................................................................. 48
Figure 62 - top 5 from ARIMA automated model ....................................................................................... 48
Figure 63 - Auto ARIMA Plot ....................................................................................................................... 50
Figure 64 - RMSE ARIMA Automated .......................................................................................................... 50
Figure 65 - Auto ARIMA 2.1.2 ..................................................................................................................... 51
Figure 66 - SARIMA Automated parameters .............................................................................................. 52
Figure 67- Top 5 best model - lowest AIC scores ........................................................................................ 52
Figure 68 - Automated SARIMA Result ....................................................................................................... 53
Figure 69 – Automated SARIMA Plot .......................................................................................................... 54
Figure 70 - SARIMA sample predicted test data ......................................................................................... 54
Figure 71 - RMSE Auto SARIMA .................................................................................................................. 55
Figure 72 - SARIMA test prediction ............................................................................................................. 55
Figure 73 - Manual ARIMA results .............................................................................................................. 57
Figure 74 - Manual ARIMA plot................................................................................................................... 58
Figure 75 - RMSE Manual ARIMA ................................................................................................................ 58
Figure 76 -Manual ARIMA prediction ......................................................................................................... 59
Figure 77 - Full data plot with diff 6 + diff................................................................................................... 59
Figure 78 - Mean and std Dev plot with diff 6 + diff ................................................................................... 60
Figure 79 - Dickey Fuller test for diff 6 ........................................................................................................ 60
Figure 80 - ACF Train set with diff 6 + diff................................................................................................... 61
Figure 81 - ACF Train set with diff 6 + diff................................................................................................... 61
Figure 82 - Manual SARIMA Model 1 results .............................................................................................. 62
Figure 83 - SARIMA Model 1 Plot ................................................................................................................ 63
Figure 84 - Manual SARIMA Model 2 results .............................................................................................. 64
Figure 85 - SARIMA Model 2 Plot ................................................................................................................ 65
Figure 86 - Manual SARIMA Model 3 results .............................................................................................. 65
Figure 87 - SARIMA Model 3 Plot ................................................................................................................ 66

6
Figure 88 - RSME Manual SARIMA Models ................................................................................................. 67
Figure 89 - SARIMA Prediction Model 1...................................................................................................... 67
Figure 90 - SARIMA Prediction Model 2...................................................................................................... 68
Figure 91 - SARIMA Prediction Model 3...................................................................................................... 68
Figure 94 - Forecast of next 12 months ...................................................................................................... 72
Figure 95 - 12 Months prediction ............................................................................................................... 72
Figure 96 - Time series plot ......................................................................................................................... 73
Figure 97 - Month wise plot ........................................................................................................................ 74

List of Tables
No. Tables Page No
1 Table 1 – All Models 69
2 Table 2 – Top 5 Best Models 74

7
Rose Wine Sale Time Series Forecasting

Executive Summary
For this particular assignment, the data of different types of wine sales in the 20th century is to be
analysed. Both of these data are from the same company but of different wines. As an analyst in the ABC
Estate Wines, you are tasked to analyse and forecast Wine Sales in the 20th century. In this document will
be going through the business report Rose Wine Sale Time Series Forecasting.

Data Dictionary
 Rose dataset has two column, Year-Month and corresponding sale quantity of Rose wine from
the year 1980 to 1995

1 Read the data as an appropriate time series data and


plot the data.
 Rose dataset has been stored in a Data Frame for analysis
 Rose data set has 187 rows.
 There is 2 null/missing value present in the dataset.
 INTERPOLATION is used to impute the 2 missing values
 We have converted the YearMonth column to index to perform Time Series forecasting
 Rose column contains the sale quantity for each month, it is in Int64 datatype

1.1 Dataset Sample:

Figure 1 –First 5 rows

8
Figure 2- Last 5 rows

 Above Figures shows the first and last 5 records of dataset.

Figure 3 - Time Series of Rose Wine Sale

 Level, trend and seasonality is visible in Rose time series graph

1.2 Missing data

Figure 4 - Missing data

Out of 187 records, 2 records are null. INTERPOLATION is used to impute the 2 missing values

9
2 Perform appropriate Exploratory Data Analysis to
understand the data & also perform decomposition.

2.1 Five Point Summary

Figure 5 - Five Point Summary

2.2 Dataset Info

Figure 6 - Dataset Info

 Rose data set has 187 rows. Missing value present in the dataset are treated.

2.3 Year wise Box Plot

Figure 7 - Year wise Box plot

 From the above Year wise box plot it is clearly visible most of the year has outliers

10
2.4 Month wise Box Plot

Figure 8- Month wise Box plot

 From the above Month wise box plot across the year it is clearly visible June, July, August and
September month has outliers
 Across the year December month shows the highest sale
 April month shows the lowest sale across the year
 Through this box plot we could understand seasonality present in the Rose dataset

2.5 Month plot with median

Figure 9 - Month wise with media


 This plot shows us the behavior of the Time Series ('Rose Wine Sales' in this case) across various
months. The red line is the median value.
 As we already seen December month has the highest sale

11
2.6 Pivot table view

Figure 10 - Pivot table view

 Rose data are grouped in month wise.


 Month are represented in numbers 1 to 12
 Month wise highest sale is highlighted in yellow
 The largest sales of the year occur in December. The best sales month was December in 1980
with 267 units of Rose Wine
 This Pivot shows how the Rose wine sale degrading year by year. Most of the highest sale are in
between 1980 and 1982

12
2.7 Empirical Distribution

Figure 11 - Empirical_distribution

 This particular graph tells us what percentage of data points refer to what number of Sales.
 85% of the sales are below 115
 Maximum sales is close to 260

13
2.8 Average and Sale percentage change

Figure 12 - 2.8 Average and Sale percentage change

 The above two graphs tells us the Average 'Rose Wine Sales' and the Percentage change of ‘Rose
Wine Sales' with respect to the time.

2.9 Decomposition of Time Series – Additive

 yt = Trend + Seasonality + Residual

Figure 13 - Decomposition - Additive

14
 Above decomposition shows the downward trend presents in the dataset
 Strong seasonality is present in the Rose wine sale dataset
 Few residual are high and most of residuals are stays near to 0

Figure 14 - Decomposition - Additive values

 November and December has the high seasonality

Figure 15 - Decomposition Additive Residual

 We see that the residuals are located around 0 from the plot of the residuals in the
decomposition.

15
2.10 Decomposition of Time Series – Multiplicative
 yt = Trend * Seasonalit y * Residual

Figure 16 - Decomposition - Multiplicative

 Above decomposition shows the trend presents in the dataset


 Strong seasonality is present in the Rose wine sale dataset

Figure 17 - Residuals - Multiplicative decomposition

 For the multiplicative series, we see that a lot of residuals are located around 1
 Multiplicative decomposition is fits better than the additive decomposition for the Rose dataset

16
3 Split the data into training and test. The test data
should start in 1991.
 Rose dataset is split into train and test at the year 1991
 Sales count from 1980 to 1990 are taken has train dataset
 Sales count from 1991 to 1995 are taken has test dataset

Train dataset has 132 records


Test dataset has 55 records

3.1 Sample of data split

Figure 18 - Train and Test Split graph


 In the above graph blue represents the train dataset and orange represents the test dataset

17
Figure 19 - Train and Test Split sample records

4 Build all the exponential smoothing models

4.1 Linear Regression Model


Linear regression is a commonly used statistical method for modeling the relationship between a
dependent variable and one or more independent variables. While it is often used for cross-sectional data,
it can also be applied to time series data.

In time series data, the dependent variable is a variable that changes over time, and the independent
variable(s) are typically other time-varying variables that may influence the dependent variable. Linear
regression can be a useful tool for modeling time series data

The linear regression equation for a time series data can be written as:

y(t) = β0 + β1x1(t) + β2x2(t) + ... + βkxk(t) + ε(t)

where y(t) is the dependent variable at time t, x1(t), x2(t), ..., xk(t) are the k independent variables at time
t, β0, β1, β2, ..., βk are the corresponding coefficients or parameters to be estimated, and ε(t) is the error
term at time t.

18
Figure 20- Linear Regression

 The above graph makes it quite evident that the Linear regression doesn't do well on the test
dataset. Linear regression forecast is represented in green bar

4.1.1 Test RMSE – Linear Regression

Figure 21 - Linear Regression test RMSE

4.2 Naïve Forecast

For this particular naive model, we say that the prediction for tomorrow is the same as today and the
prediction for day after tomorrow is tomorrow and since the prediction of tomorrow is same as today,
therefore the prediction for day after tomorrow is also today.

𝑦̂ 𝑡+1=𝑦𝑡

19
Figure 22 - Naive Forecast

 The above graph makes it quite evident that the Naïve model doesn't do well on the test dataset.
Naïve forecast is represented in green bar

4.2.1 Test RMSE – Naïve Model

Figure 23 - Naive test RMSE

 Naïve Model RMSE is higher than Liner Regression model’s RMSE

4.3 Simple Average


Simple average forecast is a forecasting method in time series analysis that involves using the arithmetic
mean of past observations as a predictor for future values.

To use this method, you would simply calculate the average of the historical data and use it as a forecast
for all future time periods.

F_t+1 = (Y_1 + Y_2 + ... + Y_t) / t

where:

F_t+1 is the forecast for the next time period (t+1)

Y_1, Y_2, ..., Y_t are the historical observations up to time t

t is the number of historical observations

20
Figure 24 - Simple Average

 The above graph makes it quite evident that the Simple Average model doesn't do well on the
test dataset. Simple Average forecast is represented in green bar

4.3.1 Test RMSE – Simple Average Model

Figure 25 - Simple Average RMSE

21
4.4 Moving Average (MA)
Moving Average (MA) is a time series forecasting method that involves calculating the average of a fixed
number of past observations to forecast future values. The "moving" part of the name refers to the fact
that the window of observations used to calculate the average moves forward in time with each new
forecast.

For various intervals, rolling means (also known as moving averages) will be computed. The highest
accuracy (or lowest error) over here can be used to calculate the ideal interval.

Figure 26 - Moving Average 2, 4, 6 and 9 point

22
Figure 27 - Moving Average 2 point

Figure 28 - Moving Average 4 point

23
Figure 29 - Moving Average 6 point

Figure 30 - Moving Average 9 point

 The above graphs makes it quite evident that 4 point, 6 point and 9 point moving average model
doesn't do well on the test dataset.
 2 point moving average performs better than the other 3 moving averages
 Let’s check the RMSE to make sure 2 point moving average is better than other moving averages

24
4.4.1 Test RMSE – Moving Average

Figure 31 - Moving Average RMSE

 Test RMSE score for 2 point moving average is lesser than the other moving averages
 Lesser RMSE value gives best performance in test dataset
 4 point moving average is the second best in the above plotted moving averages

25
Exponential Smoothing methods

Exponential smoothing is a family of time series forecasting methods that involves giving more weight to
recent observations while decreasing the weight of older observations exponentially over time. This
approach is based on the assumption that recent observations are more informative than older ones and
that trends and patterns in the data may change over time.

Following Exponential Smoothing Models will be built to check perform of the model in test dataset

• Single Exponential Smoothing with Additive Errors – ETS (A, N, N)

• Double Exponential Smoothing with Additive Errors, Additive Trends – ETS (A, A, N)

• Triple Exponential Smoothing with Additive Errors, Additive Trends, Additive

Seasonality – ETS (A, A, A)

• Triple Exponential Smoothing with Additive Errors, Additive Trends,

Multiplicative Seasonality – ETS (A, A, M)

• Triple Exponential Smoothing with Additive Errors, Additive DAMPED

Trends, Additive Seasonality – ETS (A, Ad, A)

• Triple Exponential Smoothing with Additive Errors, Additive DAMPED

Trends, Multiplicative Seasonality – ETS (A, Ad, M)

4.5 Simple Exponential Smoothing (SES) - ETS(A, N, N)


Simple exponential smoothing is the most basic form of exponential smoothing and is used to forecast a
time series that does not exhibit any trend or seasonal patterns. The forecast is based on a weighted
average of past observations, with the weights decreasing exponentially as the observations get older.
The formula for simple exponential smoothing is:

F_t+1 = α * Y_t + (1-α) * F_t

Where:

F_t+1 is the forecast for the next time period (t+1)


Y_t is the actual value of the time series at time t
F_t is the forecast for the current time period (t)

α is the smoothing parameter, also known as the smoothing constant, which determines the weight given
to the most recent observation. It ranges from 0 to 1.

26
4.5.1 Smoothing parameters

Figure 32 - SES Parameters

 For Simple exponential smoothing value of alpha parameter is consider has 0.098750

Below is the data prediction using Simple Exponential Smoothing (SES)

Figure 33 - data prediction using SES

Figure 34 - SES test data prediction graph

 The above graph makes it quite evident that the SES model doesn't do well on the test dataset.
SES forecast is represented in green bar

4.5.2 Test RMSE – SES

Figure 35 - SES RMSE

27
 Alpha parameter value is 0.098750 and the test RMSE value is 36.80

4.6 Double Exponential Smoothing - ETS(A, A, N)


 One of the drawbacks of the simple exponential smoothing is that the model does not do well in
the presence of the trend.
 This model is an extension of SES known as Double Exponential model which estimates two
smoothing parameters.
 Applicable when data has Trend but no seasonality.
 Two separate components are considered: Level and Trend.
 Level is the local mean.
 One smoothing parameter α corresponds to the level series
 A second smoothing parameter β corresponds to the trend series.

Double Exponential Smoothing uses two equations to forecast future values of the time series, one for
forecasting the short term average value or level and the other for capturing the trend.

Intercept or Level equation, 𝐿𝑡 is given by:


𝐿𝑡=𝛼𝑌𝑡+(1−𝛼)𝐹𝑡

Trend equation is given by


𝑇𝑡=𝛽(𝐿𝑡−𝐿𝑡−1)+(1−𝛽)𝑇𝑡−1

Here, 𝛼 and 𝛽 are the smoothing constants for level and trend, respectively, 0 < 𝛼 < 1 and 0 < 𝛽 < 1.
The forecast at time t + 1 is given by
𝐹𝑡+1=𝐿𝑡+𝑇𝑡
𝐹𝑡+𝑛=𝐿𝑡+𝑛𝑇𝑡

4.6.1 DES Smoothing parameters

Figure 36 - DES Smoothing Parameters

 Parameters are auto fitted as shown in the above figure; alpha as 1.490116e-08, Beta=1.661039e-
10

28
Figure 37 - DES Smoothing graph

 The above graph makes it quite evident that the DES model doesn't do well on the test dataset.
SES forecast is represented in green bar and DES forecast is represented in red bar

4.6.2 RMSE Test – DES

Figure 38 - DES RMSE

29
4.7 Holt Winter's linear method with additive errors (Triple
Exponential Additive Smoothing) - ETS (A, A, A)
Holt-Winters smoothing is a statistical technique used to forecast time-series data. It is an extension of
simple exponential smoothing and is used to model data that exhibits trends and seasonality.

The Holt-Winters method involves smoothing the data with three separate smoothing factors:

Level smoothing: This factor, alpha, smooths out the random noise in the data and captures the overall
trend of the time series.

Trend smoothing: This factor, beta, captures the rate of change of the time series trend over time.

Seasonality smoothing: This factor, gamma, captures the seasonal variations in the data over a fixed
period of time.

4.7.1 Triple Exponential Additive Smoothing parameters

Figure 39 - 4.7.1 Triple Exponential Additive Smoothing parameters

 As specified above there are many seasonal parameters are considered in Holt Winter’s model
 Parameter alpha is 0.089541 and beta is 0.000240 and gamma 0.003467

30
Figure 40 – Smoothing models SES, DES & TES

 The above graph makes it quite evident that the TES model fits well on the test dataset. SES and
DES doesn’t fill well and they are in green and red bar respectively.

Figure 41 - Triple Exponential Additive Smoothing

 Triple Exponential Additive Smoothing additive model predict well on test dataset
 Level smoothing: This factor, alpha, smooths out the random noise in the data and captures the
overall trend of the time series
 Trend smoothing: This factor, beta, captures the rate of change of the time series trend over time
 Seasonality smoothing: This factor, gamma, captures the seasonal variations in the data over a
fixed period of time

31
4.7.2 TEST RMSE – Triple Exponential Additive Smoothing

Figure 42 - RMSE TES

 TES RMSE has lesser value than the model we built so for and it reciprocate in the test prediction.
Test prediction graph comes closer to the test dataset

Inference

Triple Exponential Additive Smoothing has performed the best on the test as expected since the data had
both trend and seasonality. This model could be the best model.

But we see that triple exponential smoothing is under forecasting. Let us try to tweak some of the
parameters in order to get a better forecast on the test set.

32
4.8 Holt Winter's linear method – multiplicative (TES) – ETS (A, A, M)
4.8.1 Parameters

Figure 43 - Holt Winter's Parameters – Multiplicative

 As specified above there are many seasonal parameters are considered in Holt Winter’s model
 Parameter alpha is 0.071511, beta is 0.045292 and gamma is 0.000072

33
Figure 44 - Prediction of TES Multiplicative

 By seeing the above graph we couldn’t conclude TES multiplicative model performs well in test
data. We need to compare RMSE to conclude which TES model performs well

4.8.2 TEST RMSE – TES Multiplicative

Figure 45 - RMSE – TES Multiplicative

 By reviewing the above RMSE values we see that the multiplicative seasonality model has not
done that well when compared to the additive seasonality Triple Exponential Smoothing model.
 RMSE values of TES multiplicative model 20.16 is higher than the TES additive 14.24

4.9 Holt Winter's linear method with additive errors - Using Damped
Trend - ETS(A, A, A)

Damped trend additive method is a forecasting technique used to predict time-series data that exhibit a
trend, where the trend is expected to decrease or dampen over time. The method is a variation of the
additive method and involves adding a damping factor to the trend component.

The damped trend additive method can be represented by the following equation:

y(t) = L(t-1) + T(t-1) + S(t-m) + D*(T(t-1))

34
Where:

y(t) is the forecasted value at time t.

L(t-1) is the level component at time t-1.

T(t-1) is the trend component at time t-1.

S(t-m) is the seasonal component at time t-m.

D is the damping factor, which is a value between 0 and 1 that reduces the magnitude of the trend over
time.

4.9.1 TES additive – Damped Trend parameters

Figure 46 - TES Damped Trend parameters

 From the above parameters list we can witness the various seasonal parameters are present and
Alpha= 0.073686, Beta= 0.009798, Gamma= 0.073301,damping_trend=0 0.975626

35
 We couldn't infer from the preceding graph that the TES additive Damped Trend model performed
well in test data. In order to determine which TES model performs best, we must compare RMSE.

4.9.2 TEST RMSE – TES additive Damped Trend

Figure 47 - RMSE TES additive Damped Trend

 TES additive damped trend model performed well on the test prediction.
 Among the various tunings we did in TES models “Additive Damped Trend” has the lowest RMSE
14.25

36
4.10 Holt Winter's linear method - multiplicative - using DAMPED
TREND - ETS(A, A, M)

Damped trend multiplicative method is a forecasting technique used to predict time-series data that
exhibit a trend, where the trend is expected to decrease or dampen over time. The method is a variation
of the multiplicative method and involves adding a damping factor to the trend component.

The damped trend multiplicative method can be represented by the following equation:

y(t) = L(t-1) * T(t-1) * S(t-m) * D^(T(t-1))

Where:

y(t) is the forecasted value at time t.

L(t-1) is the level component at time t-1.

T(t-1) is the trend component at time t-1.

S(t-m) is the seasonal component at time t-m.

D is the damping factor, which is a value between 0 and 1 that reduces the magnitude of the trend over
time.

37
4.10.1 TES multiplicative – Damped Trend parameters

Figure 48 - 4.10.1 TES multiplicative – Damped Trend parameters

 From the above parameters list we can witness the various seasonal parameters are present and
Alpha= 7.339816e-07, Beta= 3.874478e-07, Gamma= 5.495855e-07,damping_trend= 9.795710e-01

38
 We couldn't infer from the preceding graph that the TES multiplicative Damped Trend model
performed well in test data. In order to determine which TES model performs best, we must
compare RMSE.

4.10.2 TEST RMSE – TES multiplicative Damped Trend

Figure 49 - RMSE – TES multiplicative Damped Trend

 TES multiplicative damped trend model performed well on the test prediction than the additive
damped trend model

39
4.11 Inference/Conclusion based on the model build so far:
So far we have seen 13 models performance. Among the 13 model “2 point Trailing Moving Average”
performed well. It has the lowest RMSE value 11.53

Best Model:

2 Point Trailing Moving Average:

Figure 50 - 2 Point Trailing Moving Average


As shown in the above figure 2 Point Trailing Moving Average predicted well on the test data. The RMSE
value also lesser than other 12 models we build so far.

40
5 Check for the stationarity of the data on which the
model is being built on using appropriate statistical tests
and also mention the hypothesis for the statistical test. If the
data is found to be non-stationary, take appropriate steps to
make it stationary. Check the new data for stationarity and comment.
Note: Stationarity should be checked at alpha = 0.05

5.1 Data Stationarity verification:

Stationarity, also known as stationarity assumption, is a fundamental concept in time series analysis that
refers to the property of a time series data where the statistical properties of the data remain constant
over time. In other words, a stationary time series has a constant mean, constant variance, and constant
autocorrelation structure over time.

The Augmented Dickey-Fuller test is a unit root test which determines whether there is a unit root and
subsequently whether the series is non-stationary.

The hypothesis in a simple form for the ADF test is:

𝐻0: The Time Series has a unit root and is thus non-stationary.

𝐻1: The Time Series does not have a unit root and is thus stationary.

We would want the series to be stationary for building ARIMA models and thus we would want the p-
value of this test to be less than the 𝛼 value.

Differencing will be applied if the time series is identified has non stationary

5.1.1 Dicky Fuller test - check for stationarity of the time series

Figure 51 - Dickey Fuller Test

 Above Dickey-Fuller test show the p value is greater than the alpha 0.05, therefore the time series
is not stationary.

41
Figure 52 - Rolling Mean and Standard Deviation

 Time series that are stationary have a constant mean and constant variance, our time series mean
and variance are not constant
 To determine if the Time Series evolves to stationary or non-stationary, the difference of order 1
will be used.

42
5.1.2 One order difference result
After applying one order difference the p value become 0. That is p value is lesser than the alpha 0.05.
Therefore we reject Null Hypothesis and conclude that the time series is stationary

Figure 53 - Dickey Fuller test after one order diff

Figure 54 - Rolling Mean and std dev after 1 order diff

 Rolling mean and standard deviation are become constant by performing 1 order difference

5.1.3 Time series plot before and after one order difference

Figure 55 - Time series before and after 1 order diff

43
6 Build an automated version of the ARIMA/SARIMA
model in which the parameters are selected using the lowest
Akaike Information Criteria (AIC) on the training data and
evaluate this model on the test data using RMSE.
ARIMA Model

ARIMA (Autoregressive Integrated Moving Average) is a popular time series forecasting model that
combines autoregression (AR) and moving average (MA) components with differencing to account for
trend and seasonality in a time series.

Autoregression refers to the use of lagged values of the dependent variable to predict future values.
Moving average refers to the use of the previous forecast errors to predict future values. Differencing
refers to the transformation of a non-stationary time series into a stationary time series by taking the
differences between consecutive observations.

ARIMA models are specified by three parameters: p, d, and q, where:

p: the order of the autoregressive component, which refers to the number of lagged values of the
dependent variable used in the model.

d: the degree of differencing, which refers to the number of times the data is differenced to make the
time series stationary.

q: the order of the moving average component, which refers to the number of lagged forecast errors used
in the model.

SARIMA Model

SARIMA (Seasonal Autoregressive Integrated Moving Average) is an extension of the ARIMA model that
can handle time series data with seasonality. It includes additional seasonal components to account for
repeating patterns in the data, in addition to the autoregressive, integrated, and moving average
components of the ARIMA model.

The parameters of a SARIMA model are denoted as (p, d, q) × (P, D, Q)s, where (p, d, q) are the non-
seasonal ARIMA parameters, (P, D, Q) are the seasonal ARIMA parameters, and s is the seasonal period
(i.e., the number of time periods in a season).

The seasonal AR component (P) models the linear relationship between the series and its seasonal lags,
while the seasonal MA component (Q) models the linear relationship between the forecast errors and
their seasonal lags. The seasonal differencing (D) is used to remove the seasonal trends, similar to the
non-seasonal differencing (d).

As we know by taking one order difference our time series moves to stationary therefore one order
difference will be available while generating Automated ARIMA and SARIMA

44
6.1.1 ACF and PACF before one order difference on full data

Figure 56 - ACF Full data

 From the above ACF plot we see an insignificant component at 13 after that point there only
one significant point. It is better to perform one order difference

Figure 57 - PACF Full data

 From the above PACF plot we see an insignificant component at 4 after that we have significant
point at 13. It is better to perform one order difference

45
6.1.2 ACF and PACF after performing one order difference on full data

Figure 58 - ACF Full data with one order diff

Figure 59 - PACF Full data with one order diff

By following the ACF and PACF graphs we conclude p has insignificant at level 3 and q has insignificant at
level 5

p value is taken from the PACF chart, we could clearly see there is an insignificant component at 5
therefore we are considering 4 as a maximum for p

q value is taken from the ACF chart, we could clearly see there is an insignificant component at 3
therefore we are considering the range of 0 to 2

46
6.1.3 ACF and PACF for Train dataset with one order difference

Figure 60 - ACF Train data - with one order difference

 One difference ACF the range of q would be 0 to 2 since the third component is insignificant we
are taking q value from 0 to the very first insignificant after 0 is 3
 For automated ARIMA we are consider p value range between 0 to 2

 Third component is insignificant therefore q value is ranges between 0 to 2

47
6.2 ARIMA Automated

 p and q value are in the range between 0 and 2 based on the ACF and PACF charts shown above
 We have kept the value of d as 1 and we need to take one difference of the series to make it
stationary

Figure 61 - ARIMA automated parameters


 For each combination ARIMA automated model will generate AIC sore
 Less AIC sore consider has a best model

6.2.1.1 Akaike Information Criteria (AIC)


Akaike Information Criteria (AIC) is a statistical measure used to evaluate the relative quality of statistical
models. The AIC was developed by the Japanese statistician Hirotugu Akaike.

The AIC is based on the concept of information entropy and is used to balance the fit of a model to the
data with the complexity of the model. In other words, the AIC attempts to find the simplest model that
best fits the data.

The formula for AIC is: AIC = -2log(L) + 2k

Where L is the likelihood of the data given the model, and k is the number of parameters in the model.

A lower AIC value indicates a better model fit, with the model having the lowest AIC value considered to
be the best fit.

6.2.1.2 Top 5 best performing model (lowest AIC values)

Figure 62 - top 5 from ARIMA automated model

48
 Model with parameter p=0, d=1, q=2 has the least AIC value, therefore we are fitting this model
to predict the test dataset

Greatest Combination with Least AIC is - (p, d, q) (0, 1, 2)

As we chosen p =0 and q=2 therefore we have 2 parameters moving average and 1 sigma parameter
with lesser value

0 components of Auto Regression

2 components Moving Average

ma.L1 – p value 0.00


ma.L2 – p value 0.00

Coefficient of sigma has a high value, in this case this model will not predict properly.

All the components are significant since p value is below the alpha 0.05

49
Figure 63 - Auto ARIMA Plot

 From histogram we could see the data is normally distributed


 Q-Q plot denotes the data is normally distributed by the points of observed data are fallen
approximately along a straight line.
 Residual remains with lesser value between -2 to 4

6.2.2 RMSE – ARIMA Automated

Figure 64 - RMSE ARIMA Automated


 RMSE value is much higher therefore this model might not predict the test dataset, by referring
the prediction graph we can conclude if this model performs well or not

50
6.2.3 Automated ARIMA Prediction

Figure 65 - Auto ARIMA 2.1.2


 The above graph makes it quite evident that the Auto ARIMA model doesn't do well on the test
dataset. Auto ARIMA forecast is represented in green bar

6.3 SARIMA Automated


The SARIMA model incorporates three components - autoregression (AR), differencing (I), and moving
average (MA) - to model the time series data. Additionally, it includes a seasonal component, which takes
into account the seasonality of the data.

The SARIMA model is often written as SARIMA(p, d, q)(P, D, Q)m, where:

p: the order of the autoregressive component

d: the degree of differencing

q: the order of the moving average component

P: the order of the seasonal autoregressive component

D: the degree of seasonal differencing

Q: the order of the seasonal moving average component

m: the number of time steps in each season

To start with P and Q value are assigned in the same range of p and q

 p -> 0 to 2
 q -> 0 to 2
 d -> 1
 P -> 0 to 2

51
 Q -> 0 to 2
 D -> 0
 Seasonality - 12

Figure 66 - SARIMA Automated parameters


 Each combination will be tested using SARIMA automated model which generates AIC sore
 Less AIC sore consider has a best model

6.3.1.1 Top 5 best performing model

Figure 67- Top 5 best model - lowest AIC scores

 We are going to use the model with parameter p=0, d=1, q=1, P=2, D=0., Q=2 and seasonality =12
has the least AIC value, therefore we are fitting this model to predict the test dataset
 Less usage of parameter gives a best result. Therefore we are considering to test with the
parameters in row 26: p=0, d=1, q=1, P=2, D=0., Q=2 and seasonality =12

52
Figure 68 - Automated SARIMA Result

Greatest Combination with Least AIC is - p=0, d=1, q=1, P=2, D=0, Q=2 and seasonality =12

As we chosen p =0 and q=1 therefore we have 2 parameters for q moving average and 1 sigma
parameter with higher value

0 components of Auto Regression – components are not significant since p value is greater than
alpha 0.05

2 components Moving Average – this is not a significant component since p value is greater than the
alpha 0.05

ma.L1 – p value 1
ms.L2 – p value 1

seasonal components

ar.S.L12 – p value 0.00 – significant component


ar.S.L24 – p value 0.00 – significant component
ma.S.L12 – p value 0.56 – not a significant component due to p value is higher than alpha
ma.S.L24 – p value 0. 62 – not a significant component due to p value is higher than alpha

53
Figure 69 – Automated SARIMA Plot

 From histogram we could see the data is normally distributed


 Q-Q plot denotes the data is normally distributed by the points of observed data are fallen
approximately along a straight line.
 Residual remains with lesser value between -2 to 3

6.3.2 Predicted sample test data

Figure 70 - SARIMA sample predicted test data

54
6.3.3 RMSE – Automated SARIMA

Figure 71 - RMSE Auto SARIMA

 Automated SARIMA’s RMSE value is lesser than the Auto ARIMA but we have seen much lesser
RMSE value TES models. Let’s plot prediction on test data and see how Auto SARIMA performs.

6.3.4 Automated SARIMA prediction

Figure 72 - SARIMA test prediction

 The above graph makes it quite evident that the Auto SARIMA model performs better than the
Auto ARIMA on the test dataset. Auto SARIMA forecast is represented in red bar
 SARIMA performs well than the ARIMA
 RMSE value of SARIMA is 26.93 where in RMSE of ARIMA is 37.31; by seeing we could conclude
Automated SARIMA performs better than Automated ARIMA

55
7 Build ARIMA/SARIMA models based on the cut-off
points of ACF and PACF on the training data and
evaluate this model on the test data using RMSE.

7.1 ACF & PACF plot with one difference

Here, we have taken alpha=0.05

56
The Auto-Regressive parameter in an ARIMA model is 'p' which comes from the significant lag before
which the PACF plot cuts-off to 2

The Moving-Average parameter in an ARIMA model is 'q' which comes from the significant lag before the
ACF plot cuts-off to 2.

By looking at the above plots, we will take the value of p and q as 2

7.2 ARIMA Manual Model (2,1,2)

Figure 73 - Manual ARIMA results

Greatest Combination with Least AIC is - (p, d, q) (2, 1, 2)

As we chosen p =2 and q=2 therefore we have 2 parameters for auto regression and 2 parameters
for moving average and 1 sigma parameter with lesser value

2 components of Auto Regression – not significant since p value is greater than the alpha 0.05

ar.L1 – p value 0.33


ar.L2 – p value 1

2 components Moving Average – not significant since p value is greater than the alpha 0.05

ma.L1 – p value 0.58


ma.L2 – p value 0.16

Coefficient of sigma is high, it implies this model will not perform well on the test dataset

57
Figure 74 - Manual ARIMA plot

 From histogram we could see the data is normally distributed


 Q-Q plot denotes the data is normally distributed by the points of observed data are fallen
approximately along a straight line.
 Residual remains with lesser value between -3 to 4

7.2.1 RMSE – Manual ARIMA

Figure 75 - RMSE Manual ARIMA

 Manual ARIMA RMSE score is lesser than the Automated ARIMA, but we can say this is the best
model since Automated SARIMA RMSE value is much lesser than both Auto and Manual ARIMA

58
7.2.2 Manual ARIMA prediction

Figure 76 -Manual ARIMA prediction

 The above graph makes it quite evident that the Manual ARIMA model doesn't do well on the test
dataset. It is failed to predict the test data. Manual ARIMA forecast is represented in green bar

7.3 SARIMA Manual Model


In order to remove seasonality part from ACF and PACF we will be taking difference of 6 – Tag6 and diff

Figure 77 - Full data plot with diff 6 + diff


 By doing tag 6 and diff we have removed the seasonality in the data

59
Figure 78 - Mean and std Dev plot with diff 6 + diff

Figure 79 - Dickey Fuller test for diff 6

 Rolling mean and standard deviation are become constant by taking differences of 6 / Tag6 plus
adding another diff (one order differentiation)
 p value is lesser the alpha 0.05
 By applying differences of 6 and diff the p value become 0. That is p value is lesser than the
alpha 0.05. Therefore we conclude that the time series is stationary

60
7.3.1 ACF and PACF with difference of 6 and diff in train dataset to identify the P and Q

Figure 80 - ACF Train set with diff 6 + diff

Figure 81 - ACF Train set with diff 6 + diff

In ACF plot there is an insignificant drop at level 2 therefore we will take Q as 1

In PACF plot there is an insignificant drop at level 4 therefore we will take P as 3

D value remain 0

Seasonality is taken as 12

61
7.3.2 Manual SARIMA Model1: (2,1,2) (3, 0, 1, 12)

Figure 82 - Manual SARIMA Model 1 results

As we chosen p =2 and q=2 therefore we have 2 parameters for auto regression and 2 parameters for
moving average and 1 sigma parameter with lesser value. P and Q has 3 and 1 respectively

2 components of Auto Regression - components are

ar.L1 – p value 0.00 – significant component


ar.L2 – p value 0.93 – not significant since p value is greater than alpha 0.05

2 components of Moving Average

ma.L1 – p value 1 – not significant since p value is greater than the alpha 0.05
ma.L2 – p value 0.93 – not significant since p value is greater than the alpha 0.05

P component

ar.S.L12 – p value 0.03 – significant component since p value is lesser than the alpha 0.05
Coefficient of S.L12 has a major impact on the prediction.

Q component

ma.S.L12 – p value 0.71 – not significant since p value is greater than the alpha 0.05

Sigma has a remarkable impact in the predicting

62
Figure 83 - SARIMA Model 1 Plot

 From histogram we could see the data is normally distributed


 Q-Q plot denotes the data is normally distributed by the points of observed data are fallen
approximately along a straight line.
 Residual remains with lesser value between -2 to 3

63
7.3.3 Manual SARIMA Model 2: (2,1,2) (0, 0, 1, 12)

Figure 84 - Manual SARIMA Model 2 results

As we chosen p =2 and q=2 therefore we have 2 parameters for auto regression and 2 parameters for
moving average and 1 sigma parameter with lesser value. P and Q has 0 and 1 respectively

3 components of Auto Regression – components are not significant since p value is greater than
alpha 0.05

ar.L1 – p value 0.49


ar.L2 – p value 0.37

2 components of Moving Average – components are not significant since p value is greater than
alpha 0.05

ma.L1 – p value 0.70


ma.L2 – p value 0.84

P has 0 components

Q has 2 components

ma.S.L1 – p value 0.07 – not significant since p value is greater than the alpha 0.05
ma.S.L2 – p value 0.84 – not significant since p value is greater than the alpha 0.05

Sigma has a remarkable impact in the predicting

64
Figure 85 - SARIMA Model 2 Plot

 From histogram we could see the data is normally distributed


 Q-Q plot denotes the data is normally distributed by the points of observed data are fallen
approximately along a straight line.
 Residual remains with lesser value between -3 to 3

7.3.4 Manual SARIMA Model 3: (3,1,2) (2, 0, 1, 12)

Figure 86 - Manual SARIMA Model 3 results

65
As we chosen p =2 and q=2 therefore we have 2 parameters for auto regression and 2 parameters for
moving average and 1 sigma parameter with lesser value. P and Q has 2 and 1 respectively

2 components of Auto Regression

ar.L1 – p value 0.12 – not a significant component


ar.L2 – p value 0.11 – not a significant component

2 components of Moving Average

ma.L1 – p value 1 – not significant since p value is greater than the alpha 0.05
ma.L2 – p value 1 – not significant since p value is greater than the alpha 0.05

P component

ar.S.L12 – p value 0.00 – significant component since p value is lesser than the alpha 0.05
Coefficient of S.L12 has a major impact on the prediction.
ar.S.L24 – p value 0.00 – significant component since p value is lesser than the alpha 0.05
Coefficient of S.L24 has impact on the prediction

Q component

ma.S.L12 – p value 1 – not significant since p value is greater than the alpha 0.05

Sigma has a remarkable impact in the predicting

Figure 87 - SARIMA Model 3 Plot

 From histogram we could see the data is normally distributed


 Q-Q plot denotes the data is normally distributed by the points of observed data are fallen
approximately along a straight line.
 Residual remains with lesser value between -2 to 3

66
7.3.5 RMSE – Manual SARIMA Models

Figure 88 - RSME Manual SARIMA Models

From the above chart we can conclude Model 1 has a lesser RMSE score and MAPR. This model should
perform well on predicting test data. Parameters used for prediction (2,1,2) (3, 0, 1, 12)

7.3.6 SARIMA Models Prediction

Figure 89 - SARIMA Prediction Model 1

67
Figure 90 - SARIMA Prediction Model 2

Figure 91 - SARIMA Prediction Model 3

By seeing the above prediction graph we could see all 3 models performs close to each other on
prediction. By considering the RMSE value and the graph we can conclude Model 1 is predicting well.
Parameters used for predicting model 1 is (2,1,2) (3, 0, 1, 12)

By reviewing ACF and PACF differences of 1 chart we have concluded the optimum value for p and q is 2.
P and Q parameters are identified based on the ACF and PACF differences of 6 chart. Both P and Q are 3
and 1 respectively

68
8 Build a table with all the models built along with
their corresponding parameters and the respective
RMSE values on the test data.
Test
Models Parameters
RMSE
RegressionOnTime 15.27
NaiveModel 79.72
SimpleAverageModel 53.46
2pointTrailingMovingAverage 11.53
4pointTrailingMovingAverage 14.45
6pointTrailingMovingAverage 14.57
9pointTrailingMovingAverage 14.73
Simple Exponential Smoothing Alpha=0.098750 36.80
Alpha=1.490116e-08
Double Exponential Smoothing Beta=1.661039e-10 15.27
Alpha=0.089541
Beta=0.000240
Triple Exponential Smoothing Additive Gamma=0.003467 14.25
Alpha=0.071511
Beta=0.045292
Triple Exponential Smoothing Multiplicative Gamma=0.000072 20.16
Alpha=0.073686
Beta=0.009798
Gamma=0.073301
Triple Exponential Smoothing Additive Damped Trend Damping_Trend=0.975626 26.36
Alpha=0.111071
Beta=0.037024
Gamma=0.395080
Triple Exponential Smoothing Multiplicative Damped Trend Damping_Trend=0.990000 25.96
p=0
d=1
Automated ARIMA(0,1,2) q=2 37.31

p=0
d=1
q=2
P=2
D=0
Q=2
Automated SARIMA(0, 1, 2)(2, 0, 2, 12) Seasonality=12 26.93

69
p=2
d=1
Manual ARIMA(2,1,2) q=2 36.87

p=2
d=1
q=2
P=3
D=0
Q=1
Manual SARIMA Model 1:(2,1,2)(3, 0, 1, 12) Seasonality=12 22.69

p=2
d=1
q=2
P=0
D=0
Q=1
Manual SARIMA Model 2:(2,1,2)(0, 0, 1, 12) Seasonality=12 33.39

p=2
d=1
q=2
P=2
D=0
Q=1
Manual SARIMA Model 3:(2,1,2)(2, 0, 1, 12) Seasonality=12 28.22
Table 1 – All Models

The models we've conducted so far are listed in the table above, along with the parameters that
contributed into each model. Models are tested using the parameters, and the RMSE value is shown

70
9 Based on the model-building exercise, build the most
optimum model(s) on the complete data and predict
12 months into the future with appropriate
confidence intervals/bands.
Moving Average-2 point Trailing has the lowest RMSE: 11.53 and Triple Exponential Additive Smoothing
has second lowest RMSE 14.25 in test data prediction.

Moving average helps to forecast near future values but it cannot be used for actual forecast therefore
we are consider Triple Exponential Additive Smoothing is the best model for predicting future 12 months

The best model as per the RMSE value is Triple Exponential Additive Smoothing with parameters
Alpha=0.089541, Beta=0.000240 and Gamma=0.003467

9.1 Best Model fitting


Triple Exponential Additive Smoothing is fitted with the below mentioned parameters. alpha=0.089541,
beta=0.000240 and gamma=0.003467

Triple Exponential Additive Smoothing, also known as Holt-Winters method. The method is called "triple"
exponential because it uses three smoothing parameters, each of which is applied to a different
component of the time series:

Level: the average value of the time series over time.

Trend: the slope or direction of the time series over time.

Seasonality: the recurring patterns or cycles in the time series.

These parameters are typically denoted by alpha (α), beta (β), and gamma (γ), respectively.

alpha (α) - 0.089541

Determines the weight given to the most recent observation versus the historical average

beta (β) - 0.000240

Determines the weight given to the most recent trend versus the historical trend

gamma (γ) - 0.003467

Determines the weight given to the most recent seasonality versus the historical seasonality

Inference on the model fitting result

Above mentioned alpha, beta and gamma value are used in the full data to predict future 12 months

The RMSE of full data is 16.13, for test data RMSE was 11.53

We have calculated the upper and lower confidence bands at 95% confidence level

71
9.2 12 months prediction

Figure 92 - Forecast of next 12 months

12 months into the future is predicted with appropriate confidence intervals/bands. The confidence
interval will be lies between the lower_CI and upper_ci range. With this confidence interval chart business
can plan their production to meet change in the demand.

Figure 93 - 12 Months prediction

72
We can easily see from the above graphic that the model predicts quite well. It is evident where the
confidence interval is for the 12 months.

10 Comment on the model thus built and report your


findings and suggest the measures that the company
should be taking for future sales.

10.1 Findings on the dataset


 Sales of Rose wine don't appear to be trending downward
 Highest sale recorded in 1981

Figure 94 - Time series plot

73
 Below plot helps us to understand there is a seasonality in the Rose wine sale
 A remarkable increase in sales is observed in end of year that is November and December on
every year
 Which may be related to the holiday season.
 December is the year's highest sales peak
 April month shows the lowest sale across the year

Figure 95 - Month wise plot

74
10.2 Comments on the model build
We have built 20 models to conclude which model fits very well to the Rose wine sale and predicts the
future 12 months.

Below is the top 5 models with respect to RMSE value. Moving Average – 2 point trailing and Triple
Exponential Smoothing additive are the 2 major model with different parameters performed well in the
Rose wine sale data

Moving average helps to forecast near future values but it cannot be used for actual forecast therefore
we are consider Triple Exponential Additive Smoothing is the best model for predicting future 12 months

Test
Models Parameters
RMSE
2pointTrailingMovingAverage 11.53
Alpha=0.089541
Beta=0.000240
Triple Exponential Smoothing Additive Gamma=0.003467 14.25
4pointTrailingMovingAverage 14.45
6pointTrailingMovingAverage 14.57
9pointTrailingMovingAverage 14.73
Table 2 – Top 5 Best Models

10.3 Suggestion based on the analysis we performed


In contrast to the other quarters, the holiday season sees particularly good sales. Promotions during the
busiest shopping season will increase the sale is ongoing year-round

Sales do significantly decrease year over year. Therefore, the exterior of the Rose wine container can be
changed to make it look new and fresh each year

Introducing deals during the slow sales periods will boost the company's performance. Also, it increases
sales during the busiest season

Analysis shows that wine is frequently consumed during the celebration. By partnering with an event
management company, wine sales will rise

75

You might also like