Lecture 8,9 - STLF - Part 1

Download as pdf or txt
Download as pdf or txt
You are on page 1of 78

Methods for Short-Term Load

Forecasting
Generalities

` Forecasting electricity demand is a vital task for planning and investment


p p
purposes

` Short-Term Load Forecasting (STLF) is the starting point of the daily energy
supply routine required for power system operation

` STLF has become increasingly important since the rise of the competitive
energy markets

` STLF is based on statistical procedures that use past load values and
exogenous variables to forecast the short-term power demand and
electricity price

` The use of the STLF begun around 1920 and grew up rapidly, employing at
first classic statistical methods

` The paper “Short-Term Load Forecasting” (Gross and Galiana, 1987)


outlines the state-of-art of STLF before the use of methods based on
artificial intelligence

2
Generalities …
` Artificial Neural Networks (ANN) were first proposed in the sixties, and after
some initial criticism, quickly gained popularity in several fields

` In fact, STLF was the first application of neural networks in the field of
Power Systems (late eighties)

` In the following years, the interest for AI techniques for electricity


forecasting grew rapidly. Researchers employed any new method,
sometimes following passing fashions, so that the literature on this subject
counts at present about a thousand papers

` Several power companies and ISOs employ on a regular basis ANN


techniques in support of the human forecasters

` ANNSTLF, a product developed by Pattern Recognition Technologies


((PRT)) under the support
pp of EPRI,, is now used byy most ISOs in USA and in
Western Europe and represents one of most successful applications of the
neural networks technology

3
Basic concepts of
Load Forecasting

5
Time horizon of the electric load forecasting

` Long-term load forecasting (LTLF)

` Medium-term load forecasting (MTLF)

` Short-term load forecasting (STLF)

6
Long-term load forecasting

` Goals: long-term system planning, allocation of new


generating units

` Time horizon: 1 - 20 years

` Forecast: total energy, peak load

7
Medium-term load forecasting

` Goals: scheduling maintenance, planning fuel supply,


economic
i ttransactions
ti

` Time horizon: several weeks – some months

` Forecast: load profile, total energy, peak load

8
Methods for Long and Medium term load forecasting

` End use models


− collect statistical information about the electricity demand in the
residential, commercial, and industrial sector
− explain energy demand as a function of the number of
consumers in the market

` Econometric models
− estimate the relationship between the energy consumption and
the economic factors that influence the energy consumption
− combine economic theory and statistical techniques (linear
regression or time series methods)

9
Short-term load forecasting

` Goals: economic energy dispatch, unit commitment,


security assessment
assessment, energy sales in the competitive
market

` Time horizon: 1 hour – 1 week, with time step 1 hour

` Forecast: hourly load profile,


f peak load

10
Importance
p of the STLF

` The accuracy of STLF has a significant impact on the operation


costs of the electric utilities

` Underestimation of the short-term


short term load results in failure of providing
the necessary reserve, which leads to higher operation costs

` Overestimation of the short-term load causes the start-up of


unnecessary units, which leads again to higher operation costs

` STLF is also required to estimate the power flows for the on-line
security assessment.
assessment It helps to make decisions that prevent
overloading, particularly in the deregulated energy market

11
Italian daily load profile and forecast (Aug 27, 2012)

12
The profile of the electric load depends on:

` Social habits

` Work organization

` Mix of customers (residential, commercial, industrial)

` Temperature and other weather-related variables (humidity,


cloud cover, wind speed)

` Foreseeable or unexpected exceptional events (plant failures,


sudden
dd weather
th changes,
h strikes,
t ik sports
t matches,
t h etc.)
t )

13
Electric load as a time series

The electric load can be considered as a time series:

y ( t ) = f ( yt −1 , yt − 2 ,…) + g ( et , et −1 , et − 2 ,…) + ε ( t )
where:

− f () is a function of the past load values

− g () is a function of the present and past values of some


exogenous variable
i bl ((weather-related
th l t d variables,
i bl clock
l k and
d
calendar variables, etc.)

− ε (t) is a normally distributed random noise

14
Characteristics of the load time series

` Trend: trend of the yearly energy demand due to macroeconomic


effects

` Seasonality: seasonal variation of the demand due to weather/social


effects (summer/winter)

` Cyclicity:
y y repetition
p of similar load p
patterns ((weekdays,
y , Saturday,
y,
Sunday) with weekly periodicity

` Effect of clock/calendar variables: similarity


y between loads at the
same hour and/or in the same day of the week

` Anomalous days:
y holidays,
y bank holidays
y and days y between close
holidays, where the load shape is most irregular

15
Available load time series: AEM

Azienda Energetica
g Municipale
p TORINO (AEM)
( )
` small electric utility (now IREN) that supplies the area of Torino

` hourly load time series from 1995 to 1997

` values in MW

` peak load around 280 MW

` yearly energy demand around 1.5 TWh

` trend not appreciable

` maximum and minimum daily temperature available (not hourly data)

16
Available load time series: ISONE

New England Independent System Operator (ISONE)


` hourly load time series from 1980 x 10
4

` values in MW 2.5

` peak load around 26 GW 2

` yearly energy demand around 134 TWh 15


1.5

` trend not appreciable


1

05
0.5

` hourly weather data available


0 1000 2000 3000 4000 5000 6000 7000 8000 9000

` time series (past load and weather data) available at:


http://www.iso-ne.com/

17
Some features of the load profile
p

` Typical daily load curve with double peak

` Different load curves on weekdays, Saturday and Sunday

` Effects of the anomalous days (holidays)

` Seasonal dependence

` Similarity of the corresponding days in different years (possibly after


trend elimination)

18
Typical daily load curve (ISONE)

4
x 10
21
2.1

1.9

1.8

1.7

1.6

1.5

1.4

1.3

12
1.2
0 5 10 15 20 25

19
Load profiles of weekdays, Saturday and Sunday (AEM)

Mon.

Sat.

Sun.

AEM. Jan. 9-15, 1995


20
Load profiles of weekdays, Saturday and Sunday (ISONE)
4
x 10
2.2

2.1 Mon
Mon.
2 Sat.
Sun.
1.9

1.8

1.7

1.6

1.5

1.4

1.3

12
1.2
0 20 40 60 80 100 120 140 160 180

21
Effects of the anomalous days (New Year’s Day, Epiphany)

Fri. Jan. 6,1995

Sun. Jan. 1,1995

22
Effects of the anomalous days (Thanksgiving)
4
x 10
2

19
1.9

1.8 Fri .Nov. 23, 2007


1.7

1.6 Thu .Nov. 22, 2007


1.5

1.4

1.3

1.2

1.1

1
0 20 40 60 80 100 120 140 160 180

23
Seasonal dependence (AEM)

24
Seasonal dependence (ISONE)

4
x 10
2.8

2.6

2.4

22
2.2

1.8

1.6

1.4

1.2

0.8
0 1000 2000 3000 4000 5000 6000 7000 8000 9000

25
Similarity between corresponding weeks of different years (AEM)

Mon. Tue.

Sun.
— 1995
— 1996

AEM 2
AEM. 2nd
d week
k off 1995 and
d 1996
26
Similarity between corresponding weeks of different years (ISONE)

4
x 10
2.2

Mon
Tue
2

Sun
1.8

— 2006
— 2007 1.6
6

1.4

1.2

1
0 20 40 60 80 100 120 140 160 180

ISONE 4th week


ISONE. k off 2006 and
d 2007
27
Lack of similarity between weeks of different months and years
4
x 10
2.5
Mon
Tue

2
Sun
— June 2006
— Jan.
Jan 2007

1.5

1
0 20 40 60 80 100 120 140 160 180

ISONE 4th week


ISONE. k off 2006 and
d 2007
28
Choice of the independent variables
The forecast requires the use of variables in the past related to the
future load. We can use:
` Autoregressive variables
− past load values (previous hours, previous day, the day a week
before the corresponding day a year before)
before,
− the choice can be made by linear correlation, but the relationship is
not completely linear

` Exogenous variables
− weather conditions (temperature, humidity, cloudiness, wind speed)
− clock and calendar variables (time of day
day, day of week)

` Non-periodic and exceptional events


− holidays,
holidays holiday periods (dealt by the forecasting method)
− strikes, unexpected weather conditions, etc. (dealt by the human
forecaster)

29
Dependence on the past load values

` The selection of independent variables autoregressive requires an


understanding of the type of relationship (linear/non-linear)
(linear/non linear) of the
load with its past values

` D
Darbellay
b ll and
d Slama
Sl (“F
(“Forecasting
ti th short-term
the h tt d
demand d for
f
electricity: do neural networks stand a better chance?”, 2000)
assessed that the total correlation with the past values is the sum of
a linear component and a (small) non-linear component

` They
y introduced a new complexp non-linear correlation coefficient
and assessed the difference with the usual linear correlation

30
Dependence
p on temperature
p and weather variables

` Related to social habits (electric heating, air conditioning)


` In some cases there are strong climatic differences between the
different areas of the same country ((e.g. the case of Italy)
` Weather data are not always available or are not available on a
hourly basis
` Temperature and weather-related data require accurate weather
forecast
` Non-linear relationship between load and temperature
` Short-term effects of the temperature
p ((unless sudden changes)
g ) are
incorporated into the load time series

38
Hourly load and hourly temperature time series (ISONE 2006)

4
x 10
3

2.5

2
MW
1.5

1
— hourly load 0.5
0 1000 2000 3000 4000 5000 6000 7000 8000 9000

— hourly dry bulb 100

80
temperature
60
°F
40

20

0
0 1000 2000 3000 4000 5000 6000 7000 8000 9000

39
Load forecasting
g methods and algorithms
g

` Similar day approach


` Least squares regression
` Time series models
` Neural networks
` Support vector machines
` Fuzzy logic

44
Similar day approach

` Days in the previous years with similar characteristics to the day


to forecast
f

` Similar characteristics include weather, day of the week, and


d t
date

` The load of the similar day is assumed as forecast (naïve


f
forecast)
t)

` Technique once used by the human forecasters, who improved


th poor results
the lt off the
th naïve
ï forecast
f t with
ith their
th i experience
i

` Used in some recent applications as basis for more thorough


approaches
h

45
Similar day approach: an example of “naïve forecast”
4
x 10
1.8

1.7

1.6

1.5

MW1.4

— actual
1.3

— forecast 1.2

1.1

1
0 5 10 15 20 25

load profile forecast for Thu Oct 19, 2006 (ISONE)

– Exploits the similarity between corresponding days of consecutive


weeks. The forecast of Thu Oct 19, 2006 is made using the profile of
the same day a week before (Thu Oct 12)
g error ((MAPE)) is 1.72 % ((max error around 649 MW))
– The forecasting

46
Least squares
q regression
g

` Generates the load curve as linear combination of non-linear


non linear
functions (past loads)

` This method approximates the linear part of the relationship


between the future load and the past load values

` Stepwise regression is often used to build the model

` Conventional method still used by some electric utilities

` We’ll use linear regression to make some basic examples of load


forecasting

47
Time series models

` Based on the assumption that the data have an internal structure


(
(autocorrelation,
l i trend,
d seasonall variations,
i i etc.))
` Autoregressive Moving Average model (ARMA) and similar methods
(ARIMA, ARMAX, ARIMAX)
` The basic equation is:
p q
X t = c + ε t + ∑ ϕi X t −i + ∑ ϑiε t −i
i =1 i =1

where Xi is the load time series, εi the error time series, ϕ and ϑ the
regression parameters
` The parameters are estimated by linear regression, but the model
formulation is not a simple task
` Classic method used by many electric utilities

48
Neural networks

` Load forecasting techniques widely applied since 1990

` Neural networks are massively parallel distributed processors


that have natural propensity to learn experiential knowledge and
make
k it available
il bl ffor use

` Excellent capabilities in non-linear curve fitting

` Realistic target values for the MAPE error are now 2.5% (24-
hours-ahead)) and 1.5% ((1-hour-ahead))

` Widely used method, also in assistance to the human


forecasters

49
Support
pp vector machines

` Support Vector Machines (SVM) are a type of kernel-based


kernel based neural
networks intended for classification and regression tasks

` Perform a nonlinear mapping of the input data into a high


dimensional space (kernel trick)

` Use then simple


p linear functions to create linear decision boundaries
in the new space. The learning requires the solution of a quadratic
optimization problem

` In the last years SVMs have gained large popularity in the field of
STLF

50
Fuzzyy logic
g

` Generalization, p
proposed
p by
y Lotfi Zadeh in 1965, of the Expert
p
Systems (Boolean logic)
` Under fuzzy logic, inputs are not exactly known but only defined
within
ithi a certain
t i range
` A set of user-defined fuzzy rules transforms fuzzy inputs in outputs,
so fuzzy logic can be also considered a technique for mapping
inputs to outputs.
` Unlike neural networks, fuzzy logic allows to explain the input-output
relationship
l ti hi
` However, in many cases fuzzy rules are learned from examples, like
in neural networks, and the explanatory capabilities are lost (neuro
(neuro-
fuzzy techniques)

51
Some experiments of STLF using linear regression

` 24-hours/1-hour ahead hourly


y forecast using
g the AEM and ISONE
time series
` In some cases we’ll use only the weekdays to simplify the forecast
(
(weekends
k d presentt more random
d b
behavior)
h i )
` Types of forecast:
1) regression parameters estimated on 4 weeks of 1995 and forecast
made on the following week of the same year (24-hours-ahead, AEM
–no hourly temperature available)

2) regression parameters estimated on 4 weeks of 2006 and forecast


made on the following week of the same year (1-hour-ahead, ISONE –
hourly temperature available)

52
AEM data set
` Each file contains a matrix with 4 columns (features) and 8760
rows (hourly observations)

` The features are:


1
1. Hour of year (1÷8760x24)
2. Day of week (1÷7)
3. Hour of dayy ((1÷24))
4. Electric load in MW

` The files (*.mat) are:


– aem95, aem96, aem97
– aem95wd aem96wd,
aem95wd, aem96wd aem97wd (working days only)

53
ISONE data set
` Each file contains a matrix with 6 columns (features) and 8760
rows (hourly observations)

` The features are:


1
1. Hour of year (1÷8760x24)
2. Hour of day (1÷24)
3. Dayy of week (1
( ÷7))
4. Dry bulb temperature in °F
5. Dew point temperature in °F
6. Electric load in MW

` The files (*.mat)


(* mat) are: NE2004,
E2004 NE2005,
NE2005 NE2006
NE2006, NE2007

54
Example of records from ISONE 2006 data set

55
Neural methods for Short-Term Load Forecasting - F. Piglione, 2012
Selection of the independent
p variables ((regressors)
g )

` Autoregressive
g q
quantities:
− hourly loads of 1-2 previous days
− hourly loads of the corresponding day a week before
− hourly loads of the corresponding day a year before
` Clock and calendar quantities (hour and day number)

` Weather data (if available)

` Regressors selected by correlation analysis with the hourly


load to forecast

56
Autocorrelation analysis of a load time series (AEM 1995 – all days )

linear crosscorrelation x/y


1

0.8

0.6

0.4

0.2

-0
0.2
2
0 20 40 60 80 100 120 140 160 180

Th correlation
The l ti reaches
h its
it maximum
i values
l every 24 h
hours
57
Autocorrelation analysis of a load time series (AEM 1995 – working days only)

linear crosscorrelation x/y


1

0.8

0.6

0.4

02
0.2

-0.2

-0.4
0 20 40 60 80 100 120 140 160 180

M
More correlation
l ti b between
t allll d
days
58
Correlation analysis shows that:

` The most correlated past hourly loads are lags: -1, 1 -24,
24 -48,
48 -72,
72
-23, -25, with correlation coefficient from 0.98 to 0.91

` Maximum correlation (0
(0.98)
98) occurs every 24 hours (lag –24)
24)

` Hourly loads at lag -1, lag -23 and lag - 25 present high
correlation values (about 00.92)
92)

` Nearest hourly loads (e.g. lag –23 and lag –24) show high
mutual correlation

59
Choice of the model and assemblage of the data set

` Possible models for 24


24-hours-ahead
hours ahead forecast:
− L(t) = F [ L(t-24), L(t-25), …, L(t-48) ]
− L(t) = F [ L(t
L(t-24)
24), L(t
L(t-48)
48) ]
− L(t) = F [ L(t-24), L(t-48), hour of day, day of week]
− ...

` Possible models for 1-hour-ahead forecast:


− L(t) = F [ L(t-1),
L(t 1) L(t-25),
L(t 25) …, L(t-48)
L(t 48) ]
− L(t) = F [ L(t-1), L(t-24) , L(t-48) ]
− L(t) = F [ L(t-1),
L(t 1) L(t-24)
L(t 24) , L(t-48
L(t 48 ),
) hour of day,
day day of week]
− …

60
Least squares
q regression
g equations
q

y = X h + ε → h = ( X X ) XT y = X† y
−1
ˆ T

yˆ = X hˆ

where:
– N: number of observations
yˆ : N ×1 estimated targets
– M: number of regressors
g
– y: N x 1 target load values hˆ : M ×1 estimated parameters
– X: N x M regressor matrix (past load values)
X† :M ×N pseudoinverse of X
– h: M x 1 pparameters

ε: N x1 normally distributed random noise

61
AEM 1995 data set (weekdays only): 24
24-hours-ahead
hours ahead forecast

` Parameter estimation
– Data set composed by 4 weeks of 5 weekdays each, from week 3
to week 6 of 1995 ((hourlyy loads from Mon.,, Jan. 16 to Fri.,, Feb. 10))

– In all 480 hours

– Offset of 2 weeks
eeks (first 10 weekdays
eekda s – 120 hours)
ho rs) to avoid
a oid the
holiday period after the New Year’s day

` Forecast
– The five weekdays of the following 7th week (120 hourly loads from
Mon. Feb. 13 to Fri. Feb. 17)

62
Neural methods for Short-Term Load Forecasting - F. Piglione, 2012
Building the regressor matrix

We want to build the 24-hours-ahead forecasting model:


L(t) = F [ L(t-24),
L(t 24) L(t
L(t-25),
25) …, L(t
L(t-48)
48) , h(t)
h(t), d(t) ]
composed by all the load variables from lag –24 to lag –48 and the two
exogenous variables hour of day h(t) and day of week d(t).
In all we have M = 27 regressors. Using N observations in the past, the N x M
regressor matrix X is assembled in this way:

rows time matrix columns


1 t’ = t L(t’-24) L(t’-25) … L(t’-48) h(t’) d(t’)
2 t’ = t+1 L(t’-24) L(t’-25) … L(t’-48) h(t’) d(t’)
3 t’ = t+2 L(t’-24) L(t’-25) … L(t’-48) h(t’) d(t’)
… … … … … … … …
N t’ = t+N-1 L(t’-24) L(t’-25) … L(t’-48) h(t’) d(t’)

63
Neural methods for Short-Term Load Forecasting - F. Piglione, 2012
The corresponding MATLAB code

load aem95wd % load data set


data = aem95wd;
first = 240+1; % first and last observation
last = 240+480; % N = 480 observations
xl = data(:,4); % hourly loads
x = [ ];
f lag
for l = 24:48,
24 48 % compose ththe matrix
t i off th
the
x = [x xl(first-lag:last-lag)]; % past load regressors
end
xd = data(first:last, 2); % day calendar number regressor
xh = data(first:last, 3); % hour clock number regressor
X = [x xh xd]; % regressor X (480x27) matrix
Y = xl(first:last); % target Y (480x1) vector

64
Neural methods for Short-Term Load Forecasting - F. Piglione, 2012
Regressor matrix X (first 10 rows)

65
Neural methods for Short-Term Load Forecasting - F. Piglione, 2012
Estimation of the regression parameters
The next step is to estimate the regression parameters by using the Moore-
Penrose pseudoinverse function pinv of MATLAB. Remember that we must
have N (number of observations) much greater than M (number of parameters),
otherwise the problem is ill-conditioned.
Finally, we plot the results and calculate the MAPE error. Here is the MATLAB
code
code:

H = pinv(X)*Y; % regression parameters vector H (27x1)


YE = X*H; % estimated target vector YE (480x1)

plot(Y)
l t(Y) % plotting
l tti actual
t l and
d estimated
ti t d ttargett
hold % vectors
p ( , )
plot(YE,'r')
mape (YE,Y) % evaluation of the error (MAPE)

66
Neural methods for Short-Term Load Forecasting - F. Piglione, 2012
Parameter estimation – 24-hours-ahead

— actual
— estimated

MAPE = 3.28%

Hourly load from Mon, Jan 16 to Fri, Feb 10 1995 (weekdays only)

67
Neural methods for Short-Term Load Forecasting - F. Piglione, 2012
Forecast of the five following
g weekdays
y

` Usingg the p
previously
y estimated regression
g p
parameters we
forecast now the 5 weekdays of the following 7th week

` The p
parameters have been estimated on the hourly
y interval from
# 241 to # 720 (4 weeks, weekdays only)

` The forecast is 24-hours-ahead,


24 hours ahead, using a moving window

` The range of the forecast is now from Mon Feb 13, 1995 to Fri
Feb 17,
17 1995 (hourly interval from # 721 to # 840)

` In the next page there is the MATLAB code

68
Neural methods for Short-Term Load Forecasting - F. Piglione, 2012
Forecast of the five following
g weekdays
y ((MATLAB code))
first = 240+480+1; % new input set (120 hourly loads)
last
ast = 240+480+120;
0 80 0;
x = [ ];
for lag = 24:48, % compose the matrix of the
x = [x xl(first
xl(first-lag:last-lag)];
lag:last lag)]; % past load regressors
end
xd = data(first:last, 2); % day calendar number regressor
xh = data(first:last, 3); % hour clock number regressor
X = [x xh xd]; % regressor X (120x27) matrix
Y = xl(first:last); % target Y (120x1) vector
YE = X*H; % forecasted target vector YE (120x1)
plot(Y), hold, plot(YE,'r‘) % plotting actual and estimated target
% vectors
mape (YE,Y) % evaluation of the error (MAPE)

69
Neural methods for Short-Term Load Forecasting - F. Piglione, 2012
Forecast of the five following
g weekdays
y ((results))

— actual
— forecast

MAPE = 3.59%

Hourly load from Mon, Feb 13 to Fri, Feb 17 1995 (weekdays only)

70
Forecast error (residual)
( )

Error
parameters:
MaxAE = 23 MW
MinAE = 0.3 MW
MAE = 6.4 MW

71
Error analysis

` The forecast error is generally measured by MAPE (Mean Absolute


g Error):
Percentage )
1 N
abs ( yˆ − y )
MAPE =
N

i =1 y
100

` The quality of the forecast is mainly assessed by the error analysis


(residual analysis):
− the correlation amongst residual and regressors must be low ⇒ all
regressors have been exploited

− the autocorrelation of the residual for all lags must be low ⇒ no further
regressors are available

− the residual must be a normally distributed random noise ⇒ the residual


is not foreseeable

72
Tests for assessing
g the normality
y of the residual

` Histogram with normal fit


Graphical technique that compares the probability density of the fitted
distribution with a histogram of the data, normalised so that they have
the same area.

` Normal probability plot


Graphical technique for roughly assessing the normal distribution of a
sample.
l Th
The d
data are plotted
l d against
i a theoretical
h i l normall di
distribution
ib i iin
the axes [probability/data values]. Departures from this straight line
indicate departures from normality

` Lilliefors test
Estimates the mean and the variance of the sample and makes a
comparison
i with
ith theoretical
th ti l values
l from
f a normall distribution
di t ib ti

73
Error analysis (1)
Correlation between residual and regressors

⇒ no significant correlation exists

74
Error analysis (2)
Autocorrelation of the residual

⇒ no significant autocorrelation exists.

75
Error analysis (3)
Histogram with normal fit

14

12

10

0
-1.5 -1 -0.5 0 0.5 1 1.5

⇒ the error distribution is q


quite different from the normal one

76
Error analysis (4)
Normal probability plot

⇒ the skewness at the two extremes of the sample distribution


shows a non perfect fit with the normal distribution
77
Neural methods for Short-Term Load Forecasting - F. Piglione, 2012
Error analysis (5)

` Lilliefors test accepts the normality hypothesis

` The statistical results are uncertain

` The availability
y of some regressor
g inside the time window
between lag –1 and lag –24 could increase the quality of the
forecast

` Of course this is a rough forecast. In fact, the MAPE (3.59%) is


lower than the maximum expected value (2.5% for 24-hours-
ahead forecast)

78
ISONE 2006 data set (all days): 1
1-hour-ahead
hour ahead forecast

` Let’s
Let s try now an example of 1
1-hour-ahead
hour ahead forecast using the
ISONE 2006 data set (all days)

` We use the model:


L(t) = F [L(t-1), …, L(t-24), h(t), d(t), dbt(t-1), dpt(t-1)]
composed dbby th
the lload
d variables
i bl ffrom llag -1
1 tto lag
l -24,
24 the
th hhour
and day indexes h(t) and d(t), the dry bulb and dew point
temperatures at lag -1 dbt(t-1) and dp(t-1)

` The regression parameters are estimated on the first 4 weeks of


2006

` The forecast is made on the following 5th week of 2006

79
Parameter estimation: MATLAB code
load NE2006 %load ISONE 2006 data set
xl = NE2006(:,6); % hourly loads
xtemp = NE2006(:,4); % dry bulb temperature
xdp = NE2006(:,5); % dew point temperature
hh = NE2006(:,2); % hour of day
dd = NE2006(:,3); % day of week

first = 48+1; % first and last observation


last = 48+672; % first 4 weeks
x = [];
[]
for lag = 1:24,
x = [x xl(first-lag:last-lag)]; % past load regressors
end
lag=1; xt1 = xtemp(first-lag:last-lag); % dry bulb temperature at lag-1
lag=1; xp1 = xdp(first-lag:last-lag); % dew point temperature at lag-1
xh = hh(first:last); % hour of the day
xd = dd(first:last); % day of the week
X = [x xh xd xt1 xp1]; % regressor X
Y = xl(first:last); % target Y

80
Parameter estimation: MATLAB code …

H = pinv(X)*Y; % regression parameters vector H (28x1)


YE = X
X*H;
H; % estimated target vector YE (672x1)

plot(Y) % plotting actual and estimated target


hold % vectors
plot(YE,'r')
mape (YE,Y)
(YE Y) % evaluation of the error (MAPE)

81
Parameter estimation – 1-hour-ahead
4
x 10
2.2

1.8
— actual
— estimated 1.6

1.4

MAPE = 1.04%
12
1.2

1
0 100 200 300 400 500 600 700

Hourly load from Tue, Jan 3, 2006 to Mon, Jan 30 2006 (ISONE)
Short-Term Load Forecasting -F. Piglione, 2012

82
Forecast of the following week (MATLAB code)
first = 48+672+1; % following 5th week (hourly loads)
last = 48+672+168;
x = [];
for lag = 1:24,
x = [x xl(first
xl(first-lag:last-lag)];
lag:last lag)];
end
lag=1; xt1 = xtemp(first-lag:last-lag);
lag=1; xp1 = xdp(first-lag:last-lag);
xh = hh(first:last);
xd = dd(first:last);
( );
X = [x xh xd xt1 xp1];
Y = xl(first:last);
YE = X*H; % forecast
mape (YE,Y) % evaluation of the error (MAPE)

83
Forecast of the following week
4
x 10
2

19
1.9

1.8

1.7

— actual 1.6

— forecast 1.5

14
1.4

1.3

MAPE = 1.15% 1.2

1.1

1
0 20 40 60 80 100 120 140 160 180

Hourly load from Tue, Jan 31, 2006 to Mon, Feb 6, 2006 (ISONE)
Short-Term Load Forecasting -F. Piglione, 2012

84
Error analysis (1)
Correlation between residual and regressors
residual/regressors correlation
01
0.1

0.08

0.06
n coefficient

0.04
correlation

0.02

-0.02
0 5 10 15 20 25 30

⇒ no significant
g correlation exists

85
Error analysis (2)
Autocorrelation of the residual
residual autocorrelation
06
0.6

0.4

n coefficient 0.2

0
correlation

-0.2

-0.4

-0.6

-0.8
0 10 20 30 40 50 60 70 80 90

⇒ no significant
g autocorrelation exists

86
Error analysis (3)
Histogram with normal fit

14
35

12
30

10
25

20
8

15
6

10
4

2
5

0
-0.8
-1.5 -0.6-1 -0.4-0.5 -0.2 0 0 0.5 0.2 10.4 1.5
0.6

⇒ the error distribution is more similar to the normal one

87
Error analysis (4)
Normal probability plot
Normal Probability Plot

0 997
0.997
0.99
0.98
0.95
0 90
0.90

0.75
obability

0.50
Pro

0.25

0.10
0.05
0.02
0.01
0.003
-1200
1200 -1000
1000 -800
800 -600
600 -400
400 -200
200 0 200 400 600
Data

⇒ the skewness at the two extremes of the sample distribution


shows a non perfect fit with the normal distribution
88
Error analysis (5)

` Lilliefors test rejects the normality hypothesis

` The statistical results are uncertain

` The MAPE is acceptable (1.15%) and falls within the maximum


expected value (1.50%)

` The availability of regressors from lag -1 improves the forecast

` The 24-hours-ahead forecast made with the same data gives


worse results (5.39%)
( )

89
Linear regression
g examples:
p conclusions

` The 24-hours-ahead
24 hours ahead forecast made with regression analysis
gives unsatisfactory results in both time series (MAPE from 3%
to 5%)

` Better results are obtained for 1-hour-ahead forecast, since very


short-term factors are kept into account

` The main limit of the method lies in the linear combination of the
time series
series. A non-linear
non linear combination would be more flexible

90

You might also like