A New Procedure For Generalized Star Modeling Using Iacm Approach
A New Procedure For Generalized Star Modeling Using Iacm Approach
A New Procedure For Generalized Star Modeling Using Iacm Approach
net/publication/266007638
CITATIONS READS
10 159
2 authors:
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Utriweni Mukhaiyar on 09 June 2016.
1 Introduction
An observation at a certain spatial location and time may be commonly
influenced by previous-time observations in that location and its neighboring
locations. For example, the oil production of a certain well during this month
can be affected by the production during the previous months at that same
location and at the neighboring wells; the number of today’s criminal actions in
a city may be influenced by the number of criminal actions that happened
during the previous days in that city and the nearest cities; or, then the number
of vehicles at a certain intersection within a certain time range is influenced by
the number of vehicles at that location and related intersections during the
previous time ranges. These examples are the portrait of space-time analysis
problems.
Also, we would like to be able to forecast the oil production of that particular
well during the coming months; how many criminal actions will happen in the
city; and how many vehicles can be found at that particular intersection during
the next range times. Knowing these forecasts, people in charge may take the
necessary actions. We can do forecasting after obtaining the appropriate model.
One way of doing this is space-time modeling.
The ℓ-th spatial order shows the ordered neighbors of a number of spatial
locations, while spatial weight wij shows the influence of location j on location
i in spatial lag ℓ-th. The locations which are closer to the reference location (s0)
will have a smaller spatial lag, which is distinguished by fixed distance d0
(Figure 1). Since the configuration in each spatial lag order is distinct, different
spatial lags may give different weight matrices.
2.2 Stationarity
GSTAR models are invertible since the vector of observations Yt could be
expressed as a weighted linear combination of past observations with weights
that converge to zero. In order for a GSTAR model to represent a stationary
process, one in which the covariance structure of Yt does not change with time,
certain conditions should be met. We obtain the stationary conditions of the first
and second order of GSTAR models using its IAcM by the following
propositions.
Proof. The IAcM is built from the sum square of errors from all past
observations until the first observation. For (N×1)-dimensional vector of
observations, 𝒀1 = 𝑌1,1 , 𝑌2,1 , … , 𝑌𝑁,1 ′ , we can write 𝐘1 ′𝐌1 𝐘1 = 𝟏𝒕=−∞ 𝐞𝑡 ′𝐞𝑡
which has positive and finite values. The 𝐌𝟏 is symmetric and since 𝐘1 ′𝐌1 𝐘1 >
0 then 𝐌𝟏 is positive definite matrix [4]. It implies the determinants of all
leading principal submatrices of 𝐌𝟏 are positive. Furthermore, since 𝐌𝟏 and the
covariance structure is do not change with the time, the GSTAR(1;1) process is
stationary. Q.E.D.
The IAcM is a symmetric positive definite matrix. For GSTAR(1; 1) models
1
with 1 0, we may define A Φ1 W , and the IAcM is,
0
𝐌𝟏 = 𝐈𝑁 − 𝐀′𝐀 (2)
Proposition 2 GSTAR(1; 1) models are stationary if the determinants of all
the leading principal submatrices of M1 are positive.
The absolute eigen value of A must be less than one, which means that the
GSTAR(1; 1) process is stationary. Q.E.D.
0
k s
1 ()
W
Yt s
'
(k )
E W Ys (4)
N
Since equation (4) is a scalar then we write
k s
1
N
1
tr W( ) W( k ) E Yt ' Yt s tr W( ) W( k ) Γ s
N
(5)
T s
ˆ s Yt Yt s ' and by substituting it to (5) then
The estimator for Γ s is Γ
t 1 T s
N T s
ˆ k s L Y
i ,t L Yi ,t s
k
ˆ k s
i 1 i 1 (6)
ˆ 0 ˆ kk 0
1/ 2 1/ 2
N T
L
2 N T
L Yi ,t
2
k
Yi ,t
i 1 t 1 i 1 t 1
with L as the spatial lag operator of spatial order -th, so that
Yi ,t , 0
N
L Yi ,t
wij Y j ,t , 1, 2,..., p
j 1
the STPACF for spatial order λ is the last coefficient of 𝜙ℎℓ (ℓ = 0,1,…, λ and h
= 1,2,…) in space-time Yule Walker equation as defined in [1].
This procedure combines and advances the three-stage iterative procedure for
space-time modeling of Pfeifer & Deutsch [1] and time-series modeling of Box
& Jenkins [6]. The new aspect of this procedure is the process stationarity check
with the IAcM *).
START Yes
Transformation/
differentiation of Model Parameter estimation
observations No Both mean and identification (Least Squares
variance are constant Plot of ST-ACF & Estimation)
ST-PACF
Model is stationary*)
No (Determinants of principal sub-
IAcM are positive)
No
Yes
4 Case Study
For our case study we used the data of the monthly tea production of six
plantation sites (N = 6) in West Java, Indonesia (Purbasari, Santosa, Sedep,
Papandayan, Cisaruni, and Dayeuh Manggung), from January 1992 to
December 2010 (T = 228). We executed space-time modeling by using Matlab
and started by centering the processes in order to obtain the null mean of the
processes. The coordinates of the observed plantations and the distances
between them are displayed in a symmetric (N×N)-dimensional matrix of which
the main diagonal is null (see Figure 3).
From Figure 3 we can draft the numbers of spatial lag order and its members,
and also the weight matrices for each spatial lag. We can build the range of
spatial lag based on the fixed radius system in Figure 1 with a fixed d0 of 0.4
km. The interval distance (0, 0.4] is categorized as spatial lag 1, (0.4, 0.8] as
spatial lag 2, and so on until (0.4(-1), 0.4] as spatial lag ℓ. For this case we got
four spatial lags for eachweight matrix (uniform, euclidean-uniform and
binary). The appropriate weight matrices for the first and second spatial lag are
shown in Table 1.
186 Utriweni Mukhaiyar and Udjianna S. Pasaribu
(a) (b)
Figure 3 (a) The coordinate of six observed plantations, and (b) the distances
(in kilometre, km). PUR is Purbasari, SAN is Santosa, SED is Sedep, PAP is
Papandayan, CIA is Cisaruni, and DAM is Dayeuh Manggung.
Table 1 The weight matrices for spatial lag ℓ-th, ℓ = 1,2,3,4. For ℓ = 0, the
weight matrix 𝐖 (0) = 𝐈𝑁 . The sum of each row must be equal to one with a null
main diagonal.
Uniform Euclidean - Uniform Biner
0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0
1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0
0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0
W (1)
0
0 0.5 0 0.5 0
0 0 0.495 0 0.505 0
0 0 1 0 0 0
0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0
0
0 0
0 0 0 0 1 0 0 0 0 1 0 0 0 1 0
0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0
0 0 1 0 0 0
0 0 1 0 0 0 0 0 1 0 0 0
1 / 3 1 / 3 0 0 1/3 0 0.311 0.376 0 0 0.313 0 0 0 0 0 1 0
W (2)
0 1 0 0 0 0
0 1 0 0 0 0
0 1 0 0 0 0
0 0 0 0 0 1 0 0 0
0 1 0 0 0 1 0 0 0
0
0
0
0 0 1 0
0 0 0 1 0 0 0 0 0 1 0
0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0
0 0 0 0.5 0.5 0 0 0 0 0.544 0.456 0 0 0 0 0 1 0
1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0
W (3)
1 / 3 1 / 3 0 0 0 1 / 3
0.304 0.355 0 0 0
0.342
0 0 0 0 0 1
0 0.5 0 0 0 0.5 0 0.460 0 0 0 0.540 0 0 0 0 0 1
0 0 1 / 3 1 / 3 1 / 3 0 0 0 0.326 0.334 0.340 0 0 0 1 0 0 0
0 0 0 0 0.5 0.5 0 0 0 0 0.491 0.509 0 0 0 0 0 1
0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1
0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1
W (4)
1 0 0 0 0 0
1 0 0 0 0 0
1 0 0 0 0 0
1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0
0.482 0.518 0
0.5 0.5 0 0 0 0 0 0 0 0 1 0 0 0 0
The appropriate weight of the matrices in Table 1 has similar values. This is
acceptable because they have the same criterion for determining the spatial lag
interval. This happens because there are only six observed sites, which should
be categorized into four levels of spatial authority (p = 4). Since there is
subjective judgment involved in building the weight matrices, we should pay
more attention to the rules, especially when the ratio between the number of
locations and the number of spatial lags is close to 1:1. In these cases, it will be
more difficult to obtain appropriate and obedient weight matrices.
A New Procedure of Generalized STAR Modeling 187
Table 2 The STACF and STPACF values for several time and spatial lag.
STACF STPACF
spatial lag 0 1 2 3 4 spatial lag 0 1 2 3 4
time lag time lag
1 0.4697 0.3102 0.3315 0.1989 0.3329 1 0.3578 -0.0086 0.1975 -0.0984 0.0634
2 0.1554 0.006 0.0105 -0.0339 0.0401 2 0.0191 -0.0385 -0.2131 0.1136 -0.0886
3 0.0699 -0.0779 -0.0912 -0.1194 -0.0532 3 0.1688 0.076 -0.088 -0.0345 -0.0845
4 -0.0102 -0.1232 -0.0859 -0.0821 -0.0374 4 -0.1734 -0.0166 -0.0854 0.0221 0.1226
5 0.0651 -0.0633 0.0083 0.0136 0.0473 5 0.0486 -0.0135 0.1478 -0.0569 -0.0015
6 0.0408 -0.0868 -0.004 0.0331 0.0372 6 -0.0401 -0.0529 -0.0472 0.0905 -0.045
7 0.0032 -0.1322 -0.0478 0.0236 -0.0063 7 0.0488 -0.0038 -0.0297 0.1858 -0.1577
8 -0.0343 -0.1714 -0.1305 -0.0862 -0.0808 8 0.036 -0.0311 -0.0982 0.0292 -0.0511
From Table 2, the STACF values for all spatial lags already have cut off after
first time lag. This alert us to consider Moving Average (MA) factors in the
models. But if we plot those values for larger time lags, it seems like sinusoidal
graphs are coming out. This means that we can temporarily neglect the MA
factors and more focus on the STPACF values that represent autoregressive
terms. This is also because this paper discusses about the GSTAR modeling.
The STPACF values are cut off after the first and second lag time for zero,
second, and third spatial lag. From these lags (both spatial and time) we obtain
some possible models, namely: GSTAR(1;0), GSTAR(1;1), GSTAR(1;3),
GSTAR(2;1,1), GSTAR(2;1,3), etc.
The next step is parameter estimation with the least squares method. To execute
this step we should rearrange the GSTAR model to be a linear structure,
Y=X+e, with 𝐘 = 𝐘1′ , 𝐘2′ , … , 𝐘𝑁′ ′, 𝐗 = diag 𝐗1 , 𝐗 2 , … , 𝐗 𝑁 , 𝚽 =
𝚽1′ , 𝚽2′ , … , 𝚽𝑁′ , and 𝐞 = 𝐞1′ , 𝐞′2 , … , 𝐞′𝑁 ′. For each location we have linear
structure 𝐘𝑖 = 𝐗 𝑖 𝚽𝑖 + 𝐞𝑖 , 𝑖 = 1,2, … , 𝑁, with 𝐘𝑖 = 𝑌𝑖,𝑝 , 𝑌𝑖,𝑝+1 , … , 𝑌𝑖,𝑇 ′,
𝐞𝑖 = 𝑒𝑖,𝑝 , 𝑒𝑖,𝑝+1 , … , 𝑒𝑖,𝑇 ′,
(𝑖) (𝑖) (𝑖) (𝑖) (𝑖) (𝑖)
𝚽𝑖 = 𝜙10 , … , 𝜙1𝜆 1 , 𝜙20 , … , 𝜙2𝜆 2 , … , 𝜙𝑝0 , … , 𝜙𝑝 𝜆 𝑝 ′, and
,p 1 i ,p 1 ,0 ,0
( )
V (0) Vi (1) Vi (,p1 ) Vi (0) Vi (1) Vi ,1 p
Xi i ,p ,p ,1 ,1
V (0) V (1) Vi (,T1)1 Vi (0) Vi (1) Vi ,T p p
( )
i ,T 1 i ,T 1 ,T p ,T p
Based on Proposition 1 to 3, the model that produce at least one negative value
of the leading principal submatrix determinant value in the IAcM, will be
eliminated from appropriate model candidates. Looking at Figure 4, we notice
that models of which at least one of its leading principal matrix has negative
determinant are: GSTAR(2;0,2), GSTAR(2;0,3), GSTAR(2;1,3),
GSTAR(2;3,1), GSTAR(2;3,2), and GSTAR(2;3,3). We may leave these
models and move forward to the next stage, diagnostic checking (Model
Validation 2).
The model with the smallest MSR value, GSTAR(2;3,0), can be proposed as the
most appropriate model for this case and be employed to execute the short
forecasting. Since there are some models that have close MSR values, we will
perform an additional step. We let the forecasting stage serve as an advanced
checking diagnostic stage. For efficiency, we choose the five possible models
A New Procedure of Generalized STAR Modeling 189
that produced the lowest MSR from Table 3. Those models are: GSTAR(2;3,0),
GSTAR(1;3), GSTAR(2;2,0), GSTAR(2;1,0), and GSTAR(1;2).
GSTAR(2;0,1) GSTAR(2;0,2)
GSTAR(1;0) GSTAR(1;1) GSTAR(2;0,3) GSTAR(2;1,0)
GSTAR(2;2,0) GSTAR(2;3,0)
GSTAR(1;2) GSTAR(1;3)
1 1
0.8 0.8
0.6 0.6
Determinant
Determinant
0.4 0.4
0.2 0.2
0 0
0 2 4 6 0 1 2 3 4 5 6 7 8 9 10 11 12
-0.2 -0.2
Leading principal sub matrices -th Leading principal sub matrices -th
(b)
(a)
GSTAR(2;1,1) GSTAR(2;1,2) GSTAR(2;2,1)
GSTAR(2;2,2) GSTAR(2;2,3) GSTAR(2;1,3)
GSTAR(2;3,1) GSTAR(2;3,2) GSTAR(2;3,3)
1
0.8
0.6
Determinant
0.4
0.2
0
0 1 2 3 4 5 6 7 8 9 10 11 12
-0.2
Leading principal sub matrices -th
(c)
Figure 4 The determinant values of all the leading principal matrices of the
IAcM tend to be smaller as the dimensions of the leading principle matrices
become larger, (a) possible models with 1st time lag, (b) possible models with 2nd
time lag and one of spatial lag is 0, and (c) others possible models with 2nd time
lag.
190 Utriweni Mukhaiyar and Udjianna S. Pasaribu
For the forecasting stage, we prepared the ten latest data, collected from January
to October 2011. In the advanced checking diagnostic stage, we estimate these
data using the five selected models in order to obtain what we call forecast-
estimated observations. These forecast-estimated observations will be compared
with real observations using (9), so that we have the MSR of the forecast-
estimated observations. The results of the MSR can be seen in Table 4.
200 200
100 100
0 0
-100 -100
-200 -200
-3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3
Standard Normal Quantiles Standard Normal Quantiles
Site 3 - Sedep Site 4 - Papandayan
Quantiles of Input Sample
200 200
100 100
0 0
-100 -100
-200 -200
-3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3
Standard Normal Quantiles Standard Normal Quantiles
Site 5 - Cisaruni Site 6 - Dayeuh Manggung
Quantiles of Input Sample
200 200
100 100
0 0
-100 -100
-200 -200
-3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3
Standard Normal Quantiles Standard Normal Quantiles
Figure 5 QQ Plot of Sample Data versus Standard Normal Data for each site i, i
= 1,2,...,6 (from top left to bottom right). Quantile-quantile plot of the sample
quantiles of residuals versus theoretical quantiles from a normal distribution.
A New Procedure of Generalized STAR Modeling 191
production
200 200
100 100
0 0
200 205 210 215 220 225 230 235 240 200 205 210 215 220 225 230 235 240
time time
Site 3 - Sedep Site 4 - Papandayan
300 300
production
production
200 200
100 100
0 0
200 205 210 215 220 225 230 235 240 200 205 210 215 220 225 230 235 240
time time
Site 5 - Cisaruni Site 6 - Dayeuh Manggung
300 300
production
production
200 200
100 100
0 0
200 205 210 215 220 225 230 235 240 200 205 210 215 220 225 230 235 240
time time
real forecast-real forecast-estimated
The plots of the 28 latest observations and the 10 next forecasted observations,
the total MSR of which is 108.9293, are shown in Figure 6. The forecast-
estimated plots display similar patterns and are close enough to the real
observations.
5 Conclusions
An IAcM is suggested to be used in selecting the approriate space-time models
in completion of STACF and STPACF values. Using an IAcM decreases the
subjectivity of the judgment about which models are appropriate. In the case
study of the monthly tea production of six plantations in West Java, we found
that GSTAR(2;3,0) is the most appropriate model. We also observed that three
types of matrices (uniform, euclidean, and binary) will have the same influence
on the modeling process, since the ratio between the number of spatial lags and
observed locations is close to one. They are supposed to have different results if
the number of observed locationsis much higher than the number of observed
spatial lags.
Acknowledgement
The research of the first author was supported by the Fundamental Research of
DIKTI DIPA ITB program in 2011. The data were supplied by PT. Perkebunan
Nusantara VIII (Persero) in Bandung. The authors are also grateful to Assoc.
Prof Wono Setya Budhi and Khreshna Syuhada, Ph.D for their valuable
discussions.
References
[1] Pfeifer, P.E., & Deutsch, S.J., A Three-Stage Iterative Approach for
Space-Time Modeling, Technometrics, 22(1), pp. 35-47, 1980.
[2] Rahardjo, S. Identifikasi Kestasioneran Proses Autoregresif Secara
Analitis, PhD Dissertation, Mathematics, Inst. Teknol. Bandung, 2005.
[3] Borovkova, S.A., Lopuhaä, H.P., & Nurani, B., Consistency and
Asymptotic Normality of Least Squares Estimators in Generalized Space-
Time Models, Statistica Neerlandica, 62, pp. 482-508, 2008.
[4] Laub, A.J., Matrix Analysis for Scientists & Engineers, SIAM,
Philadelphia, 2005.
[5] Wei, W.W.S., Time Series Analysis: Univariate and Multivariate
Methods 2nd. Ed., Pearson Addison Wesley, Boston, 2006.
[6] Box, G.E.P., Jenkins, G.M. & Reinsel, G., Time Series Analysis,
Forecasting and Control, 3rd ed., Prentice Hall, New Jersey, 1994.