Originally published as:
Fiedler, B., Zöller, G., Holschneider, M., Hainzl, S. (2018): Multiple Change‐Point Detection in
Spatiotemporal Seismicity Data. - Bulletin of the Seismological Society of America, 108, 3A, pp.
1147—1159.
DOI: http://doi.org/10.1785/0120170236
Multiple change-point detection in spatio-temporal
seismicity data
August 28, 2018
Bernhard Fiedler, Institute of Mathematics, University of Potsdam, Karl-Liebknecht-Str. 24-25,
14476 Potsdam, Germany. Email
[email protected]
Gert Zöller, Institute of Mathematics, University of Potsdam, Karl-Liebknecht-Str. 24-25, 14476
Potsdam, Germany
Matthias Holschneider, Institute of Mathematics, University of Potsdam, Karl-Liebknecht-Str.
24-25, 14476 Potsdam, Germany
Sebastian Hainzl, GFZ German Research Centre for Geosciences, Telegrafenberg, 14473 Potsdam, Germany
1
Abstract
1
2
Earthquake rates are driven by tectonic stress buildup, earthquake- induced stress
3
changes, and transient aseismic processes. Although the origin of the first two sources
4
is known, transient aseismic processes are more difficult to detect. However, the knowl-
5
edge of the associated changes of the earthquake activity is of great interest, because
6
it might help identify natural aseismic deformation patterns such as slow-slip events, as
7
well as the occurrence of induced seismicity related to human activities. For this goal, we
8
develop a Bayesian approach to identify change- points in seismicity data automatically.
9
Using the Bayes factor, we select a suitable model, estimate possible change-points, and
10
we additionally use a likelihood ratio test to calculate the significance of the change of the
11
intensity. The approach is extended to spatiotemporal data to detect the area in which
12
the changes occur. The method is first applied to synthetic data showing its capability
13
to detect real change-points. Finally, we apply this approach to observational data from
14
Oklahoma and observe statistical significant changes of seismicity in space and time.
15
Introduction
16
Natural seismicity is a nonstationary process with vari- ous kinds of transient behavior on
17
different spatiotemporal scales, for example, aftershocks, foreshocks, swarm activity, and
18
quiescence lasting from hours to decades. Man-made earthquakes, for example, arising
19
from fluid injection in geothermal areas or wastewater disposals (Ellsworth, 2013) have
20
similar statistical features, but on smaller spatial scales with transient boundary conditions.
2
21
For example, the grow- ing amount of industrial projects related to the injection of fluids at
22
depth has led to the question, to which degree the seismic hazard changes at an injection
23
site. Figure 1 shows a clear increase of the earthquake number in Oklahoma at around
24
the year 2010. Several authors including Keranen et al. (2013), Langenbruch and Zoback
25
(2016), Walsh and Zoback (2015) and Weingarten et al. (2015) reported a correlation
26
between the injection volume and the observed increase of the seismicity.
27
In our study, we propose a Bayesian approach to detect transients in seismicity. Using
28
the Poisson assumption for the occurrence of earthquakes, we apply a method which was
29
first introduced by Raftery and Akman (1986) and further applied to earthquake data by
30
Gupta and Baker (2015), Montoya and Wang (2017) and Gupta and Baker (2017). We
31
go beyond these works and present an algorithm that allows the identification of multiple
32
change-points that occur in space and time. Moreover, we note that for observational
33
data, signals for change-points might be weak and difficult to distinguish from random
34
fluctuations. Therefore, we put special emphasis on the development of an appropriate
35
significance test and apply the concept of the Bayes factor for model selection.
36
Our model approach is based on the assumption that the earthquake occurrence follows
37
a piecewise homogeneous Poisson process (HPP) in time. In particular, the system is
38
assumed to suddenly change from one Poisson rate into another. Such transitions are
39
defined as change-points in time. This approach is then extended to space–time in a
40
straightforward way by subdividing the area into smaller segments of a specific size. For
41
every subarea, we obtain a time series contain- ing change-points or not. In both cases,
42
we first address the question which model is statistically preferable, for example, a model
3
43
with or without a change-point. If a specific change- point model is preferred, we then
44
use an extended approach of Raftery and Akman (1986) to estimate the change-points
45
and calculate associated Bayesian credibility intervals at a given significance level (e.g.,
46
95%). This is described in the Estimation of Change-Points section. Additionally, we use
47
a likelihood ratio test to calculate the significance (p-value) of the change of the Poisson
48
intensity (the Likelihood Ratio Test section) and extend the approach to the space–time
49
prob- lem (the Spatiotemporal Change-Point Problem section). By means of synthetic data,
50
we demonstrate the performance of the method (the Illustration for Synthetic Data section)
51
before applying it to the observed data from Oklahoma (the Application to Seismicity in
52
Oklahoma section).
53
Method
54
Estimation of Change-Points
55
First we give a brief overview on the detection of temporal change-points according to
56
Raftery and Akman (1986). In comparison to this work we extend the method to a general
57
case with more than one change-point.
58
An observation period of [a, b] is given with n events at times
a ≤ t1 < t2 < . . . < tn ≤ b.
59
(1)
We assume the existence of k change-points
τ1 , τ2 , . . . , τk ∈ [a, b]
4
(2)
60
with k < n. Moreover in [a, τ1 ] we have N (τ1 ) events which come from a Poisson process
61
with rate λ1 , and N (τi ) − N (τi−1 ) events in (τi−1 , τi ] with rate λi for i = 2, . . . , k.
62
Finally, in (τk , b] the number of events follows a Poisson process with rate λk+1 .
63
Let t = {t1 , . . . , tn } and θ = {λ1 , . . . , λk+1 , τ1 , . . . , τk }. It can easily be shown that
64
the mutual likelihood function is given by
N (τ1 ) −λ1 (τ1 −a) N (τ2 )−N (τ1 ) −λ2 (τ2 −τ1 )
e
λ2
e
p(t | θ) = λ1
N (τ )
N (b)−N (τk ) −λk+1 (b−τk )
λ1 1 e−λ1 (τ1 −a) λk+1
e
=
N (b)−N (τk ) −λk+1 (b−τk )
· . . . · λk+1
k
Y
e
(3)
N (τ )−N (τi−1 ) −λi (τi −τi−1 )
λi i
e
.
i=2
65
Using Bayes’ Theorem
p(θ | t) = R
p(t | θ)p(θ)
∝ p(t | θ)p(θ)
Θ p(t | θ) p(θ) dθ
(4)
66
we can calculate the posterior density p(θ | t) for the parameter θ given the data represented
67
by t = {t1 , . . . , tn }. Here p(t | θ) denotes the likelihood function and p(θ) is the prior
68
density of θ.
69
Let p(τ1 ), . . . , p(τk ) and p(λ1 ), . . . , p(λk+1 ) be the prior densities. Then the posterior
70
density is given by
N (τ1 ) −λ1 (τ1 −a) N (b)−N (τk )
e
λk+1
p(θ | t) ∝ p(τ1 )p(λ1 )p(λk+1 )λ1
×e
−λk+1 (b−τk )
k
Y
(5)
N (τ )−N (τi−1 ) −λi (τi −τi−1 )
p(τi )p(λi )λi i
e
.
i=2
71
Assuming now a flat prior, we calculate the marginal posterior density of τ = {τ1 , . . . , τk }
72
by integrating with respect to λ1 , . . . , λk+1 (see also the Derivation of the Marginal Pos-
73
terior Density section in the Appendix).
5
p(τ | t) = c
Z∞
×
...
0
k
Y
Z∞
N (τ1 ) −λ1 (τ1 −a) N (b)−N (τk ) −λk+1 (b−τk )
e
λk+1
e
λ1
0
N (τi )−N (τi−1 ) −λi (τi −τi−1 )
λi
e
dλ1 . . . dλk+1
i=2
= c(τ1 − a)−[N (τ1 )+1] Γ[N (τ1 ) + 1](b − τk )−[N (b)−N (τk )+1]
× Γ[N (b) − N (τk ) + 1]
k
Y
i=2
(τi − τi−1 )−[N (τi )−N (τi−1 )+1] Γ[N (τi ) − N (τi−1 ) + 1]
(6)
74
We consider two special cases of Eq. (6).
75
Special case: one change-point
p(τ | t) = c(τ − a)−[N (τ )+1] Γ[N (τ ) + 1](b − τ )−[N (b)−N (τ )+1] Γ[N (b) − N (τ ) + 1] (7)
76
Special case: two change-points
p(τ1 , τ2 | t) = c(τ1 − a)−[N (τ1 )+1] Γ[N (τ1 ) + 1](τ2 − τ1 )−[N (τ2 )−N (τ1 )+1]
× Γ[N (τ2 ) − N (τ1 ) + 1](b − τ2 )
−[N (b)−N (τ2 )+1]
(8)
Γ[N (b) − N (τ2 ) + 1]
77
We note that in Eq. (6), (7) and (8) c is a normalizing constant which ensures that the
78
conditions for a probability density function is fulfilled. Alternatively to a flat prior density,
79
a conjugated prior for the parameters λ1 , . . . , λk+1 (e.g. a gamma distribution) and
80
uniformly distributed prior densities for τ1 , . . . , τk can be used (see also Raftery and
81
Akman (1986)). By maximizing p(τ | t) in Eq. (6) with respect to τ = {τ1 , . . . , τk } we
82
obtain the estimation τ̂ = {τ̂1 , . . . , τ̂k } for the change-points.
83
In Akman and Raftery (1986) it was shown that the estimator for a single change-
84
point is consistent and asymptotically normal. Moreover in Ghosal et al. (1999) and other
6
85
related papers (e.g. Ghosh et al. (1994) and Ghosal and Samanta (1995)) it was also
86
demonstrated that in this case the posterior distribution asymptotically behaves like an
87
exponential function on both sides of the detected change-point. The asymptotic behavior
88
for the general case with more than one change-point is shown in Ghosal et al. (1999).
89
For model selection, we use the Bayes factor to get an idea which model should be
90
preferred, that is, whether to prefer a change-point model (M1 ) or a model without a
91
change-point (model M0 ). The Bayes factor is defined by the ratio of the marginal or
92
integrated likelihood for both models, that is
Blm =
p(t | Ml )
.
p(t | Mm )
(9)
93
Here Ml and Mm denote a model with l respectively with m change-points where l, m =
94
0, 1, . . . , k. Apart from the goodness of fit, the complexity of the assumed model has to be
95
taken into account in order to assess the most capable model describing the data and thus
96
performing the estimation. As an example, if we test the hypothesis of no change-point
97
(H0 ) against a change-point model, the value of the Bayes factor quantifies the evidence
98
of the supported model, e.g. B01 < 0.01 can be interpreted as a decisive evidence against
99
H0 , compare Kass and Raftery (1995). Equation (9) strongly depends on the choice of
100
the prior distributions. When an improper prior is used, the Bayes factor is, however, not
101
well-defined and depends on an arbitrary ratio of constants. To handle this problem we
102
use the idea of an imaginary training sample which involves the smallest possible sample
103
size permitting a comparison of M0 and Mm and provides maximum possible support
104
for M0 . In this case the Bayes factor should be approximately one. This approach was
105
introduced in Spiegelhalter and Smith (1982) and was adopted and discussed in several
7
106
other works, e.g. Raftery and Akman (1986), Kass and Raftery (1995) or Berger et al.
107
(2004). Using improper prior densities for the intensities of the shape p(λi ) ∝ λi
108
uniform distributed prior for τi , i.e. p(τi ) =
109
into consideration the approach of Spiegelhalter and Smith (1982), the Bayes factor can
110
be calculated by
− 21
B01
1
b−a
and a
(Raftery and Akman, 1986) and taking
√
4 π(b − a)−n Γ(n + 21 )
.
= Pn
R ti+1
−(n−i+ 12 )
−(i+ 12 )
1
1
(b
−
τ
)
dτ
)Γ(n
−
i
+
)
Γ(i
+
(τ
−
a)
i=0
2
2
ti
(10)
111
This approach can be extended and the derivation for the general case Blm is shown in
112
the Appendix Derivation of the Bayes factor. For example, for the hypothesis of no change
113
point (H0 ) against a model with two change-points we get
n
n
1 X X
1
1
1
2
−n+ 21
B02 = 4π (b − a)
Γ(n + )
Γ(i + )Γ(j − i + )Γ(n − j + )
2
2
2
2
i=0 j=i+1
#−1
Z
Z
ti+1
×
ti
tj+1
tj
1
1
1
(τ1 − a)−(i+ 2 ) (τ2 − τ1 )−(j−i+ 2 ) (b − τ2 )−(n−j+ 2 ) dτ1 dτ2
(11)
.
114
We note that the computation of Eq. (9) for large l and m is numerically very difficult to
115
handle because of the high- dimensional integrals. We remark that the function evaluations
116
grow exponentially as the number of dimensions increases. If the quadrature rules do not
117
lead to a desirable result, Monte Carlo methods should be used instead. In our work, we
118
apply a likelihood ratio test in addition to the established methods we considered before.
119
As an advantage, we get the significance (p-value) of the change of the Poisson intensity.
120
Needless to say that the Bayes factor is a powerful tool for the model selection, but although
121
we know the preferred model, we cannot yet prove that the estimated change-points are
122
significant. This problem can be solved with the aid of the likelihood ratio test.
8
123
Likelihood Ratio Test
124
We consider two Poisson processes with intensities λ1 and λ2 in the time intervals [s1 , s2 ]
125
and [s3 , s4 ] with s3 > s2 . In the first period we have n1 events, and in the second period
126
the number of events is n2 . We aim at testing whether or not the intensities are equal. In
127
detail we test hypothesis H0 versus H1 with
H 0 : λ1 = λ2 = λ
(12)
H1 : λ1 6= λ2 .
128
The likelihood function is given by
p(t | λ1 , λ2 ) = λn1 1 exp(−λ1 ∆1 )λn2 2 exp(−λ2 ∆2 ),
129
with ∆1 = s2 − s1 and ∆2 = s4 − s3 .
130
For H0 we get
(13)
p(t | λ) = λn1 +n2 exp[−λ(∆1 + ∆2 )].
(14)
131
As shown in the Appendix Derivation of the Likelihood Ratio Test, we can derive the
132
statistic of this test by calculation of the maximum likelihood estimators for λ, λ1 and λ2
133
and by using a general result of Witting and Müller-Funk (1995). It follows that the test
134
statistic of this likelihood ratio test is equal to
Z = 2 n1 log
n1
∆1
135
H0 is rejected, if z > χ21,
136
given significance level and χ21,
137
with one degree of freedom.
1−α
+ n2 log
n2
∆2
− (n1 + n2 ) log
n1 + n2
∆1 + ∆ 2
.
(15)
or if the p-value = P (Z ≥ z) < α. Here α ∈ (0, 1) is a
1−α
is the (1 − α)-quantile of the chi-squared distribution
9
138
To investigate the properties of this test, we perform calculations with artificially generated
139
data resulting in a reasonable resemblance to the error of the first kind α, as summarized in
140
Table 1. As an estimator for the error probability, we use the number of rejected hypotheses
141
divided by the number of generated samples. Moreover, Figure 2 illustrates the behavior
142
of the power for fixed values of λ and n. As expected, the simulations show that the test
143
can distinguish between H0 and H1 in a suitable way.
144
Spatiotemporal Change-Point Problem
145
In this section, we extend our approach for time series in a straightforward way towards
146
spatiotemporal change-point problems. For this aim, we scan an area D to find change-
147
points in space and time. Figure 3 illustrates the algorithm. First, the investigated domain
148
is subdivided into m subareas A1 , . . . , Am with D =
149
equidistantly centered subareas with the same size in the following way: We consider a set
150
of circles, where Ai has the radius r and the center (xi , yi ) for all i = 1, . . . , m. However,
151
any other subdivision is also possible, see Gupta and Baker (2017).
i=1 Ai .
For simplicity, we use
In the next step we investigate the time series of all events that occurred in Ai given
152
153
Sm
by
Si = {ti1 , ti2 , . . . , tini }.
154
Hence the data is a set of triples
m
[
i=1
155
(16)
(Ai ∪ Si ) =
m
[
i=1
{(xij , yij , tij ) | j = 1, . . . , ni }.
(17)
For Si we use our method to detect and evaluate change-points as described before in the
10
156
Estimation of Change-Point and Likelihood Ratio Test sections.
157
In detail, for every time series Si we use the Bayes factor (9) to decide which model fits
158
best to the given data. If a specific change-point model is preferred, we maximize p(τ | t)
159
in Eq. (6) and receive a set of possible change-points. For every estimated change-point
160
in this set we use the likelihood ratio test and define a change-point as significant, if the
161
p-value is smaller than a given significance level α. The result is a set τ̂ i = {τ̂ 1i , . . . , τ̂ ki }
162
of significant change-points in Si . Finally we provide the mathematical definition of a
163
transition event within a global statistical model Mtrans . For this aim, we define a set of
164
165
transition events as triples in the following way
(xi , yi , τ̂ i ), Si has at least one change-point
Mi =
.
Si has no change-point
∅,
Mtrans =
m
[
i=1
Mi .
(18)
(19)
166
Evaluation and Application
167
The derived methodology from the Method section is for test and illustration purposes first
168
applied on synthetic data and in the following part applied to real seismicity data recorded
169
in Oklahoma, United States.
170
Illustration for Synthetic Data
171
We first test our method by applying it to synthetic data under controlled conditions. For
172
this aim we generate synthetic time series with t ∈ [0, 1] with a single change-point at
11
173
τreal = 0.5 and investigate the goodness of the estimator. To test how the method works,
174
we calculate the standard deviation of τ̂ −τreal and Bayes factors depending on the number
175
of events and the ratio of the intensities, see Figure 4. It can be seen e.g. that a change-
176
point can be detected in sequences of 100 events with a high probability and precision, if
177
the intensity ratio exceeds a value of 2.
178
For the spatiotemporal approach, we generate random realizations of a 3D HPP with a
179
given intensity. From these data, we cut out cylinders and replace it by new cylinders with
180
data from HPPs with different intensities, as illustrated in Figure 5. Using our algorithm we
181
calculate the transition events Mtrans . Therefore we scan the whole domain as explained
182
in Figure 3. The ”training” sample is a 3D HPP with rate λ ≈ 0.8 (per unit area) in a
183
cylinder with center (0, 0), radius r1 = 6 and height h1 = 20, corresponding to the time
184
interval t ∈ [0, 20]. The replaced cylinders follows a HPP with rate λcp ≈ 8 (per unit
185
area). One cylinder has center (1, 1), radius r2 = 1 and height h2 = 10. Here the related
186
time interval is t ∈ [5, 15] and the second replaced cylinder has the center (−3, −3), radius
187
r3 = 1 and height h3 = 5. Here the related time interval is t ∈ [15, 20]. In other words,
188
the transitions are given by the sets
189
C1 = {(x, y, t) | (x + 3)2 + (y + 3)2 ≤ 1 ∧ t = 15}.
(20)
C2 = {(x, y, t) | (x − 1)2 + (y − 1)2 ≤ 1 ∧ t ∈ {5, 15}}.
(21)
and
190
The chosen sample size is n = 2000, and approximately 15% of the data is located within
191
the replaced cylinders. For this test case, we set the selection radius to r0 = 0.3. In
192
general, our results presented in Figure 4 can guide the choice for this selection: To be
12
193
able to detect a certain rate change, the event number within the selection radius must
194
exceed a minimum number, e.g. 20 events for a ten-times increased intensity as in our
195
example. For the change-point domain C1 the method yields an average value of τ̂ = 15.173
196
and for C2 we get average values of τ̂1 = 5.094 and τ̂2 = 14.971. The estimated areas
197
are illustrated in Figure 6. It is remarkable that apart from a small number of outliers the
198
complete transition area was detected correctly by the method.
199
Additionally we investigate the sensitivity of the method depending on the selection
200
radius. Therefore, we generated synthetic data from a HPP in the time interval t ∈ [0, 20],
201
where in the circular region with radius r0 = 2 around the center occurs a change at time
202
10 to a five-times increased rate, particularly the change-point domain is given by the set
C = {(x, y, t) | x2 + y 2 ≤ 4 ∧ t ∈ [10, 20]}.
(22)
203
The chosen sample size is n = 2000, where 50 events are within the change-point domain.
204
The intensities are given by λ ≈ 0.08 and λcp ≈ 0.4 (per unit area). For 100 simulations,
205
we calculate the Bayes factors and the standard deviation of τ̂ –τreal for increasing radii of
206
the event selection around the center. The results are illustrated in Figure 7. The test
207
results show that the estimation uncertainty is lowest and the success rate is highest for
208
the case that the selection radius equals the radius of the change-point region. A too
209
small selection radius leads to time series with a non-significant number of events, while a
210
too high value results in a systematical error and the precision of the method decreases.
211
However, the results are found to be almost the same for a rather broad range of selection
212
radii within 0.5r0 and 2r0 . This indicates that the results should be rather robust, if the
213
selection radius is chosen in a reasonable range.
13
214
Application to Seismicity in Oklahoma
215
We now apply the method to real earthquake data. Because of its drastic seismic activity
216
changes, Oklahoma probably counts as one of the most interesting study areas for the
217
application of the above estimations. Therefore, we consider an earthquake catalog from
218
Oklahoma with 18,997 events from 1 January 1980 to 31 December 2015, obtained from the
219
Oklahoma Geological Survey, compare Data and Resources. We declustered the catalog
220
using the method of Reasenberg (1985) with standard parameters (van Stiphout et al.
221
(2012), Tab. 3) and taking into account all events with magnitude m ≥ 3. The declustered
222
catalog contains 1,199 events. Using all m ≥ 3 events, the Bayes factor from Eq. (9) leads
223
to a model with two change-points (see detailed results in Table 2). The estimated 95%
224
credibility intervals for the (significant) change-points τ̂1 and τ̂2 are given by [12/01/2009;
225
28/03/2009] and [14/12/2013; 30/01/2014]. This result is illustrated in Figure 8. We
226
note that the application of the likelihood ratio test leads to p-values ≪ 1 which means
227
that both change-points are considered to be significant and the result strongly supports
228
our model selection. As depicted in Table 2, the calculation of the Bayes factor B01 , B02
229
and B03 always leads to the preference of a change-point model. For comparison, a model
230
with one change-point leads to a 95% credibility interval [24/10/2013; 10/11/2013]. A
231
model with three change-points would detect a further change-point in August 2014. If we
232
use the non-declustered catalog a model with three change-points leads to the detection
233
of the MW = 5.6 earthquake at Prague in November 2011 with a subsequent aftershock
234
sequence in addition to the induced seismic changes in 2009 and 2013 (see Figure 8). Here
235
we observe a natural change-point, caused by the aftershock sequence. In comparison to
14
236
the works of Gupta and Baker (2017) and Montoya and Wang (2017) we note that they
237
have found similar results for the change-points. The study of Montoya and Wang (2017)
238
used another method for multiple change-point detection in time series and included four
239
different areas in Oklahoma according to the towns Jones, Perry, Cherokee and Waynoka.
240
In all of the four areas their method lead to the choice of a model with two change-points.
241
In the Jones area they found two change-points in May, 2008 and August, 2011. For the
242
other areas they calculated change-points in 2013 until 2015. The work of Gupta and Baker
243
(2017) used the method of Raftery and Akman (1986) to detect single change-points in
244
spatiotemporal data. They used a 25 km radius and found changes in seismicity rates
245
between 2008 and the end of 2015.
246
By scanning the spatial domain shown in Figure 1 with a total area of approximately
247
260, 000 square kilometers, our method leads to the results shown in Figure 9 and Figure 10.
248
We used a radius of 5 km leading to 3, 500 evaluations of time series. This choice is
249
a compromise between optimizing the spatial resolution and increasing the detectability
250
which requires that the considered circles contains enough events to get robust results (see
251
Figure 4). In the Appendix Case study Oklahoma: Evaluation with different choices of the
252
radius, we show the results for alternative values of r = 2 km and r = 10 km indicating
253
that the main features are robust with regard to the choice of the selection radius. For a
254
better illustration of the results, we only take into account the models M0 , M1 and M2 .
255
Interestingly, the significant change-point locations show a spatial migration pattern from
256
south to north in both figures and overlap with the injection wells. Moreover a correlation
257
with the injection volume could be a reason for this result as illustrated in Figure 10.
15
258
Furthermore we show the related times of the detected transition events. It is remarkable,
259
that most of the corresponding times of the change-points occur after 2009. This result
260
supports the hypothesis that the detected change-points are correlated with the onset dates
261
of the wastewater injections. Here we have recorded an discernable increase of approval
262
dates after 2010 for wells with an approved injection volume of at least 10, 000 barrels per
263
day.
264
Conclusions
265
The main objective of this article is to present an algorithm for the automatic detection of
266
change-points in seismicity data. We use a Bayesian algorithm to identify rate changes in
267
time and space. Tests with synthetic earthquake data show a good agreement of detected
268
change-points with real change-points in space and time. For the Oklahoma case study, the
269
significant change-points show a correlation with the onset of injection wells and especially
270
with the high-volume wells. The method leads to reasonable findings of significant change-
271
points between 2008 and the end of 2015, which correspond to the results of Gupta and
272
Baker (2017) and Montoya and Wang (2017). This makes us confident that our method
273
is powerful for the automatic detection of change-points, even for cases with less drastic
274
activity changes as in Oklahoma.
275
Nevertheless we only consider a fixed radius for the subdivision of the space. As we have
276
shown the choice of the radius depends on the number of events, and the systematic error
277
should be taken into account. Here the method could be extended for example by using a
16
278
Voronoi partition (Okabe et al., 2008) or by using the approach of Gupta and Baker (2017).
279
Furthermore the likelihood ratio test assumes that we have two fixed intervals. Although
280
our method leads to preferable results, an adaptive test could be useful. Another idea for
281
such a test has been proposed in Csörgő and Horváth (1997). Another issue is the deviation
282
from Poissonian behavior, e.g. due to aftershock sequences. In this respect, it is desirable
283
to consider also cluster models like the Epidemic Type Aftershock Sequences (ETAS) model
284
(Ogata, 1988; Zhuang et al., 2002). The ETAS approach to detect seismic changes within
285
the framework of wastewater injections was presented by Wang et al. (2016). In our work
286
we use the declustering approach of Reasenberg (1985) but also other methods could be
287
used to fulfill the Poisson assumption for the considered catalogs (van Stiphout et al.,
288
2012).
289
Data and Resources
290
The data used in this article are from the websites
291
http://www.ou.edu/ogs/research/earthquakes/catalogs.html, last accessed August 28, 2018 and
292
http://www.occeweb.com/og/ogdatafiles2.htm, last accessed August 28, 2018.
293
294
295
296
Figure 1, Figure 9 and Figure 10 were made using the Generic Mapping Tools version
4.2.1 (www.soest.hawaii.edu/gmt, last accessed March 2018; Wessel and Smith (1998)).
Simulations were made using the open source software packages R version 3.2 and
Python version 2.7.12.
17
297
Acknowledgments
298
We are grateful to Hannelore Liero for fruitful discussions and comments. The manuscript
299
benefitted from constructive comments of two anonymous reviewers. This work was sup-
300
ported by the DFG Research Training Group “Natural hazards and risks in a changing
301
world”(NatRiskChange). GZ also acknowledges support from the DFG (SFB 1294).
302
References
303
Akman, V. E. and Raftery, A. E. (1986). Asymptotic inference for a change-point Poisson
304
305
306
307
308
process, The Annals of Statistics, 14(4), 1,583–1,590.
Berger, J. O., Pericchi, L. R. and others. (2004). Training samples in objective Bayesian
model selection, The Annals of Statistics, 32(3), 841–869.
Csörgő, M. and L. Horváth (1997). Limit theorems in change-point analysis, John Wiley
and Sons Inc, Vol. 18.
309
Ellsworth, William L. (2013). Injection-induced earthquakes, Science, 341(6142), 1225942.
310
Ghosal, S., and Samanta, T. (1995). Asymptotic behaviour of Bayes estimates and posterior
311
distributions in multiparameter nonregular cases, Mathematical Methods of Statistics,
312
4(4), 361–388.
313
Ghosal, S., Ghosh, J. K. and T. Samanta (1999). Approximation of the posterior distri-
314
bution in a change-point problem, Annals of the Institute of Statistical Mathematics,
315
51(3), 479–497.
18
316
Ghosh, J. K., Ghosal, S., and T. Samanta (1994). Stability and convergence of the posterior
317
in non-regular problems, Statistical Decision Theory and Related Topics V , Springer,
318
183–199.
319
Gupta, Abhineet, and Jack W. Baker (2015). A Bayesian change point model to detect
320
changes in event occurrence rates, with application to induced seismicity. In: 12th In-
321
ternational Conference on Applications of Statistics and Probability in Civil Engineering,
322
ICASP12, Vancouver, Canada, 8 pp..
323
Gupta, Abhineet, and Jack W. Baker (2017). Estimating spatially varying event rates with
324
a change point using Bayesian statistics: Application to induced seismicity, Structural
325
Safety , 65, 1–11.
326
327
Kass, Robert E. and Adrian E. Raftery (1995). Bayes factors, Journal of the American
Statistical Association, 90(430), 773–795.
328
Keranen, K. M., Savage, H. M., Abers, G. A. and Cochran, E. S. (2013). Potentially
329
induced earthquakes in Oklahoma, USA: Links between wastewater injection and the
330
2011 Mw 5.7 earthquake sequence, Geology , 41(6), 699–702.
331
Langenbruch, C. and Zoback, M. D. (2016). How will induced seismicity in Oklahoma
332
respond to decreased saltwater injection rates?, Science advances, 2(11), e1601542.
333
Montoya-Noguera, S. and Wang, Y. (2017). Bayesian identification of multiple seismic
334
change points and varying seismic rates caused by induced seismicity, Geophysical Re-
335
search Letters, 44(8), 3,509–3,516.
19
336
337
Ogata, Y. (1988). Statistical models for earthquake occurrences and residual analysis for
point processes, Journal of the American Statistical Association, 83(401), 9–27.
338
Okabe, A., Boots, B., Sugihara, K., Chiu, S. N. and Kendall, D. G. (2008). Definitions and
339
Basic Properties of Voronoi Diagrams. Spatial Tessellations: Concepts and Applications
340
of Voronoi Diagrams, , John Wiley and Sons, Inc. , Vol. 2, 43–228.
341
342
343
344
Raftery, A. E. and Akman, V. E. (1986). Bayesian analysis of a Poisson process with a
change-point, Biometrika, 73(1), 85–89.
Reasenberg, P. (1985). Second-order moment of central California seismicity, 1969-1982,
Journal of Geophysical Research: Solid Earth, 90(B7), 5479–5495.
345
Spiegelhalter, D. J. and Smith, A. F. (1982). Bayes factors for linear and log-linear models
346
with vague prior information, Journal of the Royal Statistical Society. Series B (Method-
347
ological), 44(3), 377–387.
348
349
350
351
van Stiphout, T., Zhuang, J. and Marsan, D. (2012). Seismicity declustering, Community
Online Resource for Statistical Seismicity Analysis, 10(1).
Walsh, F. R. and Zoback, M. D. (2015). Oklahoma’s recent earthquakes and saltwater
disposal, Science advances, 1(5), e1500195.
352
Wang, P., Small, M. J., Harbert, W. and Pozzi, M. (2016). A Bayesian approach for
353
assessing seismic transitions associated with wastewater injections, Bulletin of the Seis-
354
mological Society of America, 106(3), 832–845.
20
355
Weingarten, M., Ge, S., Godt, J. W., Bekins, B. A. and Rubinstein, J. L. (2015). High-
356
rate injection is associated with the increase in US mid-continent seismicity, Science,
357
348(6241), 1336–1340.
358
359
360
361
Wessel, P., and W. H. F. Smith (1998). New, improved version of generic mapping tools
released, EOS Trans. AGU, 79, 579.
Witting, H. and Müller-Funk, U. (1995). Mathematische Statistik: asymptotische Statistik:
parametrische Modelle und nichtparametrische Funktionale, Teubner , Vol. 2, 215–235.
362
Zhuang, J., Ogata, Y. and Vere-Jones, D. (2002). Stochastic declustering of space-time
363
earthquake occurrences, Journal of the American Statistical Association, 97(458), 369–
364
380.
365
Bernhard Fiedler, Institute of Mathematics, University of Potsdam, Karl-Liebknecht-Str.
366
24-25, 14476 Potsdam, Germany. Email
[email protected]
367
368
Gert Zöller, Institute of Mathematics, University of Potsdam, Karl-Liebknecht-Str. 24-
369
25, 14476 Potsdam, Germany.
370
371
Matthias Holschneider, Institute of Mathematics, University of Potsdam, Karl-Liebknecht-
372
Str. 24-25, 14476 Potsdam, Germany.
373
374
Sebastian Hainzl, GFZ German Research Centre for Geosciences, Telegrafenberg, 14473
375
Potsdam, Germany
21
376
Tables
377
Table 1
Table 1: Estimation of the α-error simulations
λ1 = λ2
theoretical α
estimated α
number of events
1
0.05
0.061
10
1
0.05
0.057
50
1
0.05
0.052
100
1
0.05
0.049
1000
22
378
Table 2
Table 2: Bayes factors for the declustered catalog of M ≥ 3 earthquakes in Oklahoma. The
results indicates that two change-points are preferable.
Bayes factor
Decision
B01 = 3.73 × 10−158
M1
B02 = 1.67 × 10−197
M2
B03 =1.16 × 10−197
M3
B12 = 4.47 × 10−40
M2
B13 = 3.12 × 10−40
M3
B23 = 0.69
M2
23
379
Figure captions
380
Figure 1
381
(A) Magnitude-time plot for all earthquakes in Oklahoma from January 1, 1980 to De-
382
cember 31, 2015. (B) Cumulative number of earthquakes with M ≥ 3 in Oklahoma from
383
January 1, 1980 to December 31, 2015. Inset: Spatial map of all earthquakes with M ≥ 3
384
(time color-coded).
385
Figure 2
386
Estimation of α error and power depending on λ and number of events n for a hypothesis
387
test defined as H0 : λ1 = λ2 versus H1 : λ1 6= λ2 . Plots show the behavior of the
388
empirical cumulative distribution function (ecdf) of the p-values generated under the null
389
hypothesis H0 and its alternative H1 . Here we have n1 + n2 = 400 events and 1000
390
random realizations were generated.
391
Figure 3
392
Schematic diagram presenting the steps for our scan algorithm. (A) A certain area is
393
subdivided into m subareas A1 , . . . , Am . (B) Every subarea Ai is a disk with the same
394
radius r. (C) Events within the subarea Ai occur at ni times tij , so we can project it into
395
a three-dimensional domain Ai ∪ Si . (D) The time series Si is investigated with regard to
396
(i) model selection with Bayes factors, (ii) estimation of change-points, (iii) significance of
397
change-points, and (iv) credibility intervals.
24
398
Figure 4
399
Results based on 100 synthetic sequences for every evaluation: (A) Standard deviation of
400
τ̂ − τreal and (B) percentage of change-point detections by means of the Bayes factor as a
401
function of the number n of data points in the simulation and the ratio λ1 /λ2 of intensities
402
in the first and second half of the simulations.
403
Figure 5
404
Synthetic data: Generation of a 3D homogeneous Poisson process with different intensities.
405
The sample size is 2000. (A) Poisson process with a rate λ ≈ 0.8 (per unit area) and (B)
406
Poisson processes with different rates within the replaced cylinders i.e. the intensity in the
407
change-point domain is given by λcp ≈ 8 (per unit area).
408
Figure 6
409
Synthetic data: (A) Perspective view of the circle C1 and the change-point domain C2 with
410
the estimated significant change-point locations. (B) Example for the marginal posterior
411
p(τ | t) in the change-point domain C1 . (C) Example for the marginal posterior p(τ1 , τ2 | t)
412
in the change-point domain C2 . The logarithmic values of the density are color coded.
413
Figure 7
414
Synthetic data of a Poisson process with an intensity of 0.08 (per unit area) in the time
415
period [0, 20] in which a change-point domain is embedded (intensity λcp ≈ 0.40 within
25
416
a cylinder with radius r0 = 2, center (0, 0) and t ∈ [10, 20]): (A) Standard deviation of
417
τ̂ − τreal and (B) percentage of change-point detections by means of the Bayes factor as
418
a function of the selection radius.
419
Figure 8
420
(A) Magnitude-time plot with the estimated change-points for the whole declustered time
421
series. (B) Cumulative number of earthquakes with M ≥ 3 for the declustered catalog with
422
the estimated change-points (model with one change-point (green line) and two change-
423
points (red lines). Inset: Cumulative number of earthquakes for the non-declustered cat-
424
alog with the estimated change-points (model with three change-points), where the third
425
change-point coincides with the occurrence time of the MW = 5.6 mainshock.
426
Figure 9
427
Maps with transition events and the MW = 5.6 earthquake for the case study Oklahoma.
428
(A) and (B) Color-coded times of the first and second change-points at grid points where
429
the algorithm prefers two change-points: (A) first change-point and (B) second change-
430
points. (C) lllustration of all calculated transition times at grid points where the algorithm
431
preferred a model with one change-point.
432
Figure 10
433
Locations and occurrence times of the first change-points (for models with one and with
434
two change-points) in comparison to approval dates of injection wells from 1.1.2000 to
26
435
31.12.2015 for the Oklahoma case study. The high-volume injection wells (approved volume
436
> 10, 000 barrels per day) are illustrated in black. (A) Map view of the estimated change-
437
points, (B) latitude-time plot, and (C) time-longitude plot with estimated transitions and
438
injection wells.
27
439
Figures
440
Figure 1
Figure 1: (A) Magnitude-time plot for all earthquakes in Oklahoma from January 1, 1980 to
December 31, 2015. (B) Cumulative number of earthquakes with M ≥ 3 in Oklahoma from
January 1, 1980 to December 31, 2015. Inset: Spatial map of all earthquakes with M ≥ 3 (time
color-coded).
28
441
Figure 2
Figure 2: Estimation of α error and power depending on λ and number of events n for a hypothesis
test defined as H0 : λ1 = λ2 versus H1 : λ1 6= λ2 . Plots show the behavior of the empirical
cumulative distribution function (ecdf) of the p-values generated under the null hypothesis H0
and its alternative H1 . Here we have n1 + n2 = 400 events and 1000 random realizations were
generated.
29
442
Figure 3
Figure 3: Schematic diagram presenting the steps for our scan algorithm. (A) A certain area is
subdivided into m subareas A1 , . . . , Am . (B) Every subarea Ai is a disk with the same radius
r. (C) Events within the subarea Ai occur at ni times tij , so we can project it into a threedimensional domain Ai ∪ Si . (D) The time series Si is investigated with regard to (i) model
selection with Bayes factors, (ii) estimation of change-points, (iii) significance of change-points,
and (iv) credibility intervals.
30
443
Figure 4
Figure 4: Results based on 100 synthetic sequences for every evaluation: (A) Standard deviation
of τ̂ − τreal and (B) percentage of change-point detections by means of the Bayes factor as a
function of the number n of data points in the simulation and the ratio λ1 /λ2 of intensities in
the first and second half of the simulations.
31
444
Figure 5
Figure 5: Synthetic data: Generation of a 3D homogeneous Poisson process with different intensities. The sample size is 2000. (A) Poisson process with a rate λ ≈ 0.8 (per unit area) and
(B) Poisson processes with different rates within the replaced cylinders i.e. the intensity in the
change-point domain is given by λcp ≈ 8 (per unit area).
32
445
Figure 6
Figure 6: Synthetic data: (A) Perspective view of the circle C1 and the change-point domain C2
with the estimated significant change-point locations. (B) Example for the marginal posterior
p(τ | t) in the change-point domain C1 . (C) Example for the marginal posterior p(τ1 , τ2 | t) in
the change-point domain C2 . The logarithmic values of the density are color coded.
33
446
Figure 7
Figure 7: Synthetic data of a Poisson process with an intensity of 0.08 (per unit area) in the
time period [0, 20] in which a change-point domain is embedded (intensity λcp ≈ 0.40 within a
cylinder with radius r0 = 2, center (0, 0) and t ∈ [10, 20]): (A) Standard deviation of τ̂ − τreal
and (B) percentage of change-point detections by means of the Bayes factor as a function of the
selection radius.
34
447
Figure 8
Figure 8: (A) Magnitude-time plot with the estimated change-points for the whole declustered
time series. (B) Cumulative number of earthquakes with M ≥ 3 for the declustered catalog
with the estimated change-points (model with one change-point (green line) and two changepoints (red lines). Inset: Cumulative number of earthquakes for the non-declustered catalog with
the estimated change-points (model with three change-points), where the third change-point
coincides with the occurrence time of the MW = 5.6 mainshock.
35
448
Figure 9
Figure 9: Maps with transition events and the MW = 5.6 earthquake for the case study Oklahoma.
(A) and (B) Color-coded times of the first and second change-points at grid points where the
algorithm prefers two change-points: (A) first change-point and (B) second change-points. (C)
lllustration of all calculated transition times at grid points where the algorithm preferred a model
with one change-point.
36
449
Figure 10
Figure 10: Locations and occurrence times of the first change-points (for models with one and
with two change-points) in comparison to approval dates of injection wells from 1.1.2000 to
31.12.2015 for the Oklahoma case study. The high-volume injection wells (approved volume
> 10, 000 barrels per day) are illustrated in black. (A) Map view of the estimated change-points,
(B) latitude-time plot, and (C) time-longitude plot with estimated transitions and injection wells.
37
450
Appendix
451
Derivation of the Marginal Posterior Density
452
With the notation of the Estimation of Change-Points section, we derive the formula for
453
the Bayesian posterior density Eq. (6). Here we use Fubini’s theorem and the definition of
454
the gamma function
Γ(x) =
Z∞
z x−1 e−z dz
(A1)
0
455
or more precisely the following calculation:
Z∞
(N (τ )−N (τi−1 ) −λi (τi −τi−1 )
λi i
e
dλi
0
=
Z∞
0
= (τi − τi−1 )
−(N (τi )−N (τi−1 )+1)
z
τi − τi−1
Z∞
N (τi )−N (τi−1 )
e−z
z N (τi )−N (τi−1 )+1−1 e−z dz
0
= (τi − τi−1 )−(N (τi )−N (τi−1 )+1) Γ(N (τi ) − N (τi−1 ) + 1)
38
dz
τi − τi−1
(A2)
456
Derivation of the Likelihood Ratio Test
457
Based on the test problem
H0 : λ1 = λ2 versus H1 : λ1 6= λ2 .
458
(A3)
the likelihood function for two different rates is given by
p(t | λ1 , λ2 ) = λn1 1 exp(−λ1 ∆1 )λn2 2 exp(−λ2 ∆2 ),
459
where ∆1 = s2 − s1 and ∆2 = s4 − s3 .
460
The log-likelihood function is given by
l (λ1 , λ2 | t) = n1 log λ1 − λ1 ∆1 + n2 log λ2 − λ2 ∆2 .
(A4)
(A5)
461
Under H1 we have to calculate the maximum likelihood estimator (MLE) for λ1 and λ2 .
462
From
∂l (λ1 , λ2 | t)
n1
!
=
− ∆1 = 0
∂λ1
λ1
463
we get
λ̂1 =
464
(A6)
n1
.
∆1
(A7)
Furthermore
n1
∂ 2 l (λ1 , λ2 | t)
= − 2 < 0 for all λ1 ∈ R+ .
2
∂λ1
λ1
(A8)
465
So λ̂1 is the MLE for λ1 . In the same way we can show that λ̂2 =
n2
is the MLE for λ2 .
∆2
466
Under H0 is λ = λ1 = λ2 , so we get the likelihood
l (t | λ) = λn1 +n2 exp[−λ(∆1 + ∆2 )].
39
(A9)
467
468
469
The log-likelihood function is given by
l (λ | t) = (n1 + n2 ) log λ − λ(∆1 + ∆2 ).
(A10)
n1 + n2
∂l (λ | t)
!
=
− (∆1 + ∆2 ) = 0,
∂λ
λ
(A11)
Thus
which leads to
λ̂ =
470
n1 + n2
.
∆1 + ∆ 2
(A12)
Furthermore
∂ 2 l (λ | t)
n1 + n2
=−
< 0 for all λ ∈ R+ .
2
∂λ
λ2
471
So λ̂ is the MLE for λ.
472
In general the test statistic is given by
Z = 2 ln
473
(A13)
p(t | H1 )
.
p(t | H0 )
(A14)
Hence
h
i
Z = 2 l λ̂1 , λ̂2 | t − l λ̂ | t
(A15)
leads to
n1
n2
n2
−
−
(∆1 ) + n2 log
(∆2 )
Z = 2 n1 log
∆1
∆2
∆2
n1 + n2
n1 + n2
−
− (n1 + n2 ) log
(∆1 + ∆2 )
∆1 + ∆ 2
∆1 + ∆ 2
474
n1
∆1
and finally to
Z = 2 n1 log
n1
∆1
+ n2 log
n2
∆2
− (n1 + n2 ) log
40
n1 + n2
∆1 + ∆ 2
.
(A16)
475
Derivation of the Bayes Factors
476
The Bayes factor is defined by the ratio of the marginal or integrated likelihood for the
477
two considered models Ml (model with l change-points) and Mm (model with m change-
478
points), i.e.
p(t | Ml )
,
p(t | Mm )
Blm =
479
with l, m = 0, . . . , k and l 6= m. For M0 and M1 we get
p(t | M0 ) =
480
(A17)
Z
∞
p(λ)λn e−λ(b−a) dλ
(A18)
0
and
p(t | M1 ) =
Z bZ
a
0
∞Z ∞
0
N (τ ) −λ1 (τ −a) N (b)−N (τ ) −λ2 (b−τ )
e
λ2
e
dλ1 dλ2 dτ.
p(τ )p(λ1 )p(λ2 )λ1
(A19)
481
For l ≥ 2 we obtain
Z Z
N (τ )
N (b)−N (τl ) −λl+1 (b−τl )
p(t | Ml ) =
e
p(τ1 )p(λ1 )p(λl+1 )λ1 1 e−λ1 (τ1 −a) λl+1
Λ
×
T
l
Y
N (τ )−N (τi−1 ) −λi (τi −τi−1 )
p(τi )p(λi )λi i
e
dλ1 . . .
(A20)
dλl+1 dτ1 . . . dτl .
i=2
482
Here is Λ = (0, ∞)l+1 and T = (a, b)l .
483
To evaluate Eq. (A18), Eq. (A19) and Eq. (A20) we use improper prior densities for the
484
2
, where ci is a not further
intensities so that p(λ) = c0 λ− 2 and p(λ) = ck λ1 2 . . . λk+1
485
specified constant. Moreover we formulate uniform distributed priors for τi , i.e. p(τi ) =
486
(compare Raftery and Akman (1986)). For this approach Eq. (A18) becomes
−1
1
p(t | M0 ) =
Z
∞
0
−1
1
b−a
1
c0 λ− 2 λn1 e−λ(b−a) dλ
(A21)
= c0 (b − a)
41
−(n+ 21 )
1
Γ(n + ).
2
487
Further Eq. (A19) becomes
Z bZ ∞Z ∞
c1 N (τ )− 12 −λ1 (τ −a) N (b)−N (τ )− 21 −λ2 (b−τ )
λ
e
λ2
e
dλ1 dλ2 dτ
p(t | M1 ) =
b−a 1
0
0
a
Z ti+1
n
1
1
1
1
c1 X
(τ − a)−(i+ 2 ) (b − τ )−(n−i+ 2 ) dτ,
Γ(i + )Γ(n − i + )
=
b−a
2
2 ti
i=0
(A22)
488
with t0 = a and tn+1 = b. The resulting Bayes factor B01 contains an unspecified constant
489
c0 /c1 , which can determined by using the boundary condition B01 ≈ 1, if we consider an
490
observation period of [a, b] consisting only a single event t1 = (a + b)/2. So Eq. (A22)
491
becomes
Z ti+1
1
1
1
c1 X
1
1
p(t | M1 ) =
(τ − a)−(i+ 2 ) (b − τ )−(1−i+ 2 ) dτ
Γ(i + )Γ(n − i + )
b−a
2
2 ti
i=0
"Z
(a+b)/2
1
3
c1
=
(τ − a)− 2 (b − τ )− 2 dτ
Γ(0.5)Γ(1.5)
b−a
a
#
Z b
1
3
+
(τ − a)− 2 (b − τ )− 2 dτ
(a+b)/2
=
√
c1
4 πΓ(1.5).
2
(b − a)
(A23)
492
493
√
1
!
If c0 /c1 =: c01 (a, b), we receive by solving B01 = 1 that c01 (a, b) = 4 π(b − a)− 2 . Finally
we get Eq. (10). In the same way we can evaluate Eq. (A20). Here we have to consider
Z bZ bZ ∞Z ∞Z ∞
1
N (τ1 )− 21 −λ1 (τ1 −a) N (τ2 )−N (τ1 )− 21
p(t | M2 ) = c2
λ1
e
λ2
2
(b − a)
0
0
0
a
a
N (b)−N (τ2 )− 21 −λ3 (b−τ2 )
× e−λ2 (τ2 −τ1 ) λ3
e
dλ1 dλ2 dλ3 dτ1 dτ2
n X
n
X
1
1
1
1
Γ(i + )Γ(j − i + )Γ(n − j + )
2
(b − a)
2
2
2
i=0 j=i+1
Z ti+1 Z tj+1
1
1
1
×
(τ1 − a)−(i+ 2 ) (τ2 − τ1 )−(j−i+ 2 ) (b − τ2 )−(n−j+ 2 ) dτ1 dτ2 .
= c2
ti
tj
(A24)
42
494
By using the training sample method we obtain
p(t | M2 ) =
1
1 X
X
c2
1
1
1
Γ(i + )Γ(j − i + )Γ(1 − j + )
2
(b − a)
2
2
2
i=0 j=i+1
Z ti+1 Z tj+1
1
1
1
(τ1 − a)−(i+ 2 ) (τ2 − τ1 )−(j−i+ 2 ) (b − τ2 )−(1−j+ 2 ) dτ1 dτ2
×
tj
ti
[Γ(0.5)]2 Γ(1.5)
= c2
(b − a)2
=
c2
(b − a)
5
2
Z
a
(a+b)/2 Z b
1
(a+b)/2
3
1
(τ1 − a)− 2 (τ2 − τ1 )− 2 (b − τ2 )− 2 dτ1 dτ2
2π 2 Γ(1.5).
(A25)
495
Without loss of generality we assume that τ2 > τ1 , so that we have to multiply the resulting
496
constant with the factor 2. This finally leads to c02 (a, b) = 4π 2 (b − a)−1 and Eq. (11). To
497
compare M1 and M2 we use
B12 =
B02
.
B01
(A26)
498
For the general case Blm , we first calculate the Bayes factors B0l and B0m by using
499
the training sample method to get the occurring constants as shown in Eq. (A23) or in
500
Eq. (A25) and then straightforward
Blm =
B0m
.
B0l
43
(A27)
501
Using the priors as explained before, evaluation of Eq. (A20) leads to
p(t | Ml ) = cl
×
=
Z Z
Λ T
l
Y
N (τ1 )− 12 −λ1 (τ1 −a)
p(τ1 )λ1
N (b)−N (τl )− 12 −λl+1 (b−τl )
p(τl )λl+1
N (τi )−N (τi−1 )− 12 −λi (τi −τi−1 )
e
p(τi )λi
e
dλ1 . . . dλl+1 dτ1 . . . dτl
i=2
n
X
cl
...
(b − a)l
i1 =0
×
e
Z
ti1 +1
...
t i1
Z
n
X
il =il−1 +1
til +1
t il
l
1 Y
1
1
Γ(ij − ij−1 + )
Γ(i1 + )Γ(n − il + )
2
2
2
j=2
1
1
(τ1 − a)−(i1 + 2 ) (b − τl )−(n−il + 2 )
l
Y
j=2
1
(τj − τj−1 )−(ij −ij−1 + 2 )
× dτ1 . . . dτl .
(A28)
502
With the help of the training sample method, the occurring constants can be calculated.
503
√ 5
3
As a further example for B03 we get c03 (a, b) = 4 2π 2 (b − a)− 2 .
504
For model selection we use the following algorithm:
505
506
i) Define the maximum number k of possible change-points in the investigated data.
ii) Set m = 0.
507
iii) Calculate the Bayes factors Bml with l = m + 1, . . . , k.
508
iv) Calculate lnew =
arg min {Bml < 0.3}.
l∈{m+1,...,k}
509
510
v) If lnew exists, set m = lnew and go to step iii). Otherwise, select a model where the
number of change-points is equal to m.
44
511
Case Study Oklahoma: Evaluation with Different Choices of
512
the Radius
513
In comparison to the results illustrated in Fig. 9 where we used a radius r = 5 km, Fig. A1
shows the transition events for the radii r = 2 and r = 10 km.
Figure A1: Maps with transition events and the MW = 5.6 earthquake for the case study
Oklahoma. (A) and (B) Illustration of all calculated change-point locations where the algorithm
prefers two change-points by using a radius of 2 km. (C) and (D) Illustration of all calculated
change-point locations where the algorithm prefers two change-points by using a radius of 10
km. (E) and (F) show all calculated transition events where the algorithm prefers a model with
one change-point, e.g. (E) r = 2 km and (F) r = 10 km.
514
45