Academia.eduAcademia.edu

Multiple Change‐Point Detection in Spatiotemporal Seismicity Data

2018, Bulletin of the Seismological Society of America

Earthquake rates are driven by tectonic stress buildup, earthquake-induced stress changes, and transient aseismic processes. Although the origin of the first two sources is known, transient aseismic processes are more difficult to detect. However, the knowledge of the associated changes of the earthquake activity is of great interest, because it might help identify natural aseismic deformation patterns such as slow-slip events, as well as the occurrence of induced seismicity related to human activities. For this goal, we develop a Bayesian approach to identify change-points in seismicity data automatically. Using the Bayes factor, we select a suitable model, estimate possible change-points, and we additionally use a likelihood ratio test to calculate the significance of the change of the intensity. The approach is extended to spatiotemporal data to detect the area in which the changes occur. The method is first applied to synthetic data showing its capability to detect real change-points. Finally, we apply this approach to observational data from Oklahoma and observe statistical significant changes of seismicity in space and time.

Originally published as: Fiedler, B., Zöller, G., Holschneider, M., Hainzl, S. (2018): Multiple Change‐Point Detection in Spatiotemporal Seismicity Data. - Bulletin of the Seismological Society of America, 108, 3A, pp. 1147—1159. DOI: http://doi.org/10.1785/0120170236 Multiple change-point detection in spatio-temporal seismicity data August 28, 2018 Bernhard Fiedler, Institute of Mathematics, University of Potsdam, Karl-Liebknecht-Str. 24-25, 14476 Potsdam, Germany. Email [email protected] Gert Zöller, Institute of Mathematics, University of Potsdam, Karl-Liebknecht-Str. 24-25, 14476 Potsdam, Germany Matthias Holschneider, Institute of Mathematics, University of Potsdam, Karl-Liebknecht-Str. 24-25, 14476 Potsdam, Germany Sebastian Hainzl, GFZ German Research Centre for Geosciences, Telegrafenberg, 14473 Potsdam, Germany 1 Abstract 1 2 Earthquake rates are driven by tectonic stress buildup, earthquake- induced stress 3 changes, and transient aseismic processes. Although the origin of the first two sources 4 is known, transient aseismic processes are more difficult to detect. However, the knowl- 5 edge of the associated changes of the earthquake activity is of great interest, because 6 it might help identify natural aseismic deformation patterns such as slow-slip events, as 7 well as the occurrence of induced seismicity related to human activities. For this goal, we 8 develop a Bayesian approach to identify change- points in seismicity data automatically. 9 Using the Bayes factor, we select a suitable model, estimate possible change-points, and 10 we additionally use a likelihood ratio test to calculate the significance of the change of the 11 intensity. The approach is extended to spatiotemporal data to detect the area in which 12 the changes occur. The method is first applied to synthetic data showing its capability 13 to detect real change-points. Finally, we apply this approach to observational data from 14 Oklahoma and observe statistical significant changes of seismicity in space and time. 15 Introduction 16 Natural seismicity is a nonstationary process with vari- ous kinds of transient behavior on 17 different spatiotemporal scales, for example, aftershocks, foreshocks, swarm activity, and 18 quiescence lasting from hours to decades. Man-made earthquakes, for example, arising 19 from fluid injection in geothermal areas or wastewater disposals (Ellsworth, 2013) have 20 similar statistical features, but on smaller spatial scales with transient boundary conditions. 2 21 For example, the grow- ing amount of industrial projects related to the injection of fluids at 22 depth has led to the question, to which degree the seismic hazard changes at an injection 23 site. Figure 1 shows a clear increase of the earthquake number in Oklahoma at around 24 the year 2010. Several authors including Keranen et al. (2013), Langenbruch and Zoback 25 (2016), Walsh and Zoback (2015) and Weingarten et al. (2015) reported a correlation 26 between the injection volume and the observed increase of the seismicity. 27 In our study, we propose a Bayesian approach to detect transients in seismicity. Using 28 the Poisson assumption for the occurrence of earthquakes, we apply a method which was 29 first introduced by Raftery and Akman (1986) and further applied to earthquake data by 30 Gupta and Baker (2015), Montoya and Wang (2017) and Gupta and Baker (2017). We 31 go beyond these works and present an algorithm that allows the identification of multiple 32 change-points that occur in space and time. Moreover, we note that for observational 33 data, signals for change-points might be weak and difficult to distinguish from random 34 fluctuations. Therefore, we put special emphasis on the development of an appropriate 35 significance test and apply the concept of the Bayes factor for model selection. 36 Our model approach is based on the assumption that the earthquake occurrence follows 37 a piecewise homogeneous Poisson process (HPP) in time. In particular, the system is 38 assumed to suddenly change from one Poisson rate into another. Such transitions are 39 defined as change-points in time. This approach is then extended to space–time in a 40 straightforward way by subdividing the area into smaller segments of a specific size. For 41 every subarea, we obtain a time series contain- ing change-points or not. In both cases, 42 we first address the question which model is statistically preferable, for example, a model 3 43 with or without a change-point. If a specific change- point model is preferred, we then 44 use an extended approach of Raftery and Akman (1986) to estimate the change-points 45 and calculate associated Bayesian credibility intervals at a given significance level (e.g., 46 95%). This is described in the Estimation of Change-Points section. Additionally, we use 47 a likelihood ratio test to calculate the significance (p-value) of the change of the Poisson 48 intensity (the Likelihood Ratio Test section) and extend the approach to the space–time 49 prob- lem (the Spatiotemporal Change-Point Problem section). By means of synthetic data, 50 we demonstrate the performance of the method (the Illustration for Synthetic Data section) 51 before applying it to the observed data from Oklahoma (the Application to Seismicity in 52 Oklahoma section). 53 Method 54 Estimation of Change-Points 55 First we give a brief overview on the detection of temporal change-points according to 56 Raftery and Akman (1986). In comparison to this work we extend the method to a general 57 case with more than one change-point. 58 An observation period of [a, b] is given with n events at times a ≤ t1 < t2 < . . . < tn ≤ b. 59 (1) We assume the existence of k change-points τ1 , τ2 , . . . , τk ∈ [a, b] 4 (2) 60 with k < n. Moreover in [a, τ1 ] we have N (τ1 ) events which come from a Poisson process 61 with rate λ1 , and N (τi ) − N (τi−1 ) events in (τi−1 , τi ] with rate λi for i = 2, . . . , k. 62 Finally, in (τk , b] the number of events follows a Poisson process with rate λk+1 . 63 Let t = {t1 , . . . , tn } and θ = {λ1 , . . . , λk+1 , τ1 , . . . , τk }. It can easily be shown that 64 the mutual likelihood function is given by N (τ1 ) −λ1 (τ1 −a) N (τ2 )−N (τ1 ) −λ2 (τ2 −τ1 ) e λ2 e p(t | θ) = λ1 N (τ ) N (b)−N (τk ) −λk+1 (b−τk ) λ1 1 e−λ1 (τ1 −a) λk+1 e = N (b)−N (τk ) −λk+1 (b−τk ) · . . . · λk+1 k Y e (3) N (τ )−N (τi−1 ) −λi (τi −τi−1 ) λi i e . i=2 65 Using Bayes’ Theorem p(θ | t) = R p(t | θ)p(θ) ∝ p(t | θ)p(θ) Θ p(t | θ) p(θ) dθ (4) 66 we can calculate the posterior density p(θ | t) for the parameter θ given the data represented 67 by t = {t1 , . . . , tn }. Here p(t | θ) denotes the likelihood function and p(θ) is the prior 68 density of θ. 69 Let p(τ1 ), . . . , p(τk ) and p(λ1 ), . . . , p(λk+1 ) be the prior densities. Then the posterior 70 density is given by N (τ1 ) −λ1 (τ1 −a) N (b)−N (τk ) e λk+1 p(θ | t) ∝ p(τ1 )p(λ1 )p(λk+1 )λ1 ×e −λk+1 (b−τk ) k Y (5) N (τ )−N (τi−1 ) −λi (τi −τi−1 ) p(τi )p(λi )λi i e . i=2 71 Assuming now a flat prior, we calculate the marginal posterior density of τ = {τ1 , . . . , τk } 72 by integrating with respect to λ1 , . . . , λk+1 (see also the Derivation of the Marginal Pos- 73 terior Density section in the Appendix). 5 p(τ | t) = c Z∞ × ... 0 k Y Z∞ N (τ1 ) −λ1 (τ1 −a) N (b)−N (τk ) −λk+1 (b−τk ) e λk+1 e λ1 0 N (τi )−N (τi−1 ) −λi (τi −τi−1 ) λi e dλ1 . . . dλk+1 i=2 = c(τ1 − a)−[N (τ1 )+1] Γ[N (τ1 ) + 1](b − τk )−[N (b)−N (τk )+1] × Γ[N (b) − N (τk ) + 1] k Y i=2 (τi − τi−1 )−[N (τi )−N (τi−1 )+1] Γ[N (τi ) − N (τi−1 ) + 1] (6) 74 We consider two special cases of Eq. (6). 75 Special case: one change-point p(τ | t) = c(τ − a)−[N (τ )+1] Γ[N (τ ) + 1](b − τ )−[N (b)−N (τ )+1] Γ[N (b) − N (τ ) + 1] (7) 76 Special case: two change-points p(τ1 , τ2 | t) = c(τ1 − a)−[N (τ1 )+1] Γ[N (τ1 ) + 1](τ2 − τ1 )−[N (τ2 )−N (τ1 )+1] × Γ[N (τ2 ) − N (τ1 ) + 1](b − τ2 ) −[N (b)−N (τ2 )+1] (8) Γ[N (b) − N (τ2 ) + 1] 77 We note that in Eq. (6), (7) and (8) c is a normalizing constant which ensures that the 78 conditions for a probability density function is fulfilled. Alternatively to a flat prior density, 79 a conjugated prior for the parameters λ1 , . . . , λk+1 (e.g. a gamma distribution) and 80 uniformly distributed prior densities for τ1 , . . . , τk can be used (see also Raftery and 81 Akman (1986)). By maximizing p(τ | t) in Eq. (6) with respect to τ = {τ1 , . . . , τk } we 82 obtain the estimation τ̂ = {τ̂1 , . . . , τ̂k } for the change-points. 83 In Akman and Raftery (1986) it was shown that the estimator for a single change- 84 point is consistent and asymptotically normal. Moreover in Ghosal et al. (1999) and other 6 85 related papers (e.g. Ghosh et al. (1994) and Ghosal and Samanta (1995)) it was also 86 demonstrated that in this case the posterior distribution asymptotically behaves like an 87 exponential function on both sides of the detected change-point. The asymptotic behavior 88 for the general case with more than one change-point is shown in Ghosal et al. (1999). 89 For model selection, we use the Bayes factor to get an idea which model should be 90 preferred, that is, whether to prefer a change-point model (M1 ) or a model without a 91 change-point (model M0 ). The Bayes factor is defined by the ratio of the marginal or 92 integrated likelihood for both models, that is Blm = p(t | Ml ) . p(t | Mm ) (9) 93 Here Ml and Mm denote a model with l respectively with m change-points where l, m = 94 0, 1, . . . , k. Apart from the goodness of fit, the complexity of the assumed model has to be 95 taken into account in order to assess the most capable model describing the data and thus 96 performing the estimation. As an example, if we test the hypothesis of no change-point 97 (H0 ) against a change-point model, the value of the Bayes factor quantifies the evidence 98 of the supported model, e.g. B01 < 0.01 can be interpreted as a decisive evidence against 99 H0 , compare Kass and Raftery (1995). Equation (9) strongly depends on the choice of 100 the prior distributions. When an improper prior is used, the Bayes factor is, however, not 101 well-defined and depends on an arbitrary ratio of constants. To handle this problem we 102 use the idea of an imaginary training sample which involves the smallest possible sample 103 size permitting a comparison of M0 and Mm and provides maximum possible support 104 for M0 . In this case the Bayes factor should be approximately one. This approach was 105 introduced in Spiegelhalter and Smith (1982) and was adopted and discussed in several 7 106 other works, e.g. Raftery and Akman (1986), Kass and Raftery (1995) or Berger et al. 107 (2004). Using improper prior densities for the intensities of the shape p(λi ) ∝ λi 108 uniform distributed prior for τi , i.e. p(τi ) = 109 into consideration the approach of Spiegelhalter and Smith (1982), the Bayes factor can 110 be calculated by − 21 B01 1 b−a and a (Raftery and Akman, 1986) and taking √ 4 π(b − a)−n Γ(n + 21 ) . = Pn R ti+1 −(n−i+ 12 ) −(i+ 12 ) 1 1 (b − τ ) dτ )Γ(n − i + ) Γ(i + (τ − a) i=0 2 2 ti (10) 111 This approach can be extended and the derivation for the general case Blm is shown in 112 the Appendix Derivation of the Bayes factor. For example, for the hypothesis of no change 113 point (H0 ) against a model with two change-points we get  n n 1 X X 1 1 1 2 −n+ 21 B02 = 4π (b − a) Γ(n + ) Γ(i + )Γ(j − i + )Γ(n − j + ) 2 2 2 2 i=0 j=i+1 #−1 Z Z ti+1 × ti tj+1 tj 1 1 1 (τ1 − a)−(i+ 2 ) (τ2 − τ1 )−(j−i+ 2 ) (b − τ2 )−(n−j+ 2 ) dτ1 dτ2 (11) . 114 We note that the computation of Eq. (9) for large l and m is numerically very difficult to 115 handle because of the high- dimensional integrals. We remark that the function evaluations 116 grow exponentially as the number of dimensions increases. If the quadrature rules do not 117 lead to a desirable result, Monte Carlo methods should be used instead. In our work, we 118 apply a likelihood ratio test in addition to the established methods we considered before. 119 As an advantage, we get the significance (p-value) of the change of the Poisson intensity. 120 Needless to say that the Bayes factor is a powerful tool for the model selection, but although 121 we know the preferred model, we cannot yet prove that the estimated change-points are 122 significant. This problem can be solved with the aid of the likelihood ratio test. 8 123 Likelihood Ratio Test 124 We consider two Poisson processes with intensities λ1 and λ2 in the time intervals [s1 , s2 ] 125 and [s3 , s4 ] with s3 > s2 . In the first period we have n1 events, and in the second period 126 the number of events is n2 . We aim at testing whether or not the intensities are equal. In 127 detail we test hypothesis H0 versus H1 with H 0 : λ1 = λ2 = λ (12) H1 : λ1 6= λ2 . 128 The likelihood function is given by p(t | λ1 , λ2 ) = λn1 1 exp(−λ1 ∆1 )λn2 2 exp(−λ2 ∆2 ), 129 with ∆1 = s2 − s1 and ∆2 = s4 − s3 . 130 For H0 we get (13) p(t | λ) = λn1 +n2 exp[−λ(∆1 + ∆2 )]. (14) 131 As shown in the Appendix Derivation of the Likelihood Ratio Test, we can derive the 132 statistic of this test by calculation of the maximum likelihood estimators for λ, λ1 and λ2 133 and by using a general result of Witting and Müller-Funk (1995). It follows that the test 134 statistic of this likelihood ratio test is equal to  Z = 2 n1 log  n1 ∆1  135 H0 is rejected, if z > χ21, 136 given significance level and χ21, 137 with one degree of freedom. 1−α + n2 log  n2 ∆2  − (n1 + n2 ) log  n1 + n2 ∆1 + ∆ 2  . (15) or if the p-value = P (Z ≥ z) < α. Here α ∈ (0, 1) is a 1−α is the (1 − α)-quantile of the chi-squared distribution 9 138 To investigate the properties of this test, we perform calculations with artificially generated 139 data resulting in a reasonable resemblance to the error of the first kind α, as summarized in 140 Table 1. As an estimator for the error probability, we use the number of rejected hypotheses 141 divided by the number of generated samples. Moreover, Figure 2 illustrates the behavior 142 of the power for fixed values of λ and n. As expected, the simulations show that the test 143 can distinguish between H0 and H1 in a suitable way. 144 Spatiotemporal Change-Point Problem 145 In this section, we extend our approach for time series in a straightforward way towards 146 spatiotemporal change-point problems. For this aim, we scan an area D to find change- 147 points in space and time. Figure 3 illustrates the algorithm. First, the investigated domain 148 is subdivided into m subareas A1 , . . . , Am with D = 149 equidistantly centered subareas with the same size in the following way: We consider a set 150 of circles, where Ai has the radius r and the center (xi , yi ) for all i = 1, . . . , m. However, 151 any other subdivision is also possible, see Gupta and Baker (2017). i=1 Ai . For simplicity, we use In the next step we investigate the time series of all events that occurred in Ai given 152 153 Sm by Si = {ti1 , ti2 , . . . , tini }. 154 Hence the data is a set of triples m [ i=1 155 (16) (Ai ∪ Si ) = m [ i=1 {(xij , yij , tij ) | j = 1, . . . , ni }. (17) For Si we use our method to detect and evaluate change-points as described before in the 10 156 Estimation of Change-Point and Likelihood Ratio Test sections. 157 In detail, for every time series Si we use the Bayes factor (9) to decide which model fits 158 best to the given data. If a specific change-point model is preferred, we maximize p(τ | t) 159 in Eq. (6) and receive a set of possible change-points. For every estimated change-point 160 in this set we use the likelihood ratio test and define a change-point as significant, if the 161 p-value is smaller than a given significance level α. The result is a set τ̂ i = {τ̂ 1i , . . . , τ̂ ki } 162 of significant change-points in Si . Finally we provide the mathematical definition of a 163 transition event within a global statistical model Mtrans . For this aim, we define a set of 164 165 transition events as triples in the following way      (xi , yi , τ̂ i ), Si has at least one change-point Mi = .     Si has no change-point ∅, Mtrans = m [ i=1 Mi . (18) (19) 166 Evaluation and Application 167 The derived methodology from the Method section is for test and illustration purposes first 168 applied on synthetic data and in the following part applied to real seismicity data recorded 169 in Oklahoma, United States. 170 Illustration for Synthetic Data 171 We first test our method by applying it to synthetic data under controlled conditions. For 172 this aim we generate synthetic time series with t ∈ [0, 1] with a single change-point at 11 173 τreal = 0.5 and investigate the goodness of the estimator. To test how the method works, 174 we calculate the standard deviation of τ̂ −τreal and Bayes factors depending on the number 175 of events and the ratio of the intensities, see Figure 4. It can be seen e.g. that a change- 176 point can be detected in sequences of 100 events with a high probability and precision, if 177 the intensity ratio exceeds a value of 2. 178 For the spatiotemporal approach, we generate random realizations of a 3D HPP with a 179 given intensity. From these data, we cut out cylinders and replace it by new cylinders with 180 data from HPPs with different intensities, as illustrated in Figure 5. Using our algorithm we 181 calculate the transition events Mtrans . Therefore we scan the whole domain as explained 182 in Figure 3. The ”training” sample is a 3D HPP with rate λ ≈ 0.8 (per unit area) in a 183 cylinder with center (0, 0), radius r1 = 6 and height h1 = 20, corresponding to the time 184 interval t ∈ [0, 20]. The replaced cylinders follows a HPP with rate λcp ≈ 8 (per unit 185 area). One cylinder has center (1, 1), radius r2 = 1 and height h2 = 10. Here the related 186 time interval is t ∈ [5, 15] and the second replaced cylinder has the center (−3, −3), radius 187 r3 = 1 and height h3 = 5. Here the related time interval is t ∈ [15, 20]. In other words, 188 the transitions are given by the sets 189 C1 = {(x, y, t) | (x + 3)2 + (y + 3)2 ≤ 1 ∧ t = 15}. (20) C2 = {(x, y, t) | (x − 1)2 + (y − 1)2 ≤ 1 ∧ t ∈ {5, 15}}. (21) and 190 The chosen sample size is n = 2000, and approximately 15% of the data is located within 191 the replaced cylinders. For this test case, we set the selection radius to r0 = 0.3. In 192 general, our results presented in Figure 4 can guide the choice for this selection: To be 12 193 able to detect a certain rate change, the event number within the selection radius must 194 exceed a minimum number, e.g. 20 events for a ten-times increased intensity as in our 195 example. For the change-point domain C1 the method yields an average value of τ̂ = 15.173 196 and for C2 we get average values of τ̂1 = 5.094 and τ̂2 = 14.971. The estimated areas 197 are illustrated in Figure 6. It is remarkable that apart from a small number of outliers the 198 complete transition area was detected correctly by the method. 199 Additionally we investigate the sensitivity of the method depending on the selection 200 radius. Therefore, we generated synthetic data from a HPP in the time interval t ∈ [0, 20], 201 where in the circular region with radius r0 = 2 around the center occurs a change at time 202 10 to a five-times increased rate, particularly the change-point domain is given by the set C = {(x, y, t) | x2 + y 2 ≤ 4 ∧ t ∈ [10, 20]}. (22) 203 The chosen sample size is n = 2000, where 50 events are within the change-point domain. 204 The intensities are given by λ ≈ 0.08 and λcp ≈ 0.4 (per unit area). For 100 simulations, 205 we calculate the Bayes factors and the standard deviation of τ̂ –τreal for increasing radii of 206 the event selection around the center. The results are illustrated in Figure 7. The test 207 results show that the estimation uncertainty is lowest and the success rate is highest for 208 the case that the selection radius equals the radius of the change-point region. A too 209 small selection radius leads to time series with a non-significant number of events, while a 210 too high value results in a systematical error and the precision of the method decreases. 211 However, the results are found to be almost the same for a rather broad range of selection 212 radii within 0.5r0 and 2r0 . This indicates that the results should be rather robust, if the 213 selection radius is chosen in a reasonable range. 13 214 Application to Seismicity in Oklahoma 215 We now apply the method to real earthquake data. Because of its drastic seismic activity 216 changes, Oklahoma probably counts as one of the most interesting study areas for the 217 application of the above estimations. Therefore, we consider an earthquake catalog from 218 Oklahoma with 18,997 events from 1 January 1980 to 31 December 2015, obtained from the 219 Oklahoma Geological Survey, compare Data and Resources. We declustered the catalog 220 using the method of Reasenberg (1985) with standard parameters (van Stiphout et al. 221 (2012), Tab. 3) and taking into account all events with magnitude m ≥ 3. The declustered 222 catalog contains 1,199 events. Using all m ≥ 3 events, the Bayes factor from Eq. (9) leads 223 to a model with two change-points (see detailed results in Table 2). The estimated 95% 224 credibility intervals for the (significant) change-points τ̂1 and τ̂2 are given by [12/01/2009; 225 28/03/2009] and [14/12/2013; 30/01/2014]. This result is illustrated in Figure 8. We 226 note that the application of the likelihood ratio test leads to p-values ≪ 1 which means 227 that both change-points are considered to be significant and the result strongly supports 228 our model selection. As depicted in Table 2, the calculation of the Bayes factor B01 , B02 229 and B03 always leads to the preference of a change-point model. For comparison, a model 230 with one change-point leads to a 95% credibility interval [24/10/2013; 10/11/2013]. A 231 model with three change-points would detect a further change-point in August 2014. If we 232 use the non-declustered catalog a model with three change-points leads to the detection 233 of the MW = 5.6 earthquake at Prague in November 2011 with a subsequent aftershock 234 sequence in addition to the induced seismic changes in 2009 and 2013 (see Figure 8). Here 235 we observe a natural change-point, caused by the aftershock sequence. In comparison to 14 236 the works of Gupta and Baker (2017) and Montoya and Wang (2017) we note that they 237 have found similar results for the change-points. The study of Montoya and Wang (2017) 238 used another method for multiple change-point detection in time series and included four 239 different areas in Oklahoma according to the towns Jones, Perry, Cherokee and Waynoka. 240 In all of the four areas their method lead to the choice of a model with two change-points. 241 In the Jones area they found two change-points in May, 2008 and August, 2011. For the 242 other areas they calculated change-points in 2013 until 2015. The work of Gupta and Baker 243 (2017) used the method of Raftery and Akman (1986) to detect single change-points in 244 spatiotemporal data. They used a 25 km radius and found changes in seismicity rates 245 between 2008 and the end of 2015. 246 By scanning the spatial domain shown in Figure 1 with a total area of approximately 247 260, 000 square kilometers, our method leads to the results shown in Figure 9 and Figure 10. 248 We used a radius of 5 km leading to 3, 500 evaluations of time series. This choice is 249 a compromise between optimizing the spatial resolution and increasing the detectability 250 which requires that the considered circles contains enough events to get robust results (see 251 Figure 4). In the Appendix Case study Oklahoma: Evaluation with different choices of the 252 radius, we show the results for alternative values of r = 2 km and r = 10 km indicating 253 that the main features are robust with regard to the choice of the selection radius. For a 254 better illustration of the results, we only take into account the models M0 , M1 and M2 . 255 Interestingly, the significant change-point locations show a spatial migration pattern from 256 south to north in both figures and overlap with the injection wells. Moreover a correlation 257 with the injection volume could be a reason for this result as illustrated in Figure 10. 15 258 Furthermore we show the related times of the detected transition events. It is remarkable, 259 that most of the corresponding times of the change-points occur after 2009. This result 260 supports the hypothesis that the detected change-points are correlated with the onset dates 261 of the wastewater injections. Here we have recorded an discernable increase of approval 262 dates after 2010 for wells with an approved injection volume of at least 10, 000 barrels per 263 day. 264 Conclusions 265 The main objective of this article is to present an algorithm for the automatic detection of 266 change-points in seismicity data. We use a Bayesian algorithm to identify rate changes in 267 time and space. Tests with synthetic earthquake data show a good agreement of detected 268 change-points with real change-points in space and time. For the Oklahoma case study, the 269 significant change-points show a correlation with the onset of injection wells and especially 270 with the high-volume wells. The method leads to reasonable findings of significant change- 271 points between 2008 and the end of 2015, which correspond to the results of Gupta and 272 Baker (2017) and Montoya and Wang (2017). This makes us confident that our method 273 is powerful for the automatic detection of change-points, even for cases with less drastic 274 activity changes as in Oklahoma. 275 Nevertheless we only consider a fixed radius for the subdivision of the space. As we have 276 shown the choice of the radius depends on the number of events, and the systematic error 277 should be taken into account. Here the method could be extended for example by using a 16 278 Voronoi partition (Okabe et al., 2008) or by using the approach of Gupta and Baker (2017). 279 Furthermore the likelihood ratio test assumes that we have two fixed intervals. Although 280 our method leads to preferable results, an adaptive test could be useful. Another idea for 281 such a test has been proposed in Csörgő and Horváth (1997). Another issue is the deviation 282 from Poissonian behavior, e.g. due to aftershock sequences. In this respect, it is desirable 283 to consider also cluster models like the Epidemic Type Aftershock Sequences (ETAS) model 284 (Ogata, 1988; Zhuang et al., 2002). The ETAS approach to detect seismic changes within 285 the framework of wastewater injections was presented by Wang et al. (2016). In our work 286 we use the declustering approach of Reasenberg (1985) but also other methods could be 287 used to fulfill the Poisson assumption for the considered catalogs (van Stiphout et al., 288 2012). 289 Data and Resources 290 The data used in this article are from the websites 291 http://www.ou.edu/ogs/research/earthquakes/catalogs.html, last accessed August 28, 2018 and 292 http://www.occeweb.com/og/ogdatafiles2.htm, last accessed August 28, 2018. 293 294 295 296 Figure 1, Figure 9 and Figure 10 were made using the Generic Mapping Tools version 4.2.1 (www.soest.hawaii.edu/gmt, last accessed March 2018; Wessel and Smith (1998)). Simulations were made using the open source software packages R version 3.2 and Python version 2.7.12. 17 297 Acknowledgments 298 We are grateful to Hannelore Liero for fruitful discussions and comments. The manuscript 299 benefitted from constructive comments of two anonymous reviewers. This work was sup- 300 ported by the DFG Research Training Group “Natural hazards and risks in a changing 301 world”(NatRiskChange). GZ also acknowledges support from the DFG (SFB 1294). 302 References 303 Akman, V. E. and Raftery, A. E. (1986). Asymptotic inference for a change-point Poisson 304 305 306 307 308 process, The Annals of Statistics, 14(4), 1,583–1,590. Berger, J. O., Pericchi, L. R. and others. (2004). Training samples in objective Bayesian model selection, The Annals of Statistics, 32(3), 841–869. Csörgő, M. and L. Horváth (1997). Limit theorems in change-point analysis, John Wiley and Sons Inc, Vol. 18. 309 Ellsworth, William L. (2013). Injection-induced earthquakes, Science, 341(6142), 1225942. 310 Ghosal, S., and Samanta, T. (1995). Asymptotic behaviour of Bayes estimates and posterior 311 distributions in multiparameter nonregular cases, Mathematical Methods of Statistics, 312 4(4), 361–388. 313 Ghosal, S., Ghosh, J. K. and T. Samanta (1999). Approximation of the posterior distri- 314 bution in a change-point problem, Annals of the Institute of Statistical Mathematics, 315 51(3), 479–497. 18 316 Ghosh, J. K., Ghosal, S., and T. Samanta (1994). Stability and convergence of the posterior 317 in non-regular problems, Statistical Decision Theory and Related Topics V , Springer, 318 183–199. 319 Gupta, Abhineet, and Jack W. Baker (2015). A Bayesian change point model to detect 320 changes in event occurrence rates, with application to induced seismicity. In: 12th In- 321 ternational Conference on Applications of Statistics and Probability in Civil Engineering, 322 ICASP12, Vancouver, Canada, 8 pp.. 323 Gupta, Abhineet, and Jack W. Baker (2017). Estimating spatially varying event rates with 324 a change point using Bayesian statistics: Application to induced seismicity, Structural 325 Safety , 65, 1–11. 326 327 Kass, Robert E. and Adrian E. Raftery (1995). Bayes factors, Journal of the American Statistical Association, 90(430), 773–795. 328 Keranen, K. M., Savage, H. M., Abers, G. A. and Cochran, E. S. (2013). Potentially 329 induced earthquakes in Oklahoma, USA: Links between wastewater injection and the 330 2011 Mw 5.7 earthquake sequence, Geology , 41(6), 699–702. 331 Langenbruch, C. and Zoback, M. D. (2016). How will induced seismicity in Oklahoma 332 respond to decreased saltwater injection rates?, Science advances, 2(11), e1601542. 333 Montoya-Noguera, S. and Wang, Y. (2017). Bayesian identification of multiple seismic 334 change points and varying seismic rates caused by induced seismicity, Geophysical Re- 335 search Letters, 44(8), 3,509–3,516. 19 336 337 Ogata, Y. (1988). Statistical models for earthquake occurrences and residual analysis for point processes, Journal of the American Statistical Association, 83(401), 9–27. 338 Okabe, A., Boots, B., Sugihara, K., Chiu, S. N. and Kendall, D. G. (2008). Definitions and 339 Basic Properties of Voronoi Diagrams. Spatial Tessellations: Concepts and Applications 340 of Voronoi Diagrams, , John Wiley and Sons, Inc. , Vol. 2, 43–228. 341 342 343 344 Raftery, A. E. and Akman, V. E. (1986). Bayesian analysis of a Poisson process with a change-point, Biometrika, 73(1), 85–89. Reasenberg, P. (1985). Second-order moment of central California seismicity, 1969-1982, Journal of Geophysical Research: Solid Earth, 90(B7), 5479–5495. 345 Spiegelhalter, D. J. and Smith, A. F. (1982). Bayes factors for linear and log-linear models 346 with vague prior information, Journal of the Royal Statistical Society. Series B (Method- 347 ological), 44(3), 377–387. 348 349 350 351 van Stiphout, T., Zhuang, J. and Marsan, D. (2012). Seismicity declustering, Community Online Resource for Statistical Seismicity Analysis, 10(1). Walsh, F. R. and Zoback, M. D. (2015). Oklahoma’s recent earthquakes and saltwater disposal, Science advances, 1(5), e1500195. 352 Wang, P., Small, M. J., Harbert, W. and Pozzi, M. (2016). A Bayesian approach for 353 assessing seismic transitions associated with wastewater injections, Bulletin of the Seis- 354 mological Society of America, 106(3), 832–845. 20 355 Weingarten, M., Ge, S., Godt, J. W., Bekins, B. A. and Rubinstein, J. L. (2015). High- 356 rate injection is associated with the increase in US mid-continent seismicity, Science, 357 348(6241), 1336–1340. 358 359 360 361 Wessel, P., and W. H. F. Smith (1998). New, improved version of generic mapping tools released, EOS Trans. AGU, 79, 579. Witting, H. and Müller-Funk, U. (1995). Mathematische Statistik: asymptotische Statistik: parametrische Modelle und nichtparametrische Funktionale, Teubner , Vol. 2, 215–235. 362 Zhuang, J., Ogata, Y. and Vere-Jones, D. (2002). Stochastic declustering of space-time 363 earthquake occurrences, Journal of the American Statistical Association, 97(458), 369– 364 380. 365 Bernhard Fiedler, Institute of Mathematics, University of Potsdam, Karl-Liebknecht-Str. 366 24-25, 14476 Potsdam, Germany. Email [email protected] 367 368 Gert Zöller, Institute of Mathematics, University of Potsdam, Karl-Liebknecht-Str. 24- 369 25, 14476 Potsdam, Germany. 370 371 Matthias Holschneider, Institute of Mathematics, University of Potsdam, Karl-Liebknecht- 372 Str. 24-25, 14476 Potsdam, Germany. 373 374 Sebastian Hainzl, GFZ German Research Centre for Geosciences, Telegrafenberg, 14473 375 Potsdam, Germany 21 376 Tables 377 Table 1 Table 1: Estimation of the α-error simulations λ1 = λ2 theoretical α estimated α number of events 1 0.05 0.061 10 1 0.05 0.057 50 1 0.05 0.052 100 1 0.05 0.049 1000 22 378 Table 2 Table 2: Bayes factors for the declustered catalog of M ≥ 3 earthquakes in Oklahoma. The results indicates that two change-points are preferable. Bayes factor Decision B01 = 3.73 × 10−158 M1 B02 = 1.67 × 10−197 M2 B03 =1.16 × 10−197 M3 B12 = 4.47 × 10−40 M2 B13 = 3.12 × 10−40 M3 B23 = 0.69 M2 23 379 Figure captions 380 Figure 1 381 (A) Magnitude-time plot for all earthquakes in Oklahoma from January 1, 1980 to De- 382 cember 31, 2015. (B) Cumulative number of earthquakes with M ≥ 3 in Oklahoma from 383 January 1, 1980 to December 31, 2015. Inset: Spatial map of all earthquakes with M ≥ 3 384 (time color-coded). 385 Figure 2 386 Estimation of α error and power depending on λ and number of events n for a hypothesis 387 test defined as H0 : λ1 = λ2 versus H1 : λ1 6= λ2 . Plots show the behavior of the 388 empirical cumulative distribution function (ecdf) of the p-values generated under the null 389 hypothesis H0 and its alternative H1 . Here we have n1 + n2 = 400 events and 1000 390 random realizations were generated. 391 Figure 3 392 Schematic diagram presenting the steps for our scan algorithm. (A) A certain area is 393 subdivided into m subareas A1 , . . . , Am . (B) Every subarea Ai is a disk with the same 394 radius r. (C) Events within the subarea Ai occur at ni times tij , so we can project it into 395 a three-dimensional domain Ai ∪ Si . (D) The time series Si is investigated with regard to 396 (i) model selection with Bayes factors, (ii) estimation of change-points, (iii) significance of 397 change-points, and (iv) credibility intervals. 24 398 Figure 4 399 Results based on 100 synthetic sequences for every evaluation: (A) Standard deviation of 400 τ̂ − τreal and (B) percentage of change-point detections by means of the Bayes factor as a 401 function of the number n of data points in the simulation and the ratio λ1 /λ2 of intensities 402 in the first and second half of the simulations. 403 Figure 5 404 Synthetic data: Generation of a 3D homogeneous Poisson process with different intensities. 405 The sample size is 2000. (A) Poisson process with a rate λ ≈ 0.8 (per unit area) and (B) 406 Poisson processes with different rates within the replaced cylinders i.e. the intensity in the 407 change-point domain is given by λcp ≈ 8 (per unit area). 408 Figure 6 409 Synthetic data: (A) Perspective view of the circle C1 and the change-point domain C2 with 410 the estimated significant change-point locations. (B) Example for the marginal posterior 411 p(τ | t) in the change-point domain C1 . (C) Example for the marginal posterior p(τ1 , τ2 | t) 412 in the change-point domain C2 . The logarithmic values of the density are color coded. 413 Figure 7 414 Synthetic data of a Poisson process with an intensity of 0.08 (per unit area) in the time 415 period [0, 20] in which a change-point domain is embedded (intensity λcp ≈ 0.40 within 25 416 a cylinder with radius r0 = 2, center (0, 0) and t ∈ [10, 20]): (A) Standard deviation of 417 τ̂ − τreal and (B) percentage of change-point detections by means of the Bayes factor as 418 a function of the selection radius. 419 Figure 8 420 (A) Magnitude-time plot with the estimated change-points for the whole declustered time 421 series. (B) Cumulative number of earthquakes with M ≥ 3 for the declustered catalog with 422 the estimated change-points (model with one change-point (green line) and two change- 423 points (red lines). Inset: Cumulative number of earthquakes for the non-declustered cat- 424 alog with the estimated change-points (model with three change-points), where the third 425 change-point coincides with the occurrence time of the MW = 5.6 mainshock. 426 Figure 9 427 Maps with transition events and the MW = 5.6 earthquake for the case study Oklahoma. 428 (A) and (B) Color-coded times of the first and second change-points at grid points where 429 the algorithm prefers two change-points: (A) first change-point and (B) second change- 430 points. (C) lllustration of all calculated transition times at grid points where the algorithm 431 preferred a model with one change-point. 432 Figure 10 433 Locations and occurrence times of the first change-points (for models with one and with 434 two change-points) in comparison to approval dates of injection wells from 1.1.2000 to 26 435 31.12.2015 for the Oklahoma case study. The high-volume injection wells (approved volume 436 > 10, 000 barrels per day) are illustrated in black. (A) Map view of the estimated change- 437 points, (B) latitude-time plot, and (C) time-longitude plot with estimated transitions and 438 injection wells. 27 439 Figures 440 Figure 1 Figure 1: (A) Magnitude-time plot for all earthquakes in Oklahoma from January 1, 1980 to December 31, 2015. (B) Cumulative number of earthquakes with M ≥ 3 in Oklahoma from January 1, 1980 to December 31, 2015. Inset: Spatial map of all earthquakes with M ≥ 3 (time color-coded). 28 441 Figure 2 Figure 2: Estimation of α error and power depending on λ and number of events n for a hypothesis test defined as H0 : λ1 = λ2 versus H1 : λ1 6= λ2 . Plots show the behavior of the empirical cumulative distribution function (ecdf) of the p-values generated under the null hypothesis H0 and its alternative H1 . Here we have n1 + n2 = 400 events and 1000 random realizations were generated. 29 442 Figure 3 Figure 3: Schematic diagram presenting the steps for our scan algorithm. (A) A certain area is subdivided into m subareas A1 , . . . , Am . (B) Every subarea Ai is a disk with the same radius r. (C) Events within the subarea Ai occur at ni times tij , so we can project it into a threedimensional domain Ai ∪ Si . (D) The time series Si is investigated with regard to (i) model selection with Bayes factors, (ii) estimation of change-points, (iii) significance of change-points, and (iv) credibility intervals. 30 443 Figure 4 Figure 4: Results based on 100 synthetic sequences for every evaluation: (A) Standard deviation of τ̂ − τreal and (B) percentage of change-point detections by means of the Bayes factor as a function of the number n of data points in the simulation and the ratio λ1 /λ2 of intensities in the first and second half of the simulations. 31 444 Figure 5 Figure 5: Synthetic data: Generation of a 3D homogeneous Poisson process with different intensities. The sample size is 2000. (A) Poisson process with a rate λ ≈ 0.8 (per unit area) and (B) Poisson processes with different rates within the replaced cylinders i.e. the intensity in the change-point domain is given by λcp ≈ 8 (per unit area). 32 445 Figure 6 Figure 6: Synthetic data: (A) Perspective view of the circle C1 and the change-point domain C2 with the estimated significant change-point locations. (B) Example for the marginal posterior p(τ | t) in the change-point domain C1 . (C) Example for the marginal posterior p(τ1 , τ2 | t) in the change-point domain C2 . The logarithmic values of the density are color coded. 33 446 Figure 7 Figure 7: Synthetic data of a Poisson process with an intensity of 0.08 (per unit area) in the time period [0, 20] in which a change-point domain is embedded (intensity λcp ≈ 0.40 within a cylinder with radius r0 = 2, center (0, 0) and t ∈ [10, 20]): (A) Standard deviation of τ̂ − τreal and (B) percentage of change-point detections by means of the Bayes factor as a function of the selection radius. 34 447 Figure 8 Figure 8: (A) Magnitude-time plot with the estimated change-points for the whole declustered time series. (B) Cumulative number of earthquakes with M ≥ 3 for the declustered catalog with the estimated change-points (model with one change-point (green line) and two changepoints (red lines). Inset: Cumulative number of earthquakes for the non-declustered catalog with the estimated change-points (model with three change-points), where the third change-point coincides with the occurrence time of the MW = 5.6 mainshock. 35 448 Figure 9 Figure 9: Maps with transition events and the MW = 5.6 earthquake for the case study Oklahoma. (A) and (B) Color-coded times of the first and second change-points at grid points where the algorithm prefers two change-points: (A) first change-point and (B) second change-points. (C) lllustration of all calculated transition times at grid points where the algorithm preferred a model with one change-point. 36 449 Figure 10 Figure 10: Locations and occurrence times of the first change-points (for models with one and with two change-points) in comparison to approval dates of injection wells from 1.1.2000 to 31.12.2015 for the Oklahoma case study. The high-volume injection wells (approved volume > 10, 000 barrels per day) are illustrated in black. (A) Map view of the estimated change-points, (B) latitude-time plot, and (C) time-longitude plot with estimated transitions and injection wells. 37 450 Appendix 451 Derivation of the Marginal Posterior Density 452 With the notation of the Estimation of Change-Points section, we derive the formula for 453 the Bayesian posterior density Eq. (6). Here we use Fubini’s theorem and the definition of 454 the gamma function Γ(x) = Z∞ z x−1 e−z dz (A1) 0 455 or more precisely the following calculation: Z∞ (N (τ )−N (τi−1 ) −λi (τi −τi−1 ) λi i e dλi 0 = Z∞  0 = (τi − τi−1 ) −(N (τi )−N (τi−1 )+1) z τi − τi−1 Z∞ N (τi )−N (τi−1 ) e−z z N (τi )−N (τi−1 )+1−1 e−z dz 0 = (τi − τi−1 )−(N (τi )−N (τi−1 )+1) Γ(N (τi ) − N (τi−1 ) + 1) 38 dz τi − τi−1 (A2) 456 Derivation of the Likelihood Ratio Test 457 Based on the test problem H0 : λ1 = λ2 versus H1 : λ1 6= λ2 . 458 (A3) the likelihood function for two different rates is given by p(t | λ1 , λ2 ) = λn1 1 exp(−λ1 ∆1 )λn2 2 exp(−λ2 ∆2 ), 459 where ∆1 = s2 − s1 and ∆2 = s4 − s3 . 460 The log-likelihood function is given by l (λ1 , λ2 | t) = n1 log λ1 − λ1 ∆1 + n2 log λ2 − λ2 ∆2 . (A4) (A5) 461 Under H1 we have to calculate the maximum likelihood estimator (MLE) for λ1 and λ2 . 462 From ∂l (λ1 , λ2 | t) n1 ! = − ∆1 = 0 ∂λ1 λ1 463 we get λ̂1 = 464 (A6) n1 . ∆1 (A7) Furthermore n1 ∂ 2 l (λ1 , λ2 | t) = − 2 < 0 for all λ1 ∈ R+ . 2 ∂λ1 λ1 (A8) 465 So λ̂1 is the MLE for λ1 . In the same way we can show that λ̂2 = n2 is the MLE for λ2 . ∆2 466 Under H0 is λ = λ1 = λ2 , so we get the likelihood l (t | λ) = λn1 +n2 exp[−λ(∆1 + ∆2 )]. 39 (A9) 467 468 469 The log-likelihood function is given by l (λ | t) = (n1 + n2 ) log λ − λ(∆1 + ∆2 ). (A10) n1 + n2 ∂l (λ | t) ! = − (∆1 + ∆2 ) = 0, ∂λ λ (A11) Thus which leads to λ̂ = 470 n1 + n2 . ∆1 + ∆ 2 (A12) Furthermore ∂ 2 l (λ | t) n1 + n2 =− < 0 for all λ ∈ R+ . 2 ∂λ λ2 471 So λ̂ is the MLE for λ. 472 In general the test statistic is given by Z = 2 ln 473  (A13)  p(t | H1 ) . p(t | H0 ) (A14) Hence h    i Z = 2 l λ̂1 , λ̂2 | t − l λ̂ | t (A15) leads to   n1 n2 n2 − − (∆1 ) + n2 log (∆2 ) Z = 2 n1 log ∆1 ∆2 ∆2     n1 + n2 n1 + n2 − − (n1 + n2 ) log (∆1 + ∆2 ) ∆1 + ∆ 2 ∆1 + ∆ 2  474  n1 ∆1  and finally to  Z = 2 n1 log  n1 ∆1  + n2 log  n2 ∆2  − (n1 + n2 ) log 40  n1 + n2 ∆1 + ∆ 2  . (A16) 475 Derivation of the Bayes Factors 476 The Bayes factor is defined by the ratio of the marginal or integrated likelihood for the 477 two considered models Ml (model with l change-points) and Mm (model with m change- 478 points), i.e. p(t | Ml ) , p(t | Mm ) Blm = 479 with l, m = 0, . . . , k and l 6= m. For M0 and M1 we get p(t | M0 ) = 480 (A17) Z ∞ p(λ)λn e−λ(b−a) dλ (A18) 0 and p(t | M1 ) = Z bZ a 0 ∞Z ∞ 0 N (τ ) −λ1 (τ −a) N (b)−N (τ ) −λ2 (b−τ ) e λ2 e dλ1 dλ2 dτ. p(τ )p(λ1 )p(λ2 )λ1 (A19) 481 For l ≥ 2 we obtain Z Z N (τ ) N (b)−N (τl ) −λl+1 (b−τl ) p(t | Ml ) = e p(τ1 )p(λ1 )p(λl+1 )λ1 1 e−λ1 (τ1 −a) λl+1 Λ × T l Y N (τ )−N (τi−1 ) −λi (τi −τi−1 ) p(τi )p(λi )λi i e dλ1 . . . (A20) dλl+1 dτ1 . . . dτl . i=2 482 Here is Λ = (0, ∞)l+1 and T = (a, b)l . 483 To evaluate Eq. (A18), Eq. (A19) and Eq. (A20) we use improper prior densities for the 484 2 , where ci is a not further intensities so that p(λ) = c0 λ− 2 and p(λ) = ck λ1 2 . . . λk+1 485 specified constant. Moreover we formulate uniform distributed priors for τi , i.e. p(τi ) = 486 (compare Raftery and Akman (1986)). For this approach Eq. (A18) becomes −1 1 p(t | M0 ) = Z ∞ 0 −1 1 b−a 1 c0 λ− 2 λn1 e−λ(b−a) dλ (A21) = c0 (b − a) 41 −(n+ 21 ) 1 Γ(n + ). 2 487 Further Eq. (A19) becomes Z bZ ∞Z ∞ c1 N (τ )− 12 −λ1 (τ −a) N (b)−N (τ )− 21 −λ2 (b−τ ) λ e λ2 e dλ1 dλ2 dτ p(t | M1 ) = b−a 1 0 0 a Z ti+1 n 1 1 1 1 c1 X (τ − a)−(i+ 2 ) (b − τ )−(n−i+ 2 ) dτ, Γ(i + )Γ(n − i + ) = b−a 2 2 ti i=0 (A22) 488 with t0 = a and tn+1 = b. The resulting Bayes factor B01 contains an unspecified constant 489 c0 /c1 , which can determined by using the boundary condition B01 ≈ 1, if we consider an 490 observation period of [a, b] consisting only a single event t1 = (a + b)/2. So Eq. (A22) 491 becomes Z ti+1 1 1 1 c1 X 1 1 p(t | M1 ) = (τ − a)−(i+ 2 ) (b − τ )−(1−i+ 2 ) dτ Γ(i + )Γ(n − i + ) b−a 2 2 ti i=0 "Z (a+b)/2 1 3 c1 = (τ − a)− 2 (b − τ )− 2 dτ Γ(0.5)Γ(1.5) b−a a # Z b 1 3 + (τ − a)− 2 (b − τ )− 2 dτ (a+b)/2 = √ c1 4 πΓ(1.5). 2 (b − a) (A23) 492 493 √ 1 ! If c0 /c1 =: c01 (a, b), we receive by solving B01 = 1 that c01 (a, b) = 4 π(b − a)− 2 . Finally we get Eq. (10). In the same way we can evaluate Eq. (A20). Here we have to consider Z bZ bZ ∞Z ∞Z ∞ 1 N (τ1 )− 21 −λ1 (τ1 −a) N (τ2 )−N (τ1 )− 21 p(t | M2 ) = c2 λ1 e λ2 2 (b − a) 0 0 0 a a N (b)−N (τ2 )− 21 −λ3 (b−τ2 ) × e−λ2 (τ2 −τ1 ) λ3 e dλ1 dλ2 dλ3 dτ1 dτ2 n X n X 1 1 1 1 Γ(i + )Γ(j − i + )Γ(n − j + ) 2 (b − a) 2 2 2 i=0 j=i+1 Z ti+1 Z tj+1 1 1 1 × (τ1 − a)−(i+ 2 ) (τ2 − τ1 )−(j−i+ 2 ) (b − τ2 )−(n−j+ 2 ) dτ1 dτ2 . = c2 ti tj (A24) 42 494 By using the training sample method we obtain p(t | M2 ) = 1 1 X X c2 1 1 1 Γ(i + )Γ(j − i + )Γ(1 − j + ) 2 (b − a) 2 2 2 i=0 j=i+1 Z ti+1 Z tj+1 1 1 1 (τ1 − a)−(i+ 2 ) (τ2 − τ1 )−(j−i+ 2 ) (b − τ2 )−(1−j+ 2 ) dτ1 dτ2 × tj ti [Γ(0.5)]2 Γ(1.5) = c2 (b − a)2 = c2 (b − a) 5 2 Z a (a+b)/2 Z b 1 (a+b)/2 3 1 (τ1 − a)− 2 (τ2 − τ1 )− 2 (b − τ2 )− 2 dτ1 dτ2 2π 2 Γ(1.5). (A25) 495 Without loss of generality we assume that τ2 > τ1 , so that we have to multiply the resulting 496 constant with the factor 2. This finally leads to c02 (a, b) = 4π 2 (b − a)−1 and Eq. (11). To 497 compare M1 and M2 we use B12 = B02 . B01 (A26) 498 For the general case Blm , we first calculate the Bayes factors B0l and B0m by using 499 the training sample method to get the occurring constants as shown in Eq. (A23) or in 500 Eq. (A25) and then straightforward Blm = B0m . B0l 43 (A27) 501 Using the priors as explained before, evaluation of Eq. (A20) leads to p(t | Ml ) = cl × = Z Z Λ T l Y N (τ1 )− 12 −λ1 (τ1 −a) p(τ1 )λ1 N (b)−N (τl )− 12 −λl+1 (b−τl ) p(τl )λl+1 N (τi )−N (τi−1 )− 12 −λi (τi −τi−1 ) e p(τi )λi e dλ1 . . . dλl+1 dτ1 . . . dτl i=2 n X cl ... (b − a)l i1 =0 × e Z ti1 +1 ... t i1 Z n X il =il−1 +1 til +1 t il l 1 Y 1 1 Γ(ij − ij−1 + ) Γ(i1 + )Γ(n − il + ) 2 2 2 j=2 1 1 (τ1 − a)−(i1 + 2 ) (b − τl )−(n−il + 2 ) l Y j=2 1 (τj − τj−1 )−(ij −ij−1 + 2 ) × dτ1 . . . dτl . (A28) 502 With the help of the training sample method, the occurring constants can be calculated. 503 √ 5 3 As a further example for B03 we get c03 (a, b) = 4 2π 2 (b − a)− 2 . 504 For model selection we use the following algorithm: 505 506 i) Define the maximum number k of possible change-points in the investigated data. ii) Set m = 0. 507 iii) Calculate the Bayes factors Bml with l = m + 1, . . . , k. 508 iv) Calculate lnew = arg min {Bml < 0.3}. l∈{m+1,...,k} 509 510 v) If lnew exists, set m = lnew and go to step iii). Otherwise, select a model where the number of change-points is equal to m. 44 511 Case Study Oklahoma: Evaluation with Different Choices of 512 the Radius 513 In comparison to the results illustrated in Fig. 9 where we used a radius r = 5 km, Fig. A1 shows the transition events for the radii r = 2 and r = 10 km. Figure A1: Maps with transition events and the MW = 5.6 earthquake for the case study Oklahoma. (A) and (B) Illustration of all calculated change-point locations where the algorithm prefers two change-points by using a radius of 2 km. (C) and (D) Illustration of all calculated change-point locations where the algorithm prefers two change-points by using a radius of 10 km. (E) and (F) show all calculated transition events where the algorithm prefers a model with one change-point, e.g. (E) r = 2 km and (F) r = 10 km. 514 45