Academia.eduAcademia.edu

Spatially distributed non-stationary seismic hazard

Characterization of seismic event rates in time and space is necessary for performing accurate probabilistic hazard assessment. Deep-water injections, as those related to disposal of wastewater in energy production, can increase seismic hazard [1]. However, due to the highly non-stationary process of induced seismicity, existing methods for characterizing tectonic events [2-3] cannot be directly applied in this analysis. This study presents a general sequential Bayesian inference method for characterizing spatio-temporal seismic activities, broadly relevant to model the effects of fluid injections and of other sources of induced seismicity, by recursively updating the event rate based on current observations. We assume that recordings of seismic activities are collected and available to be processed. The method relies on discretizing a region in a spatial grid and a time period into time intervals. The seismic event rate is described at points on that spatiotemporal domain. Due to t...

IASSAR Safety, Reliability, Risk, Resilience and Sustainability of Structures and Infrastructure 12th Int. Conf. on Structural Safety and Reliability, Vienna, Austria, 6–10 August 2017 Christian Bucher, Bruce R. Ellingwood, Dan M. Frangopol (Editors) c 2017 TU-Verlag Vienna, ISBN 978-3-903024-28-1 Spatially distributed non-stationary seismic hazard Pengyun Wanga, Mitchell J. Smalla, William Harbertb, and Matteo Pozzia a Department of Civil and Environmental Engineering, Carnegie Mellon University b Department of Geology and Planetary Science, University of Pittsburgh, Abstract: Wastewater disposal by underground injection can increase seismic activity and the corresponding hazard. Modeling this impact is challenging because of the scarcity of available data, the imperfect knowledge about natural seismicity, the uncertain dynamics of induced seismicity and the uncertain spatial relationship between injection volumes and the corresponding increases. Bayesian modeling allows for a consistent analysis of heterogeneous data, and can be the basis for the spatio-temporal seismic hazard assessment. This paper adopts a Bayesian framework to focus on seismic productivity, modeling events’ rate and location as a stochastic dynamic system. The spatial properties of the system are modeled by discretizing the seismic region into a grid of inter-dependent pointwise components, and the knowledge about their activities is progressively updated. We select proper distributions for modeling the uncertainty on the initial condition and for the transition, and use the Particle Filter method, a sequential Monte Carlo based approach, to numerically approximate the posterior distribution. To overcome potential problems, such as Particle Impoverishment and Particle Degeneration, we use Importance Sampling and resampling techniques. The analysis focuses on the Oklahoma region, for which extensive data on seismic activity and injections are available; however, the method is generally applicable. 1 Introduction Characterization of seismic event rates in time and space is necessary for performing accurate probabilistic hazard assessment. Deep-water injections, as those related to disposal of wastewater in energy production, can increase seismic hazard [1]. However, due to the highly non-stationary process of induced seismicity, existing methods for characterizing tectonic events [2-3] cannot be directly applied in this analysis. This study presents a general sequential Bayesian inference method for characterizing spatio-temporal seismic activities, broadly relevant to model the effects of fluid injections and of other sources of induced seismicity, by recursively updating the event rate based on current observations. We assume that recordings of seismic activities are collected and available to be processed. The method relies on discretizing a region in a spatial grid and a time period into time intervals. The seismic event rate is described at points on that spatiotemporal domain. Due to the lack of analytical solutions for the inference problem in this setting, the method adopts a sequential Monte Carlo approach called Particle Filter, in which a set of weighted samples/particles are periodically updated to represent the posterior distribution of seismic rate. 3185 Figure 1: Graphical representation of the process for the seismic rate evolution and measures. The Bayesian framework provides a flexible structure, because it allows processing instrumental seismic records while incorporating geophysical and model-based knowledge. The performance of the proposed model is investigated in an application to the earthquake catalog for the state of Oklahoma. Oklahoma is well known for a long history of subsurface wastewater disposal with numerous injection wells across the state. It has seen a remarkable increase in the regional seismic activity after decades of absence of induced earthquakes since the first injection wells were deployed (although it has been suggested that induced earthquakes have occurred in Oklahoma during these decades, but at lower frequency and magnitude [3]). The seismic event catalog dataset is properly pre-processed (as described in the 3.1 section) and then analyzed by the proposed method to infer how seismic event rate evolved across Oklahoma. The method is shown to be able to well represent the spatial seismic trend in the study region. 2 Spatio-temporal Point Process Model We consider earthquake occurrences as a Spatio-temporal point process (STPP). STPP is a collection of points, where each point represents the time and location of an event. Such a process is generated by a non-uniform Poisson process, with event rate function 𝜆(𝑡, 𝐳), depending on time 𝑡 and spatial coordinate 𝐳 = [𝑥 𝑦]T , where 𝑥 and 𝑦 refer to orthogonal horizontal directions. Conditional to function 𝜆, we can calculate the probability density for each sequence of events. On the other hand, given a sequence of events and uncertain prior knowledge about 𝜆, we can update our belief on 𝜆 applying Bayes’ rule. The logarithm of the likelihood function related to an event sequence can be expressed as: 𝑇 log 𝑝(𝐇|𝜆) = ∑𝑁 𝑖=1 log[𝜆(𝑡𝑖 , 𝐳𝑖 )] − ∫0 ∫𝐳∈𝐴 𝜆(𝑡, 𝐳)𝑑𝐳 𝑑𝑡 (1) where 𝐇 = {(𝑡𝑖 , 𝐳𝑖 ): 𝑖 = 1, . . , 𝑁} contains the complete sequence of times and space coordinates of 𝑁 events occurring in region A, during time interval [0, 𝑇]. In the following, we specify how to define a prior distribution of 𝜆 and how to infer its parameters. 2.1 Model of seismic rate evolution To specify the form of function 𝜆, we discretize the region into a grid, and the period into time intervals of equal duration. Let vector 𝛌𝑘 contain the event rates on each point on the grid at time interval 𝑡𝑘 : this is the system state. As a result, 𝜆 can be represented by a sequence of vectors {𝛌1 , 𝛌2 , … }. Correspondingly, let 𝐇𝑘 contain events occurring during interval (𝑡𝑘−1 , 𝑡𝑘 ]. We assume that the state evolution follows a first-order Markov process, as graphically displayed in Figure 1. To infer the hidden state sequence, we can use a Bayesian sequential updating scheme, in which we compute the posterior distribution of the current state based on that of the previous one, the transition model and the available current observations. 3186 We assign a distribution to the initial state, and specify the state transition function. Let the distribution of 𝛌0 be modeled by a multivariate lognormal distribution: 𝛌0 ~ℒ𝒩(𝛍0 , 𝚺0 ) (2) Where 𝛍0 is a vector and 𝚺0 a matrix, representing mean and covariance of vector log(𝛌0 ), respectively. In that covariance matrix, the correlation 𝜌𝑠𝑟 between values of log(𝜆) at coordinates 𝐳𝑠 and 𝐳𝑟 at the same time is assigned as: 𝜌𝑠𝑟 = exp [− 1 2𝑙2 (𝐳𝑠 − 𝐳𝑟 )T (𝐳𝑠 − 𝐳𝑟 )] (3) where hyper-parameter 𝑙 controls the decay of the correlation as a function of the distance. The state transition during time interval (𝑡𝑘−1 , 𝑡𝑘 ] is assumed to take the following form: 𝛌𝑘 = 𝛌𝑘−1 + 𝐚𝑘 (4) 𝑝(𝐚𝑘 ) = (1 − 𝑃) × δ(𝐚𝑘 ) + 𝑃 × ℒ𝒩(𝐚𝑘 ; 𝛍𝑎 , 𝚺𝑎 ) (5) 𝑝(𝛌𝑘 |𝛌𝑘−1 ) = (1 − 𝑃) × δ(𝛌𝑘 − 𝛌𝑘−1 ) + 𝑃 × ℒ𝒩(𝛌𝑘 − 𝛌𝑘−1 ; 𝛍𝑎 , 𝚺𝑎 ) (6) where 𝐚𝑘 is a random vector, encoding the uncertain amount of change in seismic rate, possibly due to the effect of induced seismicity, for all grid cells at time k. The sequence of 𝐚 is assumed to be stationary in time and independently distributed for different times. Also, we assume 𝐚𝑘 to be 0 with probability 𝑃 = ℙ[𝐶𝑘 = 1], where binary random variable 𝐶𝑘 takes value in {0, 1} with 𝐶𝑘 = 0 standing for no increase and 𝐶𝑘 = 1 for the occurrence of an increase. When 𝐶𝑘 = 1, 𝐚𝑘 is lognormally distributed. Thus, 𝐚𝑘 is distributed as follows: where δ(∙) is the delta function centered in 0 and, 𝛍𝑎 and 𝚺𝑎 are a vector and a matrix of parameters. The correlation encoded in matrix 𝚺𝑎 for rate changes is calculated according to Eq.3, using the same parameter 𝑙. This parameter can be selected by expert knowledge or by optimizing the model likelihood. Combining Eqs.4 and 5, the transition function, 𝑝(𝛌𝑘 |𝛌𝑘−1 ) can be represented by: With the initial condition model defined in Eq.2, the transition model of Eq.6 and the likelihood function of Eq.1, we can now estimate the sequence of 𝛌𝑘 based on the set of all available observation vectors 𝐇1:𝑘 = {𝐇1 , 𝐇2 , … , 𝐇𝑘 }. Now 𝐇𝑘 is a vector reporting the number of event observed in all cells, in the time interval. Specifically, we can integrate Eq.1 approximating 𝜆 as spatially uniform in the sub-region associated to each grid point, and constant in time during the time interval. In the so-called “filtering problem”, we estimate probability 𝑝(𝛌𝑘 |𝐇1:𝑘 ) that, at time 𝑡𝑘 , defines the posterior probability of current seismic rate given all observations collected so far. This can be done following an iterative scheme. 2.2 Updating process via Particle Filter We use a Particle Filter (PF) scheme, which is a general sequential Monte Carlo approach for approximating distributions. It is traditionally based on Sequential Importance Sampling [4] which approximately represents distribution 𝑝(𝛌𝑘−1 |𝐇1:𝑘−1 ), at time 𝑡𝑘−1 , by a set of 𝑁𝑠 𝑗 𝑗 , 𝑤𝑘−1 }𝑗=1 , also known as particles, and recursively updates these parweighted samples {𝛌𝑘−1 𝑁𝑠 ticles to obtain a set {𝛌𝑘𝑗 , 𝑤𝑘𝑗 }𝑗=1 to approximate posterior distribution 𝑝(𝛌𝑘 |𝐇1:𝑘 ) at the next time step. The particles are propagated so that particle 𝑗 follows this trajectory: 3187 𝑗 𝑗 𝑗 𝑗 𝑗 𝑤𝑘 = 𝑤𝑘−1 𝛌𝑘 ~𝑞(𝛌𝑘 |𝛌𝑘−1 , 𝐇𝑘 ) 𝑗 𝑗 𝑗 𝑝(𝐇𝑘 |𝛌𝑘 ) 𝑝(𝛌𝑘 |𝛌𝑘−1 ) (7,8) 𝑗 𝑗 𝑞(𝛌𝑘 |𝛌𝑘−1 ,𝐇𝑘 ) 𝑗 , 𝐇𝑘 ) is an importance distribution for propagating particles from time 𝑡𝑘−1 where 𝑞(𝛌𝑘𝑗 |𝛌𝑘−1 to time 𝑡𝑘 , which only depends on the previous state and the current observation, and 𝑗 𝑗 𝑁𝑠 𝑗 𝑝(𝛌𝑘 |𝛌𝑘−1 ) is the predicative density as shown in Eq.6. Weights {𝑤𝑘 }𝑗=1 are normalized at 𝑗 𝑗 𝑗 , 𝐇𝑘 ) is chosen to be equal to 𝑝(𝛌𝑘 |𝛌𝑘−1 ), each time step so that their sum is one. If 𝑞(𝛌𝑘𝑗 |𝛌𝑘−1 𝑗 𝑗 𝑝(𝐇𝑘 |𝛌𝑘 ). This latter variant of PF scheme is referred to as the then Eq.8 becomes 𝑤𝑘𝑗 = 𝑤𝑘−1 “Boot Strap Particle Filter”. However, using that scheme, it often happens that particles are not able to explore well the important region of the posterior space, (where the likelihood is high and so is the posterior). This problem is particularly severe when the “transition” density 𝑝(𝛌𝑘 |𝛌𝑘−1 ) is broad and the dimension of the state vector 𝛌𝑘 is high. In order to draw samples effectively, one needs to design an informative importance distribution, considering both 𝑗 , 𝐇𝑘 ) to be of a similar form the predicative density and the likelihood. We propose 𝑞(𝛌𝑘𝑗 |𝛌𝑘−1 as the predicative density: 𝑗 𝑗 𝑗 𝑗 𝑗 𝑗 𝑗 𝑗 𝑗 𝑞(𝛌𝑘 |𝛌𝑘−1 , 𝐇𝑘 ) = (1 − 𝑃̂𝑘 ) × δ(𝛌𝑘 − 𝛌𝑘−1 ) + 𝑃̂𝑘 × ℒ𝒩(𝛌𝑘 − 𝛌𝑘−1 ; 𝐱̂ 𝑘 , 𝛽𝚺𝑎 ) (9) where 𝐱̂𝑘𝑗 is the point estimate for the posterior mode of the log of 𝛌𝑘𝑗 in the case of change (i.e., when 𝐶𝑘 = 1), 𝚺𝑎 is the same covariance matrix as for the predicative density, 𝛽 is a scalar controlling the spread of the importance density, and 𝑃̂𝑘𝑗 estimates the posterior probability that 𝐶𝑘 = 1, given by: 𝑗 𝑃̂𝑘 = 1 − 𝑗 𝑗 𝑗 𝑗 𝑝(𝐇𝑘 |𝛌𝑘 =𝛌𝑘−1 )(1−𝑃) 𝑗 𝑗 𝑝(𝐇𝑘 |𝛌𝑘 =𝛌𝑘−1 )(1−𝑃)+𝑝(𝐇𝑘 |𝛌𝑘 =𝛌𝑘−1 +𝑒 𝛍𝑎 )𝑃 (10) In practice, the iterations of the update equations in Eq. 7-8 lead to a degeneracy problem where only a few particles get the majority of the total weight. To overcome this, we can resample a new set of 𝑁𝑠 particles, according to the discrete approximation to the distribution 𝑝(𝛌𝑘 |𝐇1:𝑘 ) provided by the weighted particles: 𝑁 𝑗 𝑗 𝑝 𝑝(𝛌𝑘 |𝐇1:𝑘 ) ≈ ∑𝑗=1 𝑤𝑘 δ(𝛌𝑘 − 𝛌𝑘 ) (11) After resampling, the weight of each particle is set to 1/𝑁𝑠 . Thus resampling effectively deals with the degeneracy problem by getting rid of particles with small weights. a) b) Figure 2: Spatial illustrations of the de-clustered catalog (M ≥ 2.5), with each solid point representing an event. The blue rectangle marks the application region (a). The 5 by 9 grid on the map of Oklahoma. The cells with labels will be used to show model performance in details (b). 3188 3 Application to the Oklahoma earthquake catalogue The proposed model and method are applied to the study of the earthquake catalog for Oklahoma (Oklahoma Geological Survey). Oklahoma has seen a remarkable increase in the regional seismic activity after decades of absence of induced earthquakes since the first injection wells were deployed. The largest earthquake (with magnitude ML 5.7) in the history of the state struck its central region in 2011 and damaged nearby infrastructures. More recently, a ML 5.0 event struck near Cushing and building damages have been reported. The authors have been established statistical models for the detection and quantification of the induced earthquakes in Oklahoma, including models for early detection [5] and assessing transitions [6] of induced seismicity. The early detection model is able to detect induced seismicity at an early stage and the transition-assessing model allows for quantifying the extent to which induced seismicity increases in time. However, those methods do not provide the ability to characterize the spatial evolution of induced seismicity. 3.1 Dataset and Pre-processing The dataset roughly spans the time period from January 1975 to March 2016. The total number of recorded seismic events during this period is 20,595, with the largest being M L 5.7 and the smallest ML-1.2. The Entire-Magnitude-Range method [7] reveals that the magnitude completeness (Mc) of this dataset is ML2.4 ± 0.06 (uncertainties are calculated by bootstraping). In this study, a more conservative value of ML2.5 is used for Mc to mitigate the effect of missing occurring, but unrecorded seismic events below this threshold level. Because of the Mc selection, a total number of 7070 events, constituting a complete catalog, remain for subsequent analysis. Since the proposed method assumes independence for event occurrences, while the catalogue contains both independent and dependent events, the dataset is declustered using the algorithm proposed by Zhuang et al. [8]. Finally, the resulted dataset contains 4327 events, as shown on the map of Oklahoma in Figure 2(a). 3.2 Model parameters for processing the Oklahoma catalogue The period covered by the dataset is discretized in 2-month intervals, resulting in 247 time steps in total. In order to apply the proposed method in a regular setting, we focus on the events in a rectangular geographical region within the Oklahoma border, as shown by the rectangle in Figure 2. The study region goes from 99.5°W to 95°W in longitude, and from 34.5°N to 37°N in latitude. It is divided into cells of 0.5°×0.5°, generating a 5 by 9 grid and thus a state vector with 45 components. The coordinates of each point on the grid are those of the centers of the corresponding cells. Figure 2(b) shows the grid on the Oklahoma map. Since some of the cells experience more events during the study period than others, they are labelled out for highlighting purposes. Specifically, 16 cells are selected. Although parameter 𝑙 could be optimized for model performance, in the following it is assumed to be 50km, representing a moderate decay of correlation. Also, parameters 𝛽 and 𝑃 are set to 5 and 0.1 respectively. 3.3 Outcomes of the analysis of the Oklahoma catalogue The inference results from the proposed model are displayed below. First, Figure 4 shows the estimated event rate as a function of time for the whole study region. It suggests that the event rate experiences a sequence of increases rising from 0.01 to over 7 events per day initi- 3189 ating at 2010 in Oklahoma, which is consistent with results from existing literature on the earthquake activity in Oklahoma State [9-10]. It is notable that the uncertainty in the inference changes with time. For an extended period of stationary earthquake process, the inference uncertainty reduces as more data becomes available, which is the case for the time period from 1975 to 2010. However, the uncertainty increases right after changes occurring in 2010, 2011 and 2014. This is caused by the limited availability of data regarding the state after the transition; later on, as more data becomes available, the uncertainty reduces until the occurrence of new changes. The uptick in the estimated event rate for the end of the dataset seems in conflict with the decline in the overall observed event rate, but it can be explained by the increase in individual cells including cell 2, 3, 6 and 7, as shown in Figure 5. Figure 5 shows how the mean estimated event rate evolves for each of the highlighting cells from year 2005 to 2016. In these cells, seismic rates increase dramatically during the display period, from below 0.001 up to over 0.1 or even over 1 events/day. The first increase occurred around 2010 due to the increased presence of seismic events in cells 7, 9 and 11-16. After 2014 or 2015, almost all of these cells have experienced significant increases in seismic rate. In general, the model prediction is consistent with the seismic trend observed across Oklahoma: if the number of observed events in a cell or its surrounding cells increases significantly, the event rate in that cell is predicted to jump accordingly; if there is no increase or the increase is insignificant, the system state stays relatively stationary. The model can also be used to produce an updated seismic rate map. Figure 6 shows a map of mean seismic rate across Oklahoma, predicted after processing the seismic data for the first two months of 2016. As we can see, the center and north of Oklahoma is the most affected by induced seismicity. Such updated seismic rate can be related to ground motion models to produce an updated seismic hazard. Figure 3: Average estimated event rate as a function of time for the whole study region, along with its confidence interval and the observed event rate. 3190 Figure 4: Average estimated event rate as a function of time for each highlighting cell, along with the observed event rate. Figure 5: Mean estimated event rate after processing the seismic data for the first two months of 2016. So far, we have shown the model’s ability to sequentially update the event rate for the region, based on the previous state and the current observation. It is also of interest to examine its forecasting performance. To do so, we need to specify how to perform predictions for future earthquakes using the model. Unlike traditional earthquake predictions done for a longterm future period (e.g. 50yrs), we focus on short-term predictions, due to the fast changing process of induced seismicity. Specifically, the prediction is based on the average estimated event rate resulting from the current time step and it is used to predict the number of events to occur in the next time step, 2 months after. This prediction procedure is applied to each cell for each time step. The predicted number of events to occur in each time step for each cell is shown in Figure 6, along with the number of observed events. Note that the display is only for the critical period from 2009 to 2016 for the highlighting cells. An ideal prediction result is such that the number of observed events equals the number of predicted events, as indicted by the curve in each plot. Because of intrinsic randomness and model imperfections, the agreement is not ideal, as plotted in the figure 3191 Figure 6. Predicted number of events to occur in each time step for each cell, along with the number of observed events. The timing of the prediction is indicated by the color of the circle with blue representing the beginning of the period and red for the most recent time. The curve in each subplot represents an ideal prediction, on which the predicted number of events equal the observed number. 4 Conclusions Spatio-temporal maps of seismic rates are critical for assessing hazard due to induced earthquakes. In this paper, we propose a general method for probabilistic inference appropriate for developing these maps. The method sequentially processes recordings of seismic activities, following a Bayesian approach. Application to the Oklahoma seismic catalogue shows that the proposed method is able to well characterize the spatial evolution of induced seismicity. A considerable part (e.g., the center and north) of the region has seen significantly increases in seismic rate. The onset times and magnitudes of these changes vary for different subareas, with the time ranges from 2010 to 2014 and the magnitude ranges from 0.1 to 1 events per day. The model is able to provide insight into the spatial distribution of induced seismicity. It can also be used as a tool to for monitoring and periodically updating regional seismic hazard influenced by the uncertain effect of fluid injections. Finally, it is possible to use the model to carry out short-term earthquake forecasting to help the risk-management of injection wells. Acknowledgements The support for this research was provided by the H. John Heinz Professorship of Environmental Engineering, and the Scott Institute Seed Grants from the Wilton E. Scott Institute for Energy Innovation at Carnegie Mellon University. 3192 References [1] McGarr, A., Bekins, B., Burkardt, N., Dewey, J., Earle, P., Ellsworth, W., Ge, S., Hickman, S., Holland, A., Majer, E., Rubinstein, J., Sheehan, A., Coping with earthquakes induced by fluid injection, Science 347, no. 6224, 830-831, 2015. [2] Lawrence, M., McCann, M., Ostenaa, D., Wong, I., Unruh, J., Hanson, K., Olig, S., Clague, J., LaForge, R., Lettis, W., Swan, B., Zachariasen, J., Youngs, R. and Addo, K. The BC Hydro SSHAC Level 3 seismic source model, Proc. of the 10th Natl. Conf. on Earthquake Eng., Earthquake Engineering Research Institute, Anchorage, AK, 2014. [3] Hough, S.E., and M. Page. A century of induced earthquakes in Oklahoma?, Bulletin Of Seismological Society Of America 105, no. 6, 2863-2870, 2015. [4] Liu, J.S., Chen, R. and Logvinenko, T. A theoretical framework for sequential importance sampling with resampling, Sequential Monte Carlo methods in practice, 225-246, Springer New York, 2001. [5] Wang, P., Pozzi, M., Small, M.J., Harbert, W. Statistical method for early detection of changes in seismic rate associated with wastewater injections, Bulletin Of Seismological Society Of America 105, no. 6, doi: 10.1785/0120150038, 2015. [6] Wang, P., Small, M.J., Harbert, W. and Pozzi, M. A Bayesian Approach for Assessing Seismic Transitions Associated with Wastewater Injections, Bulletin Of Seismological Society Of America 106, no. 3, 832-845, 2016. [7] Woessner, J. and Wiemer, S. Assessing the quality of earthquake catalogues: Estimating the magnitude of completeness and its uncertainty.Bulletin of the Seismological Society of America, 95(2), pp.684-698, 2005. [8] Zhuang, J., Y. Ogata, and D. Vere-Jones. Stochastic declustering of space-time earthquake occurrences, Journal of American Statistical Association 97, no. 458, 369-380, 2002. 3193