Pham 2024 Gradient Basedjointinversionofpoint Sourcemom

Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/381370052

Gradient-based joint inversion of point-source moment tensor and station-


specific time-shifts

Article in Geophysical Journal International · June 2024


DOI: 10.1093/gji/ggae188

CITATIONS READS

0 73

1 author:

Thanh-Son Pham
Australian National University
30 PUBLICATIONS 406 CITATIONS

SEE PROFILE

All content following this page was uploaded by Thanh-Son Pham on 13 June 2024.

The user has requested enhancement of the downloaded file.


Geophysical Journal International Royal
Astronomical
Society

Geophys. J. Int. (2024) 238, 783–793 https://doi.org/10.1093/gji/ggae188


Advance Access publication 2024 May 31
GJI Seismology

Gradient-based joint inversion of point-source moment tensor and


station-specific time-shifts

Downloaded from https://academic.oup.com/gji/article/238/2/783/7686131 by Australian National University Library (INACTIVE) user on 12 June 2024
Thanh-Son Pha. m
Research School of Earth Sciences, The Australian National University, Canberra, Acton, ACT 0200, Australia. E-mail: [email protected]

Accepted 2024 May 28. Received 2024 May 20; in original form 2023 September 16

SUMMARY
The misalignment of the observation and predicted waveforms in regional moment tensor inver-
sion is mainly due to seismic models’ incomplete representation of the Earth’s heterogeneities.
Current moment tensor inversion techniques, allowing station-specific time-shifts to account
for the model error, are computationally expensive. Here, we propose a gradient-based method
to jointly invert moment-tensor parameters, centroid depth and unknown station-specific time-
shifts utilizing the modern functionalities in deep learning frameworks. A L 22 misfit function
between predicted synthetic and time-shifted observed seismograms is defined in the spectral
domain, which is differentiable to all unknowns. The inverse problem is solved by minimizing
the misfit function with a gradient descent algorithm. The method’s feasibility, robustness and
scalability are demonstrated using synthetic experiments and real earthquake data in the Long
Valley Caldera, California. This work presents an example of fresh opportunities to apply
advanced computational infrastructures developed in deep learning to geophysical problems.
Key words: joint inversion; Computational seismology; Theoretical seismology; Moment
tensor; Optimization.

However, in practice, the numerical predictions and observed


1 I N T RO D U C T I O N
waveforms are often not aligned (Figs 1a and b), and the mis-
A seismic moment tensor (MT) is a mathematical representation alignment could severely affect the MT solution estimate (Zhao &
of a seismic source under the point-source assumption in space Helmberger 1994; Zhu & Helmberger 1996). Let τ be an unknown
and time when wavelengths are several times longer than the fault station-specific time-shift to align the observation to the prediction,
geometry (Aki & Richards 2002). A full MT is a 3 × 3 matrix, but
the net torque is negligible for most underground earthquakes, and and eq. (2) can be rewritten as,
the matrix is symmetric with six unknown elements (e.g. Jost &
Herrmann 1989; Stein & Wysession 2003). In this study, we use the d (t − τ ) = m · G h (t ) . (4)
northeast–down coordinate system for an MT, denoted as,
From eq. (4), for clarity, we drop the source, ξ , and receiver, x,
m = [m k ; k = 1 . . . 6] = [m nn , m ee , m dd , m ne , m nd , m ed ] ∈ R 6 . (1) locations as they are assumingly known in this study, but the cen-
troid depth, h , is superscripted because it is considered an inversion
In MT inversion, a set of Green’s functions (GFs), or Earth’s
parameter.
structure responses, G k (ξ, x, t ), is computed in advance for six
The first contribution to the misalignment, τ , is due to the error
orthogonal bases of the MT space, in which, ξ and x are the source
in the event’s origin time. This leads to a baseline time-shift applied
and receiver locations. Then, a tensor (linear) multiplication predicts
to all stations’ waveforms to best match the synthetic prediction.
the observed waveforms,
 Secondly, the lack of complete knowledge of the Earth’s structures
d (t ) = m · G (ξ, x, t ) = mk Gk . (2) results in path-specific time-shifts (Zhao & Helmberger 1994; Zhu
k & Helmberger 1996; Zhu & Ben-Zion 2013; Silwal & Tape 2016).
nr × n c × 6 × n t Indeed, a 1-D velocity model is often used in regional surface wave
G∈ R is a forth-order tensor and d ∈ R nr ×n c × n t is a
inversion. If the model is faster or slower than the real Earth along
third-order tensor, where n r , n c and n t are number of receivers,
the path toward a receiver, the observed waveform needs to be
number of components in each receiver and number of samples,
shifted backward or forward to account for the structural error.
respectively. The MT solution, m ∗ , can be approximated by its
The presence of unknown station-specific time-shifts makes the
least-square estimate of the linear system,
MT inversion strongly non-linear, whose existing computational
 −1 methods are computationally expensive.
m∗ = d · G T · G · G T . (3)
Indeed, the cut-and-paste (CAP) algorithm (Zhao & Helm-
The superscript T denotes the matrix transposition. berger 1994; Zhu & Helmberger 1996) and its variation (Zhu &


C The Author(s) 2024. Published by Oxford University Press on behalf of The Royal Astronomical Society. This is an Open Access
article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which
permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
783
784 T.-S. Pha. m

Downloaded from https://academic.oup.com/gji/article/238/2/783/7686131 by Australian National University Library (INACTIVE) user on 12 June 2024
Figure 1. Motivating problem for joint inversion of moment tensor (MT) parameters, m , and unknown station-specific time-shifts, τ . (a) The waveforms are
Green’s functions corresponding to elementary MTs, Mi , i = 1, . . . , 6. (b) The typical mismatch between observed and predicted waveforms. The observed
waveform, dobs , needs to be shifted by an unknown time-shift, τ , to match the predicted waveform, dpred . (c) Schematic demonstration of a global minimum
in the joint parameter space of MT parameters and station-specific time-shifts, m and τ . Horizontal and vertical-coloured lines represent two widely used
approaches for MT inversion with unknown time-shifts. The vertical lines sketch the idea behind the cut-and-paste method (Zhao & Helmberger 1994; Zhu &
Helmberger 1996), while the horizontal lines sketch the idea behind the time domain MT method (Dreger et al. 2000). Here we consider full MT solutions, so
m ∈ R 6 , and one time-shift for each station, so τ ∈ R nr , where n r is the number of seismic stations.

Ben-Zion 2013) are widely used for regional full-waveform MT misfit evaluations (Hu et al. 2023; Thurin & Tape 2023) and suf-
inversion, where time-shifts are allowed to account for model er- fer a severe scalability problem when considering station-specific
rors. In this algorithm, a misfit value is computed for an imputed time-shifts. If n r stations are used simultaneously in the inversion
MT solution, assuming the optimal alignment between predicted and a time-shift is assumed for each station, the dimension of the
and observed waveforms can be determined by cross-correlation unknown space, 6 + n r , grows, while the MT parameters occupy
(depicted by vertical blue lines in Fig. 1c). Then, the best MT solu- just a small subspace.
tion is grid-searched as the smallest misfit over the entire parameter This study proposes a gradient-based method in the optimization
MT space, which often results in a costly computational burden. framework to tackle the non-linear inverse problem. Note that the
For example, Silwal & Tape (2016) and Alvizuri & Tape (2018) gradient-based iterative method dates back to the early develop-
recently implemented this approach for full MT inversion using a ment of the global centroid MT inversion (Dziewonski et al. 1981),
uniform parametrization of MT space (Tape & Tape 2015). The grid which searches for nine unknowns, including the deviatoric MT
search was implemented on a cluster due to the computational scale solution, centroid time and hypocentral coordinates. Recent stud-
(Alvizuri & Tape 2018; Thurin et al. 2022) given that 20 million ies have embraced comprehensive but computationally demanding
trials were evaluated for each earthquake. treatments of the parameters, and more including source time func-
Alternatively, an MT solution can be solved efficiently using tion parametrizations, with uncertainty estimates for earthquake
least-square estimation given an imputed set of station-specific sources at local (Kühn et al. 2020), regional (Vasyura-Bathke et al.
time-shifts (depicted by horizontal orange lines in Fig. 1c). After 2020; Petersen et al. 2021) and global scales (Stähler & Sigloch
marching the grid of all possible combinations of station-specific 2014). On the other hand, there is an emerging class of deep-
time-shifts, the optimal MT solution corresponds to the minimum learning-based methods (Steinberg et al. 2021; Nooshiri et al. 2022)
misfit (e.g. Dreger et al. 2000). More recently, Hejrani et al. (2017) for determining the MT and hypocentral parameters with their as-
assumed a single time-shift for all stations to account for possible sociated uncertainties. The approach is computationally demanding
errors in the catalogued centroid time. The grid search for station- in the training phase, often on synthetic training datasets, but fast
specific time-shifts is also costly due to the exponentially larger in evaluating results in the production phase. This paper contributes
number of combinations to explore, especially when the analysis to the wide MT inversion methodological spectrum by providing
involves a larger number (i.e. more than 10) of stations. a new method that carefully treats station-specific time-shifts as a
The ensemble sampling method in the Bayesian framework in proxy to incorporate 3-D structural heterogeneities in the inversion
the joint space of MT parameters and unknown time-shifts (Viltres at a reasonable computational cost.
et al. 2021; Hu et al. 2023) is a relatively independent approach The remainder of this paper is structured as follows. First, we rep-
of the above-mentioned approaches. This approach has recently resent the misfit function between the prediction and time-shifted
become possible thanks to the advance of powerful probabilistic observation in the spectral domain, where the misfit is differen-
sampling methods (Roberts & Rosenthal 2001; Del Moral et al. tiable to all dependent parameters, including MT parameters, cen-
2006; Goodman & Weare 2010). It narrows the targeted sampling troid depth and station-specific time-shifts. Consequently, the misfit
near the optimal solution, which provides a useful estimate of the function can be minimized using a gradient descent algorithm to
parameters’ uncertainty. Most recently, Thurin & Tape (2023) used obtain an optimal solution. The algorithm is implemented in Tensor-
a gradient-free search, that is the covariance matrix adaptation– Flow, a popular deep-learning framework, to utilize its advanced au-
evolution strategy (CMA-ES), for an optimal fit of possible source todifferentiation functionality and computational performance. The
mechanism’s parameters, which are three for a single force or method’s feasibility, robustness and scalability are benchmarked in
six for a full MT, and a time-shift accounting for origin time er- carefully designed synthetic experiments and real data events in the
ror. However, the gradient-free approaches also require millions of Long Valley Caldera region.
Gradient-based moment-tensor inversion 785

2.3 Differentiable misfit function and optimized inversion


Without losing the generality, we consider the inverse problem with
an unknown time-shift, τ , for a single observed seismogram. An
L 22 misfit function between predicted and shifted (or translated)
observed waveforms is defined as,
 2
J (m, τ ) = ∫ m · G h (t ) − d (t − τ ) dt. (6)
t

Downloaded from https://academic.oup.com/gji/article/238/2/783/7686131 by Australian National University Library (INACTIVE) user on 12 June 2024
The integration over time is a notional representation of the sum-
mation over discrete digital waveform samples.
A time-shifted function in the time domain is a multiplication in
the frequency domain, f , via a Fourier transform,

F {d (t − τ )} ( f ) = F {d (t )} ( f ) exp (−iωτ ) , (7)

where ω = 2π f . The Fourier transform of the integral kernel in


eq. (6) becomes,
Figure 2. Computational scalability of the inversion method. Blue and or-
ange lines show the computing times after 1000 epochs on a Central Process-
   
F m · G h (t ) − d (t − τ ) ( f ) = F m · G h (t ) ( f )
ing Unit (CPU) and a Graphical Processing Unit (GPU) of the Australian
National Computing Infrastructure’s Gadi cluster as functions of the number −F {d (t )} ( f ) exp (−iωτ ) . (8)
of stations involved in the inversion. The error bars show the 1σ runtime h :− F {G h (t)} ( f ), dˆ :−
For convenience, we denote G
variation in 10 independent repetitions.
F {d(t)} ( f ). Because the L 22 norms in spectral and temporal do-
2 METHODOLOGY mains equal (i.e. Parseval’s theorem–Gradshteyn et al. 2000), the
misfit function, eq. (6), can be rewritten in the spectral domain as
2.1 Data preparation and pre-processing  2
 h − dˆ exp (−iωτ ) d f.
J (m, τ, h ) = ∫ m · G (9)
Seismic waveforms and station metadata were downloaded from the f
North California Earthquake Data Center (NCEDC). The retrieved
In eq. (9), | · | is the absolute value of the complex spectra, so
waveforms were corrected for instrumental responses for velocity
the cost function, J (m, τ, h ), is continuous and differentiable to
seismograms, then filtered between 20 and 50 s, with an acausal, 4-
all dependent variables, m , τ and h , indeed,
corner Butterworth bandpass filter and down-sampled to 1 sample
per second. We visually inspected the pre-processed waveforms ∂J h r · D h i · D
= 2 ∫ G ˆr + G ˆi d f, (10)
and eliminated those having glitches or exhibiting anomalously ∂m f
high noise levels. They were then arranged into a 3-D tensor in ∂J
= 2ω ∫ D
ˆ r · dˆ i cos ωτ − dˆ r sin ωτ + D
ˆ i · dˆ i cos ωτ + dˆ r sin ωτ d f,
R nr × n c × n t , where n r , n c = 3, and n t are the numbers of receivers, ∂τ f

channels and time samples. In this study, all waveforms are 250 s (11)
long (i.e. n t = 250) starting from the origin time reported in the 
h h h
NCEDC catalogue. ∂J G r k+1 − G rh k G k+1 − G i k
= ∫2 · m · Dr + i · m · Di d f,
∂h f h k+1 − h k h k+1 − h k

2.2 Calculation of GFs (12)

Synthetic velocity seismograms of an impulse source corresponding where


to six orthogonal MT bases were generated using the frequency- h r − dˆ r cos ωτ − dˆ i sin ωτ
wave number method in the Computer Program in Seismology Dr :− m·G
(13)
package (Herrmann 2013). We used the 1-D South California model Di :− h i − dˆ i cos ωτ + dˆ r sin ωτ .
m·G
(Dreger & Helmberger 1990), which is later referred to as the SoCal
model, like previous studies of earthquakes in the region (Dreger D denotes the difference of the spectra of the observed and predicted
et al. 2000; Minson & Dreger 2008). The calculated seismograms waveforms in the integral kernel in eq. (9). Subscripts i, r denote the
are filtered and resampled in the same way as the observed data and real and imaginary components of the Fourier transforms of Green’s
arranged into a 4-D tensor G ∈ R n s × n c × n e × n t and n e = 6 is the tensor and observed data. With the differentiable misfit function, a
number of MT orthogonal bases. gradient descent algorithm can derive the optimum corresponding
To rapidly generate GF at an arbitrary depth, h , we first computed to the inverse solution in the joint space of MT parameters, centroid
an array of GFs over regular discrete depths, for example at every depth and time-shift.
2 km. The GF at an arbitrary depth, h , is linearly interpolated from In this work, we assume a single time-shift for all three com-
the pre-computed GFs at adjacent depth grids, h k and h k+1 , ponents of one station to demonstrate the method’s feasibility. By
assuming so, we ignore the Earth’s structure anisotropy, which could
h k+1 − h h k h − hk
G h (t ) = G (t ) + G h k+1 (t ) . (5) result in different time-shifts in surface wave arrivals in radial and
h k+1 − h k h k+1 − h k tangential seismograms. The time-shifts are a proxy for isotropic
As demonstrated later, the linear interpolation is sufficient to heterogeneity in 3-D Earth structures when approximated by 1-D
evaluate GF as a continuous function of the depth. A more sophis- velocity models (Zhao & Helmberger 1994; Zhu & Helmberger
ticated interpolation scheme has been used to interpolate GFs as a 1996; Zhu & Ben-Zion 2013). Thus, our inversion has n r time-
function of the 3-D epicentral coordinates (Simutė et al. 2023). shifts, earthquake depth and 6 MT parameters as unknowns.
786 T.-S. Pha. m

Downloaded from https://academic.oup.com/gji/article/238/2/783/7686131 by Australian National University Library (INACTIVE) user on 12 June 2024
Figure 3. Earth models are used in synthetic experiments. The back lines show the South California (SoCal) model (Dreger & Helmberger 1990) and the
coloured lines plot perturbed models with κ = ±1 per cent, ±3 per cent and ±5 per cent from the original P- and S-wave speeds in all layers.

2.4 Algorithm implementation and uncertainty estimation not affect the compute time. This starkly contrasts with existing
methods, as discussed in the Introduction, whose efficacy depends
Our implementation utilizes the auto differentiation functional-
strongly on the input sizes.
ity available on TensorFlow (www.tensorflow.org), a popular deep
To estimate the uncertainties of the recovered parameters, we
learning framework. The partial derivatives, eqs (10)–(12), can
repeated the inversion in 96 independent runs with different initial
be automatically calculated by backpropagating the loss func-
solutions, thanks to the inversion’s efficiency. To minimize the risk
tion defined in eq. (9). The deep learning framework also offers
of local minima solutions, we select 50 per cent solutions corre-
well-tested, high-performance gradient descent algorithms, such as
sponding to the smallest losses, which almost always converge to
Adam (Kingma & Ba 2014). In the present implementation, we
the global minimum. We use their mean and one standard deviation
use the Adam optimizer as a default option due to its empirical
to appraise the final estimates of the MT solutions, epicentral depth,
performance. As demonstrated later, the implementation enables
station-specific time-shifts and their uncertainties. On a practical
acceleration on a graphic processing unit (GPU), resulting in excel-
note, despite the better runtime of the GPU version (Fig. 2), we
lent scalability to data size with a negligible increase in computing
prefer to execute the CPU inversion in embarrassing parallel on 48
time. As we utilize the core functions of a modern deep learning
cores of a single node on the Gadi cluster, which results in an overall
framework, the inversion algorithm can also be implemented in
better runtime.
other frameworks with similar functionalities, such as PyTorch, as
the developers prefer.
Because the event’s magnitude is on a logarithmic scale but gen-
erally unknown, the amplitudes of observed seismograms and pre- 3 SYNTHETIC EXPERIMENTS
computed GFs can be in largely different dynamic ranges. This is This section demonstrates the method in a series of controlled
an inherent challenge for iterative algorithms because the initial numerical experiments. The experiments utilize velocity models
solution must be relatively close to the optimum to guarantee the slightly perturbed from the SoCal model (Dreger & Helmberger
algorithm’s convergence. Here, we normalize the observation and 1990) to generate synthetic data, while the original model is used
Green’s tensor with their absolute medians to reduce the logarithmic in the inversion to recover the MT solutions. Fig. 3 shows an array
magnitude scaling difference. The actual magnitude of the MT so- of perturbed earth models, where P- and S-wave speeds in all layers
lution will be restored afterward using the ratio between the median are increased or decreased by κ = ±1 per cent, ±3 per cent and ±5
values. When waveform amplitudes are normalized in this way, we per cent from their original values. The source-station configura-
empirically found a learning rate of 0.2, or similar values in the tion is similar to the Mw 4.9, 1997/11/22 17:20:35 event, featured
same magnitude order, performs well for the examples presented in later in Section 4.1. The input MT, fixed at 10 km depth, is highly
this paper. The learning rate defines the relative step length when non-DC, consisting of 30 per cent negative isotropic, 35 per cent
descending along the gradient direction. double couple and 35 per cent CLVD components. Synthetic data
Another important aspect is the proposed method’s excellent are filtered in 20–50 s and then added with white noise with a stan-
scalability to tens of seismic stations. Fig. 2 features the compute dard deviation equal to 10 per cent of the original signals’ standard
time of the algorithm on a single central processing unit (CPU) and a deviation.
single graphical processing unit (GPU) on the Australian National Fig. 4 presents the exemplary inversion results when the earth
Computational Infrastructure’s Gadi cluster for test runs with an model is perturbed by κ = −3 per cent. Results for the correct
increasing number of seismic stations, n r . To fairly compare the model and other perturbation values are shown in Figs S1–S6.
running time, we terminated all inversions at 1000 epochs without Given that the SoCal model is inaccurate in this example, recovered
considering the actual convergence of the MT solutions. When MT solutions and source depths are sufficiently close, though not
running on a CPU, the compute time increases less than three times exact, to the truth values, which confirms the method’s capacity.
given that the input size was increased 12 times. This indicates Because 1σ errors estimated from 50 per cent best solutions for all
the CPU code’s strong sublinear scalability. The run time is even parameters, the ISO/DC/CLVD decomposition components, event
faster on a GPU, yet, more significantly, the input size increase does depth and station-specific time-shifts are small, the method can be
Gradient-based moment-tensor inversion 787

Downloaded from https://academic.oup.com/gji/article/238/2/783/7686131 by Australian National University Library (INACTIVE) user on 12 June 2024
Figure 4. Summary of inversion results for a synthetic experiment using data from an incorrect earth model, that is κ = −3 per cent. The centroid depth is fixed
at 10 km. (a) Source-station location map. Black and red beach balls show the deviatoric moment tensors (MT) of the actual input and the recovered solution.
(b) The input (black plus) and recovered MTs (red cross) are marked on a lune diagram (Tape & Tape 2012). (C) Black lines show the evolution of descending
losses (eq. 9) as a function of epochs for 48 solutions having the smallest losses out of the 96 independent runs. Grey lines plot the corresponding centroid-depth
evolutions. The text denotes the forward earth model, model error magnitude, recovered event depth, moment magnitude and standard decomposition of
isotropic (ISO), double couple (DC) and compensated linear vector dipole (CLVD) percentages. (d) Synthetic data and 48 best-predicted waveforms are shown
in black and red, respectively. The right panel shows the output station-specific time-shifts and estimated uncertainties.

considered highly precise. However, the method’s precision should Lastly, Fig. 5(c) shows the remarkable correlation between the
be distinguished by its accuracy, representing the recovered solu- trend of station-specific time-shifts recovered by the inversion and
tion’s closeness to the true input. The inversion accuracy is subjected the imposed earth model errors. When model errors are negative,
to the forward problem’s quality, subject to the imposed model er- the original model, SoCal, is faster than the synthetic Earth struc-
rors, so the small inversion inaccuracy is unavoidable. As summa- tures (refer to Fig. 3). Hence, surface wave trains predicted by
rized in Figs 5(a) and (b), it is encouraging our proposed method, the SoCal model arrive earlier than those in the synthesized data.
accounting for structural errors by station-specific time-shifts, ef- Consequently, negative time-shifts are required to shift the syn-
fectively recovers the input MT at reasonable accuracy. thetic observation backward to match the forward prediction. In
788 T.-S. Pha. m

(a) (b) (c)

Downloaded from https://academic.oup.com/gji/article/238/2/783/7686131 by Australian National University Library (INACTIVE) user on 12 June 2024
Figure 5. Station-specific time-shifts recovered as functions of Earth model errors in synthetic experiments. The colour codes in the legend of panel (b) denote
the results in all panels. (a) Source-station location map used in the synthetic experiments. The earthquake is fixed at 10 km depth. The black beach ball plots
the deviatoric mechanism of the input MT, and the coloured beachballs plot the recovered MTs inverting synthetic data from perturbed earth models. (b) The
input and recovered MT solutions are plotted on a lune diagram (Tape & Tape 2012). (c) This panel plots the trends of recovered station-specific time-shifts
sorted with increasing epicentral distances.

general, the shift amounts needed to increase with epicentral dis- each subset size. In each subset realization, a seismic station was
tances, so a negative model error corresponds to a decreasing trend selected randomly without repetition until the size was reached.
of recovered time-shifts. On the other hand, a positive model error The probability of picking a station is inversely proportional to its
corresponds to increasing time-shifts relative to distances. This ob- distance to the earthquake source; in other words, a station closer to
servation suggests one can compare the earth model relative to the the source is more likely to be picked. As a result, seven solutions
actual structures if simplistically assuming an azimuthally homoge- correspond to crosses in each colour in all panels of Fig. 7.
nous structure in the region. The source types of 28 recovered MT solutions with the station
subsets are distributed closely around the reference solution, re-
covered with all 13 stations (Fig. 7a). Besides the non-DC source
4 R E S U LT S types, the DC parts of the full MT solutions can also be constrained
reasonably well with the station subsets (Fig. 7b). Indeed, the Ka-
4.1 Mw 4.9, 1997/11/22 17:20:35, Long Valley Caldera gan angles, being the smallest angle to match the subset solutions’
volcanic event DC to the reference’s DC parts (Kagan 1991), are always smaller
than 10◦ . Furthermore, the angles generally decrease when more
We use data from a previously studied, non-double-couple event in stations are included in the inversions because more stations yield
the Long Valley Caldera (Mw 4.9, 1997/11/22 17:20:35) to further better azimuthal coverage, thus better constraining the DC com-
demonstrate the feasibility and robustness of the proposed inversion ponents. Lastly, Fig. 7(c) features the consistency of the recovered
method. This event, often referred to as LV2, was in a sequence of time-shifts among the station subsets, which is discussed in the
large events during the volcanic unrest from 1997 to 1999 (Dreger following section.
et al. 2000; Minson & Dreger 2008; Pha. m & Tkalčić 2021). Broad- The presence of the significant CLVD component in the MT so-
band seismic data were retrieved from 13 broad-band seismic sta- lution of this event (Figs 6b and 7a) confirms the findings by Pha. m
tions of the Berkeley Digital Seismograph Network (Fig. 6a) and & Tkalčić (2021). It is worth noting that both studies considered the
filtered between 20 and 50 s. As noted above, an array of GFs is effects of Earth’s structural error using different approaches. The
computed for every 2 km down to 50 km depth for the same source- previous study considered the influence of the 1-D velocity model’s
station configuration, from which GF at an arbitrary depth can be error by simulating synthetic waveforms and their covariance ma-
linearly interpolated. trices. This study, however, utilizes station-specific time-shifts as a
Fig. 6 summarizes the inversion result from real data for this proxy to tolerate earth model error, which largely results in wave-
event. Like the synthetic data examples, the real data inversions are form misalignment (Hu et al. 2023). The agreement of a significant
also repeated 96 times independently, and only half of the solutions CLVD component for this event, which was overlooked in the past
with the smallest losses are selected to compute the mean solution (Pha. m & Tkalčić 2021), revealed by two different approaches, cor-
and 1σ uncertainties. The recovered MT solution has a dominant roborates the importance of properly treating the earth model’s error
double couple (DC) 57 ± 1 per cent, large isotropic (ISO) 33 ± 1 in MT inversion methods.
per cent and compensated linear dipole (CLVD) 9 ± 1 per cent
components (Fig. 6b). This event’s estimated source depth is highly
confident at 10.1 ± 0.1 km, which is deeper than 5.1 km reported in
4.2 Robustness of recovered station-specific time-shifts
the NCEDC catalogue. Predicted waveforms corresponding to the
selected solutions exhibit good fits with observed waveforms after Given that the previous section features the robustness of resolving
time-shifted (Fig. 6d). MT parameters, this section demonstrates the robustness of the
To further examine the robustness of the recovered solution, we recovered time-shifts. First, we note that the time-shifts for multiple
repeat the complete inversion procedure for multiple subsets of 5, station subsets for event Mw 4.9, 1997/11/22 17:20:35, presented in
7, 9 and 11 stations drawn from the original set of 13. The complete the previous section, are highly consistent except for stations MIN
inversion procedure was repeated for seven different realizations of and WDC. The large variations in time-shifts at the two stations
Gradient-based moment-tensor inversion 789

(b) (c)
(a)

Downloaded from https://academic.oup.com/gji/article/238/2/783/7686131 by Australian National University Library (INACTIVE) user on 12 June 2024
(d)

Figure 6. Summary of inversion results for real data computed from 1997/11/22 17:20:35 event catalogued in the North California Earthquake Data Center
(NCEDC). The following colour codes are used for solutions of previous work: orange for MD2008 (Minson & Dreger 2008) and blue for PT2021 (Pha. m &
Tkalčić 2021). Other details are the same as in Fig. 4.

are likely due to their small waveform amplitudes (in comparison related to volcanic activities. The only event in the Nevada site is
with station HOPS of a similar epicentral distance), which results DC-dominant, likely due to its tectonic nature.
in a low signal-to-noise ratio. The same reason also causes the large Despite the diversity in source types, the recovered station-
uncertainties in these stations’ time-shifts inverted for all 13 stations specific time-shifts show a common decreasing trend relative to
(Fig. 6d). epicentral distances among all six events (Fig. 8c). Similar to
Furthermore, we retrieve data from six events in Long Valley event 1997/11/22 17:20:35 noted above (Fig. 6d), time-shifts re-
Caldera’s 1997–1999 unrest sequence (Fig. 8a). A set of 13 stations covered for stations MIN and WDC of event 1999/05/15 13:22:10
recording all six events is used for the inversion to compare relative (Fig. S11D) are also largely uncertain due their characteristic wave-
variation in time-shifts (Fig. 8c). The selected events show a wide form amplitudes are much lower compared to adjacent stations.
variety of source types (Fig. 8b). Five events in the caldera’s closest We note that the absolute station-specific time-shifts are subjected
vicinity exhibit significant non-DC components, which are likely to uncertainties due to possible errors in the origin times, but the
790 T.-S. Pha. m

(a)
(b) (c)

Downloaded from https://academic.oup.com/gji/article/238/2/783/7686131 by Australian National University Library (INACTIVE) user on 12 June 2024
Figure 7. Solutions for event 1997/11/22 17:20:35 recovered using 28 random station subsets. The subset size colour codes are shown in the legend in panel
(a). (a) On a lune diagram (Tape & Tape 2012), colour crosses plot the recovered moment tensors (MT) from all subset realizations, and the black plus mark
the reference solution recovered with all 13 stations (shown in Fig. 6). (b) The crosses plot the Kagan (1991) angles, the smallest angle to match the double
couple (DC) parts of two MTs, between the subset realizations’ and the reference MT. (C) The colour crosses and black plus plot station-specific time-shifts
recovered in the inversions of all station subsets.

(a) (b) (c)

Figure 8. Robust trend in recovered station-specific time-shifts for six earthquake in the Long Valley Caldera region. The event colour codes in the legend of
panel (b) apply to all figure panels. (a) Location map of six events analysed in this experiment. The beach balls plot this study’s MT solutions. (b) Crosses plot
the solutions on a lune diagram (Tape & Tape 2012). (c) Thin colour lines plot the time-shift variation of 48 selected inversion runs (out of 96 trials), and the
crosses mark the trend of their median values.

decreasing trend is consistent. Assuming that the crustal and upper- where autodifferentiation functionality is supported, for example
most mantle structures are generally homogenous, the decreasing TensorFlow or PyTorch, could be a promising way to incorporate
trend could suggest that the SoCal model is slightly faster than the the source location into the present inversion framework. However,
local Earth structure (see Fig. 5c). the computation cost is expected to increase when more spatial
variables are introduced into the inversion.
It is worth noting that the pointwise L 22 misfit function (eqs 6
5 DISCUSSION and 9) assumes uncorrelated data noise in the observation. Re-
In summary, we have presented the details of an efficient gradient- cent studies demonstrate the benefit of considering correlated noise
based method and demonstrated its feasibility, robustness and scal- characterized by symmetric covariance matrices, which can be es-
ability to jointly invert MT parameters, centroid depth and station- timated either from ambient noise (Duputel et al. 2012; Mustać &
specific time-shifts through numerical experiments and real data Tkalčić 2016; Vackář et al. 2017) or involve the prior assumptions
examples. In the following, we discuss some inherent limitations of Earth’s structural error (Pha. m & Tkalčić 2021; Vasyura-Bathke
of the proposed algorithm, our measures to mitigate the limitations et al. 2021). Incorporating the covariance matrices into this study’s
and thoughts on the way forward. proposed algorithm is practical, but its robustness is subject to fur-
In the current implementation, we use linear interpolation (eq. 5) ther investigation. Similarly, it is feasible, but yet to be verified, to
of GFs pre-computed on regular depth grids to define a continuous incorporate an unknown source–time function (STF), representing
and differentiable loss function with centroid depth (eq. 9). The the history of realizing energy rather than an impulse source in time,
interpolation approach can be extended to consider unknown 3-D when the STF is represented as a weighted linear combination of
hypocentral coordinates without significantly affecting the design several bases (Stähler & Sigloch 2014).
of the inversion algorithm. Translating the existing forward meth- Because the proposed joint inversion is cast in the optimiza-
ods (e.g. Herrmann 2013) into a modern programming framework tion framework, it does not benefit from non-linear parametrization
Gradient-based moment-tensor inversion 791

methods featuring a homogeneous distribution of MTs on the pa- Computational Infrastructure (NCI Australia), an NCRIS enabled
rameter space (Stähler & Sigloch 2014; Tape & Tape 2015). Instead, capacity supported by the Australian Government. The author
we chose the primitive MT parametrization (eq. 1) in this work so thanks Editor Carl Tape, Jiřı́ Vackář and an anonymous reviewer
the derivatives to MT parameters can be calculated rapidly (eq. 10). for constructive review comments, which significantly improve the
For the same reason, we expect the implementation’s performance quality of the this paper.
to be similar when using other linear MT parametrization methods
(e.g. Kikuchi & Kanamori 1982; Kawakatsu 1996), which are con-
venient when full MT is not required, for example, deviatoric MT AU T H O R C O N T R I B U T I O N S

Downloaded from https://academic.oup.com/gji/article/238/2/783/7686131 by Australian National University Library (INACTIVE) user on 12 June 2024
(e.g. Hejrani et al. 2017; Hejrani & Tkalčić 2018). Conceptualization, data curation, formal analysis, funding acquisi-
The use of the L 22 -misfit function between predicted and ob- tion, investigation, methodology, project administration, resources,
served waveforms could pose a concern regarding the presence of software, validation, visualization, writing – original draft and writ-
local minima, most likely due to the waveform cycle-skipping (Sam- ing – review and editing were all done by author Thanh-Son Pha. m.
bridge et al. 2022). Thanks to the proposed inversion’s efficiency,
this study selects solutions converged from diverse initializations to
minimize the risk of a local minimum solution. Using novel wave- S U P P O RT I N G I N F O R M AT I O N
form similarity measurements, which are not subjected to local
Supplementary data are available at GJ I online.
minima, such as the Warsterstein distance (Sambridge et al. 2022),
could be an interesting future research topic for this application. Figure S1. Summary of inversion results for a numerical experiment
The main limitation of the presented method in an optimization using data computed by a correct earth model, that is k = 0 per cent.
framework is the lack of sufficient uncertainty estimates concern- The centroid depth is fixed at 10 km. (a) Location map of the event
ing inverted solutions because it only outputs the optimal solu- and monitoring stations. The black and red beach balls show the
tion. This is of critical importance when applying to highly non- deviatoric moment tensors (MT) of the actual input and the recov-
unique settings such as critically shallow earthquakes or explosions ered solution. (b) The input MT (black plus) and the recovered MT
(Kawakatsu 1996; Ford et al. 2010; Alvizuri & Tape 2018; Hejrani (red cross) are denoted on a lune diagram of source types (Tape &
& Tkalčić 2020; Hu et al. 2023). To circumvent this limitation, Tape 2012). (c) Black lines show the evolution of descending losses
we appraise the solution and its uncertainty from a subset of best- (eq. 9 in the main text) as a function of epochs for 48 solutions
fitting solutions obtained by repeating the inversions multiple times. that produced the smallest losses out of the 96 independent runs.
A holistic approach to obtain the complete posterior distribution The grey lines show the corresponding evolution of the event-depth
of the inverted parameters is to investigate effective gradient-based parameter. The text denotes the forward earth model, model error
Hamiltonian Monte Carlo sampler (Fichtner & Simutė 2018), thanks magnitude, recovered event depth, moment magnitude and standard
to the availability of the loss function’s derivatives. decomposition of isotropic (ISO), double couple (DC) and compen-
sated linear vector dipole (CLVD) components. (d) Synthetic data
and 48 best-predicted waveforms are shown in black and red, re-
6 C O N C LU S I O N spectively. The right-hand panel shows the output station-specific
Gme shifts and estimated uncertainties.
In conclusion, we present a gradient-based inversion method in
Figure S2. Summary of inversion results for synthetic data com-
an optimization framework for a classical problem in regional MT
puted with k = −1 per cent model error. Other details are the same
inversion, the joint inversion of MT parameters, centroid depth and
as in Fig. S1.
station-specific time-shifts. The station-specific time-shifts can be
Figure S3. Summary of inversion results for synthetic data com-
considered a simplified proxy for 3-D Earth structure errors with
puted with k = 1 per cent model error. Other details are the same as
incomplete GFs in MT inversion problems. To incorporate centroid
in Fig. S1.
depth in the inversion, we interpolate GFs at an arbitrary depth
Figure S4. Summary of inversion results for synthetic data com-
from pre-computed GFs at regular depth grids. In doing so, the L 22
puted with k = 3 per cent model error. Other details are the same as
misfit function between the prediction and shifted observation is
in Fig. S1.
cast in the frequency domain thanks to Parseval’s theorem. Tests on
Figure S5. Summary of inversion results for synthetic data com-
controlled synthetic experiments and data from pilot events from the
puted with k = −5 per cent model error. Other details are the same
Long Valley Caldera region demonstrated the proposed method’s
as in Fig. S1.
feasibility, robustness and scalability. This paper hopes to highlight
Figure S6. Summary of inversion results for synthetic data com-
a fresh opportunity to benefit from the computational infrastructures
puted with k = 5 per cent model error. Other details are the same as
thanks to the rapidly growing artificial intelligence communities for
in Fig. S1.
geophysical problems.
Figure S7. Summary of inversion results for real data computed
from 1997-11-02 08:51:52 event catalogued in the North Cali-
fornia Earthquake Data Center. Other details are the same as in
AC K N OW L E D G M E N T S
Fig. S1.
This work greatly benefited from discussion with Jinyin Hu and Figure S8. Summary of inversion results for real data computed
Hrvoje Tkalčić through ongoing research on MT inversion. The Air from 1997-11-22 12:06:55 event catalogued in the North Cali-
Force Research Laboratory’s grant, contract number FA9453-20- fornia Earthquake Data Center. Other details are the same as in
C-0072, supported the author’s post doc at The Australian Na- Fig. S1.
tional University. He also acknowledges financial support from Figure S9. Summary of inversion results for real data computed
the Australian Research Council through a Discovery Early Ca- from 1997-11-22 18:10:59 event catalogued in the North Cali-
reer Researcher Award, project DE230100025. This research was fornia Earthquake Data Center. Other details are the same as in
undertaken with the assistance of resources from the National Fig. S1.
792 T.-S. Pha. m

Figure S10. Summary of inversion results for real data computed Jost, M.L. & Herrmann, R.B., 1989. A student’s guide to and review of
from 1997-11-30 21:17:05 event catalogued in the North California moment tensors, Seismol. Res. Lett., 60, 37–57.
Earthquake Data Center. Other details are the same as in Fig. S1. Kagan, Y.Y., 1991. 3-D rotation of double-couple earthquake sources,
Figure S11. Summary of inversion results for real data computed Geophys. J. Int., 106, 709–716.
Kawakatsu, H., 1996. Observability of the isotropic component of a moment
from 1999-05-15 13:22:10 event catalogued in the North California
tensor, Geophys. J. Int., 126, 525–544.
Earthquake Data Center. Other details are the same as in Fig. S1.
Kikuchi, M. & Kanamori, H., 1982. Inversion of complex body waves, Bull.
Please note: Oxford University Press is not responsible for the con- seism. Soc. Am., 72, 491–506.
Kingma, D.P. & Ba, J., 2014. Adam: a method for stochastic optimization,

Downloaded from https://academic.oup.com/gji/article/238/2/783/7686131 by Australian National University Library (INACTIVE) user on 12 June 2024
tent or functionality of any supporting materials supplied by the
arXiv E-Prints. https://doi.org/10.48550/arXiv.1412.6980.
authors. Any queries (other than missing material) should be di-
Kühn, D., Heimann, S., Isken, M.P., Ruigrok, E. & Dost, B., 2020. Prob-
rected to the corresponding author for the paper.
abilistic moment tensor inversion for hydrocarbon-induced seismicity in
the Groningen Gas Field, the Netherlands, part 1: testing, Bull. seism. Soc.
Am., 110, 2095–2111.
D ATA A N D C O D E AVA I L A B I L I T Y Minson, S.E. & Dreger, D.S., 2008. Stable inversions for complete moment
tensors, Geophys. J. Int., 174, 585–592.
Data for this study come from the Berkeley Digital Seismic Network Mustać, M. & Tkalčić, H., 2016. Point source moment tensor inver-
(BDSN), doi:10.7932/BDSN, operated by the UC Berkeley Seis- sion through a Bayesian hierarchical model, Geophys. J. Int., 204,
mological Laboratory, which is archived at the Northern California 311–323.
Earthquake Data Center (NCEDC), doi:10.7932/NCEDC. Source Nooshiri, N., Bean, C.J., Dahm, T., Grigoli, F., Kristjánsdóttir, S., Obermann,
files to reproduce the data analysis and figures presented in this A. & Wiemer, S., 2022. A multibranch, multitarget neural network for
paper is available at: https://github.com/tsonpham/JointMTS.git. rapid point-source inversion in a microseismic environment: examples
from the Hengill Geothermal Field, Iceland, Geophys. J. Int., 229, 999–
1016.
Petersen, G.M., Cesca, S., Heimann, S., Niemz, P., Dahm, T., Kühn, D.,
REFERENCES Kummerow, J. & Plenefisch, T.& the AlpArray and AlpArray-Swath-D
Aki, K. & Richards, P.G., 2002. Quantitative Seismology, 2nd edn, Univer- working groups, 2021. Regional centroid moment tensor inversion of
sity Science Books. small to moderate earthquakes in the Alps using the dense AlpArray
Alvizuri, C. & Tape, C., 2018. Full moment tensor analysis of nuclear seismic network: challenges and seismotectonic insights, Solid Earth, 12,
explosions in North Korea, Seismol. Res. Lett., 89, 2139–2151. 1233–1257.
Del Moral, P., Doucet, A. & Jasra, A., 2006. Sequential Monte Carlo sam- Pha. m, T.-S. & Tkalčić, H., 2021. Toward improving point-source moment-
plers, J. R. Stat. Soc., B: Stat. Methodol., 68, 411–436. tensor inference by incorporating 1D earth model’s uncertainty: impli-
Dreger, D.S. & Helmberger, D.V., 1990. Broadband modelling of local earth- cations for the Long Valley caldera earthquakes, J. geophys. Res., 126,
quakes, Bull. seism. Soc. Am., 80, 1162–1179. 2021JB022477,.
Dreger, D.S., Tkalčić, H. & Johnston, M., 2000. Dilational processes accom- Roberts, G.O. & Rosenthal, J.S., 2001. Optimal scaling for various
panying earthquakes in the Long Valley Caldera, Science, 288, 122–125. metropolis-Hastings algorithms, Stat. Sci., 16, 351–367.
Duputel, Z., Rivera, L., Fukahata, Y. & Kanamori, H., 2012. Uncertainty Sambridge, M., Jackson, A. & Valentine, A.P., 2022. Geophysical inversion
estimations for seismic source inversions, Geophys. J. Int., 190, 1243– and optimal transport, Geophys. J. Int., 231, doi:10.1093/gji/ggac151.
1256. Silwal, V. & Tape, C., 2016. Seismic moment tensors and estimated uncer-
Dziewonski, A.M., Chou, T.-A. & Woodhouse, J.H., 1981. Determination of tainties in southern Alaska, J. geophys. Res., 121, 2772–2797.
earthquake source parameters from waveform data for studies of global Simutė, S., Boehm, C., Krischer, L., Gokhberg, A., Vallée, M. & Ficht-
and regional seismicity, J. geophys. Res., 86, 2825–2852. ner, A., 2023. Bayesian seismic source inversion with a 3-D earth
Fichtner, A. & Simutė, S., 2018. Hamiltonian Monte Carlo inversion of model of the Japanese islands, J. geophys. Res., 128, e2022JB024231,
seismic sources in complex Media, J. geophys. Res., 123, 2984–2999. doi:10.1029/2022JB024231.
Ford, S.R., Dreger, D.S. & Walter, W.R., 2010. Network sensitivity solutions Stähler, S.C. & Sigloch, K., 2014. Fully probabilistic seismic
for regional moment-tensor InversionsNetwork sensitivity solutions for source inversion—part 1: efficient parameterisation, Solid Earth, 5,
regional moment-tensor inversions, Bull. seism. Soc. Am., 100, 1962– 1055–1069.
1970. Stein, S. & Wysession, M., 2003. An Introduction to Seismology, Earth-
Goodman, J. & Weare, J., 2010. Ensemble samplers with affine invariance, quakes, and Earth Structure, 1 edn, Blackwell Publishing.
Commun. Appl. Math. Comput. Sci., 5, 65–80. Steinberg, A., Vasyura-Bathke, H., Gaebler, P., Ohrnberger, M. & Cer-
Gradshteyn, I.S., Ryzhik, I.M., Jeffrey, A. & Zwillinger, D., 2000. Table of anna, L., 2021. Estimation of seismic moment tensors using varia-
Integrals, Series, and Products, 6th edn, Academic Press. tional inference machine learning, J. geophys. Res., 126, e2021JB022685,
Hejrani, B. & Tkalčić, H., 2018. The 20 May 2016 Petermann Ranges doi:10.1029/2021JB022685.
earthquake: centroid location, magnitude and focal mechanism from full Tape, W. & Tape, C., 2012. A geometric setting for moment tensors,
waveform modelling, Aust. J. Earth Sci., 66(1), 37–45. Geophys. J. Int., 190, 476–498.
Hejrani, B. & Tkalčić, H., 2020. Resolvability of the centroid-moment- Tape, W. & Tape, C., 2015. A uniform parametrization of moment tensors,
tensors for shallow seismic sources and improvements from model- Geophys. J. Int., 202, 2074–2081.
ing high-frequency waveforms, J. geophys. Res., 125, e2020JB019643, Thurin, J. & Tape, C., 2023. Comparison of force and moment tensor esti-
doi:10.1029/2020JB019643. mations of subevents during the 2022 Hunga–Tonga submarine volcanic
Hejrani, B., Tkalčić, H. & Fichtner, A., 2017. Centroid moment tensor eruption, Geophys. J. Int., 235, 1959–1981.
catalogue using a 3-D continental scale earth model: application to earth- Thurin, J., Tape, C. & Modrak, R., 2022. Multi-event explosive seismic
quakes in Papua New Guinea and the Solomon Islands, J. geophys. Res., source for the 2022 Mw 6.3 Hunga Tonga submarine volcanic eruption,
122, 5517–5543. Seism. Rec., 2, 217–226.
Herrmann, R.B., 2013. Computer Programs in seismology: an evolving tool Vackář, J., Burjánek, J., Gallovič, F., Zahradnı́k, J. & Clinton, J., 2017.
for instruction and research, Seismol. Res. Lett., 84, 1081–1088. Bayesian ISOLA: new tool for automated centroid moment tensor inver-
Hu, J., Pha. m, T.-S. & Tkalčić, H., 2023. Seismic moment tensor inversion sion, Geophys. J. Int., 210, 693–705.
with theory errors from 2D earth structure: implications for the 2009-2017 Vasyura-Bathke, H., Dettmer, J., Dutta, R., Mai, P.M. & Jónsson, S., 2021.
DPRK nuclear blasts, Geophys. J. Int., 235, doi:10.1093/gji/ggad348. Accounting for theory errors with empirical bayesian noise models in
Gradient-based moment-tensor inversion 793

nonlinear centroid moment tensor estimation, Geophys. J. Int., 225, 1412– Zhao, L.-S. & Helmberger, D.V., 1994. Source estimation from broadband
1431. regional seismograms, Bull. seism. Soc. Am., 84, 91–104.
Vasyura-Bathke, H. et al., 2020. The Bayesian earthquake analysis tool, Zhu, L. & Ben-Zion, Y., 2013. Parametrization of general seismic po-
Seismol. Res. Lett., 91, 1003–1018. tency and moment tensors for source inversion of seismic waveform data,
Viltres, R., Nobile, A., Vasyura-Bathke, H., Trippanera, D., Xu, W. & Geophys. J. Int., 194(2), 839–843.
Jónsson, S., 2021. Transtensional rupture within a diffuse plate boundary Zhu, L. & Helmberger, D.V., 1996. Advancement in source estimation tech-
zone during the 2020 Mw 6.4 Puerto Rico earthquake, Seismol. Res. Lett., niques using broadband regional seismograms, Bull. seism. Soc. Am., 86,
93, 567–583. 1634–1641.

Downloaded from https://academic.oup.com/gji/article/238/2/783/7686131 by Australian National University Library (INACTIVE) user on 12 June 2024


C The Author(s) 2024. Published by Oxford University Press on behalf of The Royal Astronomical Society. This is an Open Access
article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which
permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

View publication stats

You might also like