Algorithms 17 00524

algorithms
Article
Subsampling Algorithms for Irregularly Spaced
Autoregressive Models
Jiaqi Liu † , Ziyang Wang † , HaiYing Wang and Nalini Ravishanker *
Department of Statistics, University of Connecticut, Storrs, CT 06269, USA; [email protected] (J.L.);

[email protected] (Z.W.); [email protected] (H.W.)
* Correspondence: [email protected]
† These authors contributed equally to this work.
Abstract: With the exponential growth of data across diverse fields, applying conventional statistical
methods directly to large-scale datasets has become computationally infeasible. To overcome this
challenge, subsampling algorithms are widely used to perform statistical analyses on smaller, more
manageable subsets of the data. The effectiveness of these methods depends on their ability to identify
and select data points that improve the estimation efficiency according to some optimality criteria.
While much of the existing research has focused on subsampling techniques for independent data,
there is considerable potential for developing methods tailored to dependent data, particularly in time-
dependent contexts. In this study, we extend subsampling techniques to irregularly spaced time series
data which are modeled by irregularly spaced autoregressive models. We present frameworks for
various subsampling approaches, including optimal subsampling under A-optimality, information-
based optimal subdata selection, and sequential thinning on streaming data. These methods use
A-optimality or D-optimality criteria to assess the usefulness of each data point and prioritize the
inclusion of the most informative ones. We then assess the performance of these subsampling methods
using numerical simulations, providing insights into their suitability and effectiveness for handling
irregularly spaced long time series. Numerical results show that our algorithms have promising
performance. Their estimation efficiency can be ten times as high as that of the uniform sampling
estimator. They also significantly reduce the computational time and can be up to forty times faster
than the full-data estimator.
Citation: Liu, J.; Wang, Z.; Wang, H.; Keywords: subsampling; irregularly spaced autoregressive model; time series; big data
Ravishanker, N. Subsampling
Algorithms for Irregularly Spaced
Autoregressive Models. Algorithms
2024, 17, 524. https://doi.org/
1. Introduction
10.3390/a17110524
Time series analysis is useful in many application domains for understanding pat-
Academic Editor: Matthias Templ terns over time and for forecasting into the future. Most time series methods have been
Received: 25 September 2024
developed for regularly spaced time series, where the gaps between two consecutive time
Revised: 6 November 2024 points are fixed and equal. Regular time series include hourly, daily, monthly, or annually
Accepted: 11 November 2024 recorded time series. Recently, there has been increasing interest in developing methods
Published: 15 November 2024 for analyzing and forecasting irregularly spaced time series, where the gaps between
subsequent observations are not the same. Such irregular time series are observed in
diverse fields such as astronomy [1], climatology [2], finance [3], etc. For instance, intra-
day transaction-level data in finance consist of prices of a financial asset recorded at each
Copyright: © 2024 by the authors.
trade within a trading day, resulting in irregular time intervals (in seconds, say) between
Licensee MDPI, Basel, Switzerland.
consecutive trades. An example in real estate consists of the time series of sale prices of
This article is an open access article
houses, where the discrete time gaps (in days or weeks) between subsequent sale dates are
distributed under the terms and
typically nonconstant.
conditions of the Creative Commons
Consider an irregularly spaced time series y j , j = 0, . . . , m consisting of observations
Attribution (CC BY) license (https://
at discrete times t j , where t0 < t1 < · · · < tm . The gaps between consecutive time points,
creativecommons.org/licenses/by/
4.0/).
denoted by d j = t j − t j−1 , j = 1, . . . , m, are not constant.
Algorithms 2024, 17, 524. https://doi.org/10.3390/a17110524 https://www.mdpi.com/journal/algorithms

Algorithms 2024, 17, 524 2 of 17
To model irregularly spaced time series with discrete gaps, Nagaraja et al. [4] proposed
the stationary gap time autoregressive (Gap AR(1)) model:
s s
1 dj 1 − ϕ2d j
y0 = σ ε 0 , y j = ϕ y j −1 + σ ε j for j = 1, . . . , m, (1)
1 − ϕ2 1 − ϕ2
where ε j is the error term with the standard normal distribution, i.e., ε j ∼ N (0, 1), ϕ ∈
(−1, 1) is the autoregressive parameter, and σ > 0 is a scale parameter. They used (1) to
model and forecast house prices and then constructed a house price index. Other models
have also been constructed in the literature for irregularly spaced time series. Erdogan
et al. [5] described a nonstationary irregularly spaced autoregressive (NIS-AR(1)) model
and illustrated its use on astronomy data. Anantharaman et al. [6] described Bayesian
modeling for an irregular stochastic volatility autoregressive conditional duration (IR-SV-
ACD) model and used this for estimating and forecasting inter-transaction gaps and the
volatility of financial log-returns.
In this paper, we refer to the Gap AR(1) model in (1) as the irregularly spaced AR(1)
or the IS-AR(1) model. Estimation in this model can be handled by the method of maximum
likelihood. While a model for irregularly spaced time series enables handling a wider
range of temporal scenarios, the statistical analysis involves considerable computational
complexity. Specifically, the task of capturing temporal structures by estimating the model
parameters becomes more challenging due to the irregular gaps. One potential solution
to address these computational challenges is to perform the computations of interest on
subsamples that are selected from the full data.
Despite extensive recent research on subsampling techniques, subsampling methods
tailored for irregularly spaced time series data are sparse. In this article, we fill this gap by
proposing novel subsampling methods specifically designed for irregularly spaced time
series data analyzed by the IS-AR(1) model. In comparison to other subsampling methods,
our proposed algorithms are founded upon the optimality criteria under classical statistical
frameworks. It should be noted that the term “subsample” in time series analysis generally
refers to a subset of data that are in the form of multiple series, blocks, or sequences,
and the main objective of their analysis is to provide estimates of the variance of the full-
data estimator; see Carlstein [7], Fukuchi [8], Politis [9]. By contrast, the subsampling
methods discussed in this paper are used to provide estimates of model parameters based
on subsamples from the full data in situations where obtaining such estimates from the full
data is computationally prohibitive.
The remainder of this paper is organized as follows. Section 3 shows full-data estimates
for parameters of the IS-AR(1) model. Subsampling methods for the IS-AR(1) model are
described in Section 4. These methods include a random subsampling method based on
A-optimality (in Section 4.1), information-based subdata selection (in Section 4.2), and a
sequential thinning method (in Section 4.3). Section 5.1 illustrates the techniques using
simulated data. Lastly, in Section 6, we summarize the key findings of our study.
2. Background
Section 2.1 gives a brief review of irregularly spaced time series, while Section 2.2
reviews subsampling methods from the recent literature.
2.1. Modeling Irregularly Spaced Time Series

Irregularly spaced (or unevenly spaced) time series occur in many domains including
astronomy, biomedicine, climatology, ecology, environment, finance, geology, etc. For in-
stance, high-frequency financial transactions typically occur at irregularly spaced time
points within a trading day, as each trade is recorded. Within a trading day, the elapsed
times (durations) between consecutive trades are not the same for any selected stock. These
times also vary between different stocks. In any given time interval (say, one hour), trans-
actions of a stock may occur rapidly, separated by short durations, or occur slowly with
Algorithms 2024, 17, 524 3 of 17
longer durations. Since methods that are used for modeling and forecasting regular time
series [10] are not useful for analyzing irregular time series, modified approaches have
been developed.
A recent approach for analyzing irregularly spaced time series is the gap time au-
toregressive (AR) model in (1), which was discussed in [4] for modeling house prices.
In this model, the higher the value of d j , the less useful y j−1 becomes for explaining and
predicting y j . Erdogan et al. [5] described AR type models for stationary and nonstationary
irregular time series. Similar models were used in [11] and [1] to model and forecast
future values of irregularly spaced astronomical light curves from variable stars, for which
Elorrieta et al. [12] proposed a bivariate model.
Ghysels and Jasiak [13] proposed the autoregressive conditional duration generalized
autoregressive conditionally heteroscedastic (ACD-GARCH) model for analyzing irregu-
larly spaced financial returns, employing the computationally cumbersome Generalized
Method of Moments (GMM) approach for estimating parameters. Meddahi et al. [14] pro-
posed a GARCH-type model for irregularly spaced time series by discretizing a continuous
time stochastic volatility process, thereby combining the advantages of the ACD-GARCH
model [13] and the ACD model [15]. Maller et al. [16] described a continuous version of the
GARCH (i.e., the COGARCH) model for irregularly spaced time series, while [17] proposed
a multivariate local-level model with score-driven covariance matrices for intra-day log
prices, treating the asynchronicity as a missing data problem. Recently, [6] extended the
gap time modeling idea of [4] to construct useful time series models to understand volatil-
ity patterns in irregularly spaced financial time series, considering the gaps as random
variables. An alternate stochastic volatility model treating the gaps as fixed constants was
discussed in [18].
Most methods for analyzing long irregularly spaced time series can be computationally
challenging. Subsampling methods can help us obtain estimates in a computationally
feasible way.
2.2. Subsampling Methods

Constructing parameter estimates based on subsamples from the full data is a popular
technique to speed up computations. While the simplest approach of uniform sampling
may not be effective for extracting useful information from a large dataset, optimized
subsampling methods do provide a better trade-off between estimation efficiency and
computational efficiency. Such methods have attracted significant attention in recent years
because they are designed to (a) give higher preference to more informative data points
and (b) be subject to less information loss. Typical practices include stochastic subsampling
and deterministic subdata selection.
Stochastic subsampling methods are successful because they specify inclusion proba-
bilities that allow more informative data points to have a higher chance of being included
in the subsample. In early attempts, Drineas et al. [19] advocated the use of normalized
statistical leverage scores as subsampling probabilities in least squares estimation problems,
while Yang et al. [20] showed that using the normalized square roots of the statistical lever-
age scores provides a tighter error bound. Ma et al. [21] examined the statistical properties
of estimators resulting from subsampling methods based on statistical leverage scores, they
and termed it algorithmic leverage. Xie et al. [22] applied the statistical leverage scores for
subsampling under a vector autoregressive model.
This approach has been used in several statistical modeling frameworks. Zhu [23]
proposed a subsampling method using the gradients of the objective function for linear
models. Wang et al. [24] proposed optimal subsampling probabilities under A-optimality
and L-optimality criteria for logistic regression. Teng et al. [25] examined the asymptotic
properties of subsampling estimators for generalized linear models under unbounded de-
sign. The optimal subsampling framework has been extended to other modeling scenarios
such as multiclass logistic regression, generalized linear models, and quantile regressions.
Algorithms 2024, 17, 524 4 of 17
While the aforementioned studies are based on sampling with replacement (in which
the same data point may be included more than once in the subsample), further research
derived optimal subsampling probabilities and a distributed sampling strategy under
Poisson sampling (sampling without replacement). Wang et al. [26] further showed that
Poisson sampling can be superior to sampling with replacement in terms of estimation
efficiency. Poisson sampling for irregular time series analysis is described in Section 4.1.
Deterministic subdata selection uses a particular criterion to determine a subsample
without involving additional randomness. Wang et al. [27] introduced a novel method,
termed information-based optimal subsampling (IBOSS), designed to select data points with
extreme values to approximate the D-optimality criterion in designed experiments. Pron-
zato and Wang [28] later proposed an online selection strategy that leverages directional
derivatives to decide whether to include a data point in a subsample, aiming to achieve
optimality according to a given criterion. This approach processes only the data points
encountered up to the current time, determining whether the current point should be
included in the subsample. As a result, it is especially well suited for applications involving
streaming data.
Despite the rapid recent developments mentioned above, applications of the stochastic
subsampling and deterministic subdata selection methods remain unexplored for irregu-
larly spaced time series.
3. Full-Data Estimation for the IS-AR(1) Model

Given a time series of length m from the IS-AR(1) model in (1), we present the full-data
maximum likelihood estimates (MLEs) of the model parameters. Since m is assumed to be
large, we can obtain the MLEs of the unknown parameters θ = (ϕ, σ )′ by maximizing the
conditional log-likelihood function (which ignores the marginal distribution of the initial
data point y0 at time t0 ). Up to a normalizing constant, this has the form
m
1
Mm (θ) =
m ∑ ℓ(θ; y j , y j−1 , d j )
j =1
d
! (2)
1 m
1 1 − ϕ2d j (y j − ϕ j y j−1 )2 (1 − ϕ2 )
=
m ∑ − log σ − log
2 1 − ϕ2
−
2σ2 (1 − ϕ2d j )
.
j =1
The MLE θ̂ = arg maxθ Mm (θ) is the maximizer of (2) and must be solved numerically
since an analytical solution is infeasible. Finding the MLE is challenging due to (a) the do-
main restriction (i.e., −1 < ϕ < 1) and (b) the nonconcavity of the objective function. These
challenges often cause convergence issues for the the classical Newton–Raphson algorithm.
Since ϕ is a correlation coefficient constrained to lie between −1 and 1, an uncon-
strained gradient-based optimization method may not work well. To remedy this, we use
Fisher’s z-transformation to map ϕ from (−1, 1) onto R so that an unconstrained optimizing
algorithm can be used on the transformed parameter. Although the choice of the transfor-
mation is not unique and other transformations may be used, Fisher’s z-transformation
provides a well-understood estimator of the correlation coefficient, especially when the
error term ε j follows a normal distribution [29]. Specifically, let

1 1+ϕ
z = atanh(ϕ) = log ,
2 1−ϕ
so that
exp(2z)− 1
ϕ = tanh(z) = . (3)
exp(2z)+ 1
We have
∂ϕ
= sech2 (z) = 1 − ϕ2 ,
∂z
Algorithms 2024, 17, 524 5 of 17
and
∂2 ϕ
= −2tanh(z)sech2 (z) = −2ϕ(1 − ϕ2 ),
∂z2
where tanh(·) and sech(·) are, respectively, the hyperbolic tangent and secant functions.
Below, we show the use of the coordinate ascent algorithm to maximize (2) in terms of z;
we then insert the estimate ẑ into (3) to obtain the estimate of ϕ.
We denote the gradient vector and Hessian matrix of the per-observation log-likelihood
ℓ(θ; y j , y j−1 , d j ) in (2) as follows:
∂ℓ(θ; y j , y j−1 , d j )

ℓ̇1 (θ; y j , y j−1 , d j )
ℓ̇(θ; y j , y j−1 , d j ) = = , (4)
∂θ ℓ̇2 (θ; y j , y j−1 , d j )
and
∂2 ℓ(θ; y j , y j−1 , d j )

ℓ̈11 (θ; y j , y j−1 , d j ) ℓ̈12 (θ; y j , y j−1 , d j )
ℓ̈(θ; y j , y j−1 , d j ) = = . (5)
∂θ∂θ′ ℓ̈12 (θ; y j , y j−1 , d j ) ℓ̈22 (θ; y j , y j−1 , d j )
The first and second derivatives of ℓ(θ; y j , y j−1 , d j ) with respect to z are, respectively,
denoted by
ℓ̇z1 (θ) = (1 − ϕ2 )ℓ̇1 (θ) and ℓ̈z11 (θ) = (1 − ϕ2 )2 ℓ̈11 (θ)− 2(1 − ϕ2 )ϕℓ̇1 (θ).
We describe the coordinate ascent algorithm. Suppose ϕ(k) = tanh(z(k) ) is obtained in

the k-th step of the algorithm. We find the value of σ given ϕ(k) by solving ℓ̇2 (ϕ(k) , σ ) = 0,
which gives v
u 1 m ( y j − ϕ d j y j −1 )2 (1 − ϕ ( k ) 2 )
u
σ =t ∑
(k)
2d
.
m i =1 1 − ϕ(k) j
Given σ(k) , we find the value of z in the (k + 1)-th step by implementing the Newton–
Raphson algorithm, i.e., computing
∑m
j=1 ℓ̇z1 ( ϕ
(k+1),l , σ (k) ; y , y
j j −1 , d j )
z(k+1),l +1 = z(k+1),l −
∑m
j=1 ℓ̈z11 ( ϕ
(k+1),l , σ (k) ; y , y
j j −1 , d j )
for l = 0, 1, 2, . . . until convergence, where ϕ(k+1),0 = ϕ(k) and ϕ(k+1),l = tanh(z(k+1),l ). We

alternate this updating of the values of σ and z until the values become stable.
We summarize the aforementioned estimation procedure in Algorithm 1. Here, ϕ(0) is
an initial value, while ϵNR and ϵCA are error tolerances to determine convergence.
For completeness, we provide below detailed expressions of the gradient vector and
Hessian matrix of the per-observation log-likelihood:
d j ϕ2d j −1 ϕ ϕe2j
ℓ̇1 (θ; y j , y j−1 , d j ) = − +
1 − ϕ2d j 1 − ϕ2 σ2 (1 − ϕ2d j )
1 − ϕ2
+ d j e j (ϕd j −1 y j−1 − ϕ2d j −1 y j )
σ2 (1 − ϕ2d j )2
1 e2j (1 − ϕ2 )
ℓ̇2 (θ; y j , y j−1 , d j ) = − + ,
σ σ3 (1 − ϕ2d j )
2d2j ϕ2d j −2 d j ϕ2d j −2 1 + ϕ2
ℓ̈11 (θ; y j , y j−1 , d j ) = − −
(1 − ϕ2d j )2 1−ϕ 2d j (1 − ϕ2 )2
dj
e2j − 2d j e j ϕ y j−1 2d j e2j ϕ2d j
+ +
σ2 (1 − ϕ2d j ) σ2 (1 − ϕ2d j )2
Algorithms 2024, 17, 524 6 of 17
(1 − ϕ2 )d2j ϕ2d j −2 (2y j e j − y2j−1 + y2j )

+
σ2 (1 − ϕ2d j )2
d j e j (y j−1 − ϕd j y j )[(d j − 1)ϕd j −2 −(d j + 1)ϕd j ]
+
σ2 (1 − ϕ2d j )2
4(1 − ϕ2 )d2j e2j ϕ2d j −2
− ,
σ2 (1 − ϕ2d j )3
2d j e j ϕd j −1 (ϕd j y j − y j−1 )(1 − ϕ2 ) 2e2j ϕ
ℓ̈12 (θ; y j , y j−1 , d j ) = − ,
σ3 (1 − ϕ2d j )2 σ3 (1 − ϕ2d j )
1 3e2j (1 − ϕ2 )
ℓ̈22 (θ; y j , y j−1 , d j ) = − ,
σ2 σ4 (1 − ϕ2d j )
where, e j = y j − ϕd j y j−1 .
Algorithm 1 Numerical optimization algorithm.
1. z ← atanh(ϕ(0) )
s
d
( y j − ϕ j y j −1 )2 (1 − ϕ 2 )
2. σ ← 1
m ∑im=1 2d j
1− ϕ
3. zo ← Inf
4. while |z − zo | > ϵCA
1. zo ← z
2. ∆NR ← Inf
3. while |∆NR | > ϵNR
∑m 2
j=1 (1− ϕ ) ℓ̇1 ( ϕ,σ;y j ,y j−1 ,d j )
1. ∆NR ← m
∑ j=1 (1−ϕ ) ℓ̈11 (ϕ,σ;y j ,y j−1 ,d j )−2(1−ϕ2 )ϕℓ̇1 (ϕ,σ;y j ,y j−1 ,d j )
2 2
2. z ← z − ∆NR
end
4. ϕ ← tanh(z)
s
d
( y j − ϕ j y j −1 )2 (1 − ϕ 2 )
5. σ ← 1
m ∑im=1 2d j
1− ϕ
end
4. Subsampling Methods for the IS-AR(1) Model

If the size of the full data m is moderate, we can easily obtain the full-data MLE
by maximizing (2) using Algorithm 1. However, it is computationally challenging to
implement this algorithm when m is very large. In this case, we may resort to subsampling
strategies, using a small fraction of the full data to obtain estimators. We propose three
distinct methods: (1) optimal subsampling under A-optimality (opt), (2) information-based
optimal subdata selection (iboss), and (3) sequential thinning (thin). We describe these
procedures and propose practical algorithms in the following subsections.
4.1. Optimal Subsampling Under A-Optimality

Optimal subsampling is a stochastic subsampling strategy satisfying a certain op-
timality property. The inclusion of subsampled data points is determined by carefully
designed probabilities in order to meet the optimality criterion. In this section, we im-
plement optimal Poisson subsampling, which is suitable for time series and avoids the
need to simultaneously access the full data. In comparison to sampling with replacement,
Algorithms 2024, 17, 524 7 of 17
Poisson sampling cannot sample the same data point more than once. Since repeatedly
sampled data points do not contribute any new information, Poisson sampling preserves
more distinct information from the full data and is therefore more efficient than sampling
with replacement. See Wang et al. [26] for a theoretical justification of the superiority of
Poisson sampling.
Let η j be the sampling probability and δj be the indicator variable for the inclusion
of the j-th data point in the subsample. With u j randomly sampled from the uniform
distribution U (0, 1), we set δj = 1 when u j ≤ η j and δj = 0 otherwise. The actual subsample
size from Poisson sampling, which is denoted by r ∗ = ∑m j=1 δj , is random. The expected
subsample size is r = ∑m j=1 η j . We obtain the subsample estimator θ̆ by maximizing the
following target function:
m ℓ(θ; y j , y j−1 , d j )
1
∑ δj
(opt)
Mr (θ) = . (6)
m j =1
ηj
The efficacy of θ̆ depends on the subsampling probabilities {η j }m

j=1 . We obtain the
A-optimal subsampling probabilities which depend on the unknown true parameter and
minimize the asymptotic mean squared error of θ̆ [24,26]. In practice, this unknown
parameter could be replaced by a pilot estimate θ(plt) obtained from a pilot subsample
of size r0 . The A-optimal subsampling probabilities with the pilot estimate θ(plt) take the
general form
n o −1
(opt) (plt)
η̂ j = κ M̈r0 (θ(plt) ) ℓ̇(θ(plt) ; y j , y j−1 , d j ) ∧ 1, for j = 1, . . . , m, (7)
where κ is a parameter to ensure that the expected sample size is set around r and prevent
it from being too small, the notation a ∧ b = min( a, b),
m
1
∑ δj
(plt) (plt)
M̈r0 (θ) = ℓ̈(θ; y j , y j−1 , d j ) (8)
r0 j =1
(plt)
is the average Hessian matrix with the pilot subsample, and δj is the indicator variable
for inclusion of the j-th data point in the pilot sample. While the exact value of κ requires
additional computation, it can be approximated by
r
κa = n o −1
(plt)
∑m
j =1 M̈r0 (θ(plt) ) ℓ̇(θ(plt) ; y j , y j−1 , d j )
when the sampling rate r/m is small. This is usually the case in practice.
(opt)
The approximated subsampling probabilities η̂ j in (7) are subject to additional
(opt)
disturbance from the pilot estimation, and small values of η̂ j may inflate the asymptotic
(opt)
variance of the resulting estimator. To address this problem, we mix η̂ j with the uniform
subsampling probabilities to prevent the sampling probabilities from getting too small.
Specifically, we use
(opt) r
(1 − α1 ) η j + α1 , for j = 1, . . . , m,
m
where α1 ∈ (0, 1) is a tuning parameter. The final subsample estimator θ̃ is obtained by
combining the pilot estimator θ(plt) and the optimal subsample estimator θ̆:
−1
(plt) (opt) (plt) (opt)
θ̃ = r0 M̈r0 (θ(plt) )+ r ∗ M̈r (θ̆) r0 M̈r0 (θ(plt) )θ(plt) + r ∗ M̈r (θ̆)θ̆ , (9)
Algorithms 2024, 17, 524 8 of 17
where
m ℓ̈(θ̆; y j , y j−1 , d j )
1
∑ δj
(opt)
M̈r (θ̆) = (opt)
.
m j =1 η̂α
1 ,j
We summarize the A-optimal subsampling strategy in Algorithm 2.
Algorithm 2 Poisson sampling under A-optimality.
• Step 1—Pilot Estimates and Preparations:

1. Take a simple random sample of size r0 by sampling without replacement and
set indicators {δ0j }m
j =1 .
2. θ(plt) ← arg maxθ ∑m 0
j=1 δj ℓ( θ; y j , y j−1 , d j )
3. M(plt) ← r0−1 ∑m 0
j=1 δj ℓ̈( θ
(plt) ; y , y
j j −1 , d j )
4. Calculate κ a .
• Step 2—Subsampling:
for i=1 to m
1. u ← Unif(0, 1)
−1
2. η j ← (1 − α)κ a ∥M(plt) ℓ̇(θ(plt) ; y j , y j−1 , d j )∥+ α(r/m)
3. if u < η j do
δj ← 1
else
δj ← 0
end
end
• Step 3—Estimation:
1. θ̆ ← arg maxθ ∑m
j=1 δj ℓ( θ; y j , y j−1 , d j ) / ( η j ∧ 1)
2. Combine θ(plt) and θ̆ using (9) to obtain θ̃.
4.2. Information-Based Optimal Subdata Selection

The iboss method is a deterministic selection approach to obtain a subsample. Its basic
motivation is to maximize the determinant of the Fisher information matrix conditioned on
the covariates. Under the model specified in (2), the conditional Fisher information for the
observed data at time t j is
  
∂ℓ j 2 ∂ℓ j ∂ℓ j

∂ϕ ∂ϕ ∂σ 
I j (θ|y j−1 , d j ) = −E y j −1 , d j 
 
∂ℓ j ∂ℓ j
∂ ℓ 2 
j
∂ϕ ∂σ ∂σ
(10)
 2d j −2 2 
d2j ϕ y j −1 (1 − ϕ 2 )
2c j (θ)2 + 2d j − σ2 c j (θ)
= 2
σ (1− ϕ ) ,
− σ2 c j (θ) 2
σ2
where
d j ϕ2d j −1 ϕ
c j (θ) = − .
1−ϕ 2d j 1 − ϕ2
The information matrix Ij (θ|y j−1 , d j ) has full rank. We select data points with the r
largest values of det{ I j (θ|y j−1 , d j )}, j = 1, . . . , m. This can be undertaken efficiently by
established methods like some partition-based selection algorithms [30]. We implement the
procedure via the following steps:
Algorithms 2024, 17, 524 9 of 17
1. Obtain a pilot estimate θ(plt) as in Section 4.1.

2. Calculate det( I j (θ(plt) |yt j−1 , d j )) for j = 1, . . . , m.
3. Take the r data points with the largest det( I j (θ(plt) |y j−1 , d j )) using the Quickselect
(iboss)
algorithm. Denote their inclusion indicators as δj s.
4. Obtain θ̆(iboss) by maximizing the following target function:
m
1
∑ δj
(iboss) (iboss)
Mr (θ) = ℓ(θ; y j , y j−1 , d j ).
r j =1
5. Obtain the final subsample estimator θ̃(iboss) by
(iboss) (iboss) −1

(plt)
θ̃(iboss) = r0 M̈r0 (θ̆(iboss) )+ rM̈r (θ̆ )
(11)
(plt) (iboss) (iboss)
× r0 M̈r0 (θ̆(iboss) )× θ(plt) + rM̈r (θ̆ )× θ̆(iboss) ,
(plt)
where M̈r0 (θ) is defined in (8) and
m
1
∑ δj ℓ̈(θ̆(iboss) ; y j , y j−1 , d j ).
(iboss)
M̈r (θ̆(iboss) ) =
r j =1
We present the iboss method in Algorithm 3.
Algorithm 3 Information-based optimal subdata selection.

1. Take a simple random sample of size r0 by sampling without replacement and
set indicators {δ0j }m
j =1 .
2. θ(plt) ← arg maxθ ∑m 0
j=1 δj ℓ( θ; y j , y j−1 , d j )
• Step 2—Subdata Selection:
1. for j = 1 to m
Ij ← det( I j (θ(plt) |y j−1 , d j ))
end
2. I(r) ← r-th largest Ij for j = 1, . . . , m
3. for j = 1 to m
(iboss) (iboss)
if Ij ≥ I(r) then δj ← 1 else δj ←1
end
• Step 3—Estimation:
(iboss)
j =1 δ j ℓ(θ; y j , y j−1 , d j )
2. Combine θ(plt) and θ̆ using (11) to obtain θ̃.
4.3. Sequential Thinning on Streaming Data

The thin method proposed in Pronzato and Wang [28] is another deterministic subdata
selection approach. Unlike iboss, which requires simultaneous access to the full data, thin
only uses the data points that are already included in the subsample to determine whether
the next data point should be included in the subsample or not. This online decision nature
of the thin algorithm makes it suitable for time series data. We present the main idea of
the thin method based on D-optimality below.
Algorithms 2024, 17, 524 10 of 17
Let the average information matrix of a subsample indexed by S = {δjS }m

j=1 be
m
1
I S (θ) =
rS ∑ δjS Ij (θ|y j−1 , d j ), (12)
j =1
where rS = ∑m S
j=1 δj and I j ( θ| y j−1 , d j ) is defined in (10). The contribution of a data point,
say (y j′ , y j′ −1 , d j′ ), to the average information matrix, if included in the subsample, can be
measured by the directional derivative, which is defined as

ζ I S (θ),j′ = tr I S (θ)−1 Ij′ (θ|y j′ −1 , d j′ ) − 2. (13)
The thin method aims to include (y j′ , y j′ −1 , d j′ ) in the subsample if ζ I S (θ),j′ is large

enough. This is motivated by the key result in optimal experimental design theory that
the optimal design consists of design points with the largest directional derivatives in the
design space [31,32]. In the context of subsampling, this means that we need to find the
subsample with the r largest directional derivatives under the unknown optimal average
information matrix. In order to achieve this in an online manner, we need to sequentially
estimate the upper r/m quantile for the distribution of the directional derivatives.
Unlike the linear models discussed in Pronzato and Wang [28] for which the infor-
mation matrix is completely known, the information matrix for our model depends on
the unknown parameter θ. We therefore need a pilot step to obtain a pilot estimator. We
present the outline of the thin method for our problem in the following steps:
1. Reserve the first r0 data points up to time tr0 as a pilot sample, and use it to obtain a
pilot estimate θ(plt) as in Section 4.1.
r0
2. Calculate {ζ I r (θ(plt) ),j } j= 1 and obtain its sample upper α-quantile Ĉr0 , where α =
0
r/(m − r0 ) and I r0 (θ(plt) ) is the average information matrix for the pilot sample.
3. For j = r0 + 1, . . . , m, let I j−1 (θ(plt) ) be the average information matrix and Ĉj−1 be
the estimated quantile from the subsample collected up to time t j−1 . If ζ I (θ(plt) ),j >
j −1
Ĉj−1 , include the data point (y j , y j−1 , d j ) in the subsample and calculate the updated
I j (θ(plt) ); otherwise, I j (θ(plt) ) = I j−1 (θ(plt) ). Calculate the updated Ĉj .
(thin)
4. Let δj be the indicators for the thin subsample collected in Step 3. Obtain the
subsample estimator θ̆(thin) by maximizing
m
1
∑
(thin) (thin)
Mr (θ) = δj ℓ(θ; y j , y j−1 , d j ).
r j =r0 +1
5. Obtain the final subsample estimator θ̃(thin) by
(thin) (thin) −1

(plt)
θ̃(thin) = r0 M̈r0 (θ̆(thin) )+ rM̈r (θ̆ )
(14)
(plt) (thin) (thin)
× r0 M̈r0 (θ̆(thin) )× θ(plt) + rM̈r (θ̆ )× θ̆(thin) ,
(plt)
where M̈r0 (θ) is defined in (8) and
m
1
∑ δj
(thin) (thin)
M̈r (θ̆) = ℓ̈(θ̆; y j , y j−1 , d j ).
r j =1
A detailed algorithm implementing thin is detailed in Algorithm 4.

Algorithms 2024, 17, 524 11 of 17
Algorithm 4 Sequential thinning under D-optimality.
• Set Parameters:
1. Set r0 ≥ 2, q ∈ (1/2, 1), γ ∈ (0, q − 1/2).
2. α ← r/(m − r0 )
3. Set ϵ1 > 0 such that ϵ1 ≪ α
r0
1. θ (plt) ← arg maxθ ∑i= 1 ℓ( θ; y j , y j−1 , d j )
2. r∗ ← 0
3. δ1 , . . . , δr0 ← 0
4. for j = 1 to r0
Ij ← I j (θ(plt) |y j−1 , d j )
end
r0
5. I ∗ ← r0−1 ∑ j= 1 Ij
6. for j = 1 to r0
ζ j ← tr(I ∗ −1 Ij )− 2
end
7. ζ (1) , . . . , ζ (r0 ) ← sort(ζ 1 , . . . , ζ r0 ) where ζ (1) ≤ · · · ≤ ζ (r0 )
8. r0+ ← ⌈(1 − α/2)r0 ⌉, r0− ← ⌊(1 − 3α/2)r0 ⌋∨ 1
9. Ĉ∗ ← ζ (⌈(1−α)r0 ⌉)
10. β 0 ← r0 /(r0+ − r0− )
11. h ← ζ (r+ ) − ζ (r− )
0 0
γ
12. h∗ ← h/r0
h i
r0
13. fˆ∗ ← ∑ j= 1 |ζ j −Ĉ∗ |≤h∗ / (2r0 h∗ )
1
• Step 2—Sequential Thinning:
for k = r0 + 1 to m
1. if r − r∗ > m − k do
δj ← 1 for j = k, . . . , m
break
else if r∗ ≥ r do
δj ← 0 for j = k, . . . , m
break
end
2. I∗ ← I j (θ(plt) |y j−1 , d j )
3. ζ ∗ ← tr(I − 1
∗ I∗ )
4. if (r∗ + r0 )/k ≤ ϵ1 or ζ ∗ ≥ Ĉ∗ do
δk ← 1
else
δk ← 0
end
5. Ĉo ← Ĉ∗
( fˆ∗ )−1 β 0 ( k −1) γ
V
6. Ĉ∗ ← Ĉ∗ + kq (1ζ ∗ ≥Ĉ∗ − α)
Algorithms 2024, 17, 524 12 of 17
Algorithm 4 Cont.
7. h∗ ← h/kγ
8. fˆ∗ ← fˆ∗ + k−q [(2h∗ )−1 1{|ζ ∗ −Ĉo |≤h∗ } − fˆ∗ ]
9. r∗ ← r∗ + δk
10. I ∗ ← I ∗ + r∗ δ+k r0 [ I∗ −I ∗ ]
end
• Step 3—Estimate and Combine:
j=r0 +1 δj ℓ( θ; y j , y j−1 , d j )
2. Combine θ (plt) and θ̆ by (14)
5. Numerical Results
We perform numerical simulation to evaluate the performance of the subsampling
methods. We present the results on estimation efficiency in Section 5.1 and the results on
computational efficiency in Section 5.2.
We consider the performances of the uniform subsample estimator (unif), the A-
optimal subsample estimator (opt), the information-based optimal subdata selection esti-
mator (iboss), and the sequential thinning estimator (thin). We set the full-data sample size
as m = 105 and consider different values of the true parameters; we allow ϕ = 0.1, 0.5, 0.9
and σ = 1, 10 and 20. The time gaps d j are randomly sampled from the set {1, 2, 3, 4, 5}.
We consider different subsample sizes r = 5000, 10,000, 15,000, 20,000, 25,000, and 30,000,
corresponding, respectively, to 5%, 10%, 15%, 20%, 25%, and 30% of the full dataset. We
let r0 = 1000. For opt, we set the mixing rate α1 = 0.1. To implement the thin subsam-
pling procedure, we set the tuning parameters q = 0.7, γ = 0.1, and ϵ1 = 0.001. We also
implement the uniform subsampling method with a subsample size r + r0 as a benchmark
for comparisons.
5.1. Estimation Efficiency

We use the empirical mean squared error (MSE) to measure the estimation efficiency of
the subsample estimator with respect to the true parameter. We implement the subsampling
methods discussed in Section 4 and repeat the simulation 1000 times to calculate the
empirical MSE, which is defined as
1 1000 (s) 2 1 1000 (s)

1000 s∑ 1000 s∑
MSE = θ̃ − θ = [(ϕ̃ − ϕ)2 +(σ̃(s) − σ)2 ] = MSEϕ + MSEσ , (15)
=1 =1
where θ̃(s) = (ϕ̃(s) , σ̃(s) ) is the subsample estimate in the s-th repetition. Due to the
nonconvex nature of the model, some methods failed to converge in a few repetitions. We
drop the results in these repetitions when calculating the MSE. This does not affect the
accuracy of the MSE because the nonconvergence rate is very low.
The results in terms of the empirical MSE are presented in Figure 1. It is seen that all
three proposed subsampling methods outperform the uniform sampling in all the cases.
Overall, the opt algorithm displays the best performance, especially when σ or ϕ is large.
The deterministic methods, iboss and thin, have higher estimation efficiency when σ
is small.
The varying performances in empirical MSE among the proposed subsampling meth-
ods are partly due to the different optimality criteria adopted. The A-optimality criterion
used by opt seeks to minimize the trace of the asymptotic variance matrix, whereas the
D-optimality criterion used by both iboss and thin focuses on minimizing the determinant
of the asymptotic variance matrix. As a result, opt places greater emphasis on parameter
components with larger variances, while iboss and thin distribute their focus more evenly
across all parameters. When σ is small (e.g., σ = 1), the variance in estimating σ is low
Algorithms 2024, 17, 524 13 of 17
across all subsampling methods, allowing iboss and thin to express their advantage in
reducing the variance of estimating ϕ, thereby outperforming opt. However, when σ = 20,
the variance in estimating σ becomes the dominant contributor to the empirical MSE. Since
iboss and thin do not prioritize the estimation of σ, their performance weakens in this
case. A similar reasoning applies to the differing performances across various values of ϕ.
To better understand this, we further decompose the empirical MSEs for σ and ϕ to validate
our explanation.
σ=1 σ = 10 σ = 20
ϕ = 0.1
ϕ = 0.5
ϕ = 0.9
Figure 1. Comparison of MSEs for estimating θ between different subsampling algorithms across
varying subsample sizes r.
In order to further examine the performance of the subsampling methods on estimating

different types of parameters, we plot MSEϕ for estimating ϕ in Figure 2 and MSEσ for
estimating σ in Figure 3. Irrespective of the values of ϕ and σ, iboss and thin outperform
opt in estimating ϕ, while they fall short in estimating σ compared to opt.
We observe that the proposed subsampling methods provide greater benefit for smaller
r. As r increases, the differences between the methods gradually diminish. This is to be
expected, since all estimators converge to the full-data estimator when r gets closer to m.
Therefore, with larger datasets, the advantage of using optimal subsampling estimators
becomes more pronounced.
In the IS-AR(1) model, the predicted value at any given time point is directly influ-
enced by the correlation coefficient ϕ. Therefore, more accurate estimates of ϕ lead to better
prediction accuracy. As shown in Figure 2, the optimal subsampling methods provide
exceptional performance in estimating ϕ, especially when using the iboss and thin ap-
proaches. Notably, with a subsample size of r = 5000, both iboss and thin outperform the
uniform sampling method, which uses a significantly larger subsample size of r = 30,000
for estimating ϕ.
Algorithms 2024, 17, 524 14 of 17
σ=1 σ = 10 σ = 20
ϕ = 0.1
ϕ = 0.5
ϕ = 0.9
Figure 2. Comparison of MSEϕ s for estimating ϕ between different subsampling algorithms across
σ=1 σ = 10 σ = 20
ϕ = 0.1
ϕ = 0.5
ϕ = 0.9
Figure 3. Comparison of MSEσ s for estimating σ between different subsampling algorithms across
Algorithms 2024, 17, 524 15 of 17
5.2. Time Complexity

To evaluate the computational efficiency of the subsampling methods, we repeat the
simulation 30 times and record the average computational times for each method. For
comparison, we implement the full-data estimator (full) using the algorithm described in
Section 3 as a benchmark.
We consider three different full-data sample sizes, m = 104 , 105 , and 106 , and six
subsample sizes r = 500, 1000, 1500, 2000, 2500, and 3000. Table 1 reports the average
computational times in milliseconds for the case where ϕ = 0.5 and σ = 1. The computation
times for other values of ϕ and σ follow a similar pattern. These results are based on
simulations using a Julia implementation run on an Apple MacBook Pro with an M1
Pro chip.
Table 1. Average estimation times (ms) over 30 repetitions across different estimation methods, with
full-data sizes m and subsample sizes r under the setting ϕ = 0.5 and σ = 1.
m r unif opt iboss thin full

500 0.744 1.46 0.999 2.17 17.2
1000 1.48 2.07 1.34 2.87 17.8
1500 2.14 2.91 1.67 3.33 17.8
104
2000 3.0 3.51 2.01 3.88 17.4
2500 3.38 4.12 2.51 4.22 17.5
3000 4.15 4.29 2.96 4.63 17.9
500 0.781 3.1 4.34 10.6 141
1000 1.52 3.7 4.3 11.0 147
1500 2.3 4.45 4.46 11.6 147
105
2000 3.03 4.84 4.67 12.0 147
2500 3.48 5.69 4.67 12.6 147
3000 4.42 6.71 5.16 13.1 150
500 1.25 19.5 30.5 79.6 1244
1000 2.14 19.0 29.8 79.8 1207
1500 3.15 19.9 30.4 80.6 1224
106
2000 4.16 21.3 30.3 81.1 1165
2500 5.01 21.4 31.7 82.1 1177
3000 5.65 22.1 29.9 82.4 1274
The entries in the table show a substantial reduction in computation times by using
the subsampling methods compared to the time taken for full-data estimation. Unlike the
uniform subsampling method, which incurs almost no computational overhead during the
subsampling process, optimal subsampling algorithms require additional computations
to determine the inclusion probability for each data point. The numerical optimization
using subsamples has a time complexity of O(r ), whereas calculating the subsampling
probabilities for nonuniform subsampling methods requires a cost of O(m).
Compared to full-data-based estimation, both opt and iboss demonstrate exceptional
efficiency, reducing the computation time by a factor of 1/40 on average when the sample
size m is sufficiently large. Although not as fast as these two methods, thin still achieves
significant computational savings, taking less than one-tenth of the time required for the
full-data estimation. Additionally, an advantage of using thin is that subsample selection
can be performed sequentially. Overall, all three proposed optimal subsampling methods
for the IS-AR(1) model offer reliable strategies for reducing the computational burden in
large-scale data applications.
6. Conclusions
In this paper, we investigated the technique of computationally feasible subsampling
in the context of irregularly spaced time series data. We proposed practical algorithms for
implementing the opt, iboss, and thin methods for the IS-AR(1) model. The numerical re-
Algorithms 2024, 17, 524 16 of 17
sults demonstrated that the proposed subsampling methods outperform the naive uniform
subsampling approach with improved estimation efficiency and show significant benefits
in reducing the computation time compared with the full-data estimation.
While our work is currently focused on the IS-AR(1) model, it highlights the potential
of optimal subsampling methods for time series data. In future research, these techniques
can be extended to more complex models, such as IS-AR(p) for p > 1. Typically, as the num-
ber of parameters increases, optimal subsampling methods become even more effective in
reducing computation times. We hope that our work paves the way for further exploration
of subsampling strategies in time series analysis.
Author Contributions: Conceptualization, J.L. and Z.W.; methodology, J.L. and Z.W.; formal analysis,
J.L. and Z.W.; investigation, J.L. and Z.W.; writing—original draft preparation, Z.W.; writing—review
and editing, N.R. and H.W.; supervision, N.R. and H.W.; project administration, N.R. and H.W. All
authors have read and agreed to the published version of the manuscript.
Funding: This research received no external funding.
Data Availability Statement: Data used for the paper are computer-generated. Codes for generating
the data are available upon reasonable request.
Conflicts of Interest: The authors declare no conflicts of interest.
References
1. Elorrieta, F.; Eyheramendy, S.; Palma, W. Discrete-time autoregressive model for unequally spaced time-series observations.
Astron. Astrophys. 2019, 627, A120. [CrossRef]
2. Mudelsee, M. Trend analysis of climate time series: A review of methods. Earth-Sci. Rev. 2019, 190, 310–322. [CrossRef]
3. Dutta, C.; Karpman, K.; Basu, S.; Ravishanker, N. Review of statistical approaches for modeling high-frequency trading data.
Sankhya B 2023, 85, 1–48. [CrossRef]
4. Nagaraja, C.H.; Brown, L.D.; Zhao, L.H. An autoregressive approach to house price modeling. Ann. Appl. Stat. 2011, 5, 124–149.
[CrossRef]
5. Erdogan, E.; Ma, S.; Beygelzimer, A.; Rish, I. Statistical models for unequally spaced time series. In Proceedings of the 2005 SIAM
International Conference on Data Mining. SIAM, Newport Beach, CA, USA, 21–23 April 2005; pp. 626–630.
6. Anantharaman, S.; Ravishanker, N.; Basu, S. Hierarchical modeling of irregularly spaced financial returns. Stat 2024, 13, e692.
[CrossRef]
7. Carlstein, E. The use of subseries values for estimating the variance of a general statistic from a stationary sequence. Ann. Stat.
1986, 14, 1171–1179. [CrossRef]
8. Fukuchi, J.I. Subsampling and model selection in time series analysis. Biometrika 1999, 86, 591–604. [CrossRef]
9. Politis, D.N. Scalable subsampling: Computation, aggregation and inference. Biometrika 2023, 111, 347–354. [CrossRef]
10. Shumway, R. Time Series Analysis and Its Applications; Springer: Berlin/Heidelberg, Germany, 2000.
11. Eyheramendy, S.; Elorrieta, F.; Palma, W. An autoregressive model for irregular time series of variable stars. Proc. Int. Astron.
Union 2016, 12, 259–262. [CrossRef]
12. Elorrieta, F.; Eyheramendy, S.; Palma, W.; Ojeda, C. A novel bivariate autoregressive model for predicting and forecasting
irregularly observed time series. Mon. Not. R. Astron. Soc. 2021, 505, 1105–1116. [CrossRef]
13. Ghysels, E.; Jasiak, J. GARCH for irregularly spaced financial data: The ACD-GARCH model. Stud. Nonlinear Dyn. Econom. 1998,
2, 1–19. [CrossRef]
14. Meddahi, N.; Renault, E.; Werker, B. GARCH and irregularly spaced data. Econ. Lett. 2006, 90, 200–204. [CrossRef]
15. Engle, R.F.; Russell, J.R. Autoregressive conditional duration: A new model for irregularly spaced transaction data. Econometrica
1998, 66, 1127–1162. [CrossRef]
16. Maller, R.A.; Müller, G.; Szimayer, A. GARCH modelling in continuous time for irregularly spaced time series data. Bernoulli
2008, 14, 519–542. [CrossRef]
17. Buccheri, G.; Bormetti, G.; Corsi, F.; Lillo, F. A score-driven conditional correlation model for noisy and asynchronous data: An
application to high-frequency covariance dynamics. J. Bus. Econ. Stat. 2021, 39, 920–936. [CrossRef]
18. Dutta, C. Modeling Multiple Irregularly Spaced High-Frequency Financial Time Series. Ph.D. Thesis, University of Connecticut,
Storrs, CT, USA, 2022.
19. Drineas, P.; Mahoney, M.W.; Muthukrishnan, S. Sampling algorithms for l2 regression and applications. In Proceedings of the
17th Annual ACM-SIAM Symposium on Discrete Algorithms, Miami, FL, USA, 22–24 January 2006; pp. 1127–1136.
20. Yang, T.; Zhang, L.; Jin, R.; Zhu, S. An explicit sampling dependent spectral error bound for column subset selection. In
Proceedings of the 32nd International Conference on Machine Learning, Lille, France, 7–9 July 2015; Volume 37, pp. 135–143.
21. Ma, P.; Mahoney, M.W.; Yu, B. A statistical perspective on algorithmic leveraging. J. Mach. Learn. Res. 2015, 16, 861–991.
Algorithms 2024, 17, 524 17 of 17
22. Xie, R.; Wang, Z.; Bai, S.; Ma, P.; Zhong, W. Online decentralized leverage score sampling for streaming multidimensional time
series. In Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, Okinawa, Japan, 16–18 April
2019; Volume 89, pp. 2301–2311.
23. Zhu, R. Gradient-based sampling: An adaptive importance sampling for least-squares. Adv. Neural Inf. Process. Syst. 2018,
29, 406–414.
24. Wang, H.; Zhu, R.; Ma, P. Optimal subsampling for large sample logistic regression. J. Am. Stat. Assoc. 2018, 13, 829–844.
[CrossRef]
25. Teng, G.; Tian, B.; Zhang, Y.; Fu, S. Asymptotics of Subsampling for Generalized Linear Regression Models under Unbounded
Design. Entropy 2022, 25, 84. [CrossRef]
26. Wang, J.; Zou, J.; Wang, H. Sampling with replacement vs Poisson sampling: A comparative study in optimal subsampling. IEEE
Trans. Inf. Theory 2022, 68, 6605–6630. [CrossRef]
27. Wang, H.; Yang, M.; Stufken, J. Information-based optimal subdata selection for big data linear regression. J. Am. Stat. Assoc.
2019, 114, 393–405. [CrossRef]
28. Pronzato, L.; Wang, H. Sequential online subsampling for thinning experimental designs. J. Stat. Plan. Inference 2021, 212, 169–193.
[CrossRef]
29. Casella, G.; Berger, R. Statistical Inference; CRC Press: Boca Raton, FL, USA, 2024.
30. Kleinberg, J.; Tardos, E. Algorithm Design; Pearson/Addison-Wesley: Hoboken, NJ, USA, 2006.
31. Wynn, H. Optimum Submeasures with Applications to Finite Population Sampling; Academic Press: Cambridge, MA, USA, 1982.
32. Fedorov, V.V.; Hackl, P. Model-Oriented Design of Experiments; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2012.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.

Algorithms 17 00524

Uploaded by

Copyright:

Available Formats

Algorithms 17 00524

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Algorithms 17 00524

Uploaded by

Copyright:

Available Formats

algorithms

Department of Statistics, University of Connecticut, Storrs, CT 06269, USA; [email protected] (J.L.);

Algorithms 2024, 17, 524. https://doi.org/10.3390/a17110524 https://www.mdpi.com/journal/algorithms

2.1. Modeling Irregularly Spaced Time Series

2.2. Subsampling Methods

3. Full-Data Estimation for the IS-AR(1) Model

We describe the coordinate ascent algorithm. Suppose ϕ(k) = tanh(z(k) ) is obtained in

for l = 0, 1, 2, . . . until convergence, where ϕ(k+1),0 = ϕ(k) and ϕ(k+1),l = tanh(z(k+1),l ). We

(1 − ϕ2 )d2j ϕ2d j −2 (2y j e j − y2j−1 + y2j )

Algorithm 1 Numerical optimization algorithm.

4. Subsampling Methods for the IS-AR(1) Model

4.1. Optimal Subsampling Under A-Optimality

The efficacy of θ̆ depends on the subsampling probabilities {η j }m

We summarize the A-optimal subsampling strategy in Algorithm 2.

Algorithm 2 Poisson sampling under A-optimality.

• Step 1—Pilot Estimates and Preparations:

4.2. Information-Based Optimal Subdata Selection

1. Obtain a pilot estimate θ(plt) as in Section 4.1.

5. Obtain the final subsample estimator θ̃(iboss) by

We present the iboss method in Algorithm 3.

Algorithm 3 Information-based optimal subdata selection.

• Step 1—Pilot Estimates and Preparations:

4.3. Sequential Thinning on Streaming Data

Let the average information matrix of a subsample indexed by S = {δjS }m

The thin method aims to include (y j′ , y j′ −1 , d j′ ) in the subsample if ζ I S (θ),j′ is large

5. Obtain the final subsample estimator θ̃(thin) by

A detailed algorithm implementing thin is detailed in Algorithm 4.

Algorithm 4 Sequential thinning under D-optimality.

5.1. Estimation Efficiency

1 1000 (s) 2 1 1000 (s)

In order to further examine the performance of the subsampling methods on estimating

5.2. Time Complexity

m r unif opt iboss thin full

You might also like