Algorithms 17 00524
Algorithms 17 00524
Algorithms 17 00524
Article
Subsampling Algorithms for Irregularly Spaced
Autoregressive Models
Jiaqi Liu † , Ziyang Wang † , HaiYing Wang and Nalini Ravishanker *
Abstract: With the exponential growth of data across diverse fields, applying conventional statistical
methods directly to large-scale datasets has become computationally infeasible. To overcome this
challenge, subsampling algorithms are widely used to perform statistical analyses on smaller, more
manageable subsets of the data. The effectiveness of these methods depends on their ability to identify
and select data points that improve the estimation efficiency according to some optimality criteria.
While much of the existing research has focused on subsampling techniques for independent data,
there is considerable potential for developing methods tailored to dependent data, particularly in time-
dependent contexts. In this study, we extend subsampling techniques to irregularly spaced time series
data which are modeled by irregularly spaced autoregressive models. We present frameworks for
various subsampling approaches, including optimal subsampling under A-optimality, information-
based optimal subdata selection, and sequential thinning on streaming data. These methods use
A-optimality or D-optimality criteria to assess the usefulness of each data point and prioritize the
inclusion of the most informative ones. We then assess the performance of these subsampling methods
using numerical simulations, providing insights into their suitability and effectiveness for handling
irregularly spaced long time series. Numerical results show that our algorithms have promising
performance. Their estimation efficiency can be ten times as high as that of the uniform sampling
estimator. They also significantly reduce the computational time and can be up to forty times faster
than the full-data estimator.
Citation: Liu, J.; Wang, Z.; Wang, H.; Keywords: subsampling; irregularly spaced autoregressive model; time series; big data
Ravishanker, N. Subsampling
Algorithms for Irregularly Spaced
Autoregressive Models. Algorithms
2024, 17, 524. https://doi.org/
1. Introduction
10.3390/a17110524
Time series analysis is useful in many application domains for understanding pat-
Academic Editor: Matthias Templ terns over time and for forecasting into the future. Most time series methods have been
Received: 25 September 2024
developed for regularly spaced time series, where the gaps between two consecutive time
Revised: 6 November 2024 points are fixed and equal. Regular time series include hourly, daily, monthly, or annually
Accepted: 11 November 2024 recorded time series. Recently, there has been increasing interest in developing methods
Published: 15 November 2024 for analyzing and forecasting irregularly spaced time series, where the gaps between
subsequent observations are not the same. Such irregular time series are observed in
diverse fields such as astronomy [1], climatology [2], finance [3], etc. For instance, intra-
day transaction-level data in finance consist of prices of a financial asset recorded at each
Copyright: © 2024 by the authors.
trade within a trading day, resulting in irregular time intervals (in seconds, say) between
Licensee MDPI, Basel, Switzerland.
consecutive trades. An example in real estate consists of the time series of sale prices of
This article is an open access article
houses, where the discrete time gaps (in days or weeks) between subsequent sale dates are
distributed under the terms and
typically nonconstant.
conditions of the Creative Commons
Consider an irregularly spaced time series y j , j = 0, . . . , m consisting of observations
Attribution (CC BY) license (https://
at discrete times t j , where t0 < t1 < · · · < tm . The gaps between consecutive time points,
creativecommons.org/licenses/by/
4.0/).
denoted by d j = t j − t j−1 , j = 1, . . . , m, are not constant.
To model irregularly spaced time series with discrete gaps, Nagaraja et al. [4] proposed
the stationary gap time autoregressive (Gap AR(1)) model:
s s
1 dj 1 − ϕ2d j
y0 = σ ε 0 , y j = ϕ y j −1 + σ ε j for j = 1, . . . , m, (1)
1 − ϕ2 1 − ϕ2
where ε j is the error term with the standard normal distribution, i.e., ε j ∼ N (0, 1), ϕ ∈
(−1, 1) is the autoregressive parameter, and σ > 0 is a scale parameter. They used (1) to
model and forecast house prices and then constructed a house price index. Other models
have also been constructed in the literature for irregularly spaced time series. Erdogan
et al. [5] described a nonstationary irregularly spaced autoregressive (NIS-AR(1)) model
and illustrated its use on astronomy data. Anantharaman et al. [6] described Bayesian
modeling for an irregular stochastic volatility autoregressive conditional duration (IR-SV-
ACD) model and used this for estimating and forecasting inter-transaction gaps and the
volatility of financial log-returns.
In this paper, we refer to the Gap AR(1) model in (1) as the irregularly spaced AR(1)
or the IS-AR(1) model. Estimation in this model can be handled by the method of maximum
likelihood. While a model for irregularly spaced time series enables handling a wider
range of temporal scenarios, the statistical analysis involves considerable computational
complexity. Specifically, the task of capturing temporal structures by estimating the model
parameters becomes more challenging due to the irregular gaps. One potential solution
to address these computational challenges is to perform the computations of interest on
subsamples that are selected from the full data.
Despite extensive recent research on subsampling techniques, subsampling methods
tailored for irregularly spaced time series data are sparse. In this article, we fill this gap by
proposing novel subsampling methods specifically designed for irregularly spaced time
series data analyzed by the IS-AR(1) model. In comparison to other subsampling methods,
our proposed algorithms are founded upon the optimality criteria under classical statistical
frameworks. It should be noted that the term “subsample” in time series analysis generally
refers to a subset of data that are in the form of multiple series, blocks, or sequences,
and the main objective of their analysis is to provide estimates of the variance of the full-
data estimator; see Carlstein [7], Fukuchi [8], Politis [9]. By contrast, the subsampling
methods discussed in this paper are used to provide estimates of model parameters based
on subsamples from the full data in situations where obtaining such estimates from the full
data is computationally prohibitive.
The remainder of this paper is organized as follows. Section 3 shows full-data estimates
for parameters of the IS-AR(1) model. Subsampling methods for the IS-AR(1) model are
described in Section 4. These methods include a random subsampling method based on
A-optimality (in Section 4.1), information-based subdata selection (in Section 4.2), and a
sequential thinning method (in Section 4.3). Section 5.1 illustrates the techniques using
simulated data. Lastly, in Section 6, we summarize the key findings of our study.
2. Background
Section 2.1 gives a brief review of irregularly spaced time series, while Section 2.2
reviews subsampling methods from the recent literature.
longer durations. Since methods that are used for modeling and forecasting regular time
series [10] are not useful for analyzing irregular time series, modified approaches have
been developed.
A recent approach for analyzing irregularly spaced time series is the gap time au-
toregressive (AR) model in (1), which was discussed in [4] for modeling house prices.
In this model, the higher the value of d j , the less useful y j−1 becomes for explaining and
predicting y j . Erdogan et al. [5] described AR type models for stationary and nonstationary
irregular time series. Similar models were used in [11] and [1] to model and forecast
future values of irregularly spaced astronomical light curves from variable stars, for which
Elorrieta et al. [12] proposed a bivariate model.
Ghysels and Jasiak [13] proposed the autoregressive conditional duration generalized
autoregressive conditionally heteroscedastic (ACD-GARCH) model for analyzing irregu-
larly spaced financial returns, employing the computationally cumbersome Generalized
Method of Moments (GMM) approach for estimating parameters. Meddahi et al. [14] pro-
posed a GARCH-type model for irregularly spaced time series by discretizing a continuous
time stochastic volatility process, thereby combining the advantages of the ACD-GARCH
model [13] and the ACD model [15]. Maller et al. [16] described a continuous version of the
GARCH (i.e., the COGARCH) model for irregularly spaced time series, while [17] proposed
a multivariate local-level model with score-driven covariance matrices for intra-day log
prices, treating the asynchronicity as a missing data problem. Recently, [6] extended the
gap time modeling idea of [4] to construct useful time series models to understand volatil-
ity patterns in irregularly spaced financial time series, considering the gaps as random
variables. An alternate stochastic volatility model treating the gaps as fixed constants was
discussed in [18].
Most methods for analyzing long irregularly spaced time series can be computationally
challenging. Subsampling methods can help us obtain estimates in a computationally
feasible way.
While the aforementioned studies are based on sampling with replacement (in which
the same data point may be included more than once in the subsample), further research
derived optimal subsampling probabilities and a distributed sampling strategy under
Poisson sampling (sampling without replacement). Wang et al. [26] further showed that
Poisson sampling can be superior to sampling with replacement in terms of estimation
efficiency. Poisson sampling for irregular time series analysis is described in Section 4.1.
Deterministic subdata selection uses a particular criterion to determine a subsample
without involving additional randomness. Wang et al. [27] introduced a novel method,
termed information-based optimal subsampling (IBOSS), designed to select data points with
extreme values to approximate the D-optimality criterion in designed experiments. Pron-
zato and Wang [28] later proposed an online selection strategy that leverages directional
derivatives to decide whether to include a data point in a subsample, aiming to achieve
optimality according to a given criterion. This approach processes only the data points
encountered up to the current time, determining whether the current point should be
included in the subsample. As a result, it is especially well suited for applications involving
streaming data.
Despite the rapid recent developments mentioned above, applications of the stochastic
subsampling and deterministic subdata selection methods remain unexplored for irregu-
larly spaced time series.
The MLE θ̂ = arg maxθ Mm (θ) is the maximizer of (2) and must be solved numerically
since an analytical solution is infeasible. Finding the MLE is challenging due to (a) the do-
main restriction (i.e., −1 < ϕ < 1) and (b) the nonconcavity of the objective function. These
challenges often cause convergence issues for the the classical Newton–Raphson algorithm.
Since ϕ is a correlation coefficient constrained to lie between −1 and 1, an uncon-
strained gradient-based optimization method may not work well. To remedy this, we use
Fisher’s z-transformation to map ϕ from (−1, 1) onto R so that an unconstrained optimizing
algorithm can be used on the transformed parameter. Although the choice of the transfor-
mation is not unique and other transformations may be used, Fisher’s z-transformation
provides a well-understood estimator of the correlation coefficient, especially when the
error term ε j follows a normal distribution [29]. Specifically, let
1 1+ϕ
z = atanh(ϕ) = log ,
2 1−ϕ
so that
exp(2z)− 1
ϕ = tanh(z) = . (3)
exp(2z)+ 1
We have
∂ϕ
= sech2 (z) = 1 − ϕ2 ,
∂z
Algorithms 2024, 17, 524 5 of 17
and
∂2 ϕ
= −2tanh(z)sech2 (z) = −2ϕ(1 − ϕ2 ),
∂z2
where tanh(·) and sech(·) are, respectively, the hyperbolic tangent and secant functions.
Below, we show the use of the coordinate ascent algorithm to maximize (2) in terms of z;
we then insert the estimate ẑ into (3) to obtain the estimate of ϕ.
We denote the gradient vector and Hessian matrix of the per-observation log-likelihood
ℓ(θ; y j , y j−1 , d j ) in (2) as follows:
∂ℓ(θ; y j , y j−1 , d j )
ℓ̇1 (θ; y j , y j−1 , d j )
ℓ̇(θ; y j , y j−1 , d j ) = = , (4)
∂θ ℓ̇2 (θ; y j , y j−1 , d j )
and
∂2 ℓ(θ; y j , y j−1 , d j )
ℓ̈11 (θ; y j , y j−1 , d j ) ℓ̈12 (θ; y j , y j−1 , d j )
ℓ̈(θ; y j , y j−1 , d j ) = = . (5)
∂θ∂θ′ ℓ̈12 (θ; y j , y j−1 , d j ) ℓ̈22 (θ; y j , y j−1 , d j )
The first and second derivatives of ℓ(θ; y j , y j−1 , d j ) with respect to z are, respectively,
denoted by
ℓ̇z1 (θ) = (1 − ϕ2 )ℓ̇1 (θ) and ℓ̈z11 (θ) = (1 − ϕ2 )2 ℓ̈11 (θ)− 2(1 − ϕ2 )ϕℓ̇1 (θ).
Given σ(k) , we find the value of z in the (k + 1)-th step by implementing the Newton–
Raphson algorithm, i.e., computing
∑m
j=1 ℓ̇z1 ( ϕ
(k+1),l , σ (k) ; y , y
j j −1 , d j )
z(k+1),l +1 = z(k+1),l −
∑m
j=1 ℓ̈z11 ( ϕ
(k+1),l , σ (k) ; y , y
j j −1 , d j )
d j ϕ2d j −1 ϕ ϕe2j
ℓ̇1 (θ; y j , y j−1 , d j ) = − +
1 − ϕ2d j 1 − ϕ2 σ2 (1 − ϕ2d j )
1 − ϕ2
+ d j e j (ϕd j −1 y j−1 − ϕ2d j −1 y j )
σ2 (1 − ϕ2d j )2
1 e2j (1 − ϕ2 )
ℓ̇2 (θ; y j , y j−1 , d j ) = − + ,
σ σ3 (1 − ϕ2d j )
2d2j ϕ2d j −2 d j ϕ2d j −2 1 + ϕ2
ℓ̈11 (θ; y j , y j−1 , d j ) = − −
(1 − ϕ2d j )2 1−ϕ 2d j (1 − ϕ2 )2
dj
e2j − 2d j e j ϕ y j−1 2d j e2j ϕ2d j
+ +
σ2 (1 − ϕ2d j ) σ2 (1 − ϕ2d j )2
Algorithms 2024, 17, 524 6 of 17
where, e j = y j − ϕd j y j−1 .
1. z ← atanh(ϕ(0) )
s
d
( y j − ϕ j y j −1 )2 (1 − ϕ 2 )
2. σ ← 1
m ∑im=1 2d j
1− ϕ
3. zo ← Inf
4. while |z − zo | > ϵCA
1. zo ← z
2. ∆NR ← Inf
3. while |∆NR | > ϵNR
∑m 2
j=1 (1− ϕ ) ℓ̇1 ( ϕ,σ;y j ,y j−1 ,d j )
1. ∆NR ← m
∑ j=1 (1−ϕ ) ℓ̈11 (ϕ,σ;y j ,y j−1 ,d j )−2(1−ϕ2 )ϕℓ̇1 (ϕ,σ;y j ,y j−1 ,d j )
2 2
2. z ← z − ∆NR
end
4. ϕ ← tanh(z)
s
d
( y j − ϕ j y j −1 )2 (1 − ϕ 2 )
5. σ ← 1
m ∑im=1 2d j
1− ϕ
end
Poisson sampling cannot sample the same data point more than once. Since repeatedly
sampled data points do not contribute any new information, Poisson sampling preserves
more distinct information from the full data and is therefore more efficient than sampling
with replacement. See Wang et al. [26] for a theoretical justification of the superiority of
Poisson sampling.
Let η j be the sampling probability and δj be the indicator variable for the inclusion
of the j-th data point in the subsample. With u j randomly sampled from the uniform
distribution U (0, 1), we set δj = 1 when u j ≤ η j and δj = 0 otherwise. The actual subsample
size from Poisson sampling, which is denoted by r ∗ = ∑m j=1 δj , is random. The expected
subsample size is r = ∑m j=1 η j . We obtain the subsample estimator θ̆ by maximizing the
following target function:
m ℓ(θ; y j , y j−1 , d j )
1
∑ δj
(opt)
Mr (θ) = . (6)
m j =1
ηj
where κ is a parameter to ensure that the expected sample size is set around r and prevent
it from being too small, the notation a ∧ b = min( a, b),
m
1
∑ δj
(plt) (plt)
M̈r0 (θ) = ℓ̈(θ; y j , y j−1 , d j ) (8)
r0 j =1
(plt)
is the average Hessian matrix with the pilot subsample, and δj is the indicator variable
for inclusion of the j-th data point in the pilot sample. While the exact value of κ requires
additional computation, it can be approximated by
r
κa = n o −1
(plt)
∑m
j =1 M̈r0 (θ(plt) ) ℓ̇(θ(plt) ; y j , y j−1 , d j )
when the sampling rate r/m is small. This is usually the case in practice.
(opt)
The approximated subsampling probabilities η̂ j in (7) are subject to additional
(opt)
disturbance from the pilot estimation, and small values of η̂ j may inflate the asymptotic
(opt)
variance of the resulting estimator. To address this problem, we mix η̂ j with the uniform
subsampling probabilities to prevent the sampling probabilities from getting too small.
Specifically, we use
(opt) r
(1 − α1 ) η j + α1 , for j = 1, . . . , m,
m
where α1 ∈ (0, 1) is a tuning parameter. The final subsample estimator θ̃ is obtained by
combining the pilot estimator θ(plt) and the optimal subsample estimator θ̆:
−1
(plt) (opt) (plt) (opt)
θ̃ = r0 M̈r0 (θ(plt) )+ r ∗ M̈r (θ̆) r0 M̈r0 (θ(plt) )θ(plt) + r ∗ M̈r (θ̆)θ̆ , (9)
Algorithms 2024, 17, 524 8 of 17
where
m ℓ̈(θ̆; y j , y j−1 , d j )
1
∑ δj
(opt)
M̈r (θ̆) = (opt)
.
m j =1 η̂α
1 ,j
where
d j ϕ2d j −1 ϕ
c j (θ) = − .
1−ϕ 2d j 1 − ϕ2
The information matrix Ij (θ|y j−1 , d j ) has full rank. We select data points with the r
largest values of det{ I j (θ|y j−1 , d j )}, j = 1, . . . , m. This can be undertaken efficiently by
established methods like some partition-based selection algorithms [30]. We implement the
procedure via the following steps:
Algorithms 2024, 17, 524 9 of 17
(iboss) (iboss) −1
(plt)
θ̃(iboss) = r0 M̈r0 (θ̆(iboss) )+ rM̈r (θ̆ )
(11)
(plt) (iboss) (iboss)
× r0 M̈r0 (θ̆(iboss) )× θ(plt) + rM̈r (θ̆ )× θ̆(iboss) ,
(plt)
where M̈r0 (θ) is defined in (8) and
m
1
∑ δj ℓ̈(θ̆(iboss) ; y j , y j−1 , d j ).
(iboss)
M̈r (θ̆(iboss) ) =
r j =1
m
1
I S (θ) =
rS ∑ δjS Ij (θ|y j−1 , d j ), (12)
j =1
where rS = ∑m S
j=1 δj and I j ( θ| y j−1 , d j ) is defined in (10). The contribution of a data point,
say (y j′ , y j′ −1 , d j′ ), to the average information matrix, if included in the subsample, can be
measured by the directional derivative, which is defined as
ζ I S (θ),j′ = tr I S (θ)−1 Ij′ (θ|y j′ −1 , d j′ ) − 2. (13)
(thin) (thin) −1
(plt)
θ̃(thin) = r0 M̈r0 (θ̆(thin) )+ rM̈r (θ̆ )
(14)
(plt) (thin) (thin)
× r0 M̈r0 (θ̆(thin) )× θ(plt) + rM̈r (θ̆ )× θ̆(thin) ,
(plt)
where M̈r0 (θ) is defined in (8) and
m
1
∑ δj
(thin) (thin)
M̈r (θ̆) = ℓ̈(θ̆; y j , y j−1 , d j ).
r j =1
• Set Parameters:
1. Set r0 ≥ 2, q ∈ (1/2, 1), γ ∈ (0, q − 1/2).
2. α ← r/(m − r0 )
3. Set ϵ1 > 0 such that ϵ1 ≪ α
• Step 1—Pilot Estimates and Preparations:
r0
1. θ (plt) ← arg maxθ ∑i= 1 ℓ( θ; y j , y j−1 , d j )
2. r∗ ← 0
3. δ1 , . . . , δr0 ← 0
4. for j = 1 to r0
Ij ← I j (θ(plt) |y j−1 , d j )
end
r0
5. I ∗ ← r0−1 ∑ j= 1 Ij
6. for j = 1 to r0
ζ j ← tr(I ∗ −1 Ij )− 2
end
7. ζ (1) , . . . , ζ (r0 ) ← sort(ζ 1 , . . . , ζ r0 ) where ζ (1) ≤ · · · ≤ ζ (r0 )
8. r0+ ← ⌈(1 − α/2)r0 ⌉, r0− ← ⌊(1 − 3α/2)r0 ⌋∨ 1
9. Ĉ∗ ← ζ (⌈(1−α)r0 ⌉)
10. β 0 ← r0 /(r0+ − r0− )
11. h ← ζ (r+ ) − ζ (r− )
0 0
γ
12. h∗ ← h/r0
h i
r0
13. fˆ∗ ← ∑ j= 1 |ζ j −Ĉ∗ |≤h∗ / (2r0 h∗ )
1
• Step 2—Sequential Thinning:
for k = r0 + 1 to m
1. if r − r∗ > m − k do
δj ← 1 for j = k, . . . , m
break
else if r∗ ≥ r do
δj ← 0 for j = k, . . . , m
break
end
2. I∗ ← I j (θ(plt) |y j−1 , d j )
3. ζ ∗ ← tr(I − 1
∗ I∗ )
4. if (r∗ + r0 )/k ≤ ϵ1 or ζ ∗ ≥ Ĉ∗ do
δk ← 1
else
δk ← 0
end
5. Ĉo ← Ĉ∗
( fˆ∗ )−1 β 0 ( k −1) γ
V
6. Ĉ∗ ← Ĉ∗ + kq (1ζ ∗ ≥Ĉ∗ − α)
Algorithms 2024, 17, 524 12 of 17
Algorithm 4 Cont.
7. h∗ ← h/kγ
8. fˆ∗ ← fˆ∗ + k−q [(2h∗ )−1 1{|ζ ∗ −Ĉo |≤h∗ } − fˆ∗ ]
9. r∗ ← r∗ + δk
10. I ∗ ← I ∗ + r∗ δ+k r0 [ I∗ −I ∗ ]
end
• Step 3—Estimate and Combine:
1. θ̆ ← arg maxθ ∑m
j=r0 +1 δj ℓ( θ; y j , y j−1 , d j )
2. Combine θ (plt) and θ̆ by (14)
5. Numerical Results
We perform numerical simulation to evaluate the performance of the subsampling
methods. We present the results on estimation efficiency in Section 5.1 and the results on
computational efficiency in Section 5.2.
We consider the performances of the uniform subsample estimator (unif), the A-
optimal subsample estimator (opt), the information-based optimal subdata selection esti-
mator (iboss), and the sequential thinning estimator (thin). We set the full-data sample size
as m = 105 and consider different values of the true parameters; we allow ϕ = 0.1, 0.5, 0.9
and σ = 1, 10 and 20. The time gaps d j are randomly sampled from the set {1, 2, 3, 4, 5}.
We consider different subsample sizes r = 5000, 10,000, 15,000, 20,000, 25,000, and 30,000,
corresponding, respectively, to 5%, 10%, 15%, 20%, 25%, and 30% of the full dataset. We
let r0 = 1000. For opt, we set the mixing rate α1 = 0.1. To implement the thin subsam-
pling procedure, we set the tuning parameters q = 0.7, γ = 0.1, and ϵ1 = 0.001. We also
implement the uniform subsampling method with a subsample size r + r0 as a benchmark
for comparisons.
where θ̃(s) = (ϕ̃(s) , σ̃(s) ) is the subsample estimate in the s-th repetition. Due to the
nonconvex nature of the model, some methods failed to converge in a few repetitions. We
drop the results in these repetitions when calculating the MSE. This does not affect the
accuracy of the MSE because the nonconvergence rate is very low.
The results in terms of the empirical MSE are presented in Figure 1. It is seen that all
three proposed subsampling methods outperform the uniform sampling in all the cases.
Overall, the opt algorithm displays the best performance, especially when σ or ϕ is large.
The deterministic methods, iboss and thin, have higher estimation efficiency when σ
is small.
The varying performances in empirical MSE among the proposed subsampling meth-
ods are partly due to the different optimality criteria adopted. The A-optimality criterion
used by opt seeks to minimize the trace of the asymptotic variance matrix, whereas the
D-optimality criterion used by both iboss and thin focuses on minimizing the determinant
of the asymptotic variance matrix. As a result, opt places greater emphasis on parameter
components with larger variances, while iboss and thin distribute their focus more evenly
across all parameters. When σ is small (e.g., σ = 1), the variance in estimating σ is low
Algorithms 2024, 17, 524 13 of 17
across all subsampling methods, allowing iboss and thin to express their advantage in
reducing the variance of estimating ϕ, thereby outperforming opt. However, when σ = 20,
the variance in estimating σ becomes the dominant contributor to the empirical MSE. Since
iboss and thin do not prioritize the estimation of σ, their performance weakens in this
case. A similar reasoning applies to the differing performances across various values of ϕ.
To better understand this, we further decompose the empirical MSEs for σ and ϕ to validate
our explanation.
σ=1 σ = 10 σ = 20
ϕ = 0.1
ϕ = 0.5
ϕ = 0.9
Figure 1. Comparison of MSEs for estimating θ between different subsampling algorithms across
varying subsample sizes r.
σ=1 σ = 10 σ = 20
ϕ = 0.1
ϕ = 0.5
ϕ = 0.9
Figure 2. Comparison of MSEϕ s for estimating ϕ between different subsampling algorithms across
varying subsample sizes r.
σ=1 σ = 10 σ = 20
ϕ = 0.1
ϕ = 0.5
ϕ = 0.9
Figure 3. Comparison of MSEσ s for estimating σ between different subsampling algorithms across
varying subsample sizes r.
Algorithms 2024, 17, 524 15 of 17
Table 1. Average estimation times (ms) over 30 repetitions across different estimation methods, with
full-data sizes m and subsample sizes r under the setting ϕ = 0.5 and σ = 1.
The entries in the table show a substantial reduction in computation times by using
the subsampling methods compared to the time taken for full-data estimation. Unlike the
uniform subsampling method, which incurs almost no computational overhead during the
subsampling process, optimal subsampling algorithms require additional computations
to determine the inclusion probability for each data point. The numerical optimization
using subsamples has a time complexity of O(r ), whereas calculating the subsampling
probabilities for nonuniform subsampling methods requires a cost of O(m).
Compared to full-data-based estimation, both opt and iboss demonstrate exceptional
efficiency, reducing the computation time by a factor of 1/40 on average when the sample
size m is sufficiently large. Although not as fast as these two methods, thin still achieves
significant computational savings, taking less than one-tenth of the time required for the
full-data estimation. Additionally, an advantage of using thin is that subsample selection
can be performed sequentially. Overall, all three proposed optimal subsampling methods
for the IS-AR(1) model offer reliable strategies for reducing the computational burden in
large-scale data applications.
6. Conclusions
In this paper, we investigated the technique of computationally feasible subsampling
in the context of irregularly spaced time series data. We proposed practical algorithms for
implementing the opt, iboss, and thin methods for the IS-AR(1) model. The numerical re-
Algorithms 2024, 17, 524 16 of 17
sults demonstrated that the proposed subsampling methods outperform the naive uniform
subsampling approach with improved estimation efficiency and show significant benefits
in reducing the computation time compared with the full-data estimation.
While our work is currently focused on the IS-AR(1) model, it highlights the potential
of optimal subsampling methods for time series data. In future research, these techniques
can be extended to more complex models, such as IS-AR(p) for p > 1. Typically, as the num-
ber of parameters increases, optimal subsampling methods become even more effective in
reducing computation times. We hope that our work paves the way for further exploration
of subsampling strategies in time series analysis.
Author Contributions: Conceptualization, J.L. and Z.W.; methodology, J.L. and Z.W.; formal analysis,
J.L. and Z.W.; investigation, J.L. and Z.W.; writing—original draft preparation, Z.W.; writing—review
and editing, N.R. and H.W.; supervision, N.R. and H.W.; project administration, N.R. and H.W. All
authors have read and agreed to the published version of the manuscript.
Funding: This research received no external funding.
Data Availability Statement: Data used for the paper are computer-generated. Codes for generating
the data are available upon reasonable request.
Conflicts of Interest: The authors declare no conflicts of interest.
References
1. Elorrieta, F.; Eyheramendy, S.; Palma, W. Discrete-time autoregressive model for unequally spaced time-series observations.
Astron. Astrophys. 2019, 627, A120. [CrossRef]
2. Mudelsee, M. Trend analysis of climate time series: A review of methods. Earth-Sci. Rev. 2019, 190, 310–322. [CrossRef]
3. Dutta, C.; Karpman, K.; Basu, S.; Ravishanker, N. Review of statistical approaches for modeling high-frequency trading data.
Sankhya B 2023, 85, 1–48. [CrossRef]
4. Nagaraja, C.H.; Brown, L.D.; Zhao, L.H. An autoregressive approach to house price modeling. Ann. Appl. Stat. 2011, 5, 124–149.
[CrossRef]
5. Erdogan, E.; Ma, S.; Beygelzimer, A.; Rish, I. Statistical models for unequally spaced time series. In Proceedings of the 2005 SIAM
International Conference on Data Mining. SIAM, Newport Beach, CA, USA, 21–23 April 2005; pp. 626–630.
6. Anantharaman, S.; Ravishanker, N.; Basu, S. Hierarchical modeling of irregularly spaced financial returns. Stat 2024, 13, e692.
[CrossRef]
7. Carlstein, E. The use of subseries values for estimating the variance of a general statistic from a stationary sequence. Ann. Stat.
1986, 14, 1171–1179. [CrossRef]
8. Fukuchi, J.I. Subsampling and model selection in time series analysis. Biometrika 1999, 86, 591–604. [CrossRef]
9. Politis, D.N. Scalable subsampling: Computation, aggregation and inference. Biometrika 2023, 111, 347–354. [CrossRef]
10. Shumway, R. Time Series Analysis and Its Applications; Springer: Berlin/Heidelberg, Germany, 2000.
11. Eyheramendy, S.; Elorrieta, F.; Palma, W. An autoregressive model for irregular time series of variable stars. Proc. Int. Astron.
Union 2016, 12, 259–262. [CrossRef]
12. Elorrieta, F.; Eyheramendy, S.; Palma, W.; Ojeda, C. A novel bivariate autoregressive model for predicting and forecasting
irregularly observed time series. Mon. Not. R. Astron. Soc. 2021, 505, 1105–1116. [CrossRef]
13. Ghysels, E.; Jasiak, J. GARCH for irregularly spaced financial data: The ACD-GARCH model. Stud. Nonlinear Dyn. Econom. 1998,
2, 1–19. [CrossRef]
14. Meddahi, N.; Renault, E.; Werker, B. GARCH and irregularly spaced data. Econ. Lett. 2006, 90, 200–204. [CrossRef]
15. Engle, R.F.; Russell, J.R. Autoregressive conditional duration: A new model for irregularly spaced transaction data. Econometrica
1998, 66, 1127–1162. [CrossRef]
16. Maller, R.A.; Müller, G.; Szimayer, A. GARCH modelling in continuous time for irregularly spaced time series data. Bernoulli
2008, 14, 519–542. [CrossRef]
17. Buccheri, G.; Bormetti, G.; Corsi, F.; Lillo, F. A score-driven conditional correlation model for noisy and asynchronous data: An
application to high-frequency covariance dynamics. J. Bus. Econ. Stat. 2021, 39, 920–936. [CrossRef]
18. Dutta, C. Modeling Multiple Irregularly Spaced High-Frequency Financial Time Series. Ph.D. Thesis, University of Connecticut,
Storrs, CT, USA, 2022.
19. Drineas, P.; Mahoney, M.W.; Muthukrishnan, S. Sampling algorithms for l2 regression and applications. In Proceedings of the
17th Annual ACM-SIAM Symposium on Discrete Algorithms, Miami, FL, USA, 22–24 January 2006; pp. 1127–1136.
20. Yang, T.; Zhang, L.; Jin, R.; Zhu, S. An explicit sampling dependent spectral error bound for column subset selection. In
Proceedings of the 32nd International Conference on Machine Learning, Lille, France, 7–9 July 2015; Volume 37, pp. 135–143.
21. Ma, P.; Mahoney, M.W.; Yu, B. A statistical perspective on algorithmic leveraging. J. Mach. Learn. Res. 2015, 16, 861–991.
Algorithms 2024, 17, 524 17 of 17
22. Xie, R.; Wang, Z.; Bai, S.; Ma, P.; Zhong, W. Online decentralized leverage score sampling for streaming multidimensional time
series. In Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, Okinawa, Japan, 16–18 April
2019; Volume 89, pp. 2301–2311.
23. Zhu, R. Gradient-based sampling: An adaptive importance sampling for least-squares. Adv. Neural Inf. Process. Syst. 2018,
29, 406–414.
24. Wang, H.; Zhu, R.; Ma, P. Optimal subsampling for large sample logistic regression. J. Am. Stat. Assoc. 2018, 13, 829–844.
[CrossRef]
25. Teng, G.; Tian, B.; Zhang, Y.; Fu, S. Asymptotics of Subsampling for Generalized Linear Regression Models under Unbounded
Design. Entropy 2022, 25, 84. [CrossRef]
26. Wang, J.; Zou, J.; Wang, H. Sampling with replacement vs Poisson sampling: A comparative study in optimal subsampling. IEEE
Trans. Inf. Theory 2022, 68, 6605–6630. [CrossRef]
27. Wang, H.; Yang, M.; Stufken, J. Information-based optimal subdata selection for big data linear regression. J. Am. Stat. Assoc.
2019, 114, 393–405. [CrossRef]
28. Pronzato, L.; Wang, H. Sequential online subsampling for thinning experimental designs. J. Stat. Plan. Inference 2021, 212, 169–193.
[CrossRef]
29. Casella, G.; Berger, R. Statistical Inference; CRC Press: Boca Raton, FL, USA, 2024.
30. Kleinberg, J.; Tardos, E. Algorithm Design; Pearson/Addison-Wesley: Hoboken, NJ, USA, 2006.
31. Wynn, H. Optimum Submeasures with Applications to Finite Population Sampling; Academic Press: Cambridge, MA, USA, 1982.
32. Fedorov, V.V.; Hackl, P. Model-Oriented Design of Experiments; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2012.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.