Academia.eduAcademia.edu

One shot schemes for decentralized quickest change detection

2009, IEEE Transactions on Information Theory

This work considers the problem of quickest detection with N distributed sensors that receive sequential observations either in discrete or in continuous time from the environment. These sensors employ cumulative sum (CUSUM) strategies and communicate to a central fusion center by one shot schemes. One shot schemes are schemes in which the sensors communicate with the fusion center only once, via which they signal a detection. The communication is clearly asynchronous and the case is considered in which the fusion center employs a minimal strategy, which means that it declares an alarm when the first communication takes place. It is assumed that the observations received at the sensors are independent and that the time points at which the appearance of a signal can take place are different. Both the cases of the same and different signal distributions across sensors are considered. It is shown that there is no loss of performance of one shot schemes as compared to the centralized case in an extended Lorden min-max sense, since the minimum of N CUSUMs is asymptotically optimal as the mean time between false alarms increases without bound. In the case of different signal distributions the optimal threshold parameters are explicitly computed.

One shot schemes for decentralized quickest change detection Olympia Hadjiliadis Hongzhong Zhang H. Vincent Poor Department of Mathematics Brooklyn College, C.U.N.Y. Email: [email protected] Department of Mathematics Graduate Center, C.U.N.Y. Email: [email protected] Department of Electrical Engineering Princeton University Email: [email protected] DISTRIBUTED INFERENCE AND DECISION-MAKING IN MULTISENSOR SYSTEMS, ORGANIZERS: ALEXANDER TARTAKOVSKY AND VENUGOPAL VEERAVALLI. Abstract— This work considers the problem of quickest detection with N distributed sensors that receive continuous sequential observations from the environment. These sensors employ cumulative sum (CUSUM) strategies and communicate to a central fusion center by one shot schemes. One shot schemes are schemes in which the sensors communicate with the fusion center only once, after which they must signal a detection. The communication is clearly asynchronous and the case is considered in which the fusion center employs a minimal strategy, which means that it declares an alarm when the first communication takes place. It is assumed that the observations received at the sensors are independent and that the time points at which the appearance of a signal can take place are different. It is shown that there is no loss of performance of one shot schemes as compared to the centralized case in an extended Lorden minmax sense, since the minimum of N CUSUMs is asymptotically optimal as the mean time between false alarms increases without bound. Keywords: One shot schemes, CUSUM, quickest detection1 I. I NTRODUCTION The problem of decentralized sequential detection with data fusion dates back to the 1980s with the works of [1] and [2]. We are interested in the problem of quickest detection in an N -sensor network in which the information available is distributed and decentralized, a problem introduced in [16]. We consider the situation in which the onset of a signal can occur at different times in the N sensors, that is the change points can be different for each of the N sensors. We assume that each sensor runs a cumulative sum (CUSUM) algorithm as suggested in [7], [11]–[14] and communicates with a central fusion center only when it is ready to signal an alarm. In other words, each sensor communicates with the central fusion center through a one shot scheme. We assume that the N sensors receive independent observations, which constitutes an assumption consistent with the fact that the N change points can be different. So far in the literature (see [7], [11]–[14]) it has been assumed that the change points are the same across sensors. In this paper we consider the 1 This research was supported in part by the U.S. National Science Foundation under Grants ANI-03-38807, CNS-06-25637 and CCF-07-28208 Fig. 1: One shot communication in a decentralized system of N sensors. case in which the central fusion center employs a minimal strategy, that is, it reacts when the first communication from the sensors takes place. We demonstrate that, in the situation described above, at least asymptotically, there is no loss of information at the fusion center by employing the minimal one shot scheme. That is, we demonstrate that the minimum of N CUSUMs is asymptotically optimal in detecting the minimum of the N different change points, as the mean time between false alarms tends to ∞, with respect to an appropriately extended Lorden criterion [5] that incorporates the possibility of N different change points. As an observation model we consider a continuous time Brownian motion model, which is a good approximation to reality for measurements taken at a high rate. Moreover, given a high rate of observations from any distribution, the central limit theorem asserts that sums of such observations are normally distributed and therefore the Brownian motion model is a plausible model for such situations. The communication structure considered in this paper is summarized in Figure 1, in which Ti for i = 1, . . . , N denote stopping times associated with alarms at sensors Si i = 1, . . . , N , respectively. In the next section we formulate the problem and demonstrate asymptotic optimality (as the mean time between false alarms tends to ∞), in an extended min-max Lorden sense, of the minimum of N CUSUM stopping times in the case of centralized detection. We then argue that this result suggests no loss in performance of the one shot minimal strategy employed by the fusion center in the case of decentralized detection. We finally discuss an extension of these results to the case of correlated sensors. II. T HE CENTRALIZED PROBLEM (i) We sequentially observe the processes {ξt ; t ≥ 0} for all i = 1, . . . , N with the following dynamics: ( (i) dwt t ≤ τi (i) dξt = (1) (i) µ dt + dwt t > τi , (i) where µ > 0 is known 2 , {wt } are independent standard Brownian motions, and the τi ’s are unknown constants. An appropriate measurable space is Ω = C[0, ∞) × C[0, ∞) × . . . × C[0, ∞) and F = ∪t>0 Ft , where {Ft } is the filtration of the observations with Ft = σ{s ≤ (1) (N ) t; (ξs , . . . , ξs )}. Notice that in the case of centralized detection the filtration consists of the totality of the observations that have been received up until the specific point in time t. On this space, we have the following family of probability measures {Pτ1 ,...,τN }, where Pτ1 ,...,τN corresponds to the (1) (N ) measure generated on Ω by the processes (ξt , . . . , ξt ) when the change in the N -tuple process occurs at time point τi , i = 1, . . . , N . Notice that the measure P∞,...,∞ corresponds to the measure generated on Ω by N independent Brownian motions. Our objective is to find a stopping rule T that balances the trade-off between a small detection delay subject to a lower bound on the mean-time between false alarms and will ultimately detect min{τ1 , . . . , τN } 3 . As a performance measure we consider (N ) (2) JM (T ) = ª © sup essup Eτ1 ,...,τN (T − τ1 ∧ . . . ∧ τN })+ |Fτ1 ∧...∧τN τ1 ,...,τN where the supremum over τ1 , . . . , τN is taken over the set in which min{τ1 , . . . , τN } < ∞. That is, we consider the worst detection delay over all possible realizations of paths (1) (N ) of the N -tuple of stochastic processes (ξt , . . . , ξt ) up to min{τ1 , . . . , τN } and then consider the worst detection delay over all possible N -tuples {τ1 , . . . , τN } over a set in which at least one of them is forced to take a finite value. This is because T is a stopping rule meant to detect the minimum of the N change points and therefore if one of the N processes undergoes a regime change, any unit of time by which T 2 Due to the symmetry of Brownian motion, without loss of generality, we can assume that µ > 0. 3 In what follows we will use τ ∧ . . . ∧ τ 1 N to denote min{τ1 , . . . , τN }. delays in reacting, should be counted towards the detection delay. The criterion presented in (2) results in the corresponding stochastic optimization problem of the form: (N ) (3) inf JM (T ) T subject to E∞,...,∞ {T } ≥ γ. We notice that the expectation in the above constraint is taken under the measure P∞,...,∞ . This is the measure generated on the space Ω in the case that none of the N processes (1) (N ) (ξt , . . . , ξt ) changes regime. Therefore, E∞,...,∞ {T } is the mean time between false alarms. In the case of the presence of only one stochastic process (1) (say {ξt }), the problem becomes one of detecting a onesided change in a sequence of Brownian observations, or a (1) (N ) vector of observations (ξt , . . . , ξt ) with the same change points, whose optimal solution was found in [3] and [15]. The optimal solution is the continuous time version of Page’s CUSUM stopping rule, namely the first passage time of the process ¯ dPτ1 ¯¯ (1) (1) (1) = sup log (4) yt = ut − mt , where dP∞ ¯Ft 0≤τ1 ≤t 1 (1) (1) = µξt − µ2 t, (5) ut 2 and (1) mt (6) = inf u(1) s . 0≤s≤t The CUSUM stopping rule is thus Tν (7) (1) = inf{t ≥ 0; yt ≥ ν}, where ν is chosen so that E∞ {Tν } ≡ µ22 f (ν) = γ, with f (ν) = eν − ν − 1 (see for example [4]) and (8) (1) JM (Tν ) ≡ E0 {Tν } = 2 f (−ν). µ2 The fact that the worst detection delay is the same as that incurred in the case in which the change point is exactly 0 is a consequence of the non-negativity of the CUSUM process, from which it follows that the worst detection delay occurs when the CUSUM process at the time of the change is at 0 [4]. We remark here that if the N change points were the same then the problem (3) is equivalent to observing only one stochastic process which is now N -dimensional. Thus, in this case, the detection delay and mean time between false alarms are given by the formulas in the above paragraph. Returning to problem (3), it is easily seen that in seeking solutions to this problem, we can restrict our attention to stopping times that achieve the false alarm constraint with equality [8]. The optimality of the CUSUM stopping rule in the presence of only one observation process suggests that a CUSUM type of stopping rule might display similar optimality properties in the case of multiple observation processes. In particular, an intuitively appealing rule, when the detection of min{τ1 , . . . , τN } is of interest, is Th = Th1 ∧ . . . ∧ ThN , where (i) Thi is the CUSUM stopping rule for the process {ξt ; t ≥ 0} for i = 1, . . . , N . That is, we use what is known as a multichart CUSUM stopping time [10], which can be written as n o (1) (N ) (9) Th = inf t ≥ 0; max{yt , . . . , yt } ≥ h , (i) density function of the random variable sup0≤s≤t ys for arbitrary fixed t which appears in [6]. In order to demonstrate asymptotic optimality of (9) we (N ) bound the detection delay JM of the unknown optimal ∗ stopping rule T by (12) (N ) E0,∞,...,∞ {Th } > JM (T ∗ ), where where h is chosen so that ¯ µ ¶ dPτi ¯¯ 1 (i) 1 2 E∞,...,∞ {Th } = γ. = µξt − µ t−inf µξs(i) − µ2 s (13) = sup log . s≤t dP∞ ¯Ft 2 2 0≤τi ≤t (N ) It is also obvious that JM (T ∗ ) is bounded from below by and the Pτi are the restrictions of the measure Pτ1 ,...,τN to the detection delay of the one CUSUM when there is only one C[0, ∞). observation process, in view of the fact that n o It is easily seen that + supτ1 ,...,τN essupEτ1 ,...,τN (T − τ1 ∧ . . . ∧ τN ) |Fτ1 ∧...∧τN ≥ (N ) n o JM (Th ) = E0,∞,...,∞ {Th } = E∞,0,∞,...,∞ {Th } + ≥ supτ1 essup Eτ1 (T − τ1 ) |Fτ1 . = ... The stopping time o that minimizes n = E∞,...,∞,0 {Th } . + supτ1 essup Eτ1 (T − τ1 ) |Fτ1 is the CUSUM stopping This is because the worst detection delay occurs when at rule T of (7), with ν chosen so as to satisfy ν least one of the N processes does not change regime. The E∞ {Tν } = γ. reason for this lies in the fact that the CUSUM process is (14) a monotone function of µ, resulting in a longer on average We will demonstrate that the difference between the upper and passage time if µ = 0 [9]. That is, the worst detection delay lower bounds will occur when none of the other processes changes regime (N ) E0,∞,...,∞ {Th } > JM (T ∗ ) > E0 {Tν }, and due to the non-negativity of the CUSUM process the worst (15) detection delay will occur when the remaining one processes is bounded by a constant as γ → ∞, with h and ν satisfying is exactly at 0. (13) and (14), respectively. Lemma 1: We have Notice that the threshold h is used for the multi-chart · ¸ CUSUM stopping rule (9) in order to distinguish it from ν 2 N µ2 E0,∞,...,∞ {Th } = log γ + log − 1 + o(1) , the threshold used for the one sided CUSUM stopping rule µ2 2 (7). (16) In what follows we will demonstrate asymptotic optimality of (9) as γ → ∞. In view of the discussion in the previous as γ → ∞, paragraph, in order to assess the optimality properties of the Proof: Please refer to the Appendix for a sketch of the proof. Moreover, it is easily seen from (8) that multi-chart CUSUM rule (9) we will thus need to begin by · ¸ 2 µ2 evaluating E0,∞,...,∞ {Th } and E∞,...,∞ {Th }. log γ + log (17) E0 {Tν } = − 1 + o(1) . (i) µ2 2 Since the processes ξt , i = 1, . . . , N , are independent it is possible to obtain a closed form expression through the Thus we have the following result. (N ) formula Theorem 1: The difference in detection delay JM of the ∗ unknown optimal stopping rule T and the detection delay of (10) Th of (9) with h satisfying (13) is bounded above by 2 Z ∞ (log N ) , µ2 E0,∞,...,∞ {Th } = P0,∞,...,∞ (Th > t) 0 as γ → ∞. Z ∞ Proof: The proof follows from Lemma 1 and (17). 1 N P0,∞,...,∞ ({Th > t} ∩ . . . ∩ {Th > t})dt = The upper and lower bounds on detection delay for the 0 Z ∞ ¤N −1 £ optimal stopping rule, when µ is 21 , 1 and 2, in the case that dt. P0 (Th1 > t) P∞ (Th1 > t) = N = 2 are shown in Figure 2. (i) yt 0 Similarly, (11) E∞,...,∞ {Th } = Z 0 ∞ £ ¤N P∞ (Th1 > t) dt, (i) where {Thi > t} = {sup0≤s≤t ys < h}. In other words, the evaluation of (10) and (11) is possible through the probability The consequence of Theorem 1 is the asymptotic optimality of (9) in the case in which all of the information becomes directly available through the filtration {Ft } at the fusion center. We notice however that this asymptotic optimality holds for any finite number of sensors N . We now discuss the implications of the above result in decentralized detection in the case of one shot schemes. The upper and lower bounds on the detection delay (DD) for the optimal stopping rule (a) µ = 1 2 (b) µ = 1 (c) µ = 2 Fig. 2: (Left) Case of µ = 21 . (Middle) Case of µ2 = 1. (Right) Case of µ = 2. (Note that the differences between upper and lowers bounds are all bounded as γ increases.) III. D ECENTRALIZED DETECTION Let us now suppose that each of the observation processes (i) {ξt } become sequentially available at its corresponding sensor Si which then devises an asynchronous communication scheme to the central fusion center. In particular, sensor Si communicates to the central fusion center only when it wants to signal an alarm, which is elicited according to a CUSUM (i) rule Th of (7). Once again the observations received at the N sensors are independent and can change dynamics at distinct unknown points τi . The fusion center, whose objective is to detect the first time when there is a change, devises a minimal strategy; that is, it declares that a change has occurred at the first instance when one of the sensors communicates an alarm. The implication of Theorem 1 is that in fact this strategy is the best that the fusion center can devise and that there is no loss in performance between the case in which the fusion center (1) (N ) receives the raw data {ξt , . . . ξt } directly and the case in which the communication that takes place is limited to the one shown in Figure 1. To see this, the detection delay of the (1) (N ) stopping rule Th = Th ∧. . .∧Th is equal to E0,∞,...,∞ {Th } when S1 is the one that first signals an alarm, E∞,0,...,∞ {Th } when S2 first signals and so on all of which are equal due to the assumed symmetry in the signal strength µ received at each of the sensors Si when a change occurs. The mean time between false alarms for the fusion center that devises the rule (1) (N ) Th = Th ∧ . . . ∧ Th is thus E∞,...,∞ {Th }. But Theorem 1 asserts that this rule, namely Th , is asymptotically optimal as the mean time between false alarms tends to ∞ in the centralized case for any finite N . In other words, the CUSUM (1) (2) (N ) stopping rules Th , Th , ..., Th are sufficient statistics (at least asymptotically) for the problem of quickest detection of (3). IV. P OSSIBLE EXTENSIONS An interesting extension corresponds to the case in which the signal strengths µ are different in each sensor after the change. That is, after the change the signal in Si is µi with µ1 6= µ2 6= . . . 6= µN . In this case, it is not clear what the optimal choice of thresholds is, but it is possible that the thresholds {hi } should be chosen so that E0,∞,...,∞ {TCN } = E∞,0,∞,...,∞ {TCN } = . . . = E∞,...,∞,0 {TCN }, where TCN = Th11 ∧ . . . ∧ ThNN . A further interesting extension corresponds to the case of correlated sensors. To demonstrate this case let us begin by assuming that N = 2. This case corresponds to (1), but with n o (1) E wt ws(2) (18) = ρ min{s, t} ∀ s, t ≥ 0. This case becomes significantly more difficult because of the presence of local time in the dynamics of the process (1) (2) max{yt , yt }. Nevertheless, it is possible to derive a formula for the expected delay of Th under the measure P∞,∞ . This expression is given by E∞,∞ {Th } = − 2 h (e − h − 1) µ2 (Z ) T (1) yt (1) (2) (e 2(1 − ρ)E − 1)δ(ys − ys )ds , 0 (19) where δ denotes the Dirac delta function and the final term in this expression corresponds to the collision local time of (1) (1) (2) the processes yt and yt weighted by the factor (eyt − 1). The difficulty in the use of expression (19) is the fact that as ρ changes, the expected value of the collision local time term, which is the last term in (19), also changes. Moreover, the expression for the first moment of Th becomes significantly more complicated under the measure P0,∞ . R EFERENCES [1] M. M. Al-Ibrahim and P. K. Varshney, A simple multi-sensor sequential detection procedure, Proceedings of the 27th IEEE Conference on Decision and Control, Austin, Texas, pp. 2479-2483, Deecember 1988. [2] M. M. Al-Ibrahim and P. K. Varshney, A decentralized sequential test with data fusion, Proceedings of the 1989 American Control Conference, Pittsburgh, Pennsylvania, pp. 1321-1325, June 1989. [3] M. Beibel, A note on Ritov’s Bayes approach to the minimax property of the CUSUM procedure, Annals of Statistics, Vol. 24, No. 2, pp. 1804 - 1812, 1996. [4] O. Hadjiliadis and G. V. Moustakides, Optimal and asymptotically optimal CUSUM rules for change point detection in the Brownian motion model with multiple alternatives, Theory of Probability and Its Applications, Vol. 50, No. 1, pp. 131 - 144, 2006. [5] G. Lorden, Procedures for reacting to a change in distribution, Annals of Mathematical Statistics, Vol. 42, No. 6, pp. 1897 - 1908, 1971. [6] M. Magdon-Ismail, A.F. Atiya, A. Pratap and Y.S. Abu-Mostafa, On the maximum drawdown of Brownian motion, Journal of Applied Probability, Vol. 41, No. 1, pp. 147–161. [7] G. V. Moustakides, Decentralized CUSUM change detection, Proceedings of the 9th International Conference on Information Fusion (ICIF), Florence, Italy, pp. 1 - 6, 2006. [8] G. V. Moustakides, Optimal stopping times for detecting changes in distributions, Annals of Statistics, Vol. 14, No. 4, pp. 1379 - 1387, 1986. [9] H.V. Poor and O. Hadjiliadis, Quickest Detection, Cambridge University Press, Cambridge UK (to appear). [10] A. G. Tartakovsky, Asymptotic performance of a multichart CUSUM test under false alarm probability constraint, Proceedings of the 44th IEEE Conference on Decision and Control, Seville, Spain, December 12 - 15 2005, pp. 320 - 325. [11] A.G. Tartakovsky and H. Kim, Performance of certain decentralized distributed change detection procedures, Proceedings of the 9th International Conference on Information Fusion, Florence, Italy, July 8-10 2006. [12] A.G. Tartakovsky and V. V. Veeravalli, Quickest change detection in distributed sensor systems, Proceedings of the 6th Conference on Information Fusion, Cairns, Australia, July 8-11 2003. [13] A.G. Tartakovsky and V. V. Veeravalli, Change-point detection in multichannel and distributed systems with applications, in Applications of Sequential Methodologies, pp. 331-363, (N. Mukhopadhay, S.Datta and S. Chattopadhay, Eds), Marcel Dekker, New York, 2004. [14] A.G. Tartakovsky and V. V. Veeravalli, Asymptotically optimum quickest change detection in distributed sensor systems, Sequential Analysis, to appear. [15] A. N. Shiryaev, Minimax optimality of the method of cumulative sums (cusum) in the continuous case, Russian Mathematical Surveys, Vol. 51, No. 4, pp. 750 - 751, 1996. [16] V. V. Veeravalli, Decentralized quickest change detection, IEEE Transactions on Information Theory, Vol. 47, No. 4, pp. 1657 - 1665, 2001. and E∞,∞ {Th } = e−h sin3 θi sin3 θj cos2 θi cos2 θj 32 X 2 µ (θi − sin θi cos θi )(θj − sin θj cos θj ) cos2 θi + cos2 θj i,j≥1 + i≥1 16 e−h sinh6 η cosh2 η + 2 µ (sinh η cosh η − η)2 = S4 (h) + S5 (h) + S6 (h), where = tan θn = tanh η = (20) X sin3 φi cos2 φi 32 sinh3 η µ2 sinh η cosh η − η φi − sin φi cos φi This can help us to compare η with h. For S6 (h), S6 (h) = = = ¡ ¢−2 16 −h e sinh4 η 1 − η sinh−1 η cosh−1 η 2 µ ¤ 1 −h £ 4η e e + (8η − 4) e2η + o(eη ) µ2 ¤ 1 £ h 4η−2h e e + (8η − 4)e2η−h + o(e−η ) , 2 µ by (20) the first term is ¡ ¢2 £ ¤2 eh e4η−2h = eh e2η−h = eh 1 − 4ηe−2η + o(e−3η ) £ ¤ = eh 1 − 8ηe−2η + o(e−3η ) eh − 8ηeh−2η + o(e−η ) eh − 8η + o(e−η ), £ ¤ = (8η − 4) 1 + o(e−η ) = 8η − 4 + o(e−η ), so i≥1 = S1 (h) + S2 (h) + S3 (h), h 2. e2η−h = 1 − 4ηe−2η + o(e−3η ). sin3 φi sin3 θj cos2 φi cos2 θj µ2 (φi − sin φi cos φi )(θj − sin θj cos θj ) cos2 φi + cos2 θj and the second term is i,j≥1 X sin3 φi cos2 φi sinh3 η cos2 φi 32 (8η − 4)e2η−h − 2 µ sinh η cosh η − η φi − sin φi cos φi cos2 φi + cosh2 η i≥1 + 2 − φn < 0 h 2 θn > 0 h 2 η > 0. h First notice that for large h, η is large and close to Moreover, = = E0,∞ {Th } = 32 X tan φn The idea then is show S1 (h), S2 (h), S4 (h) and S5 (h) converge to zero, and examine how S2 (h) and S5 (h) behave as h → ∞. In the following paragraphs we shall analyze these in the order S6 (h) → S3 (h) → S2 (h) → S5 (h) → S1 (h) → S4 (h). V. A PPENDIX As an illustration for general case, let us prove the result for N = 2. We begin by deriving expressions for E0,∞ {Th } and E∞,∞ {Th } by using the results in [6]. For all h > 2, we have cosh2 η 64 e−h sinh3 η X sin3 θi cos2 θi µ2 sinh η cosh η − η θi − sin θi cos θi cos2 θi + cosh2 η (21) S6 (h) = i 1 h h −h 2 ) , as h → ∞. e − 4 + o(e µ2 For S3 (h), also note that from (8) and [6] we can write Z ∞ 2 P0 (Th > t)dt f (−h) = µ2 0 16 h X sin3 φi cos2 φi = (22) , e2 µ2 φi − sin φi cos φi i≥1 from which we obtain X sin3 φi cos2 φi sinh3 η 32 S3 (h) = µ2 sinh η cosh η − η φi − sin φi cos φi i≥1 ¤ 2 £ = 1 + o(e−η ) (h + e−h − 1) µ2 i 2 h −h 2 ) , as h → ∞. = h − 1 + o(e (23) µ2 To bound S2 (h) and S5 (h) we need the following, Result 1: Suppose 0 < p < 1. Then, for all positive solutions {αi }i≥1 to the equation tan x = px (tan x = −px, resp.), we have X | sin3 αi | cos2 αi 1 lim+ ≤ . (24) αi − sin αi cos αi π p→0 i≥1 This suggests that, asymptotically, as h → ∞, X | sin3 φi | cos2 φi cos2 φi sinh3 η sinh η cosh η − η φi − sin φi cos φi cos2 φi + cosh2 η i≥1 ¸ · sinh3 η 1 + o(1) ≤ π cosh2 η(sinh η cosh η − η) ¸ · 2 ¢−1 sinh η ¡ 1 + o(1) = 1 − η sinh−1 η cosh−1 η 3 π cosh η −h 2 = O(e ), from which we obtain |S2 (h)| (25) = Consequently, |S1 (h)| (28) |S4 (h)| (29) h = O(e− 2 ), E0,∞ (Th ) (30) (31) E∞,∞ (Th ) |S5 (h)| = 2 [h − 1 + o(1)] , as h → ∞, µ2 ¤ 1 £ h e − 4 + o(1) , as h → ∞. 2 µ = And for h and γ satisfying (13), we have asymptotic results (16) with N = 2. Now let us prove the two results we used in the above. Result 1: ¡ ¢ Proof: For any αi ∈ (i − 21 )π, (i + 21 )π such that tan αi = ±pαi , (0 < p < 1), we have p3 αi2 | sin3 αi | cos2 αi = αi − sin αi cos αi (1 + p2 αi2 )3/2 [(1 ∓ p) + p2 αi2 ] p p ≤ ≤h . 2 2 3/2 ¢2 i3/2 ¡ (1 + p αi ) 1 + p2 i − 12 π 2 Thus X X | sin3 αi | cos2 αi ≤ h αi − sin αi cos αi i≥1 Z ∞ −π 2 p ¢2 i3/2 ¡ i≥1 1 + p2 i − 1 π2 2 Z ∞ pdx du 1 = π −p π2 (1 + u2 )3/2 (1 + p2 x2 )3/2 Z ∞ 1 du = , as p → 0+ . → 2 )3/2 π (1 + u 0 Result 2: Proof: For simplicity let us denote the (i, j)-term in the sum by ai,j (p). As in the last proof, a little computation would give us |ai,j (p)| h 1 O(e− 2 ), as h → ∞. 2 µ To handle the double sum in S1 (h) and S4 (h), we need Result 2: Suppose 0 < p < 1, {αi }i≥1 are all positive solutions to the equation tan x = px, and {βi }i≥1 are all positive solutions to equation tan x = px (tan x = −px, resp.), then X sin3 αi sin3 βj cos2 αi cos2 βj i,j≥1 = and so (26) 1 o(e−h ), as h → ∞. µ2 = Finally, from (21), (23), (25), (26), (28) and (29) we obtain Similarly, e−h sinh3 η X | sin3 θi | cos2 θi cosh2 η sinh η cosh η − η θi − sin θi cos θi cos2 θi + cosh2 η i≥1 ¸ · sinh2 η 1 −1 + o(1) e−h (1 − η sinh η cosh η) ≤ π cosh η o(1), as h → ∞. Similarly, 1 ≤ π h 2 O(e− 2 ), as h → ∞. µ2 = = Ip (pαi , pβj ) · p2 , where Ip (x, y) (0 < p < 1) is the function Ip (x, y) = 1 p (1 + x2 )(1 + y 2 )(2 + x2 + y 2 )(1 + → 0, as p → 0+ . + 1∓p y2 ) . Clearly, Ip (·, ·) is (uniformly in p, 0 ≤ p < 1) bounded above by the L1 (R2 ) function B(·, ·), which is defined as (αi − sin αi cos αi )(βj − sin βj cos βj ) cos2 αi + cos2 βj (27) 1−p x2 )(1 B(x, y) = 1 p (1 + x2 )(1 + y 2 )(2 + x2 + y 2 ) . We have two steps to finish our proof: (a) lim+ p→0 Z Z |ai,j (p)| X ai,j (p) · I{ai,j (p)>0} = i,j≥1 (b) lim+ p→0 1 π2 X I0 (x, y)dxdy, (R+ )2 Z Z I0 (x, y)dxdy. Because of this, for any 0 ≤ p < 1, the “tail” sum Z Z X ǫ 1 Ip (x, y)dxdy ≤ , |ai,j (p)| ≤ 2 π 3 RM min(pαi ,pβj )>M +pπ (32) where we define ai,j (0) = 0 for all (i, j). On the other hand, as p goes to zero, the function Ip will converge uniformly in [0, M ]2 to I0 . So all the terms |ai,j (p)| in the “head” sum are uniformly very close to I0 (pαi , pβj )·p2 , the sum of which, multiplied by π 2 , is a Riemann sum of the function I0 (x, y) over the region [0, M ]2 , and will converges to the Riemann integral of I0 over [0, M ]2 as p turns to zero. In other words, for small p, there exists ¯ ¯ ¯ ¯ Z MZ M X ¯ ¯ 1 ¯ |ai,j (p)| − 2 I0 (x, y)dxdy ¯¯ ¯ π 0 0 ¯max(pαi ,pβj )≤M +pπ ¯ ǫ (33) ≤ . 3 By (32) and (33), we have ¯ ¯ ¯ ¯ Z Z X ¯ ¯ 1 ¯ lim |ai,j (p)| − 2 I0 (x, y)dxdy ¯¯ ≤ ǫ. ¯p→0+ π (R+ )2 ¯ ¯ i,j≥1 (34) Now let ǫ goes to zero we are done with (a). The proof of (b) is similar. Note that the signs of the ai,j (p)’s can be represented by (−1)i+j or (−1)i+j+1 , and in each rectangle [2(i − 1)pπ, 2ipπ] × [(j − 1)pπ, jpπ], (i, j ≥ 1), either a2i−1,j (p) or a2i,j (p) is positive. With the same constant M chosen as above, for the sum of all positive ai,j (p)’s such that max(pαi , pβj ) ≤ M +pπ, we can use the same argument as before, to show that for small p,   X 2π 2  ai,j (p) · I{ai,j (p)>0}  max(pαi ,pβj )≤M +pπ ≈ Z 0 M and (R+ )2 Let us start from (a). Given any ǫ > 0, we can find a constant M > 0 such that, for RM = {(x, y) : min(x, y) > M } and all 0 ≤ p < 1, ( Ip is decreasing in both x and y in RM , RR RR 1 I (x, y)dxdy ≤ π12 B(x, y)dxdy ≤ 3ǫ . π2 RM p RM (35) Z 0 h 2 [h − 1 + o(e− 2 )], µ2 (36) i,j≥1 1 = 2π 2 Thus (b) is proven because both the tail integral and the tail sum are negligible due the way to choose M . In the N CUSUMs case with N ≥ 2, the calculation is similar: both of the main terms in E0,∞,...,∞ {Th } and in {Th } are the¤ terms with highest degree in £ E∞,∞,...,∞ sinh3 η/(sinh η cosh η − η) . With (20) we can get they are M I0 (x, y)dxdy. (37) i h 2 h h e + (N − 2)h + (2 − 3N ) + o(e− 2 ) , 2 Nµ respectively. We can prove that all other terms converge to zero as h goes to infinity. With a generalization of Result 1 to n dimensional trigonometric sums and integrals for all n ≥ 1, we are able to deal with most terms in the expansion of the expectations, because those bounded trigonometric sums are multiplied by expressions of negative exponential order in h. There is only one term (in E0,∞,...,∞ {Th }) which cannot be proven to converge to zero in this manner. We need to prove the sum involved there, which is (38) at the top of the following page, converges to zero as h goes to infinity. We can follow the proof of Result 2 to get the result. To be more precise, denote p = h2 , and the term in above sum by (N ) (N ) (2) ai,j (p), then obviously, |ai,j (p)| ≤ |ai,j (p)|, that can help us to control the “tail” sum X (N ) (40) |ai,j (p)|, min(pφi ,pθj )>M +pπ where M is chosen as in the proof of Result 2. On the other hand, (N ) |ai,j (p)| = Ip(N ) (pφi , pθj ) · p2 , (41) (N ) (N ) where Ip is the function defined in (39). The function Ip uniformly converges to I0 as p goes to infinity in the domain [0, M ]2 , since pη → 1 as p → 0+ . As a result, the “head” sum converges to the same double integral as the one in (33) or (35), so we are done! Finally, by (36) and (37), we can derive asymptotic formula (16) with h and γ satisfying (13). (38) X sin3 φi sin3 θj cos2 φi cos2 θj 2 i,j≥1 (39) (φi − sin φi cos φi )(θj − sin θj cos θj )[(N − 2)(1 − 4 hη 2 ) cos2 φi cos2 θj + cos2 φi + cos2 θj ] Ip(N ) (x, y) = p 1 (1 + x2 )(1 + y 2 )[(N − 2)(1 − p2 η 2 ) + 2 + x2 + y 2 ](1 + 1−p x2 )(1 + 1+p y2 )