Academia.eduAcademia.edu

Data-driven constrained optimal model reduction

2019, European Journal of Control

Model reduction by moment matching can be interpreted as the problem of finding a reduced-order model which possesses the same steady-state output response of a given full-order system for a prescribed class of input signals. Little information regarding the transient behavior of the system is systematically preserved, limiting the use of reduced-order models in control applications. In this paper we formulate and solve the problem of constrained optimal model reduction. Using a data-driven approach we determine an estimate of the moments and of the transient response of a possibly unknown system. Consequently we determine a reduced-order model which matches the estimated moments at the prescribed interpolation signals and, simultaneously, possesses the estimated transient. We show that the resulting system is a solution of the constrained optimal model reduction problem. Detailed results are obtained when the optimality criterion is formulated with the time-domain 1 , 2 , ∞ norms and with the frequency-domain H 2 norm. The paper is illustrated by two examples: the reduction of the model of the vibrations of a building and the reduction of the Eady model (an atmospheric storm track model).

Data-driven constrained optimal model reduction Giordano Scarciottia , Zhong-Ping Jiangb , Alessandro Astolfia,c a Department of Electrical and Electronic Engineering, Imperial College London, SW7 2AZ London, UK. of Electrical and Computer Engineering, Tandon School of Engineering, New York University, Brooklyn, NY, 11201 USA c Dipartimento di Ingegneria Civile e Ingegneria Informatica, “Tor Vergata”, Via del Politecnico 1, 00133, Rome, Italy. b Department Abstract Model reduction by moment matching can be interpreted as the problem of finding a reduced-order model which possesses the same steady-state output response of a given full-order system for a prescribed class of input signals. Little information regarding the transient behavior of the system is systematically preserved, limiting the use of reduced-order models in control applications. In this paper we formulate and solve the problem of constrained optimal model reduction. Using a data-driven approach we determine an estimate of the moments and of the transient response of a possibly unknown system. Consequently we determine a reduced-order model which matches the estimated moments at the prescribed interpolation signals and, simultaneously, possesses the estimated transient. We show that the resulting system is a solution of the constrained optimal model reduction problem. Detailed results are obtained when the optimality criterion is formulated with the time-domain ℓ1 , ℓ2 , ℓ∞ norms and with the frequency-domain H2 norm. The paper is illustrated by two examples: the reduction of the model of the vibrations of a building and the reduction of the Eady model (an atmospheric storm track model). 1. Introduction Given a full-order system and some specific properties of interest of this system, the problem of model reduction consists in the determination of a reduced-order model that, under some particular operating conditions, possesses the chosen properties. The problem is of fundamental importance in the modern control and systems field because the majority of the theory and methods developed in this field relies on the availability of an accurate but simple model [1]. Depending on the properties that the model reduction technique preserves we obtain families of different methods, such as balanced truncation [2, 3, 4, 5], Hankel-norm approximations [6, 7, 8] and the interpolation approach [9, 10, 11, 12, 13, 14, 15, 16]. Further details on these and other techniques are given in [17]. In the moment matching approach, the property that the reduced-order model systematically preserves is the steady-state output response to Email addresses: [email protected] (Giordano Scarciotti), [email protected] (Zhong-Ping Jiang), [email protected] (Alessandro Astolfi) Preprint submitted to Elsevier October 21, 2019 a prescribed class of input signals, i.e. the reduced-order model possesses the same steady-state output response of the full-order model for some prescribed “interpolating” input signals. This interpretation of the moment matching approach has been introduced in [18] (see also the monograph [19]), where it has also been shown in which measure this point of view is related to the classical interpolation theory. From this interpretation, it is clear that reduced-order models by moment matching do not preserve, in a systematic way, information about the transient behavior of the system. To mitigate this issue in this paper we address the problem of determining constrained optimal reduced-order models by moment matching (with respect to a norm to be specified). In particular, we want to determine a model which is constrained to possess a pre-assigned steady-state output response and that, at the same time, minimizes the error between its transient output response and the transient output response of the full-order system. This problem is somewhat related to a classical optimization problem in model reduction, i.e. the determination of optimal H2 (or other norms) reduced-order models, see e.g. [20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 16, 30]. The main difference between these works and the problem that we are studying is that the reduced-order model to be optimized is constrained to interpolate a given class of input signals or, equivalently, interpolate a given set of points. On the contrary, in the classical unconstrained optimal H2 model reduction also the interpolation points are parameters of the minimization, i.e. the system is required to minimize the frequency response error for all frequencies but it is not constrained to have zero steady-state error for pre-assigned input signals. The reason that justifies the interest in the constrained problem is that in many engineering applications, such as modeling of power systems and power converters [31, 32], optimal control of wave energy converters [33] and data-driven controller tuning [34], the system is excited by a specific or desired class of input signals. It is then of primary importance that the steady-state error for this class of signals is identically equal to zero and just of secondary importance that the error with respect to other input signals is minimized. The paper extends and complements [35], in which the authors, inspired by the learning algorithms given in [36, 37, 38] to solve a model-free adaptive dynamic programming problem, use time-snapshots and data matrices to determine an estimate of the moments of the system to be reduced. In particular, the present paper focuses on the preservation of information describing the transient behavior of the system to be reduced. In particular, the main contributions of the paper is to pose and solve a constrained optimal model reduction problem from time-domain data. To this end, after recalling the results in [35], we describe how to extract information regarding the transient behaviour of the system to be reduced and then how to determine a reduced order model that possesses the specified moments and minimizes the transient error. It is useful to point out from the onset that data can be used with two different scenarios in mind. One is the determination of reduced-order models from real measurements (which can be affected by noise). The other is the determination of reduced-order models from simulated data merely to avoid the use of the high-order system matrices, thus obtaining a computational advantage (sometimes this approach is referred as non-intrusive model reduction, see 2 e.g. [39]). While we do not exclude the use of the proposed technique in the case of real measurements, in the following we focus on the second application. The research objective of this paper is clearly at the intersection of large and active research areas, such as model reduction and system identification. For instance, the use of time-snapshots and data matrices is reminiscent of the methods based on proper orthogonal decompositions, see e.g. [40, 17, 41, 42] and [43], and on subspace identification, see e.g. [44] and [45]. In model reduction, a data-driven moment matching approach for linear systems has been introduced under the name of Loewner framework in [46], see also [47]. This method constructs reduced order models by using matrices composed of frequency-domain measurements. This (with the exception of systems in special form such as bilinear systems [48]) makes intrinsically difficult to extend the method to nonlinear systems. An extension to time-domain measurement has been proposed in [49], in which, however, the time-domain measurements are used to produce frequency-domain measurements (thus again the technique cannot be extended to nonlinear systems). Another time-domain model reduction technique has been proposed in [39], in which, however, the moments of the system are not matched. In system identification, various time-domain data-driven techniques have been presented, such as the eigensystem realization algorithm, see e.g. [50, 51, 52] and [53], and the dynamic mode decomposition, see [54]. In summary, while recognizing the large amount of work in this area, this paper proposes a technique for the determination of reduced order models from time-domain data which are constrained to match a given set of interpolation points. Thus, differently from system identification, the method has a deep connection with the moments, and differently from the Loewner framework (and its generalizations) there is no need to use frequency-domain tools at any stage, opening up the potential extension of the technique to nonlinear systems. The rest of the paper is organized as follows. After briefly recalling some definitions, in Section 2 we formulate the problem of constrained optimal model reduction. In Section 3 we present the first set of results: a method for the estimation of the transient response (Section 3.2), an analysis of the choice of the norm in the optimization problem (Section 3.3) and a constrained optimal reduced-order model (Section 3.4). In Section 4, we discuss how to select the initial condition of the reduced-order model (Sections 4.1 and 4.2) and we provide an algorithmic overview of the results of the paper (Section 4.3). In Section 5 we present two examples: the reduction of the model of a building and the reduction of the model of an atmospheric storm tracking system. Finally, Section 6 contains some concluding remarks. A preliminary version of this paper has been published in [55]. The additional contributions of the present paper are as follows: first, while [55] considers only the ℓ2 norm, in this paper we discuss also other ℓp norms and, in addition, we establish a result for the H2 norm. Second, we compute error bounds and we provide results to determine the optimal initial state of the reduced-order model in various cases. Third, we illustrate the results using two new physically motivated examples. In summary, Sections 3.3, 4.1, 4.2, 4.3, 5.1 and 5.2 are original contributions. Sections 3.2 and 3.4 are extended versions of the material described in [55]. 3 Notation. We use standard notation. R≥0 denotes the set of non-negative real numbers; C<0 (C0 ) denotes the set of complex numbers with negative (zero) real part. The symbol I denotes the identity matrix and σ(A) denotes the spectrum of the matrix A ∈ Rn×n . The symbol tr(A) indicates the trace, i.e. the sum of the elements on the main diagonal of A. The symbol ⊗ indicates the Kronecker product. The vectorization of a matrix A ∈ Rn×m , denoted by vec(A), is the nm × 1 vector obtained by stacking ⊤ ⊤ ⊤ the columns of the matrix A one on top of the other, namely vec(A) = [a⊤ 1 , a2 , . . . , am ] , where ai ∈ Rn denotes the i-th column of A and the superscript ⊤ denotes the transpose. The superscript ∗ denotes the conjugate transpose. A matrix is said to be non-derogatory if its characteristic and minimal polynomial coincide. The symbol ǫk indicates a vector with the k-th element equal to 1 and with all the other elements equal to 0. The symbol ι denotes the imaginary unit. Given some finite set of data points X = {xi ∈ R : i = 1, . . . , m}, the discrete ℓp norm of a function f is defined as 1 Pm ||f ||ℓp = ( i=0 |f (xi )|p ) p , with 1 ≤ p < ∞, and the ℓ∞ norm is defined as ||f ||ℓ∞ = supxi ∈X |f (xi )|. When the set of data points is countably infinite, we denote the norm as the ℓp norm. The H2 norm of q R π 1 a complex valued function Φ(s) is defined as ||Φ||H2 = 2π |Φ(eιω )|2 dω. −π 2. Problem formulation Consider a linear, single-input, single-output, continuous-time, system described by the equations ẋ = Ax + Bu, y = Cx, (1) with x(t) ∈ Rn , u(t) ∈ R, y(t) ∈ R, A ∈ Rn×n , B ∈ Rn×1 and C ∈ R1×n , and assume that (1) is minimal, i.e. controllable and observable. Consider the signal generator described by the equations ω̇ = Sω, u = Lω, (2) with ω(t) ∈ Rν , S ∈ Rν×ν and L ∈ R1×ν . Let S be a non-derogatory matrix such that σ(S) ∩ σ(A) = ∅ and let the pair (S, L) be observable. Definition 1. The moments of system (1) at (S, L) are the elements of the matrix CΠ, where Π is the unique solution of the Sylvester equation1 AΠ + BL = ΠS. (3) In addition, we call the eigenvalues of S interpolation points. In this framework, introduced in [18], a family of reduced-order models is characterized as follows. 1 The Sylvester equation is an invariance equation which is related to the problem of output regulation. The relation between moment matching and output regulation is explored in more detail in [19] and [56]. 4 Definition 2. Consider system (1) and the signal generator (2). Then the system ξ˙ = F ξ + Gu, ψ = Hξ, (4) with ξ(t) ∈ Rν , F ∈ Rν×ν , G ∈ Rν×1 and H ∈ R1×ν , with ν < n, is a reduced-order model of system (1) at (S, L) if there exists a unique solution P of the equation F P + GL = P S, (5) HP = CΠ, (6) such that where Π is the unique solution of (3). If σ(A) ⊂ C<0 and σ(S) ⊂ C0 , then the interconnection of system (1) and the signal generator (2) possesses the output steady-state response yss (t) = CΠω(t). If moreover σ(F ) ⊂ C<0 , then the interconnection of model (4) and the signal generator (2) possesses the output steady-state response ψss (t) = HP ω(t) = CΠω(t). Hence, the moment matching method preserves, by construction, the steady-state response of the system. However, the transient response of system (1) is not preserved by the model (4). This is a notable drawback of this approach that limits the use of reduced-order models by moment matching in control applications. In this paper we investigate the possibility of embedding some type of information about the transient response of system (1) in (4). Formally, this problem can be formulated as follows. Problem 1. Consider system (1) with a fixed initial condition x(0), model (4) and the input u described by the signal generator (2) with a given matrix S and a given initial condition ω(0). Suppose that A, B, C and x(0) are unknown, but that we have access to y(t) and ω(t). Determine the initial condition ξ(0) and the matrices F , G and H such that system (4) is a constrained optimal model of system (1), i.e. solves the minimization problem min F,G,H,ξ(0) ||y − ψ||ℓ , (7) with ℓ a discrete norm to be specified, subject to the constraint that equations (5) and (6) have a unique solution P for the given matrix S. Since the focus of the paper is on data-driven model reduction based on time-domain measurements (as presented in [35]), it is natural to formulate the problem in terms of a discrete ℓ norm. Special interest is given to the ℓ2 norm, but throughout the paper connections with other ℓp norms and with Hp norms are discussed. Note also that the use of discrete ℓ norms allows formulating and solving the problem also for unstable systems (this is not possible with Hp norms). We postpone the discussion of the unstable case to Section 4.3. 5 Remark 1. The optimality conditions of the unconstrained H2 problem, i.e. the problem in which also the interpolation points are parameters of the minimization, are known since the ‘60s and are given in [20]. The first numerically effective approach for the determination of optimal H2 reduced-order models, the IRKA algorithm, is given in [16]. The output of the algorithm converges to an optimal H2 model. While stability of the reduced-order model is not a priori guaranteed, it can be achieved with a “good” initialization step. In [30] a variation of the IRKA algorithm that guarantees stability is given. Remark 2. The constrained ℓ optimal model reduction problem offers the possibility of obtaining the best approximation in terms of a specific ℓ norm and in addition to preserve, with zero error, a prescribed steady-state response. The choice of solving the constrained or unconstrained problem depends, as always in engineering, on the specific application. If the interest of the designer is to have the best approximation along all the frequencies, then the unconstrained H2 problem should be solved. On the other hand if the designer knows that the system is driven by a specific class of input signals, for instance like in the case of power systems [31], wave energy converters [33] or data-driven controller tuning [34], then the constrained problem should be solved. Remark 3. One could wonder if the reduced-order model solving the unconstrained H2 problem is also a solution of our problem. Assume that the prescribed matrix S is such that σ(S) ⊂ C0 . This assumption makes sense because the aim of the paper is to construct a reduced-order model from input/output data and, thus, it is expected that the input of the unknown system is periodic. A necessary optimality condition for the unconstrained H2 problem is that σ(F ) = σ(−S ∗ ). This makes the H2 norm of the reduced-order model not well-defined because the model would not be stable. Hence, the optimal solution of the unconstrained H2 problem cannot be an optimal model of the resulting constrained ℓ problem. 3. A reduced-order model preserving the transient response In this section we provide a constructive solution to the constrained ℓ optimal model reduction problem. To this end, we first recall an algorithm published in [35] for the estimation of the moments. The second step is to extract information regarding the transient response from data. Then, before solving the problem, we show that different error norms can be considered in the optimisation problem. Finally, we gather all these steps and provide a solution to Problem 1. Notice that if Π is the unique solution of the Sylvester equation (3), which is the case if A and S do not have common eigenvalues, then the output of the interconnection of system (1) with (2) can be written, see [18], as y(t) = CΠω(t) + CeAt (x(0) − Πω(0)). 6 (8) If the system is asymptotically stable and σ(S) ⊂ C0 , then yss (t) = CΠω(t) is the steady-state response of (1), whereas ytr (t) = CeAt (x(0) − Πω(0)) is the transient response of (1). For the time being we assume that the system is asymptotically stable and that σ(S) ⊂ C0 , and we postpone the discussion of the unstable case to Section 4.3. Moreover, consistently with the formulation of Problem 1 we assume that A, B, C and x(0) are not known, but that we have access to the input and the output of the system. 3.1. Estimation of the moments (see [35]) Consider a sequence of sampling times {tk } and the sequences of sampled output {y(tk )} at {tk } and of sampled states {ω(tk )} at {tk }. In [35], an algorithm has been devised to determine an approximation g k of CΠ at the sampling time tk using {y(tk )} and {ω(tk )} with an arbitrary small error, i.e. given CΠ a small strictly positive number η we can iterate the algorithm until the inequality   ⊤ η g k − CΠ g k−1 CΠ g k − CΠ g k−1 , CΠ ≤ tk − tk−1 (9) g k satisfies (9), we indicate such CΠ g k , with abuse of notation, as holds. When the approximated CΠ g η and we call the elements of CΠ g η the estimated moments of system (1) at (S, L). CΠ 3.2. Estimation of the “transient response” Consider the sequences {y(ti )} and {ω(ti )} that have been used to obtain the estimated moments g CΠη . From these sequences, we can construct another sequence {e ytr (ti )} given by g η ω(ti ), yetr (ti ) = y(ti ) − CΠ (10) that, by comparison with equation (8), represents an approximation of the term CeAt (x(0) − Πω(0)), which is the transient output response of system (1). Note that if the sequence {e ytr (ti )} is identically equal to zero, then x(0) = Πω(0). If this is the case we select a new random initial condition ω(0), generate new {ω(tk )} and {y(tk )}, and determine a new sequence {e ytr (ti )}. We reiterate this procedure until we obtain a non-identically zero sequence {e ytr (ti )}. Let χ(t) = x(t) − Πω(t) and consider the system χ̇ = Aχ zf = Cχ. (11) The transient output response ytr (t) of system (1) coincides to the free output response zf (t) of ytr (ti )} is equal to the “estimated” sequence {e zf (ti )}. system (11). Thus, the estimated sequence {e We now briefly recall the prediction-error identification problem as presented in [57]. Consider the sequence {e zf (ti )} (note that the input is identically zero since this is a free response) and the problem of determining the parameters e χ θe = [Fe , H, e(0)], e ∈ R1×en and χ where Fe ∈ Rne×en , H e(0) ∈ Rne , with n e ≤ n, are approximations of the matrices A, C and of e = zef (t) − ẑf (t, θ), e the initial state χ(0), respectively, of system (11). Define the prediction-error as e(t, θ) 7 e The prediction-error identification e is the predicted zef (t) using the estimated parameters θ. where ẑf (t, θ) problem consists in determining θe = arg min θ with JlN (θ) = tN X 1 J N (θ), N +1 l l(e zf (t) − ẑf (t, θ)), (12) (13) t=t0 =0 where l : R → R≥0 and tN > 0. Many different algorithms to solve this problem have been presented (see [57] for an overview) and some of them are readily available in several programming languages. For instance the command “ssest” of MATLABTM (extensively described in [58]) solves the problem (12) for a squared or linear error norm, i.e. v u tN u X N |e(t, θ)|2 = ||e||ℓ2 , Jℓ2 (θ) = t (14) t=t0 =0 and JℓN1 (θ) = tN X |e(t, θ)| = ||e||ℓ1 . (15) t=t0 =0 In the following, we assume that we have solved the prediction-error identification problem (with respect e and χ to some norm) and that we have determined Fe , H e(0). zf (ti )} and provides an Remark 4. The command ssest computes the Hankel singular values from {e estimate of the order n e of the identified system. If it is possible to generate new data, this information can be used to redesign the signal generator and change the order of the reduced-order model matching the order n e. If this is not possible, the information on the singular values is ignored and the order n e = ν is selected. 3.3. Selection of the error norm e and χ Depending on the selection of the function l in (13) we can identify optimal Fe , H e(0) with respect to different norms. One of the most common and widely used norms is the Euclidean norm given in (14), i.e. the ℓ2 norm. This norm is differentiable and if the approximating function is linear, as in the case under consideration, then the problem becomes a linear least-squares problem. Moreover, since the norm is strictly convex, there exists a global minimum. Another norm of interest is the ℓ∞ norm. In this case the maximal absolute error is minimized. The resulting problem is not strictly convex, so the optimal solution is not unique, but it can be easily solved rewriting the minimization problem as a linear programming problem (with the definition of an auxiliary variable). Another common choice is the ℓ1 norm, namely (15). The characteristic of this norm is that the minimization does not consider points of zero measure, i.e. single outliers points. Also this problem is 8 not strictly convex, so the optimal solution is not unique, but it can be solved recasting it as a linear programming problem (with the definition of N auxiliary variables). Since the problem we are considering is the identification of a linear system (without input), other more general costs can be considered with a reformulation of the minimization problem. For instance, a maximum likelihood method can be considered instead of a prediction-error algorithm. In this case the cost of the minimization problem would be N Jlog (θ) = tN X − log fe (e(t, θ)) t=t0 =0 where fe is the probability density function of e [57]. More interesting is to link the ℓp minimization problem with the Hp minimization problem. To this end, we restrict our analysis to the H2 norm. The reason for this last restriction is clarified below. We assume that the signal e is defined for all t ∈ R≥0 (and recall that the system is asymptotically stable). Thus, there exist sampling times tj , with limj→∞ tj = ∞, such that the norm of square-summable sequences, i.e. the ℓ2 norm, is defined for the signal e. e and χ e(0) can be determined solving (12) with the cost function Theorem 1. The H2 optimal Fe , H JℓN replaced by JH2 (θ) = 1 2 = 1 2 Pn k=1 Pn k=1 h  limj→∞ P tj t=t0 =0 |e(t, θ)|2 : e(0, θ) = ǫk  i (16) ||e||2ℓ2 : e(0, θ) = ǫk . Proof. The proof is based on the observation that the Laplace transform of the free output response can be interpreted as the transfer function of a multi-input system with n inputs equal to the components of the initial condition. Then the result follows from the multi-input multi-output signal interpretation of the continuous-time H2 norm given in [59, Chapter 2]. In fact, consider the ℓ2 norm of e(t, θ), i.e. v u X u ∞ |e(t, θ)|2 , ||e||ℓ2 = t t=t0 =0 and the discrete L2 norm for the Laplace transform E(s) of the signal e(t, θ) on the unity circle, i.e. s Z π 1 ||E||L2 = |E(eιω )|2 dω. 2π −π By Parseval’s theorem, the time-domain and frequency-domain norms are equal, namely ||e||ℓ2 = ||E||L2 . Let Φ(s) be such that E(s) = Φ(s)e(0, θ) and consider 9 (17) Pn k=1 ||Φ(eιω )ǫk ||2L2 , i.e. the sum of n terms ||E||2L2 when the initial condition e(0, θ) is selected as ǫk , with k = 1, . . . , n, namely Pn Pn  2 ιω 2 k=1 ||Φ(e )ǫk ||L2 k=1 ||E||L2 : e(0, θ) = ǫk ] = Rπ Pn 1 = k=1 2π (Φ(eιω )ǫk )(Φ(eιω )ǫk )⊤ dω −π Rπ Pn 1 ιω ⊤ = k=1 2π tr[Φ(eιω )ǫk ǫ⊤ k Φ(e ) ]dω −π  Rπ Pn 1 ιω ⊤ tr[Φ(eιω ) k=1 ǫk ǫ⊤ = 2π k Φ(e ) ]dω −π Rπ 1 Φ(eιω )Φ(eιω )⊤ dω = 2π −π = ||Φ||2H2 . By (17) we have 1 2 which proves the claim. Pn k=1   ||e||2ℓ2 : e(0, θ) = ǫk = 21 ||Φ||2H2 , (18)  The meaning of this result is that if instead of minimizing the error between the output response of system (11) and the predictor for an arbitrary initial condition, we minimize the sum of n of these errors with initial conditions selected in each term of the sum as a different vector of the natural basis, then the solution is (for t → ∞) optimal with respect to the discrete H2 norm. Note that the cost (18) is quadratic and thus we can apply a gradient algorithm to determine the global optimal solution. Similar relations with Hp norms for p = 6 2 cannot be trivially inferred. In fact, our derivation is based on Parseval’s theorem and on the properties of the trace operator that hold only in the quadratic case. 3.4. Constrained optimal reduced-order model Now that estimates of the steady-state response and of the transient response are available, we want to determine a reduced-order model by moment matching that possesses the estimated characteristics. e obtained solving the prediction-error minimization probProposition 1. Suppose that the pair (Fe , H) e and Pe lem (12) is observable, σ(Fe ) ⊂ C<0 and σ(Fe ) ∩ σ(S) = ∅. Then there exist unique matrices G solving the equations e Fe Pe − PeS = −GL, In addition, the system described by the equations e ξ˙ = Fe ξ + Gu, is a model of system (1) at S. e Pe = CΠ gη . H (19) e ψ = Hξ, (20) Proof. Using the vectorization operator and the properties of the Kronecker product, equations (19) can be rewritten as   I ⊗ Fe − S ⊤ ⊗ I e I ⊗H L⊤ ⊗ I 0   10 vec(Pe ) e vec(G)   = 0 gη ) vec(CΠ  . e and the fact that σ(Fe ) ∩ σ(S) = ∅, the solution By observability of the pairs (S, L) and (Fe , H)    −1  vec(Pe) I ⊗ Fe − S ⊤ ⊗ I L⊤ ⊗ I 0  =    e e g vec(G) I ⊗H 0 vec(CΠη ) (21) is unique. The claim that the resulting system is a model of (1) at S follows from the fact that this g η ). system belongs to the family of models (4) given in Definition 2 (if we allow the relaxation CΠ = CΠ  Remark 5. Proposition 1 provides the unique reduced-order model which satisfies equations (19) for e S, L and CΠ g η . In particular, the proof of Proposition 1 gives a constructive fixed matrices Fe , H, e and Pe. method to determine the only two available variables, i.e. G Corollary 1. The model (20) belongs to the family of models given in [18], i.e. ˙ ξˆ = (S − ĜL)ξˆ + Ĝu, ˆ g η ξ, ψ = CΠ (22) Proof. Consider the change of coordinates ξˆ = Pe−1 ξ. By multiplying on the left-hand side the first e The system in the new coordinates is described equation in (19) by Pe−1 yields Pe−1 Fe Pe = S − Pe−1 GL. e = Ĝ and H e Pe = CΠ gη . by the matrices Pe−1 Fe Pe = S − ĜL, Pe−1 G  e selected as in (21), is a constrained ℓ optimal reduced-order model Corollary 2. System (20), with G which interpolates the moments of system (1) at S. e and H). e ν parameters Proof. The state space model (20) has ν 2 + 2ν parameters (the matrices Fe , G e are used to satisfy the ν constraints which guarantee that the model matches the (the matrix G) moments of system (1) at the prescribed interpolation points. All the remaining parameters, namely e are obtained with a prediction-error identification algorithm that, as shown the matrices Fe and H, in [57], solves the optimization problem (12). Finally, note that as a consequence of Corollary 1, the e is satisfying the moment matching condition instead of Fe or H e or a combination thereof is fact that G without loss of generality.  e Since these matrices Remark 6. Proposition 1 includes assumptions about the matrices Fe and H. are the result of a constrained optimization procedure, one may wonder if these assumptions hold. In general, in fact, if the estimation procedure does not fail, then the function ssest generates (or can e be forced to generate) matrices satisfying the assumptions about the observability of the pair (Fe , H) and the eigenvalues of Fe . In particular, the function ssest has an option which can be used to enforce stability (this is needed only when the transient is diverging but we still need to obtain a stable model, e for some reason). In this last case the option guarantees that the estimated model described by (Fe , H) is stable, although the obtained model may be sub-optimal. The third condition follows from the marginal stability of the signal generator. More details on these identification issues can be found in [57, 58]. 11 4. Error bounds and selection of ξ(0) In this section we focus on the determination of an “optimal” initial condition ξ(0) of the reducedorder model. First of all note that the selection of the initial condition is included in the minimization of (12). In fact, also the function ssest of MATLABTM returns the optimal initial condition. However, once an optimal model has been obtained, we may want to test it thoroughly, i.e. we use the same e comparing the responses of the system and of the reduced order model for different matrices Fe and H initial conditions. In this case, we update from input/output data the optimal initial condition of the e In the reduced order model without the need of repeating the identification procedure for Fe and H. following we consider two cases (u = Lω and u 6= Lω) and we provide bounds on the approximation error. 4.1. The input of the system is u = Lω g η with In this case, considering the approximation introduced by using the estimated moments CΠ the ℓ2 optimal reduced-order model (20), it is easy to show that ||y(t) − ψ(t)||ℓ2 ≤ εeη + where e e F t (ξ(0) − Pe ω(0)) , + CeAt (x(0) − Πω(0)) − He (23) ℓ2   ⊤ gη g η CΠ − CΠ εeη = CΠ − CΠ ||ω(t)||ℓ∞ . Remark 7. εeη can be made arbitrarily small decreasing the threshold η in (9), see [35] for more detail. Thus, we can minimize the error between the two output responses selecting ξ(0) as the minimizer ξe of the function VLω (ξ(0)) = CeAt (x(0) − Πω(0))− e Fet (ξ(0) − Pe ω(0)) −He i.e. (24) ℓ2 ξe = arg minξ(0) VLω (ξ(0)). (25) e p ∈ Rp×ν as Theorem 2. Define the time-snapshots Yep ∈ Rp×1 and Σ and  g η ω(0) Yep = y(0) − CΠ Σp = h e H ··· e Fet1 He 12 ⊤ g η ω(tp ) , y(tp ) − CΠ ··· e Fetp He , i⊤ . If the matrix Σp is full-rank, then ξep = zef,p (0) + Peω(0), with zef,p (0) = Σ⊤ p Σp −1 (26) e Σ⊤ p Yp , (27) e i.e. is an asymptotically converging approximation of the optimal initial condition ξ, ξe = lim ξep . p→∞ (28) e = J N (θ). e Thus, recalling that zef (t) = y(t) − CΠ g η ω(t), to compute ξep we first Proof. Note that VLω (ξ) ℓ2 determine an approximation zef,p (0) of zef (0) from g η ω(t) = zef (t) = He e Fet zef (0), y(t) − CΠ which yields (27). Equation (26) follows by comparison with equation (8) written for the reduced-order e is observable model, i.e. ξ(t) = Pe ω(t) + ξtr (t) = Pe ω(t) + zef (t). Note that assuming that the pair (Fe , H) (see Proposition 1) implies that the matrix Σp is full rank for p = ν. The resulting ξeν is such that y(0) − ψ(0) = 0. However, this does not minimize (24). To this end we should select p as large as e yielding (28). possible to obtain the least-squares estimation of ξ,  Corollary 3. Consider system (1), model (20), the signal generator (2) and the optimal initial condition ξe computed as the solution of (25). Then e + εeη . ||y(t) − ψ(t)||ℓ2 ≤ JℓN2 (θ) (29) Proof. The claim follows directly from equation (24) and the inequality (23).  Remark 8. If the moments CΠ are known, then εeη = 0. Hence, the norm of the error between the output responses of the two systems is, as expected, the optimal cost of the prediction-error problem. 4.2. The input of the system is u 6= Lω Assuming that we have obtained a constrained optimal reduced-order model as described in Section 3.4, we want now to determine the optimal initial state of the reduced-order model when the input is not an interpolating signal. The objective is once again the minimization of ||y(t) − ψ(t)||ℓ2 . e and H e are fixed and However, in this case ψ is the output of (20) determined in Section 3.4, i.e. Fe , G e Note that equation (8) does not hold and we we have to determine only the optimal initial condition ξ. have ||y(t) − ψ(t)||ℓ2 ≤ e e F t ξ(0) − H e ≤ y(t) − He 13 Rt 0 e e eF (t−τ ) Gu(τ )dτ . ℓ2 However, the problem can be easily solved at this point and the optimal initial condition can be computed as ξe = lim Σ⊤ p Σp p→∞ redefining the matrix Yep ∈ Rp×1 as  e Yp = y(0) ··· e y(tp ) − H −1 R tp 0 e Σ⊤ p Yp , (30) ⊤ e e eF (tp −τ ) Gu(τ )dτ . Hence, we have constructively solved Problem 1, i.e. we have computed the unique, up to a change of coordinates, reduced-order model which possesses the estimated transient and, simultaneously, achieves moment matching at the prescribed interpolation points. Moreover, we have determined the optimal initial condition to minimize the error between the output of the system and the output of the reduced-order model. Remark 9. The relation (30) holds in general and can be used also when u = Lω. In fact, it is easy to show that (30) gives the same result as (28) for u = Lω. Remark 10. In practice p cannot be infinity, so it is fixed “large enough”. The complexity of the routine computing the optimal initial condition depends upon the magnitude of p. Consequently, one can trade off accuracy (high p) and computational speed (low p). 4.3. Procedural overview and unstable systems We can now summarize the results of the paper in a procedure to determine reduced-order models achieving moment matching at prescribed interpolating signals that are also optimal with respect to a selected norm. The method can be applied to input/output measurements or to data generated from simulations. The procedure is summarized in the following steps. g η [35, Algorithm 1]. 1: Compute the estimated moments CΠ 2: Compute the sequence {e ytr (ti )} from equation (10). 3: Compute the minimizer of (12) using a prediction-error algorithm. e from equation (21). 4: Compute G 5: Compute the optimal initial condition ξe from equation (30). If the system is asymptotically stable and the input is bounded, the procedure that we have outlined is completely data-driven and no knowledge of the matrices of the system is required. From a computational point of view, since the matrices of the original system are never used, the complexity of the proposed method depends only on the dimension ν of the reduced order model (with ν << n). The most critical operation, from the point of view of obtaining a good reduced order model, is therefore the solution of the prediction error problem using system identification tools. 14 At last, we consider the case in which the system is not asymptotically stable. In this case we cannot distinguish between a steady-state and a transient response, however, the output response still satisfies equation (8). Thus, we may proceed in two ways. 1. We assume that the matrices A, B and C are known and, thus, that we can compute the moments CΠ solving the Sylvester equation (3). The rest of the procedure, i.e. the use of the prediction-error algorithm and the determination of the reduced-order model, can be applied to this case without modification (and without requiring A, B and C). In fact, note that the sampled sequence used in the prediction algorithm is truncated and, as a consequence, the discrete ℓ2 norm is well-defined. 2. We determine a stabilizing control law using, for instance, the results in [60, 61]. In fact, in [60] it is shown that it is always possible to find a data-driven stabilizing state feedback control law for controllable linear systems, whereas in [61] it is shown that the data-driven stabilizing control can be obtained also with output-feedback (for a large class of linear systems). Then, after applying this stabilizing feedback, we can use the results of the paper for the asymptotically stable case. Finally, to obtain a reduced-order model of the original open-loop system we can exploit the open-loop model reduction technique given in [62]. Note that in this case, the procedure is entirely data-driven even for the unstable case. 5. Simulations In this section we illustrate the results of the paper by means of two examples. In the first example we reduce a model describing a building with 8 floors. This first example entails a relatively low dimensional system to be reduced which, however, has a frequency response with several peaks and consequently is of difficult reduction. In the second example we reduce the Eady model, an atmospheric storm track model. This second example has large dimensionality and it is characterized by dense matrices. Both systems are benchmark models described in [63]. The system matrices can be downloaded from [64]. In both cases the matrices are used only to generate a single time-domain sequence {y(ti )} (for a randomly generated initial condition). This time-domain sequence is then used for the computation of the reduced order models using the procedure outlined in Section 4.3. No operation has been performed on the matrices of the two systems to be reduced. 5.1. Los Angeles University Hospital building model (n = 48) Mechanical models can be described by the second-order differential system M q̈ + Cd q̇ + Kq = Bd u 15 Bode Diagram Magnitude (dB) −40 −50 −60 −70 −80 −90 Phase (deg) 90 45 0 −45 −90 0 10 1 10 Frequency (rad/s) 2 10 Figure 1: Bode plot of the building model (solid/blue line) and of the optimal reduced-order model (dotted/red line). Magnitude (dB) -50 -100 -150 100 101 102 100 101 102 Phase (deg) 20 0 -20 Figure 2: Bode plot of the error system. The top graph is in decibel whereas the bottom graph is in degrees. 16 −7 x 10 6 4 2 0 −2 −4 −6 0 5 10 15 t 20 25 30 5 10 15 t 20 25 30 −8 x 10 2.5 2 1.5 1 0.5 0 0 Figure 3: Case u = Lω. Time histories of the output response of the building model (solid/blue line) and of the output response of the reduced-order model (dotted/red line). The bottom graph shows the absolute error. where q ∈ Rκ , M ∈ Rκ×κ , Cd ∈ Rκ×κ and Bd ∈ Rκ×1 , with M positive definite. This system can be written in the form (1) with  A= 0 I −M −1 K −M −1 Cd  ,  B= 0 −M −1 Bd  , and C taken according to the application. In particular, we consider the model of a building (Los Angeles University Hospital) with 8 floors, each having three degrees of freedom, i.e. displacements in x− and y−directions, and rotation [65, 63]. The resulting system (1) has a state of dimension n = 48. The output of the system is the 25-th component of the state which corresponds to displacement in the x−direction of the first floor. Note that this model has been reduced with various methods in [17], obtaining a reduced-order model of order ν = 31. In this paper we reduce the system with a model of order ν = 19, interpolating at the points 0, ±5.22ι, ±10.3ι, ±13.5ι, ±22.2ι, ±24.5ι, ±36ι, ±42.4ι, ±55.9ι and ±70ι (corresponding to the main frequency peaks). Using the MATLAB R2014a function ssest to implement the prediction-error algorithm, we determine a constrained optimal reduced-order model as defined in (20). Fig. 1 shows the Bode plot of the system in solid/blue line and of the optimal reduced-order model in dotted/red line. The vertical dashed lines indicate the interpolation points. Fig. 2 shows the Bode plot of the error system. We note that the main frequency peaks are approximated with negligible error and that the approximation is uniformly good2 . However, some 2 The two Bode plots in Fig. 2 should be overlapped for low frequencies because 0 is an eigenvalue of S. The mismatch noticeable a low frequencies is a numerical artifact: by direct computation, the moment at 0 of the system is 0, whereas 17 Magnitude (dB) Bode Diagram 60 50 Phase (deg) 40 −30 −60 −90 −1 10 0 10 Frequency (rad/s) 1 10 Figure 4: Bode plot of the Eady model (solid/blue line) and of the optimal reduced-order model (dotted/red line). minor frequency peaks are not maintained in the reduced-order model. This fact suggests that a reduced-order model with higher order could be used to improve the approximation of the minor peaks. Fig. 3, top graph, shows the output response of system (1) in solid/blue line for a randomly generated initial state x(0) and the output response of the reduced-order model in dotted/red line for an initial state ξ(0) computed as described in Section 4.1 (the input satisfies u = Lω). The bottom graph shows the corresponding absolute error. We note that the qualitative behavior of the reduced-order model can be considered satisfactory and that the lack of the minor peaks in the Bode plot of the reduced-order model does not impair significantly the time-domain behavior. 5.2. Atmospheric storm track model (n = 598) In the second example we consider the idealized model of the mid-latitude storm track, the so-called Eady model. The model is discussed in [66, 67] and the numerical values of the parameters used in the paper are taken from [63, 64]. The model consists of a Boussinesq atmosphere with constant stratification and constant shear in thermal wind balance on a channel with periodic boundary conditions in the zonal x−direction, 0 < x < 12π; solid walls located at two latitudes y = ± π2 in the meridional y−direction and a solid lid at height z = 1, simulating the tropopause (ground at z = 0). The mean velocity is varying only with height and it is equal to z. Zonal and meridional lengths are non-dimensionalized by L = 1000 km, vertical scales by H = 10 km, velocity by U0 = 30 m/s and time by T = L U0 , so that a time unit is about 9 h. The channel has a linear damping at the entry and exit region to model the the moment at 0 of the reduced order model is -1.47e-17. In semilogarithmic scale this results in a visible mismatch. 18 300 250 200 150 0 5 10 15 t 20 25 30 0 5 10 15 t 20 25 30 20 15 10 5 0 Figure 5: Case u 6= Lω. Time histories of the output response of the Eady model (solid/blue line) and of the output response of the reduced-order model (dotted/red line). The bottom graph shows the absolute error. lack of coherence of the cyclone waves around the atmosphere of Earth. The perturbation equations for single harmonic perturbations in the meridional y-direction of the form φ(x, z, t)eιly are where ∇2 is the Laplacian ∂2 ∂x2   ∂φ = ∇−2 −z∇2 Dφ − r(x)∇2 φ , ∂t 2 ∂ 2 + ∂z 2 −l , D = ∂ ∂x and l is the meridional wave number. The linear damp      ing rate r(x) is taken, consistently with [67], to be r(x) = h 2 − tanh x − π4 /δ + tanh x − 7π 2 /δ , with h = 2.5 and δ = 1.5. The boundary conditions express the conservation of potential temperature along the solid surfaces at the ground and at the tropopause and are obtained evaluating the equation ∂2φ ∂t∂z ∂φ = −zD ∂φ ∂z + Dφ − r(x) ∂z , 1 at z = 0 and z = 1. The dynamical system in the generalized velocity variable ψ = (−∇2 ) 2 φ is 1 1 governed by the operator A = (−∇2 ) 2 ∇−2 (−z∇2 D − r(x)∇2 )(−∇2 )− 2 yielding dψ = Aψ. dt The dynamical operator is approximated spectrally in the zonal direction and with finite differences in the vertical direction yielding a linear system (1) of order n = 598 [64]. In [67], this model (however, with a discretization of order n = 400 instead of n = 598) has been reduced using balanced truncation to order ν = 60. In this paper we reduce the system with a model of order ν = 17, interpolating at the points 0, ±0.7ι, ±0.721ι, ±0.8ι, ±0.9ι, ±1.2ι, ±1.3ι, ±2ι and ±7ι (to interpolate the steady-state for constant inputs, on the central features of the Bode plot and at a few high frequency points.) . 19 Fig. 4 shows the Bode plot of the system in solid/blue line and of the optimal reduced-order model in dotted/red line. The vertical dashed lines indicate the interpolation points. We note that the approximation is good in the range of frequencies under examination but that, at high frequencies, the two responses start to diverge. Thus, we check the qualitative behavior of the reduced-order model when the input is not Lω. To this end, we select the input as u= Pρ i=1 wi cos(fi t), (31) where f1 = 3, f2 = 15, f3 = 23, f4 = 37, f5 = 0.1, f6 = 50, f7 = 0.001 and the weights wi are randomly generated (in this simulation w1 = 0.0406, w2 = 0.0877, w3 = 0.0177, w4 = 0.0950, w5 = 0.0676, w6 = 0.0510 and w7 = 0.0724). Note that the input (31) has sinusoidal terms which are not part of the interpolated signals. Fig. 5 top graph, shows the output response of system (1) in solid/blue line and the output response of the reduced-order model in dotted/red line when the input to the two systems is selected as in (31). The initial condition of the reduced-order model has been computed from equation (30). The bottom graph shows the corresponding absolute error. We see that the error amplitude is small compared to the amplitude of the signal. However, the error does not decay to zero (since the steady-state of the two systems is different). 6. Conclusion In this paper we have solved the problem of constrained optimal model reduction by moment matching. Using a data-driven approach we have determined the unique reduced-order model which possesses the estimated transient and, simultaneously, achieves moment matching at the prescribed interpolation signals. The discrepancy between the output of the system and the output of the reducedorder model has been characterized and a method to compute the optimal initial condition of the reduced-order model has been given. We have also discussed connections with some special ℓp norm and with the H2 norm. The results of the paper have been illustrated by means of the reduction of two benchmark models based on physical systems. Further research will focus on the extension of the technique to nonlinear systems and to measurements affected by noise. Acknowledgement The work of G. Scarciotti has been supported in part by Imperial College London under the Junior Research Fellowship Scheme. The work of A. Astolfi has been supported in part by the European Union’s Horizon 2020 Research and Innovation Programme under grant agreement No 739551 (KIOS CoE). The work of Z.P. Jiang has been supported in part by the National Science Foundation under Grant ECCS-1501044. 20 [1] K. J. Åström and P. R. Kumar, “Control: A perspective,” Automatica, vol. 50, no. 1, pp. 3–43, 2014. [2] B. C. Moore, “Principal component analysis in linear systems: controllability, observability, and model reduction,” IEEE Transactions on Automatic Control, vol. 26, no. 1, pp. 17–32, 1981. [3] D. G. Meyer, “Fractional balanced reduction: model reduction via a fractional representation,” IEEE Transactions on Automatic Control, vol. 35, no. 12, pp. 1341–1345, 1990. [4] W. S. Gray and J. Mesko, “General input balancing and model reduction for linear and nonlinear systems,” in Proceedings of the 1997 European Control Conference, Brussels, Belgium, 1997, pp. 2862–2867. [5] S. Lall and C. Beck, “Error bounds for balanced model reduction of linear time-varying systems,” IEEE Transactions on Automatic Control, vol. 48, no. 6, pp. 946–956, 2003. [6] V. M. Adamjan, D. Z. Arov, and M. G. Krein, “Analytic properties of Schmidt pairs for a Hankel operator and the generalized Schur-Takagi problem,” Mathematics of the USSR Sbornik, vol. 15, pp. 31–73, 1971. [7] K. Glover, “All optimal Hankel-norm approximations of linear multivariable systems and their L∞ -error bounds,” International Journal of Control, vol. 39, no. 6, pp. 1115–1193, 1984. [8] M. G. Safonov, R. Y. Chiang, and D. J. N. Limebeer, “Optimal Hankel model reduction for nonminimal systems,” IEEE Transactions on Automatic Control, vol. 35, no. 4, pp. 496–502, 1990. [9] H. Kimura, “Positive partial realization of covariance sequences,” Modeling, Identification and Robust Control, pp. 499–513, 1986. [10] C. I. Byrnes, A. Lindquist, S. V. Gusev, and A. S. Matveev, “A complete parameterization of all positive rational extensions of a covariance sequence,” IEEE Transactions on Automatic Control, vol. 40, pp. 1841–1857, 1995. [11] T. T. Georgiou, “The interpolation problem with a degree constraint,” IEEE Transactions on Automatic Control, vol. 44, pp. 631–635, 1999. [12] A. C. Antoulas, J. A. Ball, J. Kang, and J. C. Willems, “On the solution of the minimal rational interpolation problem,” Linear Algebra and Its Applications, Special Issue on Matrix Problems, vol. 137-138, pp. 511–573, 1990. [13] C. I. Byrnes, A. Lindquist, and T. T. Georgiou, “A generalized entropy criterion for NevanlinnaPick interpolation with degree constraint,” IEEE Transactions on Automatic Control, vol. 46, pp. 822–839, 2001. 21 [14] K. Gallivan, A. Vandendorpe, and P. Van Dooren, “Sylvester equations and projection-based model reduction,” Journal of Computational and Applied Mathematics, vol. 162, no. 1, pp. 213–229, 2004. [15] K. A. Gallivan, A. Vandendorpe, and P. Van Dooren, “Model reduction and the solution of Sylvester equations,” in 17th International Symposium on Mathematical Theory of Networks and Systems, Kyoto, Japan, 2006. [16] S. Gugercin, A. C. Antoulas, and C. Beattie, “H2 model reduction for large-scale linear dynamical systems,” SIAM Journal on Matrix Analysis and Applications, vol. 30, no. 2, pp. 609–638, 2008. [17] A. Antoulas, Approximation of Large-Scale Dynamical Systems. Philadelphia, PA: SIAM Advances in Design and Control, 2005. [18] A. Astolfi, “Model reduction by moment matching for linear and nonlinear systems,” IEEE Transactions on Automatic Control, vol. 55, no. 10, pp. 2321–2336, 2010. [19] G. Scarciotti and A. Astolfi, “Nonlinear model reduction by moment matching,” Foundations and Trends in Systems and Control, vol. 4, no. 3-4, pp. 224–409, 2017. [20] L. Meier and D. G. Luenberger, “Approximation of linear constant systems,” IEEE Transactions on Automatic Control, vol. 12, no. 5, pp. 585–588, 1967. [21] D. A. Wilson, “Optimum solution of model-reduction problem,” Proceedings of the Institution of Electrical Engineers, vol. 117, no. 6, pp. 1161–1165, 1970. [22] D. C. Hyland and D. Bernstein, “The optimal projection equations for model reduction and the relationships among the methods of Wilson, Skelton, and Moore,” IEEE Transactions on Automatic Control, vol. 30, no. 12, pp. 1201–1211, 1985. [23] A. E. Bryson and A. Carrier, “Second-order algorithm for optimal model order reduction,” Journal of Guidance, Control, and Dynamics, vol. 13, no. 5, pp. 887–892, 1990. [24] Y. Halevi, “Frequency weighted model reduction via optimal projection,” in Proceedings of the 29th IEEE Conference on Decision and Control, vol. 5, 1990, pp. 2906–2911. [25] A. Lepschy, G. A. Mian, G. Pinato, and U. Viaro, “Rational l2 approximation: a non-gradient algorithm,” in Proceedings of the 30th IEEE Conference on Decision and Control, vol. 3, 1991, pp. 2321–2323. [26] L. Baratchart, M. Cardelli, and M. Olivi, “Identification and rational ℓ2 approximation a gradient algorithm,” Automatica, vol. 27, no. 2, pp. 413–417, 1991. [27] J. T. Spanos, M. H. Milman, and D. L. Mingori, “A new algorithm for l2 optimal model reduction,” Automatica, vol. 28, no. 5, pp. 897–909, 1992. 22 [28] P. Fulcheri and M. Olivi, “Matrix rational h2 approximation: A gradient algorithm based on schur analysis,” SIAM Journal on Control and Optimization, vol. 36, no. 6, pp. 2103–2127, 1998. [29] W. Yan and J. Lam, “An approximate approach to h2 optimal model reduction,” IEEE Transactions on Automatic Control, vol. 44, no. 7, pp. 1341–1358, 1999. [30] T. Zeng, J. Chen, and C. Lu, “A tangential interpolation algorithm for optimal H2 model reduction with stability guarantee,” in 2013 Fifth International Conference on Computational and Information Sciences, 2013, pp. 955–958. [31] G. Scarciotti, “Low computational complexity model reduction of power systems with preservation of physical characteristics,” IEEE Transactions on Power Systems, vol. 32, no. 1, pp. 743–752, 2017. [32] G. Scarciotti and A. Astolfi, “Moment-based discontinuous phasor transform and its application to the steady-state analysis of inverters and wireless power transfer systems,” IEEE Transactions on Power Electronics, vol. 31, no. 12, pp. 8448–8460, 2016. [33] N. Faedo, G. Scarciotti, A. Astolfi, and J. V. Ringwood, “Energy-maximising control of wave energy converters using a moment-domain representation,” Control Engineering Practice, vol. 81, pp. 85–96, 2018. [34] V. Breschi, S. Formentin, G. Scarciotti, and A. Astolfi, “Simulation-driven fixed-order controller tuning via moment matching,” in Proceedings of the 2019 European Control Conference, 2019, pp. 2307–2312. [35] G. Scarciotti and A. Astolfi, “Data-driven model reduction by moment matching for linear and nonlinear systems,” Automatica, vol. 79, pp. 340–351. [36] T. Bian, Y. Jiang, and Z. P. Jiang, “Adaptive dynamic programming and optimal control of nonlinear nonaffine systems,” Automatica, vol. 50, no. 10, pp. 2624–2632, 2014. [37] Y. Jiang and Z. P. Jiang, “Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics,” Automatica, vol. 48, no. 10, pp. 2699–2704, 2012. [38] ——, “Robust adaptive dynamic programming and feedback stabilization of nonlinear systems,” IEEE Transactions on Neural Networks and Learning Systems, 2014. [39] B. Peherstorfer and K. Willcox, “Data-driven operator inference for nonintrusive projection-based model reduction,” Computer Methods in Applied Mechanics and Engineering, vol. 306, pp. 196–215, 2016. [40] E. N. Lorenz, Empirical Orthogonal Functions and Statistical Weather Prediction, ser. Scientific report 1, Statistical Forecasting Project. MIT, Department of Meteorology, 1956. 23 [41] C. W. Rowley, T. Colonius, and R. M. Murray, “Model reduction for compressible flows using POD and Galerkin projection,” Physica D: Nonlinear Phenomena, vol. 189, no. 1-2, pp. 115–129, 2004. [42] B. R. Noack, K. Afanasiev, M. Morzynski, G. Tadmor, and F. Thiele, “A hierarchy of lowdimensional models for the transient and post-transient cylinder wake,” Journal of Fluid Mechanics, vol. 497. [43] K. Willcox and J. Peraire, “Balanced model reduction via the proper orthogonal decomposition,” AIAA Journal, vol. 40, no. 11, pp. 2323–2330, 2002. [44] P. van Overschee and L. R. de Moor, Subspace Identification for Linear Systems: Theory – Implementation – Applications. Kluwer Academic Publishers, 1996. [45] M. Verhaegen and V. Verdult, Filtering and System Identification: A Least Squares Approach. Cambridge University Press, 2007. [46] A. J. Mayo and A. C. Antoulas, “A framework for the solution of the generalized realization problem,” Linear Algebra and its Applications, vol. 425, no. 2-3, pp. 634–662, 2007. [47] A. Ionita and A. Antoulas, “Data-driven parametrized model reduction in the Loewner framework,” SIAM Journal on Scientific Computing, vol. 36, no. 3, pp. A984–A1007, 2014. [48] A. C. Antoulas, I. V. Gosea, and A. C. Ionita, “Model reduction of bilinear systems in the Loewner framework,” SIAM Journal on Scientific Computing, vol. 38, no. 5, pp. B889–B916, 2016. [49] B. Peherstorfer, S. Gugercin, and K. Willcox, “Data-driven reduced model construction with timedomain Loewner models,” SIAM Journal on Scientific Computing, vol. 39, no. 5, pp. A2152–A2178, 2017. [50] J. E. Cooper, “On-line version of the eigensystem realization algorithm using data correlations,” Journal of Guidance, Control, and Dynamics, vol. 20, no. 1, pp. 137–142, 1999. [51] I. Houtzager, J. W. van Wingerden, and M. Verhaegen, “Recursive predictor-based subspace identification with application to the real-time closed-loop tracking of flutter,” IEEE Transactions on Control Systems Technology, vol. 20, no. 4, pp. 934–949, 2012. [52] M. Majji, J. Juang, and J. L. Junkins, “Observer/Kalman-filter time-varying system identification,” Journal of Guidance, Control, and Dynamics, vol. 33, no. 3, pp. 887–900, 2010. [53] D. C. Rebolho, E. M. Belo, and F. D. Marques, “Aeroelastic parameter identification in wind tunnel testing via the extended eigensystem realization algorithm,” Journal of Vibration and Control, vol. 20, no. 11, pp. 1607–1621, 2014. 24 [54] M. S. Hemati, M. O. Williams, and C. W. Rowley, “Dynamic mode decomposition for large and streaming datasets,” Physics of Fluids, vol. 26, no. 11, pp. 111 701–1–111 701–6, 2014. [55] G. Scarciotti, Z. P. Jiang, and A. Astolfi, “Constrained optimal reduced-order models from input/output data,” in Proceedings of the 55th IEEE Conference on Decision and Control, Las Vegas, NV, USA, December 12-14, 2016, pp. 7453–7458. [56] S. Galeani and M. Sassano, “Model reduction by moment matching at discontinuous signals via hybrid output regulation,” in 2015 European Control Conference, 2015, pp. 1189–1194. [57] L. Ljung, System Identification: Theory for the User. Pearson Education, 1998. [58] ——, System identification toolbox: User’s guide, 9th ed. MathWorks Incorporated, 2011. [59] J. C. Doyle, B. A. Francis, and A. R. Tannenbaum, Feedback Control Theory. New York: Macmillan, 1992. [60] T. Bian and Z. P. Jiang, “Value iteration and adaptive dynamic programming for data-driven adaptive optimal control design,” Automatica, vol. 71. [61] W. Gao, Y. Jiang, Z. P. Jiang, and T. Chai, “Output-feedback adaptive optimal control of interconnected systems based on robust adaptive dynamic programming,” Automatica, vol. 72. [62] G. Scarciotti and A. Astolfi, “Model reduction of neutral linear and nonlinear time-invariant timedelay systems with discrete and distributed delays,” IEEE Transactions on Automatic Control, vol. 61, no. 6, pp. 1438–1451, 2016. [63] Y. Chahlaoui and P. Van Dooren, Dimension Reduction of Large-Scale Systems: Proceedings of a Workshop held in Oberwolfach, Germany, October 19-25, 2003. Berlin, Heidelberg: Springer, 2005, ch. Benchmark Examples for Model Reduction of Linear Time-Invariant Dynamical Systems. [64] SLICOT, “Benchmark examples for model reduction,” http://slicot.org/20-site/ 126-benchmark-examples-for-model-reduction, accessed: 2018-06-07. [65] A. C. Antoulas, D. C. Sorensen, and S. Gugercin, “A survey of model reduction methods for large-scale systems,” Contemporary Mathematics, vol. 280, pp. 193–219, 2001. [66] B. F. Farrell and P. J. Ioannou, “Stochastic dynamics of the mid-latitude atmospheric jet,” Journal of the Atmospheric Sciences, vol. 52, pp. 1642–1656, 1995. [67] ——, “State estimation using a reduced-order Kalman filter,” Journal of the Atmospheric Sciences, vol. 58, no. 23, pp. 3666–3680, 2001. 25