A Robust Unscented Transformation for Uncertain
Moments
arXiv:1902.09293v1 [math.ST] 25 Feb 2019
Hugo T. M. Kussabaa,∗, João Y. Ishiharaa , Leonardo R. A. X. Menezesa
a
Department of Electrical Engineering, University of Brasília – UnB, 70910-900,
Brasília, DF, Brazil
Abstract
This paper proposes a robust version of the unscented transform (UT) for
one-dimensional random variables. It is assumed that the moments are not
exactly known, but are known to lie in intervals. In this scenario, the moment
matching equations are reformulated as a system of polynomial equations and
inequalities, and it is proposed to use the Chebychev center of the solution set
as a robust UT. This method yields a parametrized polynomial optimization
problem, which in spite of being NP-Hard, can be relaxed by some algorithms
that are proposed in this paper.
Keywords: Unscented Transform, Polynomial Optimization, Lasserre’s
hierarchy, Statistics, Filtering.
c 2019. This manuscript version is made available under the CC BY-NC-ND 4.0
license http://creativecommons.org/licenses/by-nc-nd/4.0/. This article has been
accepted for publication in a future issue of the Journal of Franklin Institute, but has not
been fully edited. Content may change prior to final publication. The final version of
record is available at https://doi.org/10.1016/j.jfranklin.2019.02.018.
∗
Corresponding author
Email address:
[email protected] (Hugo T. M. Kussaba)
✩
Preprint submitted to Elsevier
February 26, 2019
1. Introduction
In numerous problems of statistics and stochastic filtering, one is often
interested in calculating the posterior expectation of a continuous random
variable X that undergoes a nonlinear transform f , viz.:
Z
f (ξ)pX (ξ)dξ.
E {f (X)} =
(1)
R
It is not always possible to have a closed-form expression for this integral in terms of elementary functions: if this integral does not satisfy the
hypothesis of Liouville’s theorem (see for instance, Section 12.4 of [1]), then
the antiderivative of this integral cannot be expressed in terms of elementary functions. Thus, instead of using analytical methods to calculate (1), in
many situations numerical methods must be employed.
A common way to numerically calculate (1) is by using the Monte Carlo
integration method [2], which is a stochastic sampling technique: by taking
a sufficiently large number of samples of the random variable X, one can
approximate the probability density function pX and obtain an estimate for
(1). However, this method can be very demanding computationally, since it
frequently employs several thousands of simulations to obtain the statistics
of the final result.
Another way to calculate (1) with less computational burden than Monte
Carlo integration method is the technique of Unscented Transform (UT).
Originally proposed for the problem of extending the Kalman filter for nonlinear dynamical systems [3], this method has also been applied in several
problems of engineering, such as in the analysis of the sensitivity of antennas
[4] and in circuit optimization [5].
2
Different from the Monte Carlo integration method, the UT is a deterministic sampling technique: in place of choosing random points to approximate
pX , points known as sigma points are deterministically selected to capture the
statistics of pX . This is accomplished by generating a discrete distribution p′X
having the same first and second (and possibly higher) moments of pX . The
mean, covariance and higher moments of the transformed ensemble of sigma
points can then be computed as the estimate of the nonlinear transformation
f of the original distribution [3], [6]. Thus, at least the first moments of pX
must exist and be exactly known. In several circumstances, however, this is
not valid: for instance, it may be the case that the distribution does not even
have the first moment (one example is the Cauchy distribution [7]). In the
case that the moments can be assumed to exist, it is usual that they are not
precisely known, but it is still possible to have upper and lower bounds for
the exact value of moments from statistical experiments or by the use, for
instance, of Chebychev’s inequality [2].
In this paper it is designed a technique for generating sigma points when
the exact value of the moments are not known, but upper and lower bounds
for the unknown moments are known. In this case, the moment matching
equations of UT are no longer just a system of polynomial equations but a
system of polynomial equations with polynomial inequalities. Furthermore,
since this system can have more than one solution, it is possible to choose a
set of sigma points which minimizes a given cost function by formulating the
problem as a polynomial optimization problem (POP). Although the solution
to this problem is in general computationally infeasible [8], by using Lasserre’s
hierarchy of semidefinite programming relaxations one can approximate the
3
solution of the original POP problem by the solution of a computationally
feasible convex optimization problem[9, 10].
The main contribution of this paper is the introduction of the concept
of UT robustness in the sense of exploiting the upper and lower bounds
for moments. A robust UT is proposed in [11] but robustness has different
meaning and it is achieved by matching precisely known high order moments.
It is important to note that Lasserre’s hierarchy of relaxations was applied in
earlier investigations of the moment matching problem [12], but its use was
limited to polynomial equations while here polynomial inequalities are also
taken into account.
This paper is organized as follow. Some preliminaries for the robust UT
are presented in Section 2. The robust UT itself is detailed in Section 3.
Details about the computation of robust sigma points is presented in Section 4. The computation of UT transform is presented in Section 5. Finally,
the conclusion is presented in Section 7.
2. Unscented Transform
The rationale behind the unscented transformation is that it is easier to
calculate a moment for a discrete distribution than a continuous one. In
fact, to compute the posterior distribution of a random scalar variable X
with distribution pX by a function f , one must calculate the integral (1)
while, in other hand, if Z is a discrete random variable with m + 1 atoms zi
and distribution pZ , one only needs to calculate the sum
E {f (Z)} =
m+1
X
i=1
4
f (zi )pZ (zi ).
(2)
Henceforth, if it is possible to choose the atoms and their weights of an
adequate pZ to approximate pX , then the value E{f (Z)} would be a good
approximation to E{f (X)}, but with (2) being easier to compute than (1).
Thus, the principle behind the UT is to approximate the continuous distribution pX by the discrete distribution pZ by equating the first m moments of
these distributions. In other words, the following equations must be satisfied:
E{X k } = E{Z k }
=
m+1
X
(3)
zik pZ (zi ), k = 1, . . . , m,
(4)
i=1
where is supposed the access to E{X k } for k = 1, . . . , m.
Given E{X k } it is always possible to find zi and pZ (zi ), i = 1, . . . , m + 1
(which will be called sigma point and its weight, respectively) satisfying (4).
In fact, as it will be stated next, there is at least one solution for the equations
E {gk (Z)} = E {gk (X)} , k = 1, . . . , m,
(5)
where gk : R → R are continuous functions, gk 6≡ gj for k 6= j and from
which (4) is a particular case with gk (x) = xk . That (5) has at a least one
solution is stated next in Theorem 1.
Theorem 1. Consider that the m moments E{gk (X)}, k = 1, . . . , m, are
given. The system (5) in terms of variables zi and pZ (zi ) has at least one
solution with at most m + 1 sigma points.1
1
The proof of the theorem is known in the mathematical literature in the context of
the Caratheodory’s theorem. Since it is less known in the context of UT literature, the
proof is presented here for easy reference. See, e.g., [13].
5
Proof. Since the point P = (E {g1 (X)} , . . . , E {gm (X)}) belongs to the convex hull of the set G = {(g1 (x), . . . , gm (x)) | x ∈ R}, then Caratheodory’s
theorem [14, Theorem 1.3.6] gives that P can be written as a convex combination of at most m + 1 points in G. Thus, one has that there exist θi ≥ 0
and zi ∈ R such that
E{gk (X)} =
m+1
X
θi gk (zi ), k = 1, . . . , m
(6)
i=1
and
m+1
X
θi = 1.
(7)
i=1
Taking Z to be the discrete random variable with probability distribution
given by
pZ (k) =
θi , if k = zi ,
0,
otherwise,
one has that equations (6) and (7) are exactly the equations given by (5).
It is important to note that Theorem 1 only states the existence of sigma
points satisfying (5), but it does not give any conclusion about uniqueness.
In fact, many solutions are possible, and the choice of an adequate set of
sigma points has been investigated thoroughly in the literature [6, 15].
Remark 1. Specifically for the first moments of X focused in this paper, i.e.
the case that gk (x) = xk , k = 0, . . . , m + 1, one can also see (4) as a Gaussian
quadrature integration scheme with pX being the weighting function and
zi and ωi := pZ (zi ), i = 1, . . . , m + 1, being respectively the nodes and
their weights in the quadrature formula. Thus, depending on the probability
density function pX , the sigma points can be readily calculated as the roots
6
of an orthogonal polynomial [16, Section 4.6]. For instance, if pX is a normal
distribution, then the sigma points are the roots of a Hermite polynomial.
One can note that if the only intention were to find some discrete random
variable such that E{f (X)} = E{f (Z)}, it is always possible to find a Z
variable with two sigma points. In fact, in Theorem 1, consider m = 1 and
f = g1 . However, this would imply the knowledge of the function f . In the
UT reasoning, it is sought a greater number of sigma points in order that the
approximation E{f (X)} = E{f (Z)} be valid for a greater number of functions f . In [17] it is shown that the approximation is good for any function
f which can be well approximated by its m-order Taylor representation.
On top of that, the larger is m, the more precise is the approximation for
Rb
P
E{f (X)}: an estimate for the error a pX (x)f (x) dx − m
i=1 ωi f (zi ) is given
by
f (2m) (ξ)
(2m)!
Z
b
pX (x)h2m (x) dx,
a
where ξ ∈ (a, b) ⊂ R and hm is the associated monic orthogonal polynomial
of degree m associated to pX [18, Theorem 3.6.24]. As a matter of fact, the
Gauss quadrature computed integral for E{f (X)} is exact for all f (x) that
are polynomials of degree less or equal than 2m − 1 [18, pp. 172-175].
Since Theorem 1 takes in account generic functions gk , one can analyze
very general moment settings (for instance, the case of fractional moments
is worked in [19]).
Summing up, one can note that the basic assumption for the UT theory
until now is that the moments are precisely known. However, this assumption
can be too strong in practical situations where the moments are estimated
from experiments and then, their values are only known to be in some in7
tervals. In this work, we deal with the scenario of unknown moments and
propose to calculate a robust set of sigma-points in the sense of minimizing
the worst possible error between this robust choice of sigma-points and the
sigma-points computed by using the real values of the moments.
3. Robust UT
To motivate the proposed robust UT consider a normally distributed
random variable X with mean µ and variance V . In the case where µ and V
are given and m = 2, it is known that a set of sigma points are given by [6]
z1 = µ −
z2 =
z3 = µ +
√
µ,
√
3V ,
(8)
3V ,
with its weights given respectively by ω1 = 61 , ω2 =
2
3
and ω3 = 16 .
Suppose now that the value of V is not exactly known, but an upper
bound V and a lower bound V for V is known, i.e. V ∈ V , V . A naive
approach for choosing the sigma points would be to use the mean value
between V and V for V in (8). In this case, the sigma points z1 and z3 would
be given by
z1 =z1M
z3 =z3M
q
:= µ − 3(V + V )/2,
q
:= µ + 3(V + V )/2,
and z2 , and the weights ω1 , ω2 and ω3 unchanged.
As Figure 1 illustrates, this can be a pessimistic choice for the sigma
points, since if, for example, the real value of V is in fact nearer of V̄ ,
8
a better choice of sigma points would be nearer to the point (z1 , z3 ) =
√
√
µ − 3V , µ + 3V . In fact, as can be seen in Figure 1, a preferable
choice for the sigma points would be at the center of the region:
p
p
√
C
z1 =z1 := µ − ( 3/2)( V + V ),
p
p
√
z3 =z3C := µ + ( 3/2)( V + V ).
This choice is precisely the Chebychev center2 of the set of possibles sigma
points given that V ∈ V , V . It has the property of having the minimum
worst possible error between the chosen sigma point set and the sigma point
set corresponding to the true value of V .
Figure 1: Possible choices of sigma points.
Consider now a general scenario where some of the moments are not
known, but upper and lower bounds for them are. In this case, the set of
2
There are two non-equivalent definitions of Chebychev center of a bounded set with
non-empty interior: the first definition is the center of the minimal-radius ball enclosing
this set, and the second is the center of the largest incribed ball in this set [20]. In this
paper, only the first definition will be used.
9
possible sigma points and their weights would be in the semialgebraic set
S ⊂ R2m+2 of elements x = (z1 , . . . , zm+1 , ω1 , . . . , ωm+1 ), defined as solution
of the system
ωj
Pm+1
i=1
S:
Pm+1
i=1
Pm+1
i=1
Pm+1
i=1
ωi
≥
0,
=
1,
j ∈ {1, . . . , m},
zik1 ωi = E{X k1 },
zik2 ωi ≤
zik2 ωi ≥
k1 ∈ K 1 ,
u k2 ,
k2 ∈ K 2 ,
ℓ k2 ,
k2 ∈ K 2 ,
(9)
where K1 (resp. K2 ) is the set of indexes k for which the k-moment is known
(resp. unknown), K1 ∪ K2 = {1, . . . , m} and uk2 and ℓk2 are respectively
known upper and lower bounds for the k2 -moment.
Although Theorem 1 guarantees a solution for (9), this solution may not
be unique. On that account, there is more than one set of sigma points and
corresponding weights, and it is natural to ask which is a good choice of
sigma points and weights. Based on the previous example, a so-called robust
choice would be the Chebychev center of the semialgebraic set defined by
(9) since this choice minimizes the worst possible error between the chosen
sigma point set and the sigma point set corresponding to the possible true
values of the moments of X.
Different from the previous example, however, in this generic scenario it
may be the case that an analytic formula to express the sigma points as the
function of the moments of the random variable X is not available, and the
following optimization problem must be solved to find the Chebychev center:
C
C
(z1C , . . . , zm+1
, ω1C , . . . , ωm+1
) = arg min maxkx − x̂k2 ,
x̂∈R2m+2
10
x∈S
(10)
where S is the solution set for (9) and x = (z1 , . . . , zm+1 , ω1 , . . . , ωm+1 ).
It is important to note that the optimization problem in (10) is not always
guaranteed to have a solution, since the set S may not be bounded (if, for
example, ω1 is zero, then z1 can take any value). Nevertheless, it still makes
sense to use the Chebychev center of a large bounded subset of S to try to
choose a set of sigma points minimizing the worst possible estimation error.
In fact, for any (bounded or unbounded) set S, one can take an ε > 0 and
consider the Chebychev center of the semi-algebraic set Sε ⊆ S defined as
the set of points
x = (z1 , . . . , zm+1 , ω1 , . . . , ωm+1 )
satisfying
ωj
Pm+1
i=1
Sε :
Pm+1
i=1
Pm+1
i=1
Pm+1
i=1
ωi
≥
ε,
=
1,
zik1 ωi = E{X k1 },
zik2 ωi ≤
zik2 ωi ≥
j ∈ {1, . . . , m},
k1 ∈ K 1 ,
u k2 ,
k2 ∈ K 2 ,
ℓ k2 ,
k2 ∈ K 2 .
(11)
As ε decreases, Sε covers a larger part of S. Fig. 2 illustrates how the set Sε
approximates the set S by choosing a sufficiently small ε > 0.
While the Chebychev center of S may not exist, the next theorem assures
the existence of the Chebychev center of Sε for any ε > 0.
Theorem 2. If m ≥ 2 and ε > 0, then the optimization problem
C
C
(z1C , . . . , zm+1
, ω1C , . . . , ωm+1
) = arg min maxkx − x̂k2 ,
x̂∈R2m+2 x∈Sε
has a solution.
11
(12)
Proof. It suffices to prove that the set Sε is bounded. The variables ωi ,
i = 1, . . . , m + 1 are all inside the m + 1 dimensional simplex and thus
bounded. Since
m+1
X
i=1
zi2 ωi ≤ u2 ,
and ωi > 0, it is impossible for any variable zi to grow without bound.
Remark 2. It is interesting to note that alternatively, instead of the compact
(closed and bounded) set Sε one could consider the set Ŝ defined as the set
S up to the first inequalities replaced by a strict inequality (that is, ωj ≥ 0
is replaced by ωj > 0, j ∈ {1, . . . , m}). This set is indeed bounded as
Sε . However, strict inequalities are in general not well-handled by numeric
solvers.
The inner maximization problems of (10) and (12) are polynomial optimization problems (POP). Such problems are ubiquitous and are encountered
in several fields [21], such as: finance [22, 23, 24], robust and nonlinear control
[25, 26], signal processing [27, 28], quantum physics [29] and materials science
[30]. It is known that this problem is NP-Hard in general [8], but despite
this, it is possible to approximate a POP by convex optimization problems
which are computationally feasible using Lasserre’s hierarchy of semidefinite
programming relaxations [9, 10]. In fact, by using the GloptiPoly 3 package3
[31] or the SparsePOP package4 [32] for MATLAB, or ncpol2sdpa library5
[33] for Python, these relaxations can be easily implemented and enables
3
Available
for
download
at
http://homepages.laas.fr/henrion/software/
gloptipoly3/.
4
Available for download at http://sourceforge.net/projects/sparsepop/.
5
Available for download at https://gitlab.com/peterwittek/ncpol2sdpa.
12
10
5
z1
10
0
-5
- 10
10
5
5
z1
z2
0
0
-5
-5
- 10
-5
z2 0
- 10
0.0
5
- 10
0.5
0.0
ω1
10
1.0
0.5
ω1
1.0
10
0.0 10
ω1 0.5
1.0
5
5
0
0
z2
z2
-5
-5
0.0
ω1 0.5
- 1.0
10
- 10
-5
0
5
10
10
z1
5
-5
0
- 10
- 10
z1
Figure 2: The regions S \ Sε (red) and Sε (blue) are plotted in various viewpoints to
illustrate the approximation of the semi-algebraic set S by Sε with ε = 0.01, m = 1,
K1 = ∅, K2 = {1, 2}, ℓ1 = ℓ2 = 0 and u1 = u2 = 1. Only ω1 , z1 and z2 are exhibited in
the graphic, since ω2 = 1 − ω1 .
13
the user to transparently construct an increasing sequence of convex LMI
relaxations whose optima are guaranteed to converge monotonically to the
global optimum of the original non-convex global optimization problem [26].
Moreover, it is possible to numerically certify the global optimality of the
problem.
A direct approach to solve (10) would be to use the mentioned Lasserre’s
hierarchy to solve the inner maximization problem for a fixed x̂ = x̂0 and a
local optimization algorithm to search which x̂0 minimizes (10). This however, besides being a hard numeric problem, would not guarantee a good
approximation to the real value of the Chebychev center. In the next section
alternative ways to compute (10) with different trade-off between precision
of the result and speed of algorithm are proposed.
4. Computation of robust sigma points
4.1. Computation of an outer box to approximate the Chebychev center
The first proposed approach to find an approximate solution to (12) is
to compute an outer-bounding box B of Sε and approximate the Chebychev
center of Sε by the Chebychev center of B. By Theorem 2, for m ≥ 2 the
constraint set Sε is bounded. Thus, each one of the following polynomial
optimization problems on x = (x1 , . . . , x2m ) ∈ Sε has solution
z i = arg min
xi s.t. x ∈ Sε , i = 1, . . . , m + 1,
2m+2
x∈R
ω i = arg min
xi+m+1 s.t. x ∈ Sε , i = 1, . . . , m + 1,
2m+2
x∈R
z i = arg max
xi s.t. x ∈ Sε , i = 1, . . . , m + 1,
2m+2
x∈R
ω i = arg max
xi+m+1 s.t. x ∈ Sε , i = 1, . . . , m + 1,
2m+2
x∈R
14
(13)
and can be used to construct an outer box
B = {(z1 , . . . , zm+1 , ω1 , . . . , ωm+1 ) ∈ R2m+2 : z i ≤ zi ≤ z i ,
ω i ≤ ωi ≤ ω i , i = 1, . . . , m + 1}
such that Sε ⊂ B. The next theorem gives an estimate of the error of the
outer-bounding box approximation.
Theorem 3. Let cB be the center of Chebychev of B and let cS be the center
of Chebychev of Sε . If d := kcB − cS k is the defined as the distance between
the centers of Chebychev, then
d≤
diam(B)
,
2
where diam(B) is the diameter of B, that is, the least upper bound of the set
of all distances between pairs of points in B.
Proof. Suppose that d > diam(B)/2. Thus cS 6∈ B, and since cS is the
center of Chebychev of Sε , one has that cS is in the convex hull of Sε . But
this implies that cS is in B, and the result follows by contradiction.
4.2. Polynomial optimization program to compute minimum enclosing ball
Another approximate solution to problem (12) can be found by using the
two-stage approach of [34]. Since the Chebychev center of Sε is always inside
of the outer box B computed in Section 4.1, problem (10) is equivalent to
C
C
(z1C , . . . , zm+1
, ω1C , . . . , ωm+1
) = arg min maxkx − x̂k2 .
x̂∈B
x∈Sε
In the first stage, the inner optimization problem of (14), namely,
J(x̂) := max kx − x̂k2 ,
x∈Sε
15
(14)
is approximated by a polynomial function J˜τ (x̂) of degree 2τ using the
semidefinite optimization program outlined in [34]. Then, in the second
stage, the outer minimization problem is replaced with the polynomial optimization problem
C
C
(z1C , . . . , zm+1
, ω1C , . . . , ωm+1
) = arg min J˜τ (x̂),
(15)
x̂∈B
which can be solved using the Lasserre’s hierarchy. As the degree 2τ of the
approximation polynomial increases, the solution of problem (15) converges
to the solution of problem (14) in the sense of [34]. Alternatively, problem
(12) can be shown to be equivalent to the following polynomial optimization
problem:
minx̂,r r
s.t. kx − x̂k2 ≤ r,
x ∈ Sε , x̂ ∈ B,
(16)
where x̂ is the Chebychev center of Sε . In other words, the Chebychev center
is the center of the radius of the minimum volume ball that encloses Sε .
Remark 3. It is interesting to note that if the gap between ℓi and ui is not
large enough, the resulting semi-definite programs from relaxing the polynomial optimization problems (13) and (16) can be numerically unstable.
In this case, however, one may use the naive approach of computing sigma
points by using the arithmetic mean of the moments without loss, since the
difference between the real Chebychev center and the point computed by this
method would be not so great due to the small difference between the upper
and lower bounds of the moments.
16
5. Computation of UT transform
C
C
Suppose that xCB := (z1C , . . . , zm+1
, ω1C , . . . , ωm+1
) is the Chebychev cen-
ter of S computed by one of the methods of Section 4. Based on (2), define
the function
U Tf (z1 , . . . , zm+1 , w1 , . . . , wm+1 ) :=
m+1
X
wi f (zi ).
i=1
As discussed above, U Tf is a good approximation for E{f (X)} for a sufficiently large m. While it is true that xCB approximates of the center of
Chebychev of S, one also wish to know if U Tf (xCB ) is near to the Chebychev
center of U Tf (S) (that is, the image of the set S by the function U Tf ). A class
of functions f such that U Tf (xCB ) is near the Chebychev center of U Tf (S)
is given by the functions such that the solution of the following optimization
problem
min D s.t. (1 − D)kx − yk ≤ kU Tf (x) − U Tf (y)k
(17)
≤ (1 + D)kx − yk,
x, y ∈ S
is sufficiently small. If these values are sufficiently small, the function U Tf is
a low-distortion geometric embedding [35], and this implies that U Tf (xCB )
is near the Chebychev center of U Tf (S). Finally, to compute (17), one can
estimate a solution for (17) by uniformly sampling random values (xi , yi ) of
S and computing max Di , where
Di := min D s.t. (1 − D)kxi − yi k ≤ kU Tf (xi ) − U Tf (yi )k
≤ (1 + D)kxi − yi k.
17
(18)
6. Numerical experiments
In this example, the computation of Chebychev center using the methods
proposed in this paper will be illustrated. Consider that one desires to compute 2 sigma points in a scenario where the values of the first and second
moment are not precisely known, but it is known that E{X} ∈ [0, 1] and that
E{X 2 } ∈ [0, 1]. In this case, one has (9) with m = 1, K1 = ∅, K2 = {1, 2},
and ℓ1 = −3 and u1 = 4, and ℓ2 = 0 and u2 = 5.
Naive method: For fixed values of E{X} = µ and E{X 2 } = V , one
can compute a point inside the set S by using the canonical formula
ω1 = 0.5, ω2 = 0.5,
p
p
z 1 = µ + V − µ2 , z 2 = µ − V − µ2 .
(19)
A naive choice of sigma points would be the use of the arithmetic mean
of the lower and upper bounds for the moments in (19). Using (19) with
µ = 21 (ℓ1 + u1 ) = 0.5 and V = 12 (ℓ2 + u2 ) = 2.5 results in the sigma points
z1 = 2, z2 = −1 with weights given by w1 = 0.5 and w2 = 0.5.
Method 1: By using the method proposed in Section 4.1 it is possible to
compute an outer-bounding box approximation to the semi-algebraic set Sε .
Using ε = 0.01, the computation of an approximate point of the Chebychev
center using outer-bounding box approximation results in the sigma points
z1 = 0, z2 = 0 with weights respectively given by w1 = 0.5 and w2 = 0.5.
Method 2: Finally, by computing a point using the method proposed
in Section 4.2 with ε = 0.01 results in the sigma points z1 = −0.0001,
z2 = 0.0001 with weights respectively given by w1 = 0.1 and w2 = 0.9.
In both Method 1 and 2, the MATLAB toolbox Gloptipoly 3 was used
to relax the polynomial optimization problems (13) and (16). A relaxation
18
order of 2 was used and the global optimality of the problems was numerically
certified by Gloptipoly3. Moreover, higher relaxation orders could not be
used, since using bigger relaxation orders results in semi-definite programs
with a large number of variables and constraints, and thus yields an unstable
numeric problem.
Figure 3 illustrates the semi-algebraic variety Sε and the coordinates of
the sigma points calculated by the three methods. It can be seen by Figure 3
that the point computed by using the outer-bounding box approximation is
the best approximation to the real Chebychev center of the variety. Moreover,
it is important to note that the point calculated by Method 2 is very far
from the real Chebychev center of the set Sε . This is due to the semi-definite
relaxation of (16) being far from the exact solution. In principle, one can
increase the relaxation, but this can lead to an unstable numeric problem,
which global optimality cannot be more certified.
Finally, to illustrate the robustness of the computed point, 100 random
samples from E{X} and E{X 2 } are respectively draw from the uniform
distributions U (ℓ1 , u1 ) and U (ℓ2 , u2 ). Solving (18) for f (x) = sin(x) with 500
samples gives an estimate for D of 0.9956, and the posterior expectation in
(1) is approximated as
E {sin(X)} ≈ ω1 sin(z1 ) + ω2 sin(z2 ),
where ω1 , ω2 , z1 and z2 are computed according to the above methods. The
P
mean error E{sin(X)} − 2i=1 ωi sin(zi ) by the naive method is 0.0339,
while by Method 1 is 3.6932 · 10−6 and by Method 2 is 1.2085 · 10−4 .
19
Figure 3: Semi-algebraic variety Sε with ε = 0.01, m = 1, K1 = ∅, K2 = {1, 2}, ℓ1 = −3,
u1 = 4, ℓ2 = 0 and u2 = 5. The blue point is the coordinates of the sigma-point obtained
by the naive method, the red point is the coordinates of the sigma-point obtained by
the outer-bounding box approximation, and the green point is obtained by solving the
polynomial optimization problem (16) directly. Only ω1 , z1 and z2 are illustrated in the
graphic, since ω2 = 1 − ω1 .
20
7. Conclusion
In this paper, it was proposed a way to devise a robust unscented transform by the computation of the Chebychev center of the semialgebraic set
defined by the possible choices of sigma points and its weights. Although, in
general, this problem is NP-Hard, some methods are proposed in this paper
to approximate the solution of the original problem by convex optimization
problems. Further works intend to generalize the present work to higher dimensions, as this could enable novel filter designs for multivariate dynamical
systems. Another possible extension to this work is to consider adjustments
of the proposed algorithm to work with confidence bounds for the moments
instead of absolute upper and lower bounds, since the former is more common to be used on interval estimation techniques, such as the bootstrapping
method.
Finally, one limitation of the method is that a great number of sigma
points can lead to a numerically unstable relaxation of the polynomial optimization problems. Nevertheless, recent advances in polynomial optimization
techniques such as novel relaxation hierarchies based on linear programming
[36] could help to scale the presented approaches to a higher number of sigma
points.
Acknowledgments
The authors would like to thanks Dario Piga and Henrique M. Menegaz
for the worthy suggestions and discussions relevant to this paper. We would
also like to thanks the Brazilian agencies CNPq and CAPES which partially
supported this work.
21
References
[1] K. O. Geddes, S. R. Czapor, G. Labahn, Algorithms for Computer Algebra, Kluwer Academic Publishers, 1992.
[2] A. Papoulis, S. U. Pillai, Probability, Random Variables, and Stochastic
Processes, McGraw-Hill, New York, 2002.
[3] S. J. Julier, J. K. Uhlmann, H. F. Durrant-Whyte, A new approach
for filtering nonlinear systems, in: Proceedings of the 1995 American
Control Conference, Vol. 3, 1995, pp. 1628–1632.
[4] L. R. A. X. de Menezes, A. J. M. Soares, F. C. Silva, M. A. B. Terada,
D. Correia, A new procedure for assessing the sensitivity of antennas
using the unscented transform, IEEE Transactions on Antennas and
Propagation 58 (3) (2010) 988–993.
[5] M. L. Carneiro, P. H. P. de Carvalho, N. Deltimple, L. d. C. Brito,
L. R. A. X. de Menezes, E. Kerherve, S. G. de Araujo, A. S. Rocira,
Doherty amplifier optimization using robust genetic algorithm and unscented transform, in: Proceedings of the 9th IEEE International conference on New Circuits and Systems Conference, IEEE, 2011, pp. 77–80.
[6] H. M. T. Menegaz, J. Y. Ishihara, G. A. Borges, A. N. Vargas, A systematization of the unscented Kalman filter theory, IEEE Transactions
on Automatic Control 60 (10) (2015) 2583–2598.
[7] K. Krishnamoorthy, Handbook of Statistical Distributions with Applications, CRC Press, 2010.
22
[8] K. G. Murty, S. N. Kabadi, Some NP-complete problems in quadratic
and nonlinear programming, Mathematical programming 39 (2) (1987)
117–129.
[9] J. B. Lasserre, Global optimization with polynomials and the problem
of moments, SIAM Journal on Optimization 11 (3) (2001) 796–817.
[10] J. B. Lasserre, Moments, Positive Polynomials and Their Applications,
Imperial College Press optimization series, Imperial College Press, Singapore, 2009.
[11] J. R. Van Zandt, A more robust unscented transform, in: International
symposium on optical science and technology, International Society for
Optics and Photonics, 2001, pp. 371–380.
[12] S. Mehrotra, D. Papp, Generating moment matching scenarios using
optimization techniques, SIAM Journal on Optimization 23 (2) (2013)
963–999.
[13] H. Dette, W. J. Studden, The Theory of Canonical Moments with Applications in Statistics, Probability, and Analysis, Vol. 338, John Wiley
& Sons, 1997.
[14] J.-B. Hiriart-Urruty, C. Lemaréchal, Fundamentals of Convex Analysis,
Grundlehren Text Editions, Springer, Germany, 2001.
[15] R. Radhakrishnan, A. Yadav, P. Date, S. Bhaumik, A new method
for generating sigma points and weights for nonlinear filtering, IEEE
Control Systems Letters 2 (3) (2018) 519–524.
23
[16] W. H. Press, S. A. Teukolsky, W. T. Vetterling, B. P. Flannery, Numerical Recipes: The Art of Scientific Computing, 3rd Edition, Cambridge
University Press, New York, 2007.
[17] S. J. Julier, J. K. Uhlmann, Unscented filtering and nonlinear estimation, Proceedings of the IEEE 92 (3) (2004) 401–422.
[18] J. Stoer, R. Bulirsch, Introduction to Numerical Analysis, Texts in Applied Mathematics, Springer, New York, 2002.
[19] R. Caballero-Aguila, A. Hermoso-Carazo, J. Linares-Pérez, Extended
and unscented filtering algorithms in nonlinear fractional order systems with uncertain observations, Applied Mathematical Sciences 6 (30)
(2012) 1471–1486.
[20] S. Boyd, L. Vandenberghe, Convex Optimization, Cambridge University
Press, Cambridge, 2004.
[21] Z. Li, S. He, S. Zhang, Approximation Methods for Polynomial Optimization: Models, Algorithms, and Applications, Springer Briefs in
Optimization, Springer, New York, 2012.
[22] H. Markowitz, Portfolio selection, The journal of finance 7 (1) (1952)
77–91.
[23] E. Jondeau, M. Rockinger, Optimal portfolio allocation under higher
moments, European Financial Management 12 (1) (2006) 29–55.
[24] P. M. Kleniati, P. Parpas, B. Rustem, Partitioning procedure for polynomial optimization: Application to portfolio decisions with higher or24
der moments, Tech. Rep. WPS-023, COMISEF Working Papers Series
(2009).
[25] A. P. Roberts, M. M. Newmann, Polynomial optimization of stochastic
feedback control for stable plants, IMA Journal of Mathematical Control
and Information 5 (3) (1988) 243–257.
[26] D. Henrion, J.-B. Lasserre, Solving nonconvex optimization problems,
Control Systems, IEEE 24 (3) (2004) 72–83.
[27] B. Mariere, Z.-Q. Luo, T. N. Davidson, Blind constant modulus equalization via convex optimization, Signal Processing, IEEE Transactions
on 51 (3) (2003) 805–818.
[28] L. Qi, K. L. Teo, Multivariate polynomial minimization and its application in signal processing, Journal of Global Optimization 26 (4) (2003)
419–433.
[29] G. Dahl, J. M. Leinaas, J. Myrheim, E. Ovrum, A tensor product matrix approximation problem in quantum physics, Linear algebra and its
applications 420 (2) (2007) 711–725.
[30] S. Soare, J. W. Yoon, O. Cazacu, On the use of homogeneous polynomials to develop anisotropic yield functions with applications to sheet
forming, International Journal of Plasticity 24 (6) (2008) 915–944.
[31] D. Henrion, J.-B. Lasserre, J. Löfberg, GloptiPoly 3: moments, optimization and semidefinite programming, Optimization Methods & Software 24 (4-5) (2009) 761–779.
25
[32] H. Waki, S. Kim, M. Kojima, M. Muramatsu, H. Sugimoto, Algorithm 883: SparsePop—a sparse semidefinite programming relaxation
of polynomial optimization problems, ACM Transactions on Mathematical Software 35 (2) (2008) 15.
[33] P. Wittek, Algorithm 950: Ncpol2sdpa–sparse semidefinite programming relaxations for polynomial optimization problems of noncommuting variables, ACM Transactions on Mathematical Software (TOMS)
41 (3) (2015) 21.
[34] V. Cerone, J. B. Lasserre, D. Piga, D. Regruto, A unified framework
for solving a general class of conditional and robust set-membership
estimation problems, IEEE Transactions on Automatic Control 59 (11)
(2014) 2897–2909.
[35] P. Indyk, Algorithmic applications of low-distortion geometric embeddings, in: Proceedings of 42nd IEEE Symposium on Foundations of
Computer Science, 2001, pp. 10–33.
[36] A. A. Ahmadi, A. Majumdar, DSOS and SDSOS optimization: more
tractable alternatives to sum of squares and semidefinite optimization,
arXiv preprint arXiv:1706.02586.
26