Arroyo 2006

Download as pdf or txt
Download as pdf or txt
You are on page 1of 36

INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN ENGINEERING

Int. J. Numer. Meth. Engng 2006; 65:2167–2202


Published online 12 December 2005 in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/nme.1534

Local maximum-entropy approximation schemes: a seamless


bridge between finite elements and meshfree methods

M. Arroyo‡, § and M. Ortiz∗, †


Graduate Aeronautical Laboratories, California Institute of Technology, Pasadena, CA 91125, U.S.A.

SUMMARY
We present a one-parameter family of approximation schemes, which we refer to as local maximum-
entropy approximation schemes, that bridges continuously two important limits: Delaunay triangulation
and maximum-entropy (max-ent) statistical inference. Local max-ent approximation schemes represent
a compromise—in the sense of Pareto optimality—between the competing objectives of unbiased
statistical inference from the nodal data and the definition of local shape functions of least width.
Local max-ent approximation schemes are entirely defined by the node set and the domain of anal-
ysis, and the shape functions are positive, interpolate affine functions exactly, and have a weak
Kronecker-delta property at the boundary. Local max-ent approximation may be regarded as a regu-
larization, or thermalization, of Delaunay triangulation which effectively resolves the degenerate cases
resulting from the lack or uniqueness of the triangulation. Local max-ent approximation schemes
can be taken as a convenient basis for the numerical solution of PDEs in the style of meshfree
Galerkin methods. In test cases characterized by smooth solutions we find that the accuracy of local
max-ent approximation schemes is vastly superior to that of finite elements. Copyright 䉷 2005 John
Wiley & Sons, Ltd.

KEY WORDS: maximum entropy; information theory; approximation theory; meshfree methods;
Delaunay triangulation

1. INTRODUCTION

This paper is concerned with the formulation of approximation schemes that bridge continuously
two important limits: Delaunay triangulation and maximum-entropy statistical inference. The
resulting basis functions bear similarities with those obtained from the moving least squares

∗ Correspondence to: M. Ortiz, Graduate Aeronautical Laboratories, California Institute of Technology, Pasadena,
CA 91125, U.S.A.
† E-mail: [email protected]
‡ E-mail: [email protected]
§ Current address: LaCàN, Universitat Politècnica de Catalunya, C/Jordi Girona 1-3, Barcelona 08034, Spain.

Contract/grant sponsor: Department of Energy


Contract/grant sponsor: NSF
Received 17 December 2004
Revised 27 April 2005
Copyright 䉷 2005 John Wiley & Sons, Ltd. Accepted 17 August 2005
2168 M. ARROYO AND M. ORTIZ

(MLS) method, which are the basis of a number of meshfree methods for the numerical solution
of partial differential equations (e.g. References [1–3]; cf. also, Reference [4] for a review).
Despite some similarities, the approach presented in this paper is distinct from MLS methods
and presents a number of important advantages over them.
The approximation schemes are entirely defined by the node set and fall into the general
class of convex approximation schemes. These are schemes based on shape functions that are
positive and interpolate affine functions exactly. An important property of convex approximation
schemes is that they have a weak Kronecker-delta property at the boundary. This property greatly
facilitates the imposition of essential boundary conditions and can also be exploited in order to
glue together domain patches in a fully conforming way. The positivity of the shape functions
endows the approximation schemes with useful properties and structure derived from convex
geometry, but makes the construction of high order approximants more involved. Extensions
beyond first-order methods will be pursued in subsequent work. Another avenue for defining
higher-order approximation schemes is to combine the present approach with the partition of
unity method [5, 6].
The specific convex approximation schemes that we investigate represent a compromise—in
the sense of Pareto optimality—between two competing objectives:
(i) Unbiased statistical inference based on the nodal data;
(ii) the definition of local shape functions of least width.
Objective (i) is classical in information theory and leads to Jaynes’ principle of maximum
entropy [7]. In the present context the least biased shape functions, which we call global
max-ent approximants, are those that maximize a suitably defined entropy of the approximation
scheme. By way of contrast, the most local shape functions, in a sense to be made mathemat-
ically precise, are found to be affine shape functions supported on a Delaunay triangulation
of the node set. Specifically, we define a one-parameter family of smooth convex approxima-
tion schemes, which we refer to as local max-ent schemes, which have global max-ent and
Delaunay schemes as limiting cases. In particular, local max-ent approximation schemes sub-
sume simplicial finite elements and the Delaunay triangulation as a special case. Conversely,
local max-ent approximation may be regarded as a regularization, or—in analogy to statistical
mechanics—a thermalization, of Delaunay interpolation. The level of thermalization is smoothly
controlled by a non-negative parameter that can be a function of position. This spatial depen-
dence enables a seamless transition from meshfree-type approximants to finite elements. An
important feature of this regularization is that it effectively resolves the degenerate cases result-
ing from the lack or uniqueness of Delaunay triangulation. Thus, for node sets for which the
Delaunay triangulation is not unique our regularization selects a unique generalized Delaunay
approximant in the limit, namely, that which maximizes the approximation entropy.
The local max-ent shape functions follow from an unconstrained convex optimization problem
at each evaluation point. The size of this problem equals the spatial dimension. In addition,
this problem is guaranteed to be solvable on the convex hull of the node set, and its solution
is very robust and efficient. Approximants derived from minimization problems have a long
tradition and include cubic splines, thin plate splines, MLS approximants and natural neighbour
approximants, to name a few.
Local max-ent approximation schemes can be taken as a convenient basis for the numerical
solution of PDEs in the style of meshfree Galerkin methods (cf. e.g. Reference [4] for a recent
review of Galerkin meshfree methods) or, in problems governed by a minimum principle,

Copyright 䉷 2005 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng 2006; 65:2167–2202
LOCAL MAXIMUM-ENTROPY APPROXIMATION SCHEMES 2169

by constrained, or Rayleigh–Ritz, minimization. We illustrate the performance of local max-


ent approximation schemes in this type of applications by means of a patch test and two test
cases: the standard benchmark problem of a linear elastic built-in cantilever beam loaded at the
tip; and the upsetting and extension of a block of compressible neo-Hookean rubber. In both
examples we find that the accuracy of local max-ent approximation schemes is vastly superior
to that of finite elements, even when the solution cost is carefully factored in.
The structure of the paper is as follows. In Section 2 we begin by establishing the properties
of general convex approximation schemes, including a weak Kronecker-delta property at the
boundary. In Section 3 we adopt an information-theoretical viewpoint and introduce the no-
tions of entropy of an approximation scheme and global max-ent approximation. The resulting
shape functions are of global support and non-interpolating in general. In order to bring these
properties under control, in Section 4 we introduce the concept of width of a shape function.
The sum of the widths of the shape functions supplies a measure of the degree of locality
of the approximation. We then proceed to introduce the local max-ent approximation schemes
by recourse to Pareto optimality. A method for the calculation of the shape functions and
some properties of the approximation scheme are presented in this section. Applications to the
numerical solution of PDEs are presented in Section 5. Some concluding remarks are finally
collected in Section 6.

2. CONVEX APPROXIMATION SCHEMES

All approximation schemes considered in this paper fall within a class that we shall term the
class of convex approximation schemes. These convex approximation schemes are characterized
by the positivity of the shape functions and by being exact on affine functions. These conditions
alone do not determine a unique convex approximation scheme, and most of this paper is
devoted to the selection of convex approximation schemes that are optimal according to certain
ancillary criteria. However, there are a number of desirable properties that are shared by all
convex approximation schemes. In this section, we proceed to enumerate these properties.

2.1. Approximants as coefficients of convex combinations


Consider a set of distinct nodes X = {x a , a = 1, . . . , N} ⊂ Rd , to be referred to as the node set.
Recall [8] that the convex hull of X is the set

conv X = {x ∈ Rd |x = X,  ∈ RN
+ , 1 ·  = 1} (1)

where RN + is the non-negative orthant, 1 denotes the vector of R whose entries are one, and
N

X is the d × N matrix whose columns are the co-ordinates of the position vectors of the nodes
in the node set X. Since X is finite, it follows that convX is a compact convex polyhedron,
or polytope. Let u : convX → R be a function whose values {ua ; a = 1, . . . , N} are known
on the node set. We wish to construct approximations to u of the form


N
uh (x) = pa (x)ua (2)
a=1

Copyright 䉷 2005 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng 2006; 65:2167–2202
2170 M. ARROYO AND M. ORTIZ

where the functions pa : convX → R will be referred to as shape functions. A particular choice
of shape functions defines an approximation scheme. We shall require the shape functions to
satisfy the zeroth and first-order consistency conditions:


N
pa (x) = 1 ∀x ∈ convX (3a)
a=1


N
pa (x)x a = x ∀x ∈ convX (3b)
a=1

These conditions guarantee that affine functions are exactly reproduced by the approximation
scheme. We note that if N = d + 1 and the point set is affinely independent, the consistency
conditions uniquely determine the shape functions over the corresponding d-simplex. By way
of contrast, the shape functions are not uniquely determined by the consistency conditions in
general when N >d + 1. In addition, we shall require the shape functions be non-negative, i.e.
pa (x)  0 ∀x ∈ convX, a = 1, . . . , N (4)
The positivity of the shape functions, together with the partition of unity property, allow us
to interpret the shape functions as the coefficients of convex combinations. This viewpoint is
common in geometric modelling, e.g. in Bézier and B-Spline techniques [9]. Positive linearly
consistent approximants have long been studied in the literature [10]. Recent examples include
the natural element method shape functions [11] and subdivision schemes [12]. These methods
often present a number of attractive features, such as the related properties of monotonicity,
the variation diminishing property (the approximation is not more ‘wiggly’ than the data), or
smoothness preservation [13], of particular interest in the presence of shocks. Furthermore,
they lead to well behaved mass matrices. The positivity restriction is natural in problems
where a maximum principle is in force, such as in the heat conduction problem. In the present
context, the non-negativity requirement is introduced primarily to enable the interpretation of
shape functions as probability distributions. It follows from (3a), (3b) and (4) that the shape
functions at x ∈ convX define a convex combination of vertices which evaluates to x. In view
of this property we shall refer to non-negative and first-order consistent approximation schemes
as convex approximation schemes.
Let p(x) denote the vector of RN whose components are {p1 (x), . . . , pN (x)}. Then, by virtue
of the consistency and non-negativity constraints the domain of p(x), or feasible set, is
Px (X) = {p ∈ RN
+ |Xp = x, 1 · p = 1} (5)
Evidently, this set is convex. A first question of interest is whether Px (X) is non-empty, i.e.
whether there exist shape functions consistent with the constraints. The following proposition
follows directly by comparison of (1) and (5).

Proposition 2.1
The feasible set Px (X) is non-empty if and only if x ∈ convX.

It follows from the preceding observations that non-negative and linearly consistent approxi-
mation schemes can only be defined on convX. If the node set is large enough, Carathéodory’s
theorem states that at least N − d − 1 points in X are not necessary in order to express

Copyright 䉷 2005 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng 2006; 65:2167–2202
LOCAL MAXIMUM-ENTROPY APPROXIMATION SCHEMES 2171

x ∈ convX as a convex combination of points in X. Thus, as expected, convex approximation


schemes are not uniquely determined by the node set in general. It is possible to consider
domains  which are subsets of convX. However, for simplicity in the present work we will
assume that  = convX throughout.

2.2. Behaviour at the boundary


In interpolating schemes such as Lagrangian finite elements the shape functions satisfy the
so-called Kronecker-delta property, i.e. pa (xb ) = ab . This property is particularly useful when
solving partial-differential equations numerically, since it renders the imposition of essential
boundary conditions straightforward. Most meshfree methods, in particular those based on the
MLS approximation, lack the Kronecker-delta property, and, consequently, the approximation
on the boundary of the domain may depend on the nodal data of interior nodes. These methods
experience difficulty in enforcing essential boundary conditions (cf. e.g. Reference [14]). In this
section we study the behaviour of general convex approximation schemes at the relative bound-
ary of convX, rbd(convX), i.e. the boundary of convX regarded as a subset of its affine hull.
The relative boundary of convX coincides with the boundary of convX when aff(convX) = Rd .
Here aff denotes the affine hull. In particular, we show that all convex approximation schemes
possess a weak Kronecker-delta property at the boundary. This Kronecker-delta property greatly
facilitates the imposition of essential boundary conditions, which confers convex approximation
schemes a distinct advantage over MLS and other meshfree approximation schemes.
We begin by reviewing a few elementary facts concerning the boundary of polytopes. The
faces of the polytope P = convX can be characterized as the intersections of P with its sup-
porting hyperplanes, in addition to P itself and ∅, and are themselves polytopes. An equivalent
definition of a face of P is a convex subset F of P such that every closed line segment
in P with a relative interior point in F has both endpoints (and hence the entire segment)
in F [8]. A proper face of P is one that is neither P nor ∅. The dimension of a face is the
dimension of its affine hull. In particular, the zero-dimensional faces of P are called vertices,
coincide with its extreme points, and belong to X. We shall denote by vertP the collection of
vertices of the polytope. In addition, P = conv(vertP ) and, if F is a face of P , it follows that
vertF = vertP ∩ F . The relative interiors of the proper faces of P are a partition of rbdP , i.e.
they are disjoint and their union is rbdP . The smallest face of P to which x belongs is the
contact set of x, C(x), and is formally defined as the intersection of P with the intersection
of all supporting hyperplanes to P at x. Its affine dimension is the facial dimension of x. The
facial dimension of points interior to convX is d, while the facial dimension of extreme points
is 0. If x ∈ rbdP , then C(x) is a proper face of P .

Proposition 2.2
Let p(x) define a convex approximation scheme with node set X. Let F be a face of convX
and x a ∈
/ F . Then pa = 0 on F .

Proof
Suppose otherwise, i.e. suppose that there is a point x ∈ F and a convex approximation scheme
p(x) such that pa (x)  = 0. Since
 
x = pb (x)xb = pb (x)xb + pa (x)x a (6)
b b =a

Copyright 䉷 2005 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng 2006; 65:2167–2202
2172 M. ARROYO AND M. ORTIZ


and x  = x a , it follows that b=a pb (x)  = 0. Consider the closed line segment

[0, 1] t −→ ty + (1 − t)x a ∈ convX (7)

where

1 
y=  pb (x)xb (8)
b=a pb (x) b =a

Then x ∈ F is a relative interior point of the segment, corresponding to t = 1 − pa (x), and


hence the entire segment, including x a , must be contained in F , which contradicts the
assumption. 

Remarks
1. If E is the union of an arbitrary collection of faces of convX and x a ∈/ E, then it follows
that pa = 0 on E.
2. The shape functions corresponding to nodes that belong to relint(convX) vanish in
rbd(convX).
3. When approximating a function as in Equation (2), the value of uh at a face F depends
only on the nodal values corresponding to nodes in X ∩ F . Let X and Y be two node sets
such that convX ∩ convY is a face of both convX and convY . Then, given a method to
select convex approximants, the approximation schemes based on X and Y are conforming
(conforming patches).
4. Suppose that a function u defined over convX is affine on a face F . Then u = uh over
F provided ua = u(x a ) ∀x a ∈ F ∩ X (exact interpolation of affine functions on faces).
5. If x a is an extreme point or vertex of convX, then pb (x a ) = ba , and consequently,
ua = uh (x a ) (interpolation at extreme points).
6. Let x ∈ rbd(convX) with contact set C(x). If x a ∈/ C(x), then pa (x) = 0. Thus, choosing a
convex approximation scheme in Px (X) is equivalent to choosing a convex approximation
scheme in Px (X ∩ C(x)). Note that the latter problem can be formulated in affC(x) − x,
the subspace of Rd parallel to C(x), whose dimension is the facial dimension of x, and
involves a reduced node set (reduced face problem).
7. If a n-dimensional face contains exactly n + 1 nodes, then the shape functions on that
face are the affine shape functions of the simplex defined by those nodes.

Some of these observations are known in different contexts. For instance, the fact that Bézier
curves pass through the end control points is a direct consequence of Proposition 2.2.

2.3. Higher-order consistency


A seemingly natural extension of the convex approximation schemes described in the foregoing
would be to impose second and higher-order consistency conditions on the shape functions.
However, these extensions are not straightforward. In order to demonstrate the source of the
difficulty we may simply consider the one-dimensional case. The second-order consistency

Copyright 䉷 2005 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng 2006; 65:2167–2202
LOCAL MAXIMUM-ENTROPY APPROXIMATION SCHEMES 2173

Figure 1. Illustration of the second-order moment space in 1D.

condition then takes the form


N
pa (x)xa2 = x 2 (9)
a=1

Defining an extended point set Y = {(xa , xa2 ); a = 1, . . . , N} ⊂ R2 , it follows that finding


non-negative and second-order consistent approximation schemes amounts to defining a convex
approximation scheme on the set P(x,x 2 ) (Y ). We have seen that this set is non-empty
iff (x, x 2 ) belongs to the set conv{(xa , xa2 ); a = 1, . . . , N}. In the context of the classical problem
of moments, namely, the problem of finding a probability distribution given its first moments
[15, 16], that set is known as the moment space. However, due to the strict convexity of the
function f (x) = x 2 , the condition that (x, x 2 ) be in the set conv{(xa , xa2 ); a = 1, . . . , N} cannot
be satisfied in general, as illustrated in Figure 1. Consequently, non-negative convex approxima-
tion schemes cannot satisfy Equation (9). A similar argument applies to higher-order consistency
conditions and higher spatial dimensions. This observation notwithstanding, it is nevertheless
possible to extend the methods presented here to construct high-order convex approximants, as
will be detailed in forthcoming work.

3. GLOBAL MAX-ENT APPROXIMANTS

In this section we begin by adopting an information-theoretical viewpoint that naturally leads


to a canonical choice of convex approximants, namely, those that maximize the entropy of
the approximation scheme. In this framework, the problem of approximating a function from
nodal data is regarded strictly as a problem of statistical inference, with no regard given to
the physical nature of the data or the mathematical character of the governing field equations.
From a strict information-theoretical viewpoint, the overriding concern is to ensure an unbiased
inference of the function from the data, i.e. one that is free of systematic errors or artifacts.

Copyright 䉷 2005 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng 2006; 65:2167–2202
2174 M. ARROYO AND M. ORTIZ

3.1. Entropy of a convex approximation scheme


Though the principle of maximum entropy is well-known in information theory and statistical
physics, in the present context it may stand a brief review. Consider a random variable which
can take values in a set of events {A1 , A2 , . . . , An } with probabilities {p1 , p2 , . . . , pn }. The set
of events and the associated probabilities
 
A1 A2 . . . An
A=
p1 p2 . . . pn

are jointly called a finite scheme. We now introduce the concept of entropy—uncertainty—of
a given finite scheme, following the introductory text by Khinchin [17]. Consider two finite
schemes
   
A1 A2 A1 A2
and
0.5 0.5 0.99 0.01

Evidently, the first scheme carries more uncertainty than the second, for which the outcome is
almost certainly A1 . The uncertainty associated with a finite scheme can also be interpreted as
the amount of information gained by realizing the random variable, thus eliminating completely
the uncertainty. Shannon [18] introduced the following measure of uncertainty, or information
entropy


N
H (A) = H (p1 , . . . , pn ) = − pa log pa (10)
a=1

with the extension by continuity: 0 log 0 = 0. The function H (A) is non-negative, symmetric,
continuous, and strictly concave, and possesses a number of properties that are expected
of a measure of uncertainty. In particular, H (p) = 0 iff one of the probabilities is one
and all the others are zero, and attains its maximum for the probabilities {1/n, . . . , 1/n},
which may intuitively be regarded as the most uncertain or random distribution. Furthermore,
H (1/n, . . . , 1/n) = log n, which is an increasing function of n. Consequently, adding events
adds uncertainty to this most uncertain distribution. However, adding an impossible event does
not alter the level of uncertainty, i.e. H (p1 , . . . , pn , 0) = H (p1 , . . . , pn ). Suppose that we are
given two finite schemes,
   
A1 A2 . . . An B1 B2 . . . Am
A= and B =
p1 p2 . . . pn q1 q2 . . . qm

The set of events Ai Bj , i = 1, . . . , n, j = 1, . . . , m defines a new finite scheme, called product


scheme AB. If A and B are independent, we have that H (AB) = H (A) + H (B), whereas
if the schemes are dependent, the H (AB) = H (A) + HA (B)  H (A) + H (B), where HA (B)
denotes the expectation of H (B) in scheme A (cf. Reference [17] for details). The inequality
HA (B)  H (B) can be interpreted by saying that the realization of the scheme A can only
decrease the uncertainty of another scheme B. The axiomatic basis of Shannon’s information
entropy is well-established in information theory (cf. e.g. Reference [17]).

Copyright 䉷 2005 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng 2006; 65:2167–2202
LOCAL MAXIMUM-ENTROPY APPROXIMATION SCHEMES 2175

Within the framework just outlined, the entropy of a convex approximation scheme may be
defined as follows. Let X be a node set with N nodes, let x ∈ convX, and let p(x) define a
convex approximation scheme. Regard the index set I = {1, . . . , N} as a complete system of
events. Since the approximation scheme is non-negative and the shape functions add to one, we
may regard {p1 (x), . . . , pN (x)} as the corresponding probabilities and H (p1 (x), . . . , pN (x)) as
the entropy of the corresponding finite scheme.

3.2. Least-biased approximation scheme


An information-theoretical approach to approximation theory can be devised as follows. Equa-
tion (3b) is regarded as additional information on the discrete probability distribution p(x),
namely that the statistical expectation or average of the random variable X : I → Rd , which
assigns to each index the position vector of the corresponding node X(a) = x a , is x. Consistent
with this constraint, there are in general multiple probability distributions {p1 (x), . . . , pN (x)}.
The problem of approximating a function from scattered data may now be regarded as a prob-
lem of statistical inference. From this standpoint, Equation (2) expresses the expected value
uh (x) of a random variable U : I → R defined by U(a) = ua as determined by the probabilities
{p1 (x), . . . , pN (x)}.
Suppose that we require that this process of inference be unbiased, i.e. that it be based
solely on the a priori knowledge of the function and free of artifacts or hidden assumptions.
According to Jaynes’ principle of maximum entropy [7], the least biased probability distri-
bution is that which maximizes entropy subject to all known constraints. Thus, Jaynes states
that the maximum entropy distribution is ‘. . . uniquely determined as the one which is maxi-
mally noncommittal with regard to missing information, in that it agrees with what is known,
but expresses maximum uncertainty with respect to all other matters’. Thus, from a purely
information-theoretical viewpoint, the optimal, or least biased, convex approximation schemes
are solutions of the program:


N
(ME) maximize H (p) = − pa log pa
a=1

subject to pa  0, a = 1, . . . , N

N
pa = 1
a=1


N
pa x a = x
a=1

It is interesting to note that in the one-dimensional case this problem gives the max-ent solution
of the classical problem of moments [19]. Since the information entropy function is strictly
concave in its domain RN + , the non-negative orthant, and the constraints are affine, (ME) defines
a convex optimization problem. The existence and uniqueness of the solution of this program
are established by the following proposition.

Proposition 3.1
The program (ME) has a solution iff x ∈ convX, in which case the solution is unique.

Copyright 䉷 2005 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng 2006; 65:2167–2202
2176 M. ARROYO AND M. ORTIZ

Proof
If x ∈ convX, then by Proposition 2.1 Px (X) = ∅. In addition, Px (X) is a closed and bounded
subset of RN and, therefore, compact. Hence, by the Weierstrass extreme value theorem −H
attains its minimum in Px (X). Since −H is strictly convex in Px (X) (the restriction of a
strictly convex function to a convex subset) the minimum is unique. 

Since program (ME) depends parametrically on x, its unique solution p0 (x) is also a function
of x. We shall refer to the convex approximation scheme defined by p0 (x) as the max-ent
approximation scheme. The smoothness of p0 (x) follows as a corollary to Proposition 4.2.

3.3. Examples
Given a point set X, the construction of a shape function p0a (x) requires solving the problem
(ME) for every point x ∈ convX. Examples of max-ent schemes in the plane are shown in
Figure 2. Figure 2(a) shows a max-ent shape function for a point set consisting of the vertices
of a convex pentagon. This example illustrates the delta Kroneker property of the max-ent shape
functions, and the property that the restriction of the max-ent shape functions to the edges of
the pentagon is linear. Thus, max-ent approximation schemes provide a basis for constructing
conforming elements in the shape of arbitrary convex polyhedra (cf. References [20, 21] for
recent alternative methods to construct generalized barycentric co-ordinates for polyhedra). In
recent independent work, maximum entropy methods have been used to construct barycentric
co-ordinates for convex polyhedra, thus defining C 0 approximants on polygonal tesselations [22].
The max-ent shape function of an interior node for a larger node set is shown in Figure 2(b).
As expected, the shape function vanishes at the boundary. The support of the shape function is
highly non-local and extends to the entire convex hull of the node set. In addition, the value
of the shape function at its corresponding node differs greatly from unity. Consequently, the
max-ent approximation is far from interpolating in the interior, and results in a very poor fit
to the data as illustrated in Figure 2(c).
This example serves to illustrate some of the limitations of global max-ent as a candidate ap-
proximation scheme for partial differential equations, namely, its non-local and non-interpolating
character. An extension of the max-ent concept that provides control over the degree of locality
of the shape functions is developed next.

Figure 2. Examples of max-ent approximation schemes in the plane: (a) shape function for the
vertex of a pentagon; (b) shape function for an interior node, illustrating the global character of
max-ent approximation schemes; and (c) max-ent approximation, or inference, of a function from
scattered data, illustrating the non-interpolating character of max-ent approximation schemes.

Copyright 䉷 2005 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng 2006; 65:2167–2202
LOCAL MAXIMUM-ENTROPY APPROXIMATION SCHEMES 2177

4. LOCAL MAX-ENT APPROXIMANTS

As observed in the preceding section, global max-ent approximation schemes, while optimal
in an information-theoretical sense, are non-local and non-interpolating, which limits their
usefulness as approximation schemes for partial differential equations. Control over the degree
of locality of max-ent approximation schemes can be achieved by adding spatial correlation
information in the (ME) program (11). In particular, we wish to control the degree to which
the value of a function at x is correlated to nearby nodal values. Correspondingly, we wish
to control the width of the shape functions and their decay with distance away from their
corresponding nodes. In this section, we extend the max-ent framework introduced in the
foregoing and build into the approximation scheme these notions of locality.
Define the width of shape function pa as

w[pa ] = pa (x)|x − x a |2 dx (11)

where we write  = convX. Thus, w[pa ] is simply the second moment of pa about x a .
Evidently, other measures of the width of a function can be used instead in order to de-
fine alternative approximation schemes. Some alternative measures are briefly discussed in
Section 4.6. The most local approximation scheme is now that which minimizes the total width
 N
N 
W [p] = w[pa ] = pa (x)|x − x a |2 dx (12)
a=1  a=1

subject to constraints (3a), (3b) and (4). Since functional (12) does not involve shape function
derivatives its minimization can be performed pointwise. This results in the linear program:


N
(RAJ) For fixed x minimize U (x, p) ≡ pa |x − x a |2
a=1

subject to pa  0, a = 1, . . . , N

N
pa = 1
a=1


N
pa x a = x
a=1

An argument identical to that in the proof of Proposition 3.1 shows that the program (RAJ)
has solutions if and only if x ∈ convX. However, the function U (x, ·) is not strictly convex
(it is linear) and the solution is not unique in general.
Rajan [23] showed that if the nodes are in general positions (no (d + 1) nodes in X are
cospherical), then (RAJ) has a unique solution, corresponding to the piecewise affine shape
functions supported by the unique Delaunay triangulation associated with the node set X
(a Delaunay triangulation verifies that the circumsphere of every simplex contains no point
from X in its interior). We shall refer to the convex approximation schemes defined by the
solutions p∞ (x) of (RAJ) as Rajan convex approximation schemes, and to the approximants
corresponding to the piecewise affine shape functions supported by a Delaunay triangulation

Copyright 䉷 2005 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng 2006; 65:2167–2202
2178 M. ARROYO AND M. ORTIZ

as Delaunay convex approximants. Thus, Rajan’s result states that for nodes in general po-
sitions, the Delaunay convex approximation scheme coincides with the unique Rajan convex
approximation scheme, that is optimal in the sense of the width (11).
When the nodes are not in general positions, the Delaunay triangulation is not unique and
the Delaunay approximation schemes are likewise not unique. Since every Delaunay approx-
imation scheme is a Rajan approximation scheme [23], it follows that the latter are likewise
not unique. Furthermore, it is readily verified that a convex combination of solutions of (RAJ)
is also a solution. Therefore, the set of solutions SxRAJ (X) of (RAJ), i.e. the set of Rajan
convex interpolation schemes, is a convex subset of Px (X). Thus, there are Rajan approxi-
mation schemes that are not Delaunay approximation schemes and are not associated with a
triangulation of the node set. We shall see in Example 4.1 that some Rajan approximation
schemes are not even convex combinations of Delaunay approximation schemes. A simple
example of non-uniqueness is provided by a node set consisting of four nodes at the corners
of a square. Then, the two triangulations corresponding to the two diagonals of the square are
Delaunay triangulations and supply solutions of (RAJ). In addition all convex combinations of
these solutions are in SxRAJ (X). This example is further analysed in Section 4.4. It should be
carefully noted that since the approximants are characterized pointwise, their continuity does
not follow automatically in the case of non-uniqueness.

4.1. Local max-ent approximation schemes as a Pareto set


Thus far we have defined two criteria for selecting convex approximation schemes: maximum
entropy and maximum locality, which result in max-ent and Delaunay (or Rajan for degenerate
node sets) convex approximation schemes, respectively. In general, it is not possible to find
convex approximation schemes that maximize both entropy and locality simultaneously, i.e.
unbiased estimation and locality are competing objective functions. A standard device for
harmonizing such competing objectives is to seek Pareto optima, i.e. convex approximation
schemes such that there is none better. Specifically, a convex approximation scheme q is
better than, or dominates, p iff −H (q)  −H (p), U (x, q)  U (x, p) and at least one of the
inequalities is strict. The set of Pareto optima is called the Pareto set. For convex multicriterion
optimization, the scalarization of the problem provides a means to find the Pareto set [24]; on
the one hand, each solution of the problem:

(LME) For fixed x minimize f (x, p) ≡ U (x, p) − H (p)

subject to pa  0, a = 1, . . . , N


N
pa = 1
a=1


N
pa x a = x
a=1

for  ∈ (0, +∞) is Pareto optimal. Conversely, each element of the Pareto set is a solution of
(LME) for some  ∈ (0, +∞) ∪ {0, +∞}. Note carefully that in general not all the solutions
for the values {0, +∞} of  are Pareto optimal. For  = 0, the uniqueness of the solutions of

Copyright 䉷 2005 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng 2006; 65:2167–2202
LOCAL MAXIMUM-ENTROPY APPROXIMATION SCHEMES 2179

(ME) guarantees that p0 is Pareto optimal. For  = +∞, the linear program (RAJ) is recovered.
As already noted, this problem admits multiple solutions in general. From the definition of
Pareto optimality, only the element of SxRAJ (X) with maximum entropy is in the Pareto set.
Since H (p) is strictly concave in SxRAJ (X), the Pareto-optimal element for  = + ∞ is unique
and given by

pPareto
∞ (x) = arg max H (p) (13)
p∈SxRAJ (X)

Since for  ∈ [0, +∞) the function f (x, ·) is continuous and strictly convex in Px (X), an
argument identical to that in the proof of Proposition 3.1 shows that (LME) has a unique
solution p (x) if and only if x ∈ convX. We shall refer the convex approximation schemes
defined by p (x) as local max-ent convex approximation schemes. These convex approximation
schemes can be viewed as optimal trade-offs or compromises between information-theoretical
optimality and locality.

4.2. Dual problem and exponential form of the shape functions


The problem (LME) defining the local max-ent convex approximations is amenable to analysis
by standard duality methods, which also provide a method for the practical calculation of the
local max-ent approximants. These methods have been extensively applied to max-ent problems
(cf. e.g. Reference [24]). For simplicity and without loss of generality, throughout this section
we shall assume that affX = Rd . Under these conditions, the relative interior relint(convX) of
convX coincides with its interior int(convX). Using the zeroth-order consistency condition (3a)
it is possible to re-write the first-order consistency condition (3b) as

pa (x a − x) ≡ Yp = 0 (14)
a

where Y is the N × d matrix whose columns are x a − x. Form (14) of the first-order condition
is preferable to (3b) as it results in better conditioning of the calculations.
Dropping the parametric dependence on x for notational simplicity, the Lagrangian associated
with the (LME) is

L(p, 0 , ) = f (p) + 0 (1 · p − 1) +  · Yp (15)

where 0 ∈ R and  ∈ Rd are Lagrange multipliers. Therefore, the domain of definition of the
Lagrangian is RN + × R × R and its range is R. Since ∀x ∈ convX,  ∈ [0, ∞), (LME) has
d

a unique solution p , by the Kuhn–Tucker theorem there exist Lagrange multipliers ∗0 and
∗ such that {p , ∗0 , ∗ } is a saddle point of the Lagrangian and verifies the Kuhn–Tucker
conditions [8].
In order to analyse the saddle-point problem associated with the Lagrangian (15) it proves
convenient to differentiate between interior points, x ∈ int(convX), and points on the boundary,
x ∈ bd(convX). The interior point case corresponds to Slater’s condition that there exist p ∈ RN
++
such that x = Xp (cf. Theorem 6.9 in Reference [8]), and, consequently, in this case strong
duality holds. We proceed to analyse these two cases in turn.

Copyright 䉷 2005 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng 2006; 65:2167–2202
2180 M. ARROYO AND M. ORTIZ

4.2.1. Interior points. Let x ∈ int(convX) be an interior point. Define the partition function
Z : Rd × Rd → R associated with the node set X as

N
Z(x, ) ≡ exp[−|x − x a |2 +  · (x − x a )] (16)
a=1

Proposition 4.1
Suppose affX = Rd and x ∈ int(convX) and let  ∈ [0, ∞). Then, the unique solution of the
local max-ent problem (LME) is
1
pa (x) = exp[−|x − x a |2 + ∗ (x) · (x − x a )], a = 1, . . . , N (17)
Z(x, ∗ (x))
where
∗ (x) = arg min log Z(x, ) (18)
 ∈ Rd

Furthermore, the minimizer ∗ (x) is unique.

Proof
The first N Kuhn–Tucker conditions are
0 = |x − x a |2 + log pa + 1 + ∗0 + ∗ · (x a − x) a = 1, . . . , N
whence we obtain
exp[−|x − x a |2 + ∗ · (x − x a )]
pa =
exp(∗0 + 1)
for all x ∈ int(convX). The optimal Lagrange multipliers ∗0 and ∗ are the maximizers of the
Lagrange dual function
g(0 , ) = inf L(p, 0 , ) = − 0 − f∗ (−0 1 − YT )
p∈RN
+

where

N
f∗ (q) = exp(qa − |x − x a |2 − 1)
a=1

is the function conjugate to f (p) [24]. The Lagrange dual function can be maximized explicitly
with respect to 0 , with the result
 

N
∗0 = arg max −0 − exp(−0 − 1) exp[−|x − x a |2 +  · (x − x a )]
0 ∈R a=1

which in turn yields the identity


Z(x, ) = exp(∗0 + 1)
Inserting this expression into the Lagrange dual function, the reduced Lagrange dual function
ĝ() = − log Z(x, )

Copyright 䉷 2005 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng 2006; 65:2167–2202
LOCAL MAXIMUM-ENTROPY APPROXIMATION SCHEMES 2181

is obtained. Thus it follows that if x ∈ int(convX), then the local max-ent shape functions are
given by Equations (17) and (18). The existence of the minimizer of −ĝ, ∗ (x), is guaranteed
by the Kuhn–Tucker theorem. Next we show that this minimizer is unique. The gradient of
the objective function is

N
r(x, ) ≡ * log Z(x, ) = pa (x, )(x − x a ) (19)
a=1

where pa (x, ) denotes the evaluation of (17) at an arbitrary value  of the Lagrange multiplier.
As expected, the first-order optimality condition results in the first-order consistency condition.
The Hessian of the objective function is

N
J(x, ) ≡ * * log Z(x, ) = pa (x, )(x − x a ) ⊗ (x − x a ) − r(x, ) ⊗ r(x, ) (20)
a=1

Consider now a non-zero vector u ∈ Rd and let ua = u·(x−x a ). Since by assumption affX = Rd
it follows that not all ua are identical. In addition, it follows from (17) that pa (x, )>0. Hence,
by the strict convexity of the square function, we have
 2
 
u · J(x, )u = pa (x, )u2a − pa (x, )ua >0
a a

Consequently, J(x, ) is positive definite for all  ∈ Rd and therefore log Z is strictly convex
and the minimizer of (18) is unique. 

Note that, in the absence of the first-order consistency conditions, the resulting approximants
are Shepard’s functions with Gaussian weight.

4.2.2. Boundary points. The treatment of boundary points x ∈ bd(convX) can be reduced to the
problem analysed in the preceding section by exploiting the reduced face problem property of
convex approximation schemes (cf. Section 2.2). Recall that the contact set C(x) of the point x
with respect to convX is the smallest face of convX that contains x. The dimension of C(x) is
the face dimension of x. The local max-ent shape functions then follow from the translation and
restriction of problem (LME) to the node set X  = X ∩ C(x) − x and the subspace L = affX ,
whose dimension is the face dimension of x. If the face dimension is zero, then the problem is
trivial. Otherwise, x − x belongs to the interior of convX = C(x) − x ⊂ L, and Proposition 4.1
applies.

4.3. Spacial smoothness of the shape functions


The smoothness of the local max-ent approximants p (x) is not guaranteed a priori, since they
are characterized pointwise by a convex program. We next establish the differentiability of the
shape functions with respect to x, which ultimately depends on the smoothness of ∗ (x). As
before, we assume affX = Rd for simplicity and without loss of generality.

Proposition 4.2
Suppose affX = Rd and let  : convX → [0, ∞) be C r in int(convX). Then the local max-ent
shape functions are of class C r in int(convX).

Copyright 䉷 2005 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng 2006; 65:2167–2202
2182 M. ARROYO AND M. ORTIZ

Proof
Consider the function from Rd × Rd to Rd given by

F (x, ) = pa (x, )(x − x a )
a

If  ∈ C r , then it follows from (16) and (17) that F is likewise is C r . In the proof of
Proposition 4.1 we have verified that F (x, ∗ ) = 0 and det * F (x, ∗ )  = 0 for all x ∈ int(convX).
Consequently, by the implicit function theorem, ∗ (x) is a C r function in int(convX), and the
theorem follows from (16) and (17). 

Computing the derivatives of the shape functions with respect to x is not altogether straight-
forward because the derivatives of the Lagrange multiplier ∗ (x) are involved in the calculation.
In Appendix A, the explicit form of the derivatives of the shape functions with respect to x is
given for a general function (x). If  is constant, the remarkably simple expression

∇pa (x) = −pa (x)J(x, ∗ (x))−1 (x − x a ) (21)

is obtained as a special case.

4.4. Smoothness and limits with respect to the thermalization


We next study the dependence of the local max-ent approximants on the thermalization
parameter . We first establish that p is a continuous function of  in [0, +∞) and smooth
in (0, +∞). We next analyse the more interesting athermal limit, as  → +∞.

4.4.1. Continuity and differentiability properties

Proposition 4.3
Let x ∈ convX. Let p (x) be the unique minimizer of the problem (LME) . Then, p (x) is a
continuous function of  in [0, +∞).

Proof
Let 0 ∈ [0, +∞). We first show that

f (x, ·)−→f0 (x, ·) as  −→ 0 uniformly on Px (X) (22)

We fix x and omit it from all expressions in the proof. Equation (22) simply follows from the
fact that for any given >0, |f0 (p) − f (p)|  , ∀ ∈ [0, +∞) ∩ [0 − , 0 + ] with

 = /(N diam2 X) (23)

and ∀p ∈ P(X). Here diam X denotes the diameter of convX. Then, by uniform convergence
and recalling that (LME) and (LME)0 have unique minimizers, we have

lim f (p ) = lim min f (p) = min f0 (p) = f0 (p0 )
→0 →0 p∈P(X) p∈P(X)

We now consider a sequence {k }k∈N ⊂ [0, +∞) converging to 0 , and the associated sequence
of minimizers of (LME)k , {pk }k∈N ⊂ P(X). Since P(X) is compact, this sequence has at

Copyright 䉷 2005 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng 2006; 65:2167–2202
LOCAL MAXIMUM-ENTROPY APPROXIMATION SCHEMES 2183

least a convergent subsequence {pk }j ∈N converging to q ∈ P(X). By uniform convergence


j
and the continuity of f0 in P(X), it easily follows that
lim fk (pk ) = f0 (q)
j →∞ j j

From the arguments above, it follows that f0 (p0 ) = f0 (q), and consequently q = p0 , by the
uniqueness of the minimizer of f0 . Thus, any convergent subsequence of {pk }k∈N converges
to p0 , and, invoking the compactness of P(X), we conclude that
lim pk = p0
k→∞

Since the sequence {k }k∈N converging to 0 is arbitrary, the continuity of the minimizer as a
function of  at any point 0 ∈ [0, +∞) immediately follows. 

It follows as an immediate corollary that for x ∈ convX,


lim p (x) = p0 (x)
→0

and thus the max-ent convex approximation schemes are recovered from their local counterparts
in the limit  → 0. Furthermore, since  in Equation (23) is independent of x, it follows that if
 → 0 uniformly on convX (for instance if the thermalization parameter is uniform in convX),
then p → p0 uniformly on convX.
Finally, we note the following smoothness property p :

Proposition 4.4
Let x ∈ convX. Then, p (x) is a C ∞ function of  in (0, +∞).

Proof
The proof is identical to that of Proposition 4.2 with F regarded as a function of  and . 

4.4.2. Athermal limit. The limit  → + ∞ is better analysed by considering the problem equiv-
alent to (LME) of minimizing the function

fˆ (x, p) = U (x, p) − −1 H (p)


in Px (X). The case  = + ∞ coincides with Rajan’s program (RAJ). The methods used in
Proposition 4.3 only provide a definite answer about the limit of p as  → +∞ when the
points are in general positions:

Proposition 4.5
Let x ∈ convX. Consider a sequence of non-negative reals {k }k∈N diverging to +∞ as
k → +∞. Then, every convergent subsequence of {pk (x)}k∈N converges to a solution of
(RAJ). Furthermore, if the nodes of X are in general positions, then p (x) converges to the
unique solution of (RAJ), the Delaunay convex approximants, as  → +∞.

Proof
The proof is analogous to that of Proposition 4.3. The last argument requires the uniqueness
of the limit problem. One simply needs to verify that fˆ → U uniformly as  → +∞.

Copyright 䉷 2005 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng 2006; 65:2167–2202
2184 M. ARROYO AND M. ORTIZ


a pa log pa  log N for all p ∈ R+ such that 1 · p = 1, it immediately follows
N
Indeed, since
that |fˆ (x, p) − U (x, p)|  (log N )/, ∀x ∈ convX, ∀p ∈ Px (X). 

We next show that regardless of the uniqueness of the minimizers of the athermal problem
(RAJ), the limit of the minimizers of (LME) exists as  → +∞. Thus, the max-ent regu-
larization of Rajan’s program selects a distinguished element of the set of solutions of (RAJ),
that which is Pareto optimal, which is unique by virtue of the strict concavity of the entropy
in the convex set SxRAJ (X).

Proposition 4.6
Let x ∈ convX, and consider the solution of (RAJ) with maximum entropy:
pPareto
∞ (x) = arg max H (p)
p∈SxRAJ (X)

Then,
lim p (x) = pPareto
∞ (x)
→+∞

Proof
We fix x and omit it from the proof. Let p̄∞ be an arbitrary element of SRAJ (X) so that
U (p̄∞ ) = m∞ , the minimum value of U in P(X). Let {k }k∈N be a sequence of non-negative
reals diverging to +∞ as k → +∞, and {pk }k∈N the associated sequence of solutions of
(LME)k . Let {pk }j ∈N be a convergent subsequence converging to p̂∞ ∈ SRAJ (X) according
j
to the preceding proposition. For j large enough, kj >1, and consequently, by the convexity
of SRAJ (X), we have
 
1 1
p∞j ≡ 1 − p̄∞ + p̂∞ ∈ SRAJ (X)
kj kj

Consider also the sequence in P(X) defined by


 
1 1
pj ≡ 1 − p∞j + p
kj kj kj

It is clear that limj →∞ pj = p̄∞ . By optimality, we have


1 1
U (pk ) − H (pk )  U (pj ) − H (pj )
j kj j kj

and since
 
1 1
U (pj ) = 1− m∞ + U (pk )
kj kj j

it follows that
kj (U (pk ) − m∞ ) − H (pk )  (U (pk ) − m∞ ) − H (pj ) (24)
j j j

Copyright 䉷 2005 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng 2006; 65:2167–2202
LOCAL MAXIMUM-ENTROPY APPROXIMATION SCHEMES 2185

By the continuity of U , limj →∞ (U (pk ) − m∞ ) = 0. On the other hand, noting that fˆk  U ,
j j
we have
H (p̂∞ )
0  fˆk (pk ) − U (p̂∞ )  fˆk (p̂∞ ) − U (p̂∞ ) = −
j j j kj

and consequently

0  kj (U (pk ) − m∞ )  H (pk ) − H (p̂∞ )


j j

which by virtue of the continuity of H implies that limj →∞ kj (U (pk ) − m∞ ) = 0. Thus,
j
taking limits at both sides of Equation (24), we conclude that

−H (p̂∞ )  − H (p̄∞ )

Since p̄∞ is an arbitrary element of SRAJ (X), we conclude that p̂∞ = pPareto
∞ . Thus, every
convergent subsequence of {pk }k∈N has pPareto
∞ as its limit and, by the compactness of P(X),
we conclude that

lim pk = pPareto



k→∞

Since the sequence {k }k∈N diverging to +∞ is arbitrary, the proposition follows. 

Example 4.1
Consider a node set consisting of the four corners of the square  = [0, 1] × [0, 1] ⊂ R2

X = {(0, 0), (1, 0), (1, 1), (0, 1)}

For this node set, the Delaunay triangulation is not unique and (RAJ) has multiple solutions.
Let us define the bilinear shape functions

p1bil (x, y) = (1 − x)(1 − y), p2bil (x, y) = x(1 − y), p3bil (x, y) = xy, p4bil (x, y) = (1 − x)y

and note that pbil (x, y) ∈ P(x,y) (X), ∀(x, y) ∈ . It is readily verified that in this particular
case (actually for any rectangular configuration) pbil = p irrespective of the value of . Indeed,
from the first three Kuhn–Tucker conditions

0 = [(x − xa )2 + (y − ya )2 ] + log pabil + 1 + ∗0 + ∗x (xa − x) + ∗y (ya − y), a = 1, 2, 3

we can obtain explicitly


1−x 1−y
∗0 = 1 + x(x − 1) + y(y − 1) + x log + y log
x y
1−x
∗x = (2x − 1) + log
x

Copyright 䉷 2005 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng 2006; 65:2167–2202
2186 M. ARROYO AND M. ORTIZ

1−y
∗y = (2y − 1) + log
y

With these Lagrange multiplier values, the fourth Kuhn–Tucker condition is identically satisfied.
Thus, for the square the Pareto set collapses to a single point and, consequently, its
pPareto
∞ = pbil , which, according to the propositions above, is the solution of (RAJ) of maximum
entropy. This simple examples thus shows that, in addition to the Rajan convex approximants
corresponding to the two Delaunay triangulations and their convex combinations, the bilinear
shape functions are also a Rajan convex scheme, and in fact the distinguished one. A simple
calculation shows that indeed U = x(1 − x) + y(1 − y) for the bilinear shape functions and for
both sets of shape functions associated with the Delaunay triangulations. It is also easy to verify
explicitly that the entropy of the bilinear shape functions is greater. This example illustrates
how the local max-ent approximants eliminate the degeneracy of the Delaunay triangulation.
This example also illustrates the lower semi-continuity of the minimum negative entropy of
all the Rajan approximants when seen as a function of the node positions, i.e. of −Hm (X, x) ≡
− H (pPareto
∞ (x)). Here, pPareto
∞ corresponds to the node set X. Thus, at degenerate configurations
Hm (X, x) fails to be continuous with respect to the first argument in general, but is lower-
semicontinuous. Consider for instance the following one-parameter family of node sets:

Xs = {(−s, −s), (1, 0), (1 + s, 1 + s), (0, 1)}

and consider the average value of the negative entropy of the distinguished Rajan approximant
(for s = 0 there is only one Rajan approximant, the Delaunay approximant) in convXs :

convXs −Hm (Xs , x) dx


h(s) =
volume(convXs )
It is readily verified that

−1 for s = 0
h(s) =
−5/6 otherwise

which demonstrates the lower-semicontinuity of −Hm (Xs , x) at s = 0.

Example 4.2
To illustrate the selection of a distinguished Rajan convex approximant by the max-ent regular-
ization in degenerate cases, we consider a node set in R2 containing four co-spherical nodes.
The local max-ent shape function corresponding to one of the co-spherical nodes is depicted
in Figure 3 for a value of  = 10h−2 close to the athermal limit, where h is a representative
nodal spacing. It is verified from the figure that the regularization connects the co-spherical
nodes with bilinear shape functions, whereas the nodes in general positions are connected by
means of ostensibly affine shape functions.

4.5. Alternative interpretations of the local max-ent program


The following interpretations of the local max-ent program provide useful insights into the
nature of the resulting approximation schemes.

Copyright 䉷 2005 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng 2006; 65:2167–2202
LOCAL MAXIMUM-ENTROPY APPROXIMATION SCHEMES 2187

Figure 3. Illustration of the thermal regularization in degenerate cases for  = 10h−2 ,


where h is the nodal spacing. The nodal set contains four co-spherical nodes that are
connected by bilinear shape functions. The remaining nodes are in general positions
and are connected by ostensibly affine shape functions.

4.5.1. Regularization of the Delaunay program. The local max-ent program may be regarded as
a perturbation of Rajan’s program (RAJ). The perturbation regularizes that linear program, pos-
sibly having multiple optima, into a one-parameter family of better-behaved smooth and strictly
convex programs. The last proposition shows that local max-ent approximation effectively re-
moves the degeneracy of the Delaunay triangulation. Thus, when the Delaunay triangulation
is not unique the optimal path of local max-ent approximants converges to a unique distin-
guished set of Rajan shape functions in the limit. This distinguished shape functions are those
of maximum entropy and are Pareto optimal. Thus, the proposed regularization is in analogy
to barrier and penalty methods in linear and convex programming, and viscosity solutions of
variational problems [25, 26].

4.5.2. A dual regularization of the Delaunay program. Alternatively, one can start from the
dual (RAJ) problem, an unconstrained non-smooth piecewise-linear convex program (the min-
imization of a polyhedral convex function), and approximate it by a one-parameter family of
smooth, strictly convex programs. The dual (RAJ) program is
min max ( · (x − x a ) − |x − x a |2 ) (25)
∈Rd a=1,...,N

Indeed, equivalent forms of this problem are [24]

minimize t
subject to  · (x − x a ) − |x − x a |2  t, a = 1, . . . , N

Copyright 䉷 2005 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng 2006; 65:2167–2202
2188 M. ARROYO AND M. ORTIZ

and

maximize −b · 
subject to AT  + c  0

where we write
t 1T 1
= , A= , b= , c = {|x − x1 |2 , . . . , |x − xN |2 }T (26)
 Y 0

The dual is this program is

minimize c·p
subject to Ap = b
p0

which is a restatement of (RAJ), as expected. Now consider the log-sum-exp function


N 

lse(z1 , . . . , zN ) = log exp za (27)
a=1

and the one-parameter family of analytic functions


1
h (z1 , . . . , zN ) = lse(z1 , . . . , zN ) (28)

which, in view of the estimates,
log N
0  h (z1 , . . . , zN ) − max za  (29)
a=1,...,N 
approximate uniformly the max function as  → ∞ [24]. Replacing the max function in
Equation (25) by this approximation gives
N 
1 
min log exp[ · (x − x a ) − |x − x a | ]
2
(30)
∈Rd  a=1

which is a statement of the dual program of (LME) .

4.5.3. Statistical mechanics interpretation. The parallel between the local max-ent program and
statistical mechanics is apparent in the preceding developments and can be further formalized
as follows. Consider a system whose configuration space is the index set I = {1, . . . , N}, and
we wish to describe the statistics of this system, i.e. its probability distribution p. The energy
of configuration a ∈ I is

Ea = |x a − x|2 (31)

Copyright 䉷 2005 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng 2006; 65:2167–2202
LOCAL MAXIMUM-ENTROPY APPROXIMATION SCHEMES 2189

where x is a controllable parameter. Its statistical expectation is the internal energy of the

system U (x, p) = N a=1 pa Ea . The canonical distribution of the system then follows as the
solution to the variational problem

inf {U (x, p) − −1 H (p)} (32)


p∈Px (X)

where  = 1/kBT , kB is Boltzmann’s constant and T is the absolute temperature. Evidently, (32)
is just a re-statement of the problem (LME) . The value of the infimum F (x, ) in (32) is the
free energy of the system. From this perspective, the local max-ent problem (LME) may be
regarded as a thermalization of the Rajan problem (RAJ). Conversely, the Rajan problem, may
be regarded as the zero temperature limit of the thermalized problem. Thus, thermalization
replaces a problem of energy minimization at zero temperature, namely the Rajan problem
(RAJ), by the problem of computing the partition function (16) at finite temperature. Since
the configuration space is the index set I , the evaluation of the partition function reduces
to the computation of a finite sum. In this manner, the solution of the linear programming
problem (RAJ) is replaced by an explicit calculation. This connection between the computation
of partition functions and energy minimization in the limit of zero temperature indeed was the
original insight that led to the present work.

4.6. Examples
Figure 4 shows the local max-ent shape function and its partial derivatives for a node in a
two-dimensional node set as a function of the dimensionless parameter  = h2 , where h is a
measure of the nodal spacing and  is constant over the domain. It can be seen from this figure
that the shape functions are smooth and their degree of locality is controlled by the parameter .
For the maximum value of  = 6.8 shown in the figure the shape function ostensibly coincides
with the Delaunay shape function.
The parameter  can be allowed to depend on position and that dependence can be adjusted
adaptively in order to achieve varying degrees of locality. Figure 5 shows an illustration of this
type of adaptivity. In this example, the function (x) is chosen such that the finite element
limit is attained at the left and centre nodes of the node set, with increasing thermalization
away from these nodes. In applications, the question of the optimal distribution of  over the
domain of analysis arises naturally. In problems whose solutions minimize a certain functional,
a natural strategy is to select (x) variationally, i.e. to let (x) be such function as minimizes
the functional. However, this enhancement of the method will not be pursued in this paper.
Figure 6(a) illustrates the behaviour of the local max-ent shape functions at the boundary of
the domain. In particular it should be noted that the shape functions of interior nodes vanish at
the boundary, and that the shape functions of extreme nodes equal one at their corresponding
node.
Finally, we briefly discuss measures of locality other than (11). For instance, suppose that
we adopt a general distance d in place of the standard Euclidean distance employed in (11).
The width of shape function pa is then

w[pa ] = pa (x)d 2 (x, x a ) dx (33)


Copyright 䉷 2005 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng 2006; 65:2167–2202
2190 M. ARROYO AND M. ORTIZ

Figure 4. Local max-ent shape functions for a two-dimensional arrangement of nodes, and spacial
derivatives (arbitrary scale) for several values of  = h2 .

0.5

0
0 5 10 15 20

Figure 5. One-dimensional example of seamless transition to finite elements


achieved by tuning the function (x).

In this case, the local max-ent shape functions become


1
pa (x) = exp[−(x)d 2 (x, x a ) + ∗ (x) · (x − x a )]
Z(x, ∗ (x))
with the partition function appropriately modified. In particular, this extension can be useful for
purposes of defining anisotropic shape functions, e.g. in problems involving localization. For

Copyright 䉷 2005 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng 2006; 65:2167–2202
LOCAL MAXIMUM-ENTROPY APPROXIMATION SCHEMES 2191

Figure 6. (a) Illustration of the behaviour of the local max-ent shape functions at the boundary of
the domain; and (b) example of anisotropic local max-ent shape function.

instance, 
Figure 6(b) shows a shape function defined using an Euclidian distance of the form
d(x, y) = (x − y) · G(x − y), for a constant metric tensor G. The localization of the shape
function to a plane is noteworthy in the figure.
Another variation of the standard local max-ent program is to consider measures of locality
that are not quadratic in the distance. In this manner, shape functions with different decay
behaviour may be obtained. For instance, consider width of the form

w[pa ] = pa (x)|x − x a |p dx (34)


where 1  p  ∞. Figure 7(a)–(c) shows the resulting shape functions for p = 1, 3, 6 and
 = 2.8, 1.4, 1.0, respectively. It is interesting to note that the l 1 -shape functions are not dif-
ferentiable at their corresponding nodes. Figure 7(d) shows the function resulting from the
function log |x − x a | with  = 2, which leads to a decay of the form ∼r −2 with distance r.
Interestingly, shape functions constructed from this locality measure verify the Kronecker-delta
property.

4.7. Practical evaluation of the shape functions


In practice, the evaluation of the local max-ent approximants at a given point x ∈ convX does
not require the solution of the (LME) as a constrained convex program involving N unknowns.
Instead, as shown in Section 4.2, it is sufficient to solve an unconstrained minimization problem,
Equation (18), involving at most d unknowns, namely, the face dimension of x. This problem
is smooth, strictly convex and, by virtue of the Kuhn–Tucker theorem, is guaranteed to have
a unique solution. This fact confers efficiency and robustness to the calculation of the local

Copyright 䉷 2005 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng 2006; 65:2167–2202
2192 M. ARROYO AND M. ORTIZ

Figure 7. Local max-ent shape functions for l p -distances: (a) p = 1 and  = 2.8; (b) p = 3 and  = 1.4;
(c) p = 6 and  = 1.0; and (d) log function and  = 2.0.

max-ent approximants. The resulting set of non-linear equations


r(x, ) = 0 (35)
for ∗ , where r(x, ) is given by Equation (19), can be solved numerically, e.g. by means of
a few Newton–Raphson interaction. The requisite Hessian matrix is given by Equation (20).
To illustrate the effectiveness of the Newton–Raphson method to solve Equation (35), the
average number of iterations required for convergence with the criterion |r(x, k )|  TolNR
is presented in Table I for different values of  and different tolerances. The node set of
Figure 4 is used in this example, and the fine grid of sample points used to generate these
plots is used to compute the average. It is observed that, for a tolerance of 10−5 , between 2
and 3 iterations are sufficient over a broad range of values of the thermalization parameter. It is

Copyright 䉷 2005 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng 2006; 65:2167–2202
LOCAL MAXIMUM-ENTROPY APPROXIMATION SCHEMES 2193

Table I. Average number of Newton–Raphson


iterations per sample point.
TolNR
 10−5 10−10 Machine precision

0.8 2.7 3.6 4.2


1.8 2.4 3.5 4.3
2.8 3.0 4.0 5.0
6.8 4.0 5.0 6.0

also observed that, as the athermal limit is reached (for large values of ), the Newton–Raphson
method needs more iterations for convergence. This slowdown in convergence is expected
since the objective function tends to a faceted polyhedral convex function as  → +∞ (cf.
Section 4.5.2).
It is evident from Equation (17) that the support of the shape functions is the entire domain
convX, i.e. the local max-ent shape functions have global support. However, the shape functions
decay as exp(−r 2 ) with distance r to the corresponding node, and can thus be truncated at
distances greater than a small multiple of −1/2 . This decay property establishes a connection
with Gaussian radial basis functions (cf. e.g. Reference [4]) and has the important consequence
that only a small number of nodes contribute ostensibly to the partition function, which greatly
reduces the computational cost of the solution of problem (18). For practical purposes, a
tolerance Tol0 is set below which we set the shape functions to zero. Therefore, owing to the
Gaussian decay, the numerically effective support  of the shape function
 corresponding to node
x a is the ball centred at this point of radius Ra = − log(Tol0 )/ = h − log(Tol0 )/. By way
of illustration, if the tolerance is the double precision machine precision (∼10−16 ) and  = 1.8,
then the radius is below 4.5 times the nodal spacing. The fast decay of the shape functions
can be observed in the first three columns of Figure 4, where the support has been determined
for a tolerance of Tol0 = 10−6 .
Once the shape functions have been evaluated, the calculation of the derivatives of the shape
functions is explicit by means of Equation (21).

5. APPLICATION TO LINEAR AND NON-LINEAR ELASTICITY

Local max-ent approximation schemes provide a convenient choice of basis to use in conjunction
with the Rayleigh–Ritz—or constrained energy minimization—approach to elasticity problems.
In this section, we first present a displacement patch-test, and then proceed to demonstrate
the accuracy and convergence characteristics of the local max-ent solutions by means of two
examples of application: the standard benchmark problem of a linear elastic built-in cantilever
beam loaded at the tip; and the upsetting and extension of a block of compressible neo-Hookean
rubber. In all calculations we confine our attention to uniformly distributed node sets and the
parameter  is taken to be uniform. In addition, all numerical integrals are carried out using
standard quadrature rules based on the limiting Delaunay triangulation. For large values of ,
the local max-ent approximation scheme differs little from simplicial interpolation and a one-
point quadrature rule is found to suffice. For lower values of  a three point rule for triangles

Copyright 䉷 2005 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng 2006; 65:2167–2202
2194 M. ARROYO AND M. ORTIZ

and a four point rule for tetrahedra are used. The accuracy of the local max-ent solutions
is compared to that of the Delaunay linear finite elements. This comparison is natural in the
present context since linear finite elements arise in the athermal limit.

5.1. Patch test


In the displacement patch test, the boundary of the computational domain is subjected to an
affine transformation. For the numerical method to pass the test, the numerical solution in
the interior of the domain must reproduce this affine transformation exactly. Since the local
max-ent approximants satisfy the first-order reproducing condition, the patch test is passed to
machine precision if exact integration is used. Thus, the test assesses the numerical quadrature
and the accuracy in the calculation of the shape functions.
Consider the square [0, 1] × [0, 1] of a linear isotropic elastic material characterized by Young
modulus E = 1 and Poisson’s ratio  = 0.3. The boundary of the square is subjected to a linear
transformation characterized by the matrix
 √ 
1 − 3/2

3 1/2

The two node sets depicted in Figure 8 are considered. The figure also shows a typical
quadrature rule, based on the underlying Delaunay triangulation. Symmetric rules ranging from
1 to 175 points per triangle are considered in the calculations [27].
Table II reports the relative L2 errors for two different node sets, for two values of , and for
nine different quadrature rules. To specifically investigate the influence of numerical quadrature,
we have assigned TolNR and Tol0 to machine precision in this table. It can be observed that for
the structured node set, it is possible to pass the patch test ostensibly within machine precision
with 175 integration points per triangle. However, for an unstructured node set the errors reduce
more slowly as the number of quadrature points is increased, and a maximum precision of
3 × 10−6 is achieved. These results conform to previous experience for other meshfree methods,
special quadrature rules notwithstanding [28]. It should also be mentioned that in the range of
one to six quadrature points per triangle there is no significant difference between the structured
and unstructured node sets. It is also noteworthy that, for the node set (b), a lower value of
 requires a more accurate quadrature for a given error in the patch test. The relative errors

(a) (b)

Figure 8. Node sets (a) and (b) used for the patch test, and typical layout of the quadrature points
based on the underlying Delaunay triangulation.

Copyright 䉷 2005 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng 2006; 65:2167–2202
LOCAL MAXIMUM-ENTROPY APPROXIMATION SCHEMES 2195

Table II. Relative errors in the L2 norm for the patch test.
Quadrature (a)  = 1.8 (a)  = 0.8 (b)  = 1.8 (b)  = 0.8

1 3 × 10−3 1 × 10−2 2 × 10−3 1 × 10−2


3 5 × 10−4 3 × 10−4 4 × 10−4 1 × 10−3
6 4 × 10−4 3 × 10−4 3 × 10−4 1 × 10−3
7 3 × 10−5 6 × 10−6 2 × 10−4 2 × 10−4
12 8 × 10−6 7 × 10−8 8 × 10−5 2 × 10−4
25 4 × 10−8 2 × 10−11 3 × 10−5 8 × 10−5
54 3 × 10−10 7 × 10−16 1 × 10−5 3 × 10−5
126 7 × 10−14 4 × 10−16 5 × 10−6 1 × 10−5
175 3 × 10−16 6 × 10−16 3 × 10−6 7 × 10−6

reported in Table II for the first four rules are not affected if TolNR = Tol0 = 10−5 . These are
the tolerances used in subsequent examples for computational efficiency.

5.2. Linear elasticity: cantilever beam


We begin by considering the standard benchmark problem of a linear elastic cantilever beam
acted upon by a parabolic distribution of tractions at one end and built-in boundary conditions at
the other end. The exact solution to this problem is known √ and may be looked up in Reference
[29]. We consider beams whose length-to-height ratio is 8/ 3. The problem is symmetric with
respect to the neutral axis of the beam, and therefore only the upper half of the beam is
analysed.
It should be noted that, owing to the weak Kronecker-delta property of the local
max-ent approximants, the imposition of Dirichlet boundary conditions is straightforward with-
out degrading the optimal rate of convergence: it is sufficient to set the nodal displacement
to the prescribed values. Indeed, owing to the properties outlined in Section 2.2, the basis
functions corresponding to the nodes on a face alone retain the full approximation properties
on that face.
Figure 9 compares the exact deformed configuration of the beam and the numerical deformed
configurations obtained by means of linear finite elements and local max-ent shape functions
with  = 1.8. In this example, Poisson’s ratio is  = 0.3 and thus volumetric locking is not an
issue. The superior accuracy of the local max-ent approximation scheme is clearly apparent
from this figure.
Convergence plots are shown Figure 10. In this plots, the L2 norm of the error is defined
as
 1/2
eL2 = |u − uh |2 dx (36)


whereas the energy semi-norm is


  1/2
1
eE = ( −  ) : ( −  ) dx
h h
(37)
2 

Copyright 䉷 2005 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng 2006; 65:2167–2202
2196 M. ARROYO AND M. ORTIZ

(a) (b)

Figure 9. Linear elastic built-in cantilever loaded at the tip, deformed configurations:
(a) finite element solution; and (b) local max-ent solution. Also shown for reference
as a solid line is the exact deformed configuration.

100 100 100 100

10-1

1.8
10-1
0.9

Energy error
Energy error
L2 error

L2 error
10-2 10-1 10-1

10-2 γ=4.8
γ=3.8 γ=1.8
10-3 1.0
γ=2.8 FEM
γ=1.8
2.0 γ=0.8
FEM
10-4 10-3 10-2 10-2
100 100 100 100
(a) nodal spacing nodal spacing (b) nodal spacing nodal spacing

Figure 10. Linear elastic built-in cantilever loaded at the tip,


convergence plots: (a)  = 0.3; and (b)  = 0.499.

In these expressions u, , and  denote the exact displacement, strain and stress fields, and uh ,
h , and h denote the corresponding numerical approximations. The convergence plots display
the numerical errors normalized by the norm of the exact solution as a function of a normalized
nodal spacing.
Figure 10(a) shows the convergence plots for the case of  = 0.3. It is evident from these plots
that there is an optimal value for the parameter  of around 1.8 for which accuracy is maximized.
A similar behaviour is observed in meshfree methods in terms of the dilation parameter. For
very low values of  convergence is degraded. This is due to the numerical quadrature: with
the 12 point rule and  = 0.8, the optimal rate of convergence is recovered. It should be
noted that for the other values of  considered, improving the numerical quadrature does not
affect noticeably the numerical errors, which are mostly due to the approximation properties
of the local max-ent schemes. As expected, the behaviour of the local max-ent solutions
approaches that of the Delaunay solutions as  increases. The local max-ent solutions exhibit
optimal convergence rates, whereas the Delaunay solutions display slightly suboptimal rates.
Furthermore, the accuracy of the local max-ent solutions is vastly superior to that of the
Delaunay solutions, as evidenced by the large shift in the convergence curves. The distinct
advantage of the local max-ent approximations over Delaunay approximations persists when
the cost of the solutions is carefully accounted for.

Copyright 䉷 2005 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng 2006; 65:2167–2202
LOCAL MAXIMUM-ENTROPY APPROXIMATION SCHEMES 2197

Figure 10(b) shows the convergence plots for a nearly incompressible material. It is well
known that increasing the support size in MLS meshfree methods can alleviate the problem
of volumetric locking. However, locking is not completely eliminated (cf. Reference [4] and
references therein) since the discretization schemes do not commute exactly with the Helmoltz–
Hodge decomposition of the displacement field. This type of behaviour is also exhibited by local
max-ent approximations; for nearly incompressible materials, the asymptotic rate of convergence
is reached at much coarser discretizations for the local max-ent approximants than for simplicial
finite elements (cf. Figure 10(b)). Nevertheless, as the incompressible limit is reached locking
occurs for both methods.

5.3. Non-linear elasticity: hyper-elastic block


Next we demonstrate the performance of local max-ent approximations in strongly non-linear
problems. The test case under consideration concerns the compression and tension of a hyper-
elastic block. The material is compressible-neo Hookean with energy density
1 
W (F) =  log2 (J ) −  log(J ) + tr(FT F) (38)
2 2
where F is the deformation gradient, J = det(F) and  and  are Lamè constants. Two sets
of material constants are considered: / = 2, corresponding to an initial Poisson’s ratio of
0 = 0.333; and / = 100, corresponding to an initial Poisson’s ratio of 0 = 0.495. By the
symmetry of the problem only an eighth of the sample needs to be analysed. Quasi-static
loading conditions are considered. The prescribed displacements are applied incrementally and
the total potential energy is minimized by the conjugate gradient method.
Figure 11 shows the deformation of the block at 75% compression for the second parameter
set. The robustness of the method for non-linear problems involving severe deformations is

Figure 11. Compression of a hyper-elastic block.

Copyright 䉷 2005 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng 2006; 65:2167–2202
2198 M. ARROYO AND M. ORTIZ

Figure 12. Final deformation for the finest FE mesh and second-coarsest
local max-ent discretization (0 = 0.495).

0.02 0.5
γ=1.6 γ =1.6
FEM
Signed relative energy error

Signed relative energy error

FEM 0.4
0.015

0.3
0.01
0.2

0.005
0.1

0 0
(a) nodal spacing 100 (b) nodal spacing 100

Figure 13. Signed relative error in strain energy with respect to a reference numerical
solution for: (a) 0 = 0.333; and (b) 0 = 0.495.

evident from this figure. It is interesting to note that some nodes lie outside the deformed
body, which is a consequence of the non-interpolatory character of the shape functions.
Figure 13 shows the deformed configurations of the block at 100% tensile deformation. The
calculation is performed with seven uniform node sets of variable resolution and for both sets
of material constants. Figure 13 shows the dependence of a normalized signed relative error in
strain energy (relative to an overkill numerical solution) on the nodal spacing. It is evident from
this figure that the numerically computed potential energy decreases monotonically with mesh

Copyright 䉷 2005 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng 2006; 65:2167–2202
LOCAL MAXIMUM-ENTROPY APPROXIMATION SCHEMES 2199

refinement, as expected. Figure 13(a) depicts convergence curves for the local max-ent and
finite element solutions in the compressible case (0 = 0.333). It is observed from that figure
that the accuracy of the local max-ent solution is vastly superior to that of the finite element
solution. The finest finite element solution has a comparable—albeit slightly larger—error than
the second-coarsest local max-ent solution, Figure 12. By contrast, the CPU time incurred
by the local max-ent solution is over a hundred times shorter than that of the finite element
solution. This difference in performance is more pronounced in the nearly incompressible case,
Figure 13(b). In this case, the finite element solution converges very slowly and the coarse
meshes result in very large errors. Indeed, the coarsest local max-ent solution suffices to achieve
an accuracy comparable to that of the finest finite element mesh, at a CPU-time knockdown
factor of one thousand.

6. SUMMARY AND CONCLUDING REMARKS

We have developed a new type of approximation scheme, which we term local max-ent approx-
imation scheme, which represents a compromise between unbiased statistical inference, in the
sense of information theory, and the desire to define shape functions of the least width possible.
The resulting shape functions are non-negative, possess a weak Kronecker-delta property at the
boundary, and reduce in the limit to piecewise affine interpolation over a Delaunay triangu-
lation, whereas away from this limit the shape functions are smooth and analogous to those
employed in MLS schemes. In this sense, local max-ent approximation supplies an efficient
bridge between simplicial finite elements and meshfree methods. In particular, by adjusting
the spatial variation (x), it is possible to select regions of the domain of analysis which
are treated by finite elements and regions that are treated in the style of meshfree methods,
with seamless smooth transitions between those regions. This is in contrast to other approaches
coupling meshfree schemes with finite elements (cf. Reference [4] and references therein).
Although we have not developed here the blending of the local max-ent approximants with
other convex approximants such as subdivision approximants, NURBS, or linear finite elements,
the variational nature of our method makes this coupling straightforward. The calculation of the
shape functions can be carried out simply, robustly and efficiently in any spatial dimension. The
numerical tests presented in this paper suggest that, for problems possessing smooth solutions,
local max-ent approximations supply high accuracy at low cost.
We conclude by pointing out some of the limitations of the method and opportunities for
further development. As defined in this paper, local max-ent shape functions are necessarily
positive, which precludes the formulation of high-order schemes satisfying the second-order
consistency condition in Equation (9). The positivity constraint is introduced mainly in or-
der to enable the interpretation of the shape functions as probability densities, which in turn
facilitates the conceptual connections with information theory. Higher-order schemes may be
obtained by relaxing the requirement expressed in Equation (9). These developments, as well
as the formulation of strictly compactly supported max-ent approximants, will be reported in a
subsequent paper.
The local max-ent approximation schemes have been defined in the convex hull of the
node set. If non-convex domains are treated, the local max-ent approximants lose the weak
Kronecker-delta property in the non-convex parts of the boundary, and thus behave similarly to
MLS approximants. The effective treatment of non-convex domains is of considerable interest

Copyright 䉷 2005 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng 2006; 65:2167–2202
2200 M. ARROYO AND M. ORTIZ

and has been extensively studied in the context of MLS-based meshfree methods [30–32].
Visibility, diffraction, and constrained path criteria have been proposed that modify how nearby
nodes interact in the vicinity of a non-convex part of the boundary. These methods are directly
applicable to local max-ent approximation. For instance, it is possible to replace the Euclidean
distance |x − x a | in the definition of the shape functions by the length of the shortest path
contained within the domain connecting x and x a . Alternatively, the non-convex domain can
be decomposed into convex sub-domains and approximation schemes can then be constructed
separately in each of the sub-domains. The schemes in each sub-domain are guaranteed to be
conforming by the conforming patches property of local max-ent approximation schemes.
Finally we remark briefly on the possibility of adapting the function (x). In problems having
a variational structure and where the solutions obey a minimum principle, the natural approach
is to let the minimum principle itself select the optimal function (x). In this view, the energy
function is minimized with respect to the displacement field and with respect to (x). This
program is facilitated by the ability to compute explicitly derivatives of shape functions with
respect to , and by the guaranteed solvability of Equation (18) for any non-negative value of .
This is in contrast to the problem of choosing an optimal value of the dilation parameter in
MLS approximations. In this case, it is often not straightforward to obtain analytical derivatives
of the shape functions with respect to the dilation parameter. In addition, the dilation parameter
is subject to lower-bound solvability constraints which are difficult to verify a priori.

APPENDIX A: SPATIAL DERIVATIVES OF THE SHAPE FUNCTIONS

In this appendix we detail the procedure for the calculation of the spatial derivative of the
shape functions. For the sake of generality we consider the case in which the parameter (x) is
spatially non-uniform. We denote spatial gradients of real functions by ∇, whereas for vector-
valued functions we denote by Dy(x) the matrix partial derivatives. The symbol * denotes
partial differentiation. We define the following functions

fa (x, , ) = −|x − x a |2 +  · (x − x a ) (A1)


exp[fa (x, , )]
pa (x, , ) =  (A2)
b exp[fb (x, , )]

r(x, , ) = pa (x, , )(x − x a ) (A3)
a

*r 
J(x, , ) = = pa (x, , )(x − x a ) ⊗ (x − x a ) − r(x, , ) ⊗ r(x, , ) (A4)
* a

Given a function h(x, , ), we define the function h∗ that depends only on x as
h∗ (x) = h(x, ∗ (x), (x))
where ∗ (x) is the unique maximizer of
 

g() = − log exp[fa (x, , (x)]
a

Copyright 䉷 2005 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng 2006; 65:2167–2202
LOCAL MAXIMUM-ENTROPY APPROXIMATION SCHEMES 2201

Our goal is to compute ∇pa∗ . It is readily verified that


 
∗ ∗ ∗  ∗ ∗
∇pa = pa ∇fa − pb ∇fb
b
By the chain rule, we have
 ∗  ∗  ∗
*fa *fa *fa
∇fa∗ = + D∗ + ∇ (A5)
*x * *
where
 ∗  ∗  ∗
*fa *fa *fa
= − 2(x − x a ) + ∗ (x), = (x − x a ), = − |x − x a |2
*x * *

The only term that is not available explicitly in Equation (A5) is D∗ . In order to compute
this term we note that, since r∗ is identically zero,
 ∗  ∗  ∗
∗ *r *r ∗ *r
0 = Dr = + D + ⊗ ∇
*x * *

Simple calculations show that


 ∗  ∗  ∗
*r ∗ *r *r 
=J , = − 2 J∗ + id, =− pa∗ |x − x a |2 (x − x a )
* *x * a

whence it follows that


 
∗ ∗ −1 ∗ −1 
D = 2 id − (J ) + (J ) pa∗ |x − x a | (x − x a ) ⊗ ∇
2
a

Rearranging terms, and noting that pa∗ verifies the linear consistency condition, we finally obtain

∇pa∗ = − pa∗ (J∗ )−1 (x − x a ) + pa∗ Ka ∇ (A6)


where we write
 

Ka = pb∗ |x − xb | (x − xb ) · (J∗ )−1 (x − x a ) − |x − x a |2 + U (x, p∗ )
2
b

ACKNOWLEDGEMENTS
The authors gratefully acknowledge the support of the Department of Energy through Caltech’s ASCI
ASAP Center for the Simulation of the Dynamic Response of Materials, and the support received
from NSF through an ITR grant on Multiscale Modelling and Simulation and Caltech’s Center for
Integrative Multiscale Modelling and Simulation.

REFERENCES
1. Nayroles B, Touzot G, Villon P. Generalizing the finite element method: diffuse approximation and diffuse
elements. Computational Mechanics 1992; 10(5):307–318.

Copyright 䉷 2005 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng 2006; 65:2167–2202
2202 M. ARROYO AND M. ORTIZ

2. Belytschko T, Lu YY, Gu L. Element-free Galerkin methods. International Journal for Numerical Methods
in Engineering 1994; 37(2):229 –256.
3. Liu WK, Li S, Belytschko T. Moving least square reproducing kernel methods. Part I: methodology and
convergence. Computer Methods in Applied Mechanics and Engineering 1997; 143(1–2):113 –154.
4. Huerta A, Belytscko T, Fernández-Méndez S, Rabczuk T. Chapter meshfree methods. Encyclopedia of
Computational Mechanics, vol. 1. Wiley: Chichester, 2004; 279 –309.
5. Duarte CA, Oden JT. An h-p adaptive method using clouds. Computer Methods in Applied Mechanics and
Engineering 1996; 139(1– 4):237–262.
6. Babuška I, Melenk JM. The partition of unity method. International Journal for Numerical Methods in
Engineering 1997; 40(4):727–758.
7. Jaynes ET. Information theory and statistical mechanics. Physical Review 1957; 106(4):620 – 630.
8. Rockafellar RT. Convex Analysis. Princeton University Press: Princeton, NJ, 1970.
9. Prautzch H, Boehm W, Paluszny M. Bézier and B-spline Techniques. Springer: Berlin, 2002.
10. DeVore RA. The Approximation of Continuous Functions by Positive Linear Operators. Springer: Berlin,
1972.
11. Sukumar N, Moran B, Belytschko T. The natural element method in solid mechanics. International Journal
for Numerical Methods in Engineering 1998; 43(5):839 – 887.
12. Cirak F, Ortiz M, Schröder P. Subdivision surfaces: a new paradigm for thin-shell finite-element analysis.
International Journal for Numerical Methods in Engineering 2000; 47(12):2039 –2072.
13. Cottin C, Gavrea I, Gonska HH, Kacsó DP, Zhou DX. Global smoothness preservation and variation-
diminishing property. Journal of Inequalities and Applications 1999; 4(2):91–114.
14. Fernández-Méndez S, Huerta A. Imposing essential boundary conditions in mesh-free methods. Computer
Methods in Applied Mechanics and Engineering 2004; 193(12–14):1257–1275.
15. Karlin S, Shapley LS. Geometry of moment spaces. Memoirs of the American Mathematical Society 1953;
12.
16. Tagliani A. Existence and stability of a discrete probability distribution by maximum entropy approach.
Applied Mathematics and Computation 2000; 110:105 –114.
17. Khinchin AI. Mathematical Foundations of Information Theory. Dover: New York, 1957.
18. Shannon CE. A mathematical theory of communication. The Bell System Technical Journal 1948;
27(3):379 – 423.
19. Mead LR, Papanicolaou N. Maximum entropy in the problem of moments. Journal of Mathematical Physics
1984; 25(8):2404 –2417.
20. Warren J, Schaefer S, Hirani A, Desbrun M. Barycentric coordinates for convex sets, 2005, submitted.
21. Floater MS, Hormann K, Kós G. A general construction of barycentric coordinates over convex polygons.
Advances in Computational Mathematics 2005, in press.
22. Sukumar N. Construction of polygonal interpolants: a maximum entropy approach. International Journal for
Numerical Methods in Engineering 2004; 61(12):2159–2181.
23. Rajan VT. Optimality of the Delaunay triangulation in R d . Discrete and Computational Geometry 1994;
12(2):189 –202.
24. Boyd S, Vandenberghe L. Convex Optimization. Cambridge University Press: Cambridge, U.K., 2004.
25. Attouch H. Viscosity solutions of minimization problems. SIAM Journal on Optimization 1996; 6(3):
769 – 806.
26. Attouch H, Cominetti R. Lp approximation of variational problems in L1 and L∞ . Nonlinear Analysis 1999;
36(3):373 –399.
27. Wandzura S, Xiao H. Symmetric quadrature rules on a triangle. Computers and Mathematics with Applications
2003; 45(12):1829 –1840.
28. Breitkopf P, Rassineux A, Savignat JM, Villon P. Integration constraint in diffuse element method. Computer
Methods in Applied Mechanics and Engineering 2004; 193(12–14):1203 –1220.
29. Timoshenko S, Goodier JN. Theory of Elasticity. McGraw-Hill: New York, 1951.
30. Organ D, Fleming M, Terry T, Belytschko T. Continuous meshless approximations for nonconvex bodies by
diffraction and transparency. Computational Mechanics 1996; 18(3):225 –235.
31. Belytschko T, Krongauz Y, Organ D, Fleming M, Krysl P. Meshless methods: an overview and recent
developments. Computer Methods in Applied Mechanics and Engineering 1996; 139(1– 4):3 – 47.
32. Krysl P, Belytschko T. Element-free Galerkin method: convergence of the continuous and discontinuous shape
functions. Computer Methods in Applied Mechanics and Engineering 1997; 148(3 – 4):257–277.

Copyright 䉷 2005 John Wiley & Sons, Ltd. Int. J. Numer. Meth. Engng 2006; 65:2167–2202

You might also like