1 Introduction and Background To The FEM: 1.1 Weighted Residual Methods

Chapter 1
1 Introduction and Background to the FEM
The Finite Element Method (FEM) is a way of obtaining approximate (numerical)

solutions to differential equations. Broadly speaking, the FEM is used to reduce
differential equation(s) to systems of equations which can be more easily solved. There
are two usual ways to derive this system of equations: using
(1) Galerkin’s weighted residual method
or
(2) a variational method together with the Rayleigh-Ritz scheme
Both of these approaches are discussed below but the former, being the more general of the
two, is the one which will be followed in most of this text. The Galerkin method is
described in §1.1 and §1.2 and the Variational approach is briefly discussed in §1.3.
1.1 Weighted Residual Methods
The FEM using the Galerkin method is more specifically called the Galerkin Finite
Element Method (GFEM). Before discussing the GFEM, which is done in the next
Chapter, it is worthwhile discussing Galerkin’s Method, from which it derives. (In fact,
many of the important concepts of the FEM are touched upon in this chapter.) Galerkin’s
Method is what one might use to obtain a solution to a differential equation if one did not
have a computer. It was only with the development of the computer in the 1950s that the
Galerkin Method was generalised to the Galerkin FEM.
Galerkin’s method1 is one of a number of numerical techniques known as Weighted

Residual Methods. These various weighted residual methods are often as effective as
each other, but it is the Galerkin method which leads naturally into the Finite Element
1
Boris Grigoryevich Galerkin was a Russian engineer who taught in the St. Petesburg Polytechnic. His
method, which he originally devised to solve some structural mechanics problems, and which he published in
1915, now forms the basis of the Galerkin Finite Element method. I.G. Bubnov independently devised a
similar method around the same time, and Galerkin’s method is known also as the Bubnov-Galerkin method
The Finite Element Method 1 Kelly

Chapter 1
Method2. The two other most commonly encountered weighted residual methods are the
Collocation Method and the Method of Least Squares; these are special cases of the
most general Petrov-Galerkin Method, which is described in §1.1.4.
The Collocation, Least Squares and Galerkin methods will be illustrated here through the
following simple one dimensional example problem: solve the linear ordinary differential
equation (ODE)
d 2u
 u   x, u (0)  0, u (1)  0 (1.1)
dx 2
sinh x
[the exact solution is u ( x)  x  ]
sinh(1)
Begin by assuming some form to u, usually a polynomial3. For example, take as a trial
function
u~ x   a  bx  cx 2 (1.2)
This trial function has three unknowns, two of which can immediately be obtained from
the boundary conditions (BC’s), leading to a trial function which automatically satisfies
these BC’s:

u~ ( x)  b x  x 2  (1.3)
It now remains to determine b.
1.1.1 The Collocation Method
The most direct method is to satisfy the differential equation at some point in the interval,
x  [0,1] - this is the Collocation Method. Which point one chooses is arbitrary, but it
makes sense to choose the midpoint, which usually yields best results, in which case,
substituting (1.3) into (1.1) and setting x  1 / 2 , one finds that b  2 / 9 and the
approximate solution is
2
rather, the most commonly encountered FEM is that based on the Galerkin method, but it is possible to
derive FEM equations using other weighted residual methods, most importantly the Petrov-Galerkin Method
(see later)
3
it is not necessary to use polynomials, e.g. one could use sinusoids,  i sin(ix)

Chapter 1
u~ ( x)  2
9 x  x 
2
(1.4)
Slightly different results will be obtained by choosing to enforce the differential equation
at different points.
More accuracy can be achieved by choosing higher order polynomials. For example, one
could begin with a cubic and so have the trial function which satisfies the BC’s
u~ ( x)  bx  cx 2  b  c x 3 (1.5)
With two unknowns, one needs two equations. For example, enforcing (1.1) at the equi-
spaced points x  1 / 3 and x  2 / 3 leads to the system of equations {▲Problem 2}
62b  2c  9 (1.6)
59b  29c  9
and solving these leads to the approximate solution
u~ ( x) 
1
560
81x  9 x 2  90 x 3  (1.7)
The “1-term” (Eqn. 1.4) and “2-term” (Eqn. 1.7) approximate solutions are graphed in Fig.
1.1, the latter being virtually indistinguishable from the exact solution at this scale. Using
the methods described here, one would expect the 1-term solution to be within, perhaps,
10-20% of the exact solution. Ever more accurate solutions can be obtained by increasing
the order of the polynomial and solving systems with ever greater numbers of equations.
2-term /
exact
1-term
Figure 1.1: Collocation Method solution to Eqn. 1.1

Chapter 1
1.1.2 The Method of Least Squares
Consider now an alternative solution procedure, wherein the differential equation is

multiplied across by some weight function  (x) and the complete equation is integrated
over the domain:
1
 d 2u 
0  dx 2  u  x ( x)dx  0 (1.8)
Again, choose a trial function which satisfies the boundary conditions, for example the
quadratic (1.3), leading to
 d 2 u~ ~ 
 
1 1
0  dx 2  u  x 

 ( x ) dx  
0
bx 2  (1  b) x  2b  ( x)dx  0 (1.9)
The term inside the square brackets is the residual R, and it is this which one wants to
drive to zero. The idea here is that if this integral is zero for any arbitrary weight function
 , then the residual should be zero also.
There is one unknown in (1.9) and the question now is: what function  (x) does one
choose? Again, the choice here is somewhat arbitrary (but see below). In the Least
Squares method, one chooses  ( x)  R / b , leading to a cubic integrand in Eqn. 1.9;
integration then leads to the equation {▲Problem 3}
47
10 b  12
13
0  b 65
282 (1.10)
which is close to the Collocation Method solution b  2 / 9 .
Note the primary difference between the Collocation Method and the Least Squares
Method. In the former, the differential equation is satisfied at one or more particular
points. In the latter, (1.9), the differential equation is forced to zero in some average way,
determined by the weight function, over the complete domain.
As before, choose now a higher order polynomial to improve the solution, say the cubic
trial function of (1.5), rewritten as

Chapter 1
u~ ( x)  a1 x  a 2 x 2  a1  a 2 x 3 (1.11)
Again, this is substituted into (1.8). This time, with two unknown coefficients, one
requires two equations, which are obtained by choosing two different weight functions,
namely
R
1 ( x)   7 x  x 3
a1
(1.12)
R
 2 ( x)   2  6x  x 2  x3
a 2
leading to two integral equations which can be evaluated to obtain
 1436 2783
  a1   15
32

105
 2783
420

    21   a1  0.1485, a 2  0.0154 (1.13)
 a 2   10 
449
 420 105
Substituting (1.13) back into (1.11) gives the approximate solution, which is close to the
exact solution to the problem.
The reason why the Least Squares Method works is as follows: if the integral (1.8) is zero
for the complete set of functions4
1  1,  2  x,  3  x 2 ,   n  x n ,  (1.14)
then the residual will be identically zero. As one takes higher order polynomial trial
functions, the weight functions  i  R / ai contain higher order terms in x and more and
more terms from the complete set of functions (1.14) are included, and the solution is
obtained with ever increasing accuracy.
The Collocation Method as a Weighted Residual Method
Re-visiting now the Collocation Method, it can be seen that this is also a weighted residual
method, with the weights chosen to be
 i ( x)   ( x  x i ) (1.15)
4
by which is meant that any function can be represented as a linear combination of these functions

Chapter 1
where  ( x  x i )  1 if x  x i and zero otherwise (the Dirac delta function). For example,
the solution (1.4) is derived by considering the integral/equation
1
 d 2u   d 2u 
I    2  u  x  x  12 dx   2  u  x  0 (1.16)
0  dx   dx  x 1
2
Symmetry of the Least Squares Method
The coefficient matrix of the Least Squares systems of equations (1.13) is symmetric. This
is always desirable, particularly for large systems of equations, since special rapid
equation-solver algorithms are available for symmetric coefficient matrices. The Least
Squares coefficient matrix is always symmetric provided the differential equation is linear.
This can be shown as follows: write the differential equation in operator form
L[u ]  f ( x) (1.17)
Substituting in the approximation u~   ai x i leads to, provided L is a linear operator,
L[ ai x i ]   ai L[ x i ]  f ( x) , (1.18)
with the weight functions
 j ( x) 
R


a j a j
 a Lx   f ( x)  Lx 
i
i j
(1.19)
leading to the system of integral equations, one for each weight,
 a  Lx Lx dx   f ( x) Lx dx,

i
i j j
j  1, 2,  (1.20)
For symmetry one requires that the coefficient of ai in equation j be equal to the
coefficient of a j in equation i, and (1.20) clearly results in a symmetric coefficient matrix.

Chapter 1
1.1.3 Galerkin’s Method
In Galerkin’s Method, the weight functions are chosen through
u~
i  (1.21)
ai
As with the Method of Least Squares, the higher the order of the approximating
polynomial u~ , the higher the order of the terms x i included in the weight functions, so
that the more weight functions are chosen, the more of the complete set of functions (1.14)
will be chosen, and the closer the residual will be to zero.
Again considering problem (1.1) and using the trial function which satisfies the boundary
conditions, Eqn. 1.3, one has the single weight function
u
  x  x2 (1.22)
b
which, when substituted into (1.9) and the integral is evaluated, leads to
 11
30 b  12  0
1
 b 5
22 (1.23)
The cubic polynomial satisfying the boundary conditions, Eqn. 1.5, together with the
weight functions (with a1  b , a2  c )
u~ u~
1   x  x 3 , 2   x2  x3 (1.24)
a1 a 2
leads to the system of integral equations
1
  
I 1   2a 2  x1  7 a1  6a 2   a 2 x 2  a1  a 2 x 3 x  x 3 dx  0
0
1 (1.25)

I 2   2a 2  x1  7 a1  6a 2   a 2 x  a1  a 2 x
2 3
x 2

 x dx  0
3
Evaluating the integrals leads to the system of equations

Chapter 1
  105
92
 137
420   a1   152  0
 137     a1  69
, a2  8
(1.26)
 420  17  a 2   201  0 473 473
and hence to the approximate solution
u~ ( x) 
1
473

69 x  8 x 2  77 x 3  (1.27)
From the definition of the Galerkin weight function (1.21), the trial function can be written
in the alternative form
u
u~ ( x)  1 ( x)a1   2 ( x)a 2   , i ( x)  (1.28)
ai
This form of the trial function will be used in most of what follows.
Integration by Parts & the Weak Form
The Galerkin method as presented gives reasonably accurate numerical solutions to

differential equations. A modified version of the method involves integrating by parts the
term inside the weighted integral which contains the highest, second order, term from the
differential equation. For the example considered above, this means re-writing (1.8) as
1
 d 2u 
0  dx 2   u  x dx  0  (1.29a)
1
 du d 
1
  du 
0  dx dx  u  x dx   dx   0 (1.29b)
There are two advantages to integrating by parts:

i) a linear trial function can be used
ii) the Galerkin coefficient matrix is symmetric for certain equations
Before discussing these points, introduce the following terminology. The original
differential equation and BC’s, Eqn. 1.1, is referred to as the strong statement of the
problem. The weighted residual equation of (1.29b) is referred to as the weak statement of
the problem. The terminology weak statement (or weak problem or weak formulation or
weak form) is used to mean two different things: it most often used to mean that the

Chapter 1
problem is stated in integral form, contrasting with the strong form of the differential
equation, which must be satisfied at all points on the interval of interest; in that sense Eqns.
1.29a,b are weak forms. However, the weak form more correctly means that the required
differentiability of the solution is of an order less than that in the original differential
equation; in that sense Eqn. 1.29a is a strong form whereas Eqn. 1.29b is a weak form. To
avoid ambiguity, we will here maintain the terminology “weak form” to mean the form of
Eqn. 1.29b.
Regarding (i), it is clear that the second derivative of a linear trial function u~ ( x)  a  bx is
zero and that the first term in the equation on the left of 1.29. This is not the case for the
weak form, which retains this information. Of course the two coefficients in a linear trial
function can be immediately found from the boundary conditions and so the weighted
residual (1.29) is not necessary to obtain what is a trivial solution; however, linear trial
functions can be used in the Galerkin FEM, as outlined in the next Chapter, and the
integration by parts is then essential.
Regarding (ii), the Galerkin coefficient matrix in (1.26) is symmetric, but this is fortuitous
– in general, it is not. However, the weak form of (1.29) is symmetric {▲Problem 4}. To
generalise this, consider the arbitrary linear ODE
d 2u du
p( x) 2
 q ( x)  r ( x)u  f ( x) (1.30)
dx dx
Multiplying by  and integrating over the complete domain leads to
  pu   qu   ru dx   fdx (1.31)
To integrate the first term by parts, first note that an integration by parts without the 
gives  pu dx  pu    p u dx . This suggests that one adds and subtracts a term to/from
Eqn. 1.31:
  pu   p u   p u   qu   ru dx   fdx (1.32)
so that an integration by parts of (1.31) gives the weak form
  pu    p u   qu   ru dx   fdx   pu   (1.33)

Chapter 1
The terms on the left, involving u, contribute to the coefficient matrix. Writing
u~   ai  i leads to a system of equations, and the relevant terms are:
  p a     p a  
i i j i i j  q  ai i j  r  ai i j dx (1.34)
For symmetry, this integral should be unchanged if i and j are interchanged. This will be
so if p   q , so a second-order equation leading to symmetry is the equation
d  du 
 p( x)   r ( x)u  f ( x) (1.35)
dx  dx 
This is known as the self-adjoint ODE. When p   q  0 , one has the equation
d 2u
p0  r ( x)u  f ( x) (1.36)
dx 2
where p0 is a constant, and an equation of the form b( x)u   c( x)u  d ( x) can always be
put in the form (1.36) by dividing through by b(x) . It can be seen that Eqn. 1.1 is of this
form, hence the symmetric matrix of Eqn. 1.2.6.
Consider again now the example problem (1.1), only this time using the weak form of
(1.29):
 du~ d ~  du~ 
1 1

0  dx dx  u   x dx   dx   0 (1.37)
The quadratic trial function satisfying the BC’s, Eqn. 1.3, leads to
1
  dx  1  2 x x  x   1
 1  2 x   x  x
2 2
 2 2 1

0 b   x x  x dx
2
 (1.38)
0  0
which gives b  10 / 44 . Moving to the cubic polynomial u~  a11 ( x)  a 2 2 ( x) with the

weights as in (1.24) leads to

Chapter 1
 du~ 
1
1
 du d1 
I1     u1  x1 dx   1   0
0
dx dx   dx  0
(1.39)
 du~ 
1
1
 du d 2 
I2    u 2  x 2 dx    2   0
0
dx dx   dx  0
and the (symmetric) system of equations and solution
a1 
 105   a1   152  1
 
92 137 69

420
 a 2    1   473
 u~  69 x  8 x 2  77 x 3 (1.40)
137
 420
1
7   20  a2  8
473 473
This is actually the same system as was obtained without the integration by parts (see
Eqns. 1.26, 1.27); in general though, this will not be the case.
A further approximation would be the quartic polynomial
u~  a  bx  cx 2  dx 3  ex 4 (1.41)
giving the trial function satisfying the BC’s
u~  a11 ( x)  a 2 2 ( x)  a3 3 ( x) (1.42)
with
1  x  x 4 ,  2  x 2  x 4 ,  3  x 3  x 4 (1.43)
leading to the integrals, symmetric system and solution
 du~ d i ~  du~ 
1 1

Ii     u  i  x i dx    i   0
0
dx dx   dx  0
 8863
929
1260
769
2520   a1   16  96406 
 a1   14427
 929 4 493  a    1 , a    539  (1.44)
 1260 9 2520   2   12   2   96406 
 2520
769 493
2520
113
1260
  a3   301  6041 
 a3   48203 
 u~  0.1497 x  0.0056 x 2  0.1253 x 3  0.0187 x 4

Chapter 1
Differential Equations with Non-constant Coefficients
Differential equations with non-constant coefficients can be handled as above. As an

illustration, consider the problem
d 2u
x2  2u  1, u (1)  0, u (2)  0 (1.45)
dx 2
[the exact solution is 1
14 6 / x  7  x  ]
2
The weighted residual is
2
 d 2u 
I    x 2 2  2u  1 ( x)dx  0 (1.46)
1 dx 
To integrate this type of function by parts, one needs to add and subtract terms (as was
done above in Eqns. 1.31-32). First, integrate the second-order term by parts:
d 2u du du  2 d 2u du  2 du
x dx  x 2   2 x dx    x dx 2  2 x dx dx  x dx
2
2
(1.47)
dx dx dx
Adding and subtracting this term 2 x(du / dx) then leads to
2
 d 2 u du  du 
I    x 2 2  2 x  w  2 x w  2u   dx  0
1  dx dx  dx 
(1.48)
2
 du d
2
du   du 
  x2  2x w  2u   dx   x 2  0
1 
dx dx dx   dx  1
Using a quadratic trial function which satisfies the BC’s,
u~  a ( x  1)( x  2) ,  ( x)  ( x  1)( x  2) (1.49)
one has
 
2
I   x 2 a (2 x  3) 2  2ax(2 x  3)( x  1)( x  2)  2a ( x  1) 2 ( x  2) 2  ( x  1)( x  2) dx
1
 56 a  16  0
(1.50)

Chapter 1
and the solution
u~  0.2( x  1)( x  2)  0.4  0.6 x  0.2 x 2 (1.51)
A higher order trial solution would be u~  a  bx  cx 2  dx 3 . Application of the BC’s

gives u~  a1 ( x  1)( x  2)  a 2 ( x  1)( x  2)( x  3) . Actually, one can just as well use the
simpler trial function u~  a1 ( x  1)( x  2)  a 2 x( x  1)( x  2) , which satisfies the BC’s and
is cubic. This latter function results in the system of equations
 56 7
  a1   16  a1  0.351027
    1  
5
 13 (1.52)
 10
241
105  a 2   4  a 2  0.0898973
and the solution
u~  2a1  (2a 2  3a1 ) x  (a1  3a 2 ) x 2  a 2 x 3

(1.53)
 0.702054  1.232877 x  0.620719 x 2  0.0898973x 3
Note that the coefficient matrix here is not symmetric; this is as expected since (1.45) is
not of the form (1.35).
The 1-term and 2-term solutions are plotted below.
x
1 1.2 1.4 1.6 1.8 2
0
-0.01
-0.02 quadratic
polynomial
-0.03
i ti
-0.04
-0.05
Figure 1.2: Galerkin Method solution to Eqn. 1.45

Chapter 1
1.1.4 The Petrov-Galerkin Method
The Petrov-Galerkin method is the most general type of weighted residual method.
Basically it is a catch-all term for all weighted residual methods, more specifically for
those in which the weight functions are not as chosen in the Least Squares or Galerkin
methods. As long as the weight functions embody the complete set of functions (1.14), the
method will converge as higher order trial functions are chosen.
1.1.5 Limitations of Weighted Residual Method
The Weighted Residual Methods involve selecting an appropriate trial function to

represent the solution. Greater accuracy is achieved by increasing the degree of the
approximating trial function. Unfortunately, the resulting coefficient matrix might well
become ill-conditioned for polynomials of high degree. For example, shown in Fig. 1.3 is
a plot of x n over [0,1] and it can be seen the curves become very close to each other for
larger n, and computer rounding will inevitably mean that these curves will be
indistinguishable.
0.8
0.6
0.4
0.2
0 0.2 0.4 0.6 0.8 1

x
Figure 1.3: A plot of xn over [0,1]
Further, it might be the case that the actual solution is a highly complex function, perhaps a
two or three dimensional function, and that the boundary conditions themselves are
complex. In these cases, instead of selecting a trial function to encompass the complete
domain, it is better (necessary) to use some other numerical method, the natural extension
to Galerkin’s method being the Galerkin Finite Element Method, described in the next
Chapter.

Chapter 1
1.2 Galerkin’s Method: Further Applications
So far, Galerkin’s method has been used to solve the second order differential equation
(1.1). Here, it is shown how the method can be used to solve a wide variety of problems:
linear problems with various types of boundary condition, non-linear ODEs and partial
differential equations.
Many of the problems considered here are trivial, in that an exact solution can easily be
obtained; they are used to illustrate the method, which can be used to tackle more complex
problems.
1.2.1 Essential and Natural Boundary Conditions
An essential (or Dirichlet) boundary condition for a second-order differential equation is

one on the unknown u. A natural (or Von Neumann) boundary condition is one on the
first derivative, u  . These boundary conditions can be homogeneous (their value at the
boundary is zero) or non-homogeneous. For example, the problem of Eqn. 1.1 involves
homogeneous essential BC’s.
Homogeneous & Non-Homogeneous Essential BC’s
Consider a general problem involving homogeneous essential BC’s, u (0)  0 and

u (l )  0 . A trial function u~  i 0 ai x i satisfying these BC’s is {▲Problem 8}
n
n 1
xn
u~   ai  i , i  x i  (1.54)
i 1 l n i
Note that the weight functions satisfy the essential boundary conditions, i.e.
wi  0   wi  l   0 . This is an important property of the weight functions in the Galerkin
method; it has to be the case since the coefficients a in u~  a  are arbitrary and u~

n 1
i i 1 i i
satisfies the essential BC’s.
Consider now the case of non-homogeneous BC’s, u (0)  u 0 and u (l )  u l . In this case
the trial function is written as u~ 
n
a x i   ( x) . The first part, the sum, again satisfies
i 0 i
the essential BC’s and the extra term involving  ensures that the non-homogeneous BC’s

Chapter 1
are satisfied, for example one might let   (1  x / l )u 0  ( x / l )u l . The weight functions
are the same as for the homogeneous BC case, since  (x) is independent of the ai .
An important consequence of the Galerkin formulation is that, when one integrates by parts
the second-order term to obtain a boundary term of the form (see, for example, Eqn. 1.37)
l
 u 
 x  j  (1.55)
 0
the weight functions are zero at each end and so the boundary term is zero.
To illustrate these points, consider the following example problem with non-homogeneous
BC’s:
d 2u
 1, u (0)  0, u (1)  3
2 (1.56)
dx 2
2 x2  x ]
Forming the weighted residual and integrating by parts to obtain the weak form, one has
1
 du d
1
  du 
I     dx      0 (1.57)
0 
dx dx   dx  0
Choose a quadratic polynomial trial function of the form u~ ( x)  i 0 ai x i   ( x) .

2
Applying the BC’s gives
u~  a( x  x 2 )  32 x (1.58)
leading to
1 1
I   3
2
 2

a(1  2 x)  1  2 x   x  x dx   du  
0  dx  0
 13 a  16  0 (1.59)
 a   12  u~  x  0.5 x 2

Chapter 1
Note that this is actually the exact solution; a quadratic trial function was used and the
exact solution is quadratic.
Again, re-stating the two important points made: (i) the weight function x  x 2 satisfies the
essential BC’s and (ii) the boundary term in (1.59) is zero.
Natural Boundary Conditions
It is not necessary to have the trial function satisfy the natural boundary conditions – they
only have to satisfy the essential BC’s (hence the name essential). For example, consider
the following problem:
d 2u
 1, u (0)  1, u
x x 1 2 (1.60)
dx 2
2 x2  x  1]
Choosing a quadratic trial function, u  ao  a1 x  a2 x 2 , which satisfies the essential BC,

one has
u~  a1 x  a 2 x 2  1, 1  x,  2  x 2 (1.61)
There is an extra unknown coefficient and a second weight function, since the natural
boundary condition has not yet been applied. As usual, the weight functions satisfy the
homogeneous essential BC. The name natural is used since these BC’s arise naturally in
the weak statement of the problem:
 du
1
d j  du
Ij     j dx   j (1)  0, j  1, 2
0 
dx dx  dx x 1
(1.62)
1
 du d j 
    j dx  u (1)  0, j  1, 2
0 
dx dx 
Substituting in the trial function and evaluating the integrals leads to
I 1  a1  a 2  12  u (1)
I 2  a1  43 a 2  13  u (1) (1.63)
1 1  a   1  u (1) 0
  4   1    12  
1 3  a 2   3  u (1)  0

Chapter 1
Applying the natural boundary condition finally leads to
1 1  a1   32  1
1 4  a     5   0, a1  1, a 2 
 3  2   3  2 (1.64)
~
 u  1 a x  a x  1 x  x 1
2 2
1 2 2
which is the exact solution.
Unlike the case of two essential BC’s, the boundary term here is non-zero.
A 4th – order ODE
The Galerkin method can be used to deal with equations of higher order. For example,
here the above ideas are generalized to solve a fourth-order differential equation:
d 4u
 0, u (0)  0, u (1)  1, u (0)  0, u (1)  2 (1.65)
dx 4
[the exact solution is u ( x)  x 2 ]
The weighted residual is
1
d 4u
I   dx  0 (1.66)
0 dx 4
and integrating twice by parts gives
1 1
1
d 2 u d 2  d 3u   d 2 u dw 
I  2 dx   3    2  0 (1.67)
0 dx dx 2  dx  0  dx dx  0
In this problem, the essential boundary conditions are (see the  in the two boundary
terms)
du
u and specified at the end-points (1.68)
dx
and the natural boundary conditions are evidently (again, see the boundary terms)

Chapter 1
d 2u d 3u
and specified at the end-points (1.69)
dx 2 dx 3
Thus in this example there are four essential boundary conditions and no natural boundary
conditions.
Using a quartic trial function then leads to the 1-term approximate solution u~  2 x 3  x 4
{▲Problem 13}, which is plotted in Fig. 1.4.
0.8
0.6
0.4
0.2 exact
Galerkin one-term solution
0 0.2 0.4 0.6 0.8 1
x
Figure 1.4: Exact and 1-term Galerkin solutions to the 4th-order ODE (1.65)
1.2.2 Non – Linear Ordinary Differential Equations
The solution method for non-linear equations is essentially the same as for linear
equations. The new feature here is that one ends up having to solve a system of non-linear
equations for the unknown coefficients. For example, consider the equation
du d 2 u
2  1  0, u (0)  0, ux x 1  1 (1.70)
dx dx 2
[the exact solution is u ( x)  2
3
2 3/ 2
 2  x 
3/ 2
]
The weak statement of the problem, after an integration by parts (see the method
encompassed in Eqns. 1.46-48), is {▲Problem 15}
1
 du  2 d 
I       dx  u (1)   (1)  0
2
(1.71)
 
0 
dx dx 

Chapter 1
Using a quadratic trial function satisfying the essential BC, u~  a1 x  a 2 x 2 , leads to the
non-linear system of equations {▲Problem 15}
a12  2a1 a 2  43 a 22  32  0
(1.72)
a12  83 a1 a 2  2a 22  43  0
These equations are fairly simple and can actually be solved using elementary elimination
methods; there are two possible solutions
a1 , a 2   2.2298,2.1114, 1.4241,0.2051 (1.73)
Note, however, that, in general, a system of non-linear equations cannot be solved by

elementary elimination methods; numerical methods such as the Newton-Raphson
technique (see Chapter 5) needs to be employed.
The solution is not unique. Usually, the physics of the problem lets one know which
solution to choose. The second solution, 1.4241,0.2051 , is plotted in Fig. 1.5. It is very
accurate since the exact solution can be well approximated by a quadratic polynomial.
1.2
0.8
solution to
0.6
non-linear
0.4
ODE (exact &
0.2
0 0.2 0.4 0.6 0.8 1

x
Figure 1.5: Exact and quadratic Galerkin solutions to the non-linear ODE (1.70)
1.2.3 Partial Differential Equations
Galerkin’s method can also be used to solve partial differential equations, in particular
those containing both time and spatial derivatives. The Galerkin approach is used to

Chapter 1
reduce the spatial terms, leaving a system of ordinary differential equations in time. For
example, consider the equation
u  2 u
 0 (1.74)
t x 2
subject to
initial conditions: u ( x,0)  sin x  x

boundary conditions: u (0, t )  0, u (1, t )  1
[the exact solution is u ( x, t )  sin x e  t  x ]
2
Introduce the trial function
 
u ( x, t )  a (t ) x  x 2  sin x  x , a (0)  0 (1.75)
As usual, this trial function explicitly satisfies the essential boundary conditions and the
weight function satisfies the two homogeneous essential boundary conditions.
Note that the coefficient a is now a function of time whereas the weight function is a
function of x only. Following the Galerkin procedure,
1
 u  2 u  u 
1 1 1
u  u 
0  t x 2 
   ( x ) dx  0 t  dx  0 x x dx   x   0
u 
1 1
a
 
t 0
 2 dx  
0
x x
dx (1.76)
1 a  1 4
  a   0
30 t  3 
This now yields an ordinary differential equation in time, which must be integrated to
obtain the solution:
da 120
 10a   0, a(0)  0 (1.77)
dt 
giving

Chapter 1
a(t ) 
12

e 10 t
 1 (1.78)
The solution is thus
u ( x, t ) 
12

e 10 t
 
 1 x  x 2  sin x  x (1.79)
The solution is plotted in Fig. 1.6. Note how the perturbation dies away over time, leaving
the linear distribution u  x , the solution to u   0 , as the steady state solution.
exact
t0 Galerkin
1.4
1.2
t  0.1
1
0.8
0.6
0.4
t
0.2
0 0.2 0.4 0.6 0.8 1

x
Figure 1.6: Exact and Galerkin solutions to the PDE (1.74)
1.3 The Variational Approach
It has been shown above how one can use the Weighted Residual Methods to reduce
differential equations to systems of equations which may be solved to obtain approximate
solutions. An alternative approach is to use a variational method. There are many
similarities between this and the weighted residual methods, as will be seen.
What follows here is a brief first step into the branch of mathematics known as the
Calculus of Variations, which is primarily concerned with the minimisation of the values
of certain integrals.
First, introduce the concept of the variation: consider the function u ( x) and a second
function

Chapter 1
uˆ ( x)  u ( x)   ( x) (1.80)
where  ( x) is some arbitrary function and  is an infinitesimal scalar parameter. Thus

uˆ ( x) is everywhere at most infinitesimally close to u ( x) . The variation of u is the
difference between these two,  ( x) , also denoted by u ( x) , as illustrated in Fig. 1.7:
u ( x)   ( x)  uˆ ( x)  u ( x) (1.81)
The variation varies over the domain but it is everywhere infinitesimal, and it is zero where
essential BC’s are applied.
u( x)  ( x)  uˆ( x)  u( x)
u*
u
Figure 1.7: The variation of a function u
Note how the variation  u  x  , which is a small change to u at a fixed point x, differs from
the increment u used in calculus, which is a small change in u due to a small change x
in x.
Considering again the problem of Eqn. 1.60, multiplying the equation across by the
variation of u, integrating over the domain, integrating by parts and using the fact that
u (0)  0 ,
 d 2u   du d u 
1 1

0  dx 2 u  u  dx    dx
0
dx
 u  dx  u (1)u (1)  0

(1.82)
This is none other than the weak statement (1.62), with the weight function here being the
variation. Now, from Eqns. 1.81 and 1.80,

Chapter 1
 du  duˆ du d
     ( x), u   d  ( x)    ( x) (1.83)
 dx  dx dx dx dx
and one has the important identity
 du  d
  u  (1.84)
 dx  dx
To continue, consider a functional F ( x, u ) , that is a function of another function u ( x) .

When u undergoes a variation u and changes to û , F undergoes a consequent change
 F  F  x, u   u   F  x, u 
 u
F
u

 O  u 
2
 (1.85)
F
   O  2 
u
and the first variation of F, that is the change in F for small  , is
F
F  u (1.86)
u
Similarly, with
  F 2   F 2  x, u     F 2  x, u 
(1.87)
  F  x, u     F  x, u    F  x, u     F  x, u  
it follows from Eqns. 1.85-86, that
 F 2   2 FF (1.88)
(as in the formula for the ordinary differentiation). Using (1.84) and (1.88), (1.82)
becomes
1
 1  du  2 
  
0
  u  dx  u (1)u (1)  0
 2  dx  
(1.89)

Chapter 1
Finally, since
  F  x dx    F  x, u     F  x, u dx
  F  x, u   dx   F  x, u dx , (1.90)
    Fdx 
and u (1) is constant,
 1  1  du  2  
W (u )        u  dx  u (1)u (1)  0 (1.91)
 0  2  dx   
This is an alternative weak statement to the problem: the variation of the functional W (u ) ,
i.e. the function of the function u (x) , inside the curly brackets, is zero. The problem is
therefore now: find the function u (x) which causes W to be stationary.
Just as the Galerkin method was used to reduce the weighted residual weak statement to a
system of equations to be solved for an approximate solution, the variational weak
statement can be reduced to a system of equations using the Rayleigh-Ritz method, which
is discussed next.
The Rayleigh-Ritz Method
In the Rayleigh-Ritz Method, a trial function satisfying the essential BC’s is chosen, say
u~  a1 x  a 2 x 2  1 , as in (1.61). The functional in Eqn. 1.91 can then be written as a
function of the unknown coefficients; W  u  x    W  a1 , a2  :
1
1
  
W (a1 , a 2 )    a1  2a 2 x   a1 x  a 2 x 2  1  dx  2a1  a 2  1
2
2
(1.92)
0  
 12 a12  a1 a 2  23 a 22  32 a1  53 a 2  1
We require that  W  a1 , a2   0 . With  W   W / a1   a1   W / a2   a2 , we require

that W / ai  0 . Evaluating these partial derivatives leads to the system of equations

Chapter 1
1 1  a1    2   0
3
(1.93)
1 3  a 2    3 
4 5
which is the exact same system as obtained using the Galerkin method (this will always be
the case for differential equations of the self-adjoint form, i.e. as in (1.35)).
Here, the variational weak statement (1.90) was derived from the strong differential
equation statement of the problem (1.60). It should be noted that in many applications the
variational statement appears quite naturally, for example by using the principle of virtual
displacements in certain mechanics problems, and that the variational statement can be
converted back into the strong statement using the Calculus of Variations.
1.4 Summary
In summary then, in this Chapter have been discussed two broadly different ways of
tackling a boundary value problem, the Weighted Residual approach and the Variational
approach. The former involves converting the strong differential equation statement of the
problem into the weak weighted residual statement of the problem, and solving using the
Galerkin or other similar numerical method. The Variational approach involves converting
the strong statement of the problem into a stationary functional problem (or perhaps
beginning with the stationary problem), and solving numerically using the Rayleigh-Ritz
approach. This is summarized in the Fig. 1.8. In the figure, the bold arrows show the
method which will be primarily used in this text.

Chapter 1
Differential Equations Calculus of Variations Variational Statement

& of the Problem
Boundary Conditions (Weak Statement)
Integration
Weak Statement
by parts
Methods of Approximate Solution
Method of Weighted Residuals

(Petrov-Galerkin)
Galerkin’s Least Rayleigh-Ritz

Collocation
Method Squares Method
System of
Equations
Fig. 1.8. Weighted Residual and Variational Solution of Differential Equations
1.5 Problems
1. The approximate solution (1.4), u~ ( x)  2

9 x  x , to the differential equation (1.1)
2
was obtained by using the collocation method, with (1.1) enforced at x  1 / 2 . What
is the solution when the point chosen is x  1 / 4 . Which is the better approximation?
2. Derive the system of equations (1.6), resulting from a cubic trial function for the
differential equation (1.1) and the collocation method.
3. From the weighted residual integral (1.9) and the Least Squares relation
 ( x)  R / b , derive the equation (1.10) for the unknown coefficient b.

Chapter 1
4. Show that the weak form of (1.29) leads to a symmetric coefficient matrix when the
weight functions are chosen according to the Galerkin Method, Eqn. 1.21 (Do not
consider the boundary term).
5. Use the weighted residual methods, (i) Collocation, (ii) Least Squares, (iii) Galerkin
(strong form), (iv) Galerkin (weak form), with a quadratic trial function, to solve the
following differential equations:
a) u   1, u (0)  u (1)  0 [exact sln. u ( x)  12 x( x  1) ]
b) u   u  2 x, u (0)  u (2)  0 [exact sln. u ( x)  2 x  4 sin
sin 2 ]
x
Which is the most accurate at the mid-point?

6. Use the Galerkin method (weak form) to solve the following ODE with non-constant
coefficients:
2 xu   x  1, u (1)  u (2)  0 [exact sln. u ( x)   12  (1  x) ln 2  34 x  14 x 2  12 x ln x ]
Use a quadratic trial function. Would the coefficient matrix resulting from a higher
order trial function be symmetric?
7. Consider again the problem (1.45) with non-constant coefficients leading to the
unsymmetric matrix (1.52). Is it possible to rewrite the equation (1.45) so as to
obtain a symmetric coefficient matrix? Re-solve the problem using this equation,
with a quadratic trial function.
8. Derive the weight functions in (1.54) for the general one dimensional second-order
problem involving the trial function u~  i 0 ai x i and the homogeneous essential
n
BC’s, u (0)  0 , u (l )  0 .
9. Use the Galerkin method (weak form) to solve the following ODEs with two non-
homogeneous BC’s:
a) u   x, u (1)  1, u (2)  4 [exact sln. 1
6 x 3  116 x  1 ]
b) u   u   0, u (1)  1, u (3)  2 [exact sln. u ( x)  1
e 2 1
(2e 2  1  e 3 x ) ]
Are the boundary terms zero?
10. In the problem (1.56), the exact solution (1.59) was obtained using a quadratic trial
function. What happens if one uses a cubic trial function?
11. In the solution of the ODE (1.60), the quadratic trial function which satisfies only the
essential BC, u~  a1 x  a 2 x 2  1 , was used. Re-solve the problem using a trial
function which explicitly satisfies both the essential and the natural BC’s. Again, use
a quadratic trial function (which will yield the exact solution).
12. Use the Galerkin method (weak form) to solve the following ODE:
u   x  0, u (1)  1, du
dx x  2  1 [exact sln. u ( x)   116  3 x  16 x 3 ]
13. Using the weak statement (1.67) and a quartic trial function, obtain the 1-term
solution for the 4th-order ODE of (1.65). (You should get the exact solution.)

Chapter 1
14. Derive the weak statement (1.71) from the non-linear differential equation (1.70).
Hence derive the non-linear system of equations (1.72).
15. Solve the following non-linear equation using Galerkin’s method (weak form), first
with one unknown coefficients, then with two unknown coefficients:
2uu   x  0, u (0)  0, u (1)  0 [no exact sln.]
16. Re-solve the non-linear equation (1.70) by using a trial function which explicitly
includes the natural BC.
17. Use the Rayleigh-Ritz Method to find an approximation to the function u ( x) ,
satisfying the essential boundary conditions u (0)  u (1)  0 , which renders stationary
the functional
 du / dx  
1
I (u )   1
2
2
 12 u 2  u dx  
[exact sln. u ( x)  1  11e e x  e1 x ]
0
Use a quadratic trial function.

Chapter 1

1 Introduction and Background To The FEM: 1.1 Weighted Residual Methods

Uploaded by

Copyright:

Available Formats

1 Introduction and Background To The FEM: 1.1 Weighted Residual Methods

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

1 Introduction and Background To The FEM: 1.1 Weighted Residual Methods

Uploaded by

Copyright:

Available Formats

Chapter 1

1 Introduction and Background to the FEM

The Finite Element Method (FEM) is a way of obtaining approximate (numerical)

(1) Galerkin’s weighted residual method

(2) a variational method together with the Rayleigh-Ritz scheme

1.1 Weighted Residual Methods

Galerkin’s method1 is one of a number of numerical techniques known as Weighted

The Finite Element Method 1 Kelly

It now remains to determine b.

1.1.1 The Collocation Method

The Finite Element Method 2 Kelly

and solving these leads to the approximate solution

Figure 1.1: Collocation Method solution to Eqn. 1.1

The Finite Element Method 3 Kelly

1.1.2 The Method of Least Squares

Consider now an alternative solution procedure, wherein the differential equation is

which is close to the Collocation Method solution b  2 / 9 .

The Finite Element Method 4 Kelly

leading to two integral equations which can be evaluated to obtain

The Collocation Method as a Weighted Residual Method

The Finite Element Method 5 Kelly

Symmetry of the Least Squares Method

Substituting in the approximation u~   ai x i leads to, provided L is a linear operator,

with the weight functions

leading to the system of integral equations, one for each weight,

 a  Lx Lx dx   f ( x) Lx dx,

The Finite Element Method 6 Kelly

1.1.3 Galerkin’s Method

In Galerkin’s Method, the weight functions are chosen through

leads to the system of integral equations

Evaluating the integrals leads to the system of equations

The Finite Element Method 7 Kelly

and hence to the approximate solution

Integration by Parts & the Weak Form

The Galerkin method as presented gives reasonably accurate numerical solutions to

There are two advantages to integrating by parts:

The Finite Element Method 8 Kelly

Multiplying by  and integrating over the complete domain leads to

  pu   qu   ru dx   fdx (1.31)

  pu   p u   p u   qu   ru dx   fdx (1.32)

so that an integration by parts of (1.31) gives the weak form

  pu    p u   qu   ru dx   fdx   pu   (1.33)

The Finite Element Method 9 Kelly

which gives b  10 / 44 . Moving to the cubic polynomial u~  a11 ( x)  a 2 2 ( x) with the

The Finite Element Method 10 Kelly

and the (symmetric) system of equations and solution

A further approximation would be the quartic polynomial

giving the trial function satisfying the BC’s

u~  a11 ( x)  a 2 2 ( x)  a3 3 ( x) (1.42)

leading to the integrals, symmetric system and solution

 u~  0.1497 x  0.0056 x 2  0.1253 x 3  0.0187 x 4

The Finite Element Method 11 Kelly

Differential Equations with Non-constant Coefficients

Differential equations with non-constant coefficients can be handled as above. As an

The weighted residual is

Adding and subtracting this term 2 x(du / dx) then leads to

Using a quadratic trial function which satisfies the BC’s,

u~  a ( x  1)( x  2) ,  ( x)  ( x  1)( x  2) (1.49)