Vector Space
Vector Space
Vector Space
VectorSpace acy. We explain the reasons for using the VectorSpace C++ Library from the perspective
C++ Library of programming language for numerical computation in followings.
Proprietary On the other hand, many symbolic languages arrived not long after the advent of For-
Mathematical tran more than three decades ago. Symbolic languages, many specialized in mathematics,
Oriented are well suited for expressing mathematics. However, symbolic languages failed misera-
Symbolic bly. After all these years, Fortran is still the number one language for numerical computa-
tion. The symbolic languages, by the nature of their construction, are just too slow for
Languages large-scale numerical computation. Moreover, the dilemma faced by a practical user of
symbolic languages is that if he doesn’t know much mathematics he will not use the sym-
bolic languages to do symbolic computation; on the other hand, if he knows mathematics
well, he would most likely derive solutions by hand than be distracted by the nitty-gritty of
the symbolic language.
C C language grew with the domination of the UNIX operation system in the computer
industry. Up to the 80’s, C had become the industrial flagship language. A professional
programmer must be very fluent in C. C is not very different from Fortran. They are both
type-compiled languages. They should run equally fast in principle. Fortran programmers,
however, are never given enough incentives to migrate from Fortran to C. The major rea-
son is that many legacy numerical packages and libraries are already written in Fortran.
Why the hassle to translate them all into C? The odd is that many C enthusiasts did re-
write these legacy numerical package in C in the last decade. However, even in the hey
day of C, scientists and engineers in most prestigious universities and institutes did their
numerical analysis in Fortran. The reason is simple. C does not provide dramatic improve-
ments to convince the community of numerical computation to change. Despite all that, C
was the most popular langauge in the 80’s because it was the industrial flagship general
purpose language, everyone else uses it (the majority—professional software engineers).
In the turn of 80’s to 90’s, C++ has grown out of the nut-shell of C as the new industrial
C++ flagship general purpose language. We are yet to see if C++ meets the criterion of offering
the dramatic improvements over Fortran to convince the community of numerical compu-
tation to move away from Fortran to C++. Be conservative to stay with Fortran! If C++ is
not, it deserves to be ignored by the community as C was. Experiences have shown that the
adoption of a premature standard may only lead to waste of time and energy.
VectorSpace C++ The gap between the mathematical expression and the computer code is diminished by
Library vs. the use of VectorSpace C++ Library objects, while the underlining modeling philosophy of
Symbolic VectorSpace C++ objects is still kept close to, unlike symbolic languages, the computa-
tional algorithms used in Fortran and C. This offers a tremendous advantage over symbolic
Languages
languages that it can serve as a rapid-proto-typing tool for numerical programming. If fur-
ther optimization is necessary, the rapid-proto-type can be reverse-engineered into plain C,
step by step under the consistent modeling philosophy. The process of optimization to
plain C is done seamlessly all in one language—C++, which is a powerful and reliable lan-
guage, and yet its compilers are so cheap and well-supported by the major vendors in the
software industry.
Other C++ Most of the other C++ mathematical libraries in the current market are provided as a
Libraries “wrapped-package” of conventional-style Fortran/C libraries which are based on the pro-
cedure programming method. In contrast, the object-oriented design of the VectorSpace
C++ Library is a complete overhaul based on a theoretically sound conceptual framework
of integrable-differentiable mathematical objects. The capability of differentiation and
integration is essential to many numerical computation. Many advanced numerical sub-
jects such as (1) computational linear algebra (matrix computation), (2) unconstrained
and constrained optimization, (3) variational methods, and (4) finite element method, have
all been shown, in this application workbook, to be revolutionary easier with the Vector-
Space C++ Library to implement than those with Fortran or C.
The C++ program written with VectorSpace C++ Library can be used to produce an
Rapid Proto-
end-product of your program development. For most of the numerical subjects demon-
Typing using strated in this application workbook, even if you are using a laptop PC, you do not need
VectorSpace further optimization at all to make a practical computation. These example programs run
C++ Library reasonably fast, mostly in a matter of a few seconds. However, the speed can be an impor-
tant factor in numerical computations. With the problem that you are facing with, you
might want to increase its number of variables (denoted as “n”) tens to hundreds of times.
The computation time and memory space might grow, in the order of O(n2) to O(n3),
respectively.1 This means that a possible increase of more than a million times in comput-
ing time and memory space for a large-scale problem (e.g, with n > 104). Similarly, for
matrix iterative solution method, transient problems and nonlinear problems, the same
program segment may need to be repeated over tens or hundreds of times. Program for
such problems, which originally runs within the acceptable range of a few second, could
easily run up to a few hours or days and becomes increasingly unacceptable. When that is
the case, the VectorSpace C++ Library can be used to generate rapid-proto-typing (or
intermediate-code) for further optimization, or reverse-engineering. Experiences show
that, for a non-trivial problem with the VectorSpace C++ Library proto-typing, the pro-
gram development time is always much shorter than if you would code in plain C directly.
The proto-typing not only help you grasp the mathematical idea quickly, but it also pro-
vides intermediate results to help you debug the final optimized code. At the end of a pro-
gram development, the intermediate-code can be commented out of the final code or
enclosed in conditional compilation segments. These disabled segments in the source files
may serve to improve the readability and maintainability of the final code. Therefore, after
you have reverse-engineered the prototype program in VectorSpace C++ objects, it is pos-
sible that your final code can be completely free of VectorSpace C++ Library objects. The
final product can be as pure as a crystal ball that it only has the ANSI/ISO C and C++.
Computer world is moving fast and everything is quite ephemeral. According to the
Moore’s Law, by the Intel co-founder Gordon Moore, the computer chip doubles its speed
every eighteen months. This basically defines the time scale for the computer industry
evolution. We doubt that VectorSpace C++ Library may have a life span as long as that of
Index 623
Part. I. Numerical Methods
Cn spaces, as in usual mathematical definition, are continuous (linear) vector spaces with derivatives up to
order n. This chapter deals with objects in C0 space.
1. The include file enables C++ complier to understand the extended definitions in the VectorSpace C++ Library.
You also need to use the VectorSpace C++ Library— “vs.lib” under directory “vs\lib”, for the linker to resolve
the external references of your program to the library in order to construct an executable file.
#include “include\vs.h”
int main() {
C0 a(0.0);
dedicated constructor; value = 0.0
cout << a << endl;
C0 b = SCALAR(“const double&”, 1.0),
c =C_0 (“const double&”, 2.0);
virtual constructors; the string severs as
cout << b << endl a memonic for the parameter that is sup-
<< c << endl; plied to it.
try (C0 d = SCALAR(“wrong string”, 0.0);
} catch(const xmsg& e) { exception handling
cout << “Excetption: “ << e.what() << “ at “ << e.where() << “ line “ << e.line() << endl;
}
}
1.1.1 Scalar
The concept of data abstraction in C++ organizes data and the operations on the data in a coherent unit—
class. The class of a Scalar defines the simplest data abstraction in VectorSpace C++ Library. A Scalar is a class
with a number s ∈ as its private member data, represented as double type in C++, associated with its con-
structors, operators and member functions (see Figure 1•1). The private member data, in this case a double type,
is shielded by the operators and the member functions through which the access from the outside world to the
private member data is only possible. In mathematics, a group, with a set G and operator ◊ , is denoted as (G ,
◊ ). Here a class in C++ defined with the concept of data abstraction closely resembles the concept of a group in
mathematics.
...
...
double
+
Access from outside world
=
operator=() only possible through member
functions and operators
sin()
Figure 1•1 Data abstraction organizes data and its operators and member functions in a coherent
Constructors
A scalar object in C++ program is declared as a C0 type with either dedicated constructor or virtual con-
structor as in Program Listing 1•1. The advantage of using a dedicated constructor is that it is very concise. It
only specifies a double value as the argument for the C0 constructor. C0 constructor knows the result is a Scalar
object by identifying that there is only one argument of type double supplied to the C0 constructor. In the Pro-
Supplying arguments of different types to the dedicated constructor of C0 type may result in other kinds of
objects, e.g., a Vector, a Matrix ... etc., in the C0 type family. The philosophy of the dedicated constructor is that
it is in accordance with the style of the C language. The syntax of using dedicated constructor saves a few
punches on the keyboard for programmers. Especially beneficial to professional programmers who work on
computer so often that tedious acts can be very annoying. However, the flexibility of the dedicated constructor is
compromised. What would happen, if we want to supply a pointer to a double to initialize the Scalar object?
How do we instantiate a reference to a double value, a “by-value” from a Scalar object, a “by-reference” from a
Scalar object, or a pointer to a Scalar object ... etc. ?
To extend such flexibility, virtual constructor1 is used to instruct the C0 type constructor to know exactly
what kind of object to generate, a Scalar, a Vector, or a Matrix, ... etc. This kind of constructor is called virtual
because it mimics the function dispatching behavior of the virtual function—a salient objected-oriented pro-
gramming feature in C++. Through the virtual function mechanism in C++, a call on the virtual function (“foo()”
in Figure 1•2) in the base class can be dispatched to the same function in the derived class. Actually, critics of
C++ often say that C++ is not an orthodox object-oriented language. The implementation of the virtual construc-
tor, which is not supported by C++, steps in the direction closer to an orthodox object-oriented language with the
use of VectorSpace C++ Library.
C0
Base class
call foo() virtual foo();
...
inheritance
dispatch route
Derived
classes Scalar Vector Matrix etc.
foo(); foo(); foo();
... ... ...
Figure 1•2 Object-oriented class inheritant relationship and virtual function dispatching
mechanism.
A concrete object, for example a Scalar, is generated at run-time by the virtual constructor of its base class C0
as (see Program Listing 1•1)
C0 b = SCALAR(“const double&”, 1.0);
1. J.O. Coplien, 1992, “Advanced C++ — Programming Styles and Idioms”, Addison-Wesley, p. 143.
by reference
“C0&” C0 type Scalar object 1
“C0*” a pointer to C0 type Scalar object 2
“double&” double 3
“double*” double pointer 4
by value
“const double&” double 5
“const double*” double pointer 6
“const C0&” C0 type Scalar object 7
“const C0*” pointer to C0 type Scalar object 8
Strings in C0 virtual constructor for Scalar object.
Some of the string contains more than one word. In VectorSpace C++ library, the string for the constructor
are parsed in free-format. Free-format uses one or more spaces or commas to separate words. For example, two
words separated by one space “const^double&” and two words separated by two spaces “const^^double&” will
be recognized as the same by the virtual constructors.
Instead of explicitly specifying “SCALAR” for the C0 type constructor, one can also use “C_0” in place of
the “SCALAR” to construct a Scalar object. For example (see also Program Listing 1•1),
C0 b = C_0(“const double&”, 2.0);
In such a case, the C0 type constructor searches for a matching string according to the priority ranking of the
strings (see Figure 1•2, searching in the order of left to right through the branches of object-hierarchy tree).
When the constructor finds the first match it generates a corresponding object, a Scalar, a Vector, a Matrix ... etc.
A specification of “C_0” instead of any specific type of object being committed in the time of program writing,
is a late-binding technique used in the VectorSpace C++ Library. The determination of the actual kind of object
to generate can be delayed until run-time. You can code logic control statements in your program to determine
what kind of object to generate, then pass an appropriate string to the constructor. Therefore, the object and its
type are both created on the fly. In VectorSpace C++ Library, this particular kind of virtual constructor is called
autonomous virtual constructor1. However, the power of flexibility comes as a frontal assault on the security of
1. see “autonomous generic exemplar idiom” in J.O. Coplien, 1992, “Advanced C++—Programming Styles and Idioms”,
Addison-Wesley, p. 291.
The try-catch statement is standard in C++ language. The xmsg object simulates the standard C++ library
exception handling1. The details and location (function name and line) of an error, if any, caused by the C0 con-
structor enclosed in try clause will be reported to the standard output in the above code segment.
#include “include/vs.h”
int main() {
C0 a(0.0), b, a = 0.0; b not initialized
c(1.0), d; c = 1.0; d not initialized
cout << a << endl; 0.0
b &= a; assignment by reference
c =a; assignment by value
a = 10.0; reset to 10.0
cout << b << endl
10.0
<< c << endl;
0.0
garbage collection behind the scene
a = c;
d = a;
make a new Scalar object
}
1. P.J. Plauger, 1995, “The draft standard C++ library”, Prentice Hall, Inc., p.53.
a
b &= a;
b a = 10.0;
Scalar
0.0 10.0
label
double c c = a;
1.0 =
Figure 1•3 Assignment by reference (upper part) and by value (lower part).
In Program Listing 1•2 (see also Figure 1•3 ), variable “b” is declared as an object of type C0. The variable
“b” actually acts more like a label (or a symbol), because it has no concrete data type, e.g., a Scalar, a Vector or a
Matrix, associated with it. The label “b” is then assigned to share the same concrete data, the Scalar object with
“a” as its label, and “0.0” as its content. This is done by the assignment by reference operator as (illustrated in
the upper part of Figure 1•3)
b &= a;
We can think this expression as an operation to attach the label “b” to the Scalar object that has already been
labeled as “a”. Therefore, changes made to the content of “a” to 10.0, by “a = 10.0;” later, will also change the
content of “b”, because “a” and “b” are referring to the same memory location. The assignment by value (see the
lower part of Figure 1•3) in Program Listing 1•2 is
c = a;
In this case, “c” already has its own copy of a Scalar object with its content initialized as “1.0”. The “assignment
by value” operation then reassigns its value to that of “a” object—“0.0”. Later on, changing “a=10.0”; does not
change the value of “c”. The value of “c” remains as “0.0”.
What then will happen if we write (see also upper part of the Figure 1•3)
a &= c
The result is that the label “a” will be peeled off from the Scalar object that it is attached to. Then, some house-
cleaning chore needed to be done. The VectorSpace C++ Library will check if the detached Scalar object has any
other label refers to it. If there is, in this case the label “b” is still referring to it, the Scalar object will survive. If
there isn’t, the Scalar object will be killed. This is done by reference-counting1, a popular garbage collection
technique for memory management in C++. The next step, after garbage collection, is that the label “a” is then
1. J.O. Coplien, 1992, “Advanced C++—Programming Styles and Idioms”, Addison-Wesley, p. 58.
0.0
b
10.0
d = a;
d
?=
Figure 1•4 Assignment by reference (upper part) and by value (lower part).
reassigned to the Scalar object that label “c” is pointing to. In short, “a” is striped off its original associated Sca-
lar object and re-assigned to point to the Scalar object that “c” is appointed to. A contrary scenario is the case of
assigning a label “d” by value as
d = a;
In this case (see the lower part of Figure 1•3), “d” is only a label, with no concrete object associated with it. Upon
the assignment by value, label “d” will be instantiated with a concrete object with the same type of object as “a”,
a Scalar object in this case. Then this newly instantiated Scalar object will be assigned a value of “0.0”. In short,
a newly constructed Scalar object (since “c” pointed to a Scalar object) is created for label “d”, and its value is
also set to the Scalar object associated with “a”. That is both the object type and its content of label “d” is deter-
mined according to what label “a” is referring to.
Two symbolic operators “&” and “&&” are column-wise-concatenation operators. Their return values are
Vector objects. The operator “|” and “||” are row-wise-concatenation operators. They return row-vectors which
are represented as Matrix of row-length = 1 in VectorSpace C++ Library. We defer discussion of these operators
until Vector and Matrix are introduced.
The use of member arithmetic operators, logic operators and functions for the Scalar object are straight-for-
ward. They are defined to be consistent with C++ without much explanations (see box in page 8). The actual set
of operators and functions is many times greater than the partial listing here. Many operators and functions not
listed here are actually proliferation due to the promotion among different types of a binary operator. For exam-
ple, (in project: “scalar_examples”)
C0 a(1.0);
C0 b = a + 2.0; // “a” a Scalar object of type C0 plus a constant double “2.0”
C0 c = 2.0 + a; // “2.0” a constant double plus a Scalar object of type C0
symbolic operators
C0& operator &= ( ) assignment by reference
C0& operator = ( ) assignment by value
C0 operator & ( ) const column concatenation
C0 operator && () const one-by-one column concatenation
C0 operator | ( ) const row concatenation
C0 operator || () const one-by-one row concatenation
arithmetic operators
C0 operator + ( ) const positive unary
C0 operator - ( ) const negative unary
C0 operator + (const C0&) const addition
C0 operator - (const C0&) const subtraction
C0 operator * (const C0&) const multiplication
C0 operator / (const C0&) const multiplication
C0& operator += (const C0&) replacement addition
C0& operator -= (const C0&) replacement subtraction
C0& operator *= (const C0&) replacement multiplication
C0& operator /= (const C0&) replacement division
logic operators
int operator == (const C0&) const equal TRUE == 1
int operator != (const C0&) const not equal FALSE == 0
int operator >= (const C0&) const greater or equal
int operator <= (const C0&) const less or equal
int operator > (const C0&) const greater
int operator < (const C0&) const less
functions
C0 pow(int) const power
C0 sqrt(const C0&) const square root
C0 exp(const C0&) const exponent
C0 log(const C0&) const log
C0 sin(const C0&) const sin
C0 cos(const C0&) const cos
Partial listing of scalar object arithmetic operators, logic operators and functions.
The return value of “c” is a Vector object of C0 type with its value as {0.0, 1.0}T, column-vector is denoted with
“transpose” superscript. Using “&&” makes no difference in the case with two Scalar objects as operands. How-
ever, if Vector is used as either of the two operands of the binary operators “&” and “&&”, we will see different
results from these two operators (see Figure 1•6 in page 14).
Constructors
Now let’s get on with “Vector” in VectorSpace C++ Library. The dedicated constructor for the Vector can be
written as (see Program Listing 1•2)
#include “include\vs.h”
int main() {
double a[3] = {0.0, 1.0, 2.0};
C0 b( 3, a); array name “a” is treated as double*
cout << b << endl; {0.0, 1.0, 2.0}T by reference
C0 c(3, (double*)0);
null pointer “0” is cast as double*
cout << c << endl;
{0.0, 0.0, 0.0}T, with default value and
return 0;
}
its own memory.
1. Chapters 4 and 9 in P. V. Linden, 1994, “Expert C programming: deep C secrets”, Prentice-Hall Inc.
In this case, a Vector “c” will have its own copy of memory space and its values are all initialized to a default
value—“0.0”. This is the most concise way of initializing a Vector object. In VectorSpace C++ we call this spe-
cific style as default dedicated constructor. The default value can be easily reset to other values with a statement
as c = 2.0. In this case all three elements of the Vector “c” will be set to have the value of “2.0”. This says that
the assignment by value operator takes a double value as its argument. Such implicit type conversion happens, in
VectorSpace C++ Library, only when it makes unambiguous intuitive sense.
The dedicated constructor can be used to make a new kind of object that we have not introduced in section
1.1.1 about Scalar. This new kind of object is a subvector which refers to an existing Vector object. The subvec-
tor can start from and end at any index within the index range of the referenced Vector object “a”, provided that
Vector “a” has to be continuous in its physical memory space. For example,
double d[8] = {0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0};
C0 a(8, d),
b(4, a, 3);
cout << (+b) << endl;
In Figure 1•5, Vector “a” is constructed as a vector of length 8 by “C0 a(8, d);” where double array “d” is
declared as “double d[8] = {0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0};”. A reference Vector “b” is initialized to have its
length “4”, the referenced object “a”, and the index to start from “a” being “3” by writing the following state-
ment.
C0 b(4, a, 3);
We consider the reference Vector as a special kind of Vector object instead of a Subvector. The word Subvector
(with capital “S”) in VectorSpace C++ Library will be reserved for subvectors that can have its referenced vector
equal-partitioned as will be discussed in Section 1.1.5.
A reference Vector, as its name implies, always references to another Vector object with continuous physical
memory space. It doesn’t own the memory space of its data. A common practice in VectorSpace C++ Library is
to use the unary positive operator “+” to cast a reference Vector into a new independent Vector object such as
“+b”. A temporary Vector object will be generated in this case to have its own copy of memory with the same
C0 a(8, d);
double d[8] = 0.0 0
1.0 1
C0 b(4, a, 3); +(b)
2.0 2
3.0 3 0 3.0 0
4.0 4 1 4.0 1
5.0 5 2 5.0 2
6.0 6 3 6.0 3
7.0 7
Figure 1•5 Referenced Vector “b”, and “primary casting” by unary operator +().
length (= 4), and with the contents of the memory to be set to have the same values as “b”. The use of the unary
positive operator “+” to convert a specialized object, in this case a reference Vector, into a more primitive object
will be encountered many times in VectorSpace C++ Library. This operation is defined as primary casting, Vec-
torSpace C++ Library, by the operator “+”.
Similar to the case for Scalar, the constant strings used for the plain virtual constructors (used macro defini-
tion “VECTOR” in place of “SCALAR”) and the autonomous virtual constructor are shown in the following box
Some of the constant strings do not have a priority number. That is because those strings clash with the strings in
the Scalar. The C0 type autonomous virtual constructor will first find a match in Scalar object; it will never have
a chance to be dispatched up to Vector object virtual constructor to make it. This means the strings in Vector are
hidden (or masked) by the same strings in Scalar. Therefore, these strings although useful for the plain virtual
constructor will not be useful for the autonomous virtual constructor to make any Vector object of C0 type.
by reference
“C0&” C0 type Vector —
“C0*” a pointer to C0 type Vector —
“int, double*” length, double* != 0 10
“int, double*, int, int” length, double*, m_row_size, m_col_size 11
by value
“int” length 9
“int, double*” length, double* = 0 10
“int, const double*” length, double* 12
“int, const C0*” length, C0* of a Scalar 13
“const C0*” C0* —
“int, C0&, int” length, C0, starting index 14
(the only one for reference Vector)
Strings in C0 virtual constructor for Vector object.
#include “include/vs.h”
int main() {
C0 a(1.0), a = 0.0; a Scalar
b(3, (double*)0); b = {0.0, 0.0, 0.0}T; a Vector
b = a; assignment by value with a Scalar
cout << b << endl; {1.0, 1.0, 1.0}T
b = 2.0;
assignment by value with a double
cout << b << endl;
{2.0, 2.0, 2.0}T
b &= a;
cout << b << endl;
assignment by reference with a Scalar
return 0;
1.0; “b” is now a Scalar!
}
Assigning Vector “b” by value with either a Scalar object of C0 type or a double type in C++ is defined as
setting all of the components of “b” to have the value of the Scalar or the double. Assignment by reference for a
variable “b” of type C0, in this case a Vector, by a Scalar results in a Scalar (as in b &= a). The type of object
associated with label “b” has been changed from a Vector to a Scalar. The garbage collection mechanism
explained in page 6 will be activated to check if the memory space for the Vector object needs to be released.
The brand new symbolic operator—selector is the operator [](int). For example, in Program Listing 1•2,
“a” is a Vector of length = 8, with its value refers to a double array “d”. “b” is a reference Vector that has length
= 4, with the first index (“0”) of “b” pointing toward the fourth index (i.e., “off-set”—3 from the first position)
of “a”. Therefore, the selectors of a[3] and b[0] will both return a Scalar with the value “3.0”, and both return
Scalar objects pointing toward the same memory position (see Figure 1•5 in page 11).
The remaining symbolic operators for the Vector objects are the column-wise concatenation operators “&”
and “&&”, and row-wise concatenation operators “|” and “||”. The row-wise concatenations will return a Matrix
symbolic operators
C0& operator &= ( ) assignment by reference
C0& operator = ( ) assignment by value
C0& operator [] (int) selector return a Scalar
C0 operator & ( ) const column concatenation
C0 operator && () const one-by-one column concatenation
C0 operator | ( ) const row concatenation return a Matrix
C0 operator || ( ) const one-by-one row concatenation return a Matrix
arithmetic operators
C0 operator ~ ( ) const transposed (into a row vector) return a Matrix
C0 operator + ( ) const positive (primary casting) unary
C0 operator - ( ) const negative unary
C0 operator + (const C0&) const addition
C0 operator - (const C0&) const subtraction
C0 operator * (const C0&) const multiplication by a scalar; scalar product of two Vectors
C0 operator %(const C0&) const tensor product of two Vectors return a Matrix
C0 operator / (const C0&) const division (by a Scalar or a Matrix only) return a Vector
C0& operator += (const C0&) replacement addition
C0& operator -= (const C0&) replacement subtraction
C0& operator *= (const C0&) replacement multiplication (by a Scalar only)
C0& operator /= (const C0&) replacement division (by a Scalar only)
logic operators
int operator == (const C0&) const equal TRUE == 1
int operator != (const C0&) const not equal FALSE == 0
int operator >= (const C0&) const greater or equal
int operator <= (const C0&) const less or equal
int operator > (const C0&) const greater
int operator < (const C0&) const less
functions
int length() const length of the Vector
double norm(int = 2) const 1-norm or 2-norm
double norm(const char*) const infinite-norm takes strings “infinity”, or “maximum”
C0 pow(int) const power (applied to each element of the Vector)
C0 sqrt(const C0&) const square root (applied to each element of the Vector)
C0 exp(const C0&) const exponent (applied to each element of the Vector)
C0 log(const C0&) const log (applied to each element of the Vector)
C0 sin(const C0&) const sin (applied to each element of the Vector)
C0 cos(const C0&) const cos (applied to each element of the Vector)
Partial listing of Vector object arithmetic operators, logic operators and functions.
#include “include/vs.h”
int main() {
double d[8] = {0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0};
C0 a(8, d),
a = {0.0, 1.0, ..., 7.0}T; a Vector
b(4, a, 3);
b; a reference Vector
cout << a[3] << endl;
cout << b[0] << endl;
a[3] = 3.0
return 0;
b[0] = a[3] = 3.0
}
Listing 1•5 Selector for Vector and reference Vector objects (project: “vector_examples”).
object, and we will discuss them in Section 1.1.3. The column-wise concatenation operations for two Scalar
objects have been introduced in page 9 of this section. For simplicity we first focus on the case for column-wise
concatenations of two Vector objects (see Figure 1•6). The column-wise-concatenation of two Vectors, by
“operator &(const C0&)” appends the second vector after the first as in the left-hand side of Figure 1•6. The
lengths of the two Vectors (say len1 and len2) do not have to be the same. The return Vector has the length of
(len1+len2). The one-by-one column-wise-concatenation, by “operator &&(const C0&), of two Vectors
requires that the two Vectors to have the same length (= len) as shown in the right-hand-side of Figure 1•6. The
values of the two Vectors are interlaced to form a new Vector of length = 2 len. If the two Vectors do not have the
same length, an C++ exception will be thrown. If not handled by a catch clause, the default behavior of C++
exception handling mechanism will cause your program to crash.
What if one of the operands for “&” or “&&” operators is a double or a Scalar object? (see left-hand-side of
Figure 1•7) For the column-concatenation operator “&”, the double or Scalar will be added in front or appended
after the Vector object (with length = len) according to the order of the operands. In Figure 1•7, “a” and “c” are
either a double or a Scalar. “a & b” adds the value 1.0 in front of the values of “b” to form a new Vector. “b & c”
1.0 1.0
1.0 1.0
a 2.0 3.0
2.0 2.0
c
3.0 5.0
3.0 3.0
7.0
4.0 4.0
4.0
5.0 5.0
5.0 2.0
b 6.0 6.0
6.0 4.0
d 7.0
7.0
7.0 6.0
8.0
a&b 8.0
c && d
Figure 1•6 Column-concatenation and one-by-one column concatenation of two Vectors.
d && e
5.0 3.0
6.0 0.0 f
c 6.0
4.0 0.0
0.0
Figure 1•7 Column-concatenation “&” and one-by-one column concatenation “&&” of one
Scalar and one Vector.
appends the value of “6.0” after the values of “b” as shown. Both of their return objects are new Vectors with
length = len + 1.
For the one-by-one column-wise-concatenation operator “&&” with mixed-type operands, VectorSpace C++
library defines the return new Vector object to have the length = 2 len, with the value of the double or the Scalar
interlaced in front or after the values of the original Vector object. In the right-hand-side of Figure 1•7, “d” and
“f” are either a double or a Scalar. “e && f” and “d && e” interlace the value of “f” after and the value of “d” in
front of the values of the Vector “e”, according to the order of operands for the “&&” operator.
The explanations on two operators, the transpose by “~” and the tensor product by “%”, will be delayed until
the next section on the subject of Matrix, because they both return a Matrix object.
The member operator “*” of a Vector object may take a Scalar or a double as its argument. For example,
If the two Vectors do not have the same length, an exception will be thrown. Three different notations in mathe-
matics for scalar inner product are
In the indicial notation the scalar inner product is indicated by using repeated indices “i”. The repeated indices is
defined to imply summation—the summation convention. In the matrix algebra notation, the scalar inner product
is achieved by re-orienting the column-vector v into a row-vector vT, and the row vector multiplied by the col-
umn vector w of the same length gives a scalar result. This is consistent with a matrix of row-length = 1 multi-
plied by a column-vector gives a scalar. In these two notations the expressions for the scalar inner product are
defined implicitly using the multiplication operation. The operands need to be tempered with by either adding
repeated indices or imposing transpose to define the scalar inner product. In the tensor algebra, the two vectors
do not need to be manipulated. The “•” operator is defined as scalar inner product per se. The operator, in this
case, is defined to have the knowledge of how each component of the two vectors should be multiplied and
summed together. For the reason of inclusiveness for all three conventions, VectorSpace C++ Library defines the
“Vector::operator *(const C0&)” as
The left-hand-side fits in matrix algebra, and the right-hand-side fits in tensor algebra. The notations in tensor
algebra have the most uncluttered expressions, in which physical meaning of an expression is usually less dis-
tracted by the trivia.
The Vector::operator / (const C0&) accepts either a Scalar or a Matrix as its argument. When the argument is
a Scalar, or a double as well, the operator obeys the distribution rule as in the multiplication operator; i.e., the
division by the Scalar value is applied to every component of the Vector object. For the division operator to
accept an argument of a Matrix, let’s first look at the solution of a set of simultaneous equation in Matrix form as
Mv = w Eq. 1•2
Where M is a matrix of size m × n, v is a vector of size n, and w is a vector of size m. In this case, the division of
a vector w by a Matrix M is naturally defined as the solution v of the simultaneous equation M v = w.
Replacement multiplication ( *=) and division ( /=) operators have semantic issues to be clarified. The
replacement operator means the l-valued object (the object in the left-hand-side) is to be operated on and then
reassigned to itself. However, if the argument taken for “*=” is a Vector, the replacement multiplication will
mean that it is a scalar inner product of the two Vectors and it will have to return a Scalar object instead of a Vec-
tor. This is inconsistent with the semantics of a replacement operator. You might also want to consider the “/=” to
accept a Matrix object, since it returns a Vector object. However, the original Vector object that calls the “/=”
operator is the right-had-side vector (w) of Eq. 1•2, and the return Vector is the vector (v) of Eq. 1•3. Although
they are both Vector objects, they are not the same vector as required by the semantics of replacement operation.
Exactly the same situation occurs for “*=” to have a Matrix object as argument. Consequently, in VectorSpace
C++ library, we define that both “*=” and “/=” can only take a Scalar or a double argument. If other types are
used, an exception will be thrown.
The logic operator, “Vector::operator ==(const C0&)”, define the condition for equal as two Vector objects
have the same length and values for every element.
The function Vector::length() returns the size of the Vector. In the box for Scalar on page 8, we did not men-
tion there is Scalar::length() function available, since it doesn’t make much sense to ask the length of a Scalar
object. Actually it exists. The return value is always 1. Another example is the transpose operator “~”. It is also
applicable to the Scalar object. It always returns a Scalar of the same value. In VectorSpace C++ library we call
this kind of functions backward compatible. The existence of the backward compatible functions enlarges the
function set for an object type. It is useful for a more dynamic programming method (see the example discussed
on page 41).
Functions “Vector::norm(int)” and “Vector::norm(const char*)” are defined to take either the values of 1 or 2
with the integer argument version, and to take either “infinity”, “Infinity”, “maximum” or “Maximum” with the
constant string version. In mathematics a p-norm or the Hölder norm of a vector v of length n is defined by1
v p = ( v1 p + v2 p + … + vn p ) 1 / p Eq. 1•4
v 1 = v1 + v2 + … + v n Eq. 1•5
v 2 = v 12 + v 22 + … + v n2 Eq. 1•6
1. B.N. Datta, 1995, “Numerical Linear Algebra, and applications”, Brooks/Cole Publishing Company, p. 25.
Free functions of the forms of “C0 norm(const C0&, int = 2)” and “C0 norm(const C0&, const char*)” can be
used for retrieving the norms of a vector “v”. A 2-norm is written as
The omission of second argument in “norm(v)” implies that “2” is the default value for the second argument of
the function norm( ). A 1-norm is
norm(v, 1),
The norm functions are also backward compatible with a Scalar object, which simply returns the absolute value
of the Scalar.
For the remaining transcendental functions it is suffice to say that these functions perform the distribution
rule as discussed in page 16; i.e., applying these functions to a Vector results in returning a new Vector with the
values obtained by applying these functions to every element of the Vector. For example, applying a trigonomet-
ric function “sine” to a Vector “v” of length n is defined as a vector with every element of it the result of apply-
ing the trigonometric function “sine” to that element.
1.1.3 Matrix
The data abstraction for a Matrix is represented as two integer numbers, “row-length” ∈ I and “column-
length” ∈ I, and a value-array mij ∈ . Since the “row-length” and “column-length” are variables, the memory
space for mij has to be managed dynamically (see Figure 1•8).
The “value-array” of mij is represented by an “array of pointers to double” (“m[0]” of type double*) of size
= “row-length” × “column-length”, while an “index-array” is an “array of pointer to pointer to double” (“m” of
type double**) to simulate “array of double” like syntax. The index of the value-array representing mij has the
following relation:
double* m[0] m00 m01 m02 m03 m04 m10 m11 m12 m13 m14 m20 m21 m22 m23 m24 m30 m31 m32 m33 m34
“row-length” = 4, “column-length” = 5
m[0] m[1] m[2] m[3] m 00 m 01 m 02 m 03 m 04
double** m m0 m1 m2 m3
m 10 m 11 m 12 m 13 m 14
“index-array”: m = new double* [“row-length”]; mij =
m 20 m 21 m 22 m 23 m 24
for(int i = 1; i < “row-length”; i++)
m[i] =m[i-1] + “column_length”; m 30 m 31 m 32 m 33 m 34
double** m;
m = new double* [row_length]; // index-array instantiation
m[0] = new double [row_lengh * column_length]; // value-array instantiation
for(int i = 1; i < row_length; i++) // setup index-array to point to the
m[i] = m[i-1] + column_length; // beginning of each row
Notice that according to the pointer arithmetic in C language, the semantics of “m[i]” can be explained in two
steps: (1) the braces after m performs casting as m[] ≡ (double*) m, and (2) the index “i” in the braces indicates
the off-set from the first position as m[i] ≡ ((double*) m)+i. At the time of destruction,
The purpose of this introduction on memory management is not asking you to master the internal working of the
VectorSpace C++ Library. The data abstraction has wrapped all these details of how to maintain the “value-
array” and the “index-array” in the constructors and destructors of each class. However, from user’s perspective
we need to understand that if we are passing an array by reference to an object, it will always be passed as a dou-
ble*—the value-array, and the value array should always be thought of as an one dimensional array, exactly as
how it is organized in memory. (see double* array m[0] in Figure 1•8). This general concept remains valid for
even more complicated classes in VectorSpace C++ Library.
We have defined that a “row-vector”, vT, where v is a (column-) vector, is represented as a Matrix of “row-
length” = 1, in VectorSpace C++ Library. Now we recall the two row-wise concatenation operators left undefined
on page 7. The row-wise-concatenation operator “|” is used for two Scalars as (project: “matrix_examples”)
{ {0.0, 10.0},
{1.0, 11.0},
{2.0, 12.0},
{3.0, 13.0} }
Again, using “||” instead of “|” for two Vectors makes no difference. For mixed type operands with one Scalar
and one Vector the row-wise concatenation operator “|” and the one-by-one row-wise concatenation operator “||”
are defined completely parallel to “&” and “&&” as illustrated in Figure 1•7. The only difference is the source
and the return (column) Vectors are now a vector represented as a Matrix of row-length = 1.
For operator “%” to take two Vectors, tensor product is written in VectorSpace C++ Library as
The result of the tensor product is called a dyad, and is represented as a Matrix object of C0 type as
The two Vector objects in the example above, do not have to have the same length. Three different popular nota-
tions for the tensor product are
The left-hand-side is consistent with “v wT” in the matrix algebra, and the right-hand-side is consistent with
“v ⊗ w” in the tensor algebra.
Constructors
The examples of dedicated constructors for the Matrix can be written as in Program Listing 1•2. The Matrix
object of C0 type is constructed by defining its “row-length” = 4 and “column-length” = 8. The double* array
m1[0] is the value array passed as a reference to the Matrix “a”. Notice the semantics of m1[0] = ((double*)
m1)+0. Just as in the dedicated constructor for Vector object, the Matrix can be constructed to have its own mem-
ory by passing an argument of “(double*)0”, a null pointer cast to a pointer of double, to the third argument of the
dedicated constructor as
C0 a(4, 8, (double*)0);
However, in doing so all the elements in the Matrix object will be initialized to have the value “0.0”.
The reference Matrix can be constructed by the dedicated constructor of Matrix class as (see Figure 1•9)
C0 b(2, 3, a, 1, 2);
where the first two arguments say that the reference Matrix “b” has “row-length” = 2 and “column-length” = 3.
The third argument is the referenced Matrix “a”, and the last two arguments are the starting indices which are the
second row-index “1” and the third column-index “2” in “a”. Again, the unary positive operator of Matrix “+”
serves the function of primary casting. For example, “+b” constructs an independent temporary Matrix object of
#include “include/vs.h”
int main() {
double m1[4][8] = { { 0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0},
{10.0, 11.0, 12.0, 13.0, 14.0, 15.0, 16.0, 17.0},
{20.0, 21.0, 22.0, 23.0, 24.0, 25.0, 26.0, 27.0},
{30.0, 31.0, 32.0, 33.0, 34.0, 35.0, 36.0, 37.0} };
C0 a( 4, 8, m1[0]); { { 0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0},
cout << a << endl; {10.0, 11.0, 12.0, 13.0, 14.0, 15.0, 16.0, 17.0},
{20.0, 21.0, 22.0, 23.0, 24.0, 25.0, 26.0, 27.0},
{30.0, 31.0, 32.0, 33.0, 34.0, 35.0, 36.0, 37.0} };
C0 b(2, 3, a, 1, 2);
cout << b << endl; { {12.0, 13.0, 14.0},
return 0;
{22.0, 23.0, 24.0} };
}
a: 4 x 8
a12 a13 a14 +b
Figure 1•9 Reference Matrix “b” and referenced matrix “a”. a12 is the starting element.
C0 type. The temporary object “+b” will be instantiated with the size of “b”, and its value is initialized to that of
“b”.
The constant strings for Matrix virtual constructors (use macro definition “MATRIX”) and autonomous vir-
tual constructors are shown in the following box
by reference
“C0&” C0 type Matrix —
“C0*” a pointer to C0 type Matrix —
“int, int, double*” row-length, column-length, double* != 0, 16
“int, int, double*, int, int” row-length, column-length, double* != 0,
memory-row-length, memory-column-length 17
by value
“int, int” row-length, column-length 15
“int, int, double*” row-length, column-length, double* = 0, 16
“int, int, const double*” row-length, column-length, double*, 18
“int, const C0*” length, C0* of a Vector 19
“const C0*” C0* type of a Matrix* —
“int, int, C0&, int, int” row-length, column-length, C0&,
starting row-index, starting column-index 20
(the only one for reference Matrix)
Strings in C0 virtual constructor for Matrix object.
symbolic operators
C0& operator &= ( ) assignment by reference
C0& operator = ( ) assignment by value
C0& operator [ ] (int) row selector return a Vector
C0& operator( )(int) column selector return a Vector
C0& operator ( )(int, int) element selector return a Scalar
C0 operator & ( ) const column concatenation
C0 operator && () const one-by-one column concatenation
C0 operator | ( ) const row concatenation
C0 operator || ( ) const one-by-one row concatenation
arithmetic operators
C0 operator + ( ) const positive (primary casting) unary
C0 operator - ( ) const negative unary
C0 operator + (const C0&) const addition
C0 operator - (const C0&) const subtraction
C0 operator * (const C0&) const multiplication
C0 operator / (const C0&) const division (by a Scalar or a Matrix only)
C0& operator += (const C0&) replacement addition
C0& operator -= (const C0&) replacement subtraction
C0& operator *= (const C0&) replacement multiplication (by a Scalar only)
C0& operator /= (const C0&) replacement division (by a Scalar only)
logic operators
int operator == (const C0&) const equal TRUE == 1
int operator != (const C0&) const not equal FALSE == 0
int operator >= (const C0&) const greater or equal
int operator <= (const C0&) const less or equal
int operator > (const C0&) const greater
int operator < (const C0&) const less
functions
int row_length() const row-length of the Matrix
int col_length() const column-length of the Matrix
double norm(in) const 1 (maximum column-sum)-norm or 2 (spectral)- norm
double norm(const char*) const “infinity” (max row-sum),“Forbenisu”(Forbenius-norm)
Partial listing of Matrix object arithmetic operators, logic operators and functions (continued on next page).
functions
C0 pow(int) const power (applied to each element of the Matrix)
C0 sqrt(const C0&) const square root (applied to each element of the Matrix)
C0 exp(const C0&) const exponent (applied to each element of the Matrix)
C0 log(const C0&) const log (applied to each element of the Matrix))
C0 sin(const C0&) const sin (applied to each element of the Matrix)
C0 cos(const C0&) const cos (applied to each element of the Matrix
For the two assignment operators, which are applied with two Matrices, the results are intuitively straight-
forward and the behavior is similar to that for two Scalars or two Vectors. Again, complexity increases when dif-
ferent types are taken as their arguments (see Program Listing 1•2). If “Matrix::operator = (const C0&)” takes a
Scalar, it is defined as to have all the elements in the Matrix set to the value of the Scalar. For example,
C0 a(2, 3, (double*)0),
b(3.0);
a = b;
#include “include/vs.h”
int main() {
double d[2] = {0.0, 1.0}; b = 3.0
C0 a(2, 3, (double*)0), c = {0.0, 1.0} T
b(3.0);
c(2, d); a=
a = b;
{ { 3.0, 3.0, 3.0 },
cout << a << endl;
{ 3.0, 3.0, 3.0 } }
a = c;
a=
cout << a << endl;
{ { 0.0, 0.0, 0.0},
{1.0, 1.0, 1.0} }
a &= b; a = 3.0; a Scalar
cout << a << endl;
a &= c;
cout << a << endl; a = {0.0, 1.0} T; a Vector
return 0;
}
Listing 1•7 Assignment operators with different types of operand (project: “matrix_examples”).
The returned Vectors for row and column selectors can again be applying “Vector::operator [](int)”, the selector
of the Vector, to access a single element. However, for single element access, we can by-pass the row and column
selectors to use element selector, “Matrix::operator ( )(int, int)”, directly with two int arguments for higher effi-
ciency. The element selector has the semantics close to Fortran language. So it is also called the Fortran-style
selector.
The result of the row-wise-concatenation operators “&” and “&&”, when two Matrix objects are taken as
arguments, are intuitively comprehensible (see Figure 1•11). Their definitions are consistent with those for the
two Scalars or two Vectors. We notice that, for the regular row-wise-concatenation operator, the “column-
lengths” of the two concatenated Matrices must be the same. For the one-by-one row-wise-concatenation opera-
1 2 3
1 2 3
1 2 3 4 5 6
1 2 3 1 2 3 1 2 3 1 2 3
& 4 5 6 = 1 2 3 && =
4 5 6 4 5 6 4 5 6 4 5 6
7 8 9 4 5 6
4 5 6
7 8 9
Figure 1•11 Row concatenation and one-by-one row concatenation of two Matrices.
1 2 3
1 2 3
1 2 3 1 2 3 1 1 1
& 1 = 4 5 6 && 1 =
4 5 6 4 5 6 4 5 6
1 1 1
1 1 1
1 2 3 1 2 3
1 4 5 6 1 2 3 1 1 1 1
1 2 3 && =
& 2 = 1 1 1 5 6
4 5 6 2 4
4 5 6
3 2 2 2 2 2 2
3 3 3
Figure 1•12 Row-wise concatenation and one-by-one row concatenation with Scalar or Vector
tor, both “row-length” and “column-length” must be the same. We also define the row-wise-concatenation of a
matrix with a Scalar or a Vector consistently as those of previous ones (see Figure 1•11). Row-wise-concatena-
tion with a Scalar is to have every column of the Matrix concatenates row-wisely with the Scalar. One-by-one
row-wise-concatenation has every element of the Matrix concatenates row-wisely with the Scalar. Row concate-
nation of the matrix with a Vector is to have every column of the Matrix concatenates row-wisely with the Vec-
tor, and one-by-one row-wise-concatenation with a Vector has every element of the Matrix concatenates row-
wisely with the Vector. In the last case the “row-length” of the Matrix and the “length” of the Vector should be
the same.
The behaviors of the column-wise-concatenation operators “|” and “||” for two Matrices are also straight-for-
ward (see Figure 1•11). Similarly, for the column concatenation operators the “row-length” must be the same.
For the one-by-one column-wise-concatenation operator both the “row-length” and “column length” of the two
matrices must be the same. Column-wise-concatenation of a matrix with a Scalar or a Vector can be defined
accordingly (see Figure 1•11). The column-wise-concatenation with a Scalar is defined as to have every row of
1 2 1 2 3 1 2 1 2 3 1 2 1 2 1 1 2 2
3 4 | 4 5 6 = 3 4 4 5 6 3 4 || 3 4 = 3 3 4 4
5 6 7 8 9 5 6 7 8 9 5 6 5 6 5 5 6 6
Figure 1•13 Column concatenation and one-by-one column concatenation of two Matrices.
1 2 1 2 1 1 2 1 1 2 1
3 4 | 1 = 3 4 1 3 4 || 1 = 3 1 4 1
5 6 5 6 1 5 6 5 1 6 1
1 2 1 1 2 1 1 2 1 1 1 2 1
3 4 | 2 = 3 4 2 3 4 || 2 = 3 2 4 2
5 6 3 5 6 3 5 6 3 5 3 6 3
Figure 1•14 Column concatenation and one-by-one column concatenation of two Matrices.
the Matrix concatenates column-wisely with the Scalar. The one-by-one column-wise-concatenation is defined
as for every element of the Matrix column-wise concatenates with the Scalar. The column-wise-concatenation
and one-by-one column-wise-concatenation with a Vector is also intuitive, but the “row-lengths” of the Vector
and the Matrix must be the same.
The positive unary operator for the Matrix can be used to perform the primary casting for a reference Matrix
to a Matrix that owns its memory space. The addition and subtraction operators, “Matrix::operator +(const
C0&)” and “Matrix::operator -(const C0&)”, need some explanations. If they take another Matrix as their argu-
ment, this Matrix must has the same row-length and column-length with the original Matrix. If the argument
taken is a Scalar (or equivalently a double), the addition or substraction is defined, with the distribution rule, to
have every element of the Matrix plus or minus the Scalar. If the argument is a Vector, the distribution rule in
such a case defines that every column of the Matrix is to plus or minus the Vector. In this case, the “row-length”
of the Matrix and the “length” of the Vector should be the same, otherwise, an exception will be thrown.
“Matrix::operator *(const C0&)” can take a Scalar, a Vector or another Matrix. For a Scalar (or a double)
argument the multiplication obeys the distribution rule. As defined earlier, this means the Scalar value will be
multiplied with every element of the Matrix. For a Vector argument, according to usual mathematical definition,
the Matrix and its argument Vector should have compatible lengths as
where dim M = m × n, dim v = n ( × 1), and dim w = m ( × 1) as required in matrix algebra. In Eq. 1•1 of
page 16, the scalar inner product of two vectors (v and w of the same length n) is written with VectorSpace C++
library as
(~v) * w = v * w
On the left-hand-side, the transpose of a Vector v is a Matrix of row-length = 1. Now, let’s see the dimension of
this expression is dim vT = {1 × n } and dim w = { n × 1 }. Therefore it is consistent with the definition of a
Matrix multiplies with a Vector. For “*” operator to take another Matrix object, the length compatible require-
ment is
M*N=L
where dim M = m × l, dim N = l × n, and therefore dim L = m × n. In Eq. 1•8 in page 21, the dyad obtained
from the tensor product of two Vectors is written as
v * (~w) = v % w
the transpose of the Vector w has dim (~w) = 1 × m. The Vector v has dim v = n × 1. The tensor product defini-
tion therefore is consistent with multiplication in matrix algebra where the dyad has dim (v*(~w)) = n × m.
The division operator “/” takes either a Scalar or a Matrix. For the case of a Matrix it is defined similar to the
process of finding solution of simultaneous equations with multiple right-hand-side vectors. Therefore the row-
length and column-length of the both Matrices must be the same. The replacement division, “/=”, and replace-
ment multiplication, “*=”, again only take a Scalar object in order to be consistent with the semantics of the
replacement operators.
Logic operators “==”, is defined to have same “row-length”, “column-length” and values for every element.
Four norms of a Matrix are defined in VectorSpace C++ library. The maximum column sum matrix norm is
denoted with subscript “1” as1
m–1
A 1 = max
0≤j<n ∑ a ij
Eq. 1•9
i=0
n–1
A ∞ = max
0≤i<m
∑ a ij Eq. 1•10
j=0
1. B.N. Datta, 1995, “Numerical Linear Algebra, and applications” Brooks/Cole Publishing Company, p. 26-27.
The Frobenius norm is the one that is most consistent with the 2-norm (Euclidean norm) of a Vector. It is denoted
with a subscript “F” as
n – 1m – 1
A F = ∑ ∑ a ij 2 Eq. 1•12
i= 0j =0
The maximum column norm subscript “1” and spectral norm subscript “2” for a Matrix “m” are called by func-
tions “norm(m, 1)” and “norm(m, 2)”, respectively. The maximum row sum norm subscript “ ∞ ” and Frobenius
norm subscript “F” are called by functions “norm(m, “infinity”)” (= “norm(m, “Infinity”)”), and “norm(m,
“frobenius”)” (= “norm(m, “Frobenius”)”), respectively.
The transcendental functions for the Matrix obey the distribution rule similar to those for the Vector. For
example, for a Matrix “m” of row-length “m” and column-length “n”,
The function, “sqrt()”, has been applied to every element of the Matrix “m”.
LU Decomposition
For the solutions of simultaneous equations A x = b, a square matrix, A, can be decomposed first into an
upper triangular matrix (U) and a lower triangular matrix (L), by the LU decomposition, as
A x = (L U) x = b Eq. 1•14
We define
The solutions of triangular matrices in the form of Eq. 1•15 and Eq. 1•16 are known to be particularly easy; the
solution steps begin either from the first or the last equation that has only one unknown. Then an equation next
to the one just solved will have one more new unknown to be solved for. The last step is repeated until all the
unknowns are solved. Therefore the solution of the original system can be performed in three steps: (1) perform
LU decomposition, (2) solve triangular system for y in Eq. 1•16. This step is known as forward elimination (or
forward substitution), (3) substitute y into Eq. 1•15 and solve for x from the triangular system. This last step is
known as back substitution.
In VectorSpace C++ Library, the LU decomposition can be called by using matrix decomposition operator
“Matrix::operator !()”. And, the forward elimination and back substitution are performed by multiplying the
decomposed matrix with the right-hand-side vector with the “LU::operator *(const C0&)”. For example,
You can explicitly form an LU decomposed matrix object by calling the LU constructor as
LU a(A); // LU decomposition
C0 x = a * b; // “*” performs forward and back substitutions
or in short just
C0 x = LU(A)*b;
The LU decomposition is the default matrix solver in VectorSpace C++ Library. We can achieve a neater expres-
sion by considering the mathematical expressions as in
Ax=b ⇒ x = A-1 b = b / A
C0 x = b / A;
The “Vector::operator /(const C0&)”, taking a Matrix object of C0 type as its argument, is invoked with the def-
inition as in Eq. 1•3 of page 17. Or equivalently,
C0 x = A.inverse() * b; // x = A-1 b
or even
The determinant is computed by the member function call “det()”, the rank of the matrix by “rank()”, and the
condition number by “cond()”.
For the LU-decomposition without pivoting, the decomposition algorithm is unstable. Some elements in the
reduced matrices can grow arbitrarily large that the information in the original data can be corrupted. The default
pivoting method for the LU-decomposition is the partial-pivoting (or row-pivoting). The algorithm can be even
more stabilized if the complete-pivoting is used. The default behavior can be over-written as
Matrix::Pivoting_Method = Matrix::Complete_Pivoting;
C0 x = b / A;
Matrix::Pivoting_Method = Matrix::Partial_Pivoting; // reset to default pivoting
The change made to the pivoting method in the above will affect not only the division operator “/”, but also all
invoking methods of the matrix solver used in the above examples.
Cholesky Decomposition
The idiosyncrasy of matrix computation is that there are different methods for specific kinds of matrices. For
a (square) symmetric matrix, the Cholesky decomposition is often used. The Cholesky decomposition can be
written as
A = L D LT Eq. 1•17
where D is the diagonal matrix. The Cholesky decomposition is two times faster than the LU decomposition.
However, the plain Cholesky decomposition works only for a positive definite symmetric matrix ( ∀ λ i > 0 ,
where λi are eigenvalues). For a positive semi-definite symmetric matrix ( ∀λ i ≥ 0 ), the Cholesky decomposition
with diagonal-pivoting is necessary to keep the algorithm stable. For an indefinite symmetric matrix, one can just
ignore its being symmetric, therefore, the LU-decomposition with complete-pivoting is a must. In the context of
the modified Newton method in optimization problem (see page 129 in Chapter 2), the negative curvatures of an
indefinite symmetric matrix (Hessian matrix; the second partial derivatives of the objective function) can be
modified to have positive curvatures by using the modified Cholesky decomposition.
The examples of using the Cholesky decomposition in VectorSpace C++ Library are1
1. example data from A. Jennings and J.J. McKeown, 1992, “Matrix computation”, 2nd ed., John Wiley & Sons, New York,
p.100-101.
“Cholesky a(A);” calls the Cholesky constructor explicitly. Operators “!” or “/” or function “inverse()” can be
used by setting the matrix solver from default LU decomposition. For example,
Matrix::Decomposition_Method = Matrix::Cholesky_Decomposition;
C0 x = b/A;
Matrix::Decomposition_Method = Matrix::LU_Decomposition; // reset back to default
If you are dealing with a positive semi-definite symmetric matrix the diagonal pivoting can be invoked by set-
ting
Matrix::Pivoting_Method = Matrix::Diagonal_Pivoting;
The modified Cholesky decomposition, for the symmetric indefinite Hessian in optimization, can be invoked
with a second argument indicating a small tolerance value, δ. A critical example is shown in the following1
In line 9, the second parameter δ = 1.e-20 in the constructor “Cholesky a_bar(A, 1.e-20);” specifies the lower
bound (from zero) for the modified eigenvalues. The original matrix has the eigenvalues of 5.1131, -2.2019, and
0.0888, which is clearly indefinite. The Cholesky decomposition of the matrix gives
1. data from P.E. Gill, W. Murray, and M.H. Wright, 1981, “Practical optimization”, Academic Press Limited, San Diego,
pp. 109-111.
Now all the diagonal elements have been modified to positive numbers. The difference of the original and the
modified matrix has the Frobenius norm of only 6.154. The amount being increased on the diagonals by the mod-
ified Cholesky decomposition can be obtained by calling “Cholesky::diagonal_increase(int i)”, the argument
specifies the off-set from the first element of the diagonal. In this case, the increased amount of the three diago-
nals are {2.771, 5.016, 2.243}.
QR Decomposition
Any matrix A can be written as
A= QR Eq. 1•20
where R is a upper triangular matrix and Q is an orthogonal matrix. An orthogonal tensor Q satisfies the neces-
sary and sufficient conditions of QTQ = I, and det Q = 1. Eq. 1•20 is called the QR decomposition.
For a square matrix A, the simultaneous equations A x = b can be solved by the QR decomposition as
A x = (QR) x = b
And, set
y = QT b Eq. 1•21
Rx = y Eq. 1•22
The QR decomposition for a square matrix, if carried out by Householder transformation, is two times more
expensive than the LU decomposition. The QR decomposition is always stable. Recall that the LU decomposi-
tion is stable only with complete pivoting.
Using QR decomposition with VectorSpace C++ Library is simple. For example,
As in the case of the Cholesky decomposition, instead of explicitly calling the QR constructor, you can use oper-
ators “!” and “/” or function “inverse()” by setting the default matrix solver to the QR decomposition as
Matrix::Decomposition_Method = Matrix::QR_Decomposition;
C0 x = b / A; // implicitly call QR decomposition
The member functions of the QR class “QR::Q()” and “QR::R()” give clearly what they say they are.
For a rectangular matrix A of size m × n ( m ≥ n ) with full rank, the QR decomposition produces
R1
Q = Q1 Q 2 , and R = Eq. 1•23
0
Q is an m × m matrix and R is a m × n matrix, where Q1 with n vectors form the orthonormal basis of the range
space of A, and Q2 with (m-n) vectors form the orthonormal basis of null space of AT. R1 is a n × n matrix and the
lower part of the R matrix is a null matrix of size (m-n) × n.
In the overdetermined full rank least squares problem, the residual of a rectangular matrix A with right-hand-
side vector, b, and the solution, x, is written as
n
m m
A x –b A x –b
r 2
2 = T
r r = (Ax – b ) (Ax – b ) =
T
∑ ∑ ij j i ∑ ik k i
i j k
m m n m n n
The least squares means to “minimize the sum of squares (the residual norm)”. Taking derivatives with respect
to x, for the three terms in the last line, gives
m m n
m m n m n
∂
first term: -------- ∑ ∑ x j x k ∑ A ij A ik = ∑ ∑ ( δjq xk + δkq xj ) ∑ Aij Aik = 2 ∑ x k ∑ A iq A ik
∂x q
j k i j k i k i
m n
m n n
∂
second term: – 2 ∑ x j ∑ A ij b i = –2 ∑ δ jq ∑ A ij b i = – 2 ∑ A iq b i
--------
∂x q
j i j i i
n
∂
third term: -------- ∑ b i b i = 0
∂x q
i
Add three terms together and set the derivatives to zero for the purpose of minimization, we get
m n n
∂ r 22
------------- = 2 ∑ x k ∑ A iq A ik – 2 ∑ A iq b i = 0 Eq. 1•26
∂x
k i i
AT A x - AT b = 0 Eq. 1•27
This equation is known as the normal equations. The solution can be obtained from Eq. 1•27 as
where A-g = [AT A]-1AT is called the generalized inverse. On the other hand the projection of vector b (of size m)
into a lower dimensional range space of A (of size n, with m ≥ n ) gives the minimum length of the Euclidean
norm of r (= A x - b, see Figure 1•15)
b
r m = 3, n = 2
Ax
Range(A)
Figure 1•15 Projection into range space of A gives the minimum length of r.
Since r and the range of A are perpendicular to each other, every column of A is orthogonal to r. Therefore,
AT r = 0 (orthogonal property)
AT (A x - b) = AT A x - AT b = 0
r 2 = Ax – b 2 Eq. 1•29
2 2
An orthogonal transformation of Eq. 1•29 with QT should not change the length of the residual as
where
R1 b’1
QTA = R = and Q T b = b’ = Eq. 1•31
0 b’2
The submatrix R1 and subvector b’1 have sizes of n × n and n, respectively, and the null matrix and the subvector
b’2 have sizes of (m-n) × n and (m-n), respectively. Therefore Eq. 1•30 becomes
In Eq. 1•32, the squares of residual norm is minimized with respect to x if we set
Therefore, after we have done the QR decomposition—A = Q R, the least squares solution can be found by first
obtaining b’1 = QT b, then, solving Eq. 1•33 for x. For example,2
1. D.G. Luenberger, 1969, “Optimization by Vector Space Methods”, John Wiley & Sons, Inc., p. 55.
2. B.N. Datta, 1995, “Numerical Linear Algebra, and applications” Brooks/Cole Publishing Company, pp. 333-4, and
pp.337-8.
The solution can be checked by computing in your mind, since the right-hand-side vector, b, is just summation of
each row of the left-hand-side matrix, A. The QR decomposition therefore yields an exact solution in this numer-
ical case. On the other hand, the normal equation method for this ill-conditioned matrix shows some discrepan-
cies in the following codes (see also project: “matrix_algebra”)
C0 gram_matrix = (~A)*(A); // AT A
Cholesky c(gram_matrix); // Cholesky decomposition
C0 x = c * ((~A)*b); // “*” is forward/back substitutions
cout .precision(12);
cout << x << endl; // {0.99999999875, 1.00000000125}T
For this example the solution has only been mildly corrupted. You may argue that the Cholesky decomposition
for normal equation is actually acceptable in practice, if not theoretically sound. However, the flop-count of
(ATA) followed by Cholesky decomposition is about two times more expensive than that of QR decomposition
alone.
QR decomposition can be also used for rank deficient problem to reveal its column rank provided that col-
umn-pivoting is used. In VectorSpace C++ Library we reset default of no pivoting for QR decomposition by
“Matrix::Pivoting_Method = Matrix::Column_Pivoting;”
before the QR decomposition is called. Then, member function “QR::rank()” can be called to reveal its column
rank. However, the rank revealing QR decomposition is not as reliable as the singular value decomposition
(SVD), although the SVD is about one order of magnitude more expensive than the QR decomposition.
Eigenvalue Problem
Before we get to the singular value decomposition, let’s first look at the symmetric eigenvalue problem. A
symmetric (square) matrix A of size n × n have eigenvalue λ and corresponding eigenvector x if
The computation of eigenvalues and eigenvectors of a symmetric matrix can be written in VectorSpace C++
Library as (in project: “matrix_algebra”)
A = U Σ VT
The singular values of matrix A are the diagonals in the diagonal matrix Σ. Assuming “r” (r ≤ n) is the rank of
the matrix, the first “r” singular values are non-zero in the diagonal submatrix Σ1. U = [U1, U2], and V = [V1, V2]
are subdivided so that the submatrices U1 and V1 to have first “r” column vectors of U and V, respectively..
mxm mxn nxn
mxn
Σ1 V1T
rxn
A = U1 U2
rxr
V2 T
mxr mx(m-r) 0
(n-r)xn
The singular values, σi, are the non-negative square root of the eigenvalues, λi, of the ATA. The column vectors
of U and V are the orthonormal bases that span the range and null spaces of A and AT (see TABLE 1•3.)
A 2 = σ1 ,
A F = ( σ 12 + σ 12 + … + σ n2 )
cond = σ1 / σn
The singular value decomposition in VectorSpace C++ library can be called by writing
The least square problem can be solved by singular value decomposition. Sometimes the problem can be overde-
termined that you make more measurements than the unkown, but it can be still rank deficient because some
measurements basically repeat the information of the others, while, at the same time, some vital information has
never been obtained. The use of singular value decomposition for the solution of least squares or simultaneous
equations is parallel to the use of LU, Cholesky or QR decomposition. For example, we have
SVD a(A);
C0 x = a * b;
or set default matrix solver by
Matrix::Decomposition_Method = Matrix::Singular_Value_Decomposition;
C0 x = b / A;
In summary, we have introduced all the primary objects, Scalar, Vector, and Matrix objects of C0 type, in
Sections 1.1.1 to 1.1.4. For a complicated object like Matrix, data abstraction begins to play a more important
role compared to simple objects like Scalar or Vector. Data abstraction helps us encapsulate the complexity of
memory management of the data array for the Matrix class. The details of how actually these data are stored in
the Matrix object is hidden from users. The various constructors and the destructor of the Matrix object help
users, behind the scene, handle the resources of the Matrix class. On top of this, because the Matrix class needs
an extensive “bag of tricks” on the subject of matrix algebra, the data abstraction help to organize different kinds
of decomposition methods-LU, Cholesky, QR, SVD, and eigen-system solver with the associated matrix data
into a coherent module. This integrated module, under the concept of data abstraction, acts like an intelligent
entity that has the knowledge to deal with problems of its own.
In the section on Scalar, we introduced the object-oriented programming that achieves the flexibility by the use
of virtual constructors. In the definition for two assignment operators “=” and “&=”, we introduced the C0 type
object actually acting more like a label that can be peeled off or attached to a concrete object . This symbolic fla-
vor is also a feature enabled by the object-oriented programming method implemented in VectorSpace C++
library. The C0 class is the base class for the derived concrete classes, Scalar, Vector, Matrix, ..., etc. The base
class C0 also serves as a flat-interface1 for all of its derived concrete classes. The C0 class contains all member
operators and functions of all its derived concrete classes. In the user’s program, C0 is the only generic type
used. C0 serves as the delegate for all of its constituents. This VectorSpace C++ library generic type feature
actually pushes the programming environment of C++ more towards the side of a full-fledged object-oriented
language for programming versatility, which complements the strong type-compiled language that C++ is origi-
nally designed for safety. We use one simple example to illustrate the advantage of this dynamic, late-binding
feature supported by VectorSpace C++ Library.
For example, root-finding for a function f(x) is stated as the following
Approximation of function f(x) by the Taylor expansion to the first order gives
Therefore, the increment of solution dx can be found, from the above equation, by using dx = -f(x0) / f’ (x0)
(known as Newton’s formula), and the solution is updated with increment dx by xi+1 = xi + dx, where “i” is the
iterative index. A converged solution is obtained when dx becomes negligible. We can easily extend this prob-
1. Bjarne Stroustrup, 1991, “The C++ programming language”, 2nd ed., Addison-Wesley Publishing Company, Massachu-
setts, p. 452.
At the time of writing this subroutine, we do not need to distinguish whether it is for a one-dimensional problem
or a multi-dimensional problem. When this subroutine is called by the user, if the problem is one-dimensional,
the arguments f, df and the return value are all Scalar objects of C0 type. If the problem is multi-dimensional, the
user passes a Vector object of C0 type through argument f, and a square Matrix object of C0 type through argu-
ment df, and the return value will be a Vector object—the increment dx from Newton’s formula. The division
operator “/”, in the multi-dimensional case, implicitly calls the default matrix solver—the LU decomposition to
solve the problem.
We discuss one more example of object-oriented function dispatching mechanism. Assuming in the subrou-
tine “newton_formula()”, we want to display the value of “f” column-wise and add a line as
The member function “C0::length()” makes sense if the variable “ f ” is a Vector. We mentioned on page 17 the
concept of backward compatibility of a member function. For a Scalar object, a call to “length()” exists and will
always return the default value of “ 1 ”. This compatibility of member functions is the result of using the flat-
interface provided by the C0 for all its derived concrete classes. Furthermore, whatever “ f ” is, a scalar or a vec-
tor, we can always write
cout in VectorSpace C++ Library knows how to handle the output whatever “ f ” is! We see that as the result of
using C0 instead of concrete types a general form of syntax can be defined.
We note by passing that for Newton’s method we better use C1 type instead, which will be introduced in the
next chapter. With the C1 type, a differentiable object can be easily defined. The above hypothetical example is
only used to illustrate, from programming method perspective, the versatility of using the more dynamic, late-
binding technique. It is not for the purpose of explaining the implementation of a real world numerical problem.
Figure 1•16 Referenced source Vector and referenced source Matrix are equal-
partitioned to Subvectors and Submatrices.
of arbitrarily higher dimensional array. In many engineering applications, for example in the finite element
method, subvectors and submatrices are used as a convention to represent objects that would otherwise be writ-
ten in a higher dimensional array.
The mathematical “subvector” and “submatrix”, of course, should include the “reference Vector” and “refer-
ence Matrix” defined on page 10 and page 21, respectively. The reference Vector or reference Matrix can have
arbitrary sizes and starting/ending indices as long as they are within the bounds of the Vector or the Matrix they
are referring to. In VectorSpace C++ Library the terms Subvector and Submatrix are reserved for special kind of
subvector and submatrix. This special kind of subvector and submatrix should be able to have their referenced
source vector and referenced source matrix partitioned into equal-sized blocks. For example,, in Figure 1•16 the
referenced source Vector has “length” = 9. The Vector can be subdivided into 3 equal sized sub-blocks of
“length” = 3. Similarly, the reference source Matrix has “row-length” = 4 and “column-length” = 9. This Matrix
can be subdivided into 6 equal sized blocks. Each has “row-length” = 2 and “column-length” = 3. We will show
you later that the Subvector and Submatrix not only serve the purpose of representing higher dimensional ten-
sors with a lower dimensional ones, but also with the use of the VectorSpace Subvector/Submatrix index
scheme, the need for the “for” control statement in C++ being substantially reduced, leads to a much uncluttered
C++ codes that become close to the mathematical expressions.
double v[9] = {0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0},
d[4][9] = {{0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0},
The double array “v” and “d” are the physical memory space to be referenced. The declaration by the dedicated
constructors of the last line actually gives a Nominal_Subvector and a Nominal_Submatrix. By the word “Nomi-
nal”, we mean that it is the name for a Subvector or a Submatrix. Therefore, “a” and “b” are the symbols that will
be used to generate Subvector and Submatrix per se. The declaration above by the dedicated constructors of the
C0 type only provides information on how to partition a continuous memory space into equal sized sub-blocks.
For example, a(v, 9, 3) means that the double array “v” has length = 9, and it is partitioned into three equal sized
sub-blocks, each of which has “length” = 3, and b(d[0], 4, 9, 2, 3) means that the double array “d” has “row-
length” = 4, “column-length” = 9, and it is partitioned into six equal sized sub-blocks, each of which has “row-
length” = 2, and “column-length” = 3. The Subvector and Submatrix are generated by using selectors. For exam-
ple, (in project: “subvector_submatrix_examples”, and see also Figure 1•16)
The operator “( )” of the Nominal_Subvector and Nominal_Submatrix selects a continuous block of memory
space to form a Subvector or Subvectors (see Figure 1•16). For example, the index of “a(1)” is “1”, which means
the off-set from the first continuous block subvector. The contents of the Subvector “a(1)” is, therefore, {3.0, 4.0,
5.0}T referring to the second continuous block in the equal-partitioned double array v. The indices of b(1, 2)
means the continuous block Submatrix has “1” row-offset and “2” column-offset from the first continuous block.
The contents of “b(1, 2)” is then {{26.0, 27.0, 28.0}, {36.0, 37.0, 38.0}}.
The operator “[ ]” of the Nominal_Subvector and Nominal_Submatrix selects a regular increment memory spots
to form a Subvector or a Submatrix . For example, in VectorSpace C++ Library, the Subvectors and Submatrices
are generated as (in project: “subvector_submatrix_examples”)
0.0
1.0 0. 1. 2. 3. 4. 5. 6. 7. 8.
2.0 10. 11. 12. 13. 14. 15. 16. 17. 18.
3.0 20. 21. 22. 23. 24. 25. 26. 27. 28.
a(1) 3.0 4.0 30. 31. 32. 33. 34. 35. 36. 37. 38.
4.0 5.0
5.0 6.0 b(1, 2) 26. 27. 28.
7.0 36. 37. 38.
8.0
Figure 1•17 Subvectors and Submatrices are generated by calling a continuous block
selector with the selector “( )”.
// subvectors
cout << a[0] << endl; // {0.0, 3.0, 6.0}T
cout << a[1] << endl; // {1.0, 4.0, 7.0}T
cout << a[2] << endl; // {2.0, 5.0, 8.0}T
// submatrices
// operator “[ ]” applied twice
cout << b[0][0] << endl; // {{ 0.0, 3.0, 6.0},
// {20.0, 23.0, 216.0}}
cout << b[0][1] << endl; // {{ 1.0, 4.0, 7.0},
// {21.0, 24.0, 27.0}}
cout << b[0][2] << endl; // {{ 2.0, 5.0, 8.0},
// {22.0, 25.0, 28.0}}
cout << b[1][0] << endl; // {{10.0, 13.0, 16.0},
// {30.0, 33.0, 36.0}}
cout << b[1][1] << endl; // {{11.0, 14.0, 17.0},
// {31.0, 34.0, 37.0}}
cout << b[1][2] << endl; // {{12.0, 15.0, 18.0},
// {32.0, 35.0, 38.0}}
For example, see Figure 1•16, “a[1]” means every element with offset of “1” from every equal-partitioned sub-
block is selected to form a Subvector. The contents of the Subvector “a[1]” is therefore {1.0, 4.0, 7.0}T, which is
referring to every second element of the sub-blocks in the double array “v”. The indices of b[0][1] means every
element with “0” row offset and “1” column offset from the first elements of the equal-partitioned sub-blocks is
selected to form a Submatrix. The contents of the Submatrix “b[0][1]” is therefore {{1.0, 4.0, 7.0}, {21.0, 24.0,
27.0}}, which is referring to every first row, second column element of each sub-block in the double array “d”.
0.0
1.0
2.0 0. 1. 2. 3. 4. 5. 6. 7. 8.
b[0][1]
3.0 10. 11. 12. 13. 14. 15. 16. 17. 18.
1. 4. 7.
a[1] 1.0 4.0 20. 21. 22. 23. 24. 25. 26. 27. 28.
21. 24. 27.
4.0 5.0 30. 31. 32. 33. 34. 35. 36. 37. 38.
7.0 6.0
7.0
8.0
Figure 1•18 Subvectors and Submatrices are generated by calling a regular increment
selectors “[ ]”. “[ ]” needs to be applied twice for the regular increment Submatrix.
The access to a part or an element of a Subvector or a Submatrix is exactly like the access to that of the Vector
and Matrix. We can use “operator[ ](int)” to select an element of a Subvector. For the Submatrix, “operator [
](int)” is the row-selector and “operator( )(int)” is the column selector, they both return a Vector object, and
“operator( )(int, int)” is the element selector. For example, (project: “subvector_submatrix_examples”)
double v[12] = {0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 11.0},
d[8][12] = { { 0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 11.0},
{10.0, 11.0, 12.0, 13.0, 14.0, 15.0, 16.0, 17.0, 18.0, 19.0, 20.0, 21.0},
{20.0, 21.0, 22.0, 23.0, 24.0, 25.0, 26.0, 27.0, 28.0, 29.0, 30.0, 31.0},
{30.0, 31.0, 32.0, 33.0, 34.0, 35.0, 36.0, 37.0, 38.0, 39.0, 40.0, 41.0},
{40.0, 41.0, 42.0, 43.0, 44.0, 45.0, 46.0, 47.0, 48.0, 49.0, 50.0, 51.0},
{50.0, 51.0, 52.0, 53.0, 54.0, 55.0, 56.0, 57.0, 58.0, 59.0, 60.0, 61.0},
{60.0, 61.0, 62.0, 63.0, 64.0, 65.0, 66.0, 67.0, 68.0, 69.0, 70.0, 71.0},
{70.0, 71.0, 72.0, 73.0, 74.0, 75.0, 76.0, 77.0, 78.0, 79.0, 80.0, 81.0} };
C0 a(v, 12, 6), b(d[0], 8, 12, 2, 2),
c(a(0), 3), // “c” is a Nominal_Subvector of a Subvector
d
0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10. 11.
v
10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21.
0.0 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31.
1.0 a(0)
30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41.
2.0 0.0 40. 41. 42. 43. 44. 45. 46. 47. 48. 49. 50. 51.
3.0 1.0 50. 51. 52. 53. 54. 55. 56. 57. 58. 59. 60. 61.
c[0]
4.0 2.0 60. 61. 62. 63. 64. 65. 66. 67. 68. 69. 70. 71.
5.0 0.0
3.0 70. 71. 72. 73. 74. 75. 76. 77. 78. 79. 80. 81.
6.0 4.0 3.0
7.0 5.0 b[0][1] 1.0 3.0 5.0 7.0 9.0 11.
8.0
21. 23. 25. 27. 29. 31.
9.0
41. 43. 45. 47. 49. 51.
10.
61. 63. 65. 67. 69. 71.
11.
We emphasize that the C0 constructors play the role of constructing Nominal_Subvectors and
Nominal_Submatrices, while the selectors of the Nominal_Subvector and Norminal_Submatrix play the role of
“constructors” to generate Subvectors and Submatrices.
Nominal_Subvector Subvector
C0 constructor & selectors &
Nominal_Submatrix Submatrix
(of Nominal_Subvector
and Nominal_Submatrix)
cout << a(0) << endl; // {0.0, 1.0, 2.0, 3.0, 4.0, 5.0}T
cout << a(1) << endl; // {6.0, 7.0, 8.0, 9.0, 10.0, 11.0}T
cout << (a(0)+a(1)) << endl; // {6.0, 8.0, 10.0, 12.0, 14.0, 16.0}T
In the last line, when binary “C0::operator +(const C0&)” is called, both Subvectors “a(0)” and “a(1)” are turned
into Vectors, and then the two Vectors are added together. In fact, in the first two lines, upon “<<” being called,
the Subvectors are also converted into Vectors for output.
1.1.6 Basis
The usage of Basis is a lot like that of Subvector and Submatrix. The constructor and selector are orches-
trated with each other to create Basis objects of C0 type. To understand the semantics of constructing the Sub-
vector and Submatrix, we need to understand how the temporary objects like Nominal_Subvector and
Nominal_Submatrix are generated in the intermediate step. The semantics of generating Basis object is just the
same. The names of those temporary objects are getting quite wordy because of the combinatorial explosion of
objects. We can learn how to generate those complicated basis expressions by examples and illustrations in fig-
ures, without regard to those meticulous temporary object names. Only in one occasion or two, when we need to
get into the semantics of their usage, we need to spell their names all out. In VectorSpace C++ Library, a Basis is
created by calling the C0 dedicated constructor and then applying the selector as (in project: “basis_examples”)
v = r e 3, Eq. 1•35
where “r * e[3]” means that a Scalar “r” is projected, by operator “*” to the basis e[3]. The result of such projec-
tion gives a Vector with the fourth component, with index “3” as its offset number, having the value of “r”. If one
of the terms in the expression does not have a Scalar, like “r” or “6” in the following, projected to the basis, it
assumes the scalar constant is “1”. For example,
where w = {0.0, 1.0, 0.0, 5.0, 6.0}T. The second component of “w” is “1.0”, which has assumed a constant “1.0”
projected at “e 1”. The projection by “*” can be further visualized as in Figure 1•20, in which “r” is projected by
using Basis e[3] as its projection map (or filter). In the C++ code, “w” is a Vector that has the value of “5.0” from
projecting the Scalar “r” into its fourth component. Similarly, the fifth component of “w” has double “6.0”.
What if we have mathematical expressions, using bases, that generate second order tensor, such as
Figure 1•20 Projection using “*”. Basis “e[3]” serves as the projection map.
T = t 00 e 0 ⊗ e 0 + t01 e 0 ⊗ e 1 + t 02 e 0 ⊗ e 2
+ t 10 e 1 ⊗ e 0 + t11 e 1 ⊗ e 1 + t 12 e 1 ⊗ e 2
+ t 20 e 2 ⊗ e 0 + t21 e 2 ⊗ e 1 + t 22 e 2 ⊗ e 2 Eq. 1•37
Matrix: T
Basis_Matrix: e[2]%e[1]
21.
where “A” is a Nominal_Subvector generated by projecting two Vectors “V” and “W” into two
Basis_Subvectors e*E[0] and e*E[1]. Let’s first look at the Basis_Subvectors e*E[0] (see Figure 1•22). The
Basis “e” is projected into the a Basis “E[0]” to form a Basis_Subvector “e*E[0]”. Then, a Vector “V” is pro-
jected using “e*E[0]” as its projection map to form a Nominal_Subvector “V*(e*E[0])”. In the above code seg-
ment, a similar step is done by projecting a Vector “W” into the lower part of a Nomial_Subvector
“W*(e*E[1])”. Then, these two Nominal_Subvectors are added together and assigned to a C0 variable “A”,
which serves as a label for it. “A” can be used just as we used a Nominal_Subvector to generate all kinds of Sub-
vectors that we want in the previous section. For the output of A, (+A) is the primary casting using unary posi-
tive operator “+” to transform the Nominal_Subvector into a Vector. A different path to generate the same thing
can be done (see Figure 1•23). The Nominal_Basis “e” which has length = 6 is projected into another
Nominal_Basis “E” which has length = 2. The result is a Nominal_Basis_Subvector “e*E”. We then assign the
C0 variable “U” to this Nominal_Basis_Subvector. A Vector “V” is projected to a Basis_Subvector “U(0)” to
produce a Nominal_Subvector “V*U(0)”.
In the above example the Nominal_Basis is projected to a Basis. We now discuss a case with a reverse order
of the object occurrences; i.e., with a Basis (“E[0]”) projecting to a Nominal_Basis (“e”), such as “E[0]*e” (in
project: “basis_examples”).
0 1.0 1.0
1 2.0 2.0
3.0 3.0
4.0 4.0
5.0 5.0
0.0
0.0
0.0
0.0
0.0
0.0
In the above code (see also Figure 1•22), a Basis “E[0]” is projected to a Nominal_Basis “e” to form a
Basis_Subvector “E[0]*e”. A Vector “V” is then projected to “E[0]*e” to generate a Nominal_Basis_Subvector
“V*(e*E[0])”.
E[0]*e E[0]*e = V*(E[0]*e)
“*”
=
“*” 0.0
0.0
e V 1.0
0.0 0.0
E[0] 2.0
1.0
0 0.0
2.0
1 3.0
3.0
4.0 0.0
5.0 4.0
0.0
5.0
0.0
Figure 1•24 Projection of a Basis “E[0] to a nominal Basis “e” to form a Basis Subvector “E[0]*e”.
A Vector “V” is projected to “E[0]*e” to generate a nominal Basis Subvector “V*(E[0]*e)”.
There are cases that the Nominal_Basis_Subvector can be used. (see Figure 1•22). The Nominal_Basis “E”
which has the length = 2 is projected into another Nominal_Basis “e” which has the length = 6. The result is a
Nominal_Basis_Subvector “E*e”, then a C0 variable “Z” is assigned (by copy constructor which is always by
reference in VectorSpace C++ Library) to this Nominal_Basis_Subvector. A Vector “V” is projected to a
Basis_Subvector “Z[0]” to produce a Nominal_Subvector “V*Z[0]”.
If you have mastered the Subvector, Submatrix and the Basis_Subvector in the above, the Basis_Submatrix
should become natural according to the rules that you have just learned. The only thing is you will notice a two
dimensional Basis_Submatrix is certainly more complicated than the one dimensional Basis_Subvector. We
have warned you that the names of the temporary objects generated along the way could get quite wordy. Those
names are important for understanding the semantics and therefore the internal working of VectorSpace C++
Library. However, in most cases you can get a practical understanding by just reading the example codes and the
Figures that are associated with the codes. For example, (in project: “basis_examples”)
V 1.0
E
0.0 0.0
2.0
1.0
0.0
2.0
3.0
3.0
4.0 0.0
5.0 4.0
0.0
5.0
0.0
Figure 1•25 Projection of nominal Basis “E” on nominal Basis “e” to form a nominal Basis
Subvector “Z = E*e”, and projection of Vector “V” on a nominal Basis Subvector “Z[0]” to form a
nominal Subvector “V*Z[0]”.
{30.0, 31.0, 32.0, 33.0, 34.0, 35.0} },
d2[4][6] = { { 6.0, 7.0, 8.0, 9.0, 10.0, 11.0},
{16.0, 17.0, 18.0, 19.0, 20.0, 21.0},
{26.0, 27.0, 28.0, 29.0, 30.0, 31.0},
{36.0, 37.0, 38.0, 39.0, 40.0, 41.0} },
d3[4][6] = { {40.0, 41.0, 42.0, 43.0, 44.0, 45.0},
{50.0, 51.0, 52.0, 53.0, 54.0, 55.0},
{60.0, 61.0, 62.0, 63.0, 64.0, 65.0},
{70.0, 71.0, 72.0, 73.0, 74.0, 75.0} },
d4[4][6] = { {46.0, 47.0, 48.0, 49.0, 50.0, 51.0},
{56.0, 57.0, 58.0, 59.0, 60.0, 61.0},
{66.0, 67.0, 68.0, 69.0, 70.0, 71.0},
{76.0, 77.0, 78.0, 79.0, 80.0, 81.0} };
C0 a1(4, 6, d1[0]), a2(4, 6, d2[0]), // four matrices
a3(4, 6, d3[0]), a4(4, 6, d4[0]),
e1(4), e2(6), E(2); // three Bases
C0 A = a1 * ((e1%e2)*(E[0]%E[0])) + a2 * ((e1%e2)*(E[0]%E[1]))
+ a3 * ((e1%e2)*(E[1]%E[0])) + a4 * ((e1%e2)*(E[1]%E[1]));
cout << A(0, 0) << endl; // { { 0.0, 1.0, 2.0, 3.0, 4.0, 5.0},
// {10.0, 11.0, 12.0, 13.0, 14.0, 15.0},
// {20.0, 21.0, 22.0, 23.0, 24.0, 25.0},
// {30.0, 31.0, 32.0, 33.0, 34.0, 35.0} }
In the above code segment (see also Figure 1•26), a Matrix “a1” of “row-length” = 4 and “column-length” = 6 is
projected into a Basis_Submatrix “((e1%e2)*(E[0]%E[0]))”. A Nominal_Basis_Matrix “e1%e2” is projected
into a Basis_Matrix “E[0]%E[0]” to form a Basis_Submatrix “((e1%e2)*(E[0]%E[0]))”. This Basis_Submatrix
is then used to project a Matrix “a1” to form a Nominal_Submatrix “a1*((e1%e2)*(E[0]%E[0]))”.
The equivalent expression by defining X = (e1%e2)*(E%E) is (see also Figure 1•26) that a
Nominal_Basis_Submatrix “X” can be defined as projecting a Nominal_Basis_Matrix “e1%e2” into another
Nominal_Basis_Matrix “E%E”. Then, a Matrix “a1” is projected to a Basis_Submatrix “X(0)” to form a
Nominal_Submatrix “a1*X(0)”..
Just as in the case of Basis_Subvector, we can switch from projecting Nominal_Basis_Matrix into
Basis_Matrix to projecting Basis_Matrix into Nominal_Basis_Submatrix. For example, (in project:
“basis_examples”)
C0 B = a1 * ((E[0]%E[0])*(e1%e2)) + a2 * ((E[0]%E[1])*(e1%e2))
+ a3 * ((E[1]%E[0])*(e1%e2)) + a4 * ((E[1]%E[1])*(e1%e2));
cout << B[0][0]<< endl; // { { 0.0, 1.0, 2.0, 3.0, 4.0, 5.0},
// {10.0, 11.0, 12.0, 13.0, 14.0, 15.0},
// {20.0, 21.0, 22.0, 23.0, 24.0, 25.0},
// {30.0, 31.0, 32.0, 33.0, 34.0, 35.0} }
C0 Y = (E%E)*(e1%e2); // nominal Basis Submatrix
C0 B1 = a1 * Y[0][0] + a2 * Y[0][1] // B1 == B
+ a3 * Y[1][0] + a4 * Y[1][1];
In the above (see also Figure 1•26), a Basis_Matrix “E[0]%E[0]” is projected into a Nominal_Basis_Submatrix
“e1%e2” to form a Basis_Submatrix “(E[0]%E[0])*(e1%e2)”. A Matrix “a1” is then projected by using the
Basis Submatrix as a map to form a Nominal_Submatrix “a1*((E[0]%E[0])*(e1%e2))”..
The Nominal_Basis_Submatrix can be used in this case to generate an equivalent expression (see. Figure
1•26). A Nominal_Basis_Matrix “E%E” is projected into another Nominal_Basis_Matrix “e1%e2” to form a
Nominal_Basis_Submatrix “(E%E)*(e1%e2)”. The Nominal_Basis_Submatrix is assigned to C0 variable Y. a
Matrix “a1” is projected into a Basis_Submatrix “Y[0][0]” to form a Nominal_Submatrix “a1*Y[0][0]”.
(e1%e2)*(E[0]%E[0])
E[0]%E[0]
“*”
e1%e2
a1*(e1%e2)*(E[0]%E[0])
0.0 1.0 2.0 3.0 4.0 5.0 0.0 0.0 0.0 0.0 0.0 0.0
10. 11. 12. 13. 14. 15. 0.0 0.0 0.0 0.0 0.0 0.0
20. 21. 22. 23. 24. 25. 0.0 0.0 0.0 0.0 0.0 0.0
30. 31. 32. 33. 34. 35. 0.0 0.0 0.0 0.0 0.0 0.0
(e1%e2)*(E[0]%E[0]) 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0
“*”
a1
0.0 1.0 2.0 3.0 4.0 5.0
10. 11. 12. 13. 14. 15.
20. 21. 22. 23. 24. 25.
30. 31. 32. 33. 34. 35.
X = (e1%e2)*(E%E)
E%E
“*”
e1%e2
a1*X(0)
0.0 1.0 2.0 3.0 4.0 5.0 0.0 0.0 0.0 0.0 0.0 0.0
10. 11. 12. 13. 14. 15. 0.0 0.0 0.0 0.0 0.0 0.0
20. 21. 22. 23. 24. 25. 0.0 0.0 0.0 0.0 0.0 0.0
30. 31. 32. 33. 34. 35. 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
X(0) 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0
“*”
a1
0.0 1.0 2.0 3.0 4.0 5.0
10. 11. 12. 13. 14. 15.
20. 21. 22. 23. 24. 25.
30. 31. 32. 33. 34. 35.
(E[0]%E[0])*(e1%e2)
e1%e2
“*”
E[0]%E[0]
a1*(E[0]%E[0])*(e1%e2)
0.0 0.0 1.0 0.0 2.0 0.0 3.0 0.0 4.0 0.0 5.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
10. 0.0 11. 0.0 12. 0.0 13. 0.0 14. 0.0 15. 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
(E[0]%E[0])*(e1%e2) 20. 0.0 21. 0.0 22. 0.0 23. 0.0 24. 0.0 25. 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 33. 0.0 34. 0.0 35. 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0
“*”
a1
0.0 1.0 2.0 3.0 4.0 5.0
10. 11. 12. 13. 14. 15.
20. 21. 22. 23. 24. 25.
30. 31. 32. 33. 34. 35.
(E%E)*(e1%e2) = Y
e1%e2
“*”
E%E
a1*Y[0][0]
0.0 0.0 1.0 0.0 2.0 0.0 3.0 0.0 4.0 0.0 5.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
10. 0.0 11. 0.0 12. 0.0 13. 0.0 14. 0.0 15. 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Y[0][0] 20. 0.0 21. 0.0 22. 0.0 23. 0.0 24. 0.0 25. 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 33. 0.0 34. 0.0 35. 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0
“*”
a1
0.0 1.0 2.0 3.0 4.0 5.0
10. 11. 12. 13. 14. 15.
20. 21. 22. 23. 24. 25.
30. 31. 32. 33. 34. 35.
- I1 I2 I3
V R1 R2 R3
+
I1+ I2 + I3
Figure 1•30 Parallel circuit with three resistances and a voltage source.
Therefore,
1. Examples taken from H.V. Malmstadt, C.G. Enke, and E.C. Toren, Jr., 1963, “ Eletronics for Scientists, Principles and
Experiments for Those who Use instruments.”, W.A. Benjamin, Inc., New York, p. 536-540.
Step 2:
2.5
5
90 90 x 100
40 5
40 5 90 + 100
10 60 20 = 47.4
20 30
Step 3:
Step 4:
40 5
100
90 25.4
20
This last formula is applicable for parallel resistances of two or more. A reduction procedure can be applied to a
more complicated network of circuit (see Figure 1•31). In four steps, we can reduce the resistances of the circuit
to a total resistance. The total current can be calculated. Then, we can work backwards to resolve the currents in
each part of the circuit. However we may not be always very fortunate like in the above case. Some circuits can
not be reduced merely by applying series resistances (simple addition) and parallel resistances formula (Eq.
1•38). For example, a Wheatstone bridge in Figure 1•32 provides a good example that can only be solved by a
linear algebraic approach.
The most important law with regarding to the calculation of the electric current in this circuit is the Kirch-
hoff’s law. This law says (1) The sum of all the current flowing toward and out of a node is zero. (2) The sum of
all the voltages from the source equals the sum of all the voltages drop in any loop. For solving this problem the
direction of current I4 is temporary assumed to go from node 2 to node 3. Applying the first part of the Kirch-
hoff’s law to node 1, node 2, and node 3, respectively
I1 = I2 + I3
I2 = I4 + I5
I6 = I3 + I4 Eq. 1•39
From the second part of the Kirchhoff’s law, note that V = IR, loops L1, L2, and L3, have the following relations.
Loop L1 has a source voltage of 3 volts. L2 and L3 are both “0”. Eq. 1•39 and Eq. 1•40 is implemented in C++ as
double v[6] = { 0.0, 0.0, 0.0, 3.0, 0.0, 0.0}; // (project: “electric_circuit”)
C0 M(6, 6, (double*)0), V(6, v);
Eqn0 = M[0], Eqn1 = M[1], Eqn2 = M[2], // aliases; copy constructor is always
Eqn3 = M[3], Eqn4 = M[4], Eqn5 = M[5]; // by reference in VectorSpace
Eqn0[0] = 1.0; Eqn0[1] = -1.0; Eqn0[2] = -1.0; // I1 = I2 + I3
Eqn1[1] = 1.0; Eqn1[3] = -1.0; Eqn1[4] = -1.0; // I2 = I4 + I5
Eqn2[5] = 1.0; Eqn2[2] = -1.0; Eqn2[3] = -1.0; // I6 = I3 + I4
Eqn3[2] = 30.0; Eqn3[5] = 100.0; // 3 = 30 I3 + 100 I6
Eqn4[2] = -30.0; Eqn4[3] = 10.0; Eqn4[1] = 50.0; // 0 = -30 I3 +10 I4 +50 I2
Eqn5[4] = -200.0; Eqn5[3] = 10.0; Eqn5[5] = 100.0; // 0 = -200 I5 + 10 I4 + 100 I6
C0 I = V / M; // default solver is the LU decomposition
cout << I << endl; // {0.0351158, 0.0130105, 0.0221053, 1.26316e-2
// 0.0117474, 2.33684e-2}T
Notice that Eqn“n” are aliases to the rows of Matrix “M”. This is in accordance with the implementation of Vec-
torSpace C++ Library where the copy constructor, for example, invoked by the statement “C0 Eqn0 = M[0];”, is
always performed as copy by reference. In other words, if the copy constructor is performed as by value, Eqn“n”
will not be the aliases to the rows of Matrix “M”. However, assigning the coefficients of the matrix in the above
is not only mathematically unsound, but also aesthetically un-pleasant. This can be improved by using the basis
expression as (project: “electric_circuit”)
100Ω 200Ω
- I6 L3 I5
10Ω
L1 2
3 Volts 3 I4
L2 50Ω
+ 30Ω
I1 I3
I2
1
Figure 1•32 Calculation of electric current in Wheatstone bridge.
These C++ statements are more readily to be compared with Eq. 1•39 and Eq. 1•40. Therefore, the programming
task should be much transparent and the finishing code is much more readable and maintainable. The linear alge-
braic approach to the Wheatstone bridge certainly is also applicable to the previous less complicated circuit net-
work. It is a more systematic approach to the general circuit network problem. We note by passing that the same
idea on the partial elimination of a part of a circuit network can be studied systematically with graph theory. The
graph theory are also used in developing efficient algorithms for the solution of sparse matrix1.
∂T 2 ∂T 2 q
--------- + λ --------- = – ---- Eq. 1•41
∂x 2 ∂y 2 K
where λ is the ratio for anisotropic heat conduction in two coordinate directions, q is the intensity of internal heat
source, and K is the thermal diffusivity. The boundary conditions of this problem are shown in Figure 1•33. The
geometry of the domain is a square area of 2 x 2 square units, with material properties of λ = 3, q / K = 16. The
temperature condition on both the right-edge and the left-edge is fixed at zero. This kind of boundary condition is
the so-called essential (or Dirichlet) boundary condition, also known as boundary condition of the first kind. In
particular, the essential condition that vanishes (with zero value) is also known as the homogeneous boundary
condition. The upper and lower-edges have boundary conditions of ∂T / ∂y = -T. Due to the symmetry of the
problem only a quarter of the area needs to be modeled. It is shown in the right-hand-side of the Figure 1•33 that
X-axis and the Y-axis are also the lines of symmetry. Therefore the lower-edge and the left-edge of this reduced
model have the boundary conditions ∂T / ∂y = 0 and ∂T / ∂x = 0, respectively. This kind of boundary condition is
the so-called natural (or Neumann) boundary condition, also known as boundary condition of the second kind. In
the present case, the zero-flux on these boundaries physically means to impose insulation walls. The boundary
condition on the upper-edge is ∂T / ∂y = -T, which mixes the independent variable, “T”, and the derivative of the
independent variable, “ ∂T / ∂y ”, in linear combination. This kind of boundary condition is known as the mixed
boundary condition or boundary condition of the third kind.
The grid size for the above problem is h = 1/4. The finite difference approximation, with the central differ-
ence scheme, to the Eq. 1•41, is
1 λ q
----- ( T i + 1, j – 2T i, j + T i – 1, j ) + ----- ( T i, j + 1 – 2T i, j + T i, j – 1 ) = – ---- Eq. 1•42
h2 h2 K
1. For example, Chapter 4 and Chapter 5 in S. Pissanetsky, 1984, “Sparse Matrix Technology”, Academic Press, New York.
2. Example from G.D. Smith, 1985, “Numerical Solution of Partial Differential Equations: Finite Difference Methods”, p.
242-245.
4 5 6 7 B.C. of
-9 8 9 10 11 ∂T 1st kind
------ = 0
T=0 T=0 ∂x 8 9 10 11 T=0
-13 12 13 14 15
12 13 14 15
-17 16 17 18 19
16 17 18 19
-13 -12 -13 -14 -15
∂T
------ = 0
∂y
∂T
------ = -T
∂y
Figure 1•33 Boundary value problem of heat conduction with all three kinds of boundary conditions.
where “i” and “j” are row and column indices to the nodal positions. With the grid size (h = 1/4) and material
properties (λ = 3, and q / K = 16) give in the above, we obtain
0 3 0 0 3T i – 1, j 0
ˆT = – 1 ⇒ T Eq. 1•44
1 –8 1 i, j – 1 – 8T i, j T i, j + 1 = – 1
0 3 0 0 3T i + 1, j 0
For any node with Tij , the finite difference stencil in the left-hand-side maps the corresponding coefficients to
the node around it. Although the finite difference stencil is extremely simple, which does not involve any high-
flown mathematics, treatment of the boundary conditions in finite difference method is notoriously non-system-
atic comparing to that in finite element method. The specification of boundary conditions in finite difference
method is full of trivial details pertaining to each problem. We simply can’t completely encapsulate these com-
plexities from users.
V[3] -= T
where the essential boundary condition is to the right of the “center-node”(3). The “right-node” has the coeffi-
cient “1” in the finite difference stencil. Therefore, the coefficient “1” multiplies the essential boundary condition
T is T. Then, shift to the right-hand-side of the equation number “3” by changing sign. We use replacement sub-
traction, “-=”, to subtract this amount out of the corresponding element in the right-hand-side vector, “V[3]”.
Natural Boundary Conditions : The left-edge and the lower-edge of the reduced model has zero natural boundary
conditions that can be considered as having the same nodes on the opposite side of the boundaries, considering
the boundaries as lines of symmetry. For example, for node number “8”, we can just specify both its “left-node”
and “right-node” as node number “9”. Therefore, the corresponding coefficients in the finite difference stencil
will be all added to the left-hand-side matrix corresponding to the node number “9”. In a more general case, for
non-zero flux, the natural boundary condition will be - ∂T / ∂x = q, where “q” is the outward heat flux. In this
more general case
This shows “T-9” can be represented by “T9 -q / 2”. For the first term, “T9”, we can specify the “left-node” (-9) as
the node number “9”. For the second term, “-q / 2”, the right-hand-side vector corresponding to the “center-
node” (8) needs to be modified as V[8] += q / 2. The quantity “-q / 2” has been shifted to the right-hand-side vec-
tor “V”, therefore, a change of sign is necessary.
Mixed Boundary Conditions : The boundary condition of third kind on top gives, for example, the current node
number “0”,
Therefore,
with “upper-node” specified as node number “4”. The left-hand-side matrix corresponding to the “center-node”
(0) needs to be modified with M[0][0] -= 3.0*0.5. Notice that the coefficient in the finite difference stencil corre-
sponding to the “upper-node” (-4) is “3.0”.
Program Listing 1•8 is the implementation of the above finite difference problem. The global left-hand-side
matrix “M” and the global right-hand-side vector “V” are declared static that only one copy of them is allowed in
the program for memory space consideration. At the heart of this code, we use object-oriented model to map the
finite difference stencil in Eq. 1•44 to the global matrix and global vector. The class name is “FD”. Two private
#include “c0_init.h”
#define ndf 20 global left-hand-side matrix and
static C0 M(ndf, ndf, (double*)0); global right-hand-side vector
static C0 V(ndf, (double*)0);
class FD { // finite difference stencil finite difference stencil mapping class
C0 *the_M_ptr, *the_V_ptr; reference to M and V pointers
public:
FD(C0* M_ptr, C0* V_ptr) : the_M_ptr(M_ptr), the_V_ptr(V_ptr) {} constructor
void add(int, int, int, int, int);
mapping finite difference stencil
};
void FD::add(int i0, int, i1, int i2, int i3, int i4) {
center node
(*the_M_ptr)[i0][i0] -= 8.0; (*the_V_ptr)[i0][i0] = -1;
if(i1>=0) (*the_M_ptr)[i0][i1] += 1.0; if(i2>=0) (*the_M_ptr)[i0][i2] += 3.0;
left and bottom nodes
if(i3>=0) (*the_M_ptr)[i0][i3] += 1.0; if(i4>=0) (*the_M_ptr)[i0][i4] += 3.0; right node and top nodes
}
int main() {
FD fd(&M, &V); // I. Problem Definition Phase
fd.add(0, 1, 4, 1, 4); fd.add(1, 0, 5, 2, 5); fd.add(2, 1, 6, 3, 6); fd.add(3, 2, 7, -1, 7); first row
fd.add(4, 5, 8, 5, 0); fd.add(5, 4, 9, 6, 1); fd.add(6, 5, 10, 7, 2); fd.add(7, 6, 11, -1, 3); second row
fd.add(8, 9, 12, 9, 4); fd.add(9, 8, 13, 10, 5); d.add(10, 9, 14, 11, 6); fd.add(11, 10, 15, -1, 7); third row
fd.add(12,13,16,13,8); fd.add(13,12,17,14,9); fd.add(14,13,18,15,10); fd.add(15,14,19,-1,11);
fourth row
fd.add(16,17,12,17,12);fd.add(17,16,13,18,13);fd.add(18,17,14,19,14);fd.add(19,18,15,-1,15);
fifth row
for(int i = 0; i < 4; i++) M[i][i] -= 1.5;
B.C. of the third kind modification
C0 T = V / M; // II. Solution Phase
for(int i = 0; i < 5; i++) { // III. Output Phase
matrix solver
for(int j = 0; j < 4; j++) output: (node#, T)
cout << “(“ << (i*4+j) << “, “ << T[i*4+j] << “) “; (0, 3.06), (1, 2.91), (2, 2.42), (3, 1.50),
cout << endl; (4, 3.72), (5, 3.53), (6, 2.92), (7, 1.80),
} (8, 4.17), (9, 3.95), (10, 3.26), (11, 2.0),
return 0; (12,4.43),(13,4.20),(14,3.46),(15, 2.11),
} (16,4.52),(17,4.28),(18,3.52),(19,2.15)
Listing 1•8 Finite difference method for 2-d heat conduction problem (project: “finite_difference”).
member data are C0 pointers to the global matrix and the global vector. The constructor of “FD” will establish
the link between the class and the global objects by “FD fd(&M, &V);”. The only public member function in
“FD” is FD::add(int, int, int, int, int). The “FD::add(int, int, int, int, int)” will assemble and add the correspond-
ing coefficient to the global matrix and the global vector according to the finite difference stencil. The five inte-
ger arguments on the function interface represent the node numbers of the center, left, lower, right, and upper
nodes. For a typical interior node “5”, as shown in Figure 1•33, it can be written as
fd.add(5, 4, 9, 6, 1);
The node numbers “4”, “9”, “6”, and “1” are the node numbers relative in their position to the node “5”. For a
homogeneous boundary condition node such as node “7”
We note that “-1” is in place of the right-node relative to node “7”. Any negative integer value can be used to
suppress the corresponding coefficient in finite difference stencil to be added to the left-hand-side matrix. For the
natural boundary conditions on the left-edge and lower-edge, considering the symmetry of node “8” for example,
we can specify
fd.add(8, 9, 12, 9, 4)
The “right-node” (9) on the right-hand-side of the “center-node” (8) is also used as the “left- node”. In more gen-
eral case, the natural boundary condition is - ∂T / ∂x = q, where “ q” is the outward heat flux and (T-9 - T9) / 2 h =
2 (T-9 - T9) = -q; i.e., T-9 = T9 -q / 2. We may write in C++ together with “FD::add()” as
Similarly, for the boundary condition of the third kind, using node “1” as an example is
To represent the top node, node “5” is used for the “upper-node” that accounts for the first term in Eq. 1•47. In
fact, the finite difference stencil coefficient of “3” on the top node position will be added to the left-hand-side
matrix at the corresponding elements. For the second term, we need to modify with the finite difference stencil
coefficient “3”, then, multiply by the second term in Eq. 1•47 as “3 × (- 0.5 T1) = -1.5 T1” that has not been
accounted for by the class “FD”. In the second part of the Eq. 1•46 “M[1][1] -= 1.5;” performs this modification.
The result of this problem is shown in Figure 1•34. The finite difference grid is drawn on the top of the box.
The height of the shaded surface represents the value of the temperature. The basic feature of the solution is a
warp-up surface along the center due to internal heat generation of “q” in Eq. 1•42. The homogenous boundary
condition on the two sides of the problem pin down the surface to zero. The surface has a concave shape along
the Y-axis, which means that the heat is lost both from the top and bottom that have the boundary condition of the
third kind.
From this simple example of implementing finite difference method, we can recognize that VectorSpace C++
Library made programming easy only on some mathematical related aspects. However, the finite difference
method is NOT intensively mathematical. The most advantage comes from the data abstraction offered by C++
that we modeled the mapping of finite difference stencil to the global matrix and vector as a class—“FD”. The
experience learned from this example tells us to stay with the current industrial flag-ship general purpose lan-
guage. We don’t want to invest our effort on a specialized environment or, even worse, a mixed environment with
more than one language to worry about. The applicability of a special purpose language on wide-ranging applica-
tion areas is most likely to break down somewhere. The advantage of object-oriented programming is where
most of the strength of applications using VectorSpace C++ Library comes from. In Chapter 3, examples on con-
Y
1.0-1 X
0.0
Y
0.0 1.0
4 T
-1
1
T
2 X
1
Figure 1•34 Solution of finite difference heat conduction problem, with internal heating,
homogeneous boundary conditions on the two sides and boundary conditions of the third kind on
the top and bottom.
strained optimization and Chapter 4 and 5 on finite element method, we will show you that when an application
problem is not trivial, the power from object-oriented programming becomes increasingly critical!
Some remarks need to be made for this finite difference problem. Assuming we even use a laptop computer
for this problem, we will get the result instantly. We will not want to do anything more. However, what if you
want 1000 × 1000 grid or even greater to increase resolution? Computing time and memory space need to be
considered. Firstly, we can use a sparse matrix instead of a full matrix to represent the global matrix “M” to save
memory space. When the size (n × n) of the global matrix grows, the required memory space will grow in the
order of O(n2), and computing time of the matrix solver, in direct method, will grow in the order of O(n3). Sec-
ondly, the global matrix in this case is not only sparse but also diagonally dominant. An iterative method such as
sucessive over relaxation (SOR) can be used to search for a feasible solution.1
How to tackle large-size problem in terms of programming method? The idea is simple. Get rid of everything
offered by VectorSpace C++ Library! Simply go “back to the basics” by using plain C exclusively for ultimate
speed. However, we can start from using VectorSpace C++ Library as the first step with a smaller number of
variables like the above example. Then, in the second step, we can reverse engineer the program segment by
segment for efficiency. This two-steps programming method is known as rapid proto-typing. In the first step,
programming in VectorSpace C++ Library is much more mathematically friendly comparing to that in Fortran or
1. Introduction on the technical details of sparse matrix representation and sucessive over-relaxation (SOR) can
be found in W.H. Press, S.A. Teukolsky, W.T. Vetterling, and B.P. Flannery, 1992, “Numerical Recipes in C, the Art of Sci-
entific Computing”, 2nd ed., Cambridge University Press, UK.
As a first cut, more people would buy more cars seems to be a good assumption, if no better a priori information
is available. We model the relationship of the number of cars sold and the population as a straight line
where “y” is the number of cars sold, “x” is the population, and “a” and “b” are two coefficients (intercept and
slope) for a straight line. This problem is clearly over-determined. Ten data are available for just resolving two
model parameters m = {a, b}T : the intercept “a" and the slope “b”. The full rank over-determined least squares
problem is explained as projection method in Figure 1•15. The normal equation of Eq. 1•27 on page 35 can be
used to solve this problem. Program Listing 1•9 uses normal equations method to solve the least squares prob-
lem using Cholesky decomposition. This problem is not too ill-conditioned. In fact, using QR decomposition as
discussed on page 36 produces the same result, with intercept “a” = 268.92 and slope “b” = 0.0078214. The fitted
line is shown in Figure 1•35. The simple-minded assumption that the area with more population demand more
1. The abstract form of the problem in terms of “fitting a straight line” is presented in William Menke, 1984, “Geophysical
Data Analysis: Discrete Inverse Theory”, Academic Press, Inc., p.10-11.
#include “include/vs.h”
int main() {
double y[10] = {421, 569, 514, 1139, 287, 543, 615, 934, 327, 918}, y obsv.
a[10][2] = { {1.0, 34549.0}, {1.0, 53943.0}, {1.0, 42983.0}, {1.0, 102832.0},
{1.0, 12034.0}, {1.0, 9023.0}, {1.0, 20934.0}, {1.0, 73023.0}, 10 rows of {1, x}
{1.0, 23294.0}, {1.0, 83543.0}};
C0 A(10, 2, a[0]), Y(10, y);
Matrix::Decomposition_Method = Matrix::Cholesky_Decomposition; Cholesky decomposition
C0 AtA = ((~A)*A),
normal equations
x = ((~A)*Y) / AtA;
mest = [ATA]-1 AT yobsv.
cout << x << endl;
= {268.92, 0.0078214}T
C0 cov_m = AtA.inverse(),
N = A * cov_m *(~A);
covu m = ATA = unit covaraince matrix
cout << cov_m << endl; N = data resolution matrix
cout << N << endl;
return 0;
}
Listing 1•9 Least squares solution using normal equations method for fitting a straight line (project:
“least_squares_line_fitting”).
sports cars in general seems to be a good prediction model. However, there are two outliers with small popula-
tion (Locations “F” and “G”) having stronger demand relative to the other areas. Some factors other than the
population may have been overlooked. We will come back to this later.
y Location D
1200
Location F&G
1000
Number
of 800
Cars
600
400
200
0 x
20000 40000 60000 80000 100000
Population
Figure 1•35 Least squares solution of fitting a straight line
The purpose of the present analysis is for market prediction. We may ask how well are the observational data
y obsv. contribute to the model we built. Since we have built a model to predict the sale of cars y pred. according to
N = A [ATA]-1AT is called the data resolution matrix. “N” maps the observations, y obsv., which we collected for
the analysis to the market prediction, y pred., from the model we built. From Figure 1•36, a plot of the data resolu-
tion matrix, the diagonal elements corresponding to the collected data from each area of the local dealers, which
show the importance to their own predictions. Each row of the data resolution matrix is a linear combination with
other data to make the prediction. Therefore, if the diagonal element is the most dominant (closer to 1), it means
that the prediction around this particular data can be more independently resolved. The data resolution matrix for
the above problem shows that the most important data is at the location “D”. This is a location with the largest
population and the greatest sports car demand. This particular information is not only consistent with the model
assumption in the first place, but also helps spread out the data to cover a wider range. The location “D” is also
expected to be resolved relatively more independently, because along the row and column number 4, this loca-
tion is much greater than the rest in the row. However, along row number 4, location “H” and “J” are also relative
large. From Figure 1•35, Locations “H” & “J” are those having large population and relative large car sale. These
information are sort of duplicating what Location “D” stands for. We note that for an overdetermined problem,
we have more data than the model parameters. Therefore the model parameters can always be perfectly resolved,
while the data resolution matrix shows how well each datum is resolved
J
H
D
Location D
H
J
0.4
4
0.2
2 J10
I
0 8
H
G
A 6
F
2B C E
D
4 4 Row number
D
E
F
6 C
G
Column number 8H B
2
I
JA
10
Figure 1•36 Data resolution matrix.
Recall probability and distribution from statistics. The peak width of a distribution can be measured by using
a quadratic form of (y - y Exp)2, where superscript “Exp” denotes expectation. The variance σ2 of a distribution is
defined as
where P(y) is the probability function of the distribution. For the correlation of any two data the covariance is
defined similarly as
∞ ∞
cov y = cov ( yi, y j ) = ∫–∞ … ∫–∞ [ yi – y iExp ] [ y j – yjExp ]P ( y )d y1 …d y N
Eq. 1•50
Assuming all data are not correlated (off-diagonals are all zeros) and have the same variance σ2, the unit covari-
ance matrix of model parameters [covu m] is defined as mapping the uncorrelated data covariance [cov y ] to
model covariance [cov m ] with a normalized factor considered
Recall from Eq. 1•28 that A-g = [ATA]-1AT is the generalized inverse for the over-determined problem, where A-
g
is used to map “y - y Exp” in Eq. 1•50 to m. Then, [cov y ] is divided by the variance such that σ-2[cov y ] = I.
The unit covarinace matrix [covu m] indicates how much error in the data, “y - y Exp”, will be amplified into the
model parameters, m. The diagonals of the unit covariance matrix for this problem is diag [covu m ] =
{0.325285, 1.08269e-10}T. Assuming that all car sales have money back guaranteed if there is a lemon product,
and at the time of survey some sales are pending with the contingency of loan to be approved by banks. That is a
lot of uncertainty may be contained in the collected data. We test with an extremely large variance of data σd =
± 10 (σd2 = 100) for this problem, and also assume that it is the same for every location for simplicity. After the
mapping by Eq. 1•51, the diag [covu m ] remains small with intercept “a” = 268.92 ± 5.7, slope “b” = 78.214e-
4 ± 1.04e-4.
It is not uncommon to think the two seaters probably are going to be most popular among wealthy people.
The regional car dealer collected the average incomes of all locations as shown in TABLE 1•5. We assume that
the car sale is also proportional to the average income, and build a modified model as
y = a + b x1 + c x2 Eq. 1•52
Listing 1•10 Least squares solution using normal equations method for fitting a plane.
The solution of the problem is m est ={-183.61, 9.9271e-3, 6.695e-3}T. The fitted plane is shown in Figure
1•37. Compared to Figure 1•35 Location “G”&”F” now do not look that much of outliers. With added parameter
of “average income”, every data seems to be fitting into a plane much nicely. We did distill an important factor to
this problem. On the other hand, it has been said that “Give me five parameters, I can fit an elephant!”. The more
parameters we use to build a model the more nicely our data will fit the model. The nature of limited information
is inherent from the data we collected. If we try to make to many inferences from the data, we may simply have
the data beaten to death. The more statements we make the weaker they are.
The data resolution matrix of the plane-fitting problem is shown in Figure 1•38. Now besides Location “D”
the importance of Location “F” & “G” increase. They both explore the range of data to cover locations that have
small population but strong demand for the sports cars. However “F” and “G” are also plagued by the same prob-
lem that “D” has and the problem getting even worse. That is when we examine row 6 and row 7, “F” and “G”
100000
80000
Income 60000
40000
20000
0
1000
00
750
50
Number
of 500
Cars
250
0
0
25000
50000
75000
Population 100000
always have about the same height. Both the off-diagonals and the diagonals in data resolution matrix N are
strongly coupled. This means that “F” and “G” can not be resolved independently. This result actually is not sur-
prising. After all these two locations are small towns with predominantly high income folks living there. The
information from these two locations duplicates each other.
The diagonal vector of the unit covarinace matrix for this case is {1.77863, 1.39737e-10, 3.18123e-10}T.
Assuming the same large variance for every data collected as in the straight-line fitting problem (σd = ± 10; i.e.,
σd2 = 100), the mapping of the variance from data to model parameter gives “a” = -183.606 ± 13.33, “b” =
0.00992712 ± 0.0001182, and “c” = 0.0066951 ± 0.0001784. It seems the model we built is pretty robust even
when it is subject to large error or uncertainty in the collected data.
Computer Tomography
Computer Tomography (CT) is commonly used in medical practice today. In this section we study a much
simplified hypothetical acoustic tomographic1 problem to show the essence of mathematical treatment that CT
uses. Sound devices are used instead of X-ray, and the source and receiver arrangement is very primitive com-
pared to that of medical CT.
Define the slowness “s” as the inverse of the velocity
1. problem after William Menke, 1984, “Geophysical Data Analysis: Discrete Inverse Theory”, Academic Press, Inc., p.11-
14 and p. 182-186.
0.4
4
0.2
.2 10
0 8
6
2 Row number
4 4
Column number 6
8 2
10
Figure 1•38 Data resolution matrix for fitting a plane.
The problem illustrated in Figure 1•39 has 10x10 blocks of size h (=1.0). 91 unshaded blocks have low velocity
(v = 10), and 3 × 3 shaded blocks have high velocity (v = 12). The measurements are carried out either row-wise
or column-wise. The travel time for the i-th measurement is
Indices 0 to 9 denote the consecutive blocks along a row or a column, and “h” is the size of the blocks. Therefore,
we will solve the slowness “s” of each block given travel time “t” from measurements.
There are 100 slowness variables associated with each block and we only have twenty measurements (10
row-wise + 10 column-wise measurements). The problem is underdetermined. The 10x10 blocks is completely
surrounded and dominated by low velocity blocks (v = 10). We work on a solution that is a perturbation to an
entirely low velocity solution
where ∆t = t - A s a priori. We define ∆s = s est - s a priori, where s a priori is a vector of length = 100 with the slow-
ness of “0.1=1/10” (low velocity is a priori information for all blocks). t is a vector of length = 20 for the travel
time measurements. The A-g in Eq. 1•55 is defined as
That is Eq. 1•56 is the generalized inverse for the underdetermined problem. The derivation of Eq. 1•56 is dis-
cussed in the following. Since the problem is underdetermined, a common a priori assumption is that the solu-
tion is “simple” that its Euclidean norm, ||∆s||2 = ∆sT ∆s = Σ ∆si, is minimized. The problem solved is a standard
constrained optimization problem (detailed in Chapter 2):
0 1 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16 17 18 19
20 21 22 23 24 25 26 27 28 29
Source
30 31 32 33 34 35 36 37 38 39
43
Receiver
40 41 42 44 45 46 47 48 49
50 51 52 53 54 55 56 57 58 59
60 61 62 63 64 65 66 67 68 69
70 71 72 73 74 75 76 77 78 79
80 81 82 83 84 85 86 87 88 89
90 91 92 93 94 95 96 97 98 99
Figure 1•39 10x10 blocks with 91 unshaded blocks as low velocity (10) and 3x3
crossed hatched blocks as high velocity (12). Twenty travel time data are collected
from either row-wise or column-wise arrangement of devices of source and receiver.
An objective functional can be constructed using Lagrange multiplier method (with “m” as number of solution
varialbes and “n” number of observations)
m n m
f(∆s) = ∆s 2 +λ c = T
∑ ∆s i2 + ∑ λi ∆t i – ∑ Aij ∆sj Eq. 1•57
i=1 i=1 j=1
where λ is the Lagrange multiplier. Take the derivative of the objective function “f” with respect to ∆s and set
this derivative to zero for minimization
m n m n
∂f ∂∆s i ∂∆s j
------------ =
∂∆s k ∑ 2 ------------ ∆s i –
∂∆s k ∑ λi ∑ A ij ------------ = 2∆s k –
∂∆s k ∑ λi Aik = 0 Eq. 1•58
i=1 i=1 j=1 i=1
As in a constrained optimization problem the Lagrange multiplier λ can be eliminated from the equations. Sub-
stituting ∆s in the first equation into the second equation yields
A ATλ / 2 = ∆t
Hence, λ = 2[A AT]-1∆t, and λ is eliminated by substituting this back to the first equation
∆s = AT[AAT]-1 ∆t = A-g ∆t
This is the proof for Eq. 1•56 that defines the generalized inverse of an underdetermined problem as A-g =
AT[AAT]-1.
Program Listing 1•11 implements the tomographic problem. The kernel of this program is the class “Ray”.
This class maps the blocks, that the sound passes, to a global matrix according to Eq. 1•54. The first ten argu-
ments are the block numbers. The last argument is the size of the block. The computation of the solution in terms
of slowness is straight forward according to Eq. 1•55.
// minimum Euclidean norm solution minimum Euclidean norm solution
C0 ds = (~A)*(dT /(A*(~A)); slowness
cout.precision(6); output slowness solution
for(int i = 0; i < 10; i++)
for(int j = 0; j < 10; j++) cout << (0.1+ds[i*10+j]) << “, “;
cout << endl;
}
// model resolution matrix
model resolution matrix
C0 R = (~A)*(A*(~A)).inverse()*A;
// output row # 33 of the model resolution matrix in matrix form output row # 33 of model resolution
cout << “{ “ << endl; matrix and rearranged the values in the
for(int i 0; i < 10; i++) row into a matrix form (10x10).
cout << “{“;
for(int j = 0; j < 10; j++) {
if(j != 9) cout << (R[33][i*10+j]) << endl << “, “;
else cout << (R[33][i*10+j]);
}
if(i != 9) cout << “}, “ << endl;
else cout << “}” << endl;
}
cout << “}” << endl;
return 0;
}
Listing 1•11 Acoustic tomography with row and column measurements (project: “acoustic_tomography_1”).
where
is the model resolution matrix that indicates the closeness of our solution ∆sest to the true solution ∆strue. As for
the data resolution matrix, the diagonals of the model resolution matrix show the importance of a particular
model parameter. If the model resolution matrix is close to I, it means the model parameters are nearly perfectly
resolved. In Program Listing 1•11, we chose to print out information corresponding to the center block of the
high velocity (block number 33 in Figure 1•39), the row number 33 of the model resolution matrix.
#include “include/vs.h”
#define ndf 100 100 blocks
#define neqn 20 20 measurements
static C0 A(neqn, ndf, (double*)0); global matrix
static int eqn = 0; global matrix row index
class Ray { // Map blocks that pass by sound to global matrix class “Ray” to map blocks in the sound
C0 *the_A;
path to the global matrix.
public:
Ray(C0* A) : the_A(A) {}
void add(int, int, int, int, int, int, int, int, int, int, double = 1.0);
10 block numbers, and grid size
};
void Ray::add(int i0, inti1, int i2, int i3, int i4, int i5, int i6, int i7, int i8, int i9, double h) { add() used to map to global matrix
(*the_A)[eqn][i0] = (*the_A)[eqn][i1] = (*the_A)[eqn][i2] = (*the_A)[eqn][i3] =
(*the_A)[eqn][i4] = (*the_A)[eqn][i5] = (*the_A)[eqn][i6] = (*the_A)[eqn][i7] =
(*the_A)[eqn][i8] = (*the_A)[eqn][i9] = h;
eqn++; increase global matrix row index
};
int main() { measurements: ∆t
C0 dT(neqn, (double*)0);
Ray ray(&A);
// add row path
// add rows
for(int i = 0; i < 10; i++)
ray.add(i*10, i*10+1, i*10+2, i*10+3, i*10+4,
i*10+5, i*10+6, i*10+7, i*10+8, i*10+9);
// add columns // add column path
for(int i = 0; i < 10; i++)
ray.add(i, i+10, i+20, i+30, i+40,
i+50, i+60, i+70, i+80, i+90); assign measurement values
for(int i = 0; i < neqn; i++) dT[i] = 0.0;
dT[2] = dT[3] = dT[4] = dT[12] = dT[13] = dT[14] = -0.05;
(*ray).add(0, sqrt2);
where sqrt2 has been defined by the macro definition to have the value of 2 . The ray that passes through the
NE-SW diagonal has 10 blocks on the way. It can be written as
with ten int values and all given the length of 2 . This function is very versatile. In fact, if you begin to consider
ray path at different angles other than those we have just taken (0o, 45o, 90o, and 135o), the distances of a ray
which passes through each block can be different and the different distances need to be specified with different
blocks.
The solution is shown in the left-hand-side of Figure 1•41. Compared to the solution in Figure 1•40 the model
has been improved at least incrementally (since you can obtain even better resolution by adding more data to the
solution procedure). The model resolution matrix shown in the right-hand-side of Figure 1•41 shows that the
over-all importance of the data has been significantly improved. The value of the importance increases from
“0.19” to “0.44” a more than two times growth. However, the model resolution matrix also shows the number of
dependent blocks increases. The new pattern looks like the British national flag with 45o and 135o stripes that
corresponding to the newly slanted measurement paths superposed on the previous vertical/horizontal cross pat-
tern
Buckling of a Rod
Consider a beam1 distorted from its natural state with the application of a load value “P” as in Figure 1•42.
True model of
2
2 slowness
4
4 6
8 row #33 from Model Resolution Matrix
6 10
0.11
0
8 0.1
0
0.09
0.
10
0 0.08
0.
0.2
2
0.1
1
2
0
-0.1
.1 4
2
6
4
6 8
8
2
2 Solution of
10 10
4 slowness
4 6
8
6 10
0.11
0
8 0.1
0
0.09
0.
10
0 0.08
0.
Figure 1•40 Solution of acoustic tomographic problem using minimum solution Euclidean norm.
The high velocity area is smeared out to have two pairs of big side-lobes. The left-hand side is
from the 33-rd row of model resolution matrix.
1. For an introduction on Bernoulli-Euler beam theory see J.M. Gere and S.P. Timoshenko, 1984, “Mechanics of Materials”,
2nd ed., Wadsworth, Inc., California, p. 553-8, and p. 680 for a brief historical account.
#include “include/vs.h”
#define ndf 100 100 blocks
#define neqn 58
58 measurements
static C0 A(neqn, ndf, (double*)0);
global matrix
static int eqn = 0;
class Ray { // Map block that pass by sound to global matrix
global matrix row index
C0 *the_A;
class “Ray” to map blocks in the sound
public: path to the global matrix.
Ray(C0* A) : the_A(A) {}
void add(int, int, int, int, int, int, int, int, int, int, double = 1.0); 10 block numbers, and grid size
void add(int, double, int = -1, double = 0.0, int = -1, double = 0.0, variable length of block numbers and
int = -1, double = 0.0, int = -1, double = 0.0, int = -1, double = 0.0, grid size
int = -1, double = 0.0, int = -1, double = 0.0, int = -1, double = 0.0,
int = -1, double = 0.0);
};
void Ray::add(int i0, inti1, int i2, int i3, int i4, int i5, int i6, int i7, int i8, int i9, double h) {
add() used to map to global matrix
(*the_A)[eqn][i0] = (*the_A)[eqn][i1] = (*the_A)[eqn][i2] = (*the_A)[eqn][i3] =
(*the_A)[eqn][i4] = (*the_A)[eqn][i5] = (*the_A)[eqn][i6] = (*the_A)[eqn][i7] =
(*the_A)[eqn][i8] = (*the_A)[eqn][i9] = h;
eqn++; increase global matrix row index
};
void Ray::add(int i0, double h0, int i1, double h1, int i2, double h2, int i3, double h3, variable length of arguments version
int i4, double h4, int i5, double h5, int i6, double h6, int i7, double h7,
int i8, double h8, int i9, double h9) { // variable length arguments version
(*the_A)[eqn][i0] = h0;
if(i1 >= 0) (*the_A)[eqn][i1] = h1; if(i2 >= 0) (*the_A)[eqn][i2] = h2;
if(i3 >= 0) (*the_A)[eqn][i3] = h3; if(i4 >= 0) (*the_A)[eqn][i4] = h4;
if(i5 >= 0) (*the_A)[eqn][i5] = h5; if(i6 >= 0) (*the_A)[eqn][i6] = h6;
if(i7 >= 0) (*the_A)[eqn][i7] = h7; if(i8 >= 0) (*the_A)[eqn][i8] = h8;
if(i9 >= 0) (*the_A)[eqn][i9] = h9;
eqn++;
}
void add_rowcol(Ray *ray, C0 *dT) { // add blocks along rows and columns; 20 measurements measurements: ∆t
// add rows
for(int i = 0; i < 10; i++) add row path
(*ray).add(i*10, i*10+1, i*10+2, i*10+3, i*10+4,
i*10+5, i*10+6, i*10+7, i*10+8, i*10+9);
// add columns
for(int i = 0; i < 10; i++)
add column path
(*ray).add(i, i+10, i+20, i+30, i+40,
i+50, i+60, i+70, i+80, i+90);
for(int i = 0; i < neqn; i++) (*dT)[i] = 0.0;
(*dT)[2] = (*dT)[3] = (*dT)[4] = (*dT)[12] = (*dT)[13] = (*dT)[14] = -0.05;
assign measurement values
}
#define sqr2 1.414213562
void add_diagonals(Ray *ray, C0 *dT) { // add 45o blocks; 38 measurements
// add NE-SW ray paths 45o measurements
(*dT)[eqn] = 0.0; (*ray).add(0, sqrt2);
Listing 1•12 Acoustic tomography with row-wise and column-wise plus 45o and 135o measurements (project:
“acoustic_tomography_2”).
Let φ be the angle between tangent of the rod and the horizontal axis, and v the lateral deflection. The curva-
ture of the rod is related to the bending moment “M” and the flexure rigidity of the beam “EI” as
d2 v M
-------- = – ------
dx 2 EI
Eq. 1•61
where “E” is the Young’s modulus and “I” is the moment of inertia. For static equilibrium, M = Pv. We have
The length of the rod is L, and the boundary conditions at the ends are v(0) = v(L) = 0. We use finite difference
to approximate numerically the second order derivative in the left-hand-side of Eq. 1•62. The rod is subdivided
into “n” segments, and the size “h” of each segment is “h = L / n”. The central difference stencil for the second
order derivative of v evaluated at node xi is
d2 v v i + 1 – 2v i + v i – 1
-------- ( x i ) = ------------------------------------------
- Eq. 1•63
dx 2 h2 x = xi
Substituting Eq. 1•63 into Eq. 1•62, we have the standard matrix form of an eigenvalue problem
Av=λv
This equation has the same form as Eq. 1•34 on page 37.
Program Listing 1•13 is the implementation of this problem with VectorSpace C++ Library. Class “FD” is the
finite difference stencil for Eq. 1•63, which is very similar to class “FD” in Program Listing 1•8. The computa-
tion of eigenvalues and eigenvectors is straight forward as has been discussed on page 37. Only the first three
eigenvalues and eigenvectors are reported to a file “rod.out”. The results of the eigen computation are shown in
Figure 1•43. These are actually eigenvalues ({λ} = {i2π2}, i = 1, 2, ...) and eigenfunctions ({sin iπx}, i = 1, 2, ...)
of the operator “-d2/dx2”. In the present case, the smallest eigenvalue and its eigenvector is the most stable. The
second and third modes on the right-hand-side of Figure 1•43 are only attainable if the inflection points are sup-
ported at the beginning of an incremental loading procedure. The branching diagram in the upper left-hand-side
of Figure 1•43 is actually quite un-realistic that the eigenvalues exist only in some discrete values and the value
of v’ can not be determined. This is caused by using Eq. 1•62, which is a linearization of a fully non-linear prob-
lem of 1
The fully non-linear version of the branching diagram shows pitch fork bifurcation2. The eigenvalues of the lin-
earized problem (Eq. 1•62)3 are only the branching points of the fully non-linear problem (Eq. 1•64).
If the resultant matrix is symmetrical, the so-called generalized eigenvalue problem and quadratic eigenvalue
problem can be solved by the symmetrical eigenvalue solver provided by VectorSpace C++ Library. In many
practical engineering applications, the symmetrical solver is sufficient. An example of how a generalized eigen-
value problem can be solved using the symmetrical eigenvalue solver is shown on page 234.
1. I. Stakgold, 1979, “Green’s Functions and Boundary Value Problems”, John Wiley & Sons, New York, p. 572-576.
2. J.E. Marsden and T. J.R. Hughes, 1983, “Mathematical Foundations of Elasticity”, Prentice-Hall, Inc., Englewood Cliffs,
N.J., p.427-429.
3. I. Stakgold, 1979, “Green’s Functions and Boundary Value Problems”, John Wiley & Sons, New York, p. 584.
#include "include/vs.h"
#include "assert.h"
#include <iostream.h>
#include <fstream.h>
#include <stdlib.h>
#include <iomanip.h>
static ofstream ofs("rod.out", ios::out | ios::trunc);
#define n 40 // number of segments
#define P 1.0 // loading number of segments
#define L 1.0 // length of the rod loading
#define E 1.0 // Young’s modulus length of rod
#define I 1.0 // moment of inertia Young’s modulus
static C0 M(n-2, n-2, (double*)0); moment of inertia
static const double h = L / n; global matrix
static const double k = - E * I / (P*h*h); finite difference grid size
class FD { // finite difference stencil
C0 *the_M;
class of finite difference stencil
public:
FD(C0 *M) : the_M(M) {}
void add(int, int, int);
};
void FD::add(int i_center, int i_left, int i_right) {
// finite difference stencil
if(i_center >= 0) (*the_M)[i_center][i_center] += 2.0*k; (-1, 2, -1) finite difference stencil
if(i_left >= 0) (*the_M)[i_center][i_left] += -1.0*k;
if(i_right >= 0) (*the_M)[i_center][i_right] += -1.0*k;
}
int main() {
FD fd(&M);
fd.add(0, -1, 1); // first node
for(int i = 1; i < n-3; i++) fd.add(i, i-1, i+1);
first node
fd.add(n-3, n-4, -1); // last node
middle nodes
Eigen a(M); last node
C0 lambda = a.Eigenvalues(),
x = a.Eigenvectors(); eigenvalues
ofs << “lambda:” << lambda[0] << “, “ << lambda[1] << “, “ labmda[2] << endl; eigenvectors
for(int i = 0; i < 3; i++) ofs << x(i) << endl; output first three eigenvalues and
ofs.close(); eigenvectors
return 0;
}
Listing 1•13 Finite difference implementation for buckling of rod eigenvalue analysis (project: “rod_buckling”)
0.15
0.1
|λ| 0.05
λ1 λ2 λ3
5 10 15 20 25 30 35
0.2
0.1
5 10 15 20 25 30 35
v’ -0.2
inflection points
|λ| 0.1
λ1 λ2 λ3
5 10 15 20 25 30 35
-0.1
-0.2
Figure 1•43 First three eigenvalues and eigenvectors of the rod buckling problem using
finite difference method.
Scholastic Achievements Analysis:1 of 20 students over 6 subjects are given in TABLE 1•6. The covariance
matrix defined in Eq. 1•50 on page 72 is
∞ ∞
cov y = cov ( yi, y j ) = ∫–∞ … ∫–∞ [ yi – y iExp ] [ y j – yjExp ]P ( y )d y1 …d y N
1. A. Jennings and J.J. McKeown, 1992, “Matrix Computation”, 2nd ed., John Wiley & Sons, Inc., New York, p.181-184.
#include "include/vs.h"
#define N 20
#define sub_no 6 number of students
static double s[N][sub_no] = number of subjects
{ {50.0, 45.0, 41.0, 45.0, 46.0, 30.0}, {17.0, 29.0, 40.0, 43.0, 40.0, 22.0},
{60.0, 51.0, 49.0, 69.0, 58.0, 51.0}, {49.0, 70.0, 46.0, 63.0, 71.0, 44.0},
{64.0, 63.0, 52.0, 53.0, 68.0, 61.0}, {35.0, 55.0, 49.0, 44.0, 66.0, 47.0},
{51.0, 67.0, 48.0, 64.0, 64.0, 57.0}, {57.0, 69.0, 60.0, 52.0, 73.0, 60.0},
{99.0, 77.0, 58.0, 86.0, 81.0, 67.0}, {36.0, 60.0, 81.0, 74.0, 68.0, 52.0},
{43.0, 33.0, 40.0, 52.0, 45.0, 43.0}, {36.0, 39.0, 34.0, 55.0, 49.0, 44.0},
{37.0, 30.0, 47.0, 40.0, 52.0, 35.0}, {70.0, 48.0, 58.0, 55.0, 72.0, 60.0},
{78.0, 48.0, 60.0, 70.0, 61.0, 41.0}, {66.0, 72.0, 66.0, 73.0, 83.0, 66.0},
{49.0, 63.0, 55.0, 69.0, 55.0, 57.0}, {68.0, 50.0, 68.0, 62.0, 55.0, 47.0},
{43.0, 71.0, 68.0, 71.0, 71.0, 59.0}, {44.0, 34.0, 33.0, 51.0, 42.0, 53.0}};
int main() { matrix X, and average a
C0 X(N, sub_no, s[0]), a(sub_no, (double*)0);
for(int i = 0; i < sub_no; i++) { compute average
for(int j = 0; j < N; j++) a[i] += X[j][i];
a[i] /= (double)N; Xj = Xj - a
for(int j = 0; j < N; j++) X[j][i] -= a[i]; C = ( XTX) / (N-1)
}
C0 C = (~X)*X / (N-1.0);
singular value decomposition
cout << C << endl;
SVD svd(C);
C0 sigma = svd.Singularvalues(),
x = svd.U(); {799.2,178.4,87.4, 58.9,41.6,16.0}T
cout << sigma << endl; {.518, .474, .305, .360, .401, .351}T
cout << x(0) << endl; {-.815, .389, .350, .027, .235, .077}T
cout << x(1) << endl;
return 0;
}
Listing 1•14 Principal components analysis of scholastic achievement of 20 students on 6 subjects (project:
“scholastic_achievements”).
of the total variance, respectively. The all positive values of the first eigenvector reconfirm the tendency that a
student who does well in one subject will do very well in the other. The second eigenvector indicates, again, that
mathematics is the subject that has the largest variation of student performance.
Stock market1 weekly return rates from five companies give a covariance matrix as
Program Listing 1•15 is the implementation of the singular decomposition of this problem.The proportion of
total variance in the first two eigenvalues are 2.857 / 5.084 = 56 %, and 0.809 / 5.084 = 16 %. The rest of the
eigenvalues sum to only 18% of the total variance. The first two eigenvectors are
x1 = {0.464, 0.457, 0.470, 0.421, 0.421}T, and x2 = {-0.240, -0.509, -0.260, 0.526, 0.582}T.
The first eigenvector is called the market component which is an equal weighted vector of the five stocks. This
component reflects the general condition of the market. The second eigenvector is called the industry component
which reflects the contrast between the chemical and the oil industries. Although this problem can be analyzed
using eigenvalue solution as well, in general the singular value decomposition is recommended for the principal
components analysis in statistics.
Mountain Morphology :2 can be analyzed using singular value decomposition. An array of 14 mountain profiles
are digitized with 11 points on each profile and the profiles are shown in Figure 1•44. A matrix S contains this
digitized data is of size 14 × 11. S can be decomposed using singular value decomposition as
S= (U Λ) VT = CF Eq. 1•66
1. example taken from B.N. Datta, 1995, “Numerical Linear Algebra and Applications”, Brooks/Cole Publishing Company,
p.399-400.
2. example data digitized from William Menke, 1984, “ Geophysical Data Analysis: Discrete Inverse Theory”, Academic
Press Inc., Orlando, p.167-170.
#include "include/vs.h"
int main() {
double cv[5][5] = {{1.000, 0.577, 0.509, 0.387, 0.462}, covaraince matrix
{0.577, 1.000, 0.599, 0.389, 0.322},
{0.509, 0.599, 1.000, 0.436, 0.426},
{0.387, 0.389, 0.436, 1.000, 0.523},
{0.462, 0.322, 0.426, 0.523, 1.000}};
C0 cm(5, 5, cv[0]); singular value decomposition
SVD a(cm);
C0 sigma = a.Singularvalues(),
U = a.U();
{2.857, 0.809, 0.540, 0.452, 0.343}T
cout << sigma << endl;
cout << U(0) << endl;
{0.464, 0.457, 0.470, 0.421, 0.421}T
cout << U(1) << endl; {-0.240,-0.509,-0.260,0.526,0.582}T
return 0;
}
1 2 3 4 5 6
7 8 9 10 11 12
13 14
Figure 1•44 14 mountain profiles.
where C = U Λ is called factor loadings, and rows of F are the factors. Program Listing 1•14 implements
this problem using VectorSpace C++ Library.
The first three singular values in the percentage of the total are
8.56 / 16.4 = 52.2%, 2.41 / 16.4 = 14.7%, and 1.91 / 16.4 = 11.6%
These three components together account for 78.5 % of the total. The three dominant “factors” are shown in Fig-
ure 1•45. The first factor (52% strong) is called the “average mountain”, while the second and third factors are
called “skewness” and “sharpness”, respectively, for obvious reasons.
The first three components of “factor loadings” C = U Λ for each row represent the proportion of the first
three factors in each mountain profile. These three numbers of each mountain profile are plotted in a 3-D graph
as shown in Figure 1•46. This figure shows a quantitative method to analyze geometrical objects. For example,
Figure 1•45 Three dominant principal components from the 14 mountain profiles.
F3(sharpness) 12
7
13
2 5 14
6
F2(skewness) 11 4 F1(average mountain)
1
9 3
10 8
Figure 1•46 The “factor loadings” of first three factors (F1, F2, and F3) of the 14 mountain profiles.
you can use this analysis to discriminate different mountain shapes caused by different erosional agents (ice,
water, or wind). Alternatively, you may want to differentiate shapes that are caused by the same agent but at dif-
ferent stages (from juvenile to mature stages) of geomorphological evolution.
On the other hand, since the first three factors account for most (78.5%) of the information of the 14 moun-
tain profiles. We can use the first three columns of the factor loadings and the first three factors to reproduce the
14 mountain profiles as
where superscript “3” denotes that the profile is only composed of only three most significant factors. The recon-
structed mountain profiles from the three dominant factors are plotted against the digitized profiles in Figure
1•47. A close fit is more likely to happen if the original mountain is close to the “average mountain”.
Figure 1•47 Mountain profiles shown in gray lines are reconstructed from three
dominant factors. They are plotted against the solid lines that represents the original
mountain profiles.
C1 and C2 spaces are continuous vector spaces which are differentiable up to order 1 and order 2, respec-
tively. Classes in C1 and C2 types enable the users of VectorSpace C++ Library to deal with numerically differ-
entiable objects. The applications in the subject of numerical optimization, constrained or unconstrained, can be
easily expressed with C1 and C2 types. C++ programs using VectorSpace C++ Library in this chapter are
projects in project workspace file “Cn.dsw” under directory “vs\ex\Cn”.
1. e.g., p.90 in W.L. Burke, 1985, “Applied differential geometry”, Cambridge University Press, Cambridge, U.K.
w
p
In the following we define the algebra of the C1 type. For “x”, “y”, and “z”, Tangent_Bundle objects of C1
type, with an abstract binary operator “ ◊ ”, such as in z = x ◊ y , TABLE 2•1 summaries the algebra of four con-
crete basic operators (in place of the abstract operator ◊ ).
The derivatives for the multiplication operator in the third row simply follows the Lebniz rule in calculus—d z =
d(x y) = y dx + x dy. Lebniz rule can also be applied to the division operator—d(x / y) = d(x * (1/y)) = (y dx - x
dy) / y2. Other operators and transcendental functions can be defined accordingly as in calculus.
It is obvious how C1 type object works as a differentiable object. Since we keep track of numerical values of
base point, p, and tangent vector, w, of a C1 type object through out all kinds of operations, the resultant numer-
ical values of base point and tangent vector of intermediate (temporary) objects will always be available. Inquir-
ing on the numerical values of tangent vector, w, of the C1 type object gives the derivative information we need.
From reverse engineering point of view, when we want to do away with C1 type object, this model has the
advantage that the computing algorithm is quite compatible with traditional FORTRAN or C programming. On
the other hand, the symbolic languages process the intermediate analytical expression by looking up its “dictio-
nary”, while defer the evaluation of actually numerical values until it is explicitly requested by users. Therefore
the computing algorithm is completely different from that of FORTRAN or C. In retrospect, we note that the
defer evaluation approach is known to have its advantage of fast response time in an interactive environment,
since un-necessary evaluations are avoided sometimes.
Constructors
A dedicated constructor for a C1 type Tangent_Bundle object can be written as (project: “c1_examples”)
A double constant “0.0” in line 1 is the argument passed to the dedicated constructor to be assigned as the value
of the base point. The Tangent_Bundle so constructed has default spatial dimension of 1, and its default deriva-
tive (tangent vector) value du = 1.0 as default. C0 converter “C1::operator C0()” in line 2 casting on “x” is used
to retrieve the value of the base point of “x”. The free function “d(const C1&)” in line 3 can be used to retrieve
the value of the derivative (tangent vector). Both the casting operator and the derivative function can be used as
“l-value”, to be put on the left-hand-side to assign value to it. The reason for the default value du = 1.0 is evident
when we consider using “x” as a variable. For example, use “x” as a variable to define a function “f” 1(project:
“c1_examples”)
1. example taken from K.E. Gorlen, S.M.Orlow, and P.S. Plexico, 1991, “Data Abstraction and Object-Oriented Program-
ming in C++”, John Wiley & Sons Ltd, p.92-93.
The first argument of this dedicated constructor is a “const double&” which specifies the value of the base point,
and the second argument is a “int” which gives the number of the spatial dimension, the dimension of a tangent
vector, w. The default value of the tangent vector is to have all its components set to “0.0”.
The constant strings for Tangent_Bundle virtual constructor (use macro definition “TANGENT_BUNDLE”)
and autonomous virtual constructor are shown in the following box.
by reference
“C1&” C1 type Tangent_Bundle 1
“C1*” pointer to C1 type Tangent_Bundle 2
“double*, double*” base point, tangent vector, (spatial dim. = 1) 3
“double*, double*, int” base point, tangnet vector, spatial dim. 4
by value
“int” spatial dim. 5
“const double&, const double&” base point, tangent vector, (spatial dim. = 1) 6
“const C0&, const C0&” base point, tangent vector, (spatial dim. = 1) 7
“const double*, const double*” base point, tangent vector, (spatial dim. = 1) 8
“const C0*, const C0*” base point, tangent vector, (spatial dim. = 1) 9
“const double*, const double*, int” base point, tangent vector, spatial dim. 10
“const C0*, const C0*, int” base point, tangent vector, spatial dim. 11
“const C1&” C1 type Tangent_Bundle 12
“const C1*” pointer to C1 type Tangent_Bundle 13
symbolic operators
C0& operator &= ( ) assignment by reference
C0& operator = ( ) assignment by value
operator C0() casting operator; retrieve base point
arithmetic operators
C0 operator + ( ) const positive (primary casting) unary
C0 operator - ( ) const negative unary
C0 operator + (const C0&) const addition
C0 operator - (const C0&) const subtraction
C0 operator * (const C0&) const multiplication by a scalar; scalar product of two Vectors
C0 operator / (const C0&) const division (by a Scalar or a Matrix only) return a Vector
C0& operator += (const C0&) replacement addition
C0& operator -= (const C0&) replacement subtraction
C0& operator *= (const C0&) replacement multiplication (by a Scalar only)
C0& operator /= (const C0&) replacement division (by a Scalar only)
logic operators
int operator == (const C0&) const equal TRUE == 1
int operator != (const C0&) const not equal FALSE == 0
int operator >= (const C0&) const greater or equal
int operator <= (const C0&) const less or equal
int operator > (const C0&) const greater
int operator < (const C0&) const less
functions
int col_length() const spatial dimension
C0& d() the first derivative; retrieve tangent vector
C0 pow(int) const power (applied to each element of the Vector)
C0 exp(const C0&) const exponent (applied to each element of the Vector)
C0 log(const C0&) const log (applied to each element of the Vector)
C0 sin(const C0&) const sin (applied to each element of the Vector)
C0 cos(const C0&) const cos (applied to each element of the Vector)
Partial listing of C1 type Tangent_Bundle class arithmetic operators, logic operators and functions.
∂f 1 ⁄ ∂x 1 ∂f 1 ⁄ ∂x 2 ∂f1 ⁄ ∂x 3
df df i
------ = ------- = ∂f 2 ⁄ ∂x 1 ∂f 2 ⁄ ∂x 2 ∂f2 ⁄ ∂x 3 Eq. 2•1
dx dx j
∂f 3 ⁄ ∂x 1 ∂f 3 ⁄ ∂x 2 ∂f3 ⁄ ∂x 3
Constructors
The dedicated constructor for the C1 type Vector_of_Tangent_Bundle class can be written as (project:
“c1_examples”)
“C1::C1(int, const double*)” is the variable dedicated constructor for the C1 type Vector_of_Tangent_Bundle
class. The example for using this variable dedicated constructor is a vector function f = {f1, f2, f3}T, which
depends on three independent variable x = {x1, x2, x3}T as1
f 1 = 16x 14 + 16x 24 + x 34 – 16
f 2 = x 12 + x 22 + x 32 – 3
f 3 = x 13 + x 2 Eq. 2•2
Root finding of “f(x) = 0” for this non-linear problem can be obtained by an iterative algorithm. Considering
approximation of the vector function f by Taylor expansion at the neighborhood of an initial value xi with its
increment dx as
where O(dx2) denotes the error in the second-order of dx or above. Neglecting higher-order errors in Eq. 2•3 for
small dx, we have
1. example taken from K.E. Gorlen, S.M. Orlow, and P.S. Plexico, 1991, “Data Abstraction and Object-Oriented Program-
ming in C++”, John Wiley & Sons Ltd, p.93-97.
#include “include/vs.h”
#define EPSILON 1.e-12
#define MAX_ITER_NO 10
int main() {
double v[3] = {1.0, 1.0, 1.0}; initial values, x0 = {1.0, 1.0, 1.0}T
C1 x(3, v), f(3, (double*)0);
int count = 0;
do {
f1 = 16x 14 + 16x 24 + x 34 – 16
f[0]=16.0*x[0].pow(4)+16.0*x[1].pow(4)+x[2].pow(4)-16.0;
f[1]= x[0].pow(2)+ x[1].pow(2)+x[2].pow(2)-3.0; f2 = x 12 + x 22 + x 32 – 3
f[2] = x[0].pow(3)- x[2]; f 3 = x 13 + x 2
C0 dx = - ((C0)f) / d(f); Eq. 2•4, dx = - f(xi) / f,x(xi)
(C0) x += dx; update xi+1 = xi+dx
} while(++count < MAX_ITER_NO &&
(double)norm((C0)f) > EPSILON); check convergence condition
if(count == MAX_ITER_NO)
cout << “Warning: convergence failed, residual norm: ” if convergence failed output
<< ((double)norm((C0)f)) << endl; residual norm
else
cout << “solution (” << count << “): ” << ((C0)x) << endl; x = {0.877966, 0.676757, 1.33086}T
return 0;
}
Listing 2•1 Solving a nonlinear vector of function using C1 type Vector_of_Tangent_Bundle class
(project: “newton_method_nd_root_finding”).
0 00 01 02
x: 1 10 11 12
double v[3] = {1.0, 1.0, 1.0};
C1 x(3, v); 2 20 21 22
base tangent
point vector
x[0] : 0 00 01 02
by reference
“C1&” C1 type Vector_of_Tangent_Bundle —
“C1*” pointer to C1 type Vector_of_Tangent_Bundle —
“int, int, double*, double*” manifold dim, spatial dim, base point, tangnet vector
14
by value
“int, int” manifold dim., spatial dim. 15
“int, int, const double*, const double*” base point, tangent vector, (spatial dim. = 1) 16
“const C0&, const C0&” base point (Vector), tangent vector (Matrix) —
“const C0*, const C0*” base point (Vector), tangent vector (Matrix) —
“const C1&” C1 type Vector_of_Tangent_Bundle —
“const C1*” pointer to C1 type Vector_of_Tangent_Bundle —
Strings in C1 virtual constructor for C1 type Vector_of_Tangent_Bundle class.
symbolic operators
C0& operator &= ( ) assignment by reference
C0& operator = ( ) assignment by value
C0& operator [] (int) selector return Tangent_
Bundle
operator C0() casting operator; retrieve base point
arithmetic operators
C0 operator + ( ) const positive (primary casting) unary
C0 operator - ( ) const negative unary
C0 operator + (const C0&) const addition
C0 operator - (const C0&) const subtraction
C0 operator * (const C0&) const multiplication by a scalar or scalar product
C0 operator / (const C0&) const division (by a Scalar only)
C0& operator += (const C0&) replacement addition
C0& operator -= (const C0&) replacement subtraction
C0& operator *= (const C0&) replacement multiplication (by a Scalar only)
C0& operator /= (const C0&) replacement division (by a Scalar only)
logic operators
int operator == (const C0&) const equal TRUE == 1
int operator != (const C0&) const not equal FALSE == 0
int operator >= (const C0&) const greater or equal
int operator <= (const C0&) const less or equal
int operator > (const C0&) const greater
int operator < (const C0&) const less
functions
int row_length() const manifold dimension
int col_length() const spatial dimension
C0& d() the first derivative; retrieve tangent vector
C0 pow(int) const power (applied to each element of the Vector)
C0 exp(const C0&) const exponent (applied to each element of the Vector)
C0 log(const C0&) const log (applied to each element of the Vector)
C0 sin(const C0&) const sin (applied to each element of the Vector)
C0 cos(const C0&) const cos (applied to each element of the Vector)
Partial listing of Vector_of_Tangent_Bundle object arithmetic operators, logic operators and functions.
where the operator “ ⊗ ” for the tangent of tangent vector denotes tensor product.
For the purpose of a variable dedicated constructor, the default value of “ddu”(=0.0) is just the derivative of “du”
(= 1.0). For access to the second derivative information, free function “dd(const C2&)” (or “d2(const C2&)”) can
be used to retrieve the value of “ddu” (project: “c2_examples”).
For spatial dimension greater than 1, we can write the dedicated constructor similarly to write that of the C1 type
Tangent_Bundle as (project: “c2_examples”)
C2 y(3.0, 3);
cout << ((C0) y) << endl; // 3.0
cout << d(y) << endl; // {0.0, 0.0, 0.0}T
cout << dd(y) << endl; // {{0.0, 0.0, 0.0}, {0.0, 0.0, 0.0}, {0.0, 0.0, 0.0}}
The constant strings for C2 type Tangnet_of_Tangent_Bundle virtual constructors (use macro definition
“TANGENT_OF_TANGENT_BUNDLE”) and autonomous virtual constructors are shown in the following box.
by reference
“C2&” C2 type Tangent_of_Tangent_Bundle 1
“C2*” pointer to C2 type Tangent_of_Tangent_Bundle 2
“double*, double*, double*” base point, tangent vector, tangent of tangent 3
vector, (spatial dim. = 1)
“double*, double*, double*, int” base point, tangent vector, tangent of tangent 4
vector, spatial dim.
by value
“int” spatial dim. 5
“const double&, const double&, base point, tangent vector, tangent of tangent 6
const double&” vector, (spatial dim. = 1)
“const C0&, const C0&, base point, tangent vector, tangent of tangent 7
const C0&” vector, (spatial dim. = 1)
“const double*, const double* base point, tangent vector, tangent of tangent 8
const double*” vector, (spatial dim. = 1)
“const C0*, const C0* base point, tangent vector, tangent of tangent 9
const C0*” vector, (spatial dim. = 1)
“const double*, const double*, base point, tangent vector, tangent of tangent 10
const double*, int” vector, spatial dim.
“const C0*, const C0*, int” base point, tangent vector, tangent of tangent 11
const C0*, int” vector, spatial dim.
“const C2&” C2 type Tangent_of_Tangent_Bundle 12
“const C2*” pointer to C2 type Tangent_of_Tangent_Bundle 13
symbolic operators
C0& operator &= ( ) assignment by reference
C0& operator = ( ) assignment by value
operator C0() casting operator; retrieve base point
arithmetic operators
C0 operator + ( ) const positive (primary casting) unary
C0 operator - ( ) const negative unary
C0 operator + (const C0&) const addition
C0 operator - (const C0&) const subtraction
C0 operator * (const C0&) const multiplication by a scalar; scalar product of two Vectors
C0 operator / (const C0&) const division (by a Scalar or a Matrix only) return a Vector
C0& operator += (const C0&) replacement addition
C0& operator -= (const C0&) replacement subtraction
C0& operator *= (const C0&) replacement multiplication (by a Scalar only)
C0& operator /= (const C0&) replacement division (by a Scalar only)
logic operators
int operator == (const C0&) const equal TRUE == 1
int operator != (const C0&) const not equal FALSE == 0
int operator >= (const C0&) const greater or equal
int operator <= (const C0&) const less or equal
int operator > (const C0&) const greater
int operator < (const C0&) const less
functions
int col_length() const spatial dimension
C0& d() the first derivative
C0& dd() or C0& d2() the second derivative
C0 pow(int) const power (applied to each element of the Vector)
C0 exp(const C0&) const exponent (applied to each element of the Vector)
C0 log(const C0&) const log (applied to each element of the Vector)
C0 sin(const C0&) const sin (applied to each element of the Vector)
C0 cos(const C0&) const cos (applied to each element of the Vector)
Partial listing of C2 type Tangent_of_Tangent_Bundle object arithmetic operators, logic operators and functions.
Constructors
A variable dedicated constructor for C2 type Vector_of_Tangent_of_Tangent_Bundle can be written as
(project: “c2_examples”)
Note that dd(x) returns a “Nominal_Submatrix” which can not be directed to iostreams. We must use primary
casting “+” to convert it into a Matrix. Without this conversion the program will throw an exception and stop.
This elliptic objective functional can be approximated by Taylor expansion to the second order as2
1
f ( x ) ≅ f ( x i ) + f ,x ( x i )dx + --- dx T H ( x i )dx Eq. 2•6
2
where f,x(xi) is the so-called Jacobian matrix, and H(xi) = f,xx(xi) is the so-called Hessian matrix. f(x) is mini-
mized if its first derivatives with respect to dx vanishes. Therefore, if we take derivatives of f(x), set to zero, then
solve for dx, we obtain,
The elliptic nature of the objective functional guarantees that the Hessian matrix can be inverted. Eq. 2•7 is
known as the Newton’s formula, and xi+1 = xi+dx is the update for the algorithm. For the elliptic objective func-
tional such as Eq. 2•5, the approximation by Eq. 2•6 in quadratic form is exact. One iteration will give the exact
answer. Program Listing 2•2 in the following implemented the “classic Newton-Raphson method” that can be
used for less ideal cases when the objective functionals are not exactly quadratic.
The minimum solution of this elliptic objective functional “f” is {2, 4}T, which is the center of the ellipse “f =
constant”. We can verify this immediately with analytical geometry. The Newton-Raphson iterative procedure in
this case achieves convergence in just one iteration from the initial point (0, 0) to the final solution point (2, 4)
(see Figure 2.5).
The constant strings for Vector_of_Tangent_Bundle virtual constructors (use macro definition
“VECTOR_OF_TANGENT_OF_TANGENT_BUNDLE”) and autonomous virtual constructors are shown in
the following box.
1. function without constrained conditions from D.G. Luenberger, 1989, “Linear and Nonlinear Programming”, Addison-
Wesley Publishing Company, Inc., Reading, MA., p.426.
2. A similar equation is in p.225, Eq. 43 of D.G. Luenberger, 1989, “Linear and Nonlinear Programming”, Addison-Wesley
Publishing Company, Inc., Reading, MA.
#include “include/vs.h”
#define EPSILON 1.e-12
#define MAX_ITER_NO 10
int main() {
C2 x(2, (double*)0), f; initial values, x0 = {0.0, 0.0}T
int count = 0;
do {
f &= 2.0*x[0].pow(2) + x[0]*x[1] + x[1].pow(2) -12.0*x[0] f(x1, x2) = 2x12 + x1x2 + x22 -12x1 -10 x2
-10.0*x[1];
C0 dx = - d(f) / dd(f); Eq. 2•7, dx = - f,x(xi) / f,xx(xi)
(C0) x += dx; update xi+1 = xi+dx
} while(++count < MAX_ITER_NO &&
(double)norm(d(f))>EPSILON); check convergence condition
if(count == MAX_ITER_NO)
cout << “Warning: convergence failed, increment norm: ” if convergence failed output
<< ((double)norm(dx)) << endl; dx norm
else
cout << “solution (” << count << “): ” << ((C0)x) << endl; x = {2, 4}T
return 0;
}
Listing 2•2 Minimization of an elliptic objective functional using Vector_of_Tangent_of_Tangent _Bundle
(project: “newton_method_optimization”).
4 (2, 4) solution
0
(0,0) initial point
0 2 4 6 8
by reference
“C2&” C2 type Vector_of_Tangent_of_Tangent_Bundle —
“C2*” C2* type Vector_of_Tangent_of_Tangent_Bundle —
“int, int, double*, double*, manifold dim, spatial dim, base point, tangent vector
double*” tangent of tangent vector 14
by value
“int, int” manifold dim., spatial dim. 15
“int, int, const double*, base point, tangent vector, tangent of tangent 16
const double*, const double*” vector, (spatial dim. = 1)
“const C0&, const C0&, base point (Vector), tangent vector (Matrix) —
const C0&” tangent of tangent vector (Submatrix)
“const C0*, const C0*, base point (Vector), tangent vector (Matrix) —
const C0*” tangent of tangent vector (Submatrix)
“const C2&” C1 type Vector_of_Tangent_of_Tangent_Bundle —
“const C2*” C1* type Vector_of_Tangent_of_Tangent_Bundle —
symbolic operators
C0& operator &= ( ) assignment by reference
C0& operator = ( ) assignment by value
C0& operator [] (int) selector return Tangent_of_
Tangent_Bundle
operator C0() casting operator; retrieve base point
arithmetic operators
C0 operator + ( ) const positive (primary casting) unary
C0 operator - ( ) const negative unary
C0 operator + (const C0&) const addition
C0 operator - (const C0&) const subtraction
C0 operator * (const C0&) const multiplication by a scalar or scalar product
C0 operator / (const C0&) const division (by a Scalar only)
C0& operator += (const C0&) replacement addition
C0& operator -= (const C0&) replacement subtraction
C0& operator *= (const C0&) replacement multiplication (by a Scalar only)
C0& operator /= (const C0&) replacement division (by a Scalar only)
logic operators
int operator == (const C0&) const equal TRUE == 1
int operator != (const C0&) const not equal FALSE == 0
int operator >= (const C0&) const greater or equal
int operator <= (const C0&) const less or equal
int operator > (const C0&) const greater
int operator < (const C0&) const less
functions
int row_length() const manifold dimension
int col_length() const spatial dimension
C0& d() the first derivative; retrieve tangent vector
C0& dd() the second derivative; retrieve tangent of tangent vector
C0 pow(int) const power (applied to each element of the Vector)
C0 exp(const C0&) const exponent (applied to each element of the Vector)
C0 log(const C0&) const log (applied to each element of the Vector)
C0 sin(const C0&) const sin (applied to each element of the Vector)
C0 cos(const C0&) const cos (applied to each element of the Vector)
Partial listing of C2 type Vector_of_Tangent_of_Tangent_Bundle object arithmetic operators, logic operators and functions.
2x 1 + x 2 + x 3 ≤ 2
x 1 + 2x 2 + 3x 3 ≤ 5
2x 1 + 2x 2 + x 3 ≤ 6
x 1 ≥ 0, x 2 ≥ 0, x 3 ≥ 0
The pre-processing step of the linear programming is to (1) multiply “-1” on the objective functional to con-
vert the maximization problem into a minimization problem, (2) transform the first three inequality constraints
into equality constraints by adding positive slack variables x4, x5, and x6. That is
2x 1 + x 2 + x 3 + x 4 = 2
x 1 + 2x 2 + 3x 3 + x 5 = 5
2x 1 + 2x 2 + x 3 + x 6 = 6
x 1 ≥ 0, x 2 ≥ 0, x 3 ≥ 0, x 4 ≥ 0, x 5 ≥ 0, x 6 ≥ 0
1. example taken from D.G. Luenberger, 1989, “Linear and Nonlinear Programming”, Addison-Wesley Publishing Company,
Inc., Reading, MA., p.46.
minimize f ( x ) = c DT x D + c BT x B
subject to D x D + Bx B = b
x D ≥ 0, x B ≥ 0
xD = {x1, x2, x3} is the non-basic variables, xB = {x4, x5, x6} is the basic variables. We choose the slack vari-
ables as initial basic variables for the obvious reason that the initial basic feasible solution is clearly xB = {2, 5,
6} without having to solve any set of equations. We can solve for xB from the equality constraints as
f ( x ) = ( cD
T – c T B –1 D ) x + c T B – 1 b
B D B Eq. 2•9
In Eq. 2•9 the “coefficient” of the non-basic variables xD is rTD ≡ c DT – c BT B –1 D . rD is known as the relative cost
which measures the cost of a non-basic variable relative to the current basic variables. Negative values of the rel-
ative cost in Eq. 2•9 decrease the value of the objective functional. A non-basic variable corresponding to the
most negative relative cost component in rD will be brought into the basic set, such that the objective functional
decreases the most. Denote a column in D corresponding to the non-basic variable selected to enter the basic set
as “d”. Bringing this non-basic variable into the current basic set means that we are moving with p = - B-1d as a
searching direction; that is moving away from xB along x = xB + α p. The smallest non-negative component in
vector α to satisfy xB + α p = 0 will be the first basis in the current basic set to be encountered (an adjacent
extreme point to d). Hence, this basis is to be selected to leave the basic set. The above process is to be repeated
until the components of relative cost rD are all positive.
The implementation of any non-trivial problem such as the finite difference method discussed in page 67
contains many logic steps that are not highly mathematical. In finite difference method we create a class “FD” to
handle the mapping of finite difference stencil to the global matrix using the concept of data abstraction. Pro-
gram Listing 2•3 (project: “linear_programming_basic_set”) implemented the basic set method. In current basic
set method we create a class “Basic_Set” to represent the basic and non-basic columns in the constraint equa-
tions and the coefficients of the objective functional as shown in Figure 2.6.
For the example problem, we write
C1 X(6, (double*)0), C(3, 6, (double*)0), f; // 6 variables, 3 constraints
C[0] = 2*X[0]+ X[1]+ X[2]+ X[3] ;
C[1] = X[0]+2*X[1]+3*X[2] + X[4] ;
Notice that both the constraint equations and the objective functional are declared as objects of C1 type. The tan-
gent vector of the C1 type objects will give the coefficients we need; i.e, the elements of A = d(C), and the ele-
ments of cT = d(f) in Eq. 2•8. The class “Basic_Set” is initialized by calling constructor
#include “include/vs.h”
class Basic_Set {
C0 *_A, *_c;
class basic set
int row_size, col_size, *_basic_order; constraint, and coefficients of the objec-
public: tive functional
Basic_Set(C0&, C0&);
~Basic_Set() { delete [] _basic_order; }
C0& A() { return *_A; }
C0& c() { return *_c; }
int basic_order(int i) {return _basic_order[i];}
void swap(int, int);
};
Basic_Set::Basic_Set(C0& dC, C0& df) {
row_size = dC.row_length(); col_size = dC.col_length();
_basic_order = new int[col_size];
for(int i = 0; i < col_size; i++) _basic_order[i] = i;
_A = &dC; _c = &df; initialize original variable order array
}
void Basic_Set::swap(int i, int j) {
int old_basic_order = _basic_order[i];
_basic_order[i] = _basic_order[j]; _basic_order[j] = old_basic_order; swap order
C0 old_Ai(row_size, (double*)0); old_Ai = (*_A)(i);
(*_A)(i) = (*_A)(j); (*_A)(j) = old_Ai;
C0 old_ci(0.0); old_ci = (*_c)[i]; swap columns
(*_c)[i] = (*_c)[j]; (*_c)[j] = old_ci;
}
swap objective functional coefficients
swap
non-basic basic
A= = [D, B]
The submatrices (D, B) and subvectors (cTD, cTB) in Eq. 2•8 can be written using referenced Matrix and refer-
enced Vector in VectorSpace C++ Library as
where the public member functions “Basic_Set::A()” and “Basic_Set::c()” provide access to the matrix “A” and
vector “cT” referenced to by the Basic_Set object. The most important function that the class “Basic_Set” per-
forms is to swap columns between the basic and non-basic set. This is provided by the public member function
“Basic_Set::swap(int, int)” with the two integer arguments indicating which two columns are to be swapped. A
private member integer array “_basic_order” is used to keep track down the original variable order. This original
variable order can be retrieved by using the public member function “Basic_Set::basic_order(int)”. Its integer
argument is the current column number.
Program Listing 2•4 implemented the steps of the so-called revised simplex method in linear programming. 1
These steps are
The solution is x = {0.2, 0, 1.6, 0, 0, 4}T in the “standard form”, with the maximum value of the objective func-
tional as “5.4”, or the solution is x = {0.2, 0, 1.6}T in original form, neglecting the artificial “slack variables”.
The basic set method is the traditional simplex-tableau updating procedure that can be explained step-by-step
with minimum mathematics.2 Following is the active set method for linear programming that will be more
readily modified to inequality constrained nonlinear programming.
1. p.60 in D.G. Luenberger, 1989, “Linear and Nonlinear Programming”, Addison-Wesley Publishing Company, Inc., Read-
ing, MA.
2. see Chapter 3 in D.G. Luenberger, 1989, same as the above.
Listing 2•4 Basic set method for linear programming (project: “linear_programming_basic_set”).
f=c
∇C
∇C ∇f ∇f
Tangent plane
p x*
Tangent plane x’
C(x)
Figure 2.7 Extremum point occurs at ∇f is a linear combination of ∇C.
With this relation between ∇f and ∇C at an extremum point, we introduce Lagrange multiplier λ (“m”
dimensional vector) as the coefficients of linear combination of components of ∇C to form ∇f
In view of this, the Lagrangian functional, l(x, λ), can be introduced to represent the constrained optimization
problem as
For an extremum condition, setting the first-order derivatives of the Lagrangian functional Eq. 2•11 to zero gives
the Euler-Lagrange equations
l,x= ∇f + λT ∇C = 0
l,λ = C = 0 Eq. 2•12
This states that the first-order condition of the Lagrangian functional (1) is exactly Eq. 2•10, and (2) requires x to
stay on the constraints surface (C = 0). The second-order condition of the Lagrangian functional in the case of
the linear objective functional is just the Hessian H= f,xx being positive definite.
In the case of inequality constraints C ( x ) ≤ 0 , one can state that
In the first part of Eq. 2•13, the constraint is satisfied in the interior of a feasible region (Ci < 0). The constraint
is inactive by setting the corresponding Lagrange multiplier λi = 0. In the second part of Eq. 2•13, the constraint
is on the edge of the feasible region. The constraint is active (Ci = 0), and the corresponding Lagrange multiplier
This is the so-called Kuhn-Tucker condition. Although Eq. 2•14 is aesthetically more satisfying, we use Eq. 2•13
for practical coding of class “Active_Set”. The kernel of the problem is to compute the Lagrange multiplier
according to Eq. 2•10 as
where A denotes the active subset of C. When ∀λ i ≥ 0 , for “i” in the active set, the Kuhn-Tucker condition is
uphold, and the solution is optimal. Otherwise, select the constraint corresponding to the most negative λi, and
drop this constraint from the active set. The search direction p due to the deletion of this constraint satisfies1
where is the e i basis vector (with only “i”-th component = 1 and 0 elsewhere). Eq. 2•16 means that the active
constraints other than the “i”-th one is to be strictly satisfied (=0). We solve for p = - e i / A, and the next solution
is along the path x+α p. Therefore, the first constraint to be encountered (Ci = 0) along this search path is the
minimum positive α to satisfy Ci (x+α p) = 0. We have
min {αi = - Ci (xcurrent) / (d(C)i p), “i”-th constraint in the inactive set} Eq. 2•17
The constraint corresponding to this smallest positive αi (in the inactive set) will be added to the active set, A.
Program Listing 2•3 (project: “linear_programming_active_set”) implemented the class “Active_Set”. The
criterion to determine whether a constraint is active or not is replaced by
where ε is a small positive number. The “Active_Set” keeps track of the “active state” of each constraint in the
original constraint equations. Upon calling the public member function “Active_Set::activate()” the current
active set is assembled and a coefficient matrix will be formed. A public member function
“Active_Set::active_state(int)” requires an integer argument as the order of the original constraint equations and
returns the order of the current constraint equations in the active set. The coefficient matrix can be retrieved
using another free function “d( )” such as
1. p. 176 in P.E. Gill, W. Murray, and M.H. Wright, 1981, “Practical Optimization”, Academic Press, Inc., San Diego.
The problem has been recast to the standard form for the active set method. For problems with equality con-
straints the Active_Set constructor can be called as
A second integer argument number indicates the number of equality constraints. These equality constraints will
be always kept in the active set.
A minor technical detail of Active_Set is the private data member “_active_state” is initialized to “-1”. When
a constraint is determined to be included in the active set the value of “_active_state” is set to the order of the
constraint in the current active set. When a constraint is determined to be dropped from the active set, the value
of its “_active_state” is set to “-2”, which means this particular constraint can never be activated again. This
treatment may avoid possible “zigzagging or jamming” of the searching path caught in an infinite loop1.
Program Listing 2•3 implemented the active set algorithm. The core steps are
1. see p. 330, Chapter 11 in D.G. Luenberger, 1989, “Linear and Nonlinear Programming”, Addison-Wesley Publishing
Company, Inc., Reading, MA.
Listing 2•6 Active set method for linear programming (project: “linear_programming_active_set”).
2
f(x1, x2) = 100 (x2-x12) + (1-x1)2 Eq. 2•19
The unique minimum point (1, 1) is at a “banana-shaped valley”. For all problems in this section, the initial point
is selected at (-1.2, 1) such that an “intelligent” search path will have to make a turn along the banana-shaped val-
ley to arrive at the minimum point. This objective functional could be used to test the robustness of an algorithm.
2
1 (-1.2, 1) (1, 1)
-1
-2
-2 -1 0 1 2
Figure 2.8 Rosenbrock’s function with minimum point at point (1, 1).
1. from p. 96 in P.E. Gill, W. Murray, and M.H. Wright, 1981, “Practical Optimization”, Academic Press, Inc., San Diego.
#include “include/vs.h”
#define EPSILON 1.e-12
#define MAX_ITER_NO 20
int main() {
double v[2] = {-1.2, 1.0}, energy_norm; initial values, x0 = {-1.2, 1}T
C2 x(2, v), f;
int count = 0;
do {
f &=100.0*(x[1]-x[0].pow(2)).pow(2)+(1.0-x[0]*x[1]).pow(2); f(x1, x2) = 100 (x2-x12)2 + (1-x1)2
C0 dx = - d(f) / dd(f); Eq. 2•7, dx = - f,x(xi) / f,xx(xi)
(C0) x += dx;
energy_norm = norm(dx*(C0)f);
update xi+1 = xi+dx
} while(++count < MAX_ITER_NO && energy_norm > EPSILON); energy norm = || dx f ||
if(count == MAX_ITER_NO) if convergence failed, output
cout << “Warning: convergence failed, energy norm: ” << energy_norm << endl;
else
energy norm
cout << “solution (” << count << “): ” << ((C0)x) << endl; x = {1, 1}T
return 0;
}
Listing 2•7 Minimization of Rosenbrock’s function using classic Newton’s method (project:
“newton_rosenbrock”).
-1
-2
-3
-2 -1 0 1 2
Along p the solution is updated according to xi+1 = xi+α p, where α is a scalar parameter, and its optimal value is
determined by using line search, or even a scalar version of classical Newton’s method (see next section on
steepest descent method). We consider bisection and golden section here. For line search algorithm, the mini-
mum of a function is searched by evaluating the function and then comparing its values at selected bracketing
points. The basic idea is to have the bracketing interval contains the point with the minimum function value, and
at the same time make the bracketing interval smaller and smaller in an iterative algorithm.
Given a bracketing interval [a, c] for bisection method, the interval contains the point corresponding to the
minimum function value. At the middle of the interval is the point, b = (a+c)/2. The next bracketing point x is
taken as the middle of [b, c]; i.e., x = (b+c)/2. If f(x) > f(b), the next bracketing points are [a, x], otherwise, the
next bracketing points are [b, c]. Repeating this process, the bracketing interval will become smaller and smaller.
In the worst scenario, the selected intervals always lie on the larger segments. The bracketing intervals will
reduce at the rate of 0.75 2n = 0.5625n, where 2n is the number of repeated iterations. On the other hand, the best
case will be reducing at the rate of 0.252n = 0.0625n. On an average case, there is 50% chance of selecting either
larger or smaller segments, so the reducing rate is 0.25n 0.75n = 0.1875n.
Golden section finds an optimal ratio to avoid the worst case scenario compared to bisection method. Con-
sidering a triplet of points [a, b, c] with the ratio of interval [a, b] to interval [a, c] as α, the interval [b, c] to inter-
val [a, c] ratio will be 1-α. The next bracketing point x, lies to the right of b with interval [b, x] to interval [a, c]
ratio as β. First, since after comparison of function values the selected bracketing point can be either b or x, we
demand the symmetry of the two points by requiring that [a, x] (normalized length = α + β) and [b, c] (normal-
ized length = 1-α) to be equal. Therefore,
α + β = 1- α Eq. 2•21
Secondly, if x is to be selected as the next bracketing point, the ratio of interval [b, x] to interval [b, c] (= β/(1-α))
should be self-similar to the original ratio of interval [a, b] to interval [a, c] (= α). Therefore,
α2 - 3α + 1 = 0 Eq. 2•23
One of the roots of Eq. 2•23 that is between [0, 1] is the ratio α = 0.381971, with the ratio of selecting the larger
segment as 1-α = 0.61803. Now, for the worst scenario the convergence rate reduces from 0.752n to 0.618032n.
1. p. 350 for “bisection”, and p. 399 for “golden section”, in W.H. Press, S.A. Teukolsky, W.T. Vetterlin, and B.P. Flannery,
1992, “ Numerical Recipes”, Cambridge University Press, Cambridge, UK.
Listing 2•8 Minimization of Rosenbrock’s function using classic Newton’s method (project:
“newton_rosenbrock” with macro definition “__TEST_GOLDEN_SECTION” defined at compile time).
2
-1
-2
-2 -1 0 1 2
Figure 2.10 Golden section line search with Newton’s method for search direction.
p = - ∇f Eq. 2•24
That is the search direction is taken along the negative gradient direction. This search direction makes intuitive
sense for the objective functional to decrease in the direction of the negative gradient. The solution is updated
through xi+1 = xi + α p, where the scalar α is the line search parameter. We seek a value of α that gives the mini-
mum value of f along the search direction p. For this one variable(α) optimization problem, the scalar version of
the Newton’s method can be used for solving optimal value of α. We may replace it with a more primitive
method such as the golden section line search1 described in previous section.
#include “include/vs.h”
#define EPSILON 1.e-12
#define MAX_ITER_NO 20
int main() {
double v[2] = {-1.2, 1.0}; initial values, x0 = {-1.2, 1}T
C1 X(2, v);
C0 dx;
int count = 0;
do { f(x1, x2) = 100 (x2-x12)2 + (1-x1)2
C1 f = 100.0*(X[1]-X[0].pow(2)).pow(2)+(1-X[0]).pow(2);
p = - g = - f,x(xi); the gradient
C0 g = d(f);
C2 alpha(0.0);
C0 d_alpha; line search along negative gradient
do {
xi+1 = xi + α p
C2 x0 = ((C0)X)[0] + alpha * -g[0],
x1 = ((C0)X)[1] + alpha * -g[1]; φ(xi+1)
C2 phi = 100*(x1-x0.pow(2)).pow(2)+(1-x0).pow(2); dα = -dφ / d 2φ; Newton’s formula
d_alpha &= - d(phi) / dd(phi);
α += dα
((C0)alpha) += d_alpha;
} while(((double)norm(d_alpha)) > EPSILON); dx = α p
dx &= ((C0)alpha)*(-g);
((C0)X) += dx;
update xi+1 = xi+dx
cout << "solution " << (++count) << ": " << ((C0)X) << endl;
} while(((double)norm(dx)) > EPSILON &&count < MAX_ITER_NO); norm = || dx ||
cout << "solution: " << ((C0)X) << endl; x = {1, 1}T
}
Listing 2•9 Minimization of Rosenbrock’s function using steepest descent method (project:
“steepest_descent”).
1. see p. 353 for bisection and p.397 for golden section search in W.H. Press, S.H. Teukolsky, W.T. Vetterling, and B.P. Flan-
nery, 1992, “Numerical Recipes in C”, Cambridge University Press, Cambridge, U.K.
where “f” depends on 2 variables, or in more general case a multi-variable functional, while φ only depends on
one variable α. We note the difference of cost between doing multi-variable Newton’s method as described on
page 110 and one-parameter Newton’s method for the line search algorithm here. The resultant search path of
steepest descent method is shown in Figure 2.11. First of all the path shows the typical zigzag pattern consisting
of alternating orthogonal search directions. The convergence rate is extremely slow. After 100 iterations the
solution is still at (0.6, 0.36). The convergence becomes ever slower when it is approaching the true solution (1,
1). At 9660 iterations, the solution is still at (0.999941, 0.999883). Although each iteration in steepest descent
method is much cheaper than in Newton’s method, the Newton’s method takes only 6 iteration to get to (1, 1).
-1
-2
-2 -1 0 1 2
Figure 2.11 The search path of the steepest descent method up to 100 iterations.
dx = - f,x(xi) / f,xx(xi) = - g / H
where g (= f,x) is the gradient vector and H (= f,xx) is Hessian matrix. Now consider taking H = I, the above equa-
tion becomes
dx = - g
that is Newton’s method has degenerated into steepest descent method. A weighted approach to combine these
two methods together is1
α is the line search parameter in steepest descent method. However, the selection of the weighting parameter ε is
a matter of art. A more systematic way of implementing Eq. 2•25, or a probably more intelligent way, is to use
modified Cholesky decomposition2 introduced on page 32. The basic idea is to set M = H-1. Since Hessian
matrix is symmetrical, we can apply Cholesky decomposition. The problem with the Newton’s method is that the
objective functional may not be quadratic and the Hessian matrix may not be positive definite. When we apply
the Cholesky decomposition, we modify small or negative diagonals according to
d = max {d , δ}
where d is the modified diagonals and δ is a small positive number to be supplied to the modified Choelsky
decomposition on page 32. That is the degeneration of Newton’s method to steepest descent method occurs only
when the positive definitiveness of the Hessian matrix is in question.
Program Listing 2•10 implements combined steepest descent and Newton method using modified Choleksy
decomposition. The search path is shown in Figure 2.12. It takes 13 iterations to get to point(1, 1) about two
times iterations compared to the classic Newton’s method. However, the wild search path in Figure 2.9 has been
tamed successfully. Combined Newton and steepest descent method seems to be a more robust method than clas-
sic Newton method and steepest descent method.
1. p. 226-227 in D.G. Luenberger, 1989, “Linear and Nonlinear Programming”, Addison-Wesley Publishing Company, Inc.,
Reading, MA.
2. p. 108-111 in P.E. Gill, W. Murray, and M.H. Wright, 1981, “Practical Optimization”, Academic Press, Inc., San Diego.
#include “include/vs.h”
#define EPSILON 1.e-12
#define MAX_ITER_NO 20
int main() {
double v[2] = {-1.2, 1.0};
C2 X(2, v);
initial values, x0 = {-1.2, 1}T
C0 dx;
int k = 0;
do {
C2 f = 100.0*(X[1]-X[0].pow(2)).pow(2)+(1.0-X[0]).pow(2);
f(x1, x2) = 100 (x2-x12)2 + (1-x1)2
C0 g = d(f); modified Cholesky decomposition on M
Cholesky mcd(dd(f), EPSILON); p = - g / M;
C0 p = mcd * (-g);
C2 alpha(0.0), x0, x1;
line search along p
C0 d_alpha; xi+1 = xi + α p
do { φ(xi+1)
x0 &= ((C0)X)[0] + alpha * p[0]; x1 &= ((C0)X)[1] + alpha * p[1];
C2 phi = 100.0*(x1-x0.pow(2)).pow(2)+(1.0-x0).pow(2);
dα = -dφ / d2φ; Newton’s formula
if(fabs((double)dd(phi)) > EPSILON) d_alpha &= -d(phi) / dd(phi); α += dα
else break;
((C0)alpha) += d_alpha;
} while(((double)norm(d_alpha)) > 1.e-8);
dx = α p
dx = ((C0)alpha)*p; update xi+1 = xi+dx
((C0)X) += dx; norm = || dx ||
} while(((double)norm(dx)) > EPSILON);
cout << "final solution: " << ((C0)X) << endl;
x = {1, 1}T
}
Listing 2•10 Minimization of Rosenbrock’s function using modified Cholesky decomposition (project:
“combined_newton_and_steepest_descent”).
-1
-2
-2 -1 0 1 2
Figure 2.12 Search path using modified Cholesky decomposition.
1
f ( x ) ≅ --- x T H x – b T x Eq. 2•26
2
The solution is sought along x i+1 = x i + α p i , where dx = α p i. Minimize f(x i+1) with respect to dx (where x i+1
= x i + dx) , we get
Since search directions pi are orthogonal to each other, we have (pi)TH pj = 0, for i ≠ j. From Eq. 2•28, we have
The conjugate direction pi+1 that is orthogonal to all its previous directions is taken as
Pre-multiplying Eq. 2•30 with (gi+1 - gi)T we have left-hand-side of Eq. 2•30 equal zero. In view of Eq. 2•29, βi
can be solved for as
Applying the orthogonal relations to Eq. 2•31 we have the Fletcher-Reeves formula
or Polak-Ribiere formula
Listing 2•11 Minimization of Rosenbrock’s function using conjugate gradient method (project:
“conjugate_gardient_method”).
1. p. 253 in D.G. Luenberger, 1989, “Linear and Nonlinear Programming”, Addison-Wesley Publishing Company, Inc.,
Reading, MA.
-1
-2
-2 -1 0 1 2
Figure 2.13 Conjugate gradient method using only first derivative information, and 30
iterations.
Quasi-Newton Method
The advantage of both steepest descent method and conjugate gradient method is that it requires only first
derivative information. This is important especially for problems with large number of variables. However, we
saw classic Newton’s method or its modification that have second derivative information enjoys a faster conver-
gence rate. The strategy is we want to stick with first derivative method because of economical consideration.
Since along the iterative steps we have two sequences of first derivative information, {x0, x1, ... , xi} and {g0, g1,
Eq. 2•34 is Hi pi = qi, which is also known as quasi-Newton condition. pi = qi / Hi = Bi qi , where Bi = (Hi)-1 is the
inverse of Hessian. We seek a rank one update formula with the form of
i+1 i
B = B +u⊗v Eq. 2•35
where u and v are vectors, which need further constraints. Enforcing quasi-Newton condition first, it can be
shown that the second term is
i i i
(p – B q ) ⊗ v
u ⊗ v = -------------------------------------
i
- Eq. 2•36
v•p
Bi+1 in Eq. 2•35 satisfy quasi-Newton condition but is not symmetrical. We can symmetrized it as (denoted with
superscript “s”)
However, (Bi+1)s may not satisfy quasi-Newton condition. We can use the above two steps (1) quasi-Newton
condition, and (2) symmetrization repeatedly to yield a sequence of Bi+1. The limit of the sequence gives the
updating formula which is known as the Davidon-Fletcher-Powell (DFP) method
i i i i i i
i+1 i p ⊗p (B q ) ⊗ (B q )
B DFP = B + -----------------
i
- – ----------------------------------------
i i i i
Eq. 2•38
p •p q • (B q )
This is a rank-two update formula. If the update is performed on the Hessian H itself instead of its inverse B, we
have a complementary formula by substituting B for H, and p for q and vice versa. Then, taking inverse of this
expression gives the alternative Broyden-Fletcher-Goldfarb-Shanno (BFGS) updating formula
i i
i+1 q • ( B q ) p ⊗ p p ⊗ ( B q ) + ( B q ) ⊗ p
i i i i i i i i i
B BFGS = B i + 1 + --------------------------
i
- ------------------ – ----------------------------------------------------------------
i i i i
Eq. 2•39
q • pi p • p q •p
Program Listing 2•12 implemented the quasi-Newton method. The basic steps are1
1. p. 265-267 in D.G. Luenberger, 1989, “Linear and Nonlinear Programming”, Addison-Wesley Publishing Company, Inc.,
Reading, MA.
-1
-2
-2 -1 0 1 2
Figure 2.14 Searching path of the BFGS method. The solution point (1, 1) is arrived at after
34 iterations.
#include “include/vs.h”
int main() {
double v[2] = {-1.2, 1.0};
initial values, x0 = {-1.2, 1}T
const double EPSILON = 1.e-12;
const int MAX_NO_OF_ITERATION = 100;
int k = 0;
C0 dx;
C1 X(2, v);
C2 x(2, v);
do {
((C0)x) = ((C0)X);
C2 F = 100.0*(x[1]-x[0].pow(2)).pow(2)+(1.0-x[0]).pow(2); f(x1, x2) = 100 (x2-x12)2 + (1-x1)2
C0 B = dd(F).inverse(); B = H-1, initial inverse of Hessian
for(int i = 0; i < 2; i++) {
C1 f = 100.0*(X[1]-X[0].pow(2)).pow(2)+(1.0-X[0]).pow(2);
C0 g = d(f), d = B*(-g); d = - B g;
C2 alpha(1.0), x0, x1;
C0 d_alpha;
do {
line search along d
x0 &= ((C0)X)[0] + alpha * d[0]; xi+1 = xi + α d
x1 &= ((C0)X)[1] + alpha * d[1];
C2 phi = 100.0*(x1-x0.pow(2)).pow(2)+(1.0-x0).pow(2);
if(fabs((double)dd(phi)) > EPSILON) d_alpha &= -d(phi) / dd(phi);
φ(xi+1)
else break; dα = -dφ / d2φ; Newton’s formula
((C0)alpha) += d_alpha; α += dα
} while(((double)norm(d_alpha)) > 1.e-8);
dx &= ((C0)alpha)*d;
dx = α d
((C0)X) += dx; // update the solution update xi+1 = xi+dx
if(((double)norm(d_x)) > EPSILON) { norm = || dx ||
C1 f1 = 100*(X[1]-X[0].pow(2)).pow(2)+(1.0-X[0]).pow(2);
C0 g1 = - d(f1), p = dx, q = g1 -g, Bq = B * q;
p i = dx , qi = gi+1 - gi
B += (1.0+(q*Bq)/(q*p))*((p%p)/(p*q)) - (p%Bq+Bq%p)/(q*p); BFGS updating formula Eq. 2•39
}
cout << "solution(" << (++k) << "): " << ((C0)X) << endl;
}
} while(k < MAX_NO_OF_ITERATION && ((double)norm(dx)) > EPSILON);
cout << "Final solution: " << ((C0)X) << endl; x = {1, 1}T
}
Listing 2•12 Minimization of Rosenbrock’s function using BFGS method (project: “quasi_newton_bfgs”).
where “B” and “D” split the basic and non-basic sets of the constraint coefficients and variables. From the con-
straint equations we have
xB = B-1 b - B-1 D xD
The reduced gradient rT is the derivative of the objective functional f with respect to xD as
d xB
r T = ∇x B f ( x B, x D ) ---------- + ∇x D f ( x B, x D ) = ∇xD f ( x B, x D ) – ∇xB f ( x B, x D ) B –1 D = c D – c B B – 1 D Eq. 2•40
dx D
where c B = ∇xB f ( x B, x D ) and c D = ∇x D f ( x B, x D ) . “Basic_Set” class needs some modification for nonlinear
expression as shown in Program Listing 2•13. The coefficient vector of the objective functional cT = ∇f, and the
coefficient matrix of the constraint equations A [B, D] = dC, are now accessed through free functions “C0&
df(Basic_Set&)” and “C0& dC(Basic_Set&)”, which are declared as friend of class “Basic_Set”. The pointer to
variables are also declared as a private data member of the “Basic_Set”, which can be accessed by the member
function “C0& X()”
We consider a specific example1 already written in standard form as
1. p. 347 in D.G. Luenberger, 1989, “Linear and Nonlinear Programming”, Addison-Wesley Publishing Company, Inc.,
Reading, MA.
#include “include/vs.h”
class Basic_Set {
C1 *_A, *_c, *_X;
class basic set
int row_size, col_size, *_basic_order; constraint, coefficients of the objective
public: functional, and variables
Basic_Set(C1&, C1&, C1&);
~Basic_Set() { delete [] _basic_order; }
int basic_order(int i) {return _basic_order[i];}
void swap(int i, int j);
C0& X();
friend C0& dC(Basic_Set&);
friend C0& df(Basic_Set&);
};
Listing 2•13 class Basic_Set data abstraction for nonlinear problem (project: “reduced_gradient”).
Program Listing 2•14 implements the reduced gradient method using class “Basic_Set” in Program Listing
2•13. The basic steps are
The difference from the linear programming version is evident now that the fundamental theorem of linear pro-
gramming is not applicable any more; i.e., the extremum value may occur in the middle of an edge or even in the
#include “include/vs.h”
int main() {
initial feasible point, x0 = {2, 2, 1, 0}T
double rhs[2] = {7.0, 6.0}, v[4] = {2.0, 2.0, 1.0, 0.0}, norm_dxd,
EPSILON = 1.e-12, HUGE = 1.e20, RELAXED = 1.e3; constraints and objective functional
int k = 0, MAX_NO_OF_ITER = 10;
C1 X(4, v), C = VECTOR_OF_TANGENT_BUNDLE("int, int", 2, 4);
C[0] = 2x1 + x2 + x3 + 4x4
C[0] = 2*X[0]+ X[1]+ X[2]+4*X[3];
C[1] = X[0]+ X[1]+2*X[2] + X[3]; C[1] = x1 + x2 + 2x3 + x4
C1 f = X[0].pow(2)+X[1].pow(2)+X[2].pow(2)+X[3].pow(2)-2*X[0]-3*X[3]; f(x) = x12 +x22 + x32 + x42 - 2x1 - 3x4
Basic_Set BS(C, f, X);
A = [B, D], cT = [cBT, cDT],
C0 B(2, 2, dC(BS), 0, 0), D(2, 2, dC(BS), 0, 2), c_B(2, df(BS), 0), c_D(2, df(BS), 2);
C0 X_B(2, BS.X(), 0), X_D(2, BS.X(), 2), b(2, rhs), x(4, (double*)0);; xT = [xBT, xDT], bT = {7, 6}T
do { ∆xB, ∆xD
C0 d_X_B(2, (double*)0),d_X_D(2, (double*)0),
Step 1: rT = cD - cB B-1 D
B_inv = B.inverse(), r_D = c_D - c_B * B_inv * D;
for(int i = 0; i < 2; i++) Step 2: ∆xDi = -ri
if((double) r_D[i] < -EPSILON || (double) X_D[i] > EPSILON) d_X_D[i] = - r_D[i]; ∆xDi = 0
else d_X_D[i] = 0.0;
Step 3: if not ∀ ∆xDi = 0
if((norm_dxd = norm(d_X_D)) > RELAXED*EPSILON) {
d_X_B = - B_inv * D * d_X_D; ∆xB= −B-1 D∆xD
double alpha_B=HUGE,alpha_D=HUGE,ratio_B,ratio_D;int min_B=-1,min_D=-1; Step 4:
for(int i = 0; i < 2; i++) if((double)d_X_B[i] < EPSILON) {
max { αB: xB+αB ∆xB ≥ 0 },
ratio_B = (double) - X_B[i]/d_X_B[i];
if(ratio_B < alpha_B) { alpha_B = ratio_B; min_B = i; }
}
for(int i = 0; i < 2; i++) if((double)d_X_D[i] < EPSILON) {
max { αD: xD+αD ∆xD ≥ 0 }
ratio_D = (double) - X_D[i]/d_X_D[i];
if(ratio_D < alpha_D) { alpha_D = ratio_D; min_D = i; }
} Step 5:
C0 d_X = d_X_B & d_X_D, d_alpha;
∆xT = [∆xBT, ∆xDT]
C2 alpha(0.0), x[4];
do { line search along ∆x
for(int i = 0; i < 4; i++) xi+1 = xi + α ∆x
x[i] = ((C0)X)[BS.basic_order(i)] + alpha * d_X[BS.basic_order(i)];
φ(xi+1)
C2 phi = x[0].pow(2)+x[1].pow(2)+x[2].pow(2)+x[3].pow(2)-2*x[0]-3*x[3];
if(fabs((double)dd(phi)) > EPSILON) d_alpha &= -d(phi) / dd(phi); dα = -dφ / d 2φ; Newton’s formula
else break; α += dα
((C0)alpha) += d_alpha;
min{f(x + α ∆x):
} while((double)norm(d_alpha) > RELAXED*EPSILON);
if((double)(C0) alpha >= alpha_B) { 0 ≤ α ≤ α B, 0 ≤ α ≤ α D }
(C0) alpha = alpha_B; Step 6: update B and D
BS.swap(min_B, min_D+2);
}
if((double)(C0) alpha > alpha_D) (C0) alpha = alpha_D;
d_X *= ((C0)alpha); dx = α d
((C0)X) += d_X;
update xi+1 = xi+dx
} f = X[BS.basic_order(0)].pow(2)+X[BS.basic_order(1)].pow(2)+
X[BS.basic_order(2)].pow(2)+X[BS.basic_order(3)].pow(2) update f
-2*X[BS.basic_order(0)]-3*X[BS.basic_order(3)];
df(BS) = d(f);
update c
} while(++k < MAX_NO_OF_ITER && norm_dxd > RELAXED*EPSILON);
for(int i = 0; i < 4; i++) x[i] = ((C0)X)[BS.basic_order(i)]; update x
C0 fp = x[0].pow(2)+x[1].pow(2)+x[2].pow(2)+x[3].pow(2)-2*x[0]-3*x[3]; x = {1.148, 0.683, 1.806, 0.553}T
cout << "The final solution: " << x << endl << "f: " << fp << endl;
f = 1.40348
}
∇f +λT ∇A = 0
This optimal condition states that the two gradients are parallel to each other (in 2-D representation). In general
during the course of optimization, where the convergence has not yet achieved, these two gradients are not paral-
lel to each other, and we can project ∇f on the tangent plane such as at the point x’ (as in Figure 2.7). The pro-
jected gradient on the tangent plane is a vector d to be used as search direction is expressible as (see also Figure
2.15
p = −∇f − ∇A λ
T
Eq. 2•41
T
where the second term - ∇A λ is the component orthogonal to the tangent plane. In view of Eq. 2•41, p vanishes
when the left-hand-side equals zero; i.e., the first-order condition of an extremum point is satisfied. Since the
search direction p on the tangent plane is orthogonal to the gradient of the constraint equations ∇A, we have the
orthogonal relationship as ∇A p = 0. Pre-multiply Eq. 2•41 with ∇A and solve for λ,1
T -1
λ = -(∇A ∇A ) ∇A∇f Eq. 2•42
hypotenuse: -g = −∇f T
∇A λ
T
x p = ∇f - ∇A λ
tangent plane
Figure 2.15 Project negative gradient to the tangent plane as p (search direction).
1. p. 330-331 in D.G. Luenberger, 1989, “Linear and Nonlinear Programming”, Addison-Wesley Publishing Company, Inc.,
Reading, MA.
2. modified from p. 332-333 in D.G. Luenberger, 1989, same as the above.
The objective functional is the same one that has been solved in Eq. 2•5 without constraints, and implemented in
Program Listing 2•2. The result of the present constrained optimization of Eq. 2•5 is shown in Figure 2.16. The
search path taken by this program is: (1) the initial feasible point is set at (0, 0). The first active set formed con-
sists of the second and the third constraints (-x1 ≤ 0, and -x2 ≤ 0). This initial active set will produce no search
direction (p = 0). (2) The Lagrange multiplier of this active set is {-12, -10}T. Therefore, the second constraint (
-x1 ≤ 0 with λ = -12) is dropped from the active set. (3) The search path, p, is along the x-axis. A line search
will be activated to find the minimum objective functional value of f(x1, x2) at (3, 0). (4) the third constraint is
dropped (-x2 ≤ 0 with λ = -10) from the active set leaving no constraint in the set. A steepest descent of objec-
tive functional f(x1, x2) produces a search direction parallel to y-axis. The first constraint (x1 + x2 ≤ 4) will be
encountered along this search direction at (3, 1). (5) With only one constraint left, a line search is performed
which gives (1.5, 2.5) as the minimum point. Since the only Lagrange multiplier is positive. (1.5, 2.5) is taken as
the final optimal solution.
In the gradient projection method, when there is no constraint, the method reduces to steepest descent
method. Figure 2.11 on page 128 shows an example of how steepest descent method can be a nightmare in the
computation. The steepest descent is really the backbone behind the gradient projection method. We need a sec-
ond-order method to improve the convergence rate.
1. p. 426-427 in D.G. Luenberger, 1989, “Linear and Nonlinear Programming”, Addison-Wesley Publishing Company, Inc.,
Reading, MA.
#include “include/vs.h”
int main() {
double v[2] = {0.0, 0.0}, EPSILON = 1.e-12; int ALL_POSITIVE = TRUE, lambda_flag; initial feasible point (0, 0)
C1 X(2,v), C = VECTOR_OF_TANGENT_BUNDLE("int, int", 3, 2);
C1 f = TANGENT_BUNDLE("int", 2);
f(x1, x2) = 2x12 + x1x2 + x22 -12x1 -10 x2
f = 2*X[0].pow(2)+X[0]*X[1]+X[1].pow(2)-12*X[0]-10*X[1];
C[0] = X[0] + X[1] -4; subject to x1 + x2 ≤ 4
C[1] = -X[0] ; C[2] = - X[1]; x1 ≤ 0, x2 ≤ 0
Active_Set A(C);
initialize active set
for(;;) {
A.activate(); Step 1. form active set
lambda_flag = !ALL_POSITIVE; Step 2: projection matrix
C0 lambda, p;
no constraint degenerated to
if(A.active_no() == 0) p &= -d(f);
else { steepest descent p = - ∇f
T -1
lambda &= - (d(A)*d(f)) / (d(A)*~d(A)); λ = -(∇A ∇A ) ∇A ∇f
p &= -d(f)- d(A)*lambda;
p = - ∇f - ∇ATλ , Eq. 2•41
}
if(fabs((double)norm(p)) > EPSILON) { Step 3: if p ≠ 0
double min_alpha = 1.e20; min{αi:αi =- Ci / (d(C)i p)}
for(int i = 0; i < 3; i++)
if(A.active_state(i) <= -1) {
double alpha, temp = (double)(d(C)[i]*p);
if(fabs(temp) > EPSILON) alpha = -(double)(((C0)C)[i]/temp);
if(alpha < min_alpha && alpha > 0.0) { min_alpha = alpha; active_flag = TRUE; }
}
C0 d_alpha(0.0); C2 alpha(0.0), x0, x1, F; line search
do { min{f(x + α p): 0 ≤ α ≤ α i }
x0 = ((C0)X[0]) + alpha * p[0]; x1 = ((C0)X[1]) + alpha * p[1];
F &= 2*x0.pow(2)+x0*x1+x1.pow(2)-12*x0-10*x1;
d_alpha = - d(F)/dd(F);
((C0)alpha) += d_alpha;
} while((double)norm(d_alpha) > EPSILON);
if((double)((C0)alpha) < min_alpha) min_alpha = (double)((C0)alpha);
C0 dx = min_alpha * p; ((C0)X) += dx; updates
((C0)C[0]) = ((C0)X[0]) + ((C0)X[1]) -4; xk+1 = xk + α p
((C0)C[1]) = -((C0)X[0]) ; ((C0)C[2]) = ((C0)X[1]);
C(xk+1)
f = 2*X[0].pow(2)+X[0]*X[1]+X[1].pow(2)-12*X[0]-10*X[1];
} else { f(xk+1)
int i_cache = -1; double min_lambda = -EPSILON; Step 4: if p = 0
lambda_flag = ALL_POSITIVE;
most negative λ
for(int i = 0; i < A.active_no(); i++)
if((double)lambda[i] < -EPSILON) { Kuhn-Tucker condition; Eq.
lambda_flag = !ALL_POSITIVE; 2•14
if((double)lambda[i] < min_lambda) { min_lambda = lambda[i]; i_cache = i; }
λ i ≥ 0 , A i ≤ 0, and λ i A i = 0
}
if (lambda_flag) break; until for Ai = 0 ⇒ ∀ λi ≥ 0
else A.deactivate(i_cache); drop constraint corresponding to
}
the most negative λ
cout << ((C0)X) << endl;
}
cout << "solution: " << ((C0)X) << endl;
return 0;
}
8 x2
6
f(x1, x2) = 2x12 + x1x2 + x22 -12x1 -10 x2
(2, 4)
4
feasible region
(1.5, 2.5)
2
x1 + x2 = 4
(3, 1)
x1
0
(0, 0) (3, 0)
0 2 4 6 8
Figure 2.16 Active set method on a constrained quadratic functional
subject to A(x) = Ax - b = 0
where H = f,xx is the Hessian matrix, and g = f,x is the gradient vector. The Lagrangian functional using the
Lagrange multiplier method such as Eq. 2•11 on page 118 is
T 1 T
l ( x, λ ) = f ( x ) + λ A ( x ) = --- x T H x + g T x + λ ( Ax – b ) Eq. 2•43
2
The Euler-Lagrange equations give the first-order optimal conditions of the Lagrangian functional with respect
to x and λ.
l,x(x,λ) = Hx+ATλ + g = 0
l,λ(x,λ) = Ax - b = 0 Eq. 2•44
The second-order optimal condition requires the Hessian matrix H be positive definite. That is always true for a
quadratic functional. An incremental version with xi+1 = xi +∆x can be substituted in f(x) and A(x). One can view
the expression f ( x i + ∆x ) ≅ f ( x i ) + gT ∆x + 1--- ( ∆x )T H ∆x as an approximation using second-order Taylor expan-
2
sion to the objective functional, with the current active constraint equations as
With these relations, the Euler-Lagrange equations can be re-written in matrix form with the incremental solu-
tion, ∆x, as
H A
T
∆x – ∇f ( x i )
= Eq. 2•45
A 0 λ –A ( x i )
Using first equation in Eq. 2•45, we get ∆x = H-1 (−∇f(xi ) - ATλ). Notice that we have relied on the symmetrical
positive definitiveness of H to have its inverse. Substituting this back to eliminate ∆x in the second equation
gives an equation: AH-1 (−∇f(xi ) - ATλ) = -A(xi). Solving this equation for λ gives
in Lagrange method, we use gradient projection Eq. 2•42 in place of Eq. 2•46. The procedure of using such
approximated Lagrange multiplier in Lagrange method is an example known as the multiplier update method.
The class “Active_Set” needs some modification (see Program Listing 2•16). In the previous examples, in
linear programing and gradient projection method, class “Active_Set” only needs to store and update the tangent
plane (of the constraint surface) information. In Lagrange method, besides the tangent plane information, the
information on the constraint surface itself, specifically A(xi), also needs to be stored and updated; i.e., “A ∆x =
- A(xi)” instead of “A ∆x = 0”.
#include “include/vs.h”
class Active_Set {
C1 _A, &Constraint;
class Active_Set
int n_equality, size_c, n_active, *_active_state; A(active constraints) and C
public:
Active_Set(C1& C, int n = 0);
~Active_Set() { delete [] _active_state; }
int active_state(int i) { return _active_state[i];}
int active_no() { return n_active; }
operator C0() { return ((C0)_A); }
void activate();
A(xi)
void deactivate(int i, int k = -2);
friend C0& d(Active_Set&); ∇A
};
Active_Set::Active_Set(C1& C, int n) : Constraint(C) {
n_equality = n; initialize active set
size_c = Constraint.row_length();
_active_state = new int[size_c];
for(int i = 0; i < size_c; i++) _active_state[i] = -1;
}
void Active_Set::activate() { activate active set
n_active = 0;
for(int i = 0; i < n_equality; i++) _active_state[i] = n_active++;
for(int i = n_equality; i < size_c; i++)
if((double) ((C0)Constraint)[i] > -1.e-10 &&_active_state[i] >= -1)
_active_state[i] = n_active++;
if(n_active > 0) {
_A &=VECTOR_OF_TANGENT_BUNDLE("int, int", n_active, Constraint.col_length());
for(int i = 0; i < size_c; i++) if(_active_state[i] >= 0) _A[_active_state[i]] = Constraint[i]; update A from C
}
}
void Active_Set::deactivate(int i, int k) { deactive a constraint
for(int j = 0; j < size_c; j++) if(_active_state[j] == i) { _active_state[j] = k; break; } }
C0& d(Active_Set& a) { return d(a._A); }
Listing 2•16 class Active_Set data abstraction for both Lagrange method and gradient projection method
(project: “lagrangian_and_gradient_projection”).
(2, 4)
4
2
(2.6667, 1.3333)
x1 + x2 = 4
x1
0
(0, 0) (3, 0)
0 2 4 6 8
Figure 2.17 Lagrange method with active set method on a constrained quadratic
functional
Listing 2•17 Lagrange method and gradient projection method (project: “lagrangian_and_gradient_projection”).
A p = 0. Eq. 2•48
In other word, Eq. 2•48 expresses that the search direction p is in the null space of A. At the optimal condition, p
= 0, the negative gradient of the objective functional “- ∇f” is the linear combination of the range space of A (=∇
A); i.e., “- ∇f = ATλ”.
Range Space Method: Recall Eq. 2•46 and Eq. 2•47 from Lagrange method and consider projection of negative
gradient “−∇f” on the tangent plane M= {y| ∇Ay= 0}and gradient of constraint surface ∇A, we have
-1 T -1 -1
λ = - (AH A ) AH ∇f
-1 T
p = - H (∇f + A λ) Eq. 2•49
T T
A is of size m × n and 0 ≤ m ≤ n. We can perform QR decomposition on A , that is A = QR. Assuming no
T
degeneracy condition, the first “m” columns of Q span the range space of A . Denoting Y (matrix of size n × m)
T≡
consists the “m” columns of the range space. Substituting A YR into the first equation of Eq. 2•49, we have
the range space method1
T -1 -1 T -1
λ = - (Y H Y) Y H ∇f
-1
p = - H (∇f + Yλ) Eq. 2•50
T
On page 36 we discussed that the round-off error could accumulate in the multiplication of the normal form A A
in the least square problem, where the QR decomposition is used to control the condition number of the problem.
-1 T
In the first equation of Eq. 2•49, the condition number can increase by the multiplication operations in AH A ,
and we may run into trouble when its inverse is taken. Since columns of Y are orthonormal, in the first equation
T -1 -1
of Eq. 2•50, the condition number of Y H Y is as good as that of H . Therefore, the range space method with
Eq. 2•50 is numerically superior to Eq. 2•49.
The range space method is implemented in Program Listing 2•18 for solving the same problem that the
reduced gradient method solved in Program Listing 2•14. The core steps are
1. see p. 183-184 in P.E. Gill, W. Murray, and M.H. Wright, 1981, “Practical Optimization”, Academic Press, Inc., San
Diego.
#include “include/vs.h”
int main() {
const double EPSILON = 1.e-12;
const double RELAXED = 1.e3;
const int MAX_NO_OF_ITERATION = 10;
double v[4] = {2.0, 2.0, 1.0, 0.0};
C2 X(4, v);
C2 C = VECTOR_OF_TANGENT_OF_TANGENT_BUNDLE("int, int", 2, 4);
C[0] = 2*X[0]+ X[1]+ X[2]+4*X[3] - 7.0;
C[1] = X[0]+ X[1]+2*X[2]+ X[3] - 6.0;
C2 f = X[0].pow(2)+X[1].pow(2)+X[2].pow(2)+X[3].pow(2)-2*X[0]-3*X[3];
C0 p = VECTOR("int", 4);
C0 A = d(C), Q = QR(~A).Q(); AT = QR
C0 Y = MATRIX("int, int", 4, 2);
for(int i = 0; i < 2; i++) Y(i) = Q(i); Y has columns in the range space of Q
int k = 0;
do { -1
H
C0 H_inv = dd(f).inverse();
λ = - (YTH Y) YTH ∇f
-1 -1 -1
C0 lambda_bar = - ((~Y)*H_inv*Y).inverse() * (~Y) *H_inv* d(f);
p = - H (∇f + Yλ)
-1
p = - H_inv*(Y*lambda_bar+d(f));
((C0)X) += p;
f = X[0].pow(2)+X[1].pow(2)+X[2].pow(2)+X[3].pow(2)-2*X[0]-3*X[3];
++k;
cout << "solution{" << k << "): " << ((C0)X) << endl << "f: " << ((C0)f) << endl;
} while(k < MAX_NO_OF_ITERATION && (double)norm(p) > RELAXED*EPSILON); x = {1.148, 0.683, 1.806, 0.553}T
cout << "The final solution: " << ((C0)X) << endl << "f: " << ((C0)f) << endl;
} f = 1.40348
For inequality constrained problems, these core steps need to be embedded in a program with the aid of class
“Active_Set” such as what is implemented in Program Listing 2•17.
Null Space Method: Consider search direction p on tangent plane M= {y| ∇Ay= 0}, we have any such direction
satisfies
Ap=0
Denote Z contains “n - m” columns of the null space. The search direction p is the linear combination of the col-
umns of Z as
p = Z pz Eq. 2•51
where pz is a vector of size “n-m”. Taylor expansion to second-order such as Eq. 2•6 on page 109 with xi+1 = xi +
p is
1
f ( x i + p ) ≅ f ( x i ) + f,x ( x i )p + --- p T H ( x i )p
2
Substituting Eq. 2•51 into the increment p of the last equation yields
1
f ( x i + p ) = f ( x i + Zp Z ) ≅ f ( x i ) + f ,x ( x i )Zp Z + --- p ZT Z T H ( x i )Zp Z Eq. 2•52
2
Denoting the projected Hessian as Hz = ZTHZ, and the projected gradient as (∇f)z = ZT∇f, we recover the classic
Newton method on the null space as
pz = - (∇f)z / Hz
Therefore, the meaning of the “n-m” vector pz is clear. From first equation of Eq. 2•45 we have Hp+ATλ + ∇f =
0. Substituting Eq. 2•53 into this equation, we can solve for the Lagrange multiplier if necessary (such as for ine-
quality constrained problems where values of λ is needed for the active set method)
In retrospect, we should discuss the counterpart (dual) of the projected Hessian and projected gradient of null
space Hz = ZTHZ, and (∇f)z = ZT∇f, respectively. Assume x* as a local solution of the primal problem: minimize
f(x) subject to A(x) = 0. The lagrangian functional from Eq. 2•11 can be re-written for the dual problem as
∇ λ φ ( λ ) = [ ∇f ( x∗ ( λ ) ) + λ T ∇A ( x∗ ( λ ) ) ]∇ λ x∗ ( λ ) + A ( x∗ ( λ ) ) Eq. 2•55
∇ λ φ ( λ ) = A ( x∗ ( λ ) ) Eq. 2•56
∇ λ φ ( λ ) = ∇ λ A ( x∗ ( λ ) )∇ λ x∗ ( λ )
2
Eq. 2•57
#include “include/vs.h”
int main() {
const double EPSILON = 1.e-12;
const double RELAXED = 1.e3;
const int MAX_NO_OF_ITERATION = 10;
double v[4] = {2.0, 2.0, 1.0, 0.0};
C2 X(4, v);
C2 C = VECTOR_OF_TANGENT_OF_TANGENT_BUNDLE("int, int", 2, 4);
C[0] = 2*X[0]+ X[1]+ X[2]+4*X[3] - 7.0;
C[1] = X[0]+ X[1]+2*X[2]+ X[3] - 6.0;
C2 f = X[0].pow(2)+X[1].pow(2)+X[2].pow(2)+X[3].pow(2)-2*X[0]-3*X[3];
C0 p = VECTOR("int", 4);
C0 A = d(C);
C0 Q = QR(~A).Q();
C0 Z = MATRIX("int, int", 4, 2); AT = QR
for(int i = 0; i < 2; i++)Z(i) = Q(i+2); Z has columns in the null space of Q
int k = 0;
do {
p = Z * ((~Z)*dd(f)*Z).inverse() * (~Z) * -d(f); p = - Z(ZTHZ)-1 ZT∇f
((C0)X) += p;
f = X[0].pow(2)+X[1].pow(2)+X[2].pow(2)+X[3].pow(2)-2*X[0]-3*X[3];
++k;
cout<< "solution{" << k << "): " << ((C0)X) << endl << "f: " << ((C0)f) << endl;
} while(k < MAX_NO_OF_ITERATION && (double)norm(p) > RELAXED*EPSILON);
cout << "The final solution: " << ((C0)X) << endl << "f: " << ((C0)f) << endl; x = {1.148, 0.683, 1.806, 0.553}T
}
f = 1.40348
Listing 2•19 Null space method (project: “null_space”).
H ( x∗ ( λ ), λ )∇ λ x∗ ( λ ) + ∇ λ A ( x∗ ( λ ) ) = 0 Eq. 2•58
Therefore, we get
T
∇ λ x∗ ( λ ) = – H –1 ( x∗ ( λ ), λ )∇ λ A ( x∗ ( λ ) ) Eq. 2•59
∇ λ φ ( λ ) = – ∇ λ A ( x∗ ( λ ) )H –1 ( x∗ ( λ ), λ )∇ λ A ( x∗ ( λ ) )
2 T
Eq. 2•60
2
That is ∇φ = A(x), and ∇ φ = - AT H-1 A. The classic Newton method for the dual problem coincides with the
first term in Eq. 2•46 of Lagrange method. The Hessian of the dual, “- AT H-1 A”, governs the convergence rate
of the dual problem.
It is helpful to point out that the existence and uniqueness of the constrained problem is known to be associ-
ated with the abstract form of a saddle-point problem1 (see Figure 2.18). A saddle function is shown as the
1. p. 30 in M.M. Sewell, 1987, “Maximum and minimum principles”, Cambridge University Press, Cambridge, UK.
saddle point
λ
Figure 2.18 An example saddle function l(x, λ)= x2 - λ2, where x is the
minimizer and λ is the maximizer.
Lagrangian functional l(x, λ), which is minimized with respect to x and maximized with respect to λ. Requiring
both positive definitiveness of l,xx and negative definitiveness of l, λλ gives a strong (sufficient) condition for a
unique solution.
In summary, the range space method works on “m” dimensional space, where the most important step is to
estimate the Lagrange multiplier, a vector of length “m”. The null space method works on “n-m” dimensional
space, where we compute projected search direction pz (of length “n-m”) from projected gradient and projected
Hessian as pz = - (∇f)z / Hz. For small size problems, the null space method is numerically more reliable, because
in the range space method the inverse of the dual Hessian is a doubly inverted factor (YTH-1Y)-1 in the estimation
of Lagrange multiplier, which may cause significant cancellation errors. For a large size problem, the size of
(YTH-1Y)-1 in Eq. 2•50 of the range space method, and the size of (ZTHZ)-1 in Eq. 2•53 of the null space method
may significantly affect the size of the memory space, consequently, the cost of the inverse computation. For
number of constraints less than half of the variable number the range space method is less demanding, and for
number of constraints greater than half of the variable number null space has the advantage.
Penalty Methods
Penalty method transforms a constrained problem into an unconstrained problem by defining a penalty objec-
tive functional, for example, with a quadratic penalty term of active constraints as
minimize q(x) = f(x) + P(x) = f(x) + (ρ /2) A (x)T A(x) Eq. 2•61
where ρ is the penalty parameter and the second term is designed to penalized the objective functional when the
constraints are violated. The steps of the penalty method are
The final solution is the limiting point x, with ρ → ∞ , although when ρ is too big the problem becomes ill-condi-
tioned. The advantage of penalty method lies on the simplicity of Eq. 2•61. No advanced concept needs to be
introduced. The disadvantage is that we are left with a incremental procedure in which the problem needs to be
solved many times with an empirical sequence of ρ. We emphasize that it is necessary to start with a smaller ρ,
and then, increase it subsequently. If we ignore the need for a incremental procedure and compute with only one
big ρ, the solution can be completely different. Starting with too big a penalty parameter the solution is to satisfy
constraints overwhelmingly with no concern of the minimization of the objective functional. However, in many
engineering applications some magic ρ is often recommended for their own application domain. The use of this
magic ρ is an art rather than science.
In view of the shape of a saddle function shown in Figure 2.18, it is obvious that the quadratic form of the
penalty function “P(x) = (ρ/2) || A(x) ||2” is the most popular one, where P,xx is positive semi-definite; i.e., the
penalty term convexifies the primal (x-variables). Comparing Eq. 2•10, “∇f +λT ∇A = 0” (the first-order condi-
tion), with the penalty objective functional
we see
This can be used as the updating formula λi+1 = λi + ρ A(x) for the simplest form of multiplier update method
discussed on page 146. This updating formula can be used for the augmented lagrangian method introduced
later.
Now consider a specific example we have been solving in previous sections
f(x1, x2) = 2x12 + x1x2 + x22 -12x1 -10 x2
subject to x1 + x2 ≤ 4
-x1 ≤ 0
-x2 ≤ 0
For simplicity we should drop the inequality constrained part of the problem and consider only the equality con-
straint
x1 + x2 = 4
Assume that we are at the final constraint set of the active set method. We use x = (2, 2) which is clearly on the
constraint line and use penalty method to search for the final solution (1.5, 2.5). The penalty objective functional
q(x) is defined as
These two equations are the kernel of the penalty method, and the original constrained problem has been trans-
form completely into an new unconstrained problem. Program Listing 2•20 implemented this simplified prob-
lem. This coding should be embedded into the active set method for a more general inequality constrained
problem.
#include “include/vs.h”
int main() {
const int DOF = 2; const int MAX_NO_OF_ITERATION = 20;
const double EPSILON = 1.e-12; double x[DOF] = {2.0, 2.0};
C2 q, A = TANGENT_OF_TANGENT_BUNDLE("int", DOF), X(DOF,x);
double rho = 1.0, delta_X;
C0 d_x, X_cache = VECTOR("int", DOF);
int k0 = 0;
do {
rho *= 10.0;
int k1 = 0; A(x1, x2) = x1 + x2 - 4
do { q(x1, x2) = 2x12 + x1x2 + x22 -12x1 -10
A = X[0] + X[1] -4; x2 + ρ--- (A(x1, x2)T A(x1, x2) )
q &= 2*X[0].pow(2)+X[0]*X[1]+X[1].pow(2)-12*X[0]-10*X[1] 2
+ (0.5*rho)*A.pow(2);
d_x &= -d(q) / dd(q); dx = - q,x(xi) / q,xx(xi)
(C0)X += d_x; x += dx
} while ((double)norm(d_x) > EPSILON && ++k1 < 10);
cout << "solution(rho=" << rho << ", " << k1 << "): " << ((C0)X) << endl;
delta_X = norm(X_cache - ((C0)X));
X_cache = ((C0)X);
} while(++k0 < MAX_NO_OF_ITERATION && delta_X > 1.e6*EPSILON);
cout << "The Final solution: " << ((C0)X) << endl; x = {1.5, 2.5}T
}
Listing 2•20 Penalty method with a single equality constraint (project: “penalty_one_constraint”).
Since the penalty problem has transformed an equality constrained problem into an unconstrained problem,
various unconstrained optimization methods in Section 2.3.2 are applicable to the penalty method. We apply
classic Newton method, conjugate gradient method, and combined Newton and steepest descent method to the
penalty method in the following. Consider a less trivial problem with 10 variables and 4 equality constraints such
as1
1. from p. 381 in D.G. Luenberger, 1989, “Linear and Nonlinear Programming”, Addison-Wesley Publishing Company, Inc.,
Reading, MA.
#include “include/vs.h”
int main() {
const int DOF = 10; const int MAX_NO_OF_ITERATION = 10;
const double EPSILON = 1.e-12; const double RELAXED = 1.e6;
double rho = 1.0, delta_X, x[DOF] = {0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0};
C2 f, q, X(DOF,x),
A = VECTOR_OF_TANGENT_OF_TANGENT_BUNDLE("int, int", 4, DOF);
C0 d_x, X_cache = VECTOR("int", DOF);
int k0 = 0;
do {
rho *= 10.0; A0 =1.5 x1+x2+x3 +0.5 x4 +0.5 x5- 5.5
int k1 = 0; A1 =2.0 x6-0.5 x7-0.5 x8+x9-x10-2.0
do { A2 =x1 + x3 + x5 + x7 + x9 - 10
A[0]=1.5*X[0]+X[1]+X[2]+0.5*X[3]+0.5*X[4] -5.5;
A[1]=2.0*X[5]-0.5*X[6]-0.5*X[7]+X[8]-X[9]-2.0; A3 =x2 + x4 + x6 + x8 + x10 - 15
A[2]= X[0] +X[2] + X[4] + X[6] + X[8] -10.0; 10
A[3]= X[1] + X[3] + X[5] + X[7] + X[9]-15.0; f(x) = ∑ ix i2
f &= X[0].pow(2)+2*X[1].pow(2)+3*X[2].pow(2)+4*X[3].pow(2)+5*X[4].pow(2)+ i=0
6*X[5].pow(2)+7*X[6].pow(2)+8*X[7].pow(2)+9*X[8].pow(2)+10*X[9].pow(2);
q &= f + (0.5 * rho) * A.pow(2); q(x) = f(x) + (ρ /2) (A(x) T A(x) )
d_x &= -d(q) / dd(q); dx = - q,x(xi) / q,xx(xi)
((C0)X) += d_x;
} while ((double)norm(d_x) > RELAXED*EPSILON x += dx
&& ++k1 < MAX_NO_OF_ITERATION);
cout << "solution(rho=" << rho << ", " << k1 << "): " << ((C0)X) <<
" f: " << ((C0)f) << " q: " << ((C0)q) << endl;
delta_X = norm(X_cache - ((C0)X));
X_cache = ((C0)X); x = {-2.00, 2.66, 2.39, 3.62, 3.27, 2.87,
} while(++k0 < MAX_NO_OF_ITERATION && delta_X > RELAXED*EPSILON); 3.87, 3.16, 2.47, 2.69}T
cout << "The Final solution: " << ((C0)X) << endl;
} f = 502.4
Listing 2•21 The classic Newton version for penalty method (project: “penalty_newton”).
Program Listing 2•22 implemented conjugate gradient method version of the penalty method for the same
equality constrained problem in the above. Conjugate gradient uses line search along search direction p as xi+1 =
xi + α p, and its objective functional is redefined to be a one-parameter function in α as ψ(xi+1(α)) = ψ(α). The
one-parameter line search uses Newton’s formula dα = -dψ(α)/d 2ψ(α) to find the minimum of ψ(α). Conjugate
gradient is computed using Fletcher-Reeves formula with gi+1=∇q(xi+1)T, and βi= (gi+1)Tgi+1 / [(gi)Tgi] as Eq.
For the Newton method on M⊥, any movement on this subspace can be expressed as xi+1 = xi + ∇A u. Define
T
Then, denote Q(x) as the Hessian of q(u) at u = 0, and substitute Eq. 2•63 in place of Q(x), we have
T T
Q(xi) = ∇A(xi)( H(xi) + ρ∇A(xi) ∇A(xi)) ∇A(xi) Eq. 2•64
T
Q(xi) ≅ ρ(∇A(xi) ∇A(xi) )2 Eq. 2•65
1. p. 282-284, and p. 384-387 in D.G. Luenberger, 1989, “Linear and Nonlinear Programming”, Addison-Wesley Publishing
Company, Inc., Reading, MA.
Listing 2•22 The conjugate gradient method for penalty formulation (project: “penalty_conjugate_gradient”).
The kernel steps of the combined Newton and steepest descent customized for penalty method are
We implemented these two steps with three C++ functions in Program Listing 2•23: (1) the line search is per-
formed in both steps. We factor out this procedure and code it into a “line_search()” function, (2) Newton method
applied to M⊥ is implemented as “newton_on_orthogonal_complement_of_tangent()”, and (3) steepest descent
on M is implemented as “steepest_descent_on_tangent()”. Program Listing 2•24 implemented the main program
of the combined Newton and steepest descent method using the above three functions.
#include “include/vs.h”
void line_search(C2& X, C0& p, C2& alpha, double rho) {
const int DOF = 10; const double EPSILON = 1.e-12;
const double RELAXED = 1.e6; const int MAX_NO_OF_ITERATION = 10;
((C0)alpha) = 0.0;
C2 x[DOF], A[4]; C0 d_alpha; int k2 = 0;
do { line search along p
C2 phi; xi+1 = xi + α p
for(int j = 0; j < 10; j++) x[j] &= ((C0)X)[j] + alpha * p[j];
A[0]=1.5*x[0]+x[1]+x[2]+0.5*x[3]+0.5*x[4] -5.5;
A0 =1.5 x1+x2+x3 +0.5 x4 +0.5 x5- 5.5
A[1]= 2.0*x[5]-0.5*x[6]-0.5*x[7]+x[8]-x[9]-2.0; A1 =2.0 x6-0.5 x7-0.5 x8+x9-x10-2.0
A[2]=x[0] +x[2] +x[4] +x[6] +x[8] -10.0; A2 =x1 + x3 + x5 + x7 + x9 - 10
A[3]=x[1] +x[3] +x[5] +x[7] +x[9]-15.0;
phi &= x[0].pow(2)+2*x[1].pow(2)+3*x[2].pow(2)+4*x[3].pow(2)+5*x[4].pow(2)+
A3 =x2 + x4 + x6 + x8 + x10 - 15
6*x[5].pow(2)+7*x[6].pow(2)+8*x[7].pow(2)+9*x[8].pow(2)+10*x[9].pow(2)+ 10
(0.5 * rho) *A.pow(2);
if((double)dd(phi) > EPSILON) d_alpha &= -d(phi) / dd(phi); else break;
f(x) = ∑ ixi2
i=0
((C0)alpha) += d_alpha; T
} while(++k2 < MAX_NO_OF_ITERATION && φ(x) = f(x) + (ρ /2) (A A)
(double)norm((double)d_alpha) > RELAXED*EPSILON); dα= -dφ/d2φ; Newton’s formula
}
void newton_on_orthogonal_complement_of_tangent(
C2& X, C2& f, C2& A, C2& q, C0& p, double rho) { Newton method on M⊥
A[0]=1.5*X[0]+X[1]+X[2]+0.5*X[3]+0.5*X[4] -5.5;
A[1]= 2.0*X[5]-0.5*X[6]-0.5*X[7]+X[8]-X[9]-2.0;
A[2]=X[0] +X[2] +X[4] +X[6] +X[8] -10.0;
A[3]=X[1] +X[3] +X[5] +X[7] +X[9]-15.0;
f &= X[0].pow(2)+2*X[1].pow(2)+3*X[2].pow(2)+4*X[3].pow(2)+5*X[4].pow(2)+
T
6*X[5].pow(2)+7*X[6].pow(2)+8*X[7].pow(2)+9*X[8].pow(2)+10*X[9].pow(2);
q &= f + (0.5 * rho) * A.pow(2);
Q(xi) ≅ --ρ1- (∇A(xi) ∇A(xi) )-2
C0 grad_A = d(A), approx_Q_bar_inv;
T T
approx_Q_bar_inv &= 1.0/rho * (grad_A*(~grad_A)).pow(2).inverse(); p = -∇A (xi) Q(xi) ∇A (xi) ∇q(xi)
p &= (~grad_A) * approx_Q_bar_inv * grad_A * -d(q);
}
void steepest_descent_on_tangent(C2& X, C2& f, C2& A, C2& q, C0& p, double rho) { steepest descent on M
A[0]=1.5*X[0]+X[1]+X[2]+0.5*X[3]+0.5*X[4]-5.5;
A[1]=2.0*X[5]-0.5*X[6]-0.5*X[7]+X[8]-X[9]-2.0;
A[2]=X[0] +X[2] +X[4] +X[6] +X[8]-10.0;
A[3]= X[1] + X[3] + X[5] + X[7] + X[9]-15.0;
f &= X[0].pow(2)+2*X[1].pow(2)+3*X[2].pow(2)+4*X[3].pow(2)+5*X[4].pow(2)+
6*X[5].pow(2)+7*X[6].pow(2)+8*X[7].pow(2)+9*X[8].pow(2)+10*X[9].pow(2);
q &= f + (0.5 * rho) * A.pow(2); T
p &= -d(q); p = - ∇q(ri)
}
Listing 2•23 Three functions of the combined Newton and steepest descent version of penalty method (project:
“penalty_combined_newton_and_steepest_descent”).
#include “include/vs.h”
int main() {
const int DOF = 10;
const int MAX_NO_OF_ITERATION = 12; int MAX_NO_OF_INNER_ITERATION = 6;
const double EPSILON = 1.e-12; const double RELAXED = 1.e6;
double rho = 0.1, delta_X, x[DOF] = {0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0};
C2 f, q, X(DOF,x),
A = VECTOR_OF_TANGENT_OF_TANGENT_BUNDLE("int, int", 4, DOF);
C0 d_x, X_cache = VECTOR("int", DOF);
int k0 = 0;
double q_cache, d_q;
do {
rho *= 10.0;
int k1 = 0;
q_cache = d_q = 0.0;
MAX_NO_OF_INNER_ITERATION = ((rho < 1.e6) ? 1: 6);
do {
C0 p;
newton_on_orthogonal_complement_of_tangent(X, f, A, q, p, rho); call Newton method on M⊥
C2 alpha(0.0); line search
line_search(X, p, alpha, rho);
d_x &= ((C0)alpha)*p;
((C0)X) += d_x;
steepest_descent_on_tangent(X, f, A, q, p, rho); call steepest descent on M
((C0)alpha) = 0.0;
line_search(X, p, alpha, rho);
d_x &= ((C0)alpha)*p; line search
((C0)X) += d_x;
d_q = fabs( ((double)(C0)q)-q_cache );
q_cache = (double)(C0)q;
cout << "(c=" << c << ", k1=" << k1 << ")--" << " f: " << ((C0)f) << " q: " << ((C0)q)
<< " d_q: " << d_q << endl;
if((double)norm(d_x) < RELAXED*EPSILON && d_q < RELAXED*EPSILON)
break;
} while (++k1 < MAX_NO_OF_INNER_ITERATION && (double)norm(d_x) >
RELAXED*EPSILON && d_q > RELAXED*EPSILON);
cout << "solution(c=" << c << ", " << k1 << "): ";
cout << ((C0)X) << " f: " << ((C0)f) << " q: " << ((C0)q) << " d_q: " << d_q << endl;
delta_X = norm(X_cache - ((C0)X));
X_cache = ((C0)X); x = {-2.00, 2.66, 2.39, 3.62, 3.27, 2.87,
} while(++k0 < MAX_NO_OF_ITERATION && delta_X > RELAXED*EPSILON); T
cout << "The Final solution: " << ((C0)X) << endl;
3.87, 3.16, 2.47, 2.69}
return 0; f = 502.4
}
Listing 2•24 The main program of the combined Newton and steepest descent version of penalty method
(project: “penalty_combined_newton_and_steepest_descent”).
First, the obvious feature is now we are working on an augmented space of {x, λ} with the dimension “n+m”.
Second, from “dual view point”, comparing Eq. 2•67 with the Lagrangian functional in Lagrange method on
T
page 145, Eq. 2•67 has an extra penalty term (ρ /2) A(x) A(x). In view of the existence and uniqueness (strong)
condition of the saddle function shown in Figure 2.18. This quadratic penalty term in “x” convexifies the primal
(x-variables) of the Lagrangian functional l(x, λ). Third, from “penalty view point”, Eq. 2•67 without the middle
term is the penalty objective functional ( q(x) in Eq. 2•61). The penalty method is always plagued by being not
consistent with the minimum of the Lagrangian functional l(x, λ). We can easily show this by considering that
the first-order conditions of the constraint problem (l,x) and unconstrained problem (∇q) are not equal
unless a special condition λ = ρA(x), Eq. 2•62, is met. On the other hand, the first derivative of the augmented
Lagrangian functional (lA) is
∇f(x) + λ ∇A(x) = 0
T
A(x) = 0
Therefore, the middle term λ A(x) in Eq. 2•67 makes the penalty method consistent with the first-order condi-
T
tion of Lagrange method. The algorithm of augmented Lagrangian method can be implemented with a nested
double loop. The outer loop is the penalty method in which we need to increase the penalty parameter ρ from a
smaller number to a considerable greater number until the solution converge. The inner loop is to update the
Lagrange multiplier, in view of λ = ρA(x), as
in the hope that A(xi+1) = 0 can be achieved by this update. Program Listing 2•25 implemented augmented
Lagrangian method.
does not necessary have negative definite Hessian. Consider the Lagrangian functional l(x, λ), a perturbed
Lagrangian functional lp(x, λ) can be defined as
ε ε
lp(x, λ) = l(x, λ) – --- λ 2 = f(x) + λT A(x) – --- λ 2 Eq. 2•69
2 2
The solution is achieved at the limit ε → 0 . The gradient of the perturbed Lagrangian function (the Euler-
Lagrange equations) is
It is consistent with the first-order condition of Lagrange method (at the limit of ε → 0 ). For a quadratic pro-
gramming problem
1
minimize f(x) = -2 x T H x + g T x
subject to A(x) = Ax - b = 0
∆ x = – ∇f ( x )
T i
HA Eq. 2•71
A ε λ –A ( x )i
HA
T
∆x – ∇f ( xi )
=
A 0 λ – A ( xi )
Since the left-hand-side matrix is symmetrical we can use modified Cholesky decomposition to solve Eq. 2•71.
Or, recall Eq. 2•45 that λ = (AH-1AT)-1 [A(xi) - AH-1∇f(xi )] Set B-1 = (AH-1AT)-1 and apply modified Cholesky
decomposition to B. Not only the implementation in Program Listing 2•26 is extremely simple, but also the
computing speed is lightning fast.
#include “include/vs.h”
int main() {
const int DOF = 10; const int MAX_NO_OF_ITERATION = 20;
const double EPSILON = 1.e-12; const double RELAXED = 1.e6;
double x[DOF] = {0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0};
int k = 0;
C2 f, X(DOF,x),
A = VECTOR_OF_TANGENT_OF_TANGENT_BUNDLE("int, int", 4, DOF);
C0 d_x;
do {
A[0]=1.5*X[0]+X[1]+X[2]+0.5*X[3]+0.5*X[4] -5.5;
A[1]= 2.0*X[5]-0.5*X[6]-0.5*X[7]+X[8]-X[9]-2.0;
A[2]= X[0] + X[2] + X[4] + X[6] + X[8] -10.0;
A[3]= X[1] + X[3] + X[5] + X[7] + X[9]-15.0;
f &= X[0].pow(2)+2*X[1].pow(2)+3*X[2].pow(2)+4*X[3].pow(2)+5*X[4].pow(2)+
6*X[5].pow(2)+7*X[6].pow(2)+8*X[7].pow(2)+9*X[8].pow(2)+10*X[9].pow(2);
C0 H_inv = dd(f).inverse(); modified Cholesky on (AH-1AT)-1
λ = (AH-1AT)-1 [A(xi) - AH-1∇f(xi )]
Cholesky AHAt_inv( (d(A)*H_inv*(~d(A)) ), EPSILON);
C0 lambda = AHAt _inv * ( ((C0)A) - d(A)*(H_inv*d(f)) );
d_x &= H_inv*-( (~d(A))*lambda + d(f) ); ∆x = H-1 (−∇f(xi ) - ATλ); i.e., Eq. 2•47
((C0)X) += d_x;
cout << "solution(" << (++k) << "): "<<((C0)X) << ", objective functional: " << ((C0)f)
<< endl;
x = {-2.00, 2.66, 2.39, 3.62, 3.27, 2.87,
} while( k < MAX_NO_OF_ITERATION && (double)norm(d_x) > RELAXED*EPSILON ); 3.87, 3.16, 2.47, 2.69}T
cout << "The final solution: " << ((C0)X) << endl; f = 502.4
}
Listing 2•26 Perturbed Lagrangian method (project: “perturbed_lagrangian”).
Variational Methods
Three Using H0, H1, and H2 Type
Objects
In functional analysis, H0 (= L2 = W0,2), H1 (=W1,2), and H2 (= W2,2) are Sobolev spaces. They are also Hil-
bert spaces that has the inner product defined. For the users of VectorSpace C++ Library, it is sufficient to know
H0, H1, and H2 type objects are integrable objects (among them H1 and H2 type objects are also differentiable
objects). The applications such as those in variational methods can be easily implemented with H0, H1, and H2
type objects. C++ programs using VectorSpace C++ Library in this chapter are projects contained in project
workspace file “Hn.dsw” under directory “vs\ex\Hn”.
3.1.1 Quadrature
Numerical integration is also known as quadrature, which means the process of finding a square equal in area
to a given area. The trapezoidal rule and the Simpson’s rule are the most fundalmental ones that are introduced in
calculus1. For a one dimensional function f(x), the trapezoidal rule and Simpson’s rule evaluate the areas of the
1. For example, p. 602-609, in T.M. Apostol, 1969, “Calculus”, 2nd eds., vol. 2, Blaisdell Publishing Company, Waltham,
Mass.
x0 h = x1 - x0 x1 x0 x1 x2
Figure 3•1 Approximation by two linear interpolation functions (left-hand-side), and by one
quadratic interpolation function (right-hand-side), where h = xi+1 - xi is the size of one segment.
approximated linear and quadratic interpolation functions, respectively. Formula of the approximated linear and
quadratic interpolation functions are
x1 1 1
Trapezoidal rule: ∫x f ( x )dx
0
= h --- f ( x 0 ) + --- f ( x 1 ) + O ( h 3 f’’)
2 2
x2 1 4 1
Simpson’s rule: ∫x 0
f ( x )dx = h --- f ( x 0 ) + --- f ( x 1 ) + --- f ( x 2 ) + O ( h 5 f ( 4 ) )
3 3 3
Eq. 3•1
where O( ) indicates order of errors as a function of size “h” and derivatives of f(x). The two methods are illus-
trated in Figure 3•1 . Formula without the necessity of evaluating the function at the end-points (“open-type”
formula) exists; e.g.,
x5 55 5 5 55
∫x f ( x )dx
0
= h ------ f ( x 1 ) + ------ f ( x 2 ) + ------ f ( x 3 ) + ------ f ( x 4 ) + O ( h 5 f ( 4 ) )
24 24 24 24
Eq. 3•2
We notice that f(x0) and f(x5) are not in Eq. 3•2. So, if a singularity of f(x) presents at any such end-points it can
be avoided. Repeatedly using the trapezoidal rule or Simpson’s rule of Eq. 3•1 in many smaller segments yields
the “extended-type” formula
N
∑ hci f ( xi ) Eq. 3•4
i=0
where ci is an array of coefficients in Eq. 3•1, Eq. 3•2, and Eq. 3•3. They all use equally spaced intervals (h = x i+1
- xi), and f(xi) are evaluated at positions xi = x0 + i × h. For example, an integration problem
2
∫ ( x 2 – 2x + 1 ) dx Eq. 3•5
1
using extended Simpson’s rule can be written in C++ with VectorSpace C++ Library as (see project:
“extended_simpson”)
The class “Quadrature” takes arguments of (1) array of coefficients, (2) starting and end points, and (3) number
of integration points. VectorSpace C++ Library uses integration operator “H0::operator | (const J&)”, where
another new class “J”, the Jacobian, may take the size of an integration segment as its arguments. The reason of
using class “J” is to be compatible with more complicated integration domain and will become more evident
later. For now, in view of Eq. 3•4, we define referential coordinate ξ (with domain [0, 1]) for every integration
segment. Hence, the length of each referential integration segment is |ξ| = 1. The actual integration segement
domain [xi, xi+1] is mapped with a linear coordinate transformation rule x = x(ξ) = (1-ξ) xi + ξ xi+1. The Jacobain
for the segement is J = dx/dξ = (xi+1-xi) / |ξ| = h, the length of the segment per se. For various quadrature rules
discussed in the above only the values of the coefficient array need to be defined differently.
Higher order approximation can be obtained, in general, with greater number of function evaluations, which,
unfortunately, increases the computational cost. The significant computational improvement can be achieved if
we can reduce the number of function evaluations while maintain higher order approximation. The idea of Gaus-
sian quadrature is to choose not only the values of ci but also the locations for the function evaluation (i.e., the
integration point coordinates). Therefore, we have two times the degree of freedom to improve the order of
approximation. For example, we seek a two-point integration rule to exactly integrate a general cubic polynomi-
nal. Let
where αi are the coefficients for the cubic function f(ξ). First we normalize the integration domain to [-1, 1], and
assume that the integration points are spaced symmetrically and weighted equally. Therefore, W 0 = W1 and ξ0 =
-ξ1. Integration of Eq. 3•6 at the interval of [-1, 1] gives
1
∫ f ( ξ ) dξ = 2 α0 + 2--- α2 Eq. 3•7
–1 3
The two-point Gaussian quadrature wieghting coefficients Wi at integration points ξi gives (with i = 0, 1)
1
∫ f ( ξ ) dξ ≅ W0f(ξ0) + W1f(ξ1) =W0(f(−ξ1) + f(ξ1)) = 2W0 ( α0 + α2 ξ12) Eq. 3•8
–1
From the right-hand-sides of both Eq. 3•7 and Eq. 3•6, we have
2
2 α0 + --- α2 = 2W0 ( α0 + α2 ξ12) Eq. 3•9
3
For arbitary values of α0 and α2 Eq. 3•9 must always hold. Therefore, we obtain weightings W0 = W1 = 1, and
integration point coordinates ξ0 = - ξ1 = -1 / 3 . Tabulated values of Wi and ξi with more integration points can
be found in Stroud and Secrest1. Gaussian integration can be applied to integral domains other than the normal-
ized interval [-1, 1]. For example, for an actual integration domain of [1, 2], we can define a linear interpolation
function f(x) = f(ξ), with the natural coordinate ξ (Gaussian integration domain in [-1, 1]), as
1 1
f(ξ) = --- (1-ξ) f(x0) + --- (1+ξ) f(x1) Eq. 3•10
2 2
where we can check f(−1) = f(x0) and f(1) = f(x1). We can also define a similar linear coordinate transformation
1 1
x(ξ) = --- (1-ξ) x0 + --- (1+ξ) x1 Eq. 3•11
2 2
We note that the forms of the interpolation function and the coordinate transformation do not have to be the same
as they are in this example. Again, we can check at the starting point x(−1) = x0, and at the end point x(1) = x1.
Then, without loss of generality for a multi-dimensional case, we write
∫ ∫ f ( x )det ∂x
------ dξ = ∫ f ( x )J dξ ≅
N
Ω
f ( x ) dx ≡
Ω
∂ξ ∑ Wi f ( xi )Ji Eq. 3•12
Ω i=0
1. A.H. Stroud and D. Secrest, 1966, “Gaussian quadrature formulas”, Prentice-Hall, Englewood Cliffs, N.J.
∫1( x – 1 ) 2 dx
2
using Gauss quadrature, can be re-written in C++ with VectorSpace C++ Library as (project:
“linear_coordinate_transformation”)
The Gaussian quadrature is the default integration method in VectorSpace C++ Library, because of its higher
order approximation. Database for both the weighting coefficients and integration points of the Gaussian quadra-
ture are hidden from end-users. The users, however, need to input number of dimensions and number of integra-
tion points in the constructor of class “Quadrature”. In the present example, since the coordinate transformation is
linear the Jacobian is constant everywhere along the integral line; i.e., J = (x1-x0)/2, where “2” is the length of the
integration domain, since the natural coordinate “ξ” is defined in the interval of [-1, 1]. The result of the integra-
tion is 0.333334. A quadratic coordinate tranformation case is presented later. We note that the restriction of con-
stant values for the Jacobian can be relaxed when we introduce H1 type later.
We have seen various integration methods, and we have inferred the basic elements in object-oriented model-
ing of an integrable class. Now we are ready to deduce the data abstraction for constructing classes in H0 type.
H0 type objects are generalization of C0 type objects with integration capability. As in C0 type objects, H0 type
objects are subdivided into primary objects and utility objects. The primary objects include Integrable_Scalar,
Integrable_Vector, and Integrable_Matrix. These objects deserve detailed description. For the utility objects,
Integrable_Subvector and Integrable_Submatrix, it is suffice to say that they are mostly comformable to their C0
counterparts.
3.1.2 Integrable_Scalar
An Integrable_Scalar, “H0::H0(const Quadrature&)”, contains a reference to a Quadrature class instance,
and a pointer array of “C0 *u”. The dual abstraction, discussed on page 97, is also used to model the high-level
Constructors
Two examples of using variable dedicated constructor for H0 type Integrable_Scalar have been shown on
page 167 and page 169 for Simpson’s rule and Gaussian quadrature, respectively. We show a few more non-triv-
ial examples, in the followings. Consider a diffusion problem, e.g., heat conduction or chemical diffusion, in the
form of a differential equation1
2
d u
– --------2 = f ( x ) , with 0 < x < 1, u ( 0 ) = α, u ( 1 ) = β Eq. 3•13
dx
where f(x) is the source term. The solution to Eq. 3•13 can be expressed in integral form as
1
u( x) = ∫ g ( x, ξ )f ( ξ )dξ + ( 1 – x )α + xβ Eq. 3•14
0
where g(x, ξ) is Green’s function. The physical interpretation of the Green’s function is that g(x, ξ) is the temper-
ature (or concentration) sampling at x when a unit concentrated point source is located at ξ. Therefore, g(x, ξ)
satisfies Eq. 3•13; i.e., with g(x, ξ) in place of u(x) in the differential equation. We also require, at the location x
= ξ, g(x, ξ) to be continuous. And the net flux of the infinitesimal control line segment at x = ξ equals to the
source intensity; i.e., g’(ξ+, ξ) - g’(ξ-, ξ) = -1, which is also known as the jump condition. From the above con-
ditions, the Green’s function can be solved as
x ( 1 – ξ ), 0 < x < ξ
g ( x, ξ ) = Eq. 3•15
ξ ( 1 – x ), ξ < x < 1
x 1
u ( x ) = ( 1 – x ) ∫ ξf ( ξ )dξ+ x ∫ ( 1 – ξ )f ( ξ )dξ + ( 1 – x )α + xβ Eq. 3•16
0 x
1. p. 42 in I. Stakgold, 1979, “Green’s function and boundary value problems”, John Wiley & Sons, New York.
For a specific case with source distribution as f(x) = sin (πx), and homogeneous boundary conditions α = β =
0, we can compute ten point values of the solution u(x) at interval of “h = 0.1”. Program Listing 3•1 implemented
the solution using Eq. 3•16. The analytical solution corresponding to this source distribution is u = - (1/ π2)
sin(πx), which is used to compare with the computed result of the integral equation. They only differs after the
sixth digit after the decimal point ( see TABLE 3.2.)
#include “include/vs.h”
int main() {
double const PI = 3.141592654;
double const alpha = 0.0; double const beta = 0.0; α=β=0
double x = 0.0, extended Simpson’s rule
w[11] = {1.0/3.0, 4.0/3.0, 2.0/3.0, 4.0/3.0, 2.0/3.0, 4.0/3.0, 2.0/3.0, 4.0/3.0, 2.0/3.0,
4.0/3.0, 1.0/3.0};
for(int i = 0; i < 11; i++) {
Quadrature q1(w, 0.0, x, 11), q2(w, x, 1.0, 11);
H0 z1(q1), z2(q2),
f_1 = sin(PI*z1), f_2 = sin(PI*z2); f(x) = sin πx
C0 integ_1, integ_2; x 1
if(i !=0) integ_1 &= (1-x)*( (z1*f_1) | J(x/10.0); else integ_1 &= C0(0.0); ( 1 – x ) ∫ ξf ( ξ )dξ + x ∫ ( 1 – ξ )f ( ξ )dξ
if(i !=10) integ_2 &= x*( ((1-z2)*f_2) | J((1-x)/10.0); else integ_2 &= C0(0.0); 0 x
double u = (double)(integ_1+integ_2) + (1-x)*alpha+x*beta;
cout << “u(“ << x << “): “ << u << endl; with boundary terms (1-x) α + x β
if(i != 10) x += 0.1;
}
return 0;
}
Listing 3•1 Solving diffusion problem using integral expression (project: “green_diffusion_equation”).
1
u( x ) = ∫ g ( x, ξ )f ( ξ )dξ Eq. 3•18
0
The Green’s function should satisfy Eq. 3•17 and should be continuous at x = ξ. The jump condition is
k(ξ)[g’(ξ+,ξ)-g’(ξ-, ξ)] = -1. These conditions lead to the Green’s function for this problem as
x
∫ ---------
1
- dy, 0<x<ξ
k(y)
0
g ( x, ξ ) = Eq. 3•19
ξ
∫ ---------
1
- dy, ξ<x<1
k(y)
0
Substituting Eq. 3•19 into Eq. 3•18, with the case of k(x) = (1+x), and f(x) = x, gives
x 1
u(x) = ∫ ln ( 1 + ξ )f ( ξ )dξ + ln ( 1 + x ) ∫ f ( ξ )dξ Eq. 3•20
0 x
Program Listing 3•2 implements Eq. 3•20, which are codes from the Program Listing 3•1 with only very slight
modifications. The results of Program Listing 3•2 are listed in TABLE 3.2. for comparison with the analytical
solution. Only last three points in the interval of [0.8, 1.0] has 1.e-6 in error. We emphasize that Eq. 3•16 and Eq.
3•20 are much more complicated than that of the corresponding analytical solutions. However, analytical solu-
tions are only possible when the given f(x) and k(x) happen to give an analytical solvable differential equations,
while the Green’s function method is quite general for less restricted forms of f(x) and k(x). This concludes the
example for the Integrable_Scalar object of H0 type.
#include “include/vs.h”
int main() {
double x = 0.0,
w[11] = {1.0/3.0, 4.0/3.0, 2.0/3.0, 4.0/3.0, 2.0/3.0, 4.0/3.0, 2.0/3.0, 4.0/3.0, 2.0/3.0, extended Simpson’s rule
4.0/3.0, 1.0/3.0};
for(int i = 0; i < 11; i++) {
Quadrature q1(w, 0.0, x, 11), q2(w, x, 1.0, 11);
H0 z1(q1), z2(q2);
C0 integ_1, integ_2; f(x) = x, k(x) = 1+x
if(i !=0) integ_1 &= ( (z1*log(1+z1)) | J(x/10.0); else integ_1 &= C0(0.0); x 1
if(i !=10) integ_2 &= log(1+x)*( z2 | J((1-x)/10.0); else integ_2 &= C0(0.0);
double u = (double)(integ_1+integ_2) ;
cout << “u(“ << x << “): “ << u << endl;
∫ ln ( 1 + ξ )f ( ξ )dξ + ln ( 1 + x ) ∫ f ( ξ )dξ
0 x
if(i != 10) x += 0.1;
}
return 0;
}
Listing 3•2 Solving diffusion problem with k(x) = 1+x and f(x) = x using integral expression (traget:
“green_diffusion_variable_conductivity”).
by reference
“H0&” H0 type Integrable_Scalar object 1
“H0*” a pointer to H0 type Integrable_Scalar object 2
“double*, const Quadrature&, int, int” 3
double pointer, Quadrature, m_row_size, m_col_size
by value
“const Quadrature&” Quadrature 4
“const H0&” H0 type Integrable_Scalar object 5
“const H0*” pointer to H0 type Integrable_Scalar object 6
Strings in H0 virtual constructor for Integrable_Scalar object.
The rest of the operators and functions are listed in the following box. They are mostly comformable to the
operators and functions of the Scalar object. Promotion of C0 type to H0 type object by binary operators is a
common practice just as in standard C++ language. For example,
H0 x;
C0 y;
H0 z1 = x+y; // invoke H0::operator +(const C0&);
H0 z2 = y+z; // invoke H0’s friend operator +(const C0&, const H0&)
The second operator invoked, “operator +(const C0&, const H0&)” is a binary operator which is declared a
friend function (operator) to H0 class. These operators, needed for the promotion operation, are not listed in the
box for simplicity.
symbolic operators
H0& operator &= ( ) assignment by reference
H0& operator = ( ) assignment by value
H0 operator & ( ) const column concatenation
H0 operator && () const one-by-one column concatenation
H0 operator | ( ) const row concatenation
H0 operator || () const one-by-one row concatenation
arithmatic operators
H0 operator + ( ) const positive unary
H0 operator - ( ) const negative unary
H0 operator + (const H0&) const addition
H0 operator - (const H0&) const subtraction
H0 operator * (const H0&) const multiplication
H0 operator / (const H0&) const multiplication
H0& operator += (const H0&) replacement addition
H0& operator -= (const H0&) replacement subtraction
H0& operator *= (const H0&) replacement multiplication
H0& operator /= (const H0&) replacement division
logic operators
int operator == (const H0&) const equal TRUE == 1
int operator != (const H0&) const not equal FALSE == 0
int operator >= (const H0&) const greater or equal
int operator <= (const H0&) const less or equal
int operator > (const H0&) const greater
int operator < (const H0&) const less
functions
H0 pow(int) const power
H0 sqrt(const C0&) const square root
H0 exp(const C0&) const exponent
H0 log(const C0&) const log
H0 sin(const C0&) const sin
H0 cos(const C0&) const cos
Partial listing of H0 type Integrable_Scalar class arithmetic operators, logic operators and functions.
Constructors
A dedicated constructor of an Integrable_Vector, “H0::H0(int, double*, const Quadrature&)”, contains a ref-
erence to a Quadrature instance, and a pointer array of “C0 *u”. The dual abstraction is used. The pointer array
of C0, “u”, is referring to a “double *v”.
1 1
x(ξ) = --- (ξ-1)ξ x0 + (1-ξ2) x1 + --- (ξ+1)ξ x2 Eq. 3•22
2 2
We mentioned in previous section on the subject of Integrable_Scalar that since we have not yet introduced H1
type objects, we need a linear coordinate transformation to have constant values of the Jacobian throughout the
integration domain. We can make Eq. 3•22 consistent with a linear coordinate transformation rule by enforcing
x1 = (x0+x2)/2
i.e., x1 has to be the middle point of the segment. Substituting this relation into Eq. 3•22 to eliminate x1 yields
1 1
x(ξ) = --- (1-ξ) x0 + --- (1+ξ) x2
2 2
f(x2) = 3
f(x3) = 2
f(x1) = 2
f(x0) = 1 x2 = (1,1)
x3 = (0,1)
Ω x1 = (1,0)
x0 = (0,0)
Figure 3•2 Integration of a plane surface with a square integration domain.
which is exactly the linear coordinate transformation rule of Eq. 3•11. Therefore, we have J = (2-1)/2 = 0.5 as
used in the above example.
We show an example of a 2-D problem using Gaussian quadrature. Again, without H1 type objects for differ-
entiation operation, we restrict the integration domain to be a square or a rhombic region in order to have con-
stant Jacobian everywhere. In this example, a set of bilinear interpolation functions are used
1 1 1 1
f(ξ, η) = --- (1−ξ)(1−η) f(x0) + --- (1+ξ)(1−η) f(x1) + --- (1+ξ)(1+η) f(x2) + --- (1−ξ)(1+η) f(x3) Eq. 3•23
4 4 4 4
Assuming a plane as shown in , f(x0) = 1, f(x1) = 2, f(x2) = 2, and f(x3) = 3, where x0 = (0, 0), x1 = (1, 0),x2 = (1,
1), and x3 = (0, 1). The constant Jacobian of the problem is J = 1/4, where the area of the referential domain
– 1 ≤ ξ ≤ 1 and – 1 ≤ η ≤ 1 , is “4”. (in project: “integration_2d”).
by reference
“H0&” H0 type Integrable_Vector —
“H0*” a pointer to H0 type Integrable_Vector —
“int, double*, const Quadrature&, length, double* != 0, Quadrature 7
int, int” m_row_size, m_col_size
by value
“int, const Quadrature&” length, Quadrature 8
“const H0*” H0* —
“int, H0&, int, const Quadrature” length, H0, starting index, Quadrature 9
(the only one for reference Integrable_Vector)
Strings in H0 virtual constructor for H0 type Integrable_Vector class.
symbolic operators
H0& operator &= ( ) assignment by reference
H0& operator = ( ) assignment by value
H0& operator [] (int) selector return scalar
H0 operator & ( ) const column concatenation
H0 operator && () const one-by-one column concatenation
H0 operator | ( ) const row concatenation return matrix
H0 operator || ( ) const one-by-one row concatenation return matrix
arithmetic operators
H0 operator ~ ( ) const transposed (into a “row vector”) return matrix
H0 operator + ( ) const positive (primary casting) unary
H0 operator - ( ) const negative unary
H0 operator + (const H0&) const addition
H0 operator - (const H0&) const subtraction
H0 operator * (const H0&) const multiplication by a scalar; scalar productof two vectors
H0 operator %(const H0&) const tensor product of two vectors
H0 operator / (const H0&) const division (by a scalar or a matrix only)
H0& operator += (const H0&) replacement addition
H0& operator -= (const H0&) replacement subtraction
H0& operator *= (const H0&) replacement multiplication (by a scalar only)
H0& operator /= (const H0&) replacement division (by a scalar only)
logic operators
int operator == (const H0&) const equal TRUE == 1
int operator != (const H0&) const not equal FALSE == 0
int operator >= (const H0&) const greater or equal
int operator <= (const H0&) const less or equal
int operator > (const H0&) const greater
int operator < (const H0&) const less
functions
int length() const length of the Integrable_Vector
double norm(int = 2) const 1-norm or 2-norm
double norm(const char*) const infinite-norm takes strings “infinity”, or “maximum”
H0 pow(int) const power (applied to each element of the Integrable_Vector)
H0 sqrt(const H0&) const square root(applied to each element of the Integrable_Vector)
H0 exp(const H0&) const exponent (applied to each element of the Integrable_Vector)
H0 log(const H0&) const log (applied to each element of the Integrable_Vector)
H0 sin(const H0&) const sin (applied to each element of the Integrable_Vector)
H0 cos(const H0&) const cos (applied to each element of the Integrable_Vector)
Partial listing of Integrable_Vector object arithmatic operators, logic operators and functions.
Constructors
A dedicated constructor of an Integrable_Matrix, “H0::H0(int, int, double*, const Quadrature&)”, contains a
reference to a Quadrature instance, and a pointer array of “C0 *u” (pointer to C0 type Matrix object). The dual
abstraction is also used. The pointer array of C0, “u”, is referring to a “double *v”.
by reference
“H0&” H0 type Matrix —
“H0*” a pointer to H0 type Matrix —
“int, int, double*, row-length, column-length, double* != 0,
const Quadrature&, int, int” memory-row-length, memory-column-length 10
by value
“int, int, double*, row-length, column-length, double* = 0,
const Quadrature&, int, int” memory-row-length, memory-column-length 11
“int, int, const Quadrature&” row-length, column-length 12
“int, int, const double*, row-length, column-length, double*, 13
const Quadrature&, int, int” memory-row-length, memory-column-length
“int, int, H0&, int, int, row-length, column-length, H0&, 14
const Quadrature&” starting row-index, starting column-index
(the only one for reference Integrable_Matrix)
Strings in H0 virtual constructor for H0 type Integrable_Matrix class.
functions
H0 pow(int) const power (applied to each element of the Matrix)
H0 sqrt(const H0&) const square root (applied to each element of the Matrix)
H0 exp(const H0&) const exponent (applied to each element of the Matrix)
H0 log(const H0&) const log (applied to each element of the Matrix))
H0 sin(const H0&) const sin (applied to each element of the Matrix)
H0 cos(const H0&) const cos (applied to each element of the Matrix
symbolic operators
H0& operator &= ( ) assignment by reference
H0& operator = ( ) assignment by value
H0& operator [ ] (int) row selector
H0& operator( )(int) column selector
H0& operator ( )(int, int) element selector
H0 operator & ( ) const column concatenation
H0 operator && () const one-by-one column concatenation
H0 operator | ( ) const row concatenation
H0 operator || ( ) const one-by-one row concatenation
arithmetic operators
H0 operator + ( ) const positive (primary casting) unary
H0 operator - ( ) const negative unary
H0 operator + (const C0&) const addition
H0 operator - (const C0&) const subtraction
H0 operator * (const C0&) const multiplication
H0 operator / (const C0&) const division (by Integrable_Scalar or Integrable_Matrix only)
H0& operator += (const C0&) replacement addition
H0& operator -= (const C0&) replacement subtraction
H0& operator *= (const C0&) replacement multiplication (by an Integrable_Scalar only)
H0& operator /= (const C0&) replacement division (by an Integrable_Scalar only)
logic operators
int operator == (const H0&) const equal TRUE == 1
int operator != (const H0&) const not equal FALSE == 0
int operator >= (const H0&) const greater or equal
int operator <= (const H0&) const less or equal
int operator > (const H0&) const greater
int operator < (const H0&) const less
functions
int row_length() const row-length of the Integrable_Matrix
int col_length() const column-length of the Integrable_Matrix
H0 norm(in) const 1 (maximum column-sum)-norm or 2 (spectral)- norm
H0 norm(const char*) const “infinity”(max row-sum),“Forbenisu”(Forbenius-norm)
Partial listing of H0 type Integrable_Matrix class arithmetic operators, logic operators and functions.
quadrature
u[1] point # 2
quadrature
point # 3
Constructors
The dedicated constructors for an Integrable_Tangent_Bundle are
For the dedicated constructor, the size of the vector equals the number of the spatial dimension.
The constant strings used for the virtual constructors and autonomous virtual constructor are listed in the
following two boxes. The macro defintions used for the virtual constructors of the Integrable_Tangent_Bundle
and the Integrable_Vector_of_Tangent_Bundle are
respectively. And, H1 x = H1(const char*, ...) is used for autonomous virtual constructors.
by reference
“H1&” H1 type Matrix 1
“H1*” a pointer to H1 type Matrix 2
“double*, double*, Quadrature” v, dv, 7
“double*, double*, int, Quadrature”v, dv, spatial dimension, 8
by value
“Quadrature”, 3
“int, Quadrature” memory-row-length, memory-column-length 4
“const double&, const double&, v, dv, (for spatial dimension = 1 only) 5
const Quadrature&”
“const double*, const double*, v, dv, 6
int, Quadrature&” spatial dimension
“const H0&, const H0&” base point, tangent 9
“const H0*, const H0*” base point, tangent 10
“const H1&” Integrable_Tangent_Bundle 11
“const H1*” Integrable_Tangent_Bundle* 12
Strings in H1 virtual constructor for H1 type Integrable_Tangent_Bundle class.
by reference
“H1&” H1 type Matrix —
“H1*” a pointer to H1 type Matrix —
“int, int, double*, double*, vector size, spatial dimension, v, dv, 14
Quadrature, int, int” quadrature, memory row size, and column size
by value
“int, int, Quadrature”, vector size, spatial dimension 13
“const H0&, const H0&” base point , tangent —
“const H0*, const H0*” base point, tangent —
“int, const H1*” vector size, Integrable_Tangent_Bundle* 15
“const H1&” Integrable_Vector_of_Tangent_Bundle —
“const H1*” Integrable_Vector_of_Tangent_Bundle* —
Strings in H1 virtual constructor for H1 type Integrable_Vector_of_Tangent_Bundle class.
1-D Integration: We seek the integration of the example shown on page 176. That is
x2 x2
x3
∫ ( x – 1 ) 2 dx = x – x 2 + ----3- Eq. 3•24
x0
x0
dx ( ξ )
J = -------------- Eq. 3•25
dξ
This numerical integration problem can be coded with VectorSpace C++ Library as (project: “integration_1d”)
double x[3] = {1.0, 1.5, 3.0}; // x=[1, 3], analytical value form Eq. 3•24 is 8/3
Quadrature qp(1,3);
Now the restriction of requring the three nodes to be equally spaced is lifted. The Jacobian is simply J(d(X)) as
in the above program.
Line Integration in 2-D: Consider an example of line integration on a two-dimensional space. We first look at
the arc length method in calculus. We seek the length of a circle in the first quadrant (see the right-hand-side of
Figure 3•4)
x2 + y 2 = 1
f( x) ≡ y = 1 – x2 Eq. 3•26
The arc length, “s”, of a function f(x) is (see the middle of Figure 3•4)
s ≅ ∑ ( ∆x i ) 2 + f ( x i + 1 ) – f ( x i ) 2 Eq. 3•27
i
by applying pythagorean law. For differentiable function f(x) at the limit of ∆x → 0 we have the arc-length
method formula
s = ∫ 1 + f ’( x )2 dx Eq. 3•28
y = f(x) y
arc length = s
r
x 2 + y 2 = r2
f(xi+1)-f(xi)
∆x
x r x
Figure 3•4 Use arcs to approximate the length of a curve.
1 1
x(ξ) = --- (ξ-1)ξ x0 + (1-ξ2) x1 + --- (ξ+1)ξ x2
2 2
1 1
y(ξ) = --- (ξ-1)ξ y0 + (1-ξ2) y1 + --- (ξ+1)ξ y2 Eq. 3•29
2 2
where ξ is the parameter for the coordinate transformation of x and y coordinates. An infinitesimal length of the
curve dr can be obtained as (see Figure 3•5)
2 2
dr = dx + dy dξ Eq. 3•30
d ξ d ξ
2 2
dx + dy = dr ≡ J Eq. 3•31
d ξ d ξ dξ 2
2
That is the Euclidean norm of Jacobian of the coordinate transformation rule. Therefore, we have
dr = J 2 dξ Eq. 3•32
This integration formula is simply written consistent with the coordinate transformation method.
Program Listing 3•3 implements the arc length method with Eq. 3•26 and Eq. 3•28, and the coordinate trans-
formation method with Eq. 3•29 and Eq. 3•30. The program computes the line integration using arc length
method, if the macro definition “__ARC_LENGTH” is defined at the compile time. Otherwise, the default
dr ∂r
dy
∂y
∂r
dx
r ∂x
H1 sqrt(const H1& a) {
return INTEGRABLE_TANGENT_BUNDLE(
“const H0&, const H0&”, sqrt((H0)a), d(a)/2.0/sqrt((H0)a) );
}
#include "include\vs.h"
#define NODE_NO 5
#if defined(__ARC_LENGTH)
H1 sqrt(const H1& a) {
return INTEGRABLE_TANGENT_BUNDLE("const H0&, const H0&",
sqrt((H0)a), d(a)/2.0/sqrt((H0)a));
}
#endif
int main() { nodal coordinates (xi, yi)
const double r = 1.0; const double PI = 3.141592654;
double X[NODE_NO][2];
for(int i = 0; i < NODE_NO; i++) {
X[i][0] = r*cos(((double)(i+1))*PI/(2.0*(NODE_NO-1)));
#ifndef __ARC_LENGTH
X[i][1] = r*sin(((double)(i+1))*PI/(2.0*(NODE_NO-1)));
#endif
}
Quadrature qp(1, 4);
H1 zai(qp), 1
N = INTEGRABLE_VECTOR_OF_TANGENT_BUNDLE( N0(ξ) = --- (ξ-1)ξ, N1(ξ)= (1-ξ2)
2
"int, int, Quadrature", 3, 1, qp),
x, y; 1
N2(ξ)= --- (ξ+1)ξ
N[0] = -zai*(1-zai)/2; N[1] = (1+zai)*(1-zai); N[2] = zai*(1+zai)/2; 2
C0 length = 0.0;
for(int i = 0; i < (NODE_NO-1)/2; i++) { coordinate transformation rule
x = N[0]*X[i+2][0]+N[1]*X[i+1][0]+N[2]*X[i][0];
#if defined(__ARC_LENGTH)
x(ξ)=Ni(ξ)xi , y(ξ)=Ni(ξ)yi
y = sqrt(pow(r,2)-x.pow(2)); arc length method
#else
length += (sqrt(1.0+(d(y)/d(x)).pow(2)) | J(d(x)));
s = ∫ 1 +f ’( x )2 dx
y = N[0]*X[i+2][1]+N[1]*X[i+1][1]+N[2]*X[i][1];
length += (sqrt(d(x).pow(2)+d(y).pow(2))) | J(1.0); integration of Jacobian of transforma-
#endif tion rule
}
2 2
cout << length << endl;
dr = dx + dy dξ = J dξ
return 0; d ξ d ξ 2
}
Listing 3•3 Line integration using the arc length method or the coordinate transformation method (project:
“line_integration_2d”).
Number Coordinate
of Nodes Arc Length Transformation
3 1.57077 1.56245
5 1.57079 1.57020
7 1.57079 1.57068
9 1.57079 1.57076
TABLE 3.6. Line integration using 3 to 9 nodes.
Surface and Volume Integration in 3-D: Next we consider an example of integrating the volume and surface
area of a sphere with unit radius, numerically. The analytical value of the volume of a sphere is -34- πr3 ≅
4.188790203 (with r = 1), and surface area of a unit sphere is 4πr2 ≅ 12.56637061 (with r=1). The nodal coordi-
nates of any point on a unit sphere can be computed using spherical (polar) coordinates as
where r is the radius of the sphere, φ (latitude) is the angle between the points to the Z-axis, and θ (colatitude) is
the angle obtained by first projecting the point on the x0-x1 plane, then, measuring the angle of the projected
point and the polar axis on the x0-x1 plane.
We use a 9-node Lagrangian element in Figure 3•7, which is common in finite element method. The case pre-
sented here, with the interpolation functions the same as the functions used for coordinate transformation, is
known as isoparametric element. The algorithm presented here, however, is applicable to 4 to 9 node element.
With this algorithm, any number of nodes from the fifth to the ninth can be added or omitted.
V = ∫ z ( ξ̂ )dx
x
φ
∂z 2 ∂z 2
r
A = ∫ 1 + -------- + -------- dx
∂x 0 ∂x 1
x
θ
x1
x0
Figure 3•6 Spherical (polar) coordinates and formula for computing the volume and
surface area of one-eighth of a sphere with unit radius.
η
η 3 6
x(ξ)
3 6 2 7
8 2
5 5
7 0
8 ξ
4 ξ
0 4 1
1
Figure 3•7 Coordinate transformation rule x(ξ, η) of a 9-node Lagrangian element.
Then, (2) modify four corner nodes due to the presence of the center node
Step 3: For each of the four edge nodes (i = 4, 5, 6, 7), do the following. If the edge node is present,
(1) add the edge node according to the corresponding interpolation function
Then, (2) modify the four corner nodes according to the presence of the four edge nodes
3 6
2
3
7 1
8 5
1
2
0 4 1
Figure 3•8 Finite element discretization of a quarter of a circle.
The order of the node number (left-hand-side of Figure 3•7) and the interpolation functions defined in the above
are both accepted standard in finite element method. A quarter of a circle is taken as a whole element or subdi-
vided into three quadrilateral elements as in Figure 3•8.
The coordinate tranformation rule for a nodal point x = (x0, x1) is now
x ( ξ, η ) = N • x Eq. 3•38
where the components of vector N are the shape functions defined in Eq. 3•33 to Eq. 3•37, and x = (x0, x1) are
nodal coordinates. The interpolation function for the z-coordinates of a point on the surface of a sphere can be
expressed similarly as
z ( ξ, η ) = N • z Eq. 3•39
where z is the height of the spherical surface above a node. For geometrical entities that are simple enough, as in
this case with the unit sphere, their algebraic expression can be found in analytical geometry. Instead of Eq. 3•39,
the height of the unit sphere can be written as
z ( ξ, η ) = 1 – ( x0 ) 2 – ( x 1 ) 2 Eq. 3•40
The volume of one-eighth of a sphere can be obtained by
where ξ̂ is a vector = {ξ, η}T. Either, the isoparametric interpolation of Eq. 3•39 or analytical expression of Eq.
3•40 can be used in Eq. 3•41. A three dimensional surface area of a two dimensional integration domain has its
formula as a multiple dimensional generalization to that of the arc length method shown in Eq. 3•28.
∂z 2 ∂z 2
surface area = ∫ 1 + -------- + -------- dx
∂x 0 ∂x 1
Eq. 3•42
x
∂z
--------
∂x 0 dz dx –1
= ------ ------
dz
------ ≡ Eq. 3•43
dx ∂z dξ̂ dξ̂
--------
∂x 1
The Program Listing 3•5 defines the element geometry as shown in Figure 3•8. The Program Listing 3•5 is
the C++ main() porgram implements integration formula for computing volume and surface area using Eq. 3•39
or Eq. 3•39. The macro definition “__THREE_ELEMENTS”, if defined, the program discretizes one-quarter of
a circular domain into three nine-nodes quadrilateral elements, otherwise, only one element is used. The macro
definition “__ANALYTICAL_GEOMETRY”, if defined, Eq. 3•40 is used in place of Eq. 3•39. The macro defi-
nition “__SURFACE_AREA”, if defined, compute the surface area instead of the volume. (Program Listing 3•5
and Program Listing 3•5 are in project: “volume_and_surface_area”)
The results of this computation are shown in TABLE 3.7. Three elements discretization does improve the
accuracy of the integration as compared to that of the one element discretization.
Automatic Mesh Generation: We use, in the above, the example of evaluating the volume and the surface area
of an unit sphere because it has the advantage that the analytical solutions are available for checking the accu-
racy of the numerical integration. For simplicity, we stick to use this example in the following In a pratical appli-
cation we might face problem with more complicated geometrical domain. Therefore, we need a more
sophisticated tool to deal with the geometrical complexity. We have used the coordinate transformation method
in finite element method to evaluate the integration. The finite element method is a powerful tool in part that it
deals with complicated geometry. Therefore, we use some of the most primitive finite element objects in Chapter
4, which provides us a systematic treatment to acomplish the refinement of the meshes. We use a function
“block” to automatically generate mesh with the interface as
The first argument Ωh refers to the discretized global domain, where Ω in mathematical convention is domain of
integration, and the superscript “h” denotes the discretization of the actual domain Ω with a characteristic mesh
size of “h”. The “block” function generates nodes and elements for a set of control points (4 to 9 nodes). The
#include "include\vs.h"
double PI = 3.141592654, deg = PI/180.0, theta = 90.0, phi = 90.0;
void sphere(double* X, double& Z, double phi, double theta) {
X[0] = sin(phi*deg)*cos(theta*deg); X[1] = sin(phi*deg)*sin(theta*deg); Z = cos(phi*deg); }
void define_element() { Three elements discretization
#if defined(__THREE_ELEMENTS)
const int ELEMENT_NO = 3;
double X[ELEMENT_NO][9][2], Z[ELEMENT_NO][9];
sphere(X[0][0], Z[0][0], 0.0, 0.0); frist element
sphere(X[0][1], Z[0][1], 1.0/2.0*phi, 0.0);
sphere(X[0][2], Z[0][2], 1.0/2.0*phi, 1.0/2.0*theta);
sphere(X[0][3], Z[0][3], 1.0/2.0*phi, theta);
sphere(X[0][4], Z[0][4], 1.0/4.0*phi, 0.0);
sphere(X[0][5], Z[0][5], 1.0/2.0*phi, 1.0/4.0*theta);
sphere(X[0][6], Z[0][6], 1.0/2.0*phi, 3.0/4.0*theta);
sphere(X[0][7], Z[0][7], 1.0/4.0*phi, theta);
sphere(X[0][8], Z[0][8], 1.0/4.0*phi, 1.0/2.0*theta);
sphere(X[1][0], Z[1][0], 1.0/2.0*phi, 0.0);
sphere(X[1][1], Z[1][1], phi, 0.0);
second element
sphere(X[1][2], Z[1][2], phi, 1.0/2.0*theta);
sphere(X[1][3], Z[1][3], 1.0/2.0*phi, 1.0/2.0*theta);
sphere(X[1][4], Z[1][4], 3.0/4.0*phi, 0.0);
sphere(X[1][5], Z[1][5], phi, 1.0/4.0*theta);
sphere(X[1][6],Z[1][6],3.0/4.0*phi,1.0/2.0*theta);
sphere(X[1][7],Z[1][7],1.0/2.0*phi,1.0/4.0*theta);
sphere(X[1][8],Z[1][8],3.0/4.0*phi,1.0/4.0*theta);
sphere(X[2][0], Z[2][0], 1.0/2.0*phi, theta);
sphere(X[2][1], Z[2][1], 1.0/2.0*phi, 1.0/2.0*theta); third element
sphere(X[2][2], Z[2][2], phi, 1.0/2.0*theta);
sphere(X[2][3], Z[2][3], phi, theta);
sphere(X[2][4], Z[2][4], 1.0/2.0*phi, 3.0/4.0*theta);
sphere(X[2][5], Z[2][5], 3.0/4.0*phi, 1.0/2.0*theta);
sphere(X[2][6], Z[2][6], phi, 3.0/4.0*theta);
sphere(X[2][7], Z[2][7], 3.0/4.0*phi, theta);
sphere(X[2][8], Z[2][8], 3.0/4.0*phi, 3.0/4.0*theta);
#else
const int ELEMENT_NO = 1; One element discretization
double X[ELEMENT_NO][9][2], Z[ELEMENT_NO][9];
sphere(X[0][0], Z[0][0], 0.0, 0.0); sphere(X[0][1], Z[0][1], phi, 0.0);
sphere(X[0][2], Z[0][2], phi, 1.0/2.0*theta); sphere(X[0][3], Z[0][3], phi, theta);
sphere(X[0][4], Z[0][4], 1.0/2.0*phi, 0.0); sphere(X[0][5], Z[0][5], phi, 1.0/4.0*theta);
sphere(X[0][6], Z[0][6], phi, 3.0/4.0*theta); sphere(X[0][7], Z[0][7], 1.0/2.0*phi, theta);
sphere(X[0][8], Z[0][8], 1.0/2.0*phi, 1.0/2.0*theta);
#endif
}
Listing 3•4 Element discretization of two dimensional integration domain for a sphere with unit radius.
control points use linear of quadratic interpolation functions to map a referential domain to a physical domain
(see Figure 3•7). The second and third argument “ξ-node-no” and “η-node-no” indicate the number of nodes
generated in each direction. The fourth argument “control-node-no” has values of 4 to 9. The following “control-
node-flag” uses either “TRUE” (=1) or “FALSE” (=0) to indicate if a node is to be used as a control node. The
pointer to double array is the nodal coordinates of the control nodes. This is followed by two integers to indicate
the first node number and the first element number to be generated. For example (see also Figure 3•9),
int main() {
define_element();
Quadrature qp(2, 9);
H1 ZAI(2, (double*)0, qp),
N = INTEGRABLE_VECTOR_OF_TANGENT_BUNDLE(
"int, int, Quadrature", 9, 2, qp), Zai, Eta;
Zai &= ZAI[0]; Eta &= ZAI[1];
4 to 9 nodes algorithm
N[0] = (1-Zai)*(1-Eta)/4; N[1] = (1+Zai)*(1-Eta)/4; Step 1: four corner nodes
N[2] = (1+Zai)*(1+Eta)/4; N[3] = (1-Zai)*(1+Eta)/4;
N[8] = (1-Zai.pow(2))*(1-Eta.pow(2));
Step 2: center nodes and modification
N[0] -= N[8]/4; N[1] -= N[8]/4; N[2] -= N[8]/4; N[3] -= N[8]/4; to four corner nodes
Step 3: four edge nodes and modifcation
N[4] = ((1-Zai.pow(2))*(1-Eta)-N[8])/2; N[5] = ((1-Eta.pow(2))*(1+Zai)-N[8])/2;
N[6] = ((1-Zai.pow(2))*(1+Eta)-N[8])/2; N[7] = ((1-Eta.pow(2))*(1-Zai)-N[8])/2;
to other nodes
N[0] -= (N[4]+N[7])/2; N[1] -= (N[4]+N[5])/2;
N[2] -= (N[5]+N[6])/2; N[3] -= (N[6]+N[7])/2;
C0 vol(0.0), area(0.0);
for(int i = 0; i < ELEMENT_NO; i++) {
C0 X_(9, 2, X[i][0]); coordinate transformation
H1 x = N*X_;
x ( ξ, η ) = N • x
C0 Z_(9, Z[i]);
#if defined(__SURFACE_AREA) 1: Surface Area
H1 z = N*Z_; interpolation z ( ξ, η ) = N • z
H0 dz_dx = d(z) * d(x).inverse(),
∂z 2 ∂z 2
∫ 1 + -------- + -------- dx
da = sqrt((dz_dx[0]).pow(2)+(dz_dx[1]).pow(2)+1.0);
area =
area += da | J(d(x).det()); ∂x 0 ∂x 1
} x
cout << (8.0*90.0/theta*area) << endl;
#else
2: Volume
#if defined(__ANALYTICAL_GEOMETRY)
H0 z = sqrt(1-((H0)x[0]).pow(2)-((H0)x[1]).pow(2)); z ( ξ, η ) = 1 – ( x 0 ) 2 – ( x1 ) 2
#else
H0 z = ((H0)N)*Z_;
#endif interpolation z ( ξ, η ) = N • z
vol += z | J(d(x).det());
}
cout << (8.0*90.0/theta*vol) << endl;
volume = ∫ z ( ξ̂ )dx
x
#endif
return 0;
}
Listing 3•5 Two dimensional integration domain for volume and surface area of a sphere with unit radius.
2
2
2
0 5
4 1 3
3 5
2
0 1
0 4 1
0 1
Figure 3•9 Use of “block()” function to generate elements and nodes
automatically. The control nodes are shwon as open squares.
The function “sphere()” is to generate Catesian coordinates on a circular domain (on x0-x1 plane, see Figure 3•6)
given the latitude φ and colatitude θ of a unit sphere. This generates a 9-nodes Lagrangian element with four cor-
ner nodes as control nodes (see element number “0” in Figure 3•9). The other two 9-nodes Lagrangian elements
labeled as element number “1” and element number “2” and can be constructed as .
The control nodes ordering is the same as that shown in Figure 3•7. For the element number “1”, the total number
of control nodes is 6, and the node number “4” is skipped by setting “control_node_flag[4] = FALSE;”. For the
element number “2”, the total number of control nodes is “7”, and both node number “4” and “5” are skipped by
setting “control_node_flag[4] = control_node_flag[5] = FALSE;”. In the main() program, the discretized domain
is declared as
Omega_h oh;
Program Listing 3•6 is the program to generate three 9-nodes Lagragian elements. The geometrical information
on the nodes and elements are coded in the segment for the constructor of the discretized global domain
“Omega_h::Omega_h()”. The actual instance of the discretized global domain is declared inside the main() pro-
gram. The project “block_volume_integral” can be used to generate either 9-nodes Lagragian elements or 4-
nodes quadrilateral elements. The 9 nodes Lagragian elements are generated when the macro definition
“__LAGRANGAIN_9_NODES” is defined at the compile time. 2, 4, and 8 subdivision for each side of the
block can be defined using macro definition “__TWO_SEGMENTS” and “__FOUR_SEGMENTS” for the first
two kinds of subdivision. The nodes and elements for these options are shown in Figure 3•10. The results of
refined computation are shown in TABLE 3.8.
12 elements 3 elements
48 elements
12 elements
192 elements
48 elements
#include "include\vs.h"
#include "include\dynamic_array.h"
#include "include\omega_h.h"
#include "include\block.h"
static const double PI = 3.14159265359; static const double deg = PI/180.0;
void sphere(double* X, double phi, double theta) {
X[0] = sin(phi*deg)*cos(theta*deg); X[1] = sin(phi*deg)*sin(theta*deg); }
EP::element_pattern EP::ep = EP::LAGRANGIAN_9_NODES; 9-nodes Lagrangian element
Omega_h::Omega_h() {
double X[9][2]; const double theta = 90.0; const double phi = 90.0;
int control_node_flag[9] = {1, 1, 1, 1, 1, 1, 1, 1, 1};
sphere(X[0], 0.0, 0.0); sphere(X[1], 1.0/2.0*phi, 0.0); block # 0; 4 control nodes
sphere(X[2], 1.0/2.0*phi, 1.0/2.0*theta); sphere(X[3], 1.0/2.0*phi, theta);
block(this, 3, 3, 4, control_node_flag, X[0], 0, 0);
sphere(X[0], 1.0/2.0*phi, 0.0); sphere(X[1], phi, 0.0); sphere(X[2], phi, 1.0/2.0*theta);
block # 1; 5 control nodes
sphere(X[3], 1.0/2.0*phi, 1.0/2.0*theta); sphere(X[4], 3.0/4.0*phi, 0.0);
sphere(X[5], phi, 1.0/4.0*theta); control_node_flag[4] = 0;
block(this, 3, 3, 6, control_node_flag, X[0], 9, 1);
sphere(X[0], 1.0/2.0*phi, theta); sphere(X[1], 1.0/2.0*phi, 1.0/2.0*theta);
block # 2; 5 control nodes
sphere(X[2], phi, 1.0/2.0*theta); sphere(X[3], phi, theta);
sphere(X[4], 1.0/2.0*phi, 3.0/4.0*theta); sphere(X[5], 3.0/4.0*phi, 1.0/2.0*theta);
sphere(X[6], phi, 3.0/4.0*theta); control_node_flag[4] = control_node_flag[5] = 0;
block(this, 3, 3, 7, control_node_flag, X[0], 18, 2);
}
int main() {
Quadrature qp(2, 9);
H1 ZAI(2, (double*)0, qp),
N = INTEGRABLE_VECTOR_OF_TANGENT_BUNDLE(
"int, int, Quadrature", 9/*nen*/, 2/*nsd*/, qp), Zai, Eta;
Zai &= ZAI[0]; Eta &= ZAI[1];
N[0] = (1-Zai)*(1-Eta)/4; N[1] = (1+Zai)*(1-Eta)/4;
4 to 9 nodes shape function
N[2] = (1+Zai)*(1+Eta)/4; N[3] = (1-Zai)*(1+Eta)/4; Step 1: four corner nodes
N[8] = (1-Zai.pow(2))*(1-Eta.pow(2)); Step 2: center nodes and modification
N[0] -= N[8]/4; N[1] -= N[8]/4; N[2] -= N[8]/4; N[3] -= N[8]/4;
N[4] = ((1-Zai.pow(2))*(1-Eta)-N[8])/2; N[5] = ((1-Eta.pow(2))*(1+Zai)-N[8])/2;
to four corner nodes
N[6] = ((1-Zai.pow(2))*(1+Eta)-N[8])/2; N[7] = ((1-Eta.pow(2))*(1-Zai)-N[8])/2; Step 3: four edge nodes and modifcation
N[0] -= (N[4]+N[7])/2; N[1] -= (N[4]+N[5])/2; to other nodes
N[2] -= (N[5]+N[6])/2; N[3] -= (N[6]+N[7])/2;
Omega_h oh; C0 vol(0.0);
Ωh
for(int i = 0; i < oh.total_element_no(); i++) {
Omega_eh& elem = oh(i); element Ωhe
double X[9][2];
for(int j = 0; j < 9; j++) {
int nn = elem[j];
Node& nd = oh[nn]; x- and y- coordinates
for(int k = 0; k < 2; k++) X[j][k] = nd[k];
x ( ξ, η ) = N • x
}
C0 X_(9, 2, X[0]); H1 x = N*X_; Volume
H0 z = sqrt(1-((H0)x[0]).pow(2)-((H0)x[1]).pow(2));
vol += z | J(d(x).det());
z ( ξ, η ) =
1 – ( x0 ) 2 – ( x1 )2
}
cout << (8.0*vol) << endl; interpolation z ( ξ, η ) = N • z
return 0; volume = ∫ z ( ξ̂ )dx
}
x
Listing 3•6 Volume integration using “block()” function to generate nodes and elements automatically.
Constructors
The dedicated constructors for a Integrable_Tangent_Bundle are
H2::H2(const Quadrature&)
H2::H2(double* v, const Quadrature&)
H2::H2(double* v, int spatial_dimension, const Quadrature&)
For the dedicated constructor, the size of vector is equal to the number of spatial dimension.
The constant strings used for the virtual constructors and autonomous virtual constructor are listed in the
following two boxes. The macro defintions use for the virtual constructors of the
Integrable_Tangent_of_Tangent_Bundle and Integrable_Vector_of_Tangent_of_Tangent_Bundle are
respectively. “H2 x = H2(const char*, ...)” is used for autonomous virtual constructors. There are not much dif-
ferent in the use of H2 type objects compared to that of the H1 type objects, The major difference is that the H2
type objects allow twice differentiable operation needed in some applications. In the next section on the varia-
tional methods, a lot of examples with H2 type are shown.
by reference
“H2&” H2 type Matrix 1
“H2*” a pointer to H2 type Matrix 2
“double*, double*, double*, v, dv, ddv 7
Quadrature”
“double*, double*, double*, v, dv, ddv, 8
int, Quadrature” spatial dimension
by value
“Quadrature”, 3
“int, Quadrature” memory-row-length, memory-column-length 4
“const double&, const double&, v, dv, (for spatial dimension = 1 only) 5
const double&, Quadrature” ddv
“const double*, const double*, v, dv, 6
const double*, int, Quadrature&” ddv, spatial dimension
“const H0&, const H0&, base point, tangent, 9
const H0&” tangent of tangent
“const H0*, const H0*, base point, tangent, 10
const H0*” tangent of tangent
“const H2&” Integrable_Tangent_of_Tangent_Bundle 11
“const H2*” Integrable_Tangent_of_Tangent_Bundle* 12
Strings in H2 virtual constructor for Integrable_Tangent _of_Tangent_Bundle object.
by reference
“H2&” H2 type Matrix —
“H2*” a pointer to H2 type Matrix —
“int, int, double*, double*, double*,vector size, spatial dimension, v, dv, ddv 14
Quadrature, int, int” quadrature, memory row size, and column size
by value
“int, int, Quadrature”, vector size, spatial dimension 13
“const H0&, const H0&, base point , tangent —
const H0&” tangent of tangent
“const H0*, const H0*, base point, tangent —
const H0*” tangent of tangent
“int, const H2*” vector size, Integrable_Tangent_of_Tangent_Bundle* 15
“const H2&” Integrable_Vector_of_Tangent_of_Tangent_Bundle —
“const H2*” Integrable_Vector_of_Tangent_of_Tangent_Bundle* —
Au = f Eq. 3•44
1
J(u) = --- a(u, u) - (u, f) Eq. 3•45
2
where (u, f) is a linear functional and a(u, u) is a bilinear functional with a(u, u) ≡ (u, Au). For u0, a solution
corresponding to the minimum value of J; i.e., J(u0) ≤ J(u). We define u = u0 + εη, where ε is a small real num-
ber and η is called the variation. The variation of J denoted as δJ can be defined with the directional derivatives
as
d
δJ(u)= ε ------ J ( u 0 + εη ) Eq. 3•46
dε ε = 0
Set δJ(u) = 0 for the minimization of J. Considering Eq. 3•46 equals zero is always true for the arbitary small
real number ε, the term in bracket must equal zero for u0 to give a stationary value of J. So, we have
ε2
------ --- a ( u 0, u 0 ) + εa ( η, u 0 ) + ----- a ( η, η ) – ( u 0, f ) – ε ( η, f )
d 1
= a ( η, u 0 ) – ( η, f ) = 0 Eq. 3•47
dε 2 2
ε=0
That is,
which is just the Eq. 3•44 multiplies by the variation η. So it is equivalent to the Eq. 3•44. Therefore, the solu-
tion of the differential equation (strong form) of Eq. 3•44, is the solution of the minimization problem of Eq.
3•45, and is also the solution of the variational formulation (weak form) of Eq. 3•48.
Re-defining “v” = u0 + εη, and “u” = u0 in Eq. 3•48, we have (v, Au) = (v, f), And, the adjoint homogeneous
equation A*v = 0, therefore, (A*v, u) = (0, u).
The left-hand-side of Eq. 3•49 is (Au, v) - (u, A*v) = 0 for self-adjoint operator A. So, we have
Eq. 3•50 is the solvability condition. From linear algebra, orthogonal complement of the range space of A is the
null space of its adjoint A*, usually denoted as R(A)⊥ = N(A*). Since f ∈ R(A) and v ∈ N(A*), this leads directly
to the orthogonal relation (v, f) = 0.
N–1
u N = vN = ∑ ci φi Eq. 3•52
i=0
The approximation basis functions φi should give finite energy with respect to the bilinear form a(. , .), such that
derivatives corresponding to the operator A is integrable. Substituting Eq. 3•52 into Eq. 3•51 gives a system of
simultaneous equations
N–1
ci ∑ a ( φi, φj ) cj = c i ( f, φ i ), where i = 0, 1, …, N – 1 Eq. 3•53
j=0
coefficients ci are present in both sides of the equation, so it can be dropped. Eq. 3•53 can be re-written in matrix
form as
where Mij = a(φi, φj), and bi = (φi, f). The solution vector c of Eq. 3•54 is the vector consists of coefficients ci in
Eq. 3•52. If φi are chosen to be orthogonal basis functions, M is diagonal; i.e., Mij = 0 for i ≠ j.
2
d u
– 2
= cos πx, where 0 < x < 1 Eq. 3•55
dx
with different kinds of boundary conditions for each of the two problems:
1. p. 265-270 in J.N. Reddy, 1986, “Applied functional analysis and variational methods in engineering”, McGraw-Hill, Inc.
1
uexact(x) = -----2 ( cos πx + 2x – 1 ) Eq. 3•56
π
We have applied the integration by part on the term with the second order derivative and requring homogeneous
conditions, vN(0) = vN(1) = 0, on the boundary integral term
1
du N
vN = 0 Eq. 3•58
dx 0
Therefore, we choose φi in Eq. 3•52 as the set of orthogonal functions {sin [(i+1)πx] | 0 ≤ i ≤ N – 1 }. Notice
that this set of functions satisfies the requirement of vN(0) = vN(1) =0. Since the set is orthogonal, all off-diago-
nals of M matrix are zeros. We only consider the diagonal elements of M as
1
dφ i dφ i
diag(M)i = a(φi, φi) = ∫ dx dx
dx , and bi = φi cos πx Eq. 3•59
0
The solution of the coefficient vector is simply ci = bi / diag(M)i , component by component without having to
solve a system of simultaneous equation. The Program Listing 3•7 implements the Eq. 3•59. After the ci are
solved for, the solution of the Dirichlet boundary value problem is obtained by plug in ci in Eq. 3•52.
In Figure 3•11, the exact solution in Eq. 3•56 is compared to the solution constructed using ci computed from
Program Listing 3•7. The approximated vairiational solution, in the form of Eq. 3•52 uN = ciφi, with N = 2 to N =
16 (even numbers only) are ploted in the left-hand-side. Only the case with N=2 has clear differences from the
exact solution, other solutions (N= 4-16) converge rapidly towards the exact solution. Figure 3•12 shows the
alternative approach discussed in page 170 with an integral equation using Green function as kernel for solution.
The results are computed by Program Listing 3•7, which is almost with a simple modification that “f(x) = cos
πx” instead of “f(x) = sin(πx)”.
#include "include\vs.h"
#define PI 3.141592654
int main() {
const int n = 16;
double w[35] = {1.0/3.0, 4.0/3.0, 2.0/3.0, 4.0/3.0, 2.0/3.0, 4.0/3.0, 2.0/3.0, 4.0/3.0, 2.0/3.0, extended Simpson’s rule
4.0/3.0, 2.0/3.0, 4.0/3.0, 2.0/3.0, 4.0/3.0, 2.0/3.0, 4.0/3.0, 2.0/3.0, 4.0/3.0, 2.0/3.0,
4.0/3.0, 2.0/3.0, 4.0/3.0, 2.0/3.0, 4.0/3.0, 2.0/3.0, 4.0/3.0, 2.0/3.0, 4.0/3.0, 2.0/3.0,
4.0/3.0, 2.0/3.0, 4.0/3.0, 2.0/3.0, 4.0/3.0, 1.0/3.0};
Quadrature qp(w, 0.0, 1.0, 35);
H1 x(qp),
phi = INTEGRABLE_VECTOR_OF_TANGENT_BUNDLE("int, int, Quadrature", n, 1, qp);
for(int i = 0; i < n; i++) phi[i] = sin(((double)i+1.0)*PI*x); φi = sin[(i+1)πx], i = 0, N-1
H0 d_phi_2(n, (double*)0, qp); 1
dφ i dφ i
for(int i = 0; i < n; i++) d_phi_2[i] = d(phi[i]).pow(2);
C0 M = d_phi_2 | J(1.0/34.0);
diag(M)i = a(φi, φi) = ∫ dx dx
dx
C0 b = ( ((H0)phi) * cos(PI*((H0)x)) ) | J(1.0/34.0); 0
C0 c(n, (double*)0); bi = φi cos πx
for(int i = 0; i < n; i++) c[i] = b[i] / M[i];
ci = bi / diag(M)i
for(int i = 0; i < n; i++) cout << c[i] << ", "; cout << endl;
return 0;
}
Listing 3•7 Dirichlet boundary condition u(0) = u(1) = 0, for the differential equation - u” = f (project:
“dirichlet”).
Exact = (cos πx + 2x -1) /π2 uN = ciφi , i = 0, Ν( = 2−16)
0.02 0.02 Ν=2
0.01 0.01
0.20.40.60.8 1 0.20.40.60.8 1
-0.01 -0.01
-0.02 -0.02
Figure 3•11 The left-hand-side is the exact solution, and the right-hand-side is the solution
with N=2 to N = 16 Rayleigh-Ritz method.
Mixed Boundary Conditions: For the Mixed boundary conditions u(0) = u’(1) = 0, the exact solution is
1
uexact(x) = -----2 ( cos πx – 1 ) Eq. 3•60
π
With this different set of boundary conditions the Eq. 3•57 to Eq. 3•59 still hold. However, φi = {sin(iπx) |
0 ≤ i ≤ N – 1 } now is not complete with respect to x = 1, where u(1) = 0 always. We can add a new algebraic
function φ0 = x (the simplest one) to fix this situation. That is the new set of φ becomes {x, sin(πx), sin(2πx),
sin(3πx), ... }. Therefore, the Program Listing 3•7 only needs the slightest modification. The Program Listing 3•8
is a modified version for the mixed boundary conditions problem. Aternatively, since the solution of ci in Pro-
gram Listing 3•7 is working out component by component for this problem (since we select a orthogonal set of
φi), we can just compute one additional term c0 with φ0 = x and add to the solution of the problem 1. Figure 3•13
0.02
0.01
uexact = (cos πx + 2x -1) /π2
-0.01
-0.02
Figure 3•12 Point-values are solution of integral equation using Green function computed by Program
Listing 3•1 with a simple modification that f(x) = cos πx.
shows that the solution up to N= 2 is already a good enough approximation to the exact solution for this
problem.
#include "include\vs.h"
#define PI 3.141592654
int main() {
const int n = 16;
double w[35] = {1.0/3.0, 4.0/3.0, 2.0/3.0, 4.0/3.0, 2.0/3.0, 4.0/3.0, 2.0/3.0, 4.0/3.0, 2.0/3.0, extended Simpson’s rule
4.0/3.0, 2.0/3.0, 4.0/3.0, 2.0/3.0, 4.0/3.0, 2.0/3.0, 4.0/3.0, 2.0/3.0, 4.0/3.0, 2.0/3.0,
4.0/3.0, 2.0/3.0, 4.0/3.0, 2.0/3.0, 4.0/3.0, 2.0/3.0, 4.0/3.0, 2.0/3.0, 4.0/3.0, 2.0/3.0,
4.0/3.0, 2.0/3.0, 4.0/3.0, 2.0/3.0, 4.0/3.0, 1.0/3.0};
Quadrature qp(w, 0.0, 1.0, 35);
H1 x(qp),
phi = INTEGRABLE_VECTOR_OF_TANGENT_BUNDLE("int, int, Quadrature", n, 1, qp);
phi[0] = x; φ0 = x, φi = sin[iπx], i = 1, N-1
for(int i = 1; i < n; i++) phi[i] = sin(((double)i+1)*PI*x);
H0 d_phi_2(n, (double*)0, qp);
for(int i = 0; i < n; i++) d_phi_2[i] = d(phi[i]).pow(2);
C0 M = d_phi_2 | J(1.0/34.0);
C0 b = ( ((H0)phi) * cos(PI*((H0)x)) ) | J(1.0/34.0);
C0 c(n, (double*)0);
for(int i = 0; i < n; i++) c[i] = b[i] / M[i];
for(int i = 0; i < n; i++) cout << c[i] << ", "; cout << endl;
return 0;
}
Listing 3•8 Mixed boundary condition u(0) = u’(1) = 0, for the differential equation - u” = f (project: “mixed”).
Dirichlet Boundary Conditions Revisited: We can use polynominals instead of trigonometric functions as the
basis functions, such as
Notice that this choice of φi satisfies the Dirichlet boundary conditions u(0) = u(1) = 0. However the φis are not
orthogonal with each other and therefore Mij will not be a diagonal matrix. The solution of a system of simulta-
neous equations is required. We choose 0 ≤ i ≤ 3 ; i.e., N=4. Hence the highest order polynomial is fifth-order.
One modification to the Program Listing 3•7 is that we use Bode’s integration rule which is a fifth-order approx-
imation. The weights for the Bode’s rule is
0.20.40.60.8 1 0.20.40.60.8 1
-0.05 Exact = (cos πx -1) /π2 -0.05 uN = ciφi , i = 0, Ν( = 2−16)
-0.1 -0.1
-0.15 -0.15
-0.2 -0.2
Figure 3•13 Exact solution (cos πx -1) /π2 comparing to solutions of N= 2-16 Rayleigh-Ritz method.
14 24 64 28 64 24 64 28 64 24 64 14
------, ------ , ------, ------, ------ , ------, ------, ------, …, ------, ------, ------, ------
45 45 45 45 45 45 45 45 45 45 45 45
Redefine φi simply as “phi[0] = x*(1-x); for(int i = 1; i < 4; i++) phi[i] = phi[i-1]*x;” Program Listing 3•9 imple-
ments the polynomial approximation. Figure 3•14 shows the solutions of the polynomial approximations which
is almost the same as the eact solution visually.
#include "include\vs.h"
#define PI 3.141592654
int main() {
double w[17] = {14.0/45.0, 64.0/45.0, 24.0/45.0, 64.0/45.0, 28.0/45.0,
64.0/45.0, 24.0/45.0, 64.0/45.0, 28.0/45.0, extended Bode’s rule
64.0/45.0, 24.0/45.0, 64.0/45.0, 28.0/45.0,
64.0/45.0, 24.0/45.0, 64.0/45.0, 14.0/45.0};
Quadrature qp(w, 0.0, 1.0, 17);
H1 x(qp), φi = x(i+1) (1-x), i = 0, N-1
phi = INTEGRABLE_VECTOR_OF_TANGENT_BUNDLE("int, int, Quadrature", 4, 1, qp); 1
phi[0] = x*(1-x); for(int i = 1; i < n; i++) phi[i] = phi[i-1]*x; dφ i dφ j
C0 M = ( d(phi)*(~d(phi)) ) | J(1.0/16.0); Mij = a(φi, φj) = ∫ dx dx
dx
C0 b = ( ((H0)phi) * cos(PI*((H0)x)) ) | J(1.0/16.0); 0
C0 c = b / M;
for(int i = 0; i < 4; i++) cout << c[i] << ", ";
bi = φi cos πx
return 0; ci = bi /Mij
}
Listing 3•9 Polynomial basis φi = x(i+1) (1-x), i = 0, N-1 for the Dirichlet boundary condition u(0) = u(1) = 0,
for the differential equation - u” = f (project: “polynomial”).
1. p.188-190, and p. 351-353 in J.M.Gere, and S.P. Timoshenko, 1984, “Mechanics of materials”, 2nd ed., Wadsworth, Inc.,
Belmont, California.
0.02
0.01
-0.01
-0.02
Figure 3•14 Solution of Dirichlet boundary problem using polynomial basis functions.
the beam as “w” (see Figure 3•15). From balance of force, the transverse loading (f) is equal to the derivative of
shear force (V) as
and the shear force is equal to the derivative of bending moment (M) as
Therefore,
2
dM
= f Eq. 3•64
dx2
The curvature (d2w/dx2) of the beam is related to the bending moment and the flexure rigidity of the beam as
w V
f(x)
x
M
where EI is the flexural rigidity. Substituting M in Eq. 3•65 into Eq. 3•64 gives
2
d d w
2
EI = f, where 0 < x < L Eq. 3•66
d x d x2
2 2
dw dw d d w
w( 0 ) = ( 0 ) = 0, EI 2 ( L ) = M, V ( L ) = EI (L) = 0 Eq. 3•67
dx dx d x d x2
2M + fL 2
w ( x ) = ---------------------- x 2 – --------- x 3 + ------------ x 4
fL f
4EI
Eq. 3•68
6EI 24EI
1. Irreducible formulation,
2. Lagrange multiplier formulation,
3. Penalty function formulation, and
4. Mixed formulation
1. Irreducible Formulation: The irreducible formulation has the highest order of differential equation. We first
need to transform the end bending moment boundary condition into a homogeneous one by change of variable as
2
d d u
2
EI = f, where 0 < x < L Eq. 3•69
dx dx2
1. From examples in p. 156-158, and p. 275-280 in J.N. Reddy, 1986, “Applied functional analysis and variational methods
in engineering”, McGraw-Hill, Inc.
L 2 2
EI d u
J(u) = ∫ ------ – fu dx
2 d x2
Eq. 3•71
0
L 2 L 2 2
2 d u d v d u
∫
d
δJ ( u ) = EI ( δu ) 2 – δuf dx = ε ∫ EI 2 2 – vf dx = 0 Eq. 3•72
dx 2
d x dx dx
0 0
The second identity is obtained from applying integration by parts twice and considering that all boundary con-
ditions are homogeneous. Dropping ε , for which is arbitary, we can define an approximated system of equations
with left-hand-side and right-hand-side as
L 2 2 L
d v N d u N
a(vN, uN) = ∫ EI 2 2 dx , and (vN, f) =
dx dx
∫ [ vN f ] dx Eq. 3•73
0 0
respectively. The approximation basis functions are taken as φi = {x i+2}, i = 0,1, ..., N-1. Program Listing 3•10
implements Eq. 3•73, with N= 2, 3. The case with N= 3 actually produces the coefficients of the exact solution
in Eq. 3•68. Figure 3•16 shows N=2 is almost identical to the exact solution visually.
0.6
0.5
0.4
Exact = (N=3) ~ (N=2)
0.3
0.2
0.1
Figure 3•16 Irreducible formulation for transeverse deflection of a beam with N=2, 3.
2. Lagrange Multiplier Formulation: We refer to Chapter 2, Eq. 2•11 in page 118 for an introduction on the
Lagrangian functional in the Lagrange multiplier method for the constrained optimization problem. We now
define the constraint equation as the variable of the negative slope ψ
dw
ψ = – Eq. 3•74
dx
Listing 3•10 Transverse deflection of a beam using irreducible formulation. φi = x(i+2) , i = 0, N-1 (project:
irreducible_formulation”).
Substituting Eq. 3•74 into Eq. 3•71 and considering the boundary term yields
L
EI dψ 2
J ( ψ, w ) = ∫ ------
2 dx
– fw dx + ( ψM ) Eq. 3•75
0 L
Using the Lagrangian multiplier λ with the constraint equation (Eq. 3•74), we can define the Lagrangian func-
tional l(ψ, w, λ) as
L
EI dψ 2
– f w + λ ψ + dx + ( ψM )
dw
l ( ψ, w, λ ) = ∫ ------
2 dx dx
Eq. 3•76
0 L
subject to boundary condition that ψ(0) = w(0) = 0. The Euler-Lagrange equations can be obtained by setting
δl(ψ, w, λ) = 0 as
∫ ---------- f dx +
dδw
- λ – δw
dx
0
L
∫ δλ ψ + d x dx
dw
Eq. 3•77
0
=0
The Lagrange multiplier in this case has physical interpretation of the shear force.
2
d d w
λ = EI Eq. 3•78
dx d x2
Therefore, the exact solution for ψ and λ can be obtained by differentiating the exact solution for w(x) in Eq.
3•68 as
– ( 2M + fL 2 ) f
fL 2 --------
ψ ( x ) = ------------------------------
2EI 2EI x – 6EI x ,
x + --------
- - 3 λ = f(L – x ) Eq. 3•79
The approximation basis functions for each approximated variable are taken as
ψ N wN λ N = c iψ φ iψ c iw φiw c iλ φ iλ
L L
dφ iψ dφ jψ
– ( φiψ M )
∫ ---------
EI
dx dx
- ---------- dx 0 ∫ φiψ φjλ dx c jψ L
0 0
L L
dφ iw
0 0 ∫ ---------
dx j
- φ λ dx c jw = ∫ φiw f dx Eq. 3•80
0 0
c jλ
L L
dφjw
∫ φiλ φjψ dx ∫
0
φ iλ ---------- dx 0
dx
0 0
Therefore, the matrix and vectors in Eq. 3•80 can be labeled to be comformed with “M c = b” (Eq. 3•54). We
first consider a one-parameter approximation with
Eq. 3•80 with Eq. 3•81 or Eq. 3•82 offers a first non-trivial matrix formulation in this workbook and it deserves
us to look into the various ways, that VectorSpace C++ Library supports, to implment these equations. Program
Listing 3•11 implements the one-parameter approixmation.
∫
0 λ
for(int i = 0; i < 3; i++) cout << c[i] << ", "; cout << endl; M12 =M21 = - φ dx
---------
dx 0
return 0; 0
} T
L
b= – ML ∫ φ 0w fdx 0
0
Listing 3•11 On-parameter approximation in Lagrange multiplier method for beam bending problem
(project: “lagrange_multiplier”).
For one-parameter approximation matrix M is only 3 × 3, which is not too complicated. It can be accessed
with plain C or Fortran semantics using the selector “[]”. Two alternative approaches are available in Vector-
Space C++ Library. First we can build M just as it is written in matrix form using concatenation operator “|” and
“&” such as
C0 e(3),
M = ( (E_*I_*(d(psi)*d(psi))) *(e[0]%e[0])+
0.0 *(e[0]%e[1])+
(((H0)psi)*lambda) *(e[0]%e[2])+
0.0*(e[1]%e[0])+ 0.0 *(e[1]%e[1])+
(d(w)*lambda) *(e[1]%e[2])+
(((H0)psi)*lambda) *(e[2]%e[0])+
(d(w)*lambda) *(e[2]%e[1])+
0.0 *(e[2]%e[2])
) | J(L_),
b = (-L_*M_) *e[0] +
((((H0)w)*f_) | J(L_)) *e[1] +
0.0 *e[2];
dφ 0ψ dφ 0ψ
L L
M= ∫ ---------
EI
dx dx
- ---------- dx ( e 0 ⊗ e 0 ) + 0 ( e 0 ⊗ e 1 ) + ∫ φ 0ψ φ 0λ dx ( e 0 ⊗ e 2 ) +
0 0
L
dφ iw
0 ( e1 ⊗ e0 ) + 0 ( e 1 ⊗ e 1 ) + ∫ ---------- φ jλ dx ( e 1 ⊗ e 2 ) +
dx
0
L L
dφ 0w
∫ φ0λ φ0ψ dx ( e 2 ⊗ e0 ) + ∫ φ 0λ ---------- dx ( e 2
dx
⊗ e1 ) + 0 ( e2 ⊗ e2 )
0 0
and,
L
b= – ( φ 0ψ M ) e 0 + ∫ φ 0w f dx e 1 + 0 e 2
L
0
The results of one-parameter approximation are shown in Figure 3•17. There is room for improvements obvi-
ously.
Program Listing 3•12 implements the two-parameter approximation.
ψ w λ
0.6 1
-0.2
0.2 0.4 0.6 0.8 1 0.5 0.8 exact
-0.4 0.4 0.6
-0.6 exact 0.3
0.4
-0.8 0.2 exact
-1 0.1 0.2
-1.2
0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1
Figure 3•17 Comparison of one-parameter solution to the exact solution.
For more than one-parameter approximation, the complexities for the formulation increase dramatically. Subma-
trix/Subvector become an important tool to deal with the complexities. We used in Program Listing 3•12, the ref-
erence Integrable_Matrix “mi j”, where i, j = 0, 1, 2. Each “mi j” is a matrix of size 2x2.
#include "include\vs.h" Simpson’s rule {1/3,4/3,1/3}T
int main() {
double L_ = 1.0, E_ = 1.0, I_ = 1.0, f_ = 1.0, M_ = 1.0, weight[3] = {1.0/3.0, 4.0/3.0, 1.0/3.0};
Quadrature qp(weight, 0.0, L_, 3); φ0ψ φ 1ψ φ0w φ 1w φ 0λ φ 1λ = x x 2 x x 2 1 x
H1 x(qp),
psi = INTEGRABLE_VECTOR_OF_TANGENT_BUNDLE("int, int, Quadrature", 2, 1, qp), L
w = INTEGRABLE_VECTOR_OF_TANGENT_BUNDLE("int, int, Quadrature", 2, 1, qp), dφ ψ dφ ψ
lambda=INTEGRABLE_VECTOR_OF_TANGENT_BUNDLE("int, int, Quadrature",2,1,qp);
psi[0] = x; psi[1] = x.pow(2);
M00= ∫ EI --------- ⊗ --------- dx
dx dx
0
w[0] = x; w[1] = x.pow(2); L
lambda[0] = 1.0; lambda[1] = x; ψ λ
H0 m(6, 6, (double*)0, qp), ML02
λ
=
ψ ∫φ ⊗ φ dx , M20
m00(2, 2, m, 0, 0, qp), m01(2, 2, m, 0, 2, qp), m02 (2, 2, m, 0, 4, qp), = ∫ φ ⊗ φ dx 0
m10(2, 2, m, 2, 0, qp), m11(2, 2, m, 2, 2, qp), m12(2, 2, m, 2, 4, qp),
m20(2, 2, m, 4, 0, qp), m21(2, 2, m, 4, 2, qp), m22(2, 2, m, 4, 4, qp); 0
m00 = (E_*I_)*d(psi)*(~d(psi)); m01 = 0.0; m02 = ((H0)psi)%((H0)lambda); L L
w
λ λ dφ w
M12= ∫ ---------- ⊗φ dx , M21= ∫ φ ⊗---------
m10 = 0.0; m11 = 0.0; m12 = d(w)(0)%((H0)lambda); dφ -dx
m20 = ((H0)lambda)%((H0)psi); m21 = ((H0)lambda)%d(w)(0); m22 = 0.0; dx dx
C0 M = m | J(L_/2); 0 0
T
H0 f(6, (double*)0, qp), f0(2, f, 0, qp), f1(2, f, 2, qp), f2(2, f, 4, qp);
f1 = (((H0)w)*f_); f0 = f2 = 0.0; L
( – Mφ ψ ( L ) ) ∫ φ w fdx 0
b=
C0 b = f | J(L_/2);
b[0] = -M_*L_;b[1] = -M_*pow(L_, 2); 0
C0 c = b / M;
for(int i = 0; i < 6; i++) cout << c[i] << ", "; cout << endl;
return 0;
}
Listing 3•12 Two-parameters approximation in Lagrange multiplier method for beam bending problem
(project: “lagrange_multiplier”).
Three alternative implementions of the two-parameter approximation in VectorSpace C++ Library are possi-
ble. First, concatenation operators can be used to patch smaller matrices and vectors into a larger one, such as
C0 e(2), E(3),
M =+( ((E_*I_)*(d(psi)*(~d(psi)))) *((e%e)*(E[0]%E[0]))+
0.0 *((e%e)*(E[0]%E[1]))+
(((H0)psi)%((H0)lambda)) *((e%e)*(E[0]%E[2]))+
0.0 *((e%e)*(E[1]%E[0]))+
0.0 *((e%e)*(E[1]%E[1]))+
(d(w)(0)%((H0)lambda)) *((e%e)*(E[1]%E[2]))+
(((H0)lambda)%((H0)psi)) *((e%e)*(E[2]%E[0]))+
(((H00)lambda)%d(w)(0)) *((e%e)*(E[2]%E[1]))+
0.0 *((e%e)*(E[2]%E[2]))
) | J(L_/2);
C0 M_delta_psi(2, (double*)0); M_delta_psi[0] = -M_*L_; M_delta_psi[1] = -M_*L_*L_;
C0 b = +( M_delta_psi *(e*E[0]) +
( (((H0)w)*f_) | J(L_/2) ) *(e*E[1]) +
0.0 *(e*E[2]) );
The expressions for using Integrable_Submatrix or Integrable_Subvector in two-parameter case are very close to
those of the one-parameter approximation case. The fact that they are now matrix of size 2x2 and vector of size
2x1 instead of a scalar component (comparing to one-parameter case discussed on page 212) is handled by pro-
jecting these matrices and vectors to “((e%e)*(E[i]%E[j]))” or “(e*E[i]) )” according to basis expression intro-
duced in Section 1.1.6. We call your attention to the last operations in the construction of both “M” and “b”.
They use unary “+” operator. This is the primary casting which down casts Subvectors and Submatrices into
Vectors and Matrices, respectively.
Basis expression generates Integrable_Subvector and Integrable_Submatrix. Third, a complementary expres-
sion using the objects of Integrable_Subvector and Integrable_Submatrix directly is
The results of two-parameter approximation are shown in Figure 3•18. Compared with Figure 3•17 the approxi-
mations for ψ and w are almost equal to the exact solution, and the approximation for λ is identical to the exact
solution.
ψ w
λ
0.6 1
-0.2 0.20.40.60.8 1 0.5 exact ~
~ approx.
0.8 exact = approx.
-0.4 0.4 0.6
-0.6 0.3
0.2 0.4
-0.8 exact ~~ approx. 0.2
-1 0.1
0.20.40.60.8 1 0.20.40.60.8 1
Figure 3•18 Two-parameter approximation using Lagrange multiplier formulation for the beam
bending problem.
3. Penalty Function Formulation: We refer to Chapter 2, Eq. 2•61 on page 153 for the basics of the penalty
method. Taking J(ψ, w) in Eq. 3•75 with a quadratic penalty term, we have
L L
EI dψ 2 ρ dw 2
J p ( ψ, w ) = ∫ ------ – fw dx + ( ψM ) + --- ∫ ψ + dx Eq. 3•84
2 dx 2 dx
0 L 0
L L
∫ EI ---------
- ------- – δwf dx + ( Mδψ ) + ρ ∫ δψ + ----------- ψ + ------- dx = 0
dδψ dψ dδw dw
Eq. 3•85
dx dx dx dx
0 L 0
Using approximation basis functions as in the Rayleigh-Ritz method this equation can be written in matrix form
as
L L
dφ iψ dφ jψ dφ jw
EI --------- ψ φ ψ dx ρ φ ψ --------- – M φiψ
∫ dx dx i j - ---------
- + φ ∫ i dx- dx c jψ L
0 0 = L Eq. 3•86
∫ φiw fdx
L L
dφ iw dφ iw dφ jw c jw
ρ∫ ---------- φ jψ dx ρ ∫ ---------- ---------- dx
dx dx dx 0
0 0
ψ w T T
φ = φ i φ i , where φ iψ = x, and φ iw = x x
2
Eq. 3•87
Program Listing 3•13 implements the penalty function formulation as in Eq. 3•86 and approximation basis
functions as in Eq. 3•87.
2 T
φ ψ = x, and φ w = x x
#include "include\vs.h"
int main() {
double L_ = 1.0, E_ = 1.0, I_ = 1.0, f_ = 1.0, M_ = 1.0,
L
rho = 1.0, weight[3] = {1.0/3.0, 4.0/3.0, 1.0/3.0}; dφ 0ψ dφ 0ψ
Quadrature qp(weight, 0.0, L_, 3);
C0 c, delta_c, c_cache(3, (double*)0);
M00= ∫ EI ---------- ---------- dx
dx dx
do { 0
H1 x(qp), psi = x, L
w w
w = INTEGRABLE_VECTOR_OF_TANGENT_BUNDLE("int, int, Quadrature", [M01, M02] = ρ ∫ φ 0ψ dφ 0 dφ 1 dx ,
---------
- ----------
2, 1, qp); dx dx
w[0] = x; w[1] = x.pow(2); 0
H0 m = INTEGRABLE_MATRIX("int, int, Quadrature", 3, 3, qp);
m[0][0] = E_*I_*d(psi)*d(psi)+rho*((H0)psi)*((H0)psi);
m[0][1] = m[1][0] = rho*((H0)psi)*d(w[0]); dφ w
L 0
m[0][2] = m[2][0] = rho*((H0)psi)*d(w[1]); M 10 ---------
dx ψ -
m[1][1] = rho*d(w[0])*d(w[0]); m[1][2] = rho*d(w[0])*d(w[1]); = ρ∫ φ 0 dx
m[2][1] = rho*d(w[1])*d(w[0]); m[2][2] = rho*d(w[1])*d(w[1]); M 20 dφ 1w
C0 M = m | J(L_/2.0); 0 ---------
-
C0b(3, (double*)0); dx
b[0] = -M_*L_; b[1] = (((H0)w[0])*f_) | J(L_/2.0); b[2] = (((H0)w[1])*f_) | J(L_/2.0);
c &= b / M; L
M11 M 12 dφ w dφ w
= ρ ∫ ---------- ⊗ ---------- dx
delta_c &= c_cache - c;
c_cache = c; dx dx
rho *= 5.0;
M21 M 22
0
} while((double)norm(delta_c) > 1.e-6); T
cout << c << endl; L
( – Mφ 0 ( L ) ) ∫ φ 0 φ 1 fdx
return 0; b= ψ w w
}
0
Listing 3•13 Penalty function formulation for beam bending problem (project: “penalty_function”).
Notice that the penalty functional Jp can be written in quadratic form Jp(c) = cTMc-cT b where M is a sym-
metrical positive definitive. δJp = 0 gives “Mc = b”. Therefore, the solution is c = b / M. Alternatively, we con-
sider that Newton’s method for the minimization of Jp(x) should be achieved in just one iteration for a quadratic
functional
Applied Newton’s formula gives the same result. Therefore, Eq. 3•86 can be used to replace the inner loop of the
Newton’s iteration in Program Listing 2•20 for the penalty method in Chapter 2.
0.6
0.2 0.4 0.6 0.8 1
-0.2 0.5
0.4 exact
-0.4
-0.6 0.3
-0.8 exact 0.2
-1 0.1
4. Mixed Formulation: The irreducible formulation is based on a fourth-order differential equation (Eq. 3•66),
which is obtained from substituting M in Eq. 3•65 into Eq. 3•64. The mixed formulation is based on the two sep-
arated equations, Eq. 3•64 and Eq. 3•65, directly. That is
2 2
dw M dM
= ---------, and = f Eq. 3•89
dx2 2EI d x2
These equations are subject to the boundary conditions of w(0) = 0, and M(L) = M. The Lagrangian functional
corresponding to the above equations is
L
M2
J M ( w, M ) = ∫ d x d x 2EI- + fw dx
dw dM + -------- Eq. 3•90
0
L
∫ ----------
- -------- + δwf dx
dδw dM
= 0
dx dx
0
L
( δM )M
∫ ----------
- ------- + ------------------ dx
dδM dw
= 0 Eq. 3•91
dx dx EI
0
The approximated solutions for w and M, which satisfies the boundary condtions w(0) = 0 and M(L) = M, are
taken as
w ≈ c 0w x + c 1w x 2, and M ≈ M + c 0M ( L – x ) + c 1M ( L – x ) 2
Substituting w and M in Eq. 3•91, we can write Eq. 3•91 in matrix form as
L L
dφ w dφ M
0 ∫ ---------
dx
- ⊗ ---------- dx
dx cw
– ∫ φ w fdx
0 = 0 Eq. 3•92
L L L
cM φMM
dφ M dφ w φ ⊗ φ dx
∫ ---------
- ⊗ ---------- dx ∫ --------------------- – ∫ ------------ dx
M M
dx dx - EI
EI
0 0 0
where φw = [x, x2], and φM = [(L-x), (L-x)2]. The second term in the right-hand-side vector has the bending
moment boundary condition M(L) = M, which is shifted to the right-hand-side of Eq. 3•92 from the second term
in the second equation of the left-hand-side of Eq. 3•91. Program Listing 3•14 implements Eq. 3•92. The results
of the mixed formulation are shown in Figure 3•20.
#include "include\vs.h" L
dφ w dφ M
∫ ---------
int main() {
double L_ = 1.0, E_ = 1.0, I_ = 1.0, f_ = 1.0, M_ = 1.0, M01 = - ⊗ ---------- dx
dx dx
weight[5] = {14.0/45.0, 64.0/45.0, 24.0/45.0, 64.0/45.0, 14.0/45.0}; 0
Quadrature qp(weight, 0.0, L_, 5.0); L
dφ M dφ w
∫ ---------
H1 x(qp),
w = INTEGRABLE_VECTOR_OF_TANGENT_BUNDLE("int, int, Quadrature", 2, 1, qp), M10 = - ⊗ ---------- dx
dx dx
M = INTEGRABLE_VECTOR_OF_TANGENT_BUNDLE("int, int, Quadrature", 2, 1, qp); 0
w[0] = x; w[1] = x.pow(2); L
φ ⊗ φ dx
∫ ---------------------
M[0] = x-L_; M[1] = (x-L_).pow(2); M M
C0 e(2), E(2), M11 = -
m =+( 0.0 *((e%e)*(E[0]%E[0])) +
EI
0
(d(w)*(~d(M))) *((e%e)*(E[0]%E[1]))+
(d(M)*(~d(w))) *((e%e)*(E[1]%E[0])) + L
( ((H0)M)*(~((H0)M))/(E_*I_) )*((e%e)*(E[1]%E[1])) ) | J(L_/4), – ∫ φ w fdx
b = +( ( (-(H0)w)*f_ ) *(e*E[0])+
( (-(H0)M)*(M_/(E_*I_)) ) *(e*E[1]) ) | J(L_/4); b= 0
c = b / m;
φ M
L M
– ∫ ------------ dx
cout << c << endl;
return 0; EI
} 0
Listing 3•14 Mixed formulation for beam bending problem (project: “mixed_formulation”).
w M
Green Function Solution: We now re-visit the solution by integral equation using Green function to solve the
same fourth-order differential equation and boundary conditions posed in Eq. 3•65 and Eq. 3•66. For simplicity
we set all material constant as 1 and this simplifies the problem to have a Green function that satisfies
4
dg
= δ ( x – ξ ), 0 < x, ξ < 1 ;g ( 0, ξ ) = g’( 0, ξ ) = g’’’( 1, ξ ) = 0, g’’( 1, ξ ) = 1 Eq. 3•93
dx4
where δ( . ) is the Dirac delta function. The jump condition of the shear force (V) due to the concentrated load at
ξ is
V(ξ+)-V(ξ-) = -1
That results in the jump condition g’’’( ξ + , ξ ) – g’’’( ξ– , ξ ) = 1 . Therefore, the equation and conditions for
solving the Green function is
4
dg
1. = 0, 0 < x < ξ, ξ < x < 1
dx4
2. ( 0, ξ ) = g’( 0, ξ ) = g’’’( 1, ξ ) = 0, g’’( 1, ξ ) = 1
3. g, g’, g’’ continuous at x = ξ ; g’’’( ξ + , ξ ) – g’’’( ξ– , ξ ) = 1
We can determine c = 1/2 by using g’’( 1, ξ ) = 1 . Applying the jump condition g’’’( ξ + , ξ ) – g’’’( ξ– , ξ ) = 1 on x
= ξ we have
That is B = -1/6. Consequently, the continuity of g’’ at ξ gives A = (1+ξ)/2, and the continuity of g’ gives b = - 1
− ξ2/2. Then, the continuity of g gives a = 1/2 +(1/2) ξ2 −(1/6) ξ3. Therefore,
Program Listing 3•15 implements the Green function defined in Eq. 3•94, the transverse loading f(x) is taken as
a constant “1.0”. The results of eleven-points values are shown in Figure 3•21 for compared with the exact solu-
tion.
#include "include/vs.h"
int main() {
double x = 0.0,
wt[17] = {14.0/45.0, 64.0/45.0, 24.0/45.0, 64.0/45.0, 28.0/45.0 64.0/45.0, 24.0/45.0,
64.0/45.0, 28.0/45.0, 64.0/45.0, 24.0/45.0, 64.0/45.0, 28.0/45.0, 64.0/45.0,
24.0/45.0, 64.0/45.0, 14.0/45.0};
for(int i = 0; i < 11; i++) { f(x) = 1, a constant
Quadrature q1(wt, 0.0, x, 17), q2(wt, x, 1.0, 17);
double c = 1.0/2.0, B = -1.0/6.0; c = 1/2, B = - 1/6
H0 z1(q1), z2(q2);
H0 A = (1.0 + z2) / 2.0, A = (1+ξ)/2
a = 1.0/2.0+1.0/2.0*z1.pow(2)-1.0/6.0*z1.pow(3), a = 1/2 +(1/2)ξ2 −(1/6)ξ3
b = -1.0 - 1.0/2.0*z1.pow(2), b = - 1 − ξ2/2
integrand1 = a + b*(1.0-x)+ c*pow((1.0-x), 2),
integrand2 = A*pow(x, 2) + B*pow(x, 3); g2 =a + b (1-x) + c (1-x)2 (where ξ<x)
C0 integal_1, integal_2; g1= Ax2 + B x3 (where ξ>x)
if(i != 0) integal_1 &= integrand1 | J(x / 16.0); else integal_1 &= C0(0.0); x 1
∫ g2 dξ + ∫
if(i != 10) integal_2 &= integrand2 | J((1.0-x) / 16.0); else integal_2 &= C0(0.0);
double w = (double) (integal_1 + integal_2); w(x) = g 1 dξ
cout << "w(" << x << "): " << w << endl; 0 x
if(i != 10) x += 0.1;
}
return 0;
}
Listing 3•15 Integral equation solution using Green function for beam bending problem (project:
“green_function”).
0.6
0.5
0.4
w 0.3
0.2
0.1
Figure 3•21 Point-values are integral equation solution using the Green function compared
with the curve of the exact solution.
–∇ 2 u = f Eq. 3•95
Denote the square region as Ω and its boundary as Γ. Multiply the left-hand-side with the variation “v” and inte-
grate over the square region Ω.
The first identity uses the integration by parts and the second identity uses divergence theorem of Gauss, where
n is the surface unit normal vector. For homogeneous boundary conditions the boundary integral term vanishes.
We investigate three sets of boundary conditions
The matrix form for the Eq. 3•96 in this particular 2-D settings can be re-written as
11 11
∂φ i ∂φ j ∂φ i ∂φ j
Mij = ∫ ∫ -------- -------- + -------- -------- dx 0 dx 1, b i = ∫ ∫ φi fdx 0 dx1 Eq. 3•97
∂x 0 ∂x 0 ∂x 1 ∂x 1
00 00
Dirichlet Boundary Conditions : the basis functions to satisfy these conditions are
Program Listing 3•16 implements Eq. 3•97 with basis functions in Eq. 3•98, where N = 3 and f = 1. Notice that
the 2-D integration are obtained by forming a 2-D array of weightings. The Jacobian of this problem is constant
throughout the whole integration domain. Figure 3•22 shows the results obtained from Program Listing 3•16.
#include "include\vs.h"
#define PI 3.141597
int main() {
double f_0 = 1.0, weight[5][5],
bode[5] = {14.0/45.0, 64.0/45.0, 24.0/45.0, 64.0/45.0, 14.0/45.0};
for(int i = 0; i < 5; i++)
for(int j = 0; j < 5; j++) weight[i][j] = bode[i] * bode[j]; 2-D weightings
Quadrature qp(weight[0], 0.0, 1.0, 5, 0.0, 1.0, 5);
J d_a(pow( (1.0/4.0), 2.0)); 2-D Jacobian
H1 x(2, (double*)0, qp), phi = INTEGRABLE_VECTOR_OF_TANGENT_BUNDLE(
"int, int, Quadrature",9, 2, qp);
for(int m = 0; m < 3; m++) φi = φmn = sin(mπx0) sin(nπx1)
for(int n = 0; n < 3; n++) phi[m*3+n] = sin((m+1.0)*PI*x[0])*sin((n+1.0)*PI*x[1]); m, n = 0, 2, ..., N-1, and i = m × N + n
H0 M_diag(9, (double*)0, qp); 11
for(int i = 0; i < 9; i++) ∂φ i ∂φ i ∂φ i ∂φ i
M_diag[i] = d(phi[i]).pow(2);
C0 M = M_diag | d_a,
M ii = ∫ ∫ -------
- -------- + -------- -------- dx dx
∂x 0 ∂x 0 ∂x 1 ∂x 1 0 1
00
b = ( ((H0)phi) * f_0 ) | d_a, 11
c(9, (double*)0);
for(int i = 0; i < 9; i++) c[i] = b[i] / M[i];
cout << c << endl;
bi = ∫ ∫ φi fdx0 dx1
00
return 0;
}
Listing 3•16 The Poisson equation with the Dirichlet boundary conditions (project: “poisson_dirichlet”).
0.06
6
1
0.04
04
0.02
02 0.8
0 0.6
0
0.2 0.4
0.4
0.6 0.2
0.8
10
Figure 3•22 Solution of Poisson equation with homogeneous Dirichlet boundary conditions.
For this Neumann boundary condition to be solvable, f can not be a non-zero constant. On physical ground if we
consider the Poisson equation to be for the heat conduction problem, the homogeneous Neumann boundary con-
ditions mean that the square region is to be insulated from its surroundings. The temperature in the region will
∫ f dV = 0 Eq. 3•100
Ω
In this case, we choose f = cos πx which satisfies Eq. 3•100. For the computation we only need to replace the
definition for φi (N = 2) and the source term f = cos πx.
Mixed Boundary Conditions : We choose approximation basis functions to be algebraic polynomials of the
form
We choose N = 2 and f = 1. The results of the Neumann boundary conditions and the mixed boundary conditions
are shown in Figure 3•23.
Neumann boundary conditions, with f = cos πx Mixed boundary conditions, with f = 1.0
0.1
1 0.3
3
0.05
5 1 0.2
2 1
0 0.8
0.8 0.1
.1
-0.05
05
-0.1
.1 0.6 0 0.6
0 0
0.2 0.4 0.2 0.4
0.4 0.4
0.6 0.2 0.6 0.2
0.8 0.8
10 10
Figure 3•23 Solutions of the Poisson equation with the Neumann and mixed boundary conditions.
Green Function Method : The Green function for Eq. 3•95 with the Dirichlet boundary condtions is1
∞
1
g ( x, y ;ξ, η ) = ∑ --------------------------------
nπ sinh ( nπ )
sin ( nπξ ) sin ( nπx ) { cosh ( nπ ( 1 – ( y + η ) ) ) – cosh ( nπ ( 1 – y – η ))} Eq. 3•102
n=1
1. Problem 9.3.9 in p. 147 of G.F. Carrier, and C.E. Pearson, 1988, “ Partial Differential Equations: Theory and technique”
2nd eds., Academic Press Inc., San Diego, CA.
Listing 3•17 The Green function method on the Poisson equation with the Dirichlet boundary conditions.
0.06
6
1
1.0
1.
u 0.04
04
0.02
02
0
0.0
-1 0.5
0
0
0.5 y
1 -1
x 0 0.0
1.0
1
Figure 3•24 Eight-term (n=8) Green function method for the Poisson equation with the Dirichlet
boundary conditions.
u ≈ u N = φ̂ + ∑ c i φ i Eq. 3•103
N
where φ̂ is set to the essential boundary conditions of uΓ, and φ i is homogenous on the boundaries. Define the
residual RN of the approximated solution as
R N = Au N – f Eq. 3•104
The residual can be distributed in an over-all manner through-out the whole domain, and then set the integrated
value to be zero, such as
where w is the weighting function. Different ways of defining the weighting function lead to different types of
approximation methods. The general class of methods in the form of Eq. 3•105 is known as the weighted-resid-
ual methods.
Point-Collocation Method
The weighting function of the point-collocation method can be expressed using the Dirac delta function that
is
Therefore, substituting Eq. 3•106 into Eq. 3•105 gives RN(ξ) = 0. What we have to do is simply pick a number of
collocation points ξi , evaluate their corresponding residuals, and obtain a system of equations by setting these
residual equations to zero.
Considering the example1
2
du
+ u + x = 0, 0<x<1 Eq. 3•108
dx2
1. p. 14 and on in C.A. Brebbia, J.C.F. Telles, and L.C. Wrobel, 1984, “Boundary element techniques: Theory and applica-
tions in engineering”, Springer-Verlag, Berlin, Germany.
sin ( x )
u exact = --------------- – x Eq. 3•109
sin ( 1 )
Two term approixmation basis functions which satisfy this homogenous boundary condition can be taken from
Eq. 3•61 in page 204,
That is u2 = c0 φ0 + c1 φ1. Two collocation points are necessary for solving the two unkown coefficients, and they
are taken at ξ0 = 1/4 and ξ1 = 3/4. These two points generate two residual equations in matrix form as
d 2 φ0 ( ξ 0 ) d2 φ1 ( ξ0 )
---------------------- + φ 0 ( ξ 0 ) ---------------------
- + φ1 ( ξ0 )
dx 2 dx 2 c0 –ξ0
= Eq. 3•110
d φ0 ( ξ 1 )
2 d φ1 ( ξ1 )
2 c1 –ξ1
---------------------- + φ 0 ( ξ 1 ) ---------------------
- + φ1 ( ξ1 )
dx 2 dx 2
We can also write Eq. 3•110 as “M c = - b”. This two-point collocation problem is simple enough to be solved by
hand, or you can code it with VectorSpace C++ Library, which has the advantage that it can be extended to higher
number of basis functions and collocation points, and the matrix solution procedure becomes inevitable (see
project: “point_collocation”)
Solution - Exact
0.07
0.06
0.2 0.4 0.6 0.8 1
0.05
-0.0002
0.04
0.03 -0.0004
0.02
-0.0006
0.01
-0.0008
0.2 0.4 0.6 0.8 1
Figure 3•25 Two-points collocation solution and error comparing to the exact solution.
We can investigate further on the relationships of the point-collocation method and the finite difference
method. Considering the domain (“cell”) has three equally spaced points xi-1, xi, and xi+1, the approximating
function u correspoinding to these three points are given as ui-1, ui, and ui+1. The value of u in between two con-
sequtive points can be interpolated with a set of quadratic interpolation functions as
where
φ1 = ξ (ξ-1)/2, φ2 = (1-ξ) (1+ξ), and φ3 = ξ (1+ξ)/2, where -1 < ξ < 1 Eq. 3•112
Considering the cell length = 2h for the three equally spaced points, we have a constant Jacobian for the coordi-
nate transformation rule through-out the whole cell as dξ/dx = 1/h. With a point-collocation on ξ = 0, the first
derivative of u with respect to x is
1
du/ dx = (du/ dξ) (dξ/dx) = [(ξ-1/2)ui-1 + 2ξui + (ξ+1/2)ui+1] 1--- = ------ (ui+1-ui-1) Eq. 3•113
h 2h
ξ=0
Notice that the point-collocation is taken at ξ = 0. Eq. 3•113 is the central difference formula. We can check that
point-collocation at ξ = -1/2 and ξ = 1/2 will yield the backward difference and the forward difference formula,
respectively. The second derivative of u can be derived accordingly as
1
d2u/ dx2 = (d2u/ dξ2) (dξ/dx)2 =----2- (ui-1-2 ui + ui+1) Eq. 3•114
h
This finite difference formula for the second derivative is independent of ξ; i.e., the position of collocation point.
1, ξ1 < x < ξ 2
w = Eq. 3•115
0, – 1 < x < ξ1 or ξ 2 < x < 1
where x, ξ1, and ξ2 can be all defined in the interval of (-1, 1). For a function f(x), we have
1 ξ2
That is the domain of integration is now restricted to the subdomain bounded by [ξ1, ξ2]. Substituting the weight-
ing function of Eq. 3•115 in the weighted-residual statement of Eq. 3•105 gives
ξ2
The subdomain-collocation method evaluates a number of subdomains bounded by different sets of [ξ1, ξ2] in
Eq. 3•117, for solving the coefficient vector c.
Considering the same example in the point-collocation case with the same basis functions and two subdo-
mains Ω1 = (-1, 0), and Ω2 = (0, 1), we then have
d2 φ0 d 2 φ1
∫ ----------- + φ 0 dx
dx 2 ∫ ----------
dx 2
- + φ 1 dx
c0
– ∫ xdx
Ω1 Ω1 Ω1
= Eq. 3•118
d2 φ0 d 2 φ1 c1
– ∫ xdx
∫ ----------- + φ 0 dx
dx 2 ∫ ----------- + φ 1 dx
dx 2
Ω2
Ω2 Ω2
The code which implements Eq. 3•118 with VectorSpace C++ Library is (project: “subdomain_collocation”)
The coefficient vector is not substantially different from that of the point-collocation case. The results are shown
in Figure 3•26.
Error
0.07
0.00025
0.06
0.2 0.4 0.6 0.8 1
0.05 -0.00025
0.04 -0.0005
0.03 -0.00075
-0.001
0.02
-0.00125
0.01
-0.0015
0.2 0.4 0.6 0.8 1
Figure 3•26 Solution of subdomain collocation and error compared with the exact
solution.
We note that the first derivative of u with respect to x collocated in the interval of (-1/2, 1/2) is
1 1 1
--- --- ---
2 2 2
∫ d ξ d x dx
du du dξ 1 1 1 1
∫ d x dx = = --- --- ξ ( ξ – 1 )u i – 1 + ( 1 – ξ ) ( 1 + ξ )u i + --- ξ ( 1 + ξ )u i + 1
h 2 2
= ------ (ui+1-ui-1) Eq. 3•119
2h
1 1 1
– --- – --- – ---
2 2 2
This gives the central difference formula. Similarly, subdomains [-1, 0] and [0, -1] give backward difference and
forward difference formula, respectively.
Method of Moment
In statistics, the idea of using moments to descirbe a distribution is taken from mechanics. The expectation of
a distribution can be considered as the center of the gravity. The variance is the second power of distances of the
data with respect to this “center of gravity”. Skewness and kurtosis are defined as the third and fourth power of
the distances, respectively.
For the weighted-residual method, we can choose the weighting function in Eq. 3•105 to be the various
power of coordinate variable x, such as
That is the residuals (or errors) are distributed through-out the domain with various moments of x. We notice that
the weighting function for the weighted residual method in general is not necessarily required to satisfy the
homogeneous boundary conditions.
Consider the example in the point-collocation and subdomain collocation with the approximation basis func-
tion φ0 = x (1-x), and φ1 = x2 (1-x). We use first two weighting moments w 0 = 1 and w1 = x. This gives
d2φ d2 φ
----------0- + φ dx ----------1- + φ dx
∫ dx 2 0 ∫ dx 2 1 c0
– ∫ xdx
Ω Ω = Ω Eq. 3•121
d2φ d2 φ c
----------0- + φ dx ----------1- + φ dx 1 – ∫ x 2 dx
∫ dx 2 0
x ∫ dx 2 1
x
Ω
Ω Ω
The code which implements Eq. 3•121 with VectorSpace C++ Library is (project: “method_of_moment”)
double w[5] = {14.0/45.0, 64.0/45.0, 24.0/45.0, 64.0/45.0, 14.0/45.0}; // Bode’s integration rule
C0 M(2, 2, (double*)0), b(2, (double*)0);
Quadrature qp(w, 0.0, 1.0, 5);
J d_l(1.0/4.0);
for(int i = 0; i < 2; i++) {
H2 x(qp),
phi = INTEGRABLE_VECTOR_OF_TANGENT_OF_TANGENT_BUNDLE(
“int, int, Quadrature”, 2, 1, qp);
phi[0] = x*(1-x); phi[1] = x.pow(2)*(1-x); // φ0 = x (1-x), φ1 = x2 (1-x)
M[i] = (((H0)x).pow(i)*(+dd(phi))(0)+((H0)phi)) | d_l; // d2φ/dx2 + φ
b[i] = - ((H0)x).pow(i+1) | d_l;
}
C0 c = b / M;
cout << c << endl; // c = {0.166667, 0.212121}T
0.07 Error
0.06 0.001
0.05
0.03 -0.001
0.02
-0.002
0.01
Consider the same example we solved starting from the point-collocation method in the above. We have
d2 φj
∫ φi ---------
dx 2
- + φ j dx
c j = – ∫ φ i xdx Eq. 3•123
Ω Ω
For two term approximation (i = 0, 1), the matrix form can be written as
d2 φ0 d 2 φ1
∫ φ0 ----------
dx 2
- + φ0 dx ∫ φ0 ----------- + φ 1 dx
dx 2 c0
– ∫ φ 0 xdx
Ω Ω = Ω Eq. 3•124
d2φ d2φ c1
φ1 ---------- + φ0 dx φ1 ---------- + φ 1 dx – ∫ φ 1 xdx
∫ ∫
0 1
- -
dx 2 dx 2 Ω
Ω Ω
The code implements Eq. 3•124 with VectorSpace C++ Library as (project: “galerkin_method”)
double w[5] = {14.0/45.0, 64.0/45.0, 24.0/45.0, 64.0/45.0, 14.0/45.0}; // Bode’s integration rule
C0 M(2, 2, (double*)0), b(2, (double*)0);
Quadrature qp(w, 0.0, 1.0, 5);
J d_l(1.0/4.0);
for(int i = 0; i < 2; i++) {
H2 x(qp),
phi = INTEGRABLE_VECTOR_OF_TANGENT_OF_TANGENT_BUNDLE(
“int, int, Quadrature”, 2, 1, qp);
phi[0] = x*(1-x); phi[1] = x.pow(2)*(1-x); // φ0 = x (1-x), φ1 = x2 (1-x)
M[i] = (((H0)phi)[i]*(+dd(phi))(0)+((H0)phi)) | d_l; // d2φ/dx2 + φ
b[i] = - (((H0)phi)[i]*((H0)x)) | d_l;
}
C0 c = b / M;
cout << c << endl; // c = {0.166667, 0.212121}T
The results of the above implementation are shown in Figure 3•27. Notice that although we choose the weight-
ing functions equal to the approximation basis functions; i.e., wi = φi , the left-hand-side matrix M is still not
symmetrical. This is because the operator A(u) = (d2 u/ dx2 + u) is not self-adjoint. .
0.08
Error
0.14
0.12
0.06
0.1
0.08 0.04
0.06
0.02
0.04
0.02
0.2 0.4 0.6 0.8 1
0.2 0.4 0.6 0.8 1
For a positive definite operator A = (T*) T, where T* is the self-adjoint of T (i.e., T*=T), the Galerkin method
with homogeneous boundary conditions gives
= ( φ i, A φj ) Eq. 3•126
where a( . , . ) is a bilinear form which is symmetrical. Therefore, A is self-adjoint ( Eq. 3•126), and the resultant
weak formulation with the Galerkin weighting is symmetrical (Eq. 3•125). In the case of the self-adjoint opera-
tor, the Galerkin method with wi = φi is also known as the Bubnov-Galerkin method. When wi ≠ φi , it is known
as the Petrov-Galerkin method
A second example is the eigenvalue problem of a circular membrance of radius a given by1
with homogeneous essential boundary conditions. This axisymmetric problem can be reduced to a 1-D problem
in polar coordinate r as
– --- r
1d du
= λu Eq. 3•128
r d r d r
The approximation basis functions which satisfy the homogeneous boundary conditions are
πr
φ i = cos ( 2 i + 1 ) ------, i = 0, 1, 2, …, N – 1 Eq. 3•129
2a
1. p. 292 in J.N. Reddy, 1986, “Applied functional analysis and variational methods in engineering”, McGraw-Hill, Inc.
Therefore,
2
a dφ d φ j a
– φ j – rφ φ φ rdr c
∫ idr ∫ i j j
dr c = λ Eq. 3•131
dr j
i 2
0 0
Ax = λ B x Eq. 3•132
We solve Eq. 3•131 for N = 2 and a = 1. First, we notice that the off-diagonals of A is analytically identical since
the Laplace operator is self-adjoint. However, numerically, the off-diagonals of A can be slightly different due to
errors from numerical integration. We can either ignore such differences or symmetrize A numerically by defin-
ing As = (A+AT)/2. Now we can solve the generalized eigenvalue problem by using the inverse of B as
However, (B -1As) will not be symmetrical. We can not use a symmetrical eigenvalue solver for this problem.
Therefore, we first use Cholesky decomposition for the symmetrical matrix B = LLT. The symmetry of the left-
hand-side can now be preserved by pre-multipling L -1 and post-multipling (L-1)T on Eq. 3•132 to give
Program Listing 3•18 implements the Galerkin method with Laplace operator in Eq. 3•131, and using Eq. 3•134
to solve the symmetric eigenvalue problem.
The results are shown in Figure 3•29. The surface graphs represent the modes from eigenvectors as a func-
tion of “vi j φ j”, where vi is the i-th eigenvector. The frequencies ωi are computed from eigenvalues λi as
ωi = λi
#include "include\vs.h"
#define PI 3.141592654
int main() {
double a_= 1.0, radius a = 1
weight[25] = {14.0/45.0, 64.0/45.0, 24.0/45.0, 64.0/45.0, 28.0/45.0, 64.0/45.0, 24.0/45.0, extended Bode’s integration rule
64.0/45.0, 28.0/45.0, 64.0/45.0, 24.0/45.0, 64.0/45.0, 28.0/45.0, 64.0/45.0,
24.0/45.0, 64.0/45.0, 28.0/45.0, 64.0/45.0, 24.0/45.0, 64.0/45.0, 28.0/45.0,
64.0/45.0, 24.0/45.0, 64.0/45.0, 14.0/45.0};
Quadrature qp(weight, 0.0, a_, 25);
J d_r(a_/24.0); πr
H2 r((double*)0, qp), φ i = cos ( 2 i + 1 ) ------, i = 0, 1, 2
2a
phi = INTEGRABLE_VECTOR_OF_TANGENT_OF_TANGENT_BUNDLE(
"int, int, Quadrature", 3, 1, qp); a 2
dφ j d φ j
A= ∫ – φi – rφ i 2 dr
phi[0] = cos((PI/2.0/a_)*r); phi[1] = cos((3.0*PI/2.0/a_)*r); phi[2] = cos(5.0*PI/2.0/a_*r);
H0 d2_phi = INTEGRABLE_VECTOR("int, Quadrature", 3, qp); d r
for(int i = 0; i < 3; i++) d2_phi[i] = dd(phi)(i)[0][0];
dr
0
C0 A = -( ( d(phi)(0)%((H0)phi) + d2_phi%((H0)phi)*((H0)r) ) | d_r ), a
∫ φi φj rdr
B = ( ((H0)phi)%((H0)phi) * ((H0)r) ) | d_r;
C0 L = MATRIX("int, int", 3, 3); B=
Cholesky Ch(B); 0
for(int i = 0; i < 3; i++) Cholesky decomposition
for(int j = 0; j < 3; j++)
if(i >= j) L[i][j] = Ch.rep_ptr()[0][i*(1+i)/2+j]; Lower triangular matrix
else L[i][j] = 0.0;
C0 L_inv = L.inverse(), (L -1) As ((L-1)T) x = λ x
C = L_inv*((A+~A)/2.0*(~L_inv)),
lambda = Eigen(C).Eigenvalues(), λ = {5.78529, 30.4889, 75.0492}T
v = Eigen(C).Eigenvectors(); v0 ={0.999174, 0.398374, -0.008019}T
cout << lambda << endl; v1 ={0.0396508,-0.998966,-0.022225}T
cout << v << endl;
return 0; v2={0.00889577,-0.0218887,0.99972}T
}
Listing 3•18 Galerkin method for the eigenvalue problem of a circular membrance (project:
“circular_membrance”).
2 2 2
1 1 1 1
1 1
0 0
0
0.5 -1 0.5
-1 0.5 -1
-2 -2
-2 -1 -1 0
-1 0 0
-0.5 -0.5
-0.5
0 -0.5 0 -0.5
0 -0.5
0.5 0.5
0.5
1 -1 1 -1
1 -1
Figure 3•29 Three frequency and mode pairs of the circular membrance eigenvalue problem.
2
du 2
u 2 + = 1,
du
0 < x < 1, with u' ( 0 ) = 0, and u ( 1 ) = 2 Eq. 3•135
dx d x
This problem can be transformed to one that has homogeneous boundary conditions by setting u = v+ 2 such
that
2
d v dv 2
(v + 2) + = 1, 0 < x < 1, with v' ( 0 ) = 0, v ( 1 ) = 0 Eq. 3•137
d x 2 d x
This is equivalent to
d dv
(v + 2) = 1, 0 < x < 1, with v' ( 0 ) = 0, v ( 1 ) = 0 Eq. 3•138
dx dx
d dv N
R ( vN ) = ( vN + 2) –1 Eq. 3•139
dx dx
The weighted residuals method with Galerkin weightings, wN = ci ψi , and approximation basis functions, vN = cj
φj , gives
1 1
d dv N
I ( c ) = c i ∫ ψ i R ( v N ) dx = c i ∫ ψ i ( vN + 2) – 1 dx = 0 Eq. 3•140
dx dx
0 0
where ci are arbitary constants and will eventually be dropped from Eq. 3•140, and cj is the only unknown vector
of the non-linear problem. Integrating by part on the first term gives the weak formulation
1
dψ i dv N
I ( c ) = ci ∫ ( vN + 2 ) – ψ i dx = 0 Eq. 3•141
dx dx
0
1. p. 294 in J.N. Reddy, 1986, “Applied functional analysis and variational methods in engineering”, McGraw-Hill, Inc.
where cjk+1 = cjk + δ cjk. The approximation in this equation is the Taylor expansion to the first-order derivatives.
That is the increment of the solution δ cjk can be solved by
–1
∂I
δ c jk = I ( c jk ) = – [ IT ] –1 I ( c jk ) Eq. 3•143
∂c k
c
Program Listing 3•19 implements the three-parameters approximation of the weak formulation I(c) (Eq. 3•141)
and its tangent IT (Eq. 3•144), then, uses the iterative alogrithm (Eq. 3•143; i.e., Newton’s method) to solve for
the increment of the Ritz coefficients δc. The preprocessing macro “__PETROV_GALERKIN”, if defined at
compile time, the corresponding code segment implements the Petrov-Galerkin method with the approximation
basis functions for weighting as
ψ i = cos -------------- π , i = 0, 1, 2
2i + 1
2
Eq. 3•145
Otherwise, the Bubnov-Galerkin method is assumed and ψi = φi. The approximated solution is
2
u = wN + 2 = c0 ( 1 – x ) + c 1 ( 1 – x ) + c2 ( 1 – x 3 ) + 2 Eq. 3•146
The results of the computation are shown in Figure 3•30. The data speak for themselve without the need for any
elaboration.
#include "include\vs.h"
#define EPSILON 1.e-12
int main() {
double weight[13] = {14.0/45.0, 64.0/45.0, 24.0/45.0, 64.0/45.0, 28.0/45.0, 64.0/45.0, extended Bode’s integration rule
24.0/45.0, 64.0/45.0, 28.0/45.0, 64.0/45.0, 24.0/45.0, 64.0/45.0,
14.0/45.0};
Quadrature qp(weight, 0.0, 1.0, 13);
J d_l(1.0/12.0);
H1 x(qp),
phi = INTEGRABLE_VECTOR_OF_TANGENT_BUNDLE(
"int, int, Quadrature", 3, 1, qp);
phi[0] = 1.0-x; phi[1] = 1.0-x.pow(2); phi[2] = 1.0-x.pow(3); φi = (1-xi+1), i = 0, 1, 2
#if defined(__PETROV_GALERKIN) Petrov-Galerkin method
#define PI 3.141592654
H1 psi = INTEGRABLE_VECTOR_OF_TANGENT_BUNDLE( (2 i + 1)
"int, int, Quadrature", 3, 1, qp); ψ i = cos ---------------------- π, i = 0, 1, 2
psi[0] = cos(PI/2.0*x); psi[1] = cos(3.0*PI/2.0*x); psi[2] = cos(5.0*PI/2.0*x);
2
#else
H1 psi; Bubnov-Galerkin method
psi &= phi; ψi = φ i
#endif 1
dψ i dv N
∫
C0 c(3,(double*)0), delta_c(3,(double*)0);
do { I= ( vN + 2 ) – ψ i dx
dx dx
H1 v = c * phi; 0
C0 I = ( d(psi)(0) * ( ((H0)v)+sqrt(2.0) ) * d(v)[0] + ((H0)psi) ) | d_l, 1
dψ i dv N dφj
φ ( v + 2 ) dx
∫
I_t = ( d(psi)(0) % (((H0)phi) * d(v)[0] + d(phi)(0) * ( ((H0)v)+sqrt(2.0) ) ) ) | d_l;
IT= +
delta_c = - I / I_t; dx jdx dx N
c += delta_c; 0
cout << c << ", " << norm(delta_c) << endl; δc = - I / IT , and ck+1 = ck + δ ck
} while((double)norm(delta_c ) > EPSILON);
cout << c << endl; Bubnov: c = {0.00539687, -0.547092,
return 0; 0.127479}T, Petrov: c = {0.00884522, -
} 0.554076, 0.131396}T .
Listing 3•19 Galerkin method for a non-linear differential equation (project: “nonlinear_galerkin”).
1.4
u exact ( x ) = 1 + x2 0.0006
1.3 0.0004
Petrov-Galerkin
u Bubnov-Galerkin
1.2 0.0002
Errors
1.1 0.2 0.4 0.6 0.8 1 x
-0.0002
0.2 0.4 0.6 0.8 1 x
∂ R N 22 ∂R N
------------------ = 2 RN, ---------- = 0 Eq. 3•147
∂c ∂c
The factor “2” can be dropped since the equation equals zero. Comparing Eq. 3•147 with the weighted-residual
statement ( R N, w ) = 0 it shows that the least squares method is a special case of the weighted-residual method
with the weighting function “w” as
∂R N
w = ---------- Eq. 3•148
∂c
∂R N ∂I ∂RN ∂R N ∂ 2 RN
I ( c ) = R N, ---------- , and I T = = ----------, ---------- + R N, ------------- Eq. 3•149
∂c ∂c ∂c ∂c ∂c 2
ci
2
d vN dv N 2
RN = ( vN + 2 ) + –1 Eq. 3•150
dx2 dx
2 2
∂2 RN dφ dφ dφ dφ
------------- = φ ⊗ + ⊗φ+2 ⊗ Eq. 3•152
∂c 2 dx2 dx2 dx dx
Program Listing 3•20 implements Eq. 3•149 to Eq. 3•152, with three-parameter approximation. Considering that
the exact solution of the problem is an even function we assume the approximation basis functions are “φ0 = (1-
x2), φ1 = (1-x4), and φ2 = (1-x6)”. This set of approximation basis functions are as accurate as if we had used six-
parameter approximation with continuous power of algebraic functions. The errors of this computation are
shown in Figure 3•31. Before we proceed any further, we notice that the combination of the concept of subdo-
main collocation method and weak formulation with Galerkin method provides the foundations of the finite ele-
ment method which we will discuss in details in Chapter 4 and Chapter 5.
#include "include\vs.h"
#define EPSILON 1.e-12
int main() { extended Bode’s integration rule
double weight[13] = {14.0/45.0, 64.0/45.0, 24.0/45.0, 64.0/45.0, 28.0/45.0, 64.0/45.0, φ0 = (1-x2), φ1 = (1-x4), and φ2 = (1-x6)
24.0/45.0, 64.0/45.0, 28.0/45.0, 64.0/45.0, 24.0/45.0, 64.0/45.0, 2
d vN dv N 2
+ –1
14.0/45.0};
RN = ( vN + 2 )
Quadrature qp(weight, 0.0, 1.0, 13); dx2 dx
J d_l(1.0/12.0);
H2 x((double*)0, qp), 2 2
phi = INTEGRABLE_VECTOR_OF_TANGENT_OF_TANGENT_BUNDLE( ∂R N d vN d φ dφ dv N
---------- = φ + 2 ( vN + 2 ) + 2
"int, int, Quadrature", 3, 1, qp); ∂c dx 2 dx dxdx
phi[0] = 1.0-x.pow(2); phi[1] = 1.0-x.pow(4); phi[2] = 1.0-x.pow(6);
H0 d2_phi = INTEGRABLE_VECTOR("int, Quadrature", 3, qp); 2 2
for(int i = 0; i < 3; i++) d2_phi[i] = dd(phi)(i)[0][0]; ∂2 RN dφ dφ dφ dφ
------------- = φ ⊗ + ⊗φ+2 ⊗
C0 c(3, (double*)0), delta_c(3, (double*)0); ∂c 2 dx2 dx2 dx dx
do {
H2 w = c * phi;
∂R
I = R N, ----------
H0 R = d(w)[0].pow(2)+(((H0)w)+sqrt(2.0))*dd(w)[0][0]-1.0, N
dR = 2.0*d(phi)(0)*d(w)[0]+((H0)phi)*dd(w)[0][0]+d2_phi*(((H0)w)+sqrt(2.0)), ∂c
ddR = 2.0*(d(phi)(0)%d(phi)(0))+((H0)phi)%d2_phi+d2_phi%((H0)phi);
C0 I = ( dR*R ) | d_l,
∂R ∂R ∂2R
IT= ----------, ---------- + R N, -------------
I_t = ( dR%dR+ddR*R ) | d_l; N N N
delta_c = -I /I_t; ∂c ∂c ∂c 2
c += delta_c;
cout << c << ", " << norm(delta_c) << endl; δc = - I / IT , and ck+1 = ck + δ ck
} while((double)norm(delta_c ) > EPSILON);
cout << c << endl;
return 0; c= {-0.491866, 0.0970614, -0.018541}T
}
Listing 3•20 Least squares method for a non-linear differential equation (project:
“nonlinear_least_squares”).
0.0008
0.0006
Error
0.0004
0.0002
( R N, w ) = ∫ RN w dΩ = ∫ ( ∇ 2 u + f )w dΩ = 0 Eq. 3•153
Ω Ω
If we integrate by part once, and apply Green’s theorem to transform the volume integral to the surface integral
as
∫ ∇u ∇w + f w dΩ = ∫ ( n • ∇u )w dΓ Eq. 3•154
Ω Γ
when the approximation basis functions for u and w are the same the first term in the left-hand-side of Eq. 3•154
is equivalent to the corresponding term in the Bubnov-Galerkin method for a self-adjoint operator (weak formu-
lation) discussed in Eq. 3•125. When the right-hand-side of Eq. 3•154 is included in the variational statment, Eq.
3•154 is also equivalent to the corresponding term in the Rayleigh-Ritz method in Eq. 3•53. Taking Eq. 3•154
and integrating by parts once more, and applying Green’s theorem to transform volume integral to surface inte-
gral,again, we have
∫ ( ( ∇ 2 w )u + fw ) dΩ = ∫ ( n • ∇w )u dΓ – ∫ ( n • ∇u )w dΓ Eq. 3•155
Ω Γ Γ
An alternative view to the weighted-residual derivation from Eq. 3•153 is possible. By setting f = -∇ 2 u , we iden-
tify, directly, Eq. 3•154 as Green’s first identity, and Eq. 3•155 as Green’s second identity.1
Trefftz Method
In Trefftz method, both u and w in Eq. 3•155 are taken as harmonic functions. By definition, harmonic founc-
tions satisfy the Laplace operator
∫ ( n • ∇w )u dΓ = ∫ ( n • ∇u )w dΓ Eq. 3•157
Γ Γ
1. p. 450 in L.E. Malvern, 1969, “Introduction to the mechanics of a continuous medium”, Prentice-Hall, Englewood Cliffs,
N.J.
2. p. 38 in C.A. Brebbia, J.C.F. Telles, and L.C. Wrobel, 1984, “Boundary element Techniques: Theory and applications in
engineering”, Springer-Verlag, Berlin, Germany.
–∇ 2 u = f Eq. 3•158
f
u = – --- (x2+y2) + v Eq. 3•159
4
H0 = 1.0,
H1 = (x2-y2).
H2 = (x4 - 6x2y2 + y4),
H3 = (x6-15x4y2+15x2y4-y6),
H4 = (x8 - 28 x6y2 + 70 x4y4 -28 x2y6 + y8),
... etc.,
with v = ci φi and vΓ ≡ v = (x2+y2) f /4, at the boundaries x = ± 1 , and y = ± 1 . Then, v is taken in place of u and
w in Eq. 3•157. For this problem ∀q ≡ ( n • ∇u ) = 0 ; i.e., total flux vanish on the boundaries, the right-hand-
side of Eq. 3•157 equals zero. Considering the symmetry of the problem, we only need to compute the boundary
at x = 1, and 0 < y < 1.
1 1
∂v ∂v
∫ -----
∂x
v dy = ∫ -----
∂x
v dy Eq. 3•160
0 x=1 0 x=1
1 1
∂φ i ∂φ i
M = ∫ ------
∂x
- ⊗ φ i
x=1
dy, and b = ∫ ------
- v
∂x x = 1
dy Eq. 3•162
0 0
Due to the symmetry of the problem with respect to x and y, only H0, H2 and H4 out of the list of algebraic har-
monic functions will be taken. In view of the problem at hand, H0 = 1.0 can not be in φi, because the x-deriva-
tives in the matrix M and vector b are both zero. We can remedy this by taking only
φ1 = H2 = (x4 - 6x2y2 + y4) and φ2 = H4 = (x8 - 28 x6y2 + 70 x4y4 -28 x2y6 + y8) Eq. 3•163
1 1
∫ u dy = 0 = ∫ ( – v + c1 φ1 + c2 φ2 + c0 ) x = 1 dy Eq. 3•164
0 0
c0 = –∫ ( – v + c1 φ1 + c 2 φ2 ) dy Eq. 3•165
x=1
0
Program Listing 3•21 implements Eq. 3•162 with basis functions Eq. 3•163 and integrating constant c0 com-
puted from Eq. 3•165. Alternatively we can use transcendental harmonic functions symmetrized with respect to x
and y as
The macro definition “__TRANSCENDENTAL”, if defined at compile time, turns on the corresponding code
segments.
The first quadrant solution of this problem is shown in Figure 3•32, which is directly comparable to the solu-
tion of the same problem shown in the right-hand side of Figure 3•23. Transcendental harmonic functions give
almost identical results.
0.3
3
u
0.2
2 1
0.1
.1 0.8
0 0.6
0
0.2 0.4 y
0.4
0.2
x 0.6
0.8
10
Figure 3•32 The first quadrant solution of the Poisson equation using the Trefftz method.
Listing 3•21 Solution of Poisson equation with Trefftz method (project: “trefftz_method”).
In the case of heat conduction, we use temperature T in place of u; therefore, we have the heat flow q =
n • ∇u = ∂u ⁄ ∂n , the weighting function w is taken as the fundamental solution (Green’s function) T* in the
boundary element method, which satisfies
– δ ( x, ξ )
∇ 2 T ∗ ( x, ξ ) = -------------------- Eq. 3•167
k
where δ(x, ξ) is the Dirac delta function with x as the sampling point, ξ as the point source location, and k is the
conductivity. In two-dimensional case, the solution is
–1
T ∗ ( x, ξ ) = ---------- ln ( r ) Eq. 3•168
2πk
1
q∗ = – k∇ T ∗ = ----------2- [ ( x 0 – ξ 0 )e 0 + ( x 1 – ξ 1 )e 1 ] Eq. 3•169
2πr
For simplicity, we deal with problems without internal heat source; i.e., f = 0. When f ≠ 0 , domain integral is
required. From programming point of view, the advantage of the boundary element method begins to make con-
cession to the finite element method. Substituting Eq. 3•168 and Eq. 3•169 into Eq. 3•166, we have
– δ ( x, ξ ) n • q∗
∫ -------------------- T dΩ – ∫ ( 0 )T ∗ dΩ =
k ∫ – --------------
k
- T dΓ – ∫ ( n • q )T ∗ dΓ Eq. 3•170
Ω Ω Γ Γ
where c = 1 if ξ is inside Ω, c = 0 if ξ is outside Ω, and c = 1/2 if ξ is on a smooth boundary Γ. The boundary inte-
gral equation for the boundary element method is obtained by discretizing the boundary Γ to boundary elements
as
where T and q can be either a variable or a specified boundary condition. For the matrix form representation of
Eq. 3•172, we reserve T and q as the unknown and T and q as corresponding boundary conditions, and denote
δ ij
∫T ∗( x, ξi ) dΓ, ∫ ( n • q∗ ( x, ξi ) ) dΓ,
ˆ ˆ
G ij = and H ij = with H ij = H ij – ------ Eq. 3•173
2
Γj Γj
Tj
H ij – G ij = – H ij Tj + G ij q j Eq. 3•174
qj
This can be re-written as “A x = b”. Two singular integrations in Eq. 3•174 occur, when the source location ξ is
right on the element under consideration; i.e., all diagonal terms on the left-hand-side. For an element with con-
stant shape function, it can be proved that 1
ln --- + 1
h 2
∫ (n • q ∗) dΓ = 0, ∫ T ∗ dΓ= ---------
ˆ -
H ii = and G ii =
h
Eq. 3•175
2πk
Γe Γe
where h is the size of the constant element with one variable node on the middle of the element. After the vari-
ables T and q are solved from Eq. 3•174. The interior temperature at any location ξ can be recovered using Eq.
3•172, by setting c = 1. The interior heat flux can also be recovered by applying Fourier’s law of conduction and
Eq. 3•172 (c = 1). We have a response gradient boundary integral equation as
∂T ( ξ ) ∂T ∗ ( x, ξ ) ∂(n • q ∗( x, ξ ))
q ( ξ ) = – k -------------- = – k ∑ ( n • q ) e ∫ ----------------------- dΓ – ∑ T e ∫ ------------------------------------- dΓ Eq. 3•176
∂ξ e ∂ ξ ∂ ξ
Γ e e Γ e
where
∂T ∗ ( x, ξ ) –1 ∂r ( xi – ξi ) ∂r –( x i – ξi )
----------------------- = ---------- r – 1 ------- = ------------------- , sin ce ------- = ----------------------- Eq. 3•177
∂ξ i 2πk ∂ξ i 2πkr 2 ∂ξi r
and
1. p. 69 in C.A. Brebbia, J.C.F. Telles, and L.C. Wrobel, 1984, “Boundary element Techniques: Theory and applications in
engineering”, Springer-Verlag, Berlin, Germany.
∂( n • q∗ ( x, ξ ) ) 1 ∂r
------------------------------------- = --------------- – r – 2 ( n • e i ) – 2r –3 ----- ( ( x – ξ ) • n )
∂ξ i 2πkr 2 ∂ξ
1 2 ( xi – ξi ) ( ( x – ξ ) • n )
= --------------2- – n i + -----------------------------------------------------
- Eq. 3•178
2πkr r2
We consider a trivial example which solution is self-evident for checking our implementation of this method.
We investigate conduction of heat on a squre region -1 < x < 1, and -1 < y < 1. The upper and lower boundaries
are insulated from its sorrounding by setting qy = 0, the left boundary has T = 0, and the right boundary has T =
100. Because of the steady state condition ∇ 2 T = 0 , the gradient on the x direction is constant. This leads to a
linear temperature distribution and constant heat flux on x direction. On y direction, temperature distribution is
uniform and heat flux is zero. Program Listing 3•22 to Program Listing 3•26 implement Eq. 3•174 to solve for
the variables T and q on the boundaries, then, use Eq. 3•172 and Eq. 3•176 to recover the interior temperature
and heat flux. We discritize each side of the square region into eight equal length constant elements. Program
Listing 3•22 is the “main()” program of the boundary element method.
#include "include\vs.h"
static const double PI = 3.141592654;
static C0 A(32, 32, (double*)0);
static C0 f(32, (double*)0);
static C0 H(32, 16, A, 0, 0);
static C0 mG(32, 16, A, 0, 16);
double w[9] = {14.0/45.0, 64.0/45.0, 24.0/45.0, 64.0/45.0, 28.0/45.0, extended Bode’s integration rule
64.0/45.0, 24.0/45.0, 64.0/45.0, 14.0/45.0},
x[8][2] = {{-1.0, -0.75}, {-0.75, -0.5}, {-0.5, -0.25}, {-0.25, -0.0}, {0.0, 0.25}, {0.25, 0.5}, element end points coordinates
{0.5, 0.75}, {0.75, 1.0}},
y[8][2] = {{-1.0, -0.75}, {-0.75, -0.5}, {-0.5, -0.25}, {-0.25, -0.0}, {0.0, 0.25}, {0.25, 0.5},
{0.5, 0.75}, {0.75, 1.0}},
zai[8] = {-0.875, -0.625, -0.375, -0.125, 0.125, 0.375, 0.625, 0.875}, element variable node coordinates
eta[8] = {-0.875, -0.625, -0.375, -0.125, 0.125, 0.375, 0.625, 0.875};
void LHS_0_15();
void LHS_16_31();
void RHS();
void T_recovery(const C0&, const C0&, const C0&);
void q_recovery(const C0&, const C0&, const C0&, const C0&);
int main() { row 0-15 of A
LHS_0_15(); row 16-31 of A
LHS_16_31();
RHS(); b
C0 Y = f / A, Solve boundary integral equations
T_gamma(16, Y, 0), q_gamma(16, Y, 16),
T(8, 8, (double*)0), q_x(8, 8, (double*)0), q_y(8, 8, (double*)0);
cout << T_gamma << endl;
cout << q_gamma << endl;
T_recovery(T_gamma, q_gamma, T); interior temperature recovery
cout << T << endl;
q_recovery(T_gamma, q_gamma, q_x, q_y);
cout << q_x << endl; interior heat flux recovery
cout << q_y << endl;
return 0;
}
Listing 3•22 The main program of the boundary element method (project: “boundary_element_method”).
Program Listing 3•23 implements the first 16 rows of matrix A corresponding to source position ξi on the 8
upper boundary elements and 8 lower boundary elements. From Eq. 3•173 the diagonal elements of Hij are first
assigned “-1/2”. For this particular case, if source position ξi is located at any element of the upper boundary, the
integration of Hij for all upper boundary elements (with second index “j”) become signular and their values are
zero according to Eq. 3•175. This is also true for sources position at lower boundary elements, which makes all
lower boundary element integrals zero. For source position ξi at upper boundary and integral on the lower
boundary elements (with index “j”), and vice versa, we have
( x 1 – ξ1 ) 2 1
∫ (n • q ∗ ( x, ξi )) dΓ ∫ --------------------- ∫ ---------- ∫ -------
ˆ
H ij = = dΓ = - dΓ = - dΓ Eq. 3•179
2πr 2 2πr 2 πr 2
Γj Γj Γj Γj
The -Gij terms in the first 16 rows of matrix A corresponding to the 8 upper boundary elements and 8 lower
boundary elements are simply
ln ( r ( x, ξ i ) )
– G ij = – ∫T ∗( x, ξi ) dΓ = ∫ --------------------------- dΓ Eq. 3•180
2πk
Γj Γj
– G ii = – ---------- ln --- + 1
double mG_diag = 0.125 / PI * (log(0.125)-1.0); h 2
for(int i = 0; i < 16; i++) mG[i+16][i] = mG_diag; 2πk h
for(int i = 0; i < 8; i++) {
for(int j = 0; j < 8; j++) {
Quadrature qp(w, y[j][0], y[j][1], 9);
J d_l(0.25/8.0);
H0 Y(qp);
if(i != j) {
ln ( r )
∫ -----------
if(i<j) mG[i+16][j] = mG[i+24][j+8] = 1.0/(2.0*PI) * log(Y-eta[i]) | d_l;
else mG[i+16][j] = mG[i+24][j+8] = 1.0/(2.0*PI) * log(eta[i]-Y) | d_l; – G ij = - dΓ
2πk
} Γj
}
}
for(int i = 0; i < 8; i++)
for(int j = 0; j < 8; j++) {
Quadrature qp(w, y[j][0], y[j][1], 9);
J d_l(0.25/8.0);
H0 Y(qp),
R = sqrt(4+(Y-eta[i]).pow(2));
mG[i+16][j+8] = mG[i+24][j] = 1.0/(2.0*PI) * log(R) | d_l;
}
for(int i = 0; i < 8; i++)
for(int j = 0; j < 8; j++) {
Quadrature qp(w, x[j][0], x[j][1], 9);
J d_l(0.25/8.0);
H0 X(qp),
R0_2 = pow(1-eta[i], 2) +(X-1).pow(2),
R1_2 = pow(1+eta[i], 2) +(X-1).pow(2),
R2_2 = pow(1-eta[i], 2) +(X+1).pow(2),
R3_2 = pow(1+eta[i], 2) +(X+1).pow(2); ( n • e 1 ) ( x1 – ξ1 )
∫ ----------------------------------------
ˆ - dΓ
H[i+16][j] = (1.0-eta[i])/(2.0*PI*R0_2) | d_l; H ij =
H[i+16][j+8] = (1.0+eta[i])/(2.0*PI*R1_2) | d_l; 2πr 2
Γj
H[i+24][j] = (1.0-eta[i])/(2.0*PI*R2_2) | d_l;
H[i+24][j+8] = (1.0+eta[i])/(2.0*PI*R3_2) | d_l;
}
}
Listing 3•24 Function LHS_16_31() (project: “boundary_element_method”).
Similarly, Program Listing 3•24 implements the last 16 rows of matrix A corresponding to source position ξi
on 8 right boundary elements and 8 left boundary elements. Now the singular integration corresponds to diago-
nals of -Gij terms in the last 16 rows of matrix A. According to Eq. 3•175, that is
– G ii = – ∫ T ∗ dΓ = – ---------- ln --- + 1
h 2
Eq. 3•181
2πk h
Γi
Otherwise, -Gij use the definition in Eq. 3•180. For the terms of Hij
( n • e 1 ) ( x1 – ξ1 )
∫ ( n • q∗ ( x , ξ i ) ) d Γ ∫ ----------------------------------------
ˆ - dΓ
H ij = = Eq. 3•182
2πr 2
Γj Γj
void RHS() {
for(int i = 0; i < 8; i++) {
for(int j = 0; j < 8; j++) {
Quadrature qp(w, y[j][0], y[j][1], 9);
J d_l(0.25/8.0);
100 ( 1 – ξ 0 )
b i = – --------- ∫ ------------------
H0 Y(qp),
R0_2 = pow(1-zai[i], 2)+(Y-1).pow(2),
- dΓ
2π r2
R1_2 = pow(1-zai[i], 2)+(Y+1).pow(2); Γj
f[i] += (-100.0*(1-zai[i]))/(2.0*PI*R0_2) | d_l;
f[i+8] += (-100.0*(1-zai[i]))/(2.0*PI*R1_2) | d_l;
}
} bi = 100/2
for(int i = 16; i < 24; i++) f[i] = 100.0 / 2.0;
for(int i = 24; i < 32; i++) {
for(int j = 0; j < 8; j++) {
Quadrature qp(w, y[j][0], y[j][1], 9);
100 1
J d_l(0.25/8.0); b i = – --------- ∫ ---2- dΓ
H0 Y(qp), R_pow_2 = 4+(Y-eta[i-24]).pow(2); π r
f[i] += - 100.0 / (PI*R_pow_2) | d_l; Γj
}
}
}
( n • e0 ) ( x0 – ξ0 ) 100 ( 1 – ξ 0 )
b i = – H ij T j = – 100 ∫ ( n • q∗ ( x, ξ i ) ) dΓ = – 100 ∫ ----------------------------------------
- dΓ = – --------- ∫ ------------------
ˆ - dΓ
2
Eq. 3•183
2πr 2π r2
Γj Γj Γj
For source position ξ on left boundary elements, singular integrals occur. Therefore, bi = -c Tj, where c = -1/2;
i.e., bi = 100/2 = 50, for i = 16~23. For source position ξ on right boundary elements (i = 24~31), we have
( n • e0 ) ( x0 – ξ0 ) 100 1
b i = – H ij T j = – 100 ∫ (n • q ∗ ( x, ξ i )) dΓ = – 100 ∫ ----------------------------------------
- dΓ = – --------- ∫ ---- dΓ
ˆ
Eq. 3•184
2πr 2 π r2
Γj Γj Γj
Program Listing 3•26 for interior temperature and interior heat flux recovery is a straight forward imple-
mentation of Eq. 3•172 and Eq. 3•176.
The results of this simple problem are trivial. It is suffice to say that close to boundaries the accuracy deteri-
orates fast, since we have a lot of ln(r), 1/r, 1/r 2 ... etc. in the equations. This phenomenon is known as hypersin-
gularity of the boundary integral equation. These complex functions are very chanlleging for numerical
integration. Special integration rules for these functions are common practice in boundary element method.
∑ ( n • q ) e ∫ T∗ ( x, ξ ) dΓ
for(int i = 0; i < 8; i++)
for(int j = 0; j < 8; j++) { T(ξ) =
for(int k = 0; k < 8; k++) { e Γe
Quadrature qpx(w, x[k][0], x[k][1], 9);
– ∑ T e ∫ ( n • q∗ ( x, ξ ) ) dΓ
J d_l(0.25/8.0);
H0 X(qpx),
R0_2 = pow((1-eta[i]), 2) + (X-zai[j]).pow(2), e Γe
R1_2 = pow((1+eta[i]), 2) + (X-zai[j]).pow(2);
T[i][j] += T_gamma[k] /(2.0*PI)*(1-eta[i])/R0_2 | d_l;
T[i][j] += T_gamma[k+8] /(2.0*PI)*(1+eta[i])/R1_2 | d_l; where
Quadrature qpy(w, y[k][0], y[k][1], 9);
1
H0 Y(qpy), q ∗ = ----------2- [ ( x 0 – ξ 0 )e 0 + ( x 1 – ξ1 )e 1 ]
R2_2 = pow((1-zai[j]), 2) + (Y-eta[i]).pow(2), 2πr
R3_2 = pow((zai[j]+1), 2) + (Y-eta[i]).pow(2);
T[i][j] += 100.0 / (2.0*PI)*(1-zai[j])/R2_2 | d_l;
–1
T[i][j] += q_gamma[k] / (2.0*PI)*log(sqrt(R2_2)) | d_l; T ∗ ( x, ξ )= ---------- ln ( r )
T[i][j] += q_gamma[k+8] / (2.0*PI)*log(sqrt(R3_2)) | d_l; 2πk
}
}
}
void q_recovery(const C0& T_gamma, const C0& q_gamma,
const C0& q_x, const C0& q_y) { ∂ T ∗ ( x, ξ )
double nx, ny; q ( ξ ) = – k ∑ ( n • q ) e ∫----------------------- dΓ
nx = 1.0;
∂ξ
e Γe
for(int i = 0; i < 8; i++)
for(int j = 0; j < 8; j++) {
for(int k = 0; k < 8; k++) {
∂( n • q∗ ( x, ξ ) )
Quadrature qpx(w, x[k][0], x[k][1], 9); + k ∑ T e ∫ ------------------------------------- dΓ
J d_l(0.25/8.0); ∂ξ
H0 X(qpx), e Γe
R0_2 = pow((1-eta[i]), 2) + (X-zai[j]).pow(2),
R1_2 = pow((1+eta[i]), 2) + (X-zai[j]).pow(2); where
ny = 1.0;
q_x[i][j] -= (ny*T_gamma[k] /(2.0*PI*R0_2)*(2.0*(X-zai[j])*(1-eta[i])/R0_2)) | d_l; ∂T ∗ ( x, ξ ) ( x i – ξi )
q_y[i][j] -= (ny*T_gamma[k] /(2.0*PI*R0_2)* ----------------------- = -------------------
(2.0*(1-eta[i])*(1-eta[i])/R0_2-1.0)) | d_l;
∂ξ i 2πkr 2
ny = -1.0;
q_x[i][j] -= (ny*T_gamma[k+8] /(2.0*PI*R1_2)* and
(2.0*(X-zai[j])*(-1-eta[i])/R1_2)) | d_l;
q_y[i][j] -= (ny*T_gamma[k+8] /(2.0*PI*R1_2)* ∂(n • q ∗ ( x, ξ ))
------------------------------------- =
(2.0*(-1-eta[i])*(-1-eta[i])/R1_2-1.0)) | d_l; ∂ξi
Quadrature qpy(w, y[k][0], y[k][1], 9);
H0 Y(qpy),
1 2 ( xi – ξi ) ( ( x – ξ ) • n )
R2_2 = pow((1-zai[j]), 2) + (Y-eta[i]).pow(2), --------------- – n i + -----------------------------------------------------
-
2πkr 2 r2
R3_2 = pow((zai[j]+1), 2) + (Y-eta[i]).pow(2);
q_x[i][j] -= (nx*100.0 / (2.0*PI*R2_2)*(2.0*(1-zai[j])*(1-zai[j])/R2_2 - 1.0)) | d_l;
q_y[i][j] -= (nx*100.0 / (2.0*PI*R2_2)*(2.0*(Y-eta[i])*(1-zai[j])/R2_2 )) | d_l;
q_x[i][j] += (q_gamma[k] / (2.0*PI*R2_2)*(1-zai[j])) | d_l;
q_y[i][j] += (q_gamma[k] / (2.0*PI*R2_2)*(Y-eta[i])) | d_l;
q_x[i][j] += (q_gamma[k+8] / (2.0*PI*R3_2)*(-1-zai[j])) | d_l;
q_y[i][j] += (q_gamma[k+8] / (2.0*PI*R3_2)*(Y-eta[i])) | d_l;
}
}
}
where C is the heat capacitiy matrix, K is the conductivity matrix, and f is heat source vector. The variable a is
the temperature and a· is the time derivative of temperature. Structural dynamics (a hyperbolic equation) is of
the form
where M is the consistent mass matrix, K is the stiffness matrix, and f is the force vector. The variable a is the
displacement and the second time derivative of the displacement, a··· gives the acceleration.
Parabolic Equation
Considering a time interval tn to tn+1, and ∆t = tn+1 - tn. At this time interval, τ (= t - tn) can be normalized by
∆t as a referential coordinate ξ = τ / ∆t, with 0 < ξ < 1. The variable a at time tn and tn+1 is denoted as an and an+1,
respectively. A linear interpolation function, the trapezoidal rule for time integration, is used to approximate a in
the time interval tn to tn+1 as1
τ
a ≅ â ( τ ) = ( 1 – ξ )a n + ξa n + 1 = a n + ξ ( a n + 1 – a n ) = a n + ----- ( a n + 1 – a n ) Eq. 3•187
∆t
Now, the residual of Eq. 3•185 is distributed through-out the time domain (tn, tn+1) by a weighted-residual state-
ment similar to Eq. 3•105 as
∆t ∆t
·
∫ WR dτ = ∫ W ( Câ ( τ ) + Kâ ( τ ) + f ) dτ = 0 Eq. 3•188
0 0
A parameter θ is defined as
∆t ∆t
θ = ∫ W τ dτ ⁄ ∆t ∫ W dτ Eq. 3•189
0 0
∆t
1. We follow Chapter 10 in Zienkiewicz and Taylor, 1991, “The finite element method”, 4th eds. vol. 2, McGraw-Hill Book
Company, UK.
Note that similar interpolation of force vector f is taken as that for the variable a. Re-arranging in order to solve
for the unknown an+1, we have
where
f̂ = f n + θ ( fn + 1 – fn ) Eq. 3•192
Eq. 3•191 is a time recurrence formula with which the solutions of consequtive time steps can be calculated.
From the results of Eq. 3•113 and Eq. 3•114, we have, in the context of the point-collocation method, the values
of θ = 0, 1, 0.5 corresponding to the backward difference, forward difference, and central difference, respectively.
Consider an initial value problem1
∂u ∂ 2 u
----- – -------- = 0, 0 < x < 1 Eq. 3•193
∂t ∂x 2
∂u
u ( 0, t ) = 0, ------ ( 1, t ) = 0, and u ( x, 0 ) = 1 Eq. 3•194
∂x
∞
e –λn t sin λ n x
2
( 2n + 1 )π
u exact ( x, t ) = 2 ∑ -----------------------------, λ n = ------------------------
λn 2
Eq. 3•195
n=0
Making (1) a weighted-residual statement, and (2) integration by part on the spatial derivative term of Eq. 3•193,
we have the weak form with Bubnov-Galerkin weighting v ≡ w = u as
1 1 1 1
∂v ∂ 2 v ∂v ∂v ∂v ∂v ∂v ∂v ∂v
0 = ∫ v ----- – --------2 dx = ∫ v ----- + ------ ------ dx – v ------ = ∫ v ----- + ------ ------ dx Eq. 3•196
∂t ∂x ∂t ∂x ∂x ∂x ∂t ∂x ∂x
0 0 0 0
1. p. 323-324 in J.N. Reddy, 1986, “Applied functional analysis and variational methods in engineering”, McGraw-Hill, Inc.
where we take φ0(x) = x, and φ1(x) = x2. Now the matrices in Eq. 3•185 can be identified as
1 1 1 1
∂φ 0 ∂φ 0 ∂φ ∂φ
∫ φ0 φ0 dx ∫ φ0 φ1 dx ∫ --------
- --------- dx ∫ --------0- --------1- dx
∂x ∂x ∂x ∂x
C = 0 0 , and K = 0 0 Eq. 3•198
1 1 1 1
∂φ 1 ∂φ 0 ∂φ 1 ∂φ 1
∫ φ1 φ0 dx ∫ φ1 φ1 dx ∫ --------
- --------- dx ∫ --------- --------- dx
∂x ∂x ∂x ∂x
0 0 0 0
1 1
∫ v [ v ( x, 0 ) – 1 ] dx = 0 = ∫ [ φ ⊗ ( φ • c ( 0 ) ) – φ ] dx Eq. 3•199
0 0
or in matrix form,
1 1 1
∫ φ0 φ0 dx ∫ φ0 φ1 dx c0 ( 0 ) ∫ φ0 dx
0 0 = 0 Eq. 3•200
1 1 1
c1 ( 0 )
∫ φ1 φ0 dx ∫ φ1 φ1 dx ∫ φ1 dx
0 0 0
Therefore, the results are c0(0) = 4 and c1(0) = -10/3. This serves as the initial condition in place of Eq. 3•200 for
the basis function approximation in Eq. 3•197. Although Eq. 3•200 is simple enough to be solved by hand, we
can easily implement Eq. 3•200 with VectorSpace C++ Library.
Program Listing 3•27 first implements the approximated initial condition with Eq. 3•200, and then, uses the
definitions of C and K in Eq. 3•198 to solve for the time recurrence formula provided by Eq. 3•191. In this
implementation, the time step ∆t = 0.05, and θ = 0.5 is used for the central difference method. Note that for the
initial condition approximation the highest order polynomial is the fifth-order. We use Bode’s integration rule for
the accuracy of the initial condition approximation. For the time recurrence formula the highest order polyno-
mial for integration is only up to second order, so Simpson’s rule will be sufficient.
The results of the computation are shown in Figure 3•33. A large error occurs at time zero, since the initial
condition is only approximated through Eq. 3•199.
Hyperbolic Equation
We consider a more general second order differential equation (hyperbolic-parabolic equation) such as
where M is the mass matrix, C is the viscous damping, K is the stiffness, and f is the forcing terms. In structural
dynamics, a is the displacement, a· is the velocity, and a·· is the acceleration. When C = 0, Eq. 3•201 reduces to
the hyperbolic case. For Eq. 3•201, we use a time recurrence formula in which given a n, a· n , and a··n at time tn,
we seek solutions of a n + 1, a· n + 1, and a··n + 1 at time tn+1. Similar to Eq. 3•187, we use a linear interpolation func-
tion with a natural coordinate ξ = τ / ∆t, where 0 < ξ < 1, to approximate the acceleration as
1
Exact Solution t=0.0 1 Numerical Solution at x = 0.5 & 1.0
x = 0.5 x = 1.0
0.8 0.8
t=0.2
0.6 0.6
u u x = 1.0
t=0.4
0.4 0.4
t=0.6
0.2 t=0.8 0.2
x = 0.5
t=1.0
t=1.2
t=1.4
0 0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 1.2 1.4
x t
Figure 3•33 The left-hand-side is the exact solution computed from Eq. 3•195. The results of
numerical solution on x = 0.5 and 1.0 are shown.
·· τ
a·· ≅ â ( τ ) = ( 1 – ξ )a··n + ξa··n + 1 = a··n + ξ ( a··n + 1 – a··n ) = a··n + ----- ( a··n + 1 – a··n ) Eq. 3•202
∆t
For the displacement a and the velocity a· , we can approximate them with Taylor series approximation to the
second order of ∆t for an+1 as
∆t 2
a n + 1 ≅ a n + ∆ta· n + -------- [ ( 1 – 2β )a··n + 2βa··n + 1 ] Eq. 3•203
2
The two parameters β and γ are choosen independently. Notice that Eq. 3•204 is an expression only up to the
first order of ∆t. Therefore, we restrict our choice of γ = 1/2 to keep the order of accuracy up to O(∆t2). Two
choices of β are popular. For the so-called constant average acceleration γ = 2β = 1/2, the weighting functions
W(τ) for acceleration terms in Eq. 3•203 and Eq. 3•204 are the same. In the context of point-collocation the
value 1/2 corresponds to central difference method. This value is also equivalent to collocation with a constant
weighting function throught out the time interval tn to tn+1; i.e., W(τ) = 1, such that from the definition in Eq.
3•189, we get
∆t ∆t ∆t
τ2
γ = 2β = ∫ W τ dτ ⁄ ∆t ∫ W dτ = ----- ⁄ ( ∆tτ )
1
= --- Eq. 3•205
2 2
0 0 0
∆t ∆t 1
ξ 2 ξ3 ξ2
2β = ∫ W τ dτ ⁄ ∆t ∫ W dτ = ----- – ----- ⁄ -----
1
= --- Eq. 3•206
2 3 2 3
0 0 0
Now, we work out the formula for the Newmark method for the hyperbolic-parablic equation. From Eq.
3•203, we have
1 ∆t 2
a··n + 1 = --------- a n + 1 – a n – ∆ta· n – -------- ( 1 – 2β )a··n Eq. 3•207
β∆t 2
γ ∆t 2
a· n + 1 = a· n + ∆t ( 1 – γ )a··n + --------- a n + 1 – a n – ∆ta· n – -------- ( 1 – 2β )a··n Eq. 3•208
β∆t 2
Since a n, a· n , and a··n at time tn are all given values, the three unknowns a n + 1, a· n + 1, and a··n + 1 at time tn+1 can be
substituted into Eq. 3•201 as
where a· n + 1 , and a··n + 1 are expressed in Eq. 3•207 and Eq. 3•208, and the only unknown is a n + 1 in Eq. 3•209.
Re-arranging Eq. 3•209, we have1
ˆ ˆ
Kan + 1 = Rn + 1 Eq. 3•210
where,
ˆ ˆ
K = K + a 0 M + a 1 C, and R n + 1 = – f n + 1 + ( a 0 a n + a 2 a· n + a 3 a··n )M + ( a 1 a n + a 4 a· n + a 5 a··n )C Eq. 3•211
a n + 1 is solved from Eq. 3•210, then, a· n + 1 , and a··n + 1 are calculated from re-arranged Eq. 3•207 and Eq. 3•208
as
a··n + 1 = a 0 ( a n + 1 – a n ) – a 2 a· n – a 3 a··n
a·n+1 = a· + a a·· + a a··
n 6 n 7 n+1 Eq. 3•212
The so-called Newmark coefficients ai in Eq. 3•211 and Eq. 3•212 are
1. p. 323 in K Bathe, and E.L. Wilson, 1976, “Numerical method in finite element analysis”, Prentice-Hall, New Jersey.
∂2 w ∂4 w
---------- = – ---------4-, 0 < x < 1, t > 0 Eq. 3•214
∂t 2 ∂x
∂w ( 0, t ) ∂w ( 1, t )
w ( 0, t ) = w ( 1, t ) = -------------------- = -------------------- = 0 , Eq. 3•215
∂x ∂x
∂w ( x, 0 )
w(x, 0) = sin(πx)-πx(1-x), and --------------------- = 0 Eq. 3•216
∂t
φ0 = 1-cos(2πx), φ1 = 1-cos(4πx)
for w2 = c0(t) φ0(x) + c1(t) φ1(x), with C = 0 and f = 0 in Eq. 3•201, M and K can be identified as
1 1 2 2
d φ d φ
M = ∫ ( φ ⊗ φ ) dx, and K = ∫ d x2 ⊗ d x2 dx Eq. 3•217
0 0
The coefficients for the initial condition in Eq. 3•216 is approximated by weighted-residual method
1
With Bubnov-Galerkin weighting W(x, t) = w2(x, t) = c0(t) φ0(x) + c1(t) φ1(x) , Eq. 3•218 becomes
1 1
#include "include\vs.h"
#define PI 3.141592654
int main()
{ extended Bode’s integration rule
double weight[33] = {14.0/45.0, 64.0/45.0, 24.0/45.0, 64.0/45.0, 28.0/45.0, 64.0/45.0,
24.0/45.0, 64.0/45.0, 28.0/45.0, 64.0/45.0, 24.0/45.0, 64.0/45.0,
28.0/45.0, 64.0/45.0, 24.0/45.0, 64.0/45.0, 28.0/45.0, 64.0/45.0,
24.0/45.0, 64.0/45.0, 28.0/45.0, 64.0/45.0, 24.0/45.0, 64.0/45.0,
28.0/45.0, 64.0/45.0, 24.0/45.0, 64.0/45.0, 28.0/45.0, 64.0/45.0,
24.0/45.0, 64.0/45.0, 14.0/45.0};
Quadrature qp(weight, 0.0, 1.0, 33);
J d_l(1.0/32.0);
H2 x((double*)0, qp),
phi = INTEGRABLE_VECTOR_OF_TANGENT_OF_TANGENT_BUNDLE(
"int, int, Quadrature", 2, 1, qp);
phi[0] = 1.0-cos(2.0*PI*x); phi[1] = 1.0-cos(4.0*PI*x); φ1 = 1-cos 2πx, φ2 = 1-cos 4πx
H0 d2_phi = INTEGRABLE_VECTOR("int, Quadrature", 2, qp); 1
∫ ( φ ⊗ φ ) dx
for(int i = 0; i < 2; i++) d2_phi[i] = dd(phi)(i)[0][0];
C0 mass = (((H0)phi)%((H0)phi))|d_l, M =
stiff = (d2_phi%d2_phi)|d_l; 0
double gamma_ = 0.5, beta_ = 0.25, dt_ = 0.01, a[8]; 1 2 2
d φ d φ
K = ∫ 2 ⊗ 2 dx
C0 c_old(2, (double*)0), c_new(2, (double*)0), dc_old(2, (double*)0),
dc_new(2, (double*)0), ddc_old(2, (double*)0), ddc_new(2, (double*)0);
a[0] = 1.0/(beta_*pow(dt_,2)); a[1] = gamma_/(beta_*dt_); a[2] = 1.0/(beta_*dt_);
dx dx
0
a[3] = 1.0/(2.0*beta_)-1.0; a[4] = gamma_/beta_-1.0; a[5] = dt_/2.0*(gamma_/beta_-2.0);
a[6] = dt_*(1.0-gamma_); a[7] = gamma_*dt_;
H0 w_0 = sin(PI*((H0)x))-PI*((H0)x)*(1.0-((H0)x)); initial condition approximation
c_old = ( ( ((H0)phi)*w_0 )|d_l ) / ( ( ((H0)phi)%((H0)phi) )|d_l ); by Eq. 3•219
C0 LHS = stiff + a[0]*mass, ˆ
K = K + a0 M
d_LHS = !LHS;
for(int i = 0; i < 28; i++) {
ˆ
C0 RHS = mass * (a[0]*c_old + a[2] * dc_old + a[3] * ddc_old); R n + 1 = ( a 0 a n + a 2 a· n + a 3 a··n )M
c_new = d_LHS*RHS;
double iptr;
if(modf( ((double)(i+1))/2.0, &iptr)==0) cout << "t= " << ((i+1)*dt_) << ", u(0.5) = " <<
(c_new[0]*(1.0-cos(PI))+c_new[1]*(1.0-cos(2.0*PI))) << endl;
ddc_new = a[0]*(c_new - c_old)-a[2]*dc_old-a[3]*ddc_old; a··n + 1 = a 0 ( a n + 1 – a n ) – a 2 a· n – a 3 a··n
a· = a· + a a·· + a a··
dc_new = dc_old + a[6]*ddc_old + a[7]*ddc_new;
c_old = c_new; dc_old = dc_new; ddc_old = ddc_new; n+1 n 6 n 7 n+1
} update ( . )n = ( . )n+1
return 0;
}
Listing 3•28 Hyperbolic equation using Newmark method with constant average acceleration (project:
“hyperbolic_equation”).
t=0.02 t=0.28
0.2 0.2 t=0.26
t=0.04
t=0.24
0.1 0.1
t=0.08
t=0.22
w
w x x
0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1
t=0.08
Various methods in the last chapter are mostly applicable to small size problems. We have demonstrated that
the VectorSpace C++ Library help to ease the programming task significantly. However, if the problem size is
down to one or two variables, they might be solved by hand as well. For better approximation of the solution, we
often need to increase the number of the variables substantially. Finite difference method, finite element method,
and boundary element method are three widely accepted methods for large size problems. We have introduced
the finite difference method in Chapter 1 and the boundary element method in the Chapter 3. Yet another defi-
ciency for the variational method in the last chapter is that it is very simplistic in terms of the geometry of the
problem domains. The geometry of the problem domains is, in most case, very simple; a line segment, a square
(or rectangle), or a circle. In real world applications, the geometry of the problem domains is always much more
complicated. We devote the following two chapters for finite element method with considerable depth. The finite
element method is the most well-established method among the three methods for the large-scale problems. It is
also most capable of handling arbitrarily complicated geometry
Moreover, we would also like to demonstrate to the users of the VectorSpace C++ Library that a numerical
method is often not just about mathematical expression which is already made easy by using VectorSpace C++
Library. The programming complexities caused by complicated geometry (and its large size variables) in finite
element method serves as an excellent test bed that the object-oriented programming can make a significant dif-
ference. The source code of “fe.lib” is used to demonstrate the power of object-oriented programming in numer-
ical applications.
The object-oriented programming is the present-day programming paradigm which is supported by the indus-
trial flag-ship general purpose language—C++. Other alternative approaches for programming highly mathemat-
ical problems are symbolic languages, special purpose packages, or program generators written specifically for
dedicated application domains. These alternative approaches may have specialized capability in solving mathe-
where “nen” is the number of nodes in an element. The space of ue is infinite dimensional, in which every point
x on the element has a variable ue(x) associated with it. In Figure 4•1, this infinite dimensional variable ue(x) is
approximated by a finite dimensional space of approximated function u eh (x) which in turn only depends on finite
element— Ωeh
global domain —Ω
boundary—Γ
1
N a ( ξ, η ) = --- ( 1 + ξ a ξ ) ( 1 + η a η ) Eq. 4•2
4
where index “a” ( = 0, 1, 2, 3) is the element node number. The coordinate (ξa , ηa) = {{-1, -1}, {1, -1}, {1, 1}, {-
1, 1}} is the natural (or referential) coordinates of the four nodes. Therefore, the explicit forms for the interpola-
tion functions are
1 1 1 1
N 0 = --- ( 1 – ξ ) ( 1 – η ), N 1 = --- ( 1 + ξ ) ( 1 – η ), N 2 = --- ( 1 + ξ ) ( 1 + η ), N 3 = --- ( 1 – ξ ) ( 1 + η ) Eq. 4•3
4 4 4 4
The interpolation function formula for linear triangular element can be degenerated from Eq. 4•3 by setting
N 0Tri = N 0, N 1Tri = N 1 , and
1 1 1
N 2Tri = N 2 + N 3 = --- ( 1 + ξ ) ( 1 + η ) + --- ( 1 – ξ ) ( 1 + η ) = --- ( 1 + η ) Eq. 4•4
4 4 2
(or using “triangular area coordinates” as in page 454 of Chapter 5). That is
1 1 1
N 0Tri = --- ( 1 – ξ ) ( 1 – η ), N 1Tri = --- ( 1 + ξ ) ( 1 – η ), N 2Tri = --- ( 1 + η ) Eq. 4•5
4 4 2
Coordinate transformation using Eq. 4•3 for quadrilateral and Eq. 4•5 for triangular elements are shown in the
middle column of Figure 4•2. From those integration examples, we note that a reference element1., Ωe , can be
defined in a normalized region with a coordinate transformation rule x ≡ x(Ωe) which maps the reference ele-
ment, Ωe , to a physical element, Ωeh ; i.e., a normalized domain in natural coordinates ξ is transformed to a phys-
ical domain in coordinate x. The interpolation functions for the coordinate transformation rule can be chosen to
be the same as the interpolation for the approximated function u eh (x) as in Eq. 4•1. That is
where x ea is the nodal coordinates (“over-bar” indicates fixed nodal values). A finite element with the same set of
interpolation functions for (1) approximation functions and (2) coordinate transformation rule is called an iso-
1. P. G. Ciarlet, 1978, “ The finite element method for elliptic problems”, North-Holland, Amsterdam.
1-D 2-D
curve quadratic quadrilateral η
linear 2 3 6
7 2
0 1 1 8
0
5
0
linear quadrilateral η ξ
4
3 2
quadratic
1
0 1 2 ξ degenerated η
0 quadratic triangle 2
η 1
degenerated 2 7 8
linear triangle 5
ξ 0
4 ξ
0 1 1
Figure 4•2 (1) 1-D linear and quadratic line elements, and (2) 2-D curve, linear quadrilateral and
trianglular elements, and quadratic quadrilateral and triangular elements.
parametric element. The interpolation functions in finite element method are further subject to continuity and
completeness requirements. The continuity requirement demands that the approximated function to be continu-
ous both in the interior and the boundaries of the element. The completeness requirement demands arbitrary
variation, up to certain order, in the approximated function can be accurately represented. When these require-
ments are relaxed, we have the so-called non-conforming elements.
0, for û ea = g ≡ φ Γa u ea on Γ g
e
W= Eq. 4•7
φ ea, otherwise
g is the essential boundary conditions on the boundary Γg, and u ea (“over-bar” indicates fixed nodal values) is
a
the nodal value of the essential boundary condition with a boundary interpolation function φ Γ on the boundary
e
associated with the element. Since φ ea is defined in the element domain only, this particular choice of weighting
function resembles the subdomain collocation method (see page 229) in the weighted residual method, where W
= 1 on each subdomain and W = 0 elsewhere.
or in matrix forms
where,
k eab = a ( φea, φ eb )
The difference of Eq. 4•8 from Eq. 3•125 in Chapter 3 is now we have second and third terms in the right-hand-
side. The second terms is the non-homogeneous natural boundary conditions
q • n = h ≡ φ a h e on Γh
a
Eq. 4•11
Γe
where q • n is flux q projected on the outward unit surface normal n. This term occurs when we take integration
by parts on the weighted-residual statement, then, applied the Green’s theorem to change the resultant right-
hand-side domain integral into a boundary integral. The third term is due to non-homogeneous essential bound-
ary conditions. According to the first line of Eq. 4•7, rewritten with a new index “b” as g ≡ φ b u eb. In Eq. 4•10 the
Γe
index “a” is the element equation number, and the index “b” is the element variable (degree of freedom) number.
Since W has been taken according to Eq. 4•7, the rows (or equations) corresponding to the fixed degree of free-
doms (essential boundary conditions) will have vanishing weighting function (W = 0) multiplies through-out
every term of Eq. 4•8. Therefore, the rows (or equations) corresponding to the fixed degree of freedoms will be
eliminated at the global level. We also note that the element tensors keab is the element stiffness matrix, and the
element tensors fea is the element force vector.
In summary, for a differential equation problem, we first discretize its domain into elements (as in Figure 4•1)
and approximate its variables (Eq. 4•1), and weighting functions (Eq. 4•7) corresponding to a variational princi-
ple. These steps are known as the finite element approximation1. A finite element approximation depends on the
choice of (1) the variational principle, and (2) a corresponding set of variables approximated by a selected set of
interpolation functions. The various variational principles make the finite element method such an open area for
improvements. These various variational principles also bring a challenge that a finite element program should
be able to endure a dramatic impact of changes in its design structure, and to enable the reuse of existing code in
its evolutionary course. The object-oriented programming has a lot to offer in this regard.
1. p. 3 in F. Brezzi and M. Fortin, 1991, “ Mixed and hybrid finite element methods”, Springer-Verlag, New York, Inc.
K ij û j = Fi
Node(int node_number,
int number_of_spatial_dimension,
double* array_of_coordinates);
Using the terminology of the relational database the “node_number” is the key to this abstract data type—
“Node”. One considers that the “node_number” as the identifier for an instance of the class Node. The following
example is to define a 2-D case with the node number “5”, and coordinates x = {1.0, 2.0}T
double *v;
v = new double[2];
v[0] = 1.0; v[1] = 2.0;
Node *nd = new Node(5, 2, v);
This instantiates an object of type “Node” pointed to by a pointer “nd”. Data abstraction is applied to model the
“Node” as an object. The states of the “Node” is consist of private data members include the node number, the
spatial_dimension, and the values of its coordinates. The behaviors of the “Node” are public member functions
that provide user to query the states of the “Node” including it node number, and spatial dimension, ... etc. The
“operator[](int)” is used to extract the components of the coordinates, and logical operators “opera-
tor==(Node&)” and “operator !=(Node&)” are used for the comparison of the values of two nodes. The data and
the functions that operating on them are now organized together into a coherent unit—class. The private mem-
bers of the object are encapsulated from users that the access are only possible through its public members. The
encapsulation mechanism provides a method to hidden complexities from bothering users (see Figure 4•3).
An element— Ω eh is constructed by
1 Omega_eh(int element_number,
2 int element_type_number,
3 int material_type_number,
int element_node_number,
int *node_number_array);
1. see p. 1 in J. Soukup, 1994, “Taming C++”, Addison-Wesley, Reading, Massachusetts, and preface in J. Lakos, 1996,
“Large-scale C++ software design”, Addison-Wesley, Reading, Massachusetts.
class Node
==
...
int node_no
int spatial_dim controlled access
double* value []
...
node_no()
Figure 4•3 The class Node is consists of private data members to describe its
states, and public member functions provide the access to query its states. The
private members are encapsulated away from the controlled access through the
public members.
The “element_number” play the role of the key for the element class “Omega_eh”. The “element_type_number”
and the “material_type_number” are integers greater or equal to “0”. The default values for the both numbers are
“0”. For example, the “element_node_number” is “3” for a triangular element, and “4” for a four-node element.
The “node_number_array” points to an int pointer array of global node numbers for the element. An example is
1 int *ena; // 10 11
2 ena = new int[4]; // +-------------+
3 ena[0] = 0; ena[1] = 1; // | |
4 ena[2] = 11; ena[3] = 10; // | |
5 Omega_eh *elem = // +-------------+
6 new Omega_eh(0, 0, 0, 4, ena); // 0 1
The order of global node numbers in the “node_number_array” is counter-clockwise from the lower-left corner,
as illustrated in the comment area after each statement, which is conventional in finite element method.
A discretized global domain— Ω h basically consists of a collection of all nodes and elements as
6 7 8
8 9 10 11
3 4 5
5 6 7
4
0 1 2
0 1 2 3
Figure 4•4 Nine elements in a rectangular area consist of 16 nodes.
The data structure Dynamic_Array<T> does what it means, which is declared and defined in “dynamic_array.h”.
It is a simplified version of <dynarray> in the standard C++ library1. Two protected member data consist of
“the_node_array” and “the_omega_eh_array” (element array). The default constructor “Omega_h::Omega_h()”
is declared in the header file, The users of the “fe.lib” are responsible for its definition. The following code seg-
ment shows an example of a user defined discretized global domain as illustrated in Figure 4•4.
1. P.J. Plauger, 1995, “The draft standard C++ library”, Prentice-Hall, Inc., Englewood Cliffs, New Jersey.
Then, we can make an instance of the discretized global domain “Omega_h” by declaring in main() function
Omega_h oh;
The instance “oh” calls the default constructor “Omega_h::Omega_h()” that is custom made by the user.
Remark: For users who are familiar with database languages1, the class definitions of Node, Omega_eh, and
Omega_h per se define the database schema; i.e., the format of the data, which serves the function of the data
definition language (DDL). The function “Dynamic_Array<T>::add(T*)” is an example of data manipulation
language (DML) that assists user to modify the database. And two most important features of data query lan-
guage provided by “fe.lib” are the node selector “Node& Omega_h::operator [ ](int)” and the element selector
“Omega_eh& Omega_h::operator ( )(int)”.
û h on Ωh.
The essential boundary conditions (fixed degree of freedoms) and natural boundary conditions are
g h on Γ h g , and h h on Γ hh
respectively, where the “over-bar” indicates a fixed value. The global variables û h are modeled as class “U_h”.
And, the global boundary conditions g h and h h are modeled as class “gh_on_Gamma_h”. A constraint flag is
used to switch in between “Dirichlet” and “Neumann” to indicate whether the stored values are essential or nat-
ural boundary conditions, respectively.
All three kinds of values û h , g h , and h h are nodal quantities, which are somewhat similar to the coordinates
of a node; i.e., x . Therefore, we can factor out the code segment on coordinates in the class Node and create a
more abstract class Nodal_Value for all of them.
1 class Nodal_Value {
2 protected:
3 int the_node_no,
4 nd; // number of dimension
1. e.g., Al Stevens, 1994, “ C++ database development”, 2nd eds., Henry Holt and Company, Inc., New York, New York.
Now the three classes are publicly derived from the base class “Nodal_Value” as
All three derived classes inherit the public interfaces (member functions) of the class Nodal_Value. For example,
now all three derived classes can use the operator[](int) to access the nodal values. If “nd” is an instance of the
class Node and “uh” is an instance of the class U_h, and “gh” is an instance of the class gh_on_Gamma_h, then,
the access is performed by
1 int main() {
...
2 int ndf = 1;
3 U_h uh(ndf, oh);
4 gh_on_Gamma_h gh(ndf, oh);
...
5 }
The constructor of class U_h is defined in “fe.lib”. The users do not need to worry about it. However, the essen-
tial and natural boundary conditions in the class gh_on_Gamma_h are parts of every differential equation prob-
g = 30 oC
12 13 14 15
6 7 8
8 9 10 11
h=0 3 4 5 h=0
5 6 7
4
0 1 2
0 1 2 3
g = 0 oC
Figure 4•5 Heat conduction problem with two side insulated, bottom and top temperature
boundary conditions are set to 0 oC and 30 oC, respectively.
lems. Therefore, defining the constructor of class gh_on_Gamma_h is users’ responsibility. This constructor
needed to be defined before it is instantiated in the above. For the problem at hand, we have
The first line in the constructor (line 2) called a private member function of class gh_on_Gamma_h. This func-
tion initiates a private data member “Dyanmic_Array<Nodal_Constraint> the_gh_array” for the class
gh_on_Gamma_h. This is a mandatory first statement for every constructor of this class for ochestrating internal
data structure. The first line in the for loop uses a constraint type selector “operator ( )(int degree_of_freedom)”.
It can be assigned, for each degree of freedom, to either as “gh_on_Gamma_h::Dirichlet” to indicate an essential
boundary condition or as “gh_on_Gamma_h::Neumann” to indicate a natural boundary condition. Line 7 uses a
constraint value selector “operator [ ](int degree_of_freedom)” to assign 30oC to the nodes on the upper bound-
ary. The default condition and default value, following finite element method convention, are natural boundary
condition and “0”, respectively. Therefore, for the present problem, the natural boundary condition with “0” on
two sides can be neglected. On the bottom boundary conditions, we only need to specify their constraint type as
essential boundary conditions, the assignment of value of “0.0” (the default value) can be skipped too.
dx = - f / df
For one dimensional problem, f, df, and dx are all Scalar object of C0 type. For n-dimensional problem, n > 1, f
and dx are Vector object of C0 type with length “n” and df is a Matrix object of C0 type with size “n × n”. The
“C0::operator / (const C0&)” now no longer implies “divide” operation. It actually means to invoke matrix solver
that use df as the left-hand-side matrix and “-f” as the right-hand-side vector. The default behavior of Vector-
Space C++ Library is the LU decomposition, although you have the freedom to change the default setting to
Cholesky decomposition (for symmetrical case only), QR decomposition (for ill-conditioned case) or even the
singular value decomposition (for rank deficient case). This single function is sufficient for the very different
arguments taken, and different operations intelligently dispatched to perform upon themselves.
In Chapter 3, we have introduced the non-linear and transient problems in the context of variational methods
which are now the kernel of the element formulation. We considers the impact of change by these two types of
problems that will be played out in the element formulation. We note that an even greater impact will be played
out in the mixed formulation, introduced in Chapter 3 in page 217, if we use global matrix substructuring solu-
tion method (or “static condensation”). We defer the more complicated matrix substructuring until Section 4.2.5.
First, from “fe.lib” user’s perspective, the design of the “element formulation definition language”, if you
would, is for (1) definition of an element formulation and (2) registration of an element type. The user code seg-
ment for the declaration and instantiation of a class HeatQ4 is
From this code, the line 5 which is the declaration of the constructor of the heat conduction element formula-
tion—“HeatQ4(int, Global_Discretization&)”. The definition of this constructor is user customized, the contents
of this constructor is the variational formulation of differential equation problem at hand. We will get to the
details of definitions for the constructor (line 8) at the end of this section.
Polymorphism: First, let’s look at the fe.lib implementation of polymorphism, in this code segment, enhanced by
emulating symbolic language by C++1. The class Element_Formulation and the custom defined user class
HeatQ4 are used hand-in-hand. The Element_Formulation is like a symbol class for its actual content class—
HeatQ4. The symbol class Element_Formulation is responsible for doing all the chores including memory man-
agement and default behaviors of the element formulation. The content class HeatQ4 does what application
domain actually required; i.e., the variational formulation. The class Element_Formulation has a private data
member “rep_ptr” (representing pointer) which is a pointer to an Element_Formulation type as
1 class Element_Formulation {
2 ...
3 Element_Formulation *rep_ptr;
4 C0 stiff, force, ...;
5 protected:
6 virtual C0& __lhs() { return stiff; }
7 virtual C0& __rhs() { return force; }
8 ...
9 public:
10 ...
11 C0& lhs() { return rep_ptr->__lhs(); }
12 C0& rhs() { return rep_ptr->__rhs(); }
13 ...
14 };
Since the derived class HeatQ4 is publicly derived from the base class Element_Formulation, an instance of
HeatQ4 has its own copy of Element_Formulation as its “header”. Therefore, the rep_ptr can point to an instance
1. see (1) p. 58 “handle / body idiom”, (2) p. 70 “envelope / letter” idiom, and (3) p. 315 “symbolic canonical form” in J.O.
Coplien, 1992, “ Advanced C++: Programming styles and idioms”, Addison-Wesley, Reading, Massachusetts.
Symbol Element_Formulation
rep_ptr
Content Element_Formulation
HeatQ4
∂R
R i + 1 ≡ R ( u i + 1 ) = R ( u i + δ u i ) ≅ R ( u i ) + ------- δ u i = 0 Eq. 4•13
∂u u i
From this approximated equation, we have the incremental solution δui as the solution of the simultaneous linear
algebraic equations
–1
∂R –1
δ u i = – ------- R ( u i ) ≡ KT ( u i ) R ( u i ) Eq. 4•14
∂u u i
where both the tangent stiffness matrix K –T1 ( u i ) and the residual vector R ( u i ) are functions of ui. That is at the
element level, the nodal values— û i must be available. Therefore, a new class derived from class
Element_Formulation is
The class “Nonlinear” inherits all the public interfaces of the class Element_Formulation. On top of that we have
declared a private data member “ul”, the element nodal variables, for this nonlinear element. When the class
“Nonlinear” is defined, it is imperative to invoke its private member function “Nonlinear::__initialization(int,
Global_Discretization&)” to setup the element nodal variables. In this case, the use of inheritance for program-
ming by specification is very straight forward. An example of a simple nonlinear problem is shown in Section
4.2.3. In Chapter 5, we investigate state-of-the-art material nonlinear (elastoplasticity) and geometrical nonlin-
ear (finite deformation problems).
For a transient problem, the polymorphic technique is much more complicated. We show the parabolic case
here. From Eq. 3•191 in Chapter 3 (page 253) we have
In this case, the nodal values from the last time step— û n is also needed. In addition, we also need to compute
the mass (heat capacitance) matrix “M”.
Note that in the definition of class Element_Formulation the default behaviors of the last two protected member
functions are through two virtual member functions to return element “stiff” matrix and element “force” vector
as
This is standard for the static, linear finite element problems. When an instance of Element_Formulation calls its
public member functions “Element_Formulation::lhs()” and “Element_Formulation::rhs()”, the requests are for-
warding to its delegates’ virtual member functions. If these two protected virtual member functions have been
overwritten (lines15-23), the default behaviors in the base class will be taken over by the derived class. An exam-
ple of transient program is shown in Section4.2.4.
Element Type Register: A differential equation problem, solved by a finite element method may apply different
elements for different regions. For example, we can choose triangular elements to cover some of the areas, while
quadrilateral elements to cover the rest of the areas. We can have a “truss” element on certain parts of “planner”
elements to simulated a strengthened structure. From user’s perspective, he needs to register multi-elements as
The element type register uses a list data structure. We number the last registered element’s element type number
as “0”. This number increases backwards to the first registered element in the “type_list”. When we define an
element as introduced in page 271. The second argument is supplied with this number such as
Omega_eh *elem;
elem = new Omega_eh(0, element_type_number, 0, 4, ena);
The C++ idiom to implement the element type register is discussed in Section 4.1.3.
Element Formulation Definition: Now we finally get to the core of the Element_Formulation. That is the defi-
nition of its constructor. We show an example of heat conduction four-node quadrilateral element
The “xl” is the element nodal coordinates which is a C0 type Matrix object of size nen × nsd(number of element
nodes) × (number of spatial dimension). The “stiff” is the element stiffness matrix, a square matrix of size
(nen × ndf) × (nen × ndf) (“ndf” as number of degree of freedoms). The “force” is the element force vector of
size (nen × ndf). The VectorSpace C++ Library is most heavily used in this code segment, since it concerns the
subject of variational methods the most. If you have mastered Chapter 3 already, these lines should be com-
pletely transparent to you.
The treatment of the terms on natural boundary conditions ( φei , h ) Γ and the essential boundary conditions
– a ( φ ei , φ ej )u ej , in Eq. 4•8 in page 269, requires some explanation. “fe.lib” adopts the conventional treatment that
the natural boundary conditions are taken care of at the global level in Matrix_Representation::assembly() where
the user input equivalent nodal forces of natural boundary condition are directly added to the global force vector.
The treatment of the third term is also conventional that when the Element_Formulation::__rhs() is called it
automatically call Element_Formulation::__reaction() which is defined as
C0 & Element_Formulation::__reaction() {
the_reaction &= -stiff *gl; // “gl” is the element fixed boundary conditions
return the_reaction;
}
C0 & Element_Formulation::__rhs() {
the_rhs &= __reaction();
if(force.rep_ptr()) the_rhs += force;
return the_rhs;
}
These two default behaviors can be overwritten as in the class “Transient” in the above. Another example is that
we might want to have different interpolation function to approximate the boundary conditions. In such case,
first we need to call “Matrix_Representation::assembly()” in main() program as
1 int main() {
... // instantiation of Global_Discretization object
2 Matrix_Representation mr(gd);
3 mr.assembly();
4 C0 u = ((C0)(mr.rhs())) / ((C0)(mr.lhs()));
5 gd.u_h() = u;
6 gd.u_h() = gd.gh_on_gamma_h();
7 cout << gd.u_h();
8 return 0;
9 }
We show an example illustrated in Figure 4•7. Step 4a. A global stiffness matrix is a square matrix of size
tnn × ndf = 7 × 7 per side, and global force vector is of size tnn × ndf = 7, respectively (where “tnn” is the total
number of node, and “ndf” is number of degree of freedoms assumed as “1” for simplicity). The fixed degree of
freedoms are then removed from the global stiffness matrix (with remaining size = 5 × 5) and global force vector
(with remaining size = 5). This is done at line 2 when an instance of Matrix_Representation “mr” is initialized
with an instance of Global_Discretization “gd”. Step 4b. The mapping relationship of element stiffness matrix to
global stiffness matrix, and element force vector to global force vector can be constructed element by element.
This global-element relation is also established in line 2. Step 4c. The maps in Step 4b are used to add element
stiffness matrices and element force vectors to the global stiffness matrix and global force vector as in line 3,
where the public member function “Matrix_Representation::assembly()” is called. Then, the global stiffness
matrix and global force vector are used for linear algebraic solution of the finite element problem as in line 4.
Step 4d. The solution is in the order of free degree of freedom number which is then mapped back to the global
degree of freedom number for output of the solution. This is done in line 5 where the global solution vector
gd.u_h() is updated with the solution “u”. The values for the fixed degree of freedoms can be retrieved from the
program input of the problem. That is the line 6 where the same global solution vector gd.u_h() is updated with
fixed degree of freedom “gd.gh_on_gamma_h()”.
In between the Step 4c and Step 4d, the variational problem has been reduced to a matrix solution problem. A
regular matrix solver provided in C0 type Matrix in Chapter 1 can be applied to solve this problem, although
0 1 2 3 4 5 6 7
0 2 5 7 0
0
1 2 1
2
1 4 6
3 4 3
4
3
5
6
7
Step 4b: map element stiffness matrix and element force vector to global matrix and global vector, respectively
Element 0:
Element 1: Element 2: Element 3: Element 4:
0 1 2 4 5
0
1
2
4
5
Step 4c: assembly of all elements Step 4d: map equation number to global degree of freedom
number global degree of freedom number equation number
0
0 1
2
1
3
2
4
4
5
5
6
7
Figure 4•7 Element connectivity example. Step 1. elimination of fixed degree of freedoms, Step 2.
element to global mapping, Step 3. assembly all elements, and Step 4. equation number to global
degree of freedoms number.
The global stiffness matrix and global force vector can be replaced by corresponding special matrix and vector,
provided you have code all the needed interfaces for retrieving the components in the special forms of the global
matrix and vector.
Just as in the Element_Formulation, object-oriented programming provides mechanisms to deal with impact
of change for a swift evolution of “fe.lib”. Examples of these changes are mixed and hybrid method and contact
mechanics. In abstract mathematical form, they all belong to the category of constrained optimization problems.
Dependency Graph
The four major components in the modeling of finite element method are (1) the discretized global
domain Ω h , (2) variables u h , (3) element formulation (EF), and (4) matrix representation (MR). We can draw a
tetrahedron with the four vertices represent the four components and the six edges represents their mutual rela-
tions (see Figure 4•8). The first thing we can do is this tetrahedron can be reduced to a planner graph, meaning
that no edge among them can cross each other; i.e., to reduce it to a lower dimension. This step can not always be
done. If there is any such difficulty, we need to applied dependency breakers (to be discussed later) to the graph
to reduce it to a lower dimension. In a planner graph, we represent a component as a node, and their relations as
the arrows. For a component, the number of arrows pointing towards the node is called degree of entrance. In the
convention of object-oriented method, an arrow stands for a dependency relation that the node on the starting
point of an arrow depends on the node at the ending point of the arrow.
We briefly explain these dependency relations. The entrance number “0” says the global discretized variables
u h depends on the global discretization Ω h . u h is defined as interpolation of nodal variables as in Eq. 4•1; i.e.,
conceptually u h ( φ, û ) , and the nodal variables û depends on how nodes and element, Ω h , are defined. The
1. Johnson, C., 1987, “ Numerical solution of partial differential equations by the finite element method”, Press Syndicate of
the University of Cambridge, UK.
EF 7 MR
MR
Figure 4•8Tetrahedron to show four components on the vertices with six edges. This can be
transformed to a planner graph with arrows to show dependency relation. The numbers
marked are the entrance numbers.
entrance number “1” says the element formulation depends on the global variables u h , since the element stiff-
ness matrix and element force vector are all calculated corresponding to the interpolated value of the element
nodal variables û e . The entrance number “2” says the matrix representation depends on the element formula-
tion, since element formulation supplies the element stiffness matrices and element force vectors to be mapped
to the global matrix and global vector. The entrance number “3” is a redundant dependency relation. Since u h
depends on Ω h and EF depends on u h , we can conclude that EF must depend on Ω h . The entrance number 4 is
a similar redundant relation with one more step of MR depending on EF. The entrance number 5 and 7 show a
mutual dependency relation that MR depends on u h for MR is just the lhs and rhs to solve for u h , and after we
get solution from solving MR we need to map the solution vector from MR back to u h , since the fixed degree of
freedom is excluded from the MR, the variable number in MR is different from the number of global degree of
freedom. Therefore, u h depends on the knowledge of MR. The entrance number “6” has Ω h depends on EF.
When we define elements, we need to specify the element type number.
The Ω h has highest degree of entrance that means it should be at the highest root of class hierarchy. However, u h
and EF have same degree of entrance. Since the EF explicitly depends on u h . u h is to be escalated and EF is to
be demoted. The order in the class hierarchical is, therefore, Ω h , u h , EF, MR, as the order shown in TABLE 4•1
The pseudo-level structure is shown in the right-hand-side of Figure 4•9. The redundant relations, entrance num-
bers 3, 4, and 5, are drawn as light arrows. These redundant dependencies are first to be eliminated. Next, there
are still two un-resolved entrances (entrance 6 and 7 pointing downwards) in the left-hand-side of Figure 4•9,
which make the graph not to be a level structure. Therefore, in the rest of this section we will explore C++ level-
ization idioms1 that help us to break these two dependency relations. Now not only the graph is simple to under-
stand for human mind, but also it will have a profound impact on the organization of the software components.
Firstly, with a simplified dependency hierarchy, the interfaces of the software components are much more simpli-
fied. The interaction among the components can be understood easier. For example one can just bear in mind that
only components that are lower in the hierarchy depend on those on the above. And , then, if there are exception,
such as entrances 6 and 7, we just mark them as such. On the other hand, the complicated network of software
components such as the one in the left-hand-side of Figure 4•8 will be extremely difficult to follow. There are so
many cliques among them. One nodes can lead to the other and then back to itself. The dynamical interaction
patterns among the components seems to have a life of its own. The sequence of events can be acted out differ-
ently every time. Therefore, the model based on the graph level structure will be less error proned. Secondly, the
complicated network demands all module to be developed, tested and maintained all together. Divide and con-
quer is the principal strategy that we always need to deploy in the development, testing and maintenance of a pro-
gram. The graph level structure in the right-hand-side of Figure 4•9 means that now these processes can be done
in a more modulized fashion from top level 0 down to level 3 incrementally. We discuss two dependency break-
ers in the followings.
Pointer to a Forward Declaration Class: We can apply a traditional C technique to break the dependency rela-
tion caused by entrance number 7. That is the output for solution u h needs the knowledge of class
Matrix_Representation. The the order of the solution vector “u”, in the main(), is corresponding to the order of
variable number in the Matrix_Representation. For output of solution, we need to map this internal order of the
Matrix_Representation back to the order of global nodal degree of freedoms u h according to the specification
from the problem. This breaking of dependency relations can be done with the forward declaration in traditional
Level 3 MR MR
7 7
Ia. “u_h.h”
1 class Matrix_Representation;
2 class U_h {
3 Matrix_Representation *mr;
4 ...
5 public:
6 ...
7 Matrix_Representation* &matrix_representation() { return mr; }
8 U_h& operator=(C0&);
9 U_h& operator+=(C0&);
10 U_h& operator-=(C0&);
11 };
Ib, “u_h.cpp”
12 #include “u_h.h”
13 ...
IIa. “matrix_representation.h”
14 class Matrix_Representation {
15 ...
16 protected:
17 Global_Discretization &the_global_discretization;
18 ...
19 public:
IIb. “matrix_representation.cpp”
23 #include “u_h.h”
24 #include “matrix_representation.h”
25 void Matrix_Representation::__initialization(char *s) {
26 if(!(the_global_discretization.u_h().matrix_representation()) )
27 the_global_discretization.u_h().matrix_representation() = this;
28 ...
29 }
30 U_h& U_h::operator=(C0& a) { ... }
31 U_h& U_h::operator+=(C0& a) { ... }
32 U_h& U_h::operator-=(C0& a) { ... }
The class U_h and class Matrix_Representation are actually depend on each other. Therefore, the implementa-
tions of them in the “cpp” extension files will require the knowledge of their definitions. That is to include the
“.h” extension files. Traditional C language (note that class can be viewed as a special case of struc) provides
mechanism to break this mutual dependency relation by forward declaration such as in line 1 that the class name
Matrix_Representation is introduced in the name scope of the translation unit “u_h.h”, on the condition that only
the name of class Matrix_Representation, not its member data or member functions are to be used in the defini-
tion of class U_h. In class U_h, we at most refer to a pointer of class Matrix_Representation, which is only an
address in the computer memory, not an actually instance of the class Matrix_Representation, because the transla-
tion unit has no knowledge yet of what class Matrix_Representation really is. Now a programmer in the devel-
oper team can compile and test “u_h.cpp” separately, without having to define class Matrix_Representation at all.
One scenario of using the forward declaration of a class and using a member pointer to it is after the entire
product has been completed and sale to the customer, if we want to change the definition and implementation of
class Matrix_Representation we do not need to recompile the file “u_h.cpp”. The changes in “.h” and “.cpp” files
of the class Matrix_Representation do not affect the object code of class U_h module. A less dramatic scenario of
using a member pointer is that a developing process is iterative and the files always need to be compiled many
times. During developing cycles, class U_h module does not need to be recompiled every time that class
Matrix_Representation is changed. Therefore we have seen a most primitive form of a compilation firewall been
set to separate the compile-time dependency among source files. In a huge project, such as the one developed in
Mentor Graphics we mentioned earlier. They may have thousands of files. It will be ridiculous that when an un-
important change of a tiny file higher in the dependency hierarchy is made. The “make” command may trigger
tens of hours in compile time to update all modules that are depending on it. Not for long you will refuse to do
any change at all. In yet another scenario, when class Matrix_Representation is intended to be encapsulated from
end-users, this same technique insulates end-users from accessing the class Matrix_Representation directly.
Certainly, the dependency relation of entrance number 7 exists, which is demanded by the problem domain,
we can only find a way to get around it. We successfully break this particular dependency and make class U_h an
independent software module, but how do we re-connect them as the problem domain required. When we define
the constructor of the class Matrix_Representation, the first line of the constructor is to call its private member
function “__initialization(char*)”. This private member function set up the current instance of
Element Type Register: In page 281, we have discussed the element type register from user’s code segment as
registration by
1 Element_Fomulation* Element_Formulation::type_list = 0;
2 Element_Type_Register element_type_register_instance;
3 static Truss truss_instance(element_type_register_instance); // element type number “2”
4 static T3 t3_instance(element_type_register_instance); // element type number “1”
5 static Q4 q4_instance(element_type_register_instance); // element type number “0”
The element types are registered in a list data structure. The last registered element type number is “0”, and then
the number increases backwards to the first registered element in the “type_list”. This element type numbers are
referred to when we define the element as
This user interface design itself breaks the dependency of the definition of an element on element types. The
C++ technique to implement this design is the autonomous virtual constructor1. Let’s first look at the definitions
of the class Element_Formulation
1. see autonomous generic constructor in J. O. Coplien, 1992, “ Advanced C++: Programming styles and idioms”, Addison-
Wesley, Reading, Massachusetts.
The class Element_Type_Register, in line 1, is a dummy one that is used like a signature in line 8 to indicate that
the instance of class Element_Formulation generated is for element type identifier, and the static member
type_list embedded in the Element_Formulation will be maintained automatically. This element_type_number
information is used in “Matrix_Representation::assembly()” as
Line 3 is to compute the Element_Formulation, and form an instance of Element_Formulation, say “ef”, it can be
used as “ef.lhs()” and “ef.rhs()” to query information. The task of “create()” is to call “make()” forward by its
delegate “rep_ptr->make()”. Since “make()” is virtual and to be redefined in the derived class. The request in line
3 is dispatched to a user defined element class. The virtual function mechanism is usually referred to as the late-
binding technique at run-time. In this case, the cyclic dependence of an element on element formulation, deliber-
ately broken for the software modulization, is re-connected at the run-time by the late-binding technique sup-
ported by C++.
Level 0
Node
Ω eh
Ωh
Level 1
uh g ∈ Γ g, h ∈ Γ h
Global_Discretization
Level 2
EF
User Defined
Elements
Finite_Element_Approximation
Level 3
MR
Global
Tensors Finite_Element_
Element Approximation
Tensors
MR
Globla_Discretization
EF
MR
Ωh
uh
EF
MR
//==========================================================================
// Step 1: Global_Discretization
//==========================================================================
//==========================================================================
// Step 2: Element_Formulation
//==========================================================================
//==========================================================================
// Step 3: Matrix_Representation and Solution Phase
//==========================================================================
24 int main() {
25 int ndf = 1; // instantiation of Global_Discretization
26 Omega_h oh;
27 gh_on_Gamma_h gh(ndf, oh);
28 U_h uh(ndf, oh);
29 Global_Discretization gd(oh, gh, uh);
30 Matrix_Representation mr(gd);
31 mr.assembly(); // assemble the global matrix
32 C0 u = ((C0)(mr.rhs())) / ((C0)(mr.lhs())); // solution phase
33 gd.h_h(); = u; gd.u_h() = gd.gh_on_gamma_h(); // update solution
34 cout << gd.u_h(); // output solution
35 return 0;
36 }
Many segments and their variations of this template have been discussed in 4.1.2. The rest of this Chapter con-
sists of concrete examples of writing user programs using this template.
2
du
– 2
= cos πx, 0 < x < 1 Eq. 4•16
dx
1. Dirichlet boundary conditions: From Eq. 4•9 and Eq. 4•10 we have the element stiffness matrix as
1
dφ i dφ j
--------e- --------e- dx
k eij = a ( φ ei , φ ej ) = ∫ dx dx Eq. 4•18
0
The last identity is obtained, since the essential and natural boundary conditions are all homogeneous the second
term ( φ ei , h ) Γ and the third term – a ( φ ei , φ ej )u ej always vanish. In more general cases that they are not homoge-
1. p. 367-371 in J.N. Reddy, 1986, “Applied functional analysis and variational methods in engineering”, McGraw-Hill, Inc.
1 1
φ e0 = --- ( 1 – ξ ), and φ e1 = --- ( 1 + ξ ) Eq. 4•20
2 2
This is the linear interpolation functions we have used for integration of a line segment in Chapter 3 (Eq. 3•10
and Eq. 3•11 of Chapter 3).
The finite element program using VectorSpace C++ Library and “fe.lib” to implement the linear element is
shown in Program Listing 4•1. We use the program template in the previous section. First, we define nodes and
elements in “Omega_h::Omega_h()”. This constructor for the discretized global domain defines nodes with their
node numbers and nodal coordinates as
The elements are defined with global node number associated with the element as
Three sets of boundary conditions are (1) Dirichlet (2) Neumann, and (3) Mixed. The corresponding code seg-
ments can be turned on or off with a macro definitions set, at compile time, as
1 #if defined(__TEST_MIXED_BOUNDARY_CONDITION)
2 gh_on_Gamma_h::gh_on_Gamma_h(int df, Omega_h& omega_h) {
3 __initialization(df, omega_h);
4 the_gh_array[node_order(0)](0) = gh_on_Gamma_h::Dirichlet;
5 the_gh_array[node_order(0)][0] = 0.0;
6 the_gh_array[node_order(node_no-1)](0) = gh_on_Gamma_h::Neumann;
7 the_gh_array[node_order(node_no-1)][0] = 0.0;
#include "include\fe.h"
static const int node_no = 9; static const int element_no = 8; static const int spatial_dim_no = 1;
Omega_h::Omega_h() {
Definte discretizaed global domain
for(int i = 0; i < node_no; i++) {
double v; v = ((double)i)/((double)element_no); define nodes
Node* node = new Node(i, spatial_dim_no, &v); the_node_array.add(node);
}
int ena[2]; define elements
for(int i = 0; i < element_no; i++) {
ena[0] = i; ena[1] = ena[0]+1;
Omega_eh* elem = new Omega_eh(i, 0, 0, 2, ena); the_omega_eh_array.add(elem);
}
}
gh_on_Gamma_h::gh_on_Gamma_h(int df, Omega_h& omega_h) {
__initialization(df, omega_h);
define boundary conditions
the_gh_array[node_order(0)](0) = gh_on_Gamma_h::Dirichlet; u(0) = u(1) = 0
the_gh_array[node_order(node_no-1)](0) = gh_on_Gamma_h::Dirichlet;
}
class ODE_2nd_Order : public Element_Formulation {
instantiate fixed and free variables and
public: Global_Discretization
ODE_2nd_Order(Element_Type_Register a) : Element_Formulation(a) {}
Element_Formulation *make(int, Global_Discretization&);
ODE_2nd_Order(int, Global_Discretization&);
};
Element_Formulation* ODE_2nd_Order::make(int en, Global_Discretization& gd) {
return new ODE_2nd_Order(en,gd);
}
static const double PI = 3.14159265359;
ODE_2nd_Order::ODE_2nd_Order(int en, Global_Discretization& gd)
: Element_Formulation(en, gd) {
Quadrature qp(spatial_dim_no, 2);
Define user element “ODE_2nd_Order”
H1 Z(qp),
N=INTEGRABLE_VECTOR_OF_TANGENT_BUNDLE("int, int, Quadrature", 2, 1, qp); 1d Gauss Quadrature
N[0] = (1-Z)/2; N[1] = (1+Z)/2;
H1 X = N*xl;
H0 Nx = d(N)(0)/d(X); N0 = (1-ξ)/2, N1 = (1+ξ)/2
J dv(d(X)); coordinate transformation rule
stiff &= (Nx * (~Nx)) | dv;
force &= ( ((H0)N)*cos(PI*((H0)X)) )| dv;
N,x
} the Jacobian
Element_Formulation* Element_Formulation::type_list = 0; 1 1
dφ ei dφ ej
∫ --------
- --------- dx , and f ei= ∫ φ ei cos πx dx
static Element_Type_Register element_type_register_instance;
k eij =
static ODE_2nd_Order ode_2nd_order_instance(element_type_register_instance); dx dx
int main() { 0 0
const int ndf = 1; register element
Omega_h oh; gh_on_Gamma_h gh(ndf, oh);
U_h uh(ndf, oh); Global_Discretization gd(oh, gh, uh);
Matrix Form
Matrix_Representation mr(gd); assembly all elements
mr.assembly(); solve linear algebraic equations
C0 u = ((C0)(mr.rhs())) / ((C0)(mr.lhs()));
gd.u_h() = u; gd.u_h() = gd.gh_on_gamma_h();
update solution and B.C.
cout << gd.u_h(); output
return 0;
}
Listing 4•1 Dirichlet boundary condition u(0) = u(1) = 0, for the differential equation - u” = f (project:
“2nd_order_ode” in project workspace file “fe.dsw” (in case of MSVC) under directory “vs\ex\fe”).
The Dirichlet boundary conditions is taken as the default macro definition. The constraint type selector is the
“operator () (int dof)”. We can assign type of constraint to the corresponding degree of freedom as
“gh_on_Gamma_h::Neumann” or “gh_on_Gamma_h::Dirichlet”. The default constraint type is Neumann con-
dition. The constraint value selector is the “operator [ ](int dof)”. The default constraint value is “0.0”. In other
words, you can eliminate lines 5-7, lines12-15, and lines 17, 23, 25, and the results should be the same.
The added essential boundary conditions on the middle point of the problem domain (line 16, and 17) are
necessary for the Neumann boundary conditions for this problem, because the solution is not unique under such
boundary conditions only.
“fe.lib” requires the following codes to ochestrate the polymorphic mechanism of the Element_Formulation
to setup the element type register. For a user defined class of “ODE_2nd_Order” derived from class
Element_Formulation we have
Lines 10 and 11 setup the data for registration and Line 12 register the element formulation “ODE_2nd_Order”.
Line 5 is the constructor for class ODE_2nd_Order where we defined the user customized element formulation as
For the element stiffness matrix, instead of “stiff &= (Nx* (~Nx)) | dv;”, the tensor product operator “H0&
H0::operator%(const H0&)” in VectorSpace C++ can be used for expressing
1
dφ e dφ e
k e =∫ -------- ⊗ -------- dx Eq. 4•21
dx dx
0
as
1 int main() {
2 const int ndf = 1;
3 Omega_h oh; // global discretizaed domain—Ω h
4 gh_on_Gamma_h gh(ndf, oh); // fixed variables — g ∈ Γ g, h ∈ Γ h
5 U_h uh(ndf, oh); // free variables— u h
6 Global_Discretization gd(oh, gh, uh); // the class Global_Discretization
7 Matrix_Representation mr(gd);
8 mr.assembly(); // assembly all elements
9 C0 u = ((C0)(mr.rhs())) / ((C0)(mr.lhs())); // matrix solver
The instances of global discretization, “oh”, and fixed and free variables, “gh” and “uh”, respectively, are then
all go to instantiate an instance of class Global_Discretization, “gd”. The results of using the linear line element
for the second order differential equation in finite element method are shown in Figure 4•11.
1
-0.1
0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8
Figure 4•11 The results from eight linear elements for (1) Dirichelt (2) Neumann and (3) Mixed
boundary condtions for the second-order ordinary differentail equation. Line segments with open
squares are finite element solutions, and continuous curves are analytical solutions.
–ξ ξ
φ e0 = ------ ( 1 – ξ ), φ e1 = ( 1 – ξ ) ( 1 + ξ ) and φ e2 = --- ( 1 + ξ ) Eq. 4•22
2 2
These are the same quadratic interpolation functions in the Chapter 3 (Eq. 3•22).
The finite element program using VectorSpace C++ Library and “fe.lib” to implement the quadratic line ele-
ment is shown in Program Listing 4•2. The definitions of 5 nodes and 2 quadratic elements are
#include "include\fe.h"
static const int node_no = 5; static const int element_no = 2; static const int spatial_dim_no = 1;
Omega_h::Omega_h() {
Definte discretizaed global domain
for(int i = 0; i < node_no; i++) {
double v; v = ((double)i)/((double)element_no); define nodes
Node* node = new Node(i, spatial_dim_no, &v); the_node_array.add(node);
}
int ena[3]; define elements
for(int i = 0; i < element_no; i++) {
ena[0] = i; ena[1] = ena[0]+1; ean[2] = ena[0] + 2;
Omega_eh* elem = new Omega_eh(i, 0, 0, 3, ena); the_omega_eh_array.add(elem);
}
}
gh_on_Gamma_h::gh_on_Gamma_h(int df, Omega_h& omega_h) {
__initialization(df, omega_h);
define boundary conditions
the_gh_array[node_order(0)](0) = gh_on_Gamma_h::Dirichlet; u(0) = u(1) = 0
the_gh_array[node_order(node_no-1)](0) = gh_on_Gamma_h::Dirichlet;
}
class ODE_2nd_Order_Quadratic : public Element_Formulation {
instantiate fixed and free variables and
public: Global_Discretization
ODE_2nd_Order_Quadratic(Element_Type_Register a) : Element_Formulation(a) {}
Element_Formulation *make(int, Global_Discretization&);
ODE_2nd_Order_Quadratic(int, Global_Discretization&);
};
Element_Formulation* ODE_2nd_Order_Quadratic::make(int en, Global_Discretization& gd) {
return new ODE_2nd_Order_Quadratic(en,gd);
}
static const double PI = 3.14159265359;
ODE_2nd_Order::ODE_2nd_Order_Quadratic(int en, Global_Discretization& gd)
: Element_Formulation(en, gd) {
Quadrature qp(spatial_dim_no, 2);
Define user element “ODE_2nd_Order”
H1 Z(qp),
N=INTEGRABLE_VECTOR_OF_TANGENT_BUNDLE("int, int, Quadrature", 3, 1, qp); 1d Gauss Quadrature
N[0] = -Z*(1-Z)/2; N[1] = (1-Z)*(1+Z); N[2] = Z*(1+Z)/2;
H1 X = N*xl;
N0=-ξ (1-ξ) / 2, N1=(1-ξ) (1+ξ),
H0 Nx = d(N)(0)/d(X); N2 = ξ (1+ξ) / 2
J dv(d(X)); coordinate transformation rule
stiff &= (Nx * (~Nx)) | dv;
force &= ( ((H0)N)*cos(PI*((H0)X)) )| dv;
N,x
} the Jacobian
Element_Formulation* Element_Formulation::type_list = 0; 1 1
dφ i dφ j
--------e- --------e- dx , and f i= φ i cos πx dx
∫ dx dx e ∫ e
static Element_Type_Register element_type_register_instance;
static ODE_2nd_Order_Quadratic
k eij =
ode_2nd_order_quadratic_instance(element_type_register_instance); 0 0
int main() { register element
const int ndf = 1; Omega_h oh; gh_on_Gamma_h gh(ndf, oh);
U_h uh(ndf, oh); Global_Discretization gd(oh, gh, uh);
Matrix Form
Matrix_Representation mr(gd); assembly all elements
mr.assembly(); solve linear algebraic equations
C0 u = ((C0)(mr.rhs())) / ((C0)(mr.lhs()));
gd.u_h() = u; gd.u_h() = gd.gh_on_gamma_h();
update solution and B.C.
cout << gd.u_h(); output
return 0;
}
Listing 4•2 Quadratic Element for Dirichlet boundary condition u(0) = u(1) = 0 of the differential equa-
tion - u” = f (project: “quadratic_ode” in project workspace file “fe.dsw” under directory “vs\ex\fe”).
The interpolation functions for Eq. 4•22 in the constructor of the user defined element is
1 H1 Z(qp),
2 N = INTEGRABLE_VECTOR_OF_TANGENT_BUNDLE(
3 "int, int, Quadrature", 3/*nen*/, 1/*nsd*/, qp);
–ξ ξ
4 N[0] = -Z*(1-Z)/2; N[1]=(1-Z)*(1+Z); N[2]=Z*(1+Z)/2; // φ e0 = ------ ( 1 – ξ ), φe1 = ( 1 – ξ ) ( 1 + ξ ), φ e2 = --- ( 1 + ξ )
2 2
The results of using only two quadratic elements are shown in Figure 4•12.
Figure 4•12 The results from two quadratic elements for (1) Dirichelt (2) Neumann and (3)
Mixed boundary condtions for the second-order ordinary differentail equation. Dashed curves
with open squares are finite element solutions, and continuous curves are analytical solutions.
1 ∂ ∂u 1 ∂2u ∂2u
∇ 2 u = --- ----- r ------ + ---2- --------2- + --------2 Eq. 4•23
r ∂r ∂r r ∂θ ∂z
1. see for example p. 667, Eq (II.4.C4) in L.E. Malvern, 1969, “Introduction to the mechanics of a continuous medium”,
Prentice-Hall, Inc., Englewood Cliffs, N.J.
2. example in p. 364-367 in J.N. Reddy, 1986, “Applied functional analysis and variational methods in engineering”,
McGraw-Hill, Inc.
50mm
31.6mm
κ=5 κ=1
20mm r 100oC 0oC
20mm 31.6mm 50mm
κ=5
κ=1
Replace dΩ = 2πr dr in the volume integral, the element stiffness matrix in Eq. 4•9 and Eq. 4•10 is obtained by
integration by parts of the weighted-residual statement with Eq. 4•24
dφ e dφ e
ke = ∫ κ -------
dr
- ⊗ -------- 2πrdr
dr
Eq. 4•25
where “Nr” is the derivative of shape functions “N” with respect to “r”. This is implemented in Program Listing
4•3. The results are shown in Figure 4•14.
100
80
T oC 60
40
20
r
25 30 35 40 45 50
Figure 4•14The solution of heat conduction of an axisymmetrical problem with two
hollow cylinders.
#include "include\fe.h"
static const int node_no = 9; static const int element_no = 8; static const int spatial_dim_no = 1;
Omega_h::Omega_h() {
Definte discretizaed global domain
double r[9] = {20.0, 22.6, 25.1, 28.4, 31.6, 35.7, 39.8, 44.9, 50.0};
for(int i = 0; i < node_no; i++) { define 9 nodes
Node* node = new Node(i, spatial_dim_no, r+i); the_node_array.add(node); }
int ena[2], material_type_no;
for(int i = 0; i < element_no; i++) { define 8 elements
ena[0] = i; ena[1] = ena[0]+1;
if(i < element_no / 2) material_type_no = 0; else material_type_no = 1;
Omega_eh* elem = new Omega_eh(i, 0, material_type_no, 2, ena);
the_omega_eh_array.add(elem);
}
}
gh_on_Gamma_h::gh_on_Gamma_h(int df, Omega_h& omega_h) {
define boundary conditions
__initialization(df, omega_h); u(20) = 100, u(50) = 0
the_gh_array[node_order(0)](0) = gh_on_Gamma_h::Dirichlet;
the_gh_array[node_order(0)][0] = 100.0;
the_gh_array[node_order(node_no-1)](0) = gh_on_Gamma_h::Dirichlet;
the_gh_array[node_order(node_no-1)][0] = 0.0;
}class ODE_Cylindrical_Coordinates : public Element_Formulation {
public:
ODE_Cylindrical_Coordinates(Element_Type_Register a) : Element_Formulation(a) {}
instantiate fixed and free variables and
Element_Formulation *make(int, Global_Discretization&); Global_Discretization
ODE_Cylindrical_Coordinates(int, Global_Discretization&);
};
Element_Formulation* ODE_Cylindrical_Coordinates::make(int en, Global_Discretization& gd) {
return new ODE_Cylindrical_Coordinates(en,gd); }
static const double PI = 3.14159265359; static const double kapa[2] = {5.0, 1.0};
ODE_Cylindrical_Coordinates::ODE_Cylindrical_Coordinates(int en, Global_Discretization& gd)
: Element_Formulation(en, gd) {
Quadrature qp(spatial_dim_no, 2); Define user element “ODE_2nd_Order”
H1 Z(qp),
N=INTEGRABLE_VECTOR_OF_TANGENT_BUNDLE( "int, int, Quadrature", 2, 1, qp);
N[0] = (1-Z)/2; N[1] = (1+Z)/2;
1d Gauss Quadrature
H1 r = N*xl;
H0 Nr = d(N)(0)/d(r); N0= (1-ξ) / 2, N1= (1+ξ) / 2
J dr(d(r));
stiff &= ( ( kapa[material_type_no]*2.0*PI*((H0)r) ) * (Nr%Nr) ) | dr;
coordinate transformation rule
} N,x, and the Jacobian
Element_Formulation* Element_Formulation::type_list = 0;
dφe dφ e
∫ κ -------
- ⊗ -------- 2πrdr
static Element_Type_Register element_type_register_instance;
ke =
static ODE_Cylindrical_Coordinates ode_cylindrical_instance(element_type_register_instance); dr dr
int main() {
const int ndf = 1; Omega_h oh; gh_on_Gamma_h gh(ndf, oh); register element
U_h uh(ndf, oh); Global_Discretization gd(oh, gh, uh);
Matrix_Representation mr(gd);
Matrix Form
mr.assembly(); assembly all elements
C0 u = ((C0)(mr.rhs())) / ((C0)(mr.lhs())); solve linear algebraic equations
gd.u_h() = u; gd.u_h() = gd.gh_on_gamma_h();
cout << gd.u_h();
update solution and B.C.
return 0; output
}
Listing 4•3 Axisymmetrical problem using cylindrical coordinates for the differential equation - u” = 0
(project: “cylindrical_ode” in project workspace file “fe.dsw” under directory “vs\ex\fe”).
and the shear force V is equal to the derivative of bending moment (M) as
Therefore,
2
dM
= f Eq. 4•28
dx2
The transverse deflection of the beam is denoted as w, and the curvature (d2w/dx2) of the beam is related to the
bending moment “M” and the flexure rigidity “EI” as
2
dw M
= --------- Eq. 4•29
dx2 2EI
Substituting “M” in Eq. 4•29 into Eq. 4•28 gives the fourth-order ordinary differential equation
2
d d w
2
EI = f, 0<x<L Eq. 4•30
d x2 d x2
We consider a boundary value problem that the Eq. 4•30 is subject to the boundary conditions1
2 2
dw dw d d w
w( 0 ) = ( 0 ) = 0, EI 2 ( L ) = M, – EI (L) = V( L) = 0 Eq. 4•31
dx dx dx d x2
In the previous chapter, we solved this boundary value problem using Rayleigh-Ritz method with four weak for-
mulations—(1) irreducible formulation, (2) mixed formulation, (3) Lagrange multiplier formulation, and (4)
penalty function formulation. We use finite element method in this section to implement these four weak formu-
lations.
1. J.N. Reddy, 1986, “Applied functional analysis and variational methods in engineering”, McGraw-Hill, Inc.
2 2
EI d w dw
J( w ) = ∫ ------ – fw dx – w V Γh – M Γh
2 d x2 dx
Eq. 4•32
Ω
The last two terms are natural boundary conditions generated from integration by parts. Using δw = εv, where ε
is a small real number. Taking the variation of J and setting δJ(u) = 0 gives
2
d 2 δw d w
– δwf dx – δ wV Γh + – ----------- M Γh
dδw
δJ ( w ) = ∫ -
EI ------------
dx 2 d x 2 dx
Ω
2 2
d v d w
= ε ∫ EI 2 2 – vf dx – v V Γh + – ------ M Γ h = 0
dv
dx
Ω d x d x
2 2
d v d w
EI 2 2 – vf dx – vV Γh + – ------ M Γ h = 0
dv
∫ dx dx dx
Eq. 4•33
Ω
The integrand of Eq. 4•33 contains derivative of variables up to second order. For this equation to be integrable
through out Ω, we have to require that the first derivative of the variable be continuous through out the integra-
tion domain. If the first derivative of the variable is not continuous at any point on the integration domain and its
boundaries, the second derivative of the variable on that point will be infinite, therefore, Eq. 4•33 is not integra-
ble. In other words, the first derivative of the variable at nodal points should be required to be continuous. This
is to satisfy the so-called continuity requirement. For example, we consider a two nodes line element with two
degrees of freedom associated with each notes. That is the nodal degrees of freedom are set to be û e = [w0, -dw0/
dx, w1, -dw1/dx] on the two nodes. The node numbers are indicated by subscripts “0” and “1”. The variables,
defined in an element domain, are defined as
1. see derivation in p. 383 in J.N. Reddy, 1986, “Applied functional analysis and variational methods in engineering”,
McGraw-Hill, Inc.
ξ 2
φ e1 = – ξ 1 – -----
h e
ξ 2 ξ 3
φ e2 = 3 ----- – 2 -----
he he
ξ 2 ξ
φe3 = – ξ ----- – ----- Eq. 4•35
h e h e
2 2
d φ e d φ e
k e = a ( φ e, φ e ) = ∫ EI 2 ⊗ 2 dx
dx dx
Eq. 4•36
Ωe
dw dw
where essential boundary conditions are u e = [w0, – , w1, – ], and
dx 0 dx 1
where P = {V0,- M0, VL, -ML}T is the natural boundary conditions on boundary shear forces and boundary bend-
ing moments. Notice that in the previous chapter we take counter clockwise direction as positive for bending
moment. The sign convention taken here for the bending moment is just the opposite. The natural boundary con-
ditions are programmed to automatically taken care of in “Matrix_Representation::assembly()” where the left-
hand-side is assumed to be a positive term instead of what happened in the left-hand-side of Eq. 4•43. This is the
reason of take a minus sign in front of M for the definition of the vector P. The Program Listing 4•4 implemented
the irreducible formulation for the beam bending problem.
The solutions of the transverse deflection w and slope -dw/dx can be calculated from nodal values according
to Eq. 4•34. They are almost identical to the exact solutions in Figure 3•16 and Figure 3•17 of the last chapter in
page 208 and page 212, respectively. Therefore, the error instead are shown in Figure 4•15. Note that the exact
2. or alternative form from p. 49 in T.J.R. Hughes, 1987,”The finite element method: Linear static and dynamic finite element
analysis”, Prentice-Hall, Inc.
#include "include\fe.h"
static const int node_no = 5; static const int element_no = 4; static const int spatial_dim_no = 1;
static const double L_ = 1.0; static const double h_e = L_/((double)(element_no));
static const double E_ = 1.0; static const double I_ = 1.0; static const double f_0 = 1.0;
static const double M_ = -1.0;
Omega_h::Omega_h() {
for(int i = 0; i < node_no; i++) {
Definte discretizaed global domain
double v = ((double)i)*h_e; define nodes
Node* node = new Node(i, spatial_dim_no, &v); the_node_array.add(node); }
int ena[2];
for(int i = 0; i < element_no; i++) {
define elements
ena[0] = i; ena[1] = ena[0]+1;
Omega_eh* elem = new Omega_eh(i, 0, 0, 2, ena); the_omega_eh_array.add(elem); }
}
gh_on_Gamma_h::gh_on_Gamma_h(int df, Omega_h& omega_h) {
define boundary conditions
__initialization(df, omega_h); M(L) = -1 (positive clockwise)
the_gh_array[node_order(0)](0) = gh_on_Gamma_h::Dirichlet;
the_gh_array[node_order(0)](1) = gh_on_Gamma_h::Dirichlet;
the_gh_array[node_order(node_no-1)][1] = M_;
} instantiate fixed and free variables and
static const int ndf = 2; static Omega_h oh; static gh_on_Gamma_h gh(ndf, oh); Global_Discretization
static U_h uh(ndf, oh); static Global_Discretization gd(oh, gh, uh);
class Beam_Irreducible_Formulation : public Element_Formulation {
public:
Beam_Irreducible_Formulation(Element_Type_Register a) : Element_Formulation(a) {}
Element_Formulation *make(int, Global_Discretization&);
Beam_Irreducible_Formulation(int, Global_Discretization&);
}; “Beam_Irreducible_Formulation”
Element_Formulation* Beam_Irreducible_Formulation::make(int en,Global_Discretization& gd) { Simpson’s rule
return new Beam_Irreducible_Formulation(en,gd); }
Beam_Irreducible_Formulation::Beam_Irreducible_Formulation(int en, Global_Discretization&
Hermit cubics
gd) : Element_Formulation(en, gd) {
ξ 2 ξ 3
double weight[3] = {1.0/3.0, 4.0/3.0, 1.0/3.0}, φ e0 = 1 – 3 ----- + 2 -----
h_e = fabs( ((double)(xl[0] - xl[1])) ); h e h e
Quadrature qp(weight, 0.0, h_e, 3);
J d_l(h_e/2.0);
ξ 2
H2 Z((double*)0, qp), z = Z/h_e, φ e1 = – ξ 1 – -----
N = INTEGRABLE_VECTOR_OF_TANGENT_OF_TANGENT_BUNDLE( h e
"int, int, Quadrature", 4/*nen x ndf*/, 1/*nsd*/, qp);
N[0] = 1.0-3.0*z.pow(2)+2.0*z.pow(3); N[1] = -Z*(1.0-z).pow(2);
ξ 2 ξ 3
N[2] = 3.0*z.pow(2)-2.0*z.pow(3); N[3] = -Z*(z.pow(2)-z); φ e2 = 3 ----- – 2 -----
H0 Nxx = INTEGRABLE_VECTOR("int, Quadrature", 4, qp); h e h e
for(int i = 0; i < 4; i++) Nxx[i] = dd(N)(i)[0][0];
stiff &= ( (E_*I_)* (Nxx*(~Nxx)) ) | d_l;
ξ 2 ξ
force &= ( ((H0)N) * f_0) | d_l; φ e3 = – ξ ----- – -----
} h e h e
Element_Formulation* Element_Formulation::type_list = 0;
2 2
static Element_Type_Register element_type_register_instance; d φ e d φ e
static Beam_Irreducible_Formulation beam_irreducible_instance(element_type_register_instance);
static Matrix_Representation mr(gd);
ke = ∫ EI 2 ⊗ 2 dx
dx dx
Ωe
int main() {
mr.assembly(); C0 u= ((C0)(mr.rhs())) / ((C0)(mr.lhs()));
}
gd.u_h() = u; gd.u_h() = gd.gh_on_gamma_h(); cout << gd.u_h(); return 0; fei = ∫ φei fdx
Ωe
Listing 4•4 Beam-bending problem irreducible formulation using Hermit cubics (project:
“beam_irreducible_formulation” in project workspace file “fe.dsw” under directory “vs\ex\fe”).
Figure 4•15 The error (= exact solution - finite element solution) of the irreducible
formulation for beam bending problem.
solution of the transverse deflection w is a polynomial of x up to fourth-order (see Eq. 3•68 in page 207). The
cubic approximation will not give solution identical to the exact solution.
We consider two more examples for different types of boundary conditions and loads.1 The first example is to
have unit downward nodal load on a simply supported beam at location of x = 120 in. (Figure 4•16). The flexure
rigidity of the beam is EI = 3.456x1010 lb in.2 The length of the beam is 360 in. We divide the beam to two cubic
Hermit elements. The definitions of the problem is now
1 static const int node_no = 3; static const int element_no = 2; static const int spatial_dim_no = 1;
2 static const double L_ = 360.0; static const double E_I_ = 144.0*24.0e6;
3 Omega_h::Omega_h() { // discritized global 4domain
5 double v = 0.0; Node* node = new Node(0, spatial_dim_no, &v);
6 the_node_array.add(node);
7 v = 120.0; node = new Node(1, spatial_dim_no, &v);
8 the_node_array.add(node);
9 v = 360.0; node = new Node(2, spatial_dim_no, &v);
10 the_node_array.add(node);
P = -1.0 lb
flexure rigidity (EI) = 3.456x1010 lb in.2
0 0 1 1 2
Figure 4•16 Unit downward nodal loading on position x = 120. The flexure
rigidity of the beam is 3.456x1010. Two cubic Hermict elements are used.
1. Example problems from p. 390 in J.N. Reddy, 1986, “Applied functional analysis and variational methods in engineering”,
McGraw-Hill, Inc.
Now in the computation for element force vector, you can either set f_0 = 0.0, or use conditional compilation,
with macro definition, to leave that line out. The results of this problem is shown in Figure 4•17.
The second example have distributed load
x
f ( x ) = f 0 --- Eq. 4•39
L
where L = 180 in. and set f0 = -1.0. This distributed load is a linear downward loading increases from zero at the
left to unity at the right. The moment of inertia is I = 723 in.4, and Young’s modulus is E = 29x106 psi. with
boundary conditions w(0) = w(L) = dw/dx (L) = 0. We divide the beam into four equal size cubic Hermit ele-
ments. The problem definitions for nodes, elements, and boundary conditions are
1 static const int node_no = 5; static const int element_no = 4; static const int spatial_dim_no = 1;
2 static const double L_ = 180.0; static const double element_size = L_/((double)(element_no));
3 static const double E_ = 29.0e6; static const double I_ = 723.0; static const double f_0 = -1.0;
4 Omega_h::Omega_h() { // discritized global domain
-0.0001 dw
w 50 100 150 200 250 300 350 x
dx
-0.00015
-6
-1. 10
-0.0002
-6
-2. 10
Figure 4•17 Finite element solution for the nodal load problem for irreducible
formulation of beam bending problem.
In the constructor of the class Beam_Irreducible_Formulation the element force vector is computed as
The results of this distributed load problem using the irreducible formulation are shown in Figure 4•18. These
two extra problems are actually coded in the same project “beam_irreducible_formulation” in project workspace
file “fe.dsw” (in case of MSVC) under directory “vs\ex\fe”. They can be activated by setting corresponding
macro definitions at compile time.
x
25 50 75 100 125 150 175
-0.00002 -6
1. 10
-0.00004
w dw
-0.00006
x
25 50 75 100 125 150 175
dx
-0.00008 -6
-1. 10
-0.0001
-6
-0.00012 -2. 10
Figure 4•18 Finite element solution of the distributed load problem for the irreducible formulation of beam
bending problem. The distributed load is a linear downward loading increases from zero at the left to unit
load at the right.
2 2
dw M dM
= ---------, and = f Eq. 4•40
dx2 2EI dx2
L
M2
∫ d x d x + --------- + fw dx – M Γ – w Γ
dw dM dw dM
J M ( w, M ) = Eq. 4•41
2EI dx h dx h
0
dM dw
where the boundary conditions on the shear force and slope are V = – and ψ = –
dx dx
The Euler-Lagrange equations are obtained by setting δJ(w, M)= 0 (where δw = εw vw and δM = εM vM)
L
dv w dM
∫ d x + v w f dx – v w Γ
dM
δ w J M = εw = 0
dx d x h
0
L
dv M dw
∫ d x + v M ------ dx – v M Γ = 0
M dw
δ M J M = εM Eq. 4•42
dx EI dx h
0
For the Bubnov-Galerkin method we use interpolation functions φ ew for both w and vw, and interpolation func-
tions φ eM for both M and vM. In matrix form finite element formulation from Eq. 4•42 is (dropping εw and εM)
1. p. 310 in T.J.R. Hughes, 1987, “The finite element method: Linear static and dynamic finite element analysis”, Prentice-
Hall, inc., Englewood cliffs, New Jersey.
dφ ew dφ eM
---------
- ⊗ ---------- dx
0 ∫ dx dx
∫ –φew fdx – φew VΓ
Ωe ŵ e h
= Ωe Eq. 4•43
dφ eM dφ ew φ eM ⊗ φ eM ˆ
---------- ⊗ ---------- dx ---------------------
- dx
M
∫ ∫ – φ eM ψ Γh
e
dx dx EI
Ωe Ωe
The natural boundary conditions specified through V is hard-wired in “fe.lib” to be automatically taken care of in
“Matrix_Representation::assembly()” where the left-hand-side is assumed to be a positive term instead of what
happened in the left-hand-side of Eq. 4•43. We can choose to take an opposite sign convention on the boundary
condition as what we have done for the bending moment boundary condition in the irreducible formulation. The
disadvantage of doing that is that we have put the burden on user to specify the program correctly. That may
often cause serious confusion. Therefore, we prefer to make the sign of Eq. 4•43 to be consistent with what is
done in the “assembly()” by changing sign as
dφ ew dφ eM
0 – ∫ ---------- ⊗ ---------- dx
Ωe
dx dx
ŵ e ∫ φew fdx + φew VΓ h
= Ωe Eq. 4•44
dφ M dφ ew φM ⊗ φ eM ˆ
e
– ∫ ---------- ⊗ ---------- dx – ∫ e
---------------------- dx
M e
φ eM ψ Γ h
dx dx EI
Ωe Ωe
The Program Listing 4•5 implement the beam bending problem subject to boundary conditions in Eq. 4•31, using
Eq. 4•44. In finite element convention, the degree of freedoms for a node are packed together. We can re-arrange
the degree of freedom, for every node, corresponding to the essential boundary conditions as {w, M}T, and natu-
ral boundary conditions are {V, ψ}T The Eq. 4•44 becomes
0 a 00 0 a 01 ŵ 0 f0
T
a 00 T
b 00 a 01 b 01 ˆ
M r0
0
= Eq. 4•45
0 a 10 0 a 11 ŵ 1 f1
T b
a 10 T ˆ r1
10 a 11 b 11 M 1
where subscripts indicate the element nodal number and each component in the matrix or vectors is defined as
dφ iw dφ jM φ iM φ jM
a ij = – ∫ ---------- ---------- dx, b ij = – ∫ --------------- dx, f i = ∫ φiw fdx + φiw VΓ , and r i = φ iM ψ Γh Eq. 4•46
dx dx EI h
Ωe Ωe Ωe
The submatrix/subvector component access through either continuous block selector “operator ()(int, int)” or
regular increment selector “operator[](int)” in VectorSpace C++ library makes the coding in the formula of
either Eq. 4•44 or Eq. 4•45 equally convenient.
#include "include\fe.h"
static const int node_no = 5; static const int element_no = node_no-1;
static const int spatial_dim_no = 1; static const double L_ = 1.0;
static const double h_e = L_/((double)(element_no)); static const double E_ = 1.0;
static const double I_ = 1.0; static const double f_0 = 1.0; static const double M_ = 1.0;
Omega_h::Omega_h() {
for(int i = 0; i < node_no; i++) {
Definte discretizaed global domain
double v = ((double)i)*h_e; define nodes
Node* node = new Node(i, spatial_dim_no, &v); the_node_array.add(node); }
int ena[2];
for(int i = 0; i < element_no; i++) {
define elements
ena[0] = i; ena[1] = ena[0]+1;
Omega_eh* elem = new Omega_eh(i, 0, 0, 2, ena); the_omega_eh_array.add(elem); }
}
gh_on_Gamma_h::gh_on_Gamma_h(int df, Omega_h& omega_h) {
define boundary conditions
__initialization(df, omega_h); M(L) = 1
the_gh_array[node_order(0)](0) = gh_on_Gamma_h::Dirichlet; // w(0) = 0
the_gh_array[node_order(node_no-1)](1) = gh_on_Gamma_h::Dirichlet; // M(L) = M_
the_gh_array[node_order(node_no-1)][1] = M_;
} instantiate fixed and free variables and
class Beam_Mixed_Formulation : public Element_Formulation { Global_Discretization
public:
Beam_Mixed_Formulation(Element_Type_Register a) : Element_Formulation(a) {}
Element_Formulation *make(int, Global_Discretization&);
Beam_Mixed_Formulation(int, Global_Discretization&);
};
Element_Formulation* Beam_Mixed_Formulation::make( int en, Global_Discretization& gd) {
return new Beam_Mixed_Formulation(en,gd); } “Beam_Mixed_Formulation”
Beam_Mixed_Formulation::Beam_Mixed_Formulation(int en, Global_Discretization& gd)
: Element_Formulation(en, gd) {
Quadrature qp(spatial_dim_no, 2);
H1 Z(qp),
N = INTEGRABLE_VECTOR_OF_TANGENT_BUNDLE(
"int, int, Quadrature", 2/*nen*/, 1/*nsd*/, qp);
N[0] = (1-Z)/2; N[1] = (1+Z)/2;
H1 X = N*xl;
H0 Nx = d(N)(0)/d(X); J d_l(d(X)); φ ew = φ eM = {(1-ξ)/2, (1+ξ)/2}T
stiff &= C0(4, 4, (double*)0); C0 stiff_sub = SUBMATRIX("int, int, C0&", 2, 2, stiff);
stiff_sub[0][1] = -(Nx * (~Nx)) | d_l; stiff_sub[1][0] = stiff_sub[0][1];
stiff_sub[1][1] = -(1.0/E_/I_)* ( (((H0)N)*(~(H0)N)) | d_l );
force &= C0(4, (double*)0); C0 force_sub = SUBVECTOR("int, C0&", 2, force);
force_sub[0] = ( (((H0)N)*f_0) | d_l );
dφ ew dφeM
k e10 = k e01 = – ∫ ---------- ⊗ ---------- dx
}
Element_Formulation* Element_Formulation::type_list = 0; dx dx
static Element_Type_Register element_type_register_instance; Ωe
static Beam_Mixed_Formulation beam_mixed_instance(element_type_register_instance);
φM
⊗ φ eM
k e11 = – ∫ ---------------------- dx
e
int main() {
const int ndf = 2; EI
Omega_h oh; gh_on_Gamma_h gh(ndf, oh); Ωe
U_h uh(ndf, oh); Global_Discretization gd(oh, gh, uh);
Matrix_Representation mr(gd);
mr.assembly(); C0 u = ((C0)(mr.rhs()))/((C0)(mr.lhs())); f e0 = ∫ φew fdx
gd.u_h() = u; gd.u_h() = gd.gh_on_gamma_h(); cout << gd.u_h(); Ωe
return 0;
Listing 4•5 Beam-bending problem mixed formulation using linear line element (project:
“beam_mixed_formulation” in project workspace file “fe.dsw” under directory “vs\ex\fe”).
0.5 1.4
0.4
1.3
w 0.3 M
1.2
0.2
1.1
0.1
Figure 4•19 Transverse deflection “w” and bending moment “M” from mixed
formulation. The dashed line segments with open squares are finite element
solutions, and the solid curves are the exact solutions.
The results are shown in Figure 4•19. The solutions at the nodal points match the exact solutions of the trans-
verse deflection and the bending moment. That is,
1 static const int node_no = 3; static const int element_no = 2; static const int spatial_dim_no = 1;
2 static const double L_ = 360.0; static const double E_ = 24.0e6; static const double I_ = 144.0;
3 Omega_h::Omega_h() {
4 double v = 0.0; Node* node = new Node(0, spatial_dim_no, &v); the_node_array.add(node);
5 v = 120.0; node = new Node(1, spatial_dim_no, &v); the_node_array.add(node);
6 v = 360.0; node = new Node(2, spatial_dim_no, &v); the_node_array.add(node);
7 int ena[2];
8 for(int i = 0; i < element_no; i++) {
9 ena[0] = i; ena[1] = ena[0]+1;
10 Omega_eh* elem = new Omega_eh(i, 0, 0, 2, ena);
11 the_omega_eh_array.add(elem);
12 }
13 }
14 gh_on_Gamma_h::gh_on_Gamma_h(int df, Omega_h& omega_h) {
15 __initialization(df, omega_h);
16 the_gh_array[node_order(0)](0) = gh_on_Gamma_h::Dirichlet; // w(0) = 0
17 the_gh_array[node_order(0)](1) = gh_on_Gamma_h::Dirichlet; // M(0) = 0
18 the_gh_array[node_order(1)](0) = gh_on_Gamma_h::Neumann; // V(120) = -1.0; shear force
19 the_gh_array[node_order(1)][0] = -1.0;
20 the_gh_array[node_order(2)](0) = gh_on_Gamma_h::Dirichlet; // w(360) = 0
21 the_gh_array[node_order(2)](1) = gh_on_Gamma_h::Dirichlet; // M(360) = 0
22 }
-0.00005 60
-0.0001
w M 40
-0.00015
20
-0.0002
Figure 4•20Transverse deflection w and bending moment M for the nodal loading
problem using linear interpolation functions for both w and M.
For the element force vector we can either set f_0 = 0 or just comment out the corresponding statement for effi-
ciency. The result of the nodal loading case is shown in Figure 4•20. The bending moment solution is exact for
this case.
The problem definition in C++ code for the distributed loading case is
1 static const int node_no = 5; static const int element_no = 4; static const int spatial_dim_no = 1;
2 static const double L_ = 180.0; static const double element_size = L_/((double)(element_no));
3 static const double E_ = 29.0e6; static const double I_ = 723.0; static const double f_0 = -1.0;
4 Omega_h::Omega_h() {
5 for(int i = 0; i < node_no; i++) {
6 double v = ((double)i)*element_size;
7 Node* node = new Node(i, spatial_dim_no, &v); the_node_array.add(node);
8 }
9 int ena[2];
10 for(int i = 0; i < element_no; i++) {
11 ena[0] = i; ena[1] = ena[0]+1;
12 Omega_eh* elem = new Omega_eh(i, 0, 0, 2, ena); the_omega_eh_array.add(elem);
13 }
14 }
15 gh_on_Gamma_h::gh_on_Gamma_h(int df, Omega_h& omega_h) {
16 __initialization(df, omega_h);
17 the_gh_array[node_order(0)](0) = gh_on_Gamma_h::Dirichlet; // w(0) = 0
18 the_gh_array[node_order(0)](1) = gh_on_Gamma_h::Dirichlet; // M(0) = 0
19 the_gh_array[node_order(node_no-1)](0) = gh_on_Gamma_h::Dirichlet; // w(L) = 0
20 the_gh_array[node_order(node_no-1)](1) = gh_on_Gamma_h::Neumann; // dw/dx(L) = 0
21 }
1 H0 f = (f_0/L_)*((H0)X);
2 force &= C0(4, (double*)0);
3 C0 force_sub = SUBVECTOR("int, C0&", 2, force);
4 force_sub[0] = ( (((H0)N)*f) | d_l );
x 1000
25 50 75 100 125 150 175
-0.00002 500
-0.00004 M x
25 50 75 100 125 150 175
w -0.00006
-500
-0.00008
-1000
-0.0001
-1500
-0.00012
-2000
In the irreducible formulation, we are required to include the higher-order derivatives be interpolated using
the abstruse cubic Hermite functions. In the mixed formulation this requirement is relaxed. However, both the
irreducible and the mixed formulation require one more variable (-dw/dx, and M, respectively) to be solved
together with w. This increases the number of degrees of freedom in the matrix solution process. This can be dis-
advantageous for a large-size problem.
2 2
EI d w dw
J( w ) = ∫ ------ – fw dx – w V Γh – M Γh
2 d x2 dx
Eq. 4•48
Ω
Now, in the context of constrained optimization discussed in Chapter 2, we define constraint equation for nega-
tive slope ψ that
dw
C ( ψ, w ) ≡ ψ + ------- = 0 Eq. 4•49
dx
dw
Substituting ψ = – ------- into Eq. 4•48, we have
dx
EI dψ 2
J ( ψ, w ) = ∫ ------
2 dx
– fw dx – w V Γh + ψM Γh Eq. 4•50
Ω
The minimization of Eq. 4•50 subject to constraint of Eq. 4•49 using Lagrange multiplier method (with the
Lagrange multiplier λ) leads to the Lagrangian functional in the form of Eq. 2•11 of Chapter 2 in page 118 as
EI dψ 2
– fw dx + ∫ λ ψ + ------- dx – w VΓ h + ψM Γ h
dw
( ψ, w, λ ) ≡ J ( ψ, w ) + λ C ( ψ, w ) = ∫ ------
2 dx dx
Eq. 4•51
Ω Ω
The Euler-Lagrange equations are obtained from δL = 0 as (where δψ = εψ vψ, δw = εw vw, and δλ = ελ vλ)
dv ψ dψ
δψ = εψ ∫ EI ---------
dx d x
dx + ∫ v ψ λ dx + v ψ M Γh = 0
Ω Ω
dv w
δw = ε w – ∫ v w f dx + ∫ ---------- λ dx – v w V Γh = 0
dx
Ω Ω
= ε λ ∫ v λ ψ + ------- dx = 0
dw
δλ Eq. 4•52
dx
Ω
Dropping the arbitrary constants of εψ, εw, and ελ and use interpolation functions for each of the variables {ψ, w,
λ}T we have, in matrix form, the finite element formulation as
dφ eψ dφ eψ
EI ∫ ---------- ⊗ ---------- dx 0 ∫ φeψ ⊗ φeλ dx
dx dx
Ωe Ωe – φeψ M Γh
ψ̂ e
dφ ew
0 0 ∫ ---------
dx
- ⊗ φ eλ dx ŵ e = ∫ φew f dx + φew VΓ h
Eq. 4•53
Ωe Ωe
λ̂ e
dφ ew 0
∫ φeλ ⊗ φeψ dx ∫ φ eλ ⊗ ---------- dx
dx
0
Ωe Ωe
Again, the bending moment boundary conditions appears on the right-hand-side of the first equation is negative.
This is in conflict with the nodal loading input is positive on the right-hand-side assumed in the implementation
of the “Matrix_Rxpresentation::assembly()”. In order to keep the convention of counter clock-wise rotation as
positive, we can change sign on the first row of Eq. 4•53 as
dφ eψ dφ eψ
– E I ∫ ---------- ⊗ ---------- dx 0 – ∫ φeψ ⊗ φ eλ dx
dx dx
Ωe Ωe φ eψ M Γh
ψ̂ e
dφ ew
0 0 ∫ ---------
dx
- ⊗ φ eλ dx ŵ e = ∫ φew f dx + φew VΓ h
Eq. 4•54
Ωe Ωe
λ̂ e
dφew 0
∫ φeλ ⊗ φeψ dx ∫ φ eλ ⊗ ---------- dx
dx
0
Ωe Ωe
Again, the degree of freedoms for each node can be packed together just as in Eq. 4•45. With the aid of the regu-
lar increment selector “operator[](int), the Eq. 4•54 is sufficient clear without really needing to rewrite to the
form of Eq. 4•45. The Program Listing 4•6 implemented the Eq. 4•54 with linear interpolation functions
{ φ eψ, φ ew, φ eλ }T for all three variables. The essential boundary conditions are {ψ, w, λ}T, and the natural boundary
conditions are {M, V, 0}T The results are shown in Figure 4•22 which are compared to the exaction solutions.
2M + fL 2
w ( x ) = ---------------------- x 2 – --------- x 3 + ------------ x 4
fL f
4EI 6EI 24EI
– ( 2M + fL 2 ) fL f
ψ ( x ) = -----------------------------
2EI
- x + --------- x 2 – --------- x 3
2EI 6EI
λ( x) = f ( L – x ) Eq. 4•55
ψ and λ is obtained by differentiating the exact solution of w(x) in the first line from the corresponding defini-
tions. The shear force solution, the lagrange multiplier λ per se, coincides with the exact solution..
#include "include\fe.h"
static const int node_no = 5; static const int element_no = 4; static const int spatial_dim_no = 1;
static const double L_ = 1.0; static const double h_e = L_/((double)(element_no));
static const double E_ = 1.0; static const double I_ = 1.0; static const double f_0 = 1.0;
static const double M_ = 1.0;
Omega_h::Omega_h() {
for( int i = 0; i < node_no; i++) {
Definte discretizaed global domain
double v = ((double)i)*h_e; define nodes
Node* node = new Node(i, spatial_dim_no, &v); the_node_array.add(node); }
for( int i = 0; i < element_no; i++) {
int ena[2]; ena[0] = i; ena[1] = ena[0]+1;
define elements
Omega_eh* elem = new Omega_eh(i, 0, 0, 2, ena); the_omega_eh_array.add(elem); }
}
gh_on_Gamma_h::gh_on_Gamma_h(int df, Omega_h& omega_h) {
__initialization(df, omega_h);
define boundary conditions
the_gh_array[node_order(0)](0) = gh_on_Gamma_h::Dirichlet; // psi(0) = -dw/dx(0) = 0
the_gh_array[node_order(0)](1) = gh_on_Gamma_h::Dirichlet; // w(0) = 0
the_gh_array[node_order(node_no-1)](2) = gh_on_Gamma_h::Dirichlet; // lambda(L) = 0;
the_gh_array[node_order(node_no-1)](0) = gh_on_Gamma_h::Neumann; // M(L) = M_
the_gh_array[node_order(node_no-1)][0] = M_; // end bending moment M(L) = 1
} instantiate fixed and free variables and
class Beam_Lagrange_Multiplier_Formulation : public Element_Formulation {
public:
Global_Discretization
Beam_Lagrange_Multiplier_Formulation(Element_Type_Register a)
: Element_Formulation(a) {}
Element_Formulation *make(int, Global_Discretization&);
Beam_Lagrange_Multiplier_Formulation(int, Global_Discretization&);
};
Element_Formulation* Beam_Lagrange_Multiplier_Formulation::make(int en,
Global_Discretization& gd) { return new Beam_Lagrange_Multiplier_Formulation(en,gd); }
Beam_Lagrange_Multiplier_Formulation::Beam_Lagrange_Multiplier_Formulation(int en,
“Beam_Lagrange_Multiplier_Formulati
Global_Discretization& gd) : Element_Formulation(en, gd) { on”
Quadrature qp(spatial_dim_no, 2);
H1 Z(qp), N = INTEGRABLE_VECTOR_OF_TANGENT_BUNDLE(
"int, int, Quadrature", 2/*nen*/, 1/*nsd*/, qp);
N[0] = (1-Z)/2; N[1] = (1+Z)/2;
H1 X = N*xl; H0 Nx = d(N)(0)/d(X); J d_l(d(X)); φ eψ = φew = φ eλ = {(1-ξ)/2, (1+ξ)/2}T
stiff &= C0(6, 6, (double*)0); C0 stiff_sub = SUBMATRIX("int, int, C0&", 3, 3, stiff);
dφeψ dφ eψ
k e00 = – ∫ EI ---------- ⊗ ---------- dx
stiff_sub[0][0] = -((E_*I_) * Nx * (~Nx)) | d_l; stiff_sub[0][2] = -(((H0)N) % ((H0)N)) | d_l;
stiff_sub[2][0] = -( ~stiff_sub[0][2] ); stiff_sub[1][2] = (Nx % ((H0)N)) | d_l; dx dx
stiff_sub[2][1] = ~stiff_sub[1][2]; Ωe
force &= C0(6, (double*)0); C0 force_sub = SUBVECTOR("int, C0&", 3, force);
dφeψ dφ eλ
ke02 = – ( k e20 ) T = – ∫ ---------- ⊗ --------- dx
force_sub[1] = (((H0)N)*f_0) | d_l;
} dx dx
Element_Formulation* Element_Formulation::type_list = 0; Ωe
static Element_Type_Register element_type_register_instance;
dφew
∫ ---------
static Beam_Lagrange_Multiplier_Formulation lagrange(element_type_register_instance);int
main() { k e12 = ( k e21 ) T = - ⊗ φ eλ dx
dx
const int ndf = 3; Omega_h oh; gh_on_Gamma_h gh(ndf, oh); Ωe
U_h uh(ndf, oh); Global_Discretization gd(oh, gh, uh);
Matrix_Representation mr(gd);
mr.assembly(); C0 u = ((C0)(mr.rhs()))/((C0)(mr.lhs())); f e1 = ∫ φew fdx
gd.u_h() = u; gd.u_h() = gd.gh_on_gamma_h();cout << gd.u_h(); return 0; Ωe
}
Listing 4•6 Beam-bending problem Lagrange multipler formulation using linear line element (project:
“beam_lagrange_multiplier” in project workspace file “fe.dsw” under directory “vs\ex\fe”).
-0.4 0.4
ψ w λ0.6
-0.6 0.3
0.4
-0.8 0.2
0.2
-1 0.1
Figure 4•22 Lagrange multiplier formulation for beam bending problem using
linear interpolation function for all three variables.
The problem definitions for the nodal load case can be coded as the followings
1 static const int node_no = 4; static const int element_no = node_no-1; static const int spatial_dim_no = 1;
2 static const double L_ = 360.0; static const double E_ = 24.0e6; static const double I_ = 144.0;
3 static const double P_ = 1.0;
4 Omega_h::Omega_h() {
5 double v = 0.0; Node* node = new Node(0, spatial_dim_no, &v); the_node_array.add(node);
6 v = 120.0; node = new Node(1, spatial_dim_no, &v); the_node_array.add(node);
7 v = 240.0; node = new Node(2, spatial_dim_no, &v); the_node_array.add(node);
8 v = 360.0; node = new Node(3, spatial_dim_no, &v); the_node_array.add(node);
9 for(int i = 0; i < element_no; i++) {
10 int ena[2]; ena[0] = i; ena[1] = ena[0]+1;
11 Omega_eh* elem = new Omega_eh(i, 0, 0, 2, ena); the_omega_eh_array.add(elem);
12 }
13 }
14 gh_on_Gamma_h::gh_on_Gamma_h(int df, Omega_h& omega_h) {
15 __initialization(df, omega_h);
16 the_gh_array[node_order(0)](1) = gh_on_Gamma_h::Dirichlet; // w(0) = 0
17 the_gh_array[node_order(1)](1) = gh_on_Gamma_h::Neumann; // f(120) = - P; shear force
18 the_gh_array[node_order(1)][1] = -P_;
19 the_gh_array[node_order(node_no-1)](1) = gh_on_Gamma_h::Dirichlet; // w(360) = 0
20 }
Again, we can just comment out the element force vector computation in the constructor of class
Beam_Lagrange_Multiplier for efficiency. The results are shown in Figure 4•23. The solution for this boundary
condition case is not acceptable. The exact solution shear force is constant within each element, while we use lin-
ear interpolation functions for the shear force. The problem is overly constrained. On the other hand, the slope
and transverse deflection require higher order of interpolation functions than the linear functions. The choice of
different order of interpolation functions and the number of nodes per variable/per element to obtain a meaning-
ψ w λ exact soln.
-6
2.5 10 50 100 150 200 250 300 350 x
-6 0.5
2. 10
-6 -0.00005
1.5 10
50 100 150 200 250 300 350
x
-6
1. 10
-7 -0.5
-0.0001
5. 10
-1
50 100 150 200 250 300 350 x
-7 -0.00015
-5. 10 -1.5
-6
-1. 10
-0.0002 -2
Figure 4•23 The Lagrange multiplier method with all three variables interpolated
using linear element for the nodal load problem does not produce satisfactory
result.
ful result depends on the so-called LBB-condition in finite element method that we will discussed in details in
Section 4.4
The distributed load case is defined as
1 static const int node_no = 5; static const int element_no = 4; static const int spatial_dim_no = 1;
2 static const double L_ = 180.0; static const double element_size = L_/((double)(element_no));
3 static const double E_ = 29.0e6; static const double I_ = 723.0; static const double f_0 = -1.0;
4 Omega_h::Omega_h() {
5 for(int i = 0; i < node_no; i++) {
6 double v = ((double)i)*element_size;
7 Node* node = new Node(i, spatial_dim_no, &v); the_node_array.add(node);
8 }
9 for(int i = 0; i < element_no; i++) {
10 int ena[2]; ena[0] = i; ena[1] = ena[0]+1;
11 Omega_eh* elem = new Omega_eh(i, 0, 0, 2, ena); the_omega_eh_array.add(elem);
12 }
13 }
14 gh_on_Gamma_h::gh_on_Gamma_h(int df, Omega_h& omega_h) {
15 __initialization(df, omega_h);
16 the_gh_array[node_order(0)](0) = gh_on_Gamma_h::Neumann; // M(0) = 0
17 the_gh_array[node_order(0)](1) = gh_on_Gamma_h::Dirichlet; // w(0) = 0
18 the_gh_array[node_order(node_no-1)](0) = gh_on_Gamma_h::Dirichlet; // psi(L) = -dw/dx(L) = 0
19 the_gh_array[node_order(node_no-1)](1) = gh_on_Gamma_h::Dirichlet; // w(L) = 0
20 }
1 H0 f = (f_0/L_)*((H0)X);
2 force &= C0(6, (double*)0); C0 force_sub = SUBVECTOR("int, C0&", 3, force);
3 force_sub[1] = (((H0)N)*f) | d_l;
-1. 10
-6 -0.00008
0.2 0.4 0.6 0.8 1 x
-6
-1.5 10 -0.0001 -25
-50
Figure 4•24 The results of the distrubted loading case using Lagrange
multiplier formulation for the beam bending problem.
ρ EI dψ 2 ρ dw 2
p ( ψ, w ;ρ ) ≡ J ( ψ, w ) + --- C 2 ( ψ, w ) = ∫ ------ – fw dx + --- ∫ ψ + ------- dx – w VΓ h + ψM Γh Eq. 4•56
2 2 dx 2 dx
Ω Ω
where the popular quadratic form of the penalty function is taken. The Euler-Lagrange equations obtained from
setting δ p = 0 are (where δψ = εψ vψ, δw = εw vw )
dv ψ dψ
dx + ρ ∫ v ψ ψ + ------- dx + v ψ M Γh = 0
dw
δψ p = εψ ∫ EI ---------
dx d x dx
Ω Ω
dv w
= ε w – ∫ v w f dx + ρ ∫ ---------- ψ + ------- dx – v w V Γh = 0
dw
δw p Eq. 4•57
dx dx
Ω Ω
Dropping the arbitrary constants εψ and εw and substituting interpolation functions for {ψ, w}, and {vψ, vw}, the
Euler-Lagrange equations, Eq. 4•57, are re-written for the element formulation in matrix form as
dφeψ dφ eψ dφ ew
EI ∫ ---------- ⊗ ---------- dx + ρ ∫ φ eψ ⊗ φ eψ dx ρ ∫ φeψ ⊗ ---------- dx – φ eψ M Γh
Ω dx dx
Ω
Ω dx
ψ̂ e
e = Eq. 4•58
dφ ew dφ ew dφ ew ŵ e ∫ φew f dx + φew VΓ
ρ ∫ ---------- ⊗ φ eψ dx ρ ∫ ---------- ⊗ ---------- dx
h
dx dx dx Ωe
Ω Ω
Changing the sign of the first equation to keep the right-hand-side positive, we have
dφ eψ dφ eψ dφ ew
– EI ∫ ---------- ⊗ ---------- dx + ρ ∫ φ eψ ⊗ φeψ dx – ρ ∫ φ eψ ⊗ ---------- dx φ eψ M Γh
Ω dx dx
Ω
Ω
dx
ψ̂ e
e = Eq. 4•59
dφ ew dφew dφ ew ŵ e ∫ φew f dx + φew VΓ
ρ ∫ ---------- ⊗ φ eψ dx ρ ∫ ---------- ⊗ ---------- dx
h
dx dx dx Ωe
Ω Ω
As discussed in a sub-section “Penalty Methods” on page 153 in Chapter 2, the penalty parameter ρ should be
initially set to a small number, then gradually increase its values in subsequent iterations. Starting out with a
small ρ means we are to weight more on the minimization of the objective functional (for this problem the mini-
mum energy principle in mechanics). Subsequently increasing the penalty parameter enforces the constraint
x 0.6
0.2 0.4 0.6 0.8 1
-0.2 0.5
-0.4 0.4
w
ψ -0.6 0.3
-0.8 0.2
0.1
-1
Figure 4•25 The solutions of end-bending moment case with penalty formulation.
#include "include\fe.h"
static const int node_no = 5; static const int element_no = 4; static const int spatial_dim_no = 1;
static const double L_ = 1.0; static const double h_e = L_/((double)(element_no));
static const double E_ = 1.0; static const double I_ = 1.0; static const double f_0 = 1.0;
static const double M_ = 1.0; static double k_ = 1.0; Definte discretizaed global domain
Omega_h::Omega_h() { for(int i = 0; i < node_no; i++) { double v = ((double)i)*h_e;
Node* node = new Node(i, spatial_dim_no, &v); the_node_array.add(node); }
define nodes
for(int i = 0; i < element_no; i++) { int ena[2]; ena[0] = i; ena[1] = ena[0]+1; define elements
Omega_eh* elem = new Omega_eh(i, 0, 0, 2, ena); the_omega_eh_array.add(elem); } }
gh_on_Gamma_h::gh_on_Gamma_h(int df, Omega_h& omega_h) {__initialization(df, omega_h);
the_gh_array[node_order(0)](0)=the_gh_array[node_order(0)](1)=gh_on_Gamma_h::Dirichlet;
define boundary conditions
the_gh_array[node_order(node_no-1)][0] = M_; }class Beam_Penalty_Function_Formulation : M(L) = 1
public Element_Formulation { public: instantiate fixed and free variables and
Beam_Penalty_Function_Formulation(Element_Type_Register a) : Element_Formulation(a) {}
Element_Formulation *make( int, Global_Discretization&);
Global_Discretization
Beam_Penalty_Function_Formulation( int, Global_Discretization&); };
Element_Formulation* Beam_Penalty_Function_Formulation::make(int en,
Global_Discretization& gd) { return new Beam_Penalty_Function_Formulation(en,gd); }
Beam_Penalty_Function_Formulation::Beam_Penalty_Function_Formulation( int en,
Global_Discretization& gd) : Element_Formulation(en, gd) {
Quadrature qp(spatial_dim_no, 2); φ eψ = φ ew = {(1-ξ)/2, (1+ξ)/2}T
H1 Z(qp), N = INTEGRABLE_VECTOR_OF_TANGENT_BUNDLE(
dφeψ dφ eψ
k e00 = – ∫ EI ---------- ⊗ ---------- dx –
"int, int, Quadrature", 2/*nen*/, 1/*nsd*/, qp);
N[0] = (1-Z)/2; N[1] = (1+Z)/2; H1 X = N*xl; H0 Nx = d(N)(0)/d(X); J d_l(d(X)); dx dx
stiff &= C0(4, 4, (double*)0); C0 stiff_sub = SUBMATRIX("int, int, C0&", 2, 2, stiff); Ωe
stiff_sub[0][0] = -( (E_*I_) * Nx * (~Nx) + k_ * (((H0)N)*(~(H0)N)) ) | d_l;
ρ ∫ φ eψ ⊗ φ eψ dx
stiff_sub[0][1] = -k_* ( (((H0)N) * (~Nx)) | d_l ); stiff_sub[1][0] = -(~stiff_sub[0][1]);
stiff_sub[1][1] = k_* ( (Nx * (~Nx)) | d_l );
force &= C0(4, (double*)0); C0 force_sub = SUBVECTOR("int, C0&", 2, force); Ω
force_sub[1] = (((H0)N)*f_0) | d_l;
dφ eψ dφ eλ
k e01 = – ( k e10 ) T = – ρ ∫ ---------- ⊗ --------- dx
}
Element_Formulation* Element_Formulation::type_list = 0; dx dx
static Element_Type_Register element_type_register_instance; Ωe
static Beam_Penalty_Function_Formulation beam_penalty_function_formulation_instance(
dφ ew dφ ew
k e11 = ρ ∫ ---------- ⊗ ---------- dx
element_type_register_instance);
int main() { dx dx
const int ndf = 2; Omega_h oh; gh_on_Gamma_h gh(ndf, oh); Ω
U_h uh(ndf, oh); Global_Discretization gd(oh, gh, uh);
Matrix_Representation mr(gd);
C0 w(node_no, (double*)0), w_old(node_no, (double*)0), f e1 = ∫ φew fdx
delta_w(node_no, (double*)0), u_optimal; Ωe
double min_energy_norm = 1.e20, k_optimal;
for( int i = 0; i < 10; i++) {
mr.assembly(); C0 u = ((C0)(mr.rhs()))/((C0)(mr.lhs()));
gd.u_h() = u; gd.u_h() = gd.gh_on_gamma_h();
for( int j = 0; j < node_no; j++) w[j] = gd.u_h()[j][1];
delta_w = ((i) ? w-w_old : w); w_old = w;
if((double)norm(delta_w) < min_energy_norm) { monitor convergence with norm(∆w)
min_energy_norm = norm(delta_w); u_optimal = u; k_optimal = k_; }
cout << "penalty parameter: " << k_ << " energy norm: " << norm(delta_w) << endl
<< gd.u_h() << endl; k_ *= 2.0; }
gd.u_h() = u_optimal; gd.u_h() = gd.gh_on_gamma_h();
cout << "penalty parameter: " << k_optimal << endl << gd.u_h() << endl; return 0;
}
Listing 4•7 Beam-bending problem with penalty function formulation using linear line element (project:
“beam_penalty_function_formulation” in project workspace file “fe.dsw” under directory “vs\ex\fe”).
2
du 2
u 2 + = 1,
du
0 < x < 1, with u’( 0 ) = 0, and u ( 1 ) = 2 Eq. 4•60
dx d x
d du
u = 1, 0 < x < 1, with u’( 0 ) = 0, u ( 1 ) = 2 Eq. 4•62
dx dx
Parallel to the development in Chapter 3, we solve this problem in finite element with (1) Galerkin formulation,
and (2) least squares formulation.
Galerkin Formulation
du h
d h --------
R( uh ) ≡ u –1 Eq. 4•63
dx dx
With Galerkin weightings vh , which is homogeneous at the boundaries, and uh = vh + uΓg , where uΓg is the essen-
tial boundary conditions, the weighted residuals statement gives
1 1
h
d h du
I ( u h ) ≡ ∫ v h R ( u h ) dx = ∫ vh u -------- – 1 dx = 0 Eq. 4•64
dx dx
0 0
1
dv h du h
I(uh) = ∫ – u h -------- -------- – v h dx = 0
dx dx
Eq. 4•65
0
An iterative algorithm is employed for this non-linear problem with uh interpolated at the element level as
u eh ≡ φ ei û ei , where “hat” denotes the nodal values.
∂I
I ( û k + 1 ) = I ( û k + δû k ) ≅ I ( û k ) + δû k = 0 Eq. 4•66
∂ û k
û
where û k + 1 ≡ û k + δû k . The approximation in this equation is the Taylor expansion to the first-order derivatives.
That is the increment of the solution δûk can be solved by
–1
∂I – I ( û k )
δû k = I ( û k ) = ---------------- Eq. 4•67
∂ û k
IT
û
d ( v̂ i φ ei ) du k dφ j du k
∂I φ j --------e- + u k --------e- dx = v̂ A
dφ e dφ
IT ≡ = A – ∫ ------------------ - ⊗ φ e --------e- + u ek -------e- dx
– ∫ ------- Eq. 4•68
∂ û ∀e e dx e dx dx
û k
Ω dx ∀e Ω dx dx
e e
and,
dφ e du ek
I ( û k ) = v̂ A
∀e ∫ – u ek -------- --------- – φ e dx
dx dx
Eq. 4•69
Ωe
v̂ is an arbitrary constant of global nodal vector and appears on both the nominator and denominator of Eq.
4•67. Therefore, it can be dropped. We define the element tangent stiffness matrix and element residual vector as
dφ e du ek dφ e dφ e du ek
ke T ≡ ∫ – -------- ⊗ φ e --------- + -------- u ek dx , and re ≡ ∫ u ek -------- --------- + φ e dx Eq. 4•70
dx dx dx dx dx
Ωe Ωe
The Program Listing 4•8 implements element formulation in Eq. 4•70, then, uses an iterative algorithmic solve
for the increment of the solution ( δu h )k with Eq. 4•67. An initial values of zero, u0 = 0, will lead to singular left-
hand-side matrix, therefore, the initial values are set to unity, u0 = 1.0. In the element level the nodal value of ue
is supplied by a private member function __initialization(int) of class Non_Linear_ODE_Quadratic as “ul”
The line 3 in the above assigns nodal free degree of freedom values plus nodal fixed degree of freedom values to
“ul”. The values of ue itself can be computed at the element level as
#include "include\fe.h"
static const int node_no = 5; static const int element_no = 2; static const int spatial_dim_no = 1;
Omega_h::Omega_h() {
for(int i = 0; i < node_no; i++) { Definte discretizaed global domain
double v; v = ((double)i)/((double)(node_no-1)); define nodes
Node* node = new Node(i, spatial_dim_no, &v); the_node_array.add(node); }
for(int i = 0; i < element_no; i++) {
int ena[3]; ena[0] = i*2; ena[1] = ena[0]+1; ena[2] = ena[0]+2; define elements
Omega_eh* elem = new Omega_eh(i, 0, 0, 3, ena); the_omega_eh_array.add(elem); }
}
gh_on_Gamma_h::gh_on_Gamma_h(int df, Omega_h& omega_h) {
define boundary conditions
__initialization(df, omega_h);
du
the_gh_array[node_order(node_no-1)](0) = gh_on_Gamma_h::Dirichlet; ( 0 ) = 0, u ( 1 ) = 2
the_gh_array[node_order(node_no-1)][0] = sqrt(2.0); } dx
static const int ndf = 1; static Omega_h oh; static gh_on_Gamma_h gh(ndf, oh);
static U_h uh(ndf, oh); static Global_Discretization gd(oh, gh, uh); instantiate fixed and free variables and
class Non_Linear_ODE_Quadratic : public Element_Formulation { Global_Discretization
C0 ul; void __initialization(int);
public:
Non_Linear_ODE_Quadratic(Element_Type_Register a) : Element_Formulation(a) {}
Element_Formulation *make(int, Global_Discretization&);
Non_Linear_ODE_Quadratic(int, Global_Discretization&); };
static int initial_newton_flag;
void Non_Linear_ODE_Quadratic::__initialization(int en) {
ul &= gd.element_free_variable(en) + gd.element_fixed_variable(en);
if(!initial_newton_flag) gl = 0.0; }
Element_Formulation* Non_Linear_ODE_Quadratic::make(int en, Global_Discretization& gd) {
return new Non_Linear_ODE_Quadratic(en,gd); }
Non_Linear_ODE_Quadratic::Non_Linear_ODE_Quadratic(int en, Global_Discretization& gd)
: Element_Formulation(en, gd) { dφ e du ek dφ e
__initialization(en); Quadrature qp(spatial_dim_no, 3); k eT ≡ ∫ – -------- ⊗ φ e --------- + -------- u ek dx
H1 Z(qp), N = INTEGRABLE_VECTOR_OF_TANGENT_BUNDLE( dx dx dx
Ωe
"int, int, Quadrature", 3/*nen*/, 1/*nsd*/, qp);
N[0] = -Z*(1-Z)/2; N[1] = (1-Z)*(1+Z); N[2] = Z*(1+Z)/2;
H1 X = N*xl; J d_l(d(X)); H0 Nx = d(N)(0)/d(X); H1 U = N*ul; H0 Ux = d(U)/d(X); dφ e du ek
stiff &= -(Nx * ~( ((H0)N)*Ux + Nx * ((H0)U) ) ) | d_l; re ≡ ∫ u ek -------- --------- + φ e dx
dx dx
force &= ( ((H0)U) * Nx * Ux + ((H0)N) ) | d_l; } Ωe
Element_Formulation* Element_Formulation::type_list = 0;
static Element_Type_Register element_type_register_instance;
static Non_Linear_ODE_Quadratic non_linear_ode_quadratic_instance
(element_type_register_instance);
static Matrix_Representation mr(gd); static const double EPSILON = 1.e-12;
int main() {
C0 u, du, unit(gd.u_h().total_node_no(), (double*)0); unit = 1.0; gd.u_h() = unit; –1
∂I –1
gd.u_h() = gd.gh_on_gamma_h(); initial_newton_flag = TRUE; δû k= [ – I ( û k ) ] = I T [ – I ( û k ) ]
do { ∂ û
û k
mr.assembly(); initial_newton_flag = FALSE; du = ((C0)(mr.rhs())) / ((C0)(mr.lhs()));
if(!(u.rep_ptr())) { u = du; u = 1.0; }
u += du; gd.u_h() = u; û k + 1 ≡ û k + δû k
cout << norm((C0)(mr.rhs())) << " , " << norm(du) << endl << gd.u_h();
(C0)(mr.lhs()) = 0.0; (C0)(mr.rhs()) = 0.0;
} while((double)norm(du) > EPSILON); reset left-hand-side and right-hand-side
cout << gd.u_h();
return 0; }
Listing 4•8 Solution of nonlinear ordinary differential equation using Galerkin formulation for finite ele-
ment (project: “nonlinear_ode” in project workspace file “fe.dsw” under directory “vs\ex\fe”).
The default behavior of the class Element_Formulation is that essential boundary conditions “gl” will be
included in the computation of reaction, which is “stiff * gl”, and to be subtracted out from the right-hand-side
vector. For the iterative algorithm which solves the increment of solution, δû k , only at the initial loop (k = 0)
when we compute δû 0 , the reaction need to be subtracted out of the right-hand-side once for all. For k > 0, “gl”
is set to zero, as in line 4, to prevent the reaction to be subtracted out of the right-hand-side at every iteration.
This ad hoc mechanism is incorporated by a “initial_newton_flag” in the main() function as
1 int main() {
...
2 initial_newton_flag = TRUE;
3 do { // Newton iteration loop
4 mr.assembly();
5 intial_newton_flag = FALSE;
...
6 } while (... );
...
7 }
The “initial_newton_flag” is set to TRUE initially (line 2). After the global matrix and global vector have been
assembled for the first time (line 4), the initial_newton_flag is set to FALSE (line 5). Therefore, at the element
level the reaction can be prevent from subtracting out of the right-hand-side again. The error of this computation,
defined as the difference of the exact solution ( u ex ( x ) = 1 + x 2 ) and finite element solution, is shown in Fig-
ure 4•26.The nodal solutions are almost identical to the exact solution.
0.0006
0.0004
0.0002
Error =
exact - f.e. solution x
0.2 0.4 0.6 0.8 1
-0.0002
-0.0004
-0.0006
-0.0008
∂ R ( u h ) 22 ∂R ( u h )
------------------------- = 2 ------------------, R ( u h ) = 0 Eq. 4•71
∂u h ∂u h
∂R ( u h )
w = -----------------
- Eq. 4•72
∂u h
∂R ( u h ) ∂I ∂2 R( uh ) ∂R ( u h ) ∂R ( u h )
I ( u h ) ≡ -----------------
-, R ( u h ) , and I T ≡
= --------------------
-, R ( u h ) + ------------------, ------------------
Eq. 4•73
∂u h ∂ uh ∂u h 2 ∂u h ∂u h
uh
For the non-linear problem in the previous section, the residual, at the element level, is
d 2 ue du 2
- + -------e- – 1
R ( u e ) ≡ u e ---------- Eq. 4•74
dx 2 dx
and the first derivative of the residual, with respect to the nodal variables ( u e ≡ φ ei û ei ), is
∂R ( û ei ) d 2 ue d 2 φ ei dφ ei du e
------------------ = φei ----------- + u e ----------- + 2 --------- -------- Eq. 4•75
∂û ei dx 2 dx 2 dx dx
∂ 2 R ( û ei ) d2 φ d2 φ dφ dφ
--------------------- = φ e ⊗ ----------e- + ----------e- ⊗ φ e + 2 -------e- ⊗ -------e- Eq. 4•76
∂û e i2 dx 2 dx 2 dx dx
From Eq. 4•73, the element tangent stiffness matrix and the element residual vector are
∂R ( û e ) ∂R ( û e ) ∂ 2 R ( û e )
ke T ≡ ∫ ----------------- ⊗ ----------------- + -------------------- R ( u e ) dx
∂û e ∂û e ∂û e 2
, and
Ωe
∂R ( û e )
r e ≡ – ∫ ----------------- R ( u e ) dx Eq. 4•77
∂û e
Ωe
The Program Listing 4•9 implements Eq. 4•77. An immediate difficulty associates with the least squares formu-
lation is the presence of the second derivatives. As we have discussed in the irreducible formulation for beam
bending problem in page 306, the C1-continuity on node is required for the entire problem domain to be integra-
ble. Otherwise, if first derivative is not continuous on node, the second derivative on node will be infinite, and
the entire problem domain is not integrable. This means that we need to have du/dx in the set of nodal variables
to ensure the first derivative is continuous on the nodes. As in the irreducible formulation for beam bending
problem, a 2-node element can be used with the Hermite cubics discussed previously. At the element level, we
have
The Hermite cubics (lines 9-12) are the same as those in the irreducible formulation except that we have positive signs for
both du0/dx and du1/dx variables (which is taken as negative in bending problem conventionally to improve the symmetry
of the formulation).
#include "include\fe.h"
static const int node_no = 5;
static const int element_no = 4;
static const int spatial_dim_no = 1;
Omega_h::Omega_h() {
Definte discretizaed global domain
for( int i = 0; i < node_no; i++) {
double v; v = ((double)i)/((double)(node_no-1)); define nodes
Node* node = new Node(i, spatial_dim_no, &v); the_node_array.add(node);
}
define elements
for( int i = 0; i < element_no; i++) {
int ena[2]; ena[0] = i; ena[1] = ena[0]+1;
Omega_eh* elem = new Omega_eh(i, 0, 0, 2, ena); the_omega_eh_array.add(elem);
}
}
gh_on_Gamma_h::gh_on_Gamma_h( int df, Omega_h& omega_h) { define boundary conditions
__initialization(df, omega_h);
du
the_gh_array[node_order(0)](1) = gh_on_Gamma_h::Dirichlet;
( 0 ) = 0, u ( 1 ) = 2
the_gh_array[node_order(node_no-1)](0) = gh_on_Gamma_h::Dirichlet; dx
the_gh_array[node_order(node_no-1)][0] = sqrt(2.0);
} instantiate fixed and free variables and
static const int ndf = 2; static Omega_h oh;
Global_Discretization
static gh_on_Gamma_h gh(ndf, oh);
static U_h uh(ndf, oh);
static Global_Discretization gd(oh, gh, uh);
class Non_Linear_Least_Squares : public Element_Formulation {
C0 ul; void __initialization(int);
public:
Non_Linear_Least_Squares(Element_Type_Register a) : Element_Formulation(a) {}
Element_Formulation *make(int, Global_Discretization&);
Non_Linear_Least_Squares(int, Global_Discretization&);
};
static int initial_newton_flag;
void Non_Linear_Least_Squares::__initialization(int en) {
ul &= gd.element_free_variable(en) + gd.element_fixed_variable(en);
if(!initial_newton_flag) gl = 0.0;
}
Element_Formulation* Non_Linear_Least_Squares::make(int en,
Global_Discretization& gd) { return new Non_Linear_Least_Squares(en,gd); }
Non_Linear_Least_Squares::Non_Linear_Least_Squares(int en,
Global_Discretization& gd) : Element_Formulation(en, gd) {
__initialization(en);
double weight[3] = {1.0/3.0, 4.0/3.0, 1.0/3.0},
h_e = fabs( ((double)(xl[0] - xl[1])) );
Quadrature qp(weight, 0.0, h_e, 3);
J d_l(h_e/2.0);
H2 Z((double*)0, qp),
z = Z/h_e,
N = INTEGRABLE_VECTOR_OF_TANGENT_OF_TANGENT_BUNDLE(
"int, int, Quadrature", 4/*nen x ndf*/, 1/*nsd*/, qp);
N[0] = 1.0-3.0*z.pow(2)+2.0*z.pow(3);
N[1] = Z*(1.0-z).pow(2); Hermite cubics
N[2] = 3.0*z.pow(2)-2.0*z.pow(3);
N[3] = Z*(z.pow(2)-z);
H0 Nx = INTEGRABLE_VECTOR("int, Quadrature", 4, qp),
Nxx = INTEGRABLE_VECTOR("int, Quadrature", 4, qp);
Nx = d(N)(0);
for(int i = 0; i < 4; i++) { Nxx[i] = dd(N)(i)[0][0]; }
H2 U = N*ul;
d2ue du 2
H0 Ux, Uxx;
- + -------e- – 1
R ( u e ) ≡ u e ----------
Ux = d(U)(0); dx 2 dx
Uxx = dd(U)[0][0];
H0 uR = ((H0)U)*Uxx + Ux.pow(2) - 1.0,
Ru = ((H0)N)*Uxx + ((H0)U)*Nxx + 2.0*Nx*Ux, ∂R ( û ei ) d 2 ue d 2 φ ei dφ ei du e
------------------ = φ ei ----------- + u e ----------- + 2 --------- --------
Ruu = (((H0)N)%Nxx) + (Nxx%((H0)N)) + 2.0*(Nx%Nx); ∂û ei dx 2 dx 2 dx dx
stiff &= ( (Ru%Ru + Ruu*uR) ) | d_l;
force &= -(Ru*uR) | d_l ;
} ∂ 2 R ( û e ) d2 φe d2 φe
-------------------- = φ e ⊗ ----------- + ----------- ⊗ φ e +
Element_Formulation* Element_Formulation::type_list = 0; ∂û e 2 dx 2 dx 2
static Element_Type_Register element_type_register_instance;
static Non_Linear_Least_Squares
non_linear_least_squares_instance(element_type_register_instance); dφ e dφe
2 -------- ⊗ --------
static Matrix_Representation mr(gd); dx dx
static const double EPSILON = 1.e-12;
int main() {
∂R ( û e ) ∂R ( û e )
∫
C0 p, u, du;
k eT ≡ - ⊗ ----------------- +
----------------
gd.u_h() = gd.gh_on_gamma_h(); ∂û e ∂û e
C0 unit(gd.u_h().total_node_no()*ndf, (double*)0); Ωe
unit = 1.0;
gd.u_h() = unit; ∂ 2 R ( û e )
-------------------- R ( u e ) dx
do { ∂û e 2
mr.assembly();
p = ((C0)(mr.rhs())) / ((C0)(mr.lhs()));
∂R ( û e )
if(!(u.rep_ptr())) { u = p; u = 1.0; } r e ≡ – ∫ ----------------- R ( u e ) dx
double left = 0.0, right = 1.0, length = right-left; ∂û e
Ωe
do {
Matrix_Representation::Assembly_Switch = Matrix_Representation::RHS; line search
du = (left + 0.618 * length) * p;
gd.u_h() = u + du;
golden section
(C0)(mr.rhs()) = 0.0;
mr.assembly();
double residual_golden_right = norm((C0)(mr.rhs()));
du = (left + 0.382 * length)* p;
gd.u_h() = u + du;
(C0)(mr.rhs())=0.0;
mr.assembly();
double residual_golden_left = norm((C0)(mr.rhs()));
if(residual_golden_right < residual_golden_left) left = left + 0.382 * length;
else right = left+0.618*length;
length = right - left;
} while(length > 1.e-2);
cout << "bracket: (" << left << ", " << right << ")" << endl;
u += du; û k + 1 ≡ û k + δû k
cout << "residual norm: " << norm((C0)(mr.rhs())) <<
" search direction norm: " << norm(p) << endl << “solution: “ << gd.u_h() << endl;
Matrix_Representation::Assembly_Switch = Matrix_Representation::ALL;
(C0)(mr.lhs()) = 0.0;
(C0)(mr.rhs()) = 0.0;
} while((double)norm(p) > EPSILON);
cout << gd.u_h();
return 0;
}
Listing 4•9 Solution of nonlinear ordinary differential equation using least squares formulation for finite
element (project: “nonlinear_least_squares_ode” in project workspace file “fe.dsw” under directory
In place of evaluating the objective functional value in Chapter 2, the finite element method is to minimized the
residuals of the problem. In the loop for the golden section line search, the assembly flag is set to only assemble
the right-hand-side vector (line 3). The norm of the right-hand-side vector is used as the criterion for the line
search minimization. At outer loop where Newton’s formula is used to compute the next search direction p, the
assembly flag is reset back to assembly both the left-hand-side matrix and the right-hand-side vector (line 19).
The results are shown in Figure 4•27.
1.4 0.7
0.6
1.3 0.5
u du/dx0.4
1.2
0.3
1.1 0.2
0.1
Figure 4•27 Nodal solutions (open squares) comparing to the exact solutions (solid
curves) for the nonlinear least squares formulation.
where C is the heat capacity matrix, K is the conductivity matrix, and f is heat source vector. The variable u is
the temperature and u· is the time derivative of temperature. And, the hyperbolic equation for structural dynam-
ics
where M is the consistent mass matrix, K the stiffness matrix and f the force vector. The variable u is the dis-
placement and u·· , the second time derivative of the displacement, gives the acceleration.
Parabolic Equation
From Eq. 3•191 of Chapter 3 (in page 253),
∂u ∂ 2 u ∂u
----- – -------- = 0, 0 < x < 1 subject to u ( 0, t ) = 0, ------ ( 1, t ) = 0, and u ( x, 0 ) = 1 Eq. 4•81
∂t ∂x 2 ∂x
∂φ e ∂φ e
ce = ∫ φe ⊗ φe dx, and k e = ∫ -------
∂x
- ⊗ -------- dx
∂x
Eq. 4•82
Ωe Ωe
θ is a scalar parameter and ∆t is the time step length. The Program Listing 4•10 implements Eq. 4•80 and Eq.
4•82. At the element level, the heat capacity matrix ce is the additional term to the static case as
1 C0& Parabolic_Equation::__lhs() {
2 the_lhs &= mass + theta_* dt_*stiff; // C + ∆tθK
3 return the_lhs;
4 }
#include "include\fe.h"
static const int node_no = 5; static const int element_no = 4; static const int spatial_dim_no = 1;
Omega_h::Omega_h(){
Definte discretizaed global domain
for(int i = 0; i < node_no; i++) { double v;v=((double)i)/((double)element_no); define nodes
Node* node = new Node(i, spatial_dim_no, &v); the_node_array.add(node); }
for(int i = 0; i < element_no; i++) { int ena[2]; ena[0] = i; ena[1] = ena[0]+1;
Omega_eh* elem = new Omega_eh(i, 0, 0, 2, ena); the_omega_eh_array.add(elem); } }
define elements
gh_on_Gamma_h::gh_on_Gamma_h(int df, Omega_h& omega_h) { __initialization(df, omega_h);
the_gh_array[node_order(0)](0) = gh_on_Gamma_h::Dirichlet; define boundary conditions
the_gh_array[node_order(node_no-1)](0) = gh_on_Gamma_h::Neumann; }
static const int ndf = 1; static Omega_h oh; static gh_on_Gamma_h gh(ndf, oh);
instantiate fixed and free variables and
static U_h uh(ndf, oh); static Global_Discretization gd(oh, gh, uh); Global_Discretization
class Parabolic_Equation : public Element_Formulation { C0 mass, ul;
void __initialization(int, Global_Discretization&);
public:
Parabolic_Equation(Element_Type_Register a) : Element_Formulation(a) {}
Element_Formulation *make(int, Global_Discretization&);
Parabolic_Equation(int, Global_Discretization&);
C0& __lhs(); C0& __rhs(); };
overwrite protected member functions
void Parabolic_Equation::__initialization(int en, Global_Discretization& gd) {
ul &= gd.element_free_variable(en); }
Element_Formulation* Parabolic_Equation::make(int en, Global_Discretization& gd) {
return new Parabolic_Equation(en,gd); }
Parabolic_Equation::Parabolic_Equation(int en, Global_Discretization& gd) :
Element_Formulation(en, gd) { __initialization(en, gd);
heat capacitance c e = ∫ φ e ⊗ φ e dx
Ωe
Quadrature qp(spatial_dim_no, 2);
H1 Z(qp), ∂φ e ∂φ e
N = INTEGRABLE_VECTOR_OF_TANGENT_BUNDLE( "int, int, Quadrature", 2, 1, qp);
N[0] = (1-Z)/2; N[1] = (1+Z)/2; H1 X = N*xl; H0 Nx = d(N)(0)/d(X); J dv(d(X));
conductivity ke = ∫ -------
∂x
- ⊗ -------- dx
∂x
Ωe
stiff &= (Nx % Nx) | dv; mass &= ( ((H0)N)%((H0)N) ) | dv; }
Element_Formulation* Element_Formulation::type_list = 0;
static Element_Type_Register element_type_register_instance;
static Parabolic_Equation parabolic_equation_instance(element_type_register_instance);
static Matrix_Representation mr(gd);
static double theta_ = 0.5; static double dt_ = 0.05;
C0& Parabolic_Equation::__lhs() { the_lhs &= mass + theta_* dt_*stiff; return the_lhs; } C + ∆tθK
C0& Parabolic_Equation::__rhs() {
Element_Formulation::__rhs();
the_rhs += (mass - (1.0-theta_)*dt_*stiff)*ul;
( C – ∆t ( 1 – θ )K )u n – f̂
return the_rhs; }
int main() {
for(int i = 0; i < node_no; i++) uh[i][0] = 1.0;
gd.u_h() = gd.gh_on_gamma_h();
initial conditions
mr.assembly();
C0 decomposed_LHS = !((C0)(mr.lhs()));
for(int i = 0; i < 28; i++) { ( C – ∆t ( 1 – θ )K )u n – f̂
u n + 1 = ---------------------------------------------------------
C0 u = decomposed_LHS*((C0)(mr.rhs())); gd.u_h() = u; ( C + ∆tθK )
double iptr;
if(modf( ((double)(i+1))/4.0, &iptr)==0) {
cout << "time: " << (((double)(i+1))*dt_) << ", at (0.5, 1.0), u = (" <<
gd.u_h()[(node_no-1)/2][0] << ", " << gd.u_h()[node_no-1][0] << ")" << endl; }
if(i < 27) { (C0)(mr.rhs()) = 0.0; (C0)(mr.lhs()) = 0.0; mr.assembly(); }
}
return 0;
}
Listing 4•10 Solution of hyperbolic equation using center difference scheme in time dimension (project:
“hyperbolic_equation” in project workspace file “fe.dsw” under directory “vs\ex\fe”).
In the main() function the decomposition of the left-hand-side matrix is done only once, which is outside of the
time integration loop. The results of this program are shown in Program Listing 4•10.
1 t0
0.8 t0.2
u
0.6
t0.4
0.4
t0.6
0.2 t0.8
t1.0
t1.2
t1.4
0 0.2 0.4 0.6 0.8 1
x
Figure 4•28Finite element solutions for the hyperbolic equation for heat conduction.
ˆ ˆ
K = K + a 0 M + a 1 C, and R n + 1 = – f n + 1 + ( a 0 u n + a 2 u· n + a 3 u·· n )M + ( a 1 u n + a 4 u· n + a 5 u·· n )C Eq. 4•83
where u·· n + 1 = a 0 ( u n + 1 – un ) – a2 u· n – a 3 u·· n and u· n + 1 = u· n + a 6 u·· n + a 7 u·· n + 1 , the Newmark coefficients ai are
γ γ ∆t γ
a 0 = -----------2, a 1 = ---------, a 2 = ---------, a 3 = ------ – 1, a 4 = --- – 1, a 5 = ----- --- – 2 , a 6 = ∆t ( 1 – γ ), a 7 = γ∆t
1 1 1
Eq. 4•84
β∆t β∆t β∆t 2β β 2 β
∂2u ∂4 u
-------- = – --------, 0 < x < 1, t > 0
∂t 2 ∂x 4
∂u ( 0, t ) ∂u ( 1, t )
boundary conditions u ( 0, t ) = u ( 1, t ) = ------------------- = ------------------- = 0
∂x ∂x
∂u ( x, 0 )
and initial conditions u(x, 0) = sin(πx)-πx(1-x), and -------------------- = 0 Eq. 4•85
∂t
The finite element formulation for consistent mass matrix and stiffness matrix is
∂ 2 φe ∂ 2 φe
me = ∫ φ e ⊗ φ e dx, and k e = ∫ ----------- ⊗ ----------- dx
∂x 2 ∂x 2
Eq. 4•86
Ωe Ωe
The damping matrix ce is either in the form of me times damping parameter or in the form of Raleigh damping as
a linear combination of me and ke.1 Again, for two-node element the Hermite cubics are required for the stiffness
matrix as in the irreducible formulation of beam bending problem. The Program Listing 4•11 implements the
hyperbolic equation. Now variables un , u· n , u·· n at tn and un + 1 , u· n + 1 , u·· n + 1 at tn+1 need to be registered as
1 static U_h u_old(ndf, oh); static U_h du_old(ndf, oh); static U_h ddu_old(ndf, oh);
2 static U_h u_new(ndf, oh); static U_h du_new(ndf, oh); static U_h ddu_new(ndf, oh);
These variables are supplied to the element constructor by a private member function
Hyperbolic_Equation::__initialization(int, Global_Discretization&) as
1. p. 93 and p. 339 in K-J Bathe and E.L.Wilson, 1976, “Numerical methods in finite element analysis”, Prentice-Hall, inc.,
Englewood Cliffs, New Jersey.
#include "include\fe.h"
static const int node_no = 5;
static const int element_no = node_no-1;
static const int spatial_dim_no = 1;
static const double L_ = 1.0;
static const double h_e = L_/((double)(element_no));
Omega_h::Omega_h() {
Definte discretizaed global domain
for(int i = 0; i < node_no; i++) { define nodes
double v = ((double)i)*h_e;
Node* node = new Node(i, spatial_dim_no, &v);
the_node_array.add(node);
}
for(int i = 0; i < element_no; i++) { define elements
int ena[2]; ena[0] = i; ena[1] = ena[0]+1;
Omega_eh* elem = new Omega_eh(i, 0, 0, 2, ena);
the_omega_eh_array.add(elem);
}
}
gh_on_Gamma_h::gh_on_Gamma_h(int df, Omega_h& omega_h) {
define boundary conditions
__initialization(df, omega_h);
the_gh_array[node_order(0)](0) =
the_gh_array[node_order(0)](1) =
the_gh_array[node_order(node_no-1)](0) =
the_gh_array[node_order(node_no-1)](1) =
gh_on_Gamma_h::Dirichlet;
}
static const int ndf = 2;
instantiate fixed and free variables and
static Omega_h oh; Global_Discretization
static gh_on_Gamma_h gh(ndf, oh);
static U_h uh(ndf, oh);
static Global_Discretization gd(oh, gh, uh);
static U_h u_old(ndf, oh); static U_h du_old(ndf, oh); static U_h ddu_old(ndf, oh);
static U_h u_new(ndf, oh); static U_h du_new(ndf, oh); static U_h ddu_new(ndf, oh);
class Hyperbolic_Equation : public Element_Formulation {
C0 mass, ul, dul, ddul;
void __initialization(int, Global_Discretization&);
public:
Hyperbolic_Equation(Element_Type_Register a) : Element_Formulation(a) {}
Element_Formulation *make(int, Global_Discretization&);
Hyperbolic_Equation(int, Global_Discretization&);
C0& __lhs(); overwrite protected member functions
C0& __rhs();
};
void Hyperbolic_Equation::__initialization(int en, Global_Discretization& gd) {
Omega_h& oh = gd.omega_h();
gh_on_Gamma_h& gh = gd.gh_on_gamma_h();
Global_Discretization gd_u_old(oh, gh, u_old);
ul &= gd_u_old.element_free_variable(en); un
Global_Discretization gd_du_old(oh, gh, du_old); u· n
dul &= gd_du_old.element_free_variable(en);
Global_Discretization gd_ddu_old(oh,gh,ddu_old); u··
n
ddul &=gd_ddu_old.element_free_variable(en);
}
Element_Formulation* Hyperbolic_Equation::make(int en, Global_Discretization& gd) {
return new Hyperbolic_Equation(en,gd);
}
Listing 4•11 Newmark scheme for hyperbolic equation using finite element method.
Basically, the time integration algorithm is to update variables un , u· n , u·· n at time tn to u n + 1 , u· n + 1 , u·· n + 1 at
time tn+1. At the beginning of time tn+1, u n , u· n , u·· n are given, and u n + 1 is solved from back-substitution of glo-
·
bal stiffness matrix and global residual vector. The velocity and acceleration u·· n + 1 and u n + 1 at time tn+1 are
computed at the global level in the main() program, when the variable “u_new”, un + 1 , is available, such as
1 ddu_new = a[0]*(((C0)u_new)-((C0)u_old))-a[2]*((C0)du_old)-a[3]*((C0)ddu_old);
2 du_new = ((C0)du_old) + a[6]*((C0)ddu_old)+a[7]*((C0)ddu_new);
This is implemented according to the formula for acceleration u·· n + 1 = a 0 ( un + 1 – u n ) – a 2 u· n – a 3 u·· n (line 1) and
velocity u· n + 1 = u· n + a 6 u·· n + a 7 u·· n + 1 (line 2), respectively. The results of this computation are shown in Figure
4•29.
initial condition t = 0
t = 0.28
0.2 0.2 t = 0.26
t = 0.02
t = 0.04 t = 0.24
0.1 0.1
u t = 0.06 u
t = 0.20
0.2 0.4 0.6 0.8 1 x 0.2 0.4 0.6 0.8 1 x
t = 0.08
-0.1 -0.1 t = 0.22
t = 0.10
t = 0.12 t = 0.18
-0.2 -0.2
t = 0.14 t = 0.16
t = 0.0 to 0.14 t = 0.16 to 0.28
Figure 4•29 Beam vibration using finite element method with Newmark scheme to solve the
hyperbolic equation. The finite element solutions of downward deflection are piece-wise cubic
functions of nodal deflection “ û ” and nodal negative slope “ ψ̂ ≡ -du/dx” (i.e., u = f( û , ψ̂ ) for
two-node Hermite cubic element). Solutions of every four time steps are shown.
dφ ew dφ eM
0 – ∫ ---------- ⊗ ---------- dx
dx dx
Ωe ŵ e ∫ φew fdx + φew VΓ h
= Ωe Eq. 4•87
dφ eM dφ ew φ eM ⊗ φ eM ˆ
–∫ ---------- ⊗ ---------- dx – ∫ ---------------------- dx
M e
φ eM ψ Γ h
dx dx EI
Ωe Ωe
0 a e ŵ e fe
= Eq. 4•88
a eT b e M
ˆ
e
re
where the stiffness matrix has the size of 4x4 and the solution and force vectors have the sizes of 4. Following the
finite element convention, we collect degree of freedoms w and M together for each node. At the global level, the
matrix form is
0 a 00 0 a 01 … … 0 a0 ( n – 1 ) ŵ 0 f0
T
a 00 b 00 T
a 01 b 01 … … a 0T( n – 2 ) b0 ( n – 1 ) ˆ
M 0 r0
0 a 10 0 a 11 … … 0 a1 ( n – 1 ) ŵ 1 f1
T
a 10 b 10 T
a 11 b 11 … … a 1T( n – 2 ) b1 ( n – 1 ) ˆ
M 1 =
r1
Eq. 4•89
… … … … …… … … … …
… … … … …… … … … …
0 a ( n – 1 )0 0 a ( n – 1 )1 … … 0 a ( n – 1 ) ( n – 1 ) ŵ f( n – 1 )
( n – 1)
a (Tn – 1 )0 b( n – 1 )0 a (Tn – 1 )1 b( n – 1 )1 … … a (Tn – 1 ) ( n – 1 ) b( n – 1 ) ( n – 1 ) ˆ r( n – 1 )
M( n – 1 )
For large-size problem, the stiffness matrix size could be critical for limited computer memory space , and even
more seriously for the computation time. One approach to reduce the number of degree of freedom in the global
matrix solution process is to separate the variables ŵ and Mˆ in Eq. 4•89 at the global level. Rewriting the ele-
ment formulation of Eq. 4•89 in global submatrix form as
0 A ŵ f Eq. 4•90
=
ˆ
AT B M r
ˆ = r
A T ŵ + BM Eq. 4•91
Therefore,
ˆ = B –1 ( r – A T ŵ )
M Eq. 4•92
ˆ = f , we
Note we have use the property that B is invertible. Observing that the first equation of Eq. 4•90 is AM
pre-multiply Eq. 4•92 with A, such that
ˆ = AB –1 ( r – A T ŵ ) = AB – 1 r – AB –1 A T ŵ
f = AM Eq. 4•93
ŵ = ( AB –1 A T ) – 1 ( AB –1 r – f ) Eq. 4•94
We have relied on the property that AB-1AT is invertible. With ŵ solved, M ˆ can be recovered according to Eq.
4•92, if necessary. The solution using the substructuring technique has two major advantages. Firstly, only A and
B need to be stored in memory space that is only half of the memory space comparing to the entire left-hand-side
matrix in Eq. 4•90. Secondly, the matrix solver in substructuring deals with B-1 and AB-1AT which are smaller
matrices than the left-hand-side matrix in Eq. 4•90. The cost for a matrix solver can be a function of cubic power
of size. For the present case, each of the inverse of B and the inverse of AB-1AT requires about one-eighth of
computation time comparing to that of the solution of the left-hand-side matrix in Eq. 4•90. That is only a quar-
ter of computation time is needed for the matrix solver using substructuring.1
1. Note that the term f includes (1) the distributed load term, the term contains “f” in Eq. 4•87, (2) shear force
(V), the “nodal loading boundary condition” VΓ(treated as natural boundary condition specified corresponding to
“w”-dof), and (3) essential boundary condition of MΓ by subtracting “AMΓ” out of f. The term r includes (1)
negative slope (ψ), the “nodal loading boundary condition” ψ Γ(treated as natural boundary condition specified
corresponding to “M”-dof), and (2) the essential boundary conditions of {wΓ, MΓ} by subtracting “AT wΓ+BMΓ”
out of r .
1. Moreover, Eq. 4•89 has a lot of zero diagonals, which is not without trouble for the matrix solver. We either need to use
modified Cholesky decomposition with the diagonal pivoting or we need to ignore the symmetry and use LU decomposition
with complete pivoting.
0 A ŵ f Eq. 4•95
=
ˆ
AT B M r
The matrix representation, “MR”, for the diagonal submatrix B and its corresponding right-hand-side r is
declared as standard class of “Matrix_Representation”
Matrix_Representation mr(mgd);
This matrix representation instance “mr” can be called to assemble and instantiate the submatrix B and the sub-
vector r. They can be retrieved by
1 mr.assembly();
2 C0 B = ((C0)(mr.lhs())), // diagonal submatrix B
3 r = ((C0)(mr.rhs())); // r
The rows of submatrix A corresponding to “w”-dof, the principal discretization, and the columns of submatrix A
corresponding to “M”-dof, the subordinate discretization. The class “Matrix_Representation_Couple” is
declared instead as
The second argument of this constructor is reserved for instantiation sparse matrix, the third and the fourth argu-
ments of this constructor referencing to right-hand-side vectors corresponding to the principal and the subordi-
nate discretization of submatrix A. In the above example, the principal right-hand-side is supplied with a “0”, the
null pointer. In this case, the principal right-hand-side vector f will be instantiated. When the argument is not
null, such as the subordinate right-hand-side is reference to “mr.rhs()” in this case. The subordinate right-hand-
1 int main() {
2 mrc.assembly();
3 C0 f = ((C0)(mrc.rhs())),
4 A = ((C0)(mrc.lhs()));
5 mr.assembly();
6 C0 B = ((C0)(mr.lhs())),
7 r = ((C0)(mr.rhs()));
8 C0 B_inv = B.inverse(),
9 w = (A*B_inv*r - f)/(A*B_inv*(~A)), // ŵ = ( AB –1 A T ) – 1 ( AB –1 r – f )
10 M = B_inv*(r-(~A)*w); ˆ = B –1 ( r – A T ŵ )
// M
11 wh = w; wh = wgd.gh_on_gamma_h();
13 mh = M; mh = mgd.gh_on_gamma_h();
14 cout << "deflection:" << endl << wh << endl << "bending moment:" << endl << mh;
15 return 0;
16 }
The complete listing of the substructure mixed formulation is in Program Listing 4•12. The cases for nodal
loading and distributed loading, discussed in the mixed formulation of Section 4.2.2, can be turn on by setting
macro definitions “__TEST_NODAL_LOAD” and “__TEST_DISTRIBUTED_LOAD” . The results of the
present computation are completely identical to those of the previous section on mixed formulation..
The nonlinear and transient problems bring only marginal changes to the “fe.lib”. We certainly can create
new classes of “Nonlinear_Element_Formulation” and “Transient_Element_Formulation” for a user defined ele-
ment to derived from. This is similar to the class “Element_Formulation_Couple” in the present example is cre-
ated for user to derived a user defined element formulation from it. We can even create a multiple inheritance (an
advanced but controversial C++ feature) of class Nonlinear_Element_Formulation and class
Transient_Element_Formulation to capture both the nonlinear and the transient capabilities. The object-oriented
programming provides the basic mechanisms for a smooth code evolution of “fe.lib” to be extended to vastly
different area of problems. However, the problem of “mixed formulation with separate variables” brings the
greatest impact of change to fe.lib. We need to change all four strong components of the “fe.lib” to implement
this problem. With mechanisms of the object-oriented programming, we are not only able to reuse the code in
“fe.lib” by deriving from it, but also are able to keep the simplicity of the “fe.lib” intact. After the “fe.lib” has
been modified to deal with the new problem, the beginner of the fe.lib still only need to learn the unscrambled
basic set of “fe.lib” without to confront all kinds of more advanced problems in finite element at once. For For-
tran/C programmers who are already familiar with a couple of existing full-fledged finite element programs, this
advantage of using object-oriented programming to accommodate vastly different problems would be most
immediately apparent.
#include "include\fe.h"
#include "include\omega_h_n.h"
Matrix_Representation_Couple::assembly_switch
initialize static member of class
Matrix_Representation_Couple::Assembly_Switch = Matrix_Representation_Couple::ALL; “Matrix_Representation_Couple”
static const int node_no = 5;
static const int element_no = 4;
static const int spatial_dim_no = 1;
static const double L_ = 1.0;
static const double h_e = L_/((double)(element_no));
static const double E_ = 1.0;
static const double I_ = 1.0;
static const double f_0 = 1.0;
static const double M_ = 1.0;
Omega_h::Omega_h() {
Definte discretizaed global domain
for(int i = 0; i < node_no; i++) { define nodes
double v = ((double)i)*h_e;
Node* node = new Node(i, spatial_dim_no, &v);
the_node_array.add(node);
}
for(int i = 0; i < element_no; i++) { define elements
int ena[2];
ena[0] = i;
ena[1] = ena[0]+1;
Omega_eh* elem = new Omega_eh(i, 0, 0, 2, ena);
the_omega_eh_array.add(elem);
}
}
gh_on_Gamma_h_i::gh_on_Gamma_h_i(int i, int df, Omega_h& omega_h) : gh_on_Gamma_h() { define boundary conditions
gh_on_Gamma_h::__initialization(df, omega_h);
if(i == 0) {
the_gh_array[node_order(0)](0) = gh_on_Gamma_h::Dirichlet;
} else if(i == 1) {
the_gh_array[node_order(node_no-1)](0) = gh_on_Gamma_h::Dirichlet;
the_gh_array[node_order(node_no-1)][0] = M_;
}
} instantiate fixed and free variables and
static const int ndf = 1;
static Omega_h oh;
Global_Discretization
static gh_on_Gamma_h_i wgh(0, ndf, oh); {Ωh, ŵ h}
static U_h wh(ndf, oh);
static Global_Discretization wgd(oh, wgh, wh);
static gh_on_Gamma_h_i mgh(1, ndf, oh);
ˆ h}
{Ωh, M
static U_h mh(ndf, oh);
static Global_Discretization mgd(oh, mgh, mh);
static Global_Discretization_Couple gdc(wgd, mgd);
class Beam_Mixed_Formulation : public Element_Formulation_Couple {
Global_Discretization_Couple
public:
Beam_Mixed_Formulation(Element_Type_Register a) : Element_Formulation_Couple(a) {}
Element_Formulation *make(int, Global_Discretization&);
Beam_Mixed_Formulation(int, Global_Discretization&);
Element_Formulation_Couple *make(int, Global_Discretization_Couple&);
Beam_Mixed_Formulation(int, Global_Discretization_Couple&);
};
Element_Formulation* Beam_Mixed_Formulation::make(int en, Global_Discretization& gd) {
return new Beam_Mixed_Formulation(en,gd);
}
Listing 4•12 Substructure solution for the mixed formulation of the beam bending problem.
where q is the heat flux and f is the heat source. This is subject to Dirichlet and Neumann boundary conditions
respectively. We use “u” for temperature and n as the outward unit surface normal. The Fourier law assumes
that the heat flux can be related to temperature gradient as
q = – κ ∇u Eq. 4•98
where κ is the thermal diffusivity. The weighted residual statement of Eq. 4•96 with the Fourier law gives
Integration by parts and applying divergence theorem of Gauss to transform the volume integral into a boundary
integral gives
∫ ∇w ( κ ∇u ) dΩ + ∫ w ( – κ ∇u ) • n dΓ – ∫ wf dΩ = 0 Eq. 4•100
Ω Γ Ω
Since the “w” is homogeneous at Γg, the boundaries with Dirchlet boundary conditions, the second term of the
boundary integral becomes
∫ wq • n dΓ = – ∫ wh dΓ Eq. 4•101
Γ Γh
The second term ( φ ea, h ) Γ is the Neumann boundary conditions, which is most easily specified in the problem
h
definition as equivalent nodal load, and the third term – a ( φea, φ eb )u eb accounts for the Dirichlet boundary condi-
tions. Again, the default behaviors of “fe.lib” will deal with these two terms automatically.
For an isoparametric bilinear 4-nodes element, the bilinear shape functions are taken for both the variable
interpolation, u eh ≡ φ ea û ea , and the coordinate transformation rule, x ≡ φ ea x ea , that is
1
φ ea ≡ N a ( ξ, η ) = --- ( 1 + ξ a ξ ) ( 1 + η a η ) Eq. 4•104
4
index “a” indicates element node number, and (ξa, ηa) , for a = 0, 1, 2, 3 are four nodal coordinates {(-1, -1), (1,
-1), (1, 1), (-1, 1)} of the referential element. The variable interpolation becomes
where û ea is the nodal variables, and the coordinate transformation rule becomes
x eh ≡ N a ( ξ, η )x ea Eq. 4•106
where x ea is the element nodal coordinates. The integration in Eq. 4•102 and first term of Eq. 4•103 gives
∂x ∂x
ke = ∫ ( ∇N ⊗ κ∇N )dx = ∫ ( ∇N ⊗ κ∇N )det -----
∂ξ
- dξ , and fe = ∫ ( Nf )dx = ∫ ( Nf )det -----
∂ξ
- dξ Eq. 4•107
Ωe Ωe Ωe
Ωe
The Gaussian quadrature requires the integration domain to be transformed from the physical element domain
“Ωe” to the referential element domain “Ωe” with the Jacobian of the coordinate transformation as “J
≡ d et ( ∂x ⁄ ∂ξ ) ” (i.e., the determinant of the Jacobian matrix), where the Jacobian matrix of the coordinate trans-
formation rule, “ ∂x ⁄ ∂ξ ”, is computed from the definition of the coordinate transformation rule in Eq. 4•106.
The derivatives of the variables are taken from Eq. 4•105 as
û e0 ∂N 0 ∂N ∂N 2 ∂N3 û e
0
∂N 0 ∂N ∂N ∂N 3 û e0
∂ξ ∂η
------ ------ --------- ---------1 ---------2 --------- ∂Na ∂ξ T
∂x ∂x ∂ξ ∂ξ ∂ξ ∂ξ û e1
= = --------- ------ û ea Eq. 4•108
∂ξ ∂η ∂N 0 ∂N 1 ∂N2 ∂N 3 ∂ξ ∂x
------ ------ û e2
∂y ∂y --------- --------- --------- ---------
∂η ∂η ∂η ∂η û e3
The derivative of shape functions with respect to natural coordinates ∂N ⁄ ∂ξ , is computed from the definition of
the shape functions in Eq. 4•104. The term ∂ξ ⁄ ∂x is computed from the inverse of the derivative of the coordi-
nate transformation rule from Eq. 4•106 as
∂ξ ⁄ ∂x = ( ∂x ⁄ ∂ξ ) –1 Eq. 4•109
That is, Eq. 4•108 gives the formula to compute the derivatives of shape functions matrix (nen × dof = 4 × 2) for
the element stiffness matrix in Eq. 4•107
∂N ∂x –1
∇N = ------- ------ Eq. 4•110
∂ξ ∂ξ
Since there is no heat source in the square area “f = 0”, and due to symmetry of the boundary conditions no tem-
perature gradient can be generated in x-direction, Eq. 4•111 reduces to
utop= 30oC
q•n=0 q•n=0
ubottom= 0oC
Figure 4•30 Conduction in a square insulated from two sides.
That is the temperature gradient in y-direction is 10 (oC per unit length). In other words, the nodal solutions at
the row next to the bottom is u = 10 oC, and the row next to the top is u = 20 oC. The Program Listing 4•13
implements element formulation for the stiffness matrix and force vector in Eq. 4•107 for this simple problem.
The nodes and elements can be generated as
1 int row_node_no = 4,
2 row_element_no = row_node_no - 1;
3 double v[2];
4 for(int i = 0; i < row_node_no; i++)
5 for(int j = 0; j < row_node_no; j++) {
6 int nn = i*row_node_no+j;
7 v[0] = (double)j; v[1] = (double)i;
8 Node* node = new Node(nn, 2, v);
9 the_node_array.add(node);
10 }
11 int ena[4];
12 for(int i = 0; i < row_element_no; i++)
13 for(int j = 0; j < row_element_no; j++) {
14 int nn = i*row_node_no+j;
15 ena[0] = nn; ena[1] = ena[0]+1; ena[3] = nn + row_node_no; ena[2] = ena[3]+1;
16 int en = i*row_element_no+j;
17 Omega_eh* elem = new Omega_eh(en, 0, 0, 4, ena);
18 the_omega_eh_array.add(elem);
19 }
#include "include\fe.h"
Omega_h::Omega_h() {
int row_node_no = 4, row_element_no = row_node_no - 1;
define nodes
for(int i = 0; i < row_node_no; i++)
for(int j = 0; j < row_node_no; j++) {
int nn = i*row_node_no+j; double v[2]; v[0] = (double)j; v[1] = (double)i;
Node* node = new Node(nn, 2, v); the_node_array.add(node);
}
for(int i = 0; i < row_element_no; i++) define elements
for(int j = 0; j < row_element_no; j++) {
int nn = i*row_node_no+j, en = i*row_element_no+j;
int ena[4]; ena[0] = nn; ena[1] = ena[0]+1; ena[3] = nn + row_node_no; ena[2] = ena[3]+1;
Omega_eh* elem = new Omega_eh(en, 0, 0, 4, ena); the_omega_eh_array.add(elem);
}
}
gh_on_Gamma_h::gh_on_Gamma_h(int df, Omega_h& omega_h) { define B.C.
__initialization(df, omega_h);
int row_node_no = 4, first_top_node_no = row_node_no*(row_node_no-1);
for(int i = 0; i < row_node_no; i++) {
the_gh_array[node_order(i)](0) = gh_on_Gamma_h::Dirichlet; top boundary u = 0oC
the_gh_array[node_order(first_top_node_no+i)](0) = gh_on_Gamma_h::Dirichlet; bottom boundary u = 30oC
the_gh_array[node_order(first_top_node_no+i)][0] = 30.0;
}
}
class HeatQ4 : public Element_Formulation { public: define element
HeatQ4(Element_Type_Register a) : Element_Formulation(a) {}
Element_Formulation *make(int, Global_Discretization&);
HeatQ4(int, Global_Discretization&);
};
Element_Formulation* HeatQ4::make(int en, Global_Discretization& gd) {
return new HeatQ4(en,gd);
}
HeatQ4::HeatQ4(int en, Global_Discretization& gd) : Element_Formulation(en, gd) {
Quadrature qp(2, 4); 1
N a ( ξ, η ) = --- ( 1 + ξ a ξ ) ( 1 + η a η )
H1 Z(2, (double*)0, qp), Zai, Eta, 4
N = INTEGRABLE_VECTOR_OF_TANGENT_BUNDLE( "int, int, Quadrature", 4, 2, qp);
Zai &= Z[0]; Eta &= Z[1];
∂N ∂x – 1
∇N = ------- ------
N[0] = (1-Zai)*(1-Eta)/4; N[1] = (1+Zai)*(1-Eta)/4;
N[2] = (1+Zai)*(1+Eta)/4; N[3] = (1-Zai)*(1+Eta)/4; ∂ξ ∂ξ
H1 X = N*xl; H0 Nx = d(N) * d(X).inverse(); J dv(d(X).det()); double k = 1.0;
∂x
}
stiff &= (Nx * k * (~Nx)) | dv; ke = ∫ ( ∇N ⊗ κ ∇N )det -----
∂ξ
- dξ
Element_Formulation* Element_Formulation::type_list = 0; Ωe
Element_Type_Register element_type_register_instance;
static HeatQ4 heatq4_instance(element_type_register_instance);
void output(Global_Discretization&);
int main() {
int ndf = 1; Omega_h oh; gh_on_Gamma_h gh(ndf, oh); U_h uh(ndf, oh);
Global_Discretization gd(oh, gh, uh); Matrix_Representation mr(gd); assembly
mr.assembly();
C0 u = ((C0)(mr.rhs())) / ((C0)(mr.lhs()));
matrix solver
gd.u_h() = u; gd.u_h() = gd.gh_on_gamma_h(); update free and fixed dof
cout << gd.u_h(); output
return 0;
}
Listing 4•13 Two-dimensional heat conduction problem (project workspace file “fe.dsw”, project
“2d_heat_conduction”.
We use a 2-D 2 × 2 Gaussian quadrature for all integrable objects (line 2). In line 6 and 7, the shape functions
“N” is defined according to Eq. 4•104. The coordinate transformation rule in line 8 is from Eq. 4•106. The deriv-
ative of shape function are calculated according to Eq. 4•109 and Eq. 4•110. Line 10 on “the Jacobian” and line
12 on stiffness matrix is the first part of the Eq. 4•107. The rest of the code is not very different from that of a 1-
D problem.
1 double coord[4][2] = {{0.0, 0.0}, {3.0, 0.0}, {3.0, 3.0}, {0.0, 3.0}};
2 int control_node_flag[4] = {TRUE, TRUE, TRUE, TRUE};
3 block(this, 4, 4, 4, control_node_flag, coord[0]);
The first integer argument specifies in “block()” the number of nodes generated row-wise, which is “4”. The sec-
ond integer argument specifies the number of nodes generated column-wise. The following integer is the number
of control nodes. In this example, the four control nodes are located at node numbers “0”, “3”, “15”, and “12”
12 13 14 15
6 7 8
8 11
9 10
3 4 5
4 5 6 7
0 1 2
0 1 2 3
Figure 4•31 16 nodes and 9 elements generated by a single “block()” function call.
ordered counter-clockwise starting from the lower-left corner. The components in the int array of the
“control_node_flag” are all set as TRUE (=1). This is followed by the pointer to double array “coord[0]”. Notice
that in the semantics of C language (“pointer arithmatics”), the expression of the symbol “coord” with “[]” means
casting the double** to double*, while the index “0” means with an off-set of zero from the first memory address
of the double*.
An example with two “block()” function calls has the potential of being more adaptive to deal with compli-
cated geometry (see Figure 4•32)
1 double coord1[4][2] = {{0.0, 0.0}, {3.0, 0.0}, {3.0, 3.0}, {0.0, 3.0}},
2 coord2[4][2] = {{3.0, 0.0}, {6.0, 0.0}, {6.0, 3.0}, {3.0, 3.0}};
3 int control_node_flag[4] = {1, 1, 1, 1};
4 block(this, 4, 4, 4, control_node_flag, coord1[0], 0, 0, 3, 3);
5 block(this, 4, 4, 4, control_node_flag, coord2[0], 3, 3, 3, 3);
In this example, the coordinates of the control nodes are given as rectangles for simplicity. The first int argument
after the coordinates of type double* is the first node number generated, the next int argument is the first element
generated. The last two int arguments are “row-wise node number skip” and “row-wise element number skip”.
For example, in line 5 the second block definition has both its first node and first element numbered as “3”. The
row-wise node number and element number both skip “3”. Therefore, the first node number of the second row is
“10” and the first element number of the second row is “9”. When we define the first block in line 4 the nodes
numbered “3”, “10”, “17” and “24” has been defined. On line 5, when the “block()” function is called again,
these four nodes will be defined again. In “fe.lib”, the “block()” function use “Omega_h::set()” instead of
“Omega_h::add()”, in which the database integrity is accomplished by checking the uniqueness of the node
number. Using the terminology of relational database, the node number is the key of the database tabulae in this
case. If a node number exist, it will not be added to the database again.
A third example shows a cylinder consists of eight blocks (see Figure 4•33) which is even much more chal-
lenging. The code for generating these eight blocks is
21 22 23 24 25 26 27
12 13 14 15 16 17
14 17 20
15 16 17 18 19
6 7 8 9 10 11
7 8 9 10 10 11 11 12 13
0 1 2 3 4 5
0 1 2 3 4 5 6
21 22 23 24 24 25 26 27
12 13 14 15 16 17
17 17
14 20
15 16 common 18 19
6 7 8 nodes 9 10 11
10 10
7 8 9 11 11 12 13
0 1 2 3 4 5
0 1 2 3 3 4 5 6
Figure 4•33 A cylinder consists of eight blocks. Open circles in the left-hand-side are
control nodes. Tie nodes 164-132, 131-99, 98-66, 65-33, and 32-0 are shown in the right-
hand-side.
Five tie nodes “164-132”, “131-99”, “98-66”, “65-33”, and “32-0” (see right-hand-side of Eq. 4•33) are gener-
ated when the “tail” of the eighth block comes back to meet the “head” of the first block. The tie nodes are gen-
erated when different node number with same coordinates occurs. In fe.lib the nodes that are generated later is
“tied” to the nodes that are generated earlier. In this example nodes “0”, “33”, “66”, “99”, and “132” are gener-
ated when the first “block()” function call is made. When the eighth “block()” function call is made later, nodes
“32”, “65”, “98”, “131”, and “164” will be generated. The tie nodes are formed when the coordinates are found
to be the same as that of any node generated previously.
For heat conduction problem, if the boundary condition is symmetrical with respect to the center axis, it can
well be written with axisymmetrical formulation and solve as an one dimension problem such as in the subsec-
tion under the title of “Cylindrical Coordinates For Axisymmetrical Problem” on page 302. For the present case
of the hollow cylinder made of one material, the Eq. 4•111 expressed in cylindrical coordinates is1
1. p. 189 in Carslaw, H.S., and J.C. Jaeger, 1959, “Conduction of heat in solids”, 2nd ed. Oxford University Press, Oxford,
UK.
100
80
60
o
C
40
20
----- r ------ = 0
d du
Eq. 4•113
dr dr
The general solution is u = A+B ln r. The constants A and B are determined by imposing the boundary condi-
tions. For example, if at inner side of the cylinder of ri the temperature is kept at ui, and at outer side of the cylin-
der of ro the temperature is kept at uo, we have the solution as
ro
u i ln ---- + u o ln ---
r
r ri
u exact = ----------------------------------------------
ro
- Eq. 4•114
ln ----
ri
The finite element computation can be turned on using the same project “2d_heat_conduction” in project
workspace “fe.dsw” by setting macro definition “__TEST_CYLINDER” at compile time. The finite element
solution in the radial direction is compared to the analytical solution of Eq. 4•114 and shown in Figure 4•34.For
an additional exercise for function “block()”, we proceed with the fourth example of using three blocks to
approximate a quarter of a circle. In Chapter 3 on page 195, we approximate a quarter of a circle with three
“block()” function calls. In that case we do not have provision of repeated definitions of nodes. In the present
case, we try to minimize the number of the tie nodes by the following code
65 66
60 67
61 68
44 45 55 56 62
40 41 50 51 57 63
46 52 58 69
36 42 47 45
32 37 38 46 53 64
33 43 36 47 59
39 37 48 54 44
24 25 34 35 30
31 38 49 42 43
26 29 39 41 35
2728 23 27 40 34
16 17 22 28 33
18 19 2021 29 30 32
15 18 25 26
8 9 10 13 14 19 20
31
23
24
11 12 21 22 17
0 1 2 3 4 5 6 7 9 10 14 1516
11 12 13
element numbering 0 1 2 3 4 5 67 8
node numbering
Figure 4•35 Three block function calls to approximate a quarter of a circle. The right-hand-
side shows the element numbering scheme and the left-hand-side shows the node numbering
scheme.
The numbering of the elements and nodes for the first two blocks are similar to that of the second example. After
the third block has been generated, 9 tie-nodes will be generated including “45-36”, “46-37”, “47-38”, “48-39”,
“49-40”, “54-41”, “59-42”, “64-43”, and “69-44”.
Lines 17-24 are shape function definition for Lagragian 4-to-9-node element that we have already used in Chap-
ter 3. Lines 33, and 34 register the element formulations. The last element formulation register has the element
type number “0”. This number increases backwards to element(s) registered earlier. We can also use the
“block()” function call to define Lagrangian 9-node element as (see Figure 4•36)
Line 1 specified the elements generated are Lagragian 9-nodes elements. The last integer argument in line 3 to
line 10 indicate the element type number is 1, which corresponding to the “HeatQ9” element that we just regis-
tered. The computation of the Lagragian 9-node elements can be activated by setting macro definition
“__TEST_QUADRATIC_CYLINDER” for the same project “2d_heat_conduction” in the project workspace
file “fe.dsw”.
Figure 4•36 9-node Lagrangian quadrilateral elements generated by eight “block()” function
calls.
q = – κ ∇u
This step is often referred to as post-processing in finite element method. The derivatives of shape function,
∇N a ( ξ, η ) , on Gaussian integration points are available at the constructor of class “Element_Formulation”. The
gradients of temperature distribution are approximated by
Therefore,
Therefore, after the solutions of nodal values, û ea , are obtained, we can loop over each element to calculate the
heat flux on its Gaussian integration points, such as,
Substituting Eq. 4•117 and Eq. 4•116 into Eq. 4•118, we have
∫ Na N b dΩ q̂ eb = ∫ ( Na ( – κ ∇Nb ûeb ) ) dΩ Eq. 4•119
Ω Ω
We identify, in Eq. 4•119, the consistent mass matrix (with unit density), M, as
M ≡ ∫ N a N b dΩ Eq. 4•120
Ω
The nodal heat flux, q̂ea , can be solved from Eq. 4•119. This nodal solution procedure is described as smoothing
or projection in finite element.1 An approximation on Eq. 4•120 which alleviates the need for matrix solver is to
define lumped mass matrix as
ML ≡ ∑ ∫ Na Nb dΩ, a=b
Eq. 4•121
b Ω
0, a≠b
This is the row-sum method among many other ways of defining a lumped mass matrix.2
An alternative thinking on Eq. 4•118 of Galerkin weighting of the weighted-residual statement is that we can
write least-squares approximation of error as
1. p. 346 in Zienkiewicz, O.C., and R.L. Taylor, 1989, “The finite element method: basic formulation and linear problems”,
4the ed., vol. 1, McGraw-Hill, London, UK,
see also p. 226 in Hughes, T. J.R., “The finite element method: linear static and dynamic finite element analysis”, Prentice-
Hall, Inc., Englewood Cliffs, New Jersey.
2. see appendix 8 in Zienkiewicz, O.C., and R.L. Taylor, 1989, “The finite element method: basic formulation and linear
problems”, 4the ed., vol. 1, McGraw-Hill, London, UK.
21 int main() {
22 ...
23 Matrix_Representation::Assembly_Switch = Matrix_Representation::NODAL_FLUX;
24 mr.assembly(FALSE);
25 cout << "nodal heat flux:" << endl;
26 for(int i = 0; i < oh.total_node_no(); i++) {
27 int node_no = oh.node_array()[i].node_no();
28 cout << "{ " << node_no << "| "
29 << (mr.global_nodal_value()[i][0]) << ", "
30 << (mr.global_nodal_value()[i][1]) << "}" << endl;
31 }
32 ...
33 }
280
260
240
220
200 q
180
160
∂u ∂v
------ + ------ = 0 , Eq. 4•124
∂x ∂y
and an equation with zero vorticity component perpendicular to the x-y plane
∂u ∂v
------ – ------ = 0 Eq. 4•125
∂y ∂x
From the continuity equation Eq. 4•124, it follows that u dy - v dx is an total derivative, defined as
dψ = u dy - v dx Eq. 4•126
∂ψ ∂ψ
u = -------, and v = – ------- Eq. 4•127
∂y ∂x
Substituting Eq. 4•127 back to Eq. 4•124 gives the identity of cross derivatives of ψ to be equal. This is the con-
dition that ψ to be a potential function in calculus. Integration of Eq. 4•126 along an arbitrary path, as shown in
Figure 4•38a, gives the volume flux across the path. Along a stream line the volume flux across it is zero by def-
inition. That is along a streamline ψ is constant. Therefore, the scalar function ψ is known as the stream func-
tion.
Substituting Eq. 4•127 into the condition of irrotationality, Eq. 4•125, gives
∂2 u ∂ 2 v
div ( grad ψ ) ≡ ∇•( ∇ψ ) ≡ ∇2ψ = --------2 + --------2 = 0 Eq. 4•128
∂x ∂y
u dy C2
A
x
(a) (b)
Figure 4•38 (a) The volume flux across an arbitrary integration path is equal to u dy - v dx.
If the integration path coincides with the streamline, the volume flux across the integration
path should become zero by definition. (b) The circulation of a loop is zero for irrotational
flow. Therefore, a potential function φ can be defined which only depends on position.
°∫ u • dx = 0 Eq. 4•129
C
From Figure 4•38b, we have two different integration paths, C1 and C2, along any two points form a closed cir-
cle.
∫ u • dx + ∫ u • dx = 0, or ∫ u • dx = – ∫ u • dx Eq. 4•130
C1 C2 C1 C2
Therefore, any two paths of integration give the same result; i.e., the integration depends only on end-points.
Therefore, we can define a potential function φ, i.e.,
∂φ ∂φ
u = – ------, and v = – ------ Eq. 4•132
∂x ∂y
Again, substituting Eq. 4•132 back to Eq. 4•125 of condition of irrotationality, we have the cross derivatives of φ
which is identical to assert the exact differential nature of φ. Substituting Eq. 4•132 into the continuity equation
of Eq. 4•124, we have another Laplace equation that
This relation ensures that the gradients of stream function and velocity potential are orthogonal to each other,
since
∂φ ∂ψ ∂φ ∂ψ
∇φ • ∇ψ = ------ ------- + ------ ------- = 0 Eq. 4•135
∂x ∂x ∂y ∂y
The gradients are the normals to the equipotential lines of φ and the streamlines of ψ. Therefore, the “contours”
of φ and ψ are orthogonal to each others.
An example of finite element problem1 (a confined flow around a cylinder is shown in Figure 4•39) in both
stream function—ψ formulation and velocity potential—φ formulation are solved using VectorSpace C++
Library and “fe.lib” in the followings.
At the bottom-boundary ΓAB we choose the arbitrary reference value of ψ0 = 0. Therefore, along the left-bound-
ary ΓAE, Eq. 4•136 simplified to ψ(y) = U0 y. The streamline at boundary ΓBC follows from the boundary ΓAB
which has ψ =ψ0 (= 0). On the top-boundary ΓED, y = 2, we have ψ(2) = 2U0. Notice that the corner E is shared
by the boundaries ΓAE and ΓED. At the right-boundary ΓDC the horizontal velocity, u, is unknown, but the verti-
cal velocity v = 0; i.e., v = −∂ψ/∂x = 0.
The Program Listing 4•14 implements the Eq. 4•128 with the above boundary conditions. The only differ-
ence to the 2-D heat conduction problem is the post-processing of the derivative information.
1 if(Matrix_Representation::Assembly_Switch == Matrix_Representation::NODAL_FLUX) {
2 int velocity_no = 2;
3 the_element_nodal_value &= C0(nen*velocity_no, (double*)0);
4 C0 projected_nodal_velocity = SUBVECTOR("int, C0&", velocity_no, the_element_nodal_value);
5 H0 Velocity = INTEGRABLE_VECTOR("int, Quadrature", velocity_no, qp);
6 Velocity = 0.0;
7 for(int i = 0; i < nen; i++) {
1. p. 360-365 in Reddy, J.N., “An introduction to the finite element method”, 2nd ed., McGraw-Hill, Inc., New York.
From Eq. 4•127, the velocity is interpolated at the element formulation level as
U0 4
8
(a)
ψ = y U0 E ψ = 2U0 D E ∂φ/∂y = 0 D
∂ψ/∂x = 0 φ=0
C -∂φ/∂x = U0 C
ψ=0
∂φ/∂n = 0
A B
ψ=0 A ∂φ/∂y = 0 B
Figure 4•39(a) A confined flow around a circular cylinder. Only the upper left quadrant is
model due to symmetries of geometry, boundary conditions, and PDE. (b) stream
function B.C., and (c) velocity potential B.C.
#include "include\fe.h"
EP::element_pattern EP::ep = EP::QUADRILATERALS_4_NODES;
Omega_h::Omega_h() { const double PI = 3.141592653509, c = cos(PI/4.0), s = sin(PI/4.0),
c1 = cos(PI/8.0), s1 = sin(PI/8.0), c2 = cos(3.0*PI/8.0), s2 = sin(3.0*PI/8.0); define nodes and elements
double coord0[4][2] = {{0.0, 0.0}, {3.0, 0.0}, {1.0, 2.0}, {0.0, 2.0}},
coord1[5][2] = {{3.0, 0.0}, {4.0-c, s}, {3.0, 2.0}, {1.0, 2.0}, {4.0-c1, s1}},
coord2[5][2] = {{4.0-c, s}, {4.0, 1.0}, {4.0, 2.0}, {3.0, 2.0}, {4.0-c2, s2}};
int control_node_flag[5] = {TRUE, TRUE, TRUE, TRUE, TRUE};
block(this, 5, 5, 4, control_node_flag, coord0[0], 0, 0, 8, 8);
block(this, 5, 5, 5, control_node_flag, coord1[0], 4, 4, 8, 8);
block(this, 5, 5, 5, control_node_flag, coord2[0], 8, 8, 8, 8); }
gh_on_Gamma_h::gh_on_Gamma_h(int df, Omega_h& omega_h) { define B.C.
__initialization(df, omega_h); const double U0 = 1.0; const double h_y = 0.5;
for(int i = 0; i <= 12; i++) the_gh_array[node_order(i)](0) = gh_on_Gamma_h::Dirichlet;
for(int i = 52; i <= 64; i++) { the_gh_array[node_order(i)](0) = gh_on_Gamma_h::Dirichlet;
the_gh_array[node_order(i)][0] = 2.0*U0; }
for(int i = 1; i <= 4; i++) { the_gh_array[node_order(i*13)](0) = gh_on_Gamma_h::Dirichlet;
the_gh_array[node_order(i*13)][0] = (((double)i)*h_y)*U0; } }
class Irrotational_Flow_Q4 : public Element_Formulation { public:
define element formulation
Irrotational_Flow_Q4(Element_Type_Register a) : Element_Formulation(a) {}
Element_Formulation *make(int, Global_Discretization&);
Irrotational_Flow_Q4(int, Global_Discretization&); };
Element_Formulation* Irrotational_Flow_Q4::make(int en, Global_Discretization& gd) {
return new Irrotational_Flow_Q4(en,gd); }
Irrotational_Flow_Q4::Irrotational_Flow_Q4(int en, Global_Discretization& gd) :
Element_Formulation(en, gd) { Quadrature qp(2, 4);
H1 Z(2, (double*)0, qp), Zai, Eta, 1
N = INTEGRABLE_VECTOR_OF_TANGENT_BUNDLE("int, int, Quadrature", 4, 2, qp); N a ( ξ, η ) = --- ( 1 + ξ a ξ ) ( 1 + η a η )
4
Zai &= Z[0]; Eta &= Z[1]; N[0] = (1-Zai)*(1-Eta)/4; N[1] = (1+Zai)*(1-Eta)/4;
N[2] = (1+Zai)*(1+Eta)/4; N[3] = (1-Zai)*(1+Eta)/4;
H1 X = N*xl; H0 Nx = d(N) * d(X).inverse(); J dv(d(X).det());
if(Matrix_Representation::Assembly_Switch == Matrix_Representation::NODAL_FLUX) { ∂N a ∂N a T
int v_no = 2; the_element_nodal_value &= C0(nen*velocity_no, (double*)0); u eh = --------- ψ̂ a, – --------- ψ̂ a
C0 projected_nodal_velocity = SUBVECTOR("int, C0&", v_no, the_element_nodal_value); ∂y ∂x
H0 Velocity = INTEGRABLE_VECTOR("int, Quadrature", v_no, qp); Velocity = 0.0;
for(int i = 0; i < nen; i++) { Velocity[0] += Nx[i][1]*(ul[i]+gl[i]);
Velocity[1] += - Nx[i][0]*(ul[i]+gl[i]); } û e = ( M L ) –1 ∫ ( Nu eh ) dΩ
for(int i = 0; i < nen; i++) { C0 lumped_mass(0.0);
Ω
for(int k = 0; k < nen; k++) lumped_mass += (((H0)N[i])*((H0)N[k])) | dv;
projected_nodal_velocity(i) = ( ((H0)N[i])*Velocity | dv ) / lumped_mass; }
∂x
} else stiff &= (Nx * (~Nx)) | dv; }
Element_Formulation* Element_Formulation::type_list = 0;
ke = ∫ ( ∇N ⊗ ∇N )det -----
∂ξ
- dξ
Element_Type_Register element_type_register_instance; Ωe
static Irrotational_Flow_Q4 flowq4_instance(element_type_register_instance);
int main() { int ndf = 1; Omega_h oh; gh_on_Gamma_h gh(ndf, oh); U_h uh(ndf, oh);
Global_Discretization gd(oh, gh, uh); Matrix_Representation mr(gd);
mr.assembly(); C0 u = ((C0)(mr.rhs())) / ((C0)(mr.lhs()));
assembly and matrix solver
gd.u_h() = u; gd.u_h() = gd.gh_on_gamma_h(); cout << gd.u_h(); update free and fixed dof
Matrix_Representation::Assembly_Switch = Matrix_Representation::NODAL_FLUX; post-processing for nodal velocity
mr.assembly(FALSE); cout << "nodal velocity:" << endl;
for(int i = 0; i < uh.total_node_no(); i++)
cout << "{ " << oh.node_array()[i].node_no() << "| " <<
(mr.global_nodal_value()[i]) << "}" << endl;
return 0;
}
Listing 4•14 Stream function formulation potential flow problem(project “fe.ide”, project “potential_flow”
with macro definition “__TEST_STREAM_FUNCTION” set).
∂N a ∂Na T
u eh = --------- ψ̂ a, – --------- ψ̂ a Eq. 4•137
∂y ∂x
û e = ( M L ) –1 ∫ ( Nu eh ) dΩ Eq. 4•138
Ω
where ML is the lumped mass matrix. The results of this computation with element discretization, streamlines,
and nodal velocity vectors are shown in Figure 4•40.
2.0
1.75
1.5
1.25
ψ= 1.0
0.75
0.5
0.25
0.0
Figure 4•40 Finite element discretization (open circles are nodes), streamlines (ψ = 0-2.0 at 0.25
intervals), and nodal velocity vectors shown as arrows.
∂φ ∂φ
u = – ------, and v = – ------ Eq. 4•139
∂x ∂y
At the left-boundary ΓAE of Figure 4•39c, from u = - ∂φ/∂x, we have ∂φ/∂x = - U0. At the top and bottom-bound-
aries ΓAB and ΓED we have ∂φ/∂y = 0. On the cylinder surface ΓBC, ∂φ/∂n = 0, where n is its outward normal. At
the left-boundary ΓCD a reference value of φ is set to zero.
The code is implemented in the same project file without the macro definition
“__TEST_STREAM_FUNCTION” set at compile time. The results of this computation with element discretiza-
tion, velocity equipotential lines, and nodal velocity vectors are shown in Figure 4•41.
Inspecting Figure 4•40 and Figure 4•41, we see that the contours lines of the stream function ψ and velocity
potential φ is orthogonal to each others at every point. This is consistent with the orthogonality condition proved
φ= 5.0 4.5 4.0 3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0
0.5 0.0
1.0
1.5
2.0
Figure 4•41 Finite element discretization (open circles are nodes), velocity equi-potential lines
(φ = 0-5.0 at 0.5 intervals), and nodal velocity vectors shown as arrows.
in Eq. 4•135. The contours of stream function ψ and velocity potential φ make a smoothed mesh. Actually, this
is a popular method to generate a finite element mesh automatically.1
1. p.99-106 in George, P.L., 1991, “Automatic mesh generation: application to finite element methods”, John Wiley & Sons,
Masson, Paris, France.
where t is the traction and n is the outward unit surface normal. The weighted-residual statement of the Eq. 4•140
is
Integration by parts and then applying the divergence theorem of Gauss, we have
where the gradient operator, “grad”, and its relation to divergence operator, “div”, are
respectively. The trace operator, “tr”, is the summation of all diagonal entries. The operator “:”, in Eq. 4•143, is
the double contraction. Considering the variation of “w” is chosen to be homogeneous at Γg, the second term of
the boundary integral, in Eq. 4•143, can be restricted to Γh as
We first develop in tensorial notation for its clarity in physical meaning. The Cauchy stress tensor, σ, in Eq.
4•145 can be decomposed as
where λ and µ are the Lamé constants. µ is often denoted as G for the shear modulus. The operator def u is
defined as the symmetric part of grad u; i.e.,
where the superscript “s” denotes the symmetrical part of grad ( ≡ ∇ ), and ε is the (infiniteismal) strain tensor,
and the skew-symmetric part of grad u is defined as
1
rot u ≡ --- ( grad u – ( grad u ) T ) Eq. 4•149
2
def u and rot u are orthogonal to each other. From Eq. 4•148 and Eq. 4•149, we have the additative decomposi-
tion of grad u as
Recall the first term in Eq. 4•145, and substituting the constitutive equations Eq. 4•146 and Eq. 4•147
Note that,
The last identity is from the second part of the Eq. 4•144. With the Eq. 4•150 and the orthogonal relation of def
u and rot u, we can verify that
grad w : (2µ def u) = (def u + rot u) : (2µ def u) = 2 µ (def u : def u) Eq. 4•153
With Eq. 4•152 and Eq. 4•153, the Eq. 4•151 becomes
With the element shape function defined, e.g., as Eq. 4•104, the element stiffness matrix is
where indices {a, b} in superscripts and subscripts are the element node numbers.
In the indicial notation, we have the infinitesimal strain tensor εij(u) = def u = u(i,j) (with the parenthesis in
the subscript denotes the symmetric part), and the generalized Hooke’s law as
where δij is the Kronecker delta (δij = 1 if i = j, otherwise δij =0). The equivalence of Eq. 4•155 is
The last identity is due to the minor symmetry of cijkl. The element stiffness matrix for the indicial notation for-
mulation is
where the indices {i, j} are the degree of freedom numbers (0 ≤ i, j < ndf, where ndf is the “number degree of
freedoms” which equals to the nsd the “number of spatial dimension” in the present case; i.e., 0 ≤ k < nsd), and
the indices {a, b} are element node numbers (0 ≤ a, b < nen, where nen is the “element node number”). The rela-
tion of indices {p, q} and {i, a, j, b} are defined as
∂u ∂
------ ------ 0
εx ∂x ∂x σx
ε = εy = ∂v = 0 ----- ∂ u , and σ = σy Eq. 4•164
------ -
∂y ∂y v
γ xy ∂u ∂v ∂ ∂ τ xy
------ + ------ ------ ------
∂y ∂x ∂y ∂x
In plane strain case, we can show that the fourth-order tensor D becomes a matrix as
λ + 2µ λ 0
D = λ λ + 2µ 0 Eq. 4•166
0 0 µ
2λµ
λ = ---------------- Eq. 4•167
λ + 2µ
In engineering applications, the Young’s modulus, E, and Poisson’s ratio, ν, are often given instead of the Lamé
constants. They can be related as
νE E
λ = --------------------------------------, and µ = -------------------- Eq. 4•168
( 1 + ν ) ( 1 – 2ν ) 2(1 + ν)
rewritten Eq. 4•105 for a = 0, 1, ..., (nen - 1), and i = 0, ..., (ndf - 1)
∂N a
--------- 0
∂x
∂N a
Ba = 0 --------- , and B = B 0 B 1 B2 … B n – 1 Eq. 4•171
∂y
∂N a ∂N a
--------- ---------
∂y ∂x
PL 3 3(1 + ν)
v = – --------- 1 + -------------------
- Eq. 4•175
3EI L2
τy = 150 psi
2 in.
10 in.
10 11 12 13 14
fy,10 = -75 ux,14 = 0
4 5 6 7
fy,5 = -150 5 6 7 8 9 ux,9 = 0, and uy,9 = 0
0 1 2 3
fy,0 = -75 ux,4 = 0
0 1 2 3 4
10 11 12 13 14
fy,10 = -50 ux,14 = 0
fy,5 = -200 5 6 0 7 8 1 9 ux,9 = 0, and uy,9 = 0
fy,0 = -50 ux,4 = 0
0 1 2 3 4
1. p. 473 in Reddy, J.N. 1993, “ An introduction to the finite element method”, 2nd ed., McGraw-Hill, Inc., New York.
Line 17 is the computation of the derivatives of the shape function “Nx” (see Figure 4•43). The “Nx” is then par-
titioned into submatrix “w_x”. The regular increment submatrices wx &= w_x[0][0] and wy &= w_x[0][1] are
#include "include\fe.h"
static const double L_ = 10.0; static const double c_ = 1.0; static const double h_e_ = L_/2.0;
static const double E_ = 30.0e6; static const double v_ = 0.25;
static const double lambda_ = v_*E_/((1+v_)*(1-2*v_)); Young’s modulus and Poisson ratio
static const double mu_ = E_/(2*(1+v_)); plane stress λ modification
static const double lambda_bar = 2*lambda_*mu_/(lambda_+2*mu_);
EP::element_pattern EP::ep = EP::QUADRILATERALS_4_NODES;
Omega_h::Omega_h() {
double x[4][2] = {{0.0, 0.0}, {10.0, 0.0}, {10.0, 2.0}, {0.0, 2.0}}; int flag[4] = {1, 1, 1, 1};
block(this, 3, 5, 4, flag, x[0]);
}
generate nodes and elements
gh_on_Gamma_h::gh_on_Gamma_h(int df, Omega_h& omega_h) { __initialization(df, omega_h); B.C.
the_gh_array[node_order(4)](0) = the_gh_array[node_order(9)](0) = u4 = u9 = v9 = u14 = 0
the_gh_array[node_order(9)](0)=the_gh_array[node_order(14)](0)=gh_on_Gamma_h::Dirichlet;
the_gh_array[node_order(0)](1) = the_gh_array[node_order(5)](1) =
the_gh_array[node_order(10)](1) = gh_on_Gamma_h::Neumann;
the_gh_array[node_order(0)][1] = the_gh_array[node_order(10)][1] = -75.0; τy0 = τy10 = -75, τy5 = -150
the_gh_array[node_order(5)][1] = -150.0;
}
class ElasticQ4 : public Element_Formulation { public:
ElasticQ4(Element_Type_Register a) : Element_Formulation(a) {}
Element_Formulation *make(int, Global_Discretization&);
ElasticQ4(int, Global_Discretization&);
};
Element_Formulation* ElasticQ4::make(int en, Global_Discretization& gd) {
return new ElasticQ4(en,gd);
} 1
static const double a_ = E_ / (1-pow(v_,2)); N a ( ξ, η ) = --- ( 1 + ξ a ξ ) ( 1 + η a η )
4
static const double Dv[3][3] = {{a_, a_*v_, 0.0}, {a_*v_, a_, 0.0 }, {0.0, 0.0, a_*(1-v_)/2.0} };
∂N ∂x –1
∇N = ------- ------
C0 D = MATRIX("int, int, const double*", 3, 3, Dv[0]);
ElasticQ4::ElasticQ4(int en, Global_Discretization& gd) : Element_Formulation(en, gd) { ∂ξ ∂ξ
Quadrature qp(2, 4);
H1 Z(2, (double*)0, qp), Zai, Eta,
N = INTEGRABLE_VECTOR_OF_TANGENT_BUNDLE("int, int, Quadrature", 4, 2, qp); ∂Na
Zai &= Z[0]; Eta &= Z[1]; --------- 0
N[0] = (1-Zai)*(1-Eta)/4; N[1] = (1+Zai)*(1-Eta)/4; ∂x
N[2] = (1+Zai)*(1+Eta)/4; N[3] = (1-Zai)*(1+Eta)/4; ∂N a
H1 X = N*xl;
Ba = 0 ---------
H0 w_x = INTEGRABLE_SUBMATRIX("int, int, H0&", 1, nsd, Nx), wx, wy, B;
∂y
wx &= w_x[0][0]; wy &= w_x[0][1]; ∂Na ∂N a
--------- ---------
B &= (~wx || C0(0.0)) &
∂y ∂x
(C0(0.0) || ~wy ) &
(~wy || ~wx );
stiff &= ((~B) * (D * B)) | dv;
} k e = e iT ∫ B aT D Bb dΩe j
Element_Formulation* Element_Formulation::type_list = 0;
Element_Type_Register element_type_register_instance; Ω
static ElasticQ4 elasticq4_instance(element_type_register_instance);
int main() { int ndf = 2; Omega_h oh; gh_on_Gamma_h gh(ndf, oh); U_h uh(ndf, oh);
Global_Discretization gd(oh, gh, uh); Matrix_Representation mr(gd); mr.assembly();
C0 u = ((C0)(mr.rhs())) / ((C0)(mr.lhs()));
gd.u_h() = u; gd.u_h() = gd.gh_on_gamma_h();
cout << gd.u_h();
return 0;
}
Listing 4•15 Plane elastiticity (project workspace file “fe.dsw”, project “2d_beam” with Macro definition
“__TEST_B_MATRIX_CONCATENATE_EXPRESSION_SUBMATRIX” set at compile time).
Line 4 takes the size and type of the transpose of “wx”, then re-assigns its values to zero. Line 7 uses unary pos-
itive operator “+” to convert a Integrable_Nominal_Submatrix (of object type H0) into a plain Integrable_Matrix
(also of object type H0). We note that the expression “U[2][1]” can be written as “(e3[2] % e[1]) * (~E)” without
having to define the additional symbol “U = (e3%e)*(~E)”. One needs to set both macro definitions of
“__TEST_B_MATRIX_CONCATENATE_EXPRESSION_SUBMATRIX” and “__TEST_BASIS” for this
implementation at compile time
The semantics in the construction of B-matrix in the above is a bottom-up process. We first define the com-
ponents of the B-matrix than built the B-matrix with these pre-constructed components. The semantics of the
program code can be constructed in a reversed order; i.e., top-down process. We may want to construct the B-
matrix first, giving its size and initialized with default values (“0.0”). Then, we can assign each components of
the B-matrix with its intended values.
The B-matrix is constructed first, then, its components {εx, εy, γxy}T are assign according to the definition in the
first part of Eq. 4•164 and Eq. 4•170, where the strain “epsilon” is a submatrix referring to “B” matrix. For this
implementation the same project “2d_beam” in project workspace file “fe.dsw” can be used with only the macro
definition “__TEST_B_MATRIX_CONCATENATE_EXPRESSION_TOP_DOWN” set.
We see that this implementation takes direct image of the right-hand-side block in the Figure 4•43. In the above
code, no submatrix facility is used only the concatenate operator “|” is used to built the B-matrix from ground-up.
Comparing the bottom-up with the top-down algorithms, the only difference is the semantics. In the last algo-
rithm, we have flatten out the submatrix into simple matrix. In doing so, we can avoid using the requirement of
submatrix features supported by the VectorSpace C++ Library. We may want to optimize the rapid-proto-typing
code by eliminating the features supported in VectorSpace C++ Library step-by-step, such that the overhead
caused by the use of VectorSpace C++ Library can be alleviated.
A even more Fortran-like equivalent implementation is as the followings1 (set the macro definition to noth-
ing)
1. p. 153 in Thomas J.R. Hughes, 1987, “ The finite element method: Linear and dynamic finite element analysis.”, Prentice-
Hall, Englewood Cliffs, New Jersey.
In lines 2-14, provision is taken to eliminate the multiplication with “0” components in BTDB. Only the “nodal
submatrices”—keab in the diagonal and upper triangular matrix of ke is computed. The lower triangular part
matrix is then determined by symmetry with keab = (keba)T (lines 15-21). We recognize that this is the idiom of
using the low-level language expression with indices in accessing the submatrices of the matrix ke as “k[ndf a +
i][ndf b + j]”. By this way, we may avoid using the submatrix facility in VectorSpace C++ Library entirely. Cer-
tainly the optimized low-level code is much longer, less readable, and harder to maintain for programmers.
Nonetheless, this last version can be easily optimized even more aggressively in plain C language without using
the VectorSpace C++ Library at all. The last step is to have an numerical integration at the most outer loop where
we evaluate all values at Gaussian quadrature points and multiply these values with their corresponding weights.
∂N ∂N ∂N ∂N
---------a ---------b ---------a ---------b
∂x ∂x ∂y ∂x
λ ( N a, i Nb, j ) = λ Eq. 4•176
∂N ∂N ∂N ∂N
---------a ---------b ---------a ---------b
∂x ∂y ∂y ∂y
Note that λ may replace λ for the plane stress case in Eq. 4•167. The rest of the integrands of Eq. 4•161 is its
deviatoric part
µ ( δ ij ( Na, k N b, k ) + ( Na, j N b, i ) )
∂N ∂N ∂N ∂N ∂N ∂N ∂N ∂N
---------a ---------b + ---------a ---------b 0 ---------a ---------b ---------a ---------b
∂x ∂x ∂y ∂y ∂x ∂x ∂y ∂x
µ +µ
= ∂N ∂N ∂N ∂N ∂N ∂N ∂N ∂N
0 ---------a ---------b + ---------a ---------b ---------a ---------b ---------a ---------b
∂x ∂x ∂y ∂y ∂x ∂y ∂y ∂y
∂N a ∂N b ∂N a ∂N b ∂N ∂N
2 --------- --------- + --------- --------- ---------a ---------b
∂x ∂x ∂y ∂y ∂y ∂x
= µ Eq. 4•177
∂N ∂N ∂N ∂N ∂N ∂N
---------a ---------b ---------a ---------b + 2 ---------a ---------b
∂x ∂y ∂x ∂x ∂y ∂y
Eq. 4•176 and Eq. 4•177 are implemented as (by setting, at compile time, the macro definition of
“__TEST_INDICIAL_NOTATION_FORMULATION” )
Line 4-8 implements the integrand of the volumetric element stiffness by Eq. 4•176 and line 11-18 implements
the integrand of the deviatoric element stiffness by Eq. 4•177. Note that the unary positive operator in front of
both line 6 and line 13 are conversion operation to convert an Integrable_Nominal_Submatrix (of object type
H0) into an Integrable_Matrix (of type H0). An Integrable_Submatrix version of this implementation will be
The same implementation with one-by-one concatenation operations “||” and “&&” will be
1. p. 155 in Thomas J.R. Hughes, 1987, “ The finite element method: Linear and dynamic finite element analysis.”, Prentice-
Hall, Englewood Cliffs, New Jersey.
∂N ∂N ∂N ∂N
---------a ---------b ---------a ---------b
∂x ∂x ∂y ∂x
keab (temporary) = ∫ ∂N ∂N ∂N ∂N
dΩ Eq. 4•178
Ω ---------a ---------b ---------a ---------b
∂x ∂y ∂y ∂y
Then, ke is overwritten by the rest of the codes. Lines 13-34 will have no integrable objects involved. In both of
these two parts, the symmetry consideration is taken, and only the components of the diagonal nodal submatrix
and upper-triangular nodal submatrices belonging to the upper triangular part of ke are calculated, to reduce the
number of calculation. Firstly, line 17 calculates the following quantity and store in the variable “temp”
∂N ∂N ∂N ∂N
---------a ---------b + ---------a ---------b dΩ
∫ ∂x ∂x ∂y ∂y
Eq. 4•179
Ω
∂N a ∂N b ∂N a ∂N b
( λ + 2µ ) --------- --------- + µ --------- --------- ∅
∂x ∂x ∂y ∂y
∫ ∂N a ∂N b ∂N a ∂N b
dΩ Eq. 4•180
Ω
∅ ( λ + 2µ ) --------- --------- + µ --------- ---------
∂y ∂y ∂x ∂x
where the null symbol “ ∅ ” denotes the corresponding components in the matrix are not calculated. Lines 22-24,
and 25-30 get the off-diagonal components of nodal submatrices
∂Na ∂N b ∂N a ∂N b
∅ λ --------- --------- + µ --------- ---------
∂x ∂y ∂y ∂x
∫ ∂N a ∂N b ∂N a ∂Nb
dΩ Eq. 4•181
Ω
λ --------- --------- + µ --------- --------- ∅
∂y ∂x ∂x ∂y
Special care is taken in lines 22-23, when the nodal submatrices are diagonal nodal submatrices. In the case the
node number index is “a”, we have Na,x Na,y = Na,y Na,x. That is the off-diagonal components in the diagonals
nodal submatrices in Eq. 4•181 is reduced to
∂N a ∂Na
( λ + µ ) ∫ ∅ ---------
∂x ∂y dΩ
--------- Eq. 4•182
Ω ∅ ∅
For these diagonal nodal submatrices the off-diagonal components calculation is therefore further simplified to
lines 22-23. Notice that components in lower-left corner of Eq. 4•182 are not calculated, because these compo-
The inner product gives a scalar. The implementation for the coordinate free tensorial formulation will be based
on Eq. 4•156 which is
where N a ∈ V h , and superscripts and subscripts {a, b} are the element node numbers. The element variables,
e.g., in 2-D elasticity for bilinear 4-nodes element, are arranged in the order of u = {u0, v0, u1, v1, u2, v2, u3, v3}T.
The variable vector u has the size of (ndf × nen) = 2 × 4 =8. Therefore, we identify that the finite element space—
Vh(Ωe) has its inner product operation producing an element stiffness matrix, ke , of size (ndf × nen) × (ndf × nen)
= 8 × 8. We also observed that the differential operators “div”, “def”, and the double contraction “:” on the finite
element space, Vh(Ωe), all need to be defined. The closest thing to the finite element space, Vh(Ωe), in Vector-
Space C++ Library is the type H1 which is an integrable type differentiable up to the first order. However, H1 is
certainly not a finite element space. The inner product of objects defined by H1 will not generate a
(ndf × nen) × (ndf × nen) element stiffness matrix, neither does it has the knowledge of “div”, “def” or “:” opera-
tors. We may implement a customized, not intended for code reuse, class “H1_h” in ad hoc manner for the finite
element space—Vh(Ωe) as
The differential operators “div” and “def” are applied to the finite element space—Vh(Ωe) which can be imple-
mented as an abstract data type “H1_h”. The return values of these differential operators are of yet another
abstract data type “H0_h”. In the terminology of object-oriented analysis, H0_h “IS-A” H0 type. The “IS-A”
relationship between H0_h and H0 is manifested by the definition of class H0_h as publicly derived from class H0
(line 11). We can view class “H0_h” as an extension of class H0 to define the double contraction operation “:”.
The double contraction operator is defined as a public member binary operator “H0_h::operator ^ (const
H0_h&)” (line 14). We emphasize that with the public derived relationship, class H0_h inherits all the public
interfaces and implementations of class H0. Moreover, we design to have H1_h used in the element formulation
as close to the mathematical expression as possible. Lines 16-20 are auxiliary free functions defined to provide
better expressiveness, such that, we may write in element formulation as simple as
which is almost an exact translation of high-flown mathematical expression of Eq. 4•184. The constructor of
class H1_h take two arguments of type H1. The first argument is the shape functions—“N”, and the second argu-
ment is the physical coordinates— “X”. The derivatives of the shape function can be computed from these two
objects as
H0 Nx = d(N) * d(X).inverse();
These two objects have been defined earlier in the element formulation. Now we get to the definition of the
divergence operator “div” according to Eq. 4•144
Or in the form of the nodal subvector (row-wise) for the finite element space—Vh(Ωe) as
∂N a ∂N a
--------- --------- Eq. 4•186
∂x ∂y
1 H0_h H1_h::div_() {
2 H0 Nx = n.d() * x.d().inverse();
3 H0 w_x = INTEGRABLE_SUBMATRIX("int, int, H0&", 1, 2, Nx);
4 H0 wx = (+w_x[0][0]), wy = (+w_x[0][1]);
5 C0 u = BASIS("int", 2), E = BASIS("int", 4);
6 H0 ret_val = wx(0)*(u[0]*E) + wy(0)*(u[1]*E); // Eq. 4•186
7 return ~(+ret_val);
8 }
This divergence operation will return an Integrable_Matrix of size 1 × 8. Therefore, the inner product,
“ div • div ”, not with respect to node number, will return an element stiffness matrix object (an
Integrable_Matrix of type H0) of size 8 × 8. The gradient operator “grad” is defined (also in Eq. 4•144)
∂u ∂v
------ ------
∂x ∂x
grad u = ∇ ⊗ u = u i, j = Eq. 4•187
∂u ∂v
------ ------
∂y ∂y
Notice that we arrange “u”, “v” in row-wise order to be compatible with the order of the variable vector in ele-
ment formulation. This special ordering makes the gradient tensor in Eq. 4•187 as the transpose of the ordinary
mathematical definition on grad u. The nodal submatrices of the return value of “grad” operator are
∂N a ∂N a
--------- ---------
∂x ∂x
Eq. 4•188
∂N a ∂N a
--------- ---------
∂y ∂y
Eq. 4•188, for “grad” operator on Vh, should return a 2 × 8 Integrable_Matrix, and it is implemented as
1 H0_h H1_h::grad_() {
2 H0 Nx = n.d() * x.d().inverse();
3 H0 w_x = INTEGRABLE_SUBMATRIX("int, int, H0&", 1, 2, Nx), wx, wy;
The operator “gradT” is defined independently from “grad” for the finite element space—Vh(Ωe), which can not
be obtained by the transpose of the resulting matrix of “grad”. This is because that the transpose operation on
grad is with respect to its spatial derivatives only not with respect to element node number index—a. Both differ-
ential operators “grad” and “gradT” have return value, with the size of 2 × 8, of type H0_h which is derive from
Integrable_Matrix of type H0. The operator gradT has its nodal submatrices
∂N a ∂N a
--------- ---------
∂x ∂y
Eq. 4•189
∂N ∂N a
---------a ---------
∂x ∂y
which is implemented as
1 H0_h H1_h::grad_t_() {
2 H0 Nx = n.d() * x.d().inverse();
3 H0 w_x = INTEGRABLE_SUBMATRIX("int, int, H0&", 1, 2, Nx), wx, wy;
4 wx &= ~(+w_x[0][0]); wy &= ~(+w_x[0][1]);
5 C0 eu = BASIS("int", 4),
6 e = BASIS("int", 2),
7 E1 = BASIS("int", 1),
8 E2 = BASIS("int", 4),
9 a = (e%eu)*(E1%E2);
10 H0 ret_val = wx*a[0][0] + wy*a[0][2] + //Eq. 4•189
11 wx*a[1][1] + wy*a[1][3];
12 return ret_val;
13 }
The operator “def”, for the finite element space—Vh(Ωe), is defined according to Eq. 4•148
1
def u ≡ --- ( grad u + ( grad u ) T ) Eq. 4•190
2
With both “grad” and “gradT” already defined, “def” can be implemented simply as
The differential operator def also return a 2 × 8 H0_h type object. The double contraction is defined in Eq. 4•154
The implementation of the binary operator “^” as double contraction operator is completely ad hoc. Under the
discretion of the programmer, it has assumed that the two operands of the binary operator are the return values of
the def operator. The return value has the size of 8 × 8. This is evident from the left-hand-side of Eq. 4•191.
1 H0 H0_h::operator^(const H0_h& a) {
2 H0 ret_val(8, 8, (double*)0, a.quadrature_point());
3 H0 ret_sub = INTEGRABLE_SUBMATRIX("int, int, H0&", 2, 2, ret_val);
4 H0 def_w = INTEGRABLE_SUBMATRIX("int, int, H0&", 2, 4, a);
5 for(int a = 0; a < 4; a++)
6 for(int b = 0; b < 4; b++) {
7 H0 def_wa = +def_w(0,a), def_wb = +def_w(0,b);
8 H0 def_def = (~def_wa)*def_wb; // (def u)Ta (def u)b
9 H0 dds = INTEGRABLE_SUBMATRIX("int, int, H0&", 2, 2, def_def);
10 ret_sub(a,b) = +(dds(0,0)+dds(1,1)); // trace of “(def u)Ta (def u)b”
11 }
12 return ret_val;
13 }
Line 3 is the nodal submatrices that we calculated according to Eq. 4•191, and upon which we loop over all
nodes. This implementation can be activated by setting, at compile time, the macro definition
“__TEST_COORDINATE_FREE_TENSORIAL_FORMULATION” for the same project “2d_beam” in project
workspace file “fe.dsw”.
The extension of H1 class in VectorSpace C++ Library to finite element space—Vh(Ωe) as H1_h class in the
above is an example of the so-call programming by specification in the object-oriented method.
The class “Matrix_Representation” has the member function “assembly()” which maps
“the_element_nodal_value” to the “mr.global_nodal_value()” used in the “main()” function. The reaction is not
computed in the present example of project “beam_2d”. The next project “patch_test”, in the next section, will
compute this quantity.
After the nodal displacements, ûea , are obtained, we can loop over each element to calculate the stresses on each
Gaussian integration point as,
σe ≡ Na ( ξ, η ) σ̂e
h a
Eq. 4•193
∫ Na ( σ e – σeh ) dΩ
h
= 0 Eq. 4•194
Ω
Substituting Eq. 4•192 and Eq. 4•117 into Eq. 4•118, we have
b
∫ N a N b dΩ σ̂ e = ∫ ( Na ( D Bûea ) ) dΩ Eq. 4•195
Ω Ω
The nodal stresses σ̂ e can be solved for from Eq. 4•195. Following the same procedure for the heat flux projec-
a
tion on node, in the previous section, Eq. 4•195 can be approximated similarly for the stress nodal projection by
implementing the following codes.
The computation of strains on Gaussian integration points and nodes is similar to the computation of stresses. In
place of Eq. 4•192 for stresses, we have strains computed according to ε eh = Bû ea . The flag
“Matrix_Representation::Assembly_Switch” is now set to “Matrix_Representation::STRAIN” and
“Matrix_Representation::NODAL_STRAIN” for Gauss point stresses and nodal stresses, respectively. The
results of relative magnitudes of displacements, nodal stresses and nodal strains of the 4-node quadrilateral ele-
ment are shown in Figure 4•44.
We introduce the notorious pathology of the finite element method by demonstrating (1) shear locking and
(2) dilatational locking for the bilinear four-node element in plane elasticity.
Figure 4•44Displacement (arrows), nodal stresses (crossed-hairs, solid line for compression, dashed
line for tension), and nodal strain (ellipsoidals) of the beam bending problem. The magnitudes of these
three quantities have all been re-scaled.
1
N a ( ξ, η ) = --- ( 1 + ξ a ξ ) ( 1 + η a η ) Eq. 4•196
4
We considered a special case of a rectangle (Eq. 4•45a), for simplicity, under applied bending moment as shown
in Figure 4•45. Therefore, the finite element space is spanned by the bases of P = {1, ξ, η, ξη}. Since referential
coordinates ξ- and η- axes of the rectangle is assumed to coincide with the physical coordinates x- and y- axes,
the finite element space is also spanned by {1, x, y, xy}. The solution to the displacement field u = [u, v]T for the
bending problem, in plane stress, is1
xy
u
u = = 1 2 υ 2 Eq. 4•197
v – --- x – --- y
2 2
This analytical solution is shown in Figure 4•45b with υ = 0 for simplicity. The horizontal displacement compo-
nent, u = xy, will be represented correctly by the bilinear four-node element, since the basis “xy” is included. The
quadratic terms, x2 and y2, in the solution of vertical displacement “v” will not be captured by the element. These
quadratic forms of solution will be “substituting” or “aliasing” to the linear combination of bases in P. For the
bilinear four-node element the shape functions Eq. 4•196 can be expressed in its generic form as “ Na = PC-1 ”.2
Therefore, from Eq. 4•196, we have
1 1 1 1
1 –1 1 1 –1
u eh ( ξ, η ) ≡ N a ( ξ, η )û ea = P ( ξ, η )C – 1 û ea, where C – 1 = --- Eq. 4•198
4 –1 –1 1 1
1 –1 1 –1
ξ
1
1. p.218 in MacNeal, R.H., 1994, “Finite elements: their design and performance”, Marcel Dekker, Inc., New York.
2. p. 116 in Zienkiewicz, O.C. and R.L. Taylor, 1989, “The finite element method: basic formulation and linear problems”,
vol. 1, McGraw-Hill book company, UK.
ξ 02
1
ξ 12 1
û ea = ( ξa ) 2 = = , Eq. 4•199
ξ 22 1
1
ξ 32
and,
1 1 1 1 1
= 1 ξ η ξη ---
–1 1 –1 1 1 –1 1
u eh = PC û ea = 1 Eq. 4•200
4 –1 –1 1 1 1
1 –1 1 –1 1
That is we have the alias of ξ 2 ⇒ 1 . By symmetry of the element we can also obtain the alias of η 2 ⇒ 1 . The
vertical displacement solution in the bending problem in Eq. 4•197 will then be aliased, considering the aspect
ratio “Λ” in the transformation of natural to physical coordinates in a rectangular element, into
Λ2 ν
u = xy, and v = – ------ – --- = cons tan t Eq. 4•201
2 2
With vertical displacement “v” as constant through out the element domain, the deformation becomes a “key-
stoning” or “x-hourglass” mode (see Figure 4•45c, where the constant “v” is set to zero for comparing to the
original configuration). That is the lower-order element, such as the bilinear 4-node element, exhibits locking
phenomenon, when a boundary value problem corresponding to a higher-order solution is imposed.
The analytical strain, derived from Eq. 4•197, corresponding to the bending problem is
∂u
------
εx ∂x
y
εy = ∂v = – νy Eq. 4•202
------
∂y
γxy ∂v ∂u
0
------ + ------
∂x ∂y
where u and v are solutions in Eq. 4•197. The bilinear 4-node element under the same bending condition
responds with the solution in Eq. 4•201, and we have the corresponding strains as
εx y
εy = 0 Eq. 4•203
γ xy x
Comparing Eq. 4•202 and Eq. 4•203, both εy and γxy are in error. With Poisson’s ratio in the range of ν = [0, 0.5],
γxy will be more serious than εy. The source of error is the interpolating failure of the bilinear four node element
which leads to the aliasing of x2 and y2 terms in Eq. 4•197 into constants in Eq. 4•201. A partial solution to this
locking problem is to evaluate γxy at ξ = 0, and η = 0. That is one Gauss point integration of in-plane shear strain
at the center of the element, and 2 × 2 integration for the remaining direct strain components εx and εy. A more
satisfactory treatment is to add back both x2 and y2 to the set of shape functions which is the subject of “non-con-
forming element” in page 502 of Chapter 5. We introduce the treatment by selective reduced integration on in-
plane shear strain γxy(at ξ = 0, η = 0) in the followings.
Eq. 4•176 and Eq. 4•177 are re-written as
∂N ∂N ∂N ∂N
---------a ---------b ---------a ---------b
∂x ∂x ∂x ∂y
λ ( N a, i N b, j ) = λ Eq. 4•204
∂N ∂N ∂N ∂N
---------a ---------b ---------a ---------b
∂y ∂x ∂y ∂y
and
µ ( δ ij ( N a, k Nb, k ) + ( N a, j N b, i ) )
∂N a ∂N b ∂N ∂N ∂N ∂N
2 --------- --------- 0 ---------a ---------b ---------a ---------b
∂x ∂x ∂x ∂x ∂y ∂x
= µ +µ Eq. 4•205
∂N a ∂N b ∂N ∂N ∂N ∂N
0 2 --------- --------- ---------a ---------b ---------a ---------b
∂y ∂y ∂x ∂y ∂y ∂y
Notice that the positions in the stiffness matrix corresponding to variables u and v and their variations u’ and v’
as
( u’u ) ( u’v )
Eq. 4•206
( v’u ) ( v’v )
The components in Eq. 4•204 and the first term in Eq. 4•205 only involve the direct strains εx(=u,x) and εy(=v,y).
These terms are evaluated with 2 × 2 points Gauss integration (the full-integration). The components in the sec-
ond term of Eq. 4•205 involve the in-plane shear strain γxy(=u,y+v,x), and these are to be evaluated at the center
of the element where ξ = 0, η = 0. This term is applied with 1-point Gauss integration (the reduced integration.)
In retrospect, had we apply 1-point integration to all terms, spurious modes (x-hourglass and y-hourglass)
will arise. That is the two hourglass modes become eigenvectors for the stiffness matrix that is evaluated at the
center of the element. This is evident from Figure 4•45c. The cross-hairs which parallel to the ξ, η axes are dis-
∂ξ ∂η
------ dy = ------ dx Eq. 4•207
∂y ∂x
This approximation is possible to make the shear term nearly invariant if we deal only with element shapes that
are very close to a square element. At the limit of infinitesimal coordinate transformation, Eq. 4•207 is to assume
the “spin” at the centroid vanishes, which is adopted in the “co-rotational” formulation in finite element method.
The invariance formulation, discussed in the above, can be activated by setting macro definition
“__TEST_HUGHES” at compile time.
Unfortunately, for an arbitrary element shape, the mapping from the reference element (in ξ, η) to physical
element (in x, y) is unlikely to be infinitesimal as can be approximated in Eq. 4•207. For an arbitrary element
shape, we can decomposed the shape distortion into eigenvectors as rectangular, parallelogram, and trapezoid
shapes (see Figure 4•48b). There is no practical invariance formulation that can remove the shape sensitivity if
the trapezoid component for a particular element shape is strong.2 In a finite element program, which often
implemented with sparse matrix technique, the node-ordering can be changed, for example, in order to minimize
the bandwidth of the global stiffness matrix. Sudden change of the node-ordering can therefore inadversarily
change the value of the global stiffness matrix dramatically. A practical fixed to remedy the frame dependent in-
1. see project in p. 261-262 from Hughes, T.J.R., 1987, “ The finite element method: linear static and dynamic finite element
analysis”, Prentice-Hall, Inc., Englewood Cliffs, New Jersey.
2. see p.241-248 in MacNeal, R.H., 1994, “Finite elements: their design and performance”, Marcel Dekker, Inc., New York.
#include "include\fe.h"
static const double L_ = 10.0; static const double c_ = 1.0; static const double h_e_ = L_/4.0;
static const double E_ = 30.0e6; static const double v_ = 0.25;
static const double lambda_ = v_*E_/((1+v_)*(1-2*v_));
Young’s modulus and Poisson ratio
static const double mu_ = E_/(2*(1+v_));
static const double lambda_bar = 2*lambda_*mu_/(lambda_+2*mu_); plane stress λ modification
static const double K_ = lambda_bar+2.0/3.0*mu_;
static const double e_ = 0.0;
Omega_h::Omega_h() {
Node *node; double v[2]; int ena[4]; Omega_eh *elem; define nodes
v[0] = 0.0; v[1] = 0.0; node = new Node(0, 2, v); node_array().add(node);
v[0] = h_e_-e_; node = new Node(1, 2, v); node_array().add(node);
v[0] = 2.0*h_e_-2.0*e_; node = new Node(2, 2, v); node_array().add(node);
v[0] = 3.0*h_e_-e_; node = new Node(3, 2, v); node_array().add(node);
v[0] = 4.0*h_e_; node = new Node(4, 2, v); node_array().add(node);
v[0] = 0.0; v[1] = 1.0*c_; node = new Node(5, 2, v); node_array().add(node);
v[0] = 1.0*h_e_; node = new Node(6, 2, v); node_array().add(node);
v[0] = 2.0*h_e_; node = new Node(7, 2, v); node_array().add(node);
v[0] = 3.0*h_e_; node = new Node(8, 2, v); node_array().add(node);
v[0] = 4.0*h_e_; node = new Node(9, 2, v); node_array().add(node);
v[0] = 0.0; v[1] = 2.0*c_; node = new Node(10, 2, v); node_array().add(node);
v[0] = h_e_+e_; node = new Node(11, 2, v); node_array().add(node);
v[0] = 2.0*h_e_+2.0*e_; node = new Node(12, 2, v); node_array().add(node);
v[0] = 3.0*h_e_+e_; node = new Node(13, 2, v); node_array().add(node);
v[0] = 4.0*h_e_; node = new Node(14, 2, v); node_array().add(node);
ena[0] = 0; ena[1] = 1; ena[2] = 6; ena[3] = 5; define elements
elem = new Omega_eh(0, 0, 0, 4, ena); omega_eh_array().add(elem);
ena[0] = 1; ena[1] = 2; ena[2] = 7; ena[3] = 6;
elem = new Omega_eh(1, 0, 0, 4, ena); omega_eh_array().add(elem);
ena[0] = 2; ena[1] = 3; ena[2] = 8; ena[3] = 7;
elem = new Omega_eh(2, 0, 0, 4, ena); omega_eh_array().add(elem);
ena[0] = 3; ena[1] = 4; ena[2] = 9; ena[3] = 8;
elem = new Omega_eh(3, 0, 0, 4, ena); omega_eh_array().add(elem);
ena[0] = 5; ena[1] = 6; ena[2] = 11; ena[3] = 10;
elem = new Omega_eh(4, 0, 0, 4, ena); omega_eh_array().add(elem);
ena[0] = 6; ena[1] = 7; ena[2] = 12; ena[3] = 11;
elem = new Omega_eh(5, 0, 0, 4, ena); omega_eh_array().add(elem);
ena[0] = 7; ena[1] = 8; ena[2] = 13; ena[3] = 12;
elem = new Omega_eh(6, 0, 0, 4, ena); omega_eh_array().add(elem);
ena[0] = 8; ena[1] = 9; ena[2] = 14; ena[3] = 13;
elem = new Omega_eh(7, 0, 0, 4, ena); omega_eh_array().add(elem);
} B.C.
gh_on_Gamma_h::gh_on_Gamma_h(int df, Omega_h& omega_h) {
__initialization(df, omega_h); int row_node_no = 5, col_node_no = 3;
u4 = u9 = v9 = u14 = 0
the_gh_array[node_order(4)](0) = the_gh_array[node_order(14)](0) =
the_gh_array[node_order(4)](1) = the_gh_array[node_order(9)](1) =
Listing 4•16 Seletive reduce integration on the offending shear term (project workspace file “fe.dsw”,
project “invariance_formulation”.)
y’
y
θ2 θ2
θ1
x’
θ1
x
Figure 4•47 MacNeal’s local preferred coordinate system for selective reduced
integration on shear term.
plane shear (after reduced integration) is to implement an algorithm to select, for example, the longest edge of
the elements to begin element node numbering. 1 Then transform the global coordinate system, for computing the
stiffness matrix, under a preferred local coordinate system. After the stiffness is computed at the element level, it
is transformed back to the global coordinate system then assembled to the global stiffness matrix. The origin of
the local coordinate system is chosen as center at the intersection of the two diagonals of the quadrilateral. The x-
axis is chosen to be the bisector of the diagonal angle as shown in Figure 4•47. This implementation can be acti-
vated by setting macro definition “__TEST_MACNEAL”. Note that for simplicity we do not implements the
part of algorithm that choose the longest edge. We only implemented the more mathematical part of the algo-
rithm that demonstrates how to translate to the center of the intersection of the two diagonals and then rotate to
the local coordinate x’-axis, which is the bisector of the diagonals.
η y’
θ2 = η,x
x’
y ξ
θ1 = ξ,y
Figure 4•46 Hughes’s local preferred coordinate system for the invariance formulation of the
shear term under selective reduced integration. θ1 ||dy|| = θ2 ||dx||, or simply θ1 = θ2 , if ||dy|| ~ ||dx||
which is consistent with the infinitesimal mapping assumption.
Full Integration Selective Reduced Hughes’ local coord. MacNeal’s local coord. Analytical
-0.00311871 -0.0061448 -0.00535423 -0.00565686 -0.00518750
TABLE 4•2. Tip-deflections for selective reduced integration to prevent shear locking and
choices of local preferred coordinate system for invariance of the formulation.
1 δ δ
1 ν
u = xy, and v = – --- x 2 – -------------------- y 2 Eq. 4•208
2 2(1 – ν )
εx y
εy ν
= – ---------------- y Eq. 4•209
(1 – ν)
γ xy 0
where the bulk modulus K and Young’s modulus E, Poisson’s ratio ν are related as
E
K = ----------------------- Eq. 4•211
3 ( 1 – 2ν )
Notice that even when ν → 0.5 , we have K → ∞ (Eq. 4•211), and ε v → 0 (Eq. 4•210), while the pressure “p”
(Eq. 4•210) remains finite. For a 4-node rectangular element, the aliasing of solution in Eq. 4•208 leads to
1 ν
u = xy, and v = – --- Λ 2 – -------------------- Eq. 4•212
2 2(1 – ν)
εx y
εy = 0 Eq. 4•213
γxy x
1. p.216-217 in MacNeal, R.H., 1994 “Finite elements: their design and performance”, Marcel Dekker, Inc., New York.
A volumetric-deviatoric split1 is applied to the stiffness of Eq. 4•214 into the volumetric part and deviatoric part.
Define the volumetric strain εv as
ε v = ε x + εy = m • ε Eq. 4•215
In vector form of plane elasticity, m = [1, 1, 0]T and ε = [εx, εy, γxy]T. The mean stress or pressure is
p ≡ --- ( σ x + σ y + σ z ) = Kε v = K m • ε
1
Eq. 4•216
3
mε m⊗m
ε d ≡ ε – --------- = I – ----------------- ε
v
- Eq. 4•217
3 3
σd = µ D 0 ε d = µ D 0 – --- m ⊗ m ε
2
Eq. 4•218
3
where
2 00
D0 = 0 2 0 Eq. 4•219
0 01
1. p.334-352 in Zienkiewicz, O.C., and R.L. Taylor, 1989, “The finite element method: basic formulation and linear prob-
lems”, 4th ed., vol. 1, McGraw-Hill, London, UK.
= e iT ∫ B aT µ D 0 – --- m ⊗ m B b dΩ + ∫ B aT K ( m ⊗ m )B b dΩ e j
2
Eq. 4•220
Ω 3
Ω
k vol = e iT ∫ B aT K ( m ⊗ m )B b dΩe j
Ω
k dev = e iT ∫ B aT µ D 0 – --- m ⊗ m B b dΩ e j
2
Eq. 4•221
3
Ω
Therefore, the selective reduced integration can be applied to these two separate terms accordingly. The follow-
ing codes implemented Eq. 4•221 as
Lines 1-10 define 2 × 2 points integration, and lines 11-20 define 1-point integration. The deviatoric stiffness is
implemented in line 33, and the volumetric stiffness in line 40. This computation can be done with macros
“__TEST_PLAIN_STRAIN”,“__NEARLY_INCOMPRESSIBLE”,“__TEST_B_MATRIX_VOLUMETRIC_D
EVIATORIC_SPLIT”, and “__TEST_SELECTIVE_REDUCED_INTEGRATION” defined at compile time.
The result of tip deflection with standard integration scheme is “-0.000149628” (i.e., sever locking compared to
tip deflection of ElasticQ4 element with ν = 0.25 in TABLE 4•2.). With the selective reduced integration on the
volumetric term, under B-matrix formulation, the tip-deflection is “-0.00305825”.
For the coordinate-free tensorial formulation of Eq. 4•156,
k eiajb = λ ∫ N a, i N b, j dΩ + µ δ ij ∫ Na, k N b, k dΩ + ∫ N a, j Nb, i dΩ Eq. 4•223
Ω
Ω Ω
2
K = λ + --- µ Eq. 4•224
3
1. see p.129-130 in Fung, C.Y., 1965, “ Foundations of solid mechanics”, Prentice-Hall, Inc., Englewood Cliffs, N.J.
σ x νσ y σx
εx = ------ – --------- = ------ = 0.002 ⇒ u = 0.002x
E E E
νσ x σ y νσ x
εy = – --------- + ------ = – --------- = – 0.0006 ⇒ v = – 0.0006 y
E E E
2 ( 1 + ν )τ xy
γ xy = ---------------------------- = 0 Eq. 4•225
E
We observe that the imposing displacement field for the patch test is therefore linear. This gives a simple exact
solution the nodal displacements, nodal stresses, and nodal reactions shown in TABLE 4•4.
Node # u v σx σy τxy rx ry
0 0.0000 0.0000 2 0 0 2 0
1 0.0040 0.0000 2 0 0 -3 0
2 0.0040 -0.00180 2 0 0 -2 0
3 0.0000 -0.00120 2 0 0 3 0
4 0.0008 -0.00024 2 0 0 0 0
5 0.0028 -0.00036 2 0 0 0 0
6 0.0030 -0.00120 2 0 0 0 0
7 0.0006 -0.00096 2 0 0 0 0
TABLE 4•4. Nodal displacement, nodal stresses and nodal reactions of the element patch.
1. Taylor, R.L., O.C. Zienkiewicz, J.C. Simo, and A.H.C. Chan, 1986, “The patch test--a condition for assessing f.e.m. con-
vergence”, International Journal of Numerical Methods in Engineering, vol., 22, pp. 39-62, or, for more availability, an abbre-
viated representation as Chapter 11 in Zienkiewicz, O.C., and R.L. Taylor, 1989, “The finite element method: basic
formulation and linear problems”, McGraw-Hill, London., UK.
Consistency
E = 1x103, ν = 0.3 Stability
(Test A) (Test B) (Test C)
σx = 2, σy = τxy =0 2 fx =2
(2, 3)
(0, 2)
3 6
(1.5, 2.0)
(0.3, 1.6) 7
4 5 (1.4, 0.6)
(0.4,0.4)
0 1 u = 0.002x fx =3
(0, 0) (2, 0) v = -0.0006y
free d.o.f.s fixed d.o.f.s
Consistency Requirement demands the governing partial differential equation to be satisfied exactly. The
matrix form of the weak statement derived from the governing partial differential equation is
where Kij is the global stiffness matrix and fi is the global nodal force vector. We first specify all nodes with the
linear displacement calculated from u = 0.002x, and v = -0.0006y, where uj = [uj, vj]T is the solution vector, and
x = [x, y]T is the nodal coordinates. Since no loading, fi in Eq. 4•226, is specified for the internal nodes (# 4, 5,
6, 7), the “reaction” calculated according to “-Kijuj” should be identically zero, if the governing partial differen-
tial equation is to be satisfied. This is the “Test A” in Figure 4•49. The Test A is useful in checking the correct-
ness of program statements in implementing the stiffness matrix. The Program Listing 4•17 implements the test
suite for the Test A described in the above. The standard (full-) integration (2 × 2) for Test A is the default setting
of this program. The uniform reduced integration (1-point Gauss integration) can be performed on this program
by setting macro definition “__TEST_UNIFORM_REDUCED_INTEGRATION” at compile time. Both the
standard integration and uniform reduced integration produce the exact reaction, up to machine accuracy, as
listed in TABLE 4•4.
In the “Test B” in Figure 4•49, a second step for checking the consistency requirement, we specified only
nodes on the boundaries. Then, the unknown uj on internal nodes (# 4, 5, 6, 7) can be calculated according to
This step requires the matrix solver to “invert” the stiffness matrix Kij. The matrix solver is a fixture in “fe.lib”.
Assuming the matrix solver chosen is appropriate to solve the problem at hand, the “Test B” checks the accu-
racy of the stiffness matrix maintained in the process of matrix solution step. A problematic stiffness matrix, or
improper matrix solver, will lose accuracy significantly and may give out erroneous solution. The Test B can be
#include "include\fe.h"
static const double E_ = 1.0e3; static const double v_ = 0.3;
static const double lambda_=v_*E_/((1+v_)*(1-2*v_)); static const double mu_=E_/(2*(1+v_));
static const double lambda_bar = 2*lambda_*mu_/(lambda_+2*mu_);
Omega_h::Omega_h() { double v[2]; Node* node; int ena[4]; Omega_eh* elem;
v[0] = 0.0; v[1] = 0.0; node = new Node(0, 2, v); the_node_array.add(node);
v[0] = 2.0; v[1] = 0.0; node = new Node(1, 2, v); the_node_array.add(node);
define nodes
v[0] = 2.0; v[1] = 3.0; node = new Node(2, 2, v); the_node_array.add(node);
v[0] = 0.0; v[1] = 2.0; node = new Node(3, 2, v); the_node_array.add(node);
v[0] = 0.4; v[1] = 0.4; node = new Node(4, 2, v); the_node_array.add(node);
v[0] = 1.4; v[1] = 0.6; node = new Node(5, 2, v); the_node_array.add(node);
v[0] = 1.5; v[1] = 2.0; node = new Node(6, 2, v); the_node_array.add(node);
v[0] = 0.3; v[1] = 1.6; node = new Node(7, 2, v); the_node_array.add(node);
ena[0] = 0; ena[1] = 1; ena[2] = 5; ena[3] = 4;
elem = new Omega_eh(0, 0, 0, 4, ena); the_omega_eh_array.add(elem);
define elements
ena[0] = 5; ena[1] = 1; ena[2] = 2; ena[3] = 6;
elem = new Omega_eh(1, 0, 0, 4, ena); the_omega_eh_array.add(elem);
ena[0] = 7; ena[1] = 6; ena[2] = 2; ena[3] = 3;
elem = new Omega_eh(2, 0, 0, 4, ena); the_omega_eh_array.add(elem);
ena[0] = 0; ena[1] = 4; ena[2] = 7; ena[3] = 3;
elem = new Omega_eh(3, 0, 0, 4, ena); the_omega_eh_array.add(elem);
ena[0] = 4; ena[1] = 5; ena[2] = 6; ena[3] = 7;
elem = new Omega_eh(4, 0, 0, 4, ena); the_omega_eh_array.add(elem); }
gh_on_Gamma_h::gh_on_Gamma_h(int df, Omega_h& omega_h) { __initialization(df, omega_h);
for(int i = 0; i < 8; i++) define boundary conditions
for(int j = 0; j < 2; j++) the_gh_array[node_order(i)](j) = gh_on_Gamma_h::Dirichlet;
the_gh_array[node_order(1)][0] = 0.004; the_gh_array[node_order(2)][0] = 0.004;
the_gh_array[node_order(2)][1] = -0.0018; the_gh_array[node_order(3)][1] = -0.0012;
the_gh_array[node_order(4)][0] = 0.0008; the_gh_array[node_order(4)][1] = -0.00024;
the_gh_array[node_order(5)][0] = 0.0028; the_gh_array[node_order(5)][1] = -0.00036;
the_gh_array[node_order(6)][0] = 0.003; the_gh_array[node_order(6)][1] = -0.0012;
the_gh_array[node_order(7)][0] = 0.0006; the_gh_array[node_order(7)][1] = -0.00096; }
class ElasticQ4 : public Element_Formulation { public: define element “ElasticQ4”
ElasticQ4(Element_Type_Register a) : Element_Formulation(a) {}
Element_Formulation *make(int, Global_Discretization&);
ElasticQ4(int, Global_Discretization&); };
Element_Formulation* ElasticQ4::make(int en, Global_Discretization& gd) {
return new ElasticQ4(en,gd); }
static const double a_ = E_ / (1-pow(v_,2));
static const double Dv[3][3] = { {a_, a_*v_, 0.0}, {a_*v_, a_, 0.0}, {0.0, 0.0, a_*(1-v_)/2.0} };
C0 D = MATRIX("int, int, const double*", 3, 3, Dv[0]);
ElasticQ4::ElasticQ4(int en, Global_Discretization& gd) : Element_Formulation(en, gd) {
Quadrature qp(2, 4); H1 Z(2, (double*)0, qp), Zai, Eta,
N = INTEGRABLE_VECTOR_OF_TANGENT_BUNDLE( "int, int, Quadrature", 4, 2, qp);
Zai &= Z[0]; Eta &= Z[1]; N[0] = (1-Zai)*(1-Eta)/4; N[1] = (1+Zai)*(1-Eta)/4;
N[2] = (1+Zai)*(1+Eta)/4; N[3] = (1-Zai)*(1+Eta)/4; H1 X = N*xl; J dv(d(X).det());
for(int b = 0; b < nen; b++) { B1 &= Nx[b][0]; B2 &= Nx[b][1];
DB[0][0] = Dv[0][0]*B1; DB[0][1] = Dv[0][1]*B2; DB[1][0] = Dv[0][1]*B1;
DB[1][1] = Dv[1][1]*B2; DB[2][0] = Dv[2][2]*B2; DB[2][1] = Dv[2][2]*B1;
for(int a = 0; a <= b; a++) { B1 &= Nx[a][0]; B2 &= Nx[a][1];
K[2*a ][2*b] = B1*DB[0][0] + B2*DB[2][0];
K[2*a ][2*b+1] = B1*DB[0][1] + B2*DB[2][1];
K[2*a+1][2*b] = B2*DB[1][0] + B1*DB[2][0];
K[2*a+1][2*b+1] = B2*DB[1][1] + B1*DB[2][1]; } }
for(int b = 0; b < nen; b++) for(int a = b+1; a < nen; a++) {
K[2*a ][2*b] = K[2*b ][2*a ]; K[2*a ][2*b+1] = K[2*b+1][2*a ];
K[2*a+1][2*b] = K[2*b ][2*a+1]; K[2*a+1][2*b+1] = K[2*b+1][2*a+1];
}
Listing 4•17 Patch test A(project workspace file “fe.dsw”, project “patch_test” with Macro definition
“__PATCH_TEST_A” set at compile time).
(d) y-hourglass mode (for a square) (e) x-hourglass mode (for a square)
Figure 4•50 Deformation of the element patch magnifies 50 times in (a) solution with
standard 2 × 2 integration points, (b) solution with uniform reduced (1 × 1) integration, (c)
solution with uniform reduced integration using pseudo (Moore-Penrose) inverse for matrix
solver; i.e., with singular value decomposition, (d) and (e) are compared to two zero-energy
hourglass modes of a square bilinear element (the eigenvectors designated as the x-
hourglass and y- hourglass modes), associated with the signular values of the uniform
reduced (1 × 1) integration stiffness matrix.
1. see p.242 in Hughes, T. J.R., 1987, “The finite element method: linear static and dynamic finite element analysis”, Pren-
tice-Hall Inc., Englewood Cliffs, New Jersey.
h h
3 4 5
h
0 1 2 r
r=1
∂w
-------
∂z
εz
∂u
εr ------
∂r
ε = = Eq. 4•229
εθ u
---
γ rz r
∂u ∂w
------ + -------
∂z ∂r
Therefore, with εeh = B a û ea , where the B-matrix for the axisymmetrical case becomes
1. Taylor, R.L., O.C. Zienkiewicz, J.C. Simo, and A.H.C. Chan, 1986, “The patch test--a condition for assessing f.e.m. con-
vergence”, International Journal of Numerical Methods in Engineering, vol., 22, pp. 39-62.
2. Chapter 12 in Timoshenko, S.P., and J.N. Goodier, 1970, “ Theory of elasticity”, McGraw-Hill, Inc., London, U.K.
∂N a
0 ---------
∂z
∂N a
--------- 0
∂r û a
Ba = , and ûea = e Eq. 4•230
N v̂ ea
-----a- 0
r
∂N a ∂N a
--------- ---------
∂z ∂r
ν ν
1 ------------ ------------ 0
1–ν 1–ν
ν ν
------------ 1 ------------ 0
E(1 – ν) 1 – ν 1–ν
D = -------------------------------------- Eq. 4•231
( 1 + ν ) ( 1 – 2ν ) ν ν
------------ ------------ 1 0
1–ν 1–ν
1 – 2ν
0 0 0 --------------------
2( 1 – ν )
The infinitesimal volume is taken over the whole ring of material as dV = 2πr dr dz. For the selective reduced
integration, the volumetric and deviatoric split of the stiffness matrix as in Eq. 4•221 is still valid
with two simple modifications for axisymmetrical consideration that m = [1, 1, 1, 0]T, and
2 0 0 0
D0 = 0 2 0 0
Eq. 4•234
0 0 2 0
0 0 0 1
For the current problem, the material constants are given as E = 1, and υ = 0, for simplicity. This gives
We notice that the implementation of the B-matrix for the axisymmetrical problem is implemented according to
Eq. 4•230 as
Lines 5-8 use matrix concatenation operation to capture the semantics of B-matrix directly.
#include "include\fe.h"
static const double E_ = 1.0;
static const double v_ = 0.0;
static const double lambda_=v_*E_/((1+v_)*(1-2*v_));
static const double mu_ = E_/(2*(1+v_));
static const double K_ = lambda_ + 2.0/3.0 * mu_;
static const double h_=0.8;
static const double r_=1.0;
static const double PI_=3.14159265359;
Omega_h::Omega_h() {
double v[2];
Node* node;
int ena[4];
Omega_eh* elem;
v[0] = r_-h_; v[1] = 0.0;
node = new Node(0, 2, v);
the_node_array.add(node);
v[0] = r_;
node = new Node(1, 2, v);
the_node_array.add(node);
v[0] = r_+h_;
node = new Node(2, 2, v);
the_node_array.add(node);
v[0] = r_-h_; v[1] = h_;
node = new Node(3, 2, v);
the_node_array.add(node);
v[0] = r_;
node = new Node(4, 2, v);
the_node_array.add(node);
v[0] = r_+h_;
node = new Node(5, 2, v); the_node_array.add(node);
ena[0] = 0; ena[1] = 1; ena[2] = 4; ena[3] = 3;
elem = new Omega_eh(0, 0, 0, 4, ena);
the_omega_eh_array.add(elem);
ena[0] = 1; ena[1] = 2; ena[2] = 5; ena[3] = 4;
elem = new Omega_eh(1, 0, 0, 4, ena);
the_omega_eh_array.add(elem); }
gh_on_Gamma_h::gh_on_Gamma_h(int df, Omega_h& omega_h) {
__initialization(df, omega_h);
the_gh_array[node_order(1)](1) = gh_on_Gamma_h::Dirichlet;
double sigma_r, r, f_r;
sigma_r = 2.0; r = 1.0-h_;
f_r = -2.0*PI_*r*h_*sigma_r;
the_gh_array[node_order(0)][0] = f_r / 2.0;
the_gh_array[node_order(3)][0] = f_r / 2.0;
r = 1.0+h_;
f_r = 2.0*PI_*r*h_*sigma_r;
the_gh_array[node_order(2)][0] = f_r / 2.0;
the_gh_array[node_order(5)][0] = f_r / 2.0;
}
class ElasticAxisymmetricQ4 : public Element_Formulation {
public:
ElasticAxisymmetricQ4(Element_Type_Register a) : Element_Formulation(a) {}
Element_Formulation *make(int, Global_Discretization&);
ElasticAxisymmetricQ4(int, Global_Discretization&);
};
Element_Formulation* ElasticAxisymmetricQ4::make(int en, Global_Discretization& gd) {
return new ElasticAxisymmetricQ4(en,gd);
}
Listing 4•18 Axisymmetrical patch test (project workspace file “fe.dsw”, project
“axisymmetrical_patch_test” with Macro definition
Shape Sensibility: Consider two quadratic elements. Either eight-nodes or nine-nodes elements as shown in Fig-
ure 4•52. The common edge of the two elements is slanted with the distortion, away from axes of Cartesian
coordinates, denoted as “d”, and shown in Figure 4•52.
d=0
E = 103, ν = 0.3
15
2 d=1
-15
d
10
d=2
d
Figure 4•52 Beam subject to bending moement on the left. Three amount of element
distortion away from rectangular shape (d = 0).
There is no new program implementation needed for the higher-order patch test. The project
“higher_order_patch_test” implemented program for the present test. The eight-nodes and nine-nodes elements
are activated by setting macro definition “__TEST_Q8” and “__TEST_Q9”, respectively. The distortion factor
is a static constant “d_” in the very beginning of the program. The uniform reduced integration can be achieved
by setting all qaudrature point to 2 × 2 in the program. The tip deflection on the middle point of the left edge is
listed in TABLE 4•6.
Convergence of bilinear 4-node element: We show the convergence of bilinear 4-node element at (1) Poisson ratio
ν = 0.3 in plane stress and (2) ν = 0.4999 in plane strain (with the same boundary value problem in Figure 4•52).
The options of (a) the selective reduce integration on the shear term of the deviatoric stiffness and (b) the volu-
metric stiffness are also tested. The same problem is divided with successively finer meshes, and is shown in Fig-
ure 4•53. The test suite is implemented in project “higher_order_q4” in project workspace file “fe.dsw”. For total
element number greater than 8, the macro definitions “__TEST_Q4_32”, “__TEST_Q4_128”, and
“__TEST_Q4_512”, with the last numbers indicate the total element number, can be set at compile time. For the
selective reduced integration on the offending shear terms and dilatational term in incompressible materials, the
corresponding macro definitions are “__SHEAR_SELECTIVE_REDUCED_INTEGRATION” and
“__INCOMPRESSIBLE_ SELECTIVE_REDUCED_INTEGRATION”.
The results with various combinations of the options are shown in TABLE 4•7. For Poisson ratio ν = 0.3, in
plane stress, the convergence is clear with increasing number of element used in the computation. The successive
results agree on more digits after the decimal points. This convergence is guaranteed by the patch test for the 4-
nodes bi-linear element, since it pass the consistency and stability parts of the patch test. Both the full integration
and selective reduced integration on the offending shear treatment converge to exact solution of 0.75. For ν =
0.4999, the nearly incompressible condition, in plane strain case, the solution shows significant locking without
signs of convergence, when applied with the full integration. The solution and its convergence are obtainable
with the selective reduced integration schemes as shown in the last two columns, which both converge to value
of ~0.56 comparing to “0.5625” in mixed u-p formulation (ν = 0.5 in Chapter 5).
1. p. 167-169 in Zienkiewicz, O.C., and R.L. Taylor, 1989, “The finite element method: basic formulation and linear prob-
lems”, McGraw-Hill, London., UK.
2. p. 164-165 in Bathe, K.-J. and W.L. Wilson, 1976, “ Numerical method in finite element analysis”, Prentice-Hall, Inc.,
Englewood Cliffs, New Jersey.
8 elements
32 elements
128 elements
512 elements
Du
ρ -------- = div
Dt
σ+f Eq. 4•236
where divergence of interal stresses, div σ, equals the external surface force, and f is the body force. The Du/Dt
in the left-hand-side is the fluid particle in Lagragian (material) description, in which u(x, t) can be differentiated
with respect to time “t” (by first applying the Lebniz rule, i.e., d(xy) = x dy + y dx, and then the chain rule, d f(x)
/ dt = (df / dx) (dx / dt), on the second term of the Lebniz rule)
Du ( x, t ) ∂u ∂u ∂x ∂u
-------------------- = ------ + ------ ------ = ------ + u • grad u Eq. 4•237
Dt ∂t ∂x ∂t ∂t
where we have applied the definitions of the velocity, u ≡ ∂x/∂t, and the velocity gradient, grad u ≡ ∂u/∂x. The
stress in the first term of the right-hand-side of Eq. 4•236 can be expressed as in Eq. 4•146 that
σ = –p I+τ
where p is the pressure, I is the unit tensor, and τ is the viscous stress. The constitutive equations is
where µ is the fluid viscosity, and λ' is the second viscosity (this term gives the deviatoric stress caused by the
volumetirc deformation which is a process attributed to molecular relaxation). For monatomic gas λ' = -2µ/3,
and it can be proved as the lower bound for λ' thermodynamically. In most applications, λ' div u , is nearly
completely negligible compared to the pressure, “p”.
A popular treatment for the incompressible condition is to use penalty method where the pressure variable is
eliminated by taking
Now λ and µ are equivalent to the Lamé constants in elasticity. As discussed earlier (see page 409), near the
incompressible condition K ≈ λ >> µ. In the penalty method in the stokes problem, the penalty parameter, λ, is
usually taken as
to approximate the nearly incompressible condition.1 Substituting Eq. 4•237 and viscous stress of Eq. 4•238 into
Eq. 4•236, we have the Navier-Stokes equation
1. p.520 in Zienkiewicz and R.L. Taylor, 1991, “The finite element method”, 4th ed., vol. 2. McGraw-Hill, Inc., UK.
We have dropped out the second viscosity λ' and use the identity that “div(p I) = grad p”. For steady incom-
pressible viscous fluid, the Navier-Stokes equation simplifies to
From Eq. 4•242, the Reynolds number (denoted as Re) is the dynamic similarity of the inertia force
“ ρu • grad u “ to the viscous force “div(2µ def u)” as1
ρu • grad u ρUL
------------------------------------------ ≈ ----------- ≡ Re Eq. 4•243
div ( 2µ def u ) µ
At very low Reynolds number (Re << 1) the inertia force is negligible compared to the viscous force. The Eq.
4•242 can be simplified to
Therefore, the resultant equation is completely identical to Eq. 4•140 with the constitutive equation of Eq. 4•146
and Eq. 4•147 for elasticity. The physical interpretation is different in that instead of regarding u as the displace-
ment, it is the velocity in the stokes flow. µ now plays the role of fluid viscosity instead of the shear modulus G
in elasticity. λ is now the penalty parameter we take λ = 108 µ, and certainly with the selective reduced integra-
tion for the volumetric term, in the computation. The finite element formulation in the last section for plane elas-
ticity can be applied to the stokes flow problem without modification. Considering the B-matrix formulation for
plane elasticity
Since at the incompressible limit, λ ≈ K , and λ = 108 µ for the penalty method, Eq. 4•245 becomes2
λλ0 2µ 0 0
k dev ≅ e iT ∫ B aT D µ B b dΩ e j , and k vol ≅ e iT ∫ BaT D λ Bb dΩe j where D = Eq. 4•246
λ λ λ 0 , D µ = 0 2µ 0
Ω Ω
0 0 0 0 0 µ
1. p. 97 in Tritton, D.J., 1988, “ Physical fluid dynamics”, 2nd ed., Oxford University Press, Oxford, UK.
2. see Hughes, T.J.R., W.K. Liu, and A. Brooks, 1979, “Review of finite element analysis of incompressible viscous flows
by the penalty function formulation”, Journal, of Computational Physics, vol. 30, no. 1, p. 1-60.
G Uy
u ( y ) = ------ y ( d – y ) + ------- Eq. 4•247
2µ d
This solution can be derived from Eq. 4•244 from the superposition of two solutions of the viscous flow induced
by the pressure gradient and by the bounding plates separately. That is the first term corresponding to the Poi-
seuille flow caused by the applied horizontal pressure gradient, the second term corresponding to the Couette
flow induced by the relative motion of the two bounding plates. In these test cases, the Couette flow provides an
assumed linear solution, and the Poiseuille flow provides an assumed higher-order (quadratic) solution.
Program Listing 4•19, in the project “plane_couette_poiseuille_flow” in project workspace file “fe.dsw”, is
implemented for these tests. To emphasize its relation to plane elasticity, we use “elasticq9.cpp” as a separate
compilation unit, as a dependent source file for this project. The “elasticq9.cpp” is the implementation very close
to of Lagrangian 9-node element for plane elasticity.
The plane Couette flow can be activated by setting macro definition “__TEST_PLANE_COUETTE_FLOW”
and the plane Poiseuille flow can be activated by setting macro definition “__TEST_PLANE_POISEUILLE_
FLOW”. The default is a combined flow with both pressure gradient applied on the entrance and relative motion
of bounding plates. The results of the computation are shown in Figure 4•55. The finite element solutions are
shown in dashed curves with arrows to indicate the velocity profiles in the middle of the channel to avoid the
entrance and exit effects. The exact solution are shown in solid curves. We notice that the solution for the plane
Poiseuille flow, quadratic in nature, is less accurate compared to the solution for the plane Couette flow, which is
linear.
U=1
L = 10
1. p. 182 in Batchelor, G.K., 1967, “An introduction to fluid dynamics”, Cambridge University Press, UK.
1 1
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0.8
0.6
0.4
0.2
k dev ≅ e iT ∫ B aT D µ B b dΩ e j ,
wx &= w_x[0][0]; wy &= w_x[0][1]; b &= (~wx|| C0(0.0)) & (C0(0.0) || ~wy ) & (~wy || ~wx);
C0 stiff_vol = ((~b) * (D_lambda * b)) | dv;
double d_mu[3][3] = { {2*mu_, 0.0, 0.0}, {0.0, 2*mu_, 0.0}, {0.0, 0.0, mu_} }; Ω
C0 D_mu = MATRIX("int, int, const double*", 3, 3, d_mu[0]);
H0 W_x = INTEGRABLE_SUBMATRIX("int, int, H0&", 1, nsd, Nx), Wx, Wy, B;
k vol ≅ e iT ∫ B aT D λ Bb dΩe j
Wx &=W_x[0][0]; Wy &=W_x[0][1]; B &= (~Wx|| C0(0.0)) & (C0(0.0) || ~Wy) & (~Wy|| ~Wx );
C0 stiff_dev = ((~B) * (D_mu * B)) | dV; Ω
#else standard λ−µ formulation
C0 e = BASIS("int", ndf), E = BASIS("int", nen), U = (e%e)*(E%E);
H0 w_x = INTEGRABLE_SUBMATRIX("int, int, H0&", 1, nsd, nx), wx, wy;
wx &= w_x[0][0]; wy &= w_x[0][1];
C0 stiff_vol = lambda_* (
+( wx*~wx*U[0][0]+wx*~wy*U[0][1] +wy*~wx*U[1][0]+wy*~wy*U[1][1] ) | dv);
H0 W_x = INTEGRABLE_SUBMATRIX("int, int, H0&", 1, nsd, Nx), Wx, Wy;
Wx &= W_x[0][0]; Wy &= W_x[0][1];
C0 stiff_dev = mu_* (
+( ((2*Wx*~Wx)+(Wy*~Wy))*((e[0]%e[0])*(E%E))+(Wy*~Wx) *((e[0]%e[1])*(E%E))
+(Wx*~Wy) *((e[1]%e[0])*(E%E))+((2*Wy*~Wy)+(Wx*~Wx))*((e[1]%e[1])*(E%E)) )
| dV);
#endif
stiff &= stiff_vol + stiff_dev;
}
x
(0, 0) (1,0)
(a) (b)
Figure 4•56(a) Flow in square cavity with sixteen 9-nodes Lagrangian elements. (b) velocity
vectors.
1. p. 462-465 in J.N. Reddy, 1986, “Applied functional analysis and variational methods in engineering”, McGraw-Hill, Inc.,
New York.
2. such as corner node treatments described in p.231 in Hughes, T.J.R., “The finite element method: linear static and dynamic
finite element analysis”, Prentice-Hall, Inc., Englewood Cliffs, New Jersey.
#include "include\fe.h"
static const int row_node_no = 9; static const int col_node_no = 9;
static const int row_element_no = (row_node_no-1)/2;
static const int col_element_no = (col_node_no-1)/2;
static const double h_e_ = 1.0/((double)row_element_no*2);
static const double v_e_ = 1.0/((double)col_element_no*2);
static const double mu_ = 1.0; static const double lambda_ = 1.e8 * mu_;
EP::element_pattern EP::ep = EP::LAGRANGIAN_9_NODES;
Omega_h::Omega_h() {
double x[4][2] = {{0.0, 0.0}, {1.0, 0.0}, {1.0, 1.0}, {0.0, 1.0}};
int control_node_flag[4] = {1, 1, 1, 1};
block(this, row_node_no, col_node_no, 4, control_node_flag, x[0]);
}
gh_on_Gamma_h::gh_on_Gamma_h(int df, Omega_h& omega_h) {
__initialization(df, omega_h);
for(int i = 0; i < col_node_no; i++) { right; u = v = 0
the_gh_array[node_order((i+1)*row_node_no-1)](0) =
the_gh_array[node_order((i+1)*row_node_no-1)](1) = gh_on_Gamma_h::Dirichlet;
}
for(int i = 0; i < col_node_no; i++) { left; u = v = 0
the_gh_array[node_order(i*row_node_no)](0) =
the_gh_array[node_order(i*row_node_no)](1) = gh_on_Gamma_h::Dirichlet;
}
for(int i = 1; i < row_node_no-1; i++) { bottom; u = v = 0
the_gh_array[node_order(i)](0) =
the_gh_array[node_order(i)](1) = gh_on_Gamma_h::Dirichlet;
}
for(int i = 1; i < row_node_no-1; i++) {
int nn = (col_node_no-1)*row_node_no+i; top, forced B.C.; u = 4(1-x)x, v = 0
double x = ((double)i)*h_e_, u = 4.0 * (1.0-x) * x;
the_gh_array[node_order(nn)](0) = gh_on_Gamma_h::Dirichlet;
the_gh_array[node_order(nn)][0] = u;
the_gh_array[node_order(nn)](1) = gh_on_Gamma_h::Dirichlet;
}
}
class ElasticQ9 : public Element_Formulation { declare “ElasticQ9” class
public:
ElasticQ9(Element_Type_Register);
Element_Formulation *make(int, Global_Discretization&);
ElasticQ9(int, Global_Discretization&);
};
Element_Formulation* Element_Formulation::type_list = 0;
register “ElasticQ9” as element # 0
Element_Type_Register element_type_register_instance;
static ElasticQ9 stokesq9_instance(element_type_register_instance);
int main() {
int ndf = 2; Omega_h oh;
gh_on_Gamma_h gh(ndf, oh);
U_h uh(ndf, oh); solution phase
Global_Discretization gd(oh, gh, uh);
Matrix_Representation mr(gd);
mr.assembly();
C0 u = ((C0)(mr.rhs())) / ((C0)(mr.lhs()));
gd.u_h() = u; gd.u_h() = gd.gh_on_gamma_h();
cout << gd.u_h() << endl;
return 0;
}
Listing 4•20 Driven cavity flow (in project: “square_cavity_flow” in project workspace file “fe.dsw”.).
∂
------ 0
εx ∂x
ε = εy = – z 0 ----- ∂ θ x = – zL θ Eq. 4•249
-
∂y θ y
γ xy ∂ ∂
------ ------
∂y ∂x
Sx
w = w0
Mx
midsurface Mxy
fiber Myx
Sy My
θα
uα = uα0-θαz
Figure 4•57 (a) the displacements of plate under deformation, (b) the shear forces (Sx, Sy),
the normal momenets (Mx, My), and the twisting moments (Mxy, Myx) of a plate.
1. p.8 in Zienkiewicz and R.L. Taylor, 1991, “The finite element method”, 4th ed., vol. 2. McGraw-Hill, Inc., UK.
∂w
γx θx -------
∂x
γ = = – + = – θ + ∇w Eq. 4•250
γy θy ∂w
-------
∂y
From the Figure 4•57b, the normal moments (Mx, My) and the twisting moment (Mxy) are
t
---
Mx 2 σx
M = My = –∫ σ y z dz = D L θ Eq. 4•251
M xy t
– --- τ xy
2
1 ν 0
Et 3 ν 1 0
D = ------------------------- Eq. 4•252
12 ( 1 – ν 2 ) 1–ν
0 0 ------------
2
Sx
S = = βGt ( – θ + ∇w ) ≡ α ( – θ + ∇w ) Eq. 4•253
Sy
5
where α = βGt, and the correction factor β = --- is for rectangular homogeneous section with parabolic shear
6
stress distribution.
Parallel to the equilibrium equations, Eq. 4•26 and Eq. 4•27 for 1-D beam bending problem, we have in plate
bending problem
∂ ∂
------ 0 ------ M x
∂x ∂y Sx 0 ∂ ∂ Sx + qx = 0
My + = , and ------ ------ Eq. 4•255
∂ ∂ Sy 0 ∂x ∂y S y qy 0
0 ------ ------ M
∂y ∂x xy
θ = ∇w Eq. 4•256
Substituting first part of Eq. 4•254 into the second part of it, we get
–∇ T L T M + q = 0 Eq. 4•257
Then, use Eq. 4•251 to substitute M in Eq. 4•257, and substitute θ, with thin plate assumption, θ = ∇w in Eq.
4•256, we get
From the definition of operators L and ∇, we have the combined operator “L∇” as
∂ ∂2
------ 0 --------
∂x ∂ ∂x 2
------
∂ ∂x ∂2
L∇ = 0 ----- - = -------- Eq. 4•259
∂y ∂ ∂y 2
------
∂ ∂ ∂y ∂2
------ ------
∂y ∂x 2 -------------
∂x∂y
For constant D, the Eq. 4•258 becomes the well-known classical biharmonic equation1
The homogeneous solution for a simply supported rectangular plate with lengths of “a” and “b” has the simple
form of
1. e.g., Airy’s stress function satisfies the biharmonic equation as described in p.32, and p.538 in Timoshenko, S.P., and J.N.
Goodier, 1970, “ Theory of elasticity”, 3rd ed., McGraw-Hill Book Company.
h a
M e = D B a ŵ e Eq. 4•263
a
where ŵ e is the nodal deflection vector. The element stiffness matrix has no difference from Eq. 4•173; i.e.,
k epq = k eiajb = e iT ∫ B aT D B b dΩe j , with p = ndf (a-1) + i, and q = ndf (b-1)+j Eq. 4•264
Ω
wa
û ea ≡ θ̂ xa Eq. 4•265
θ̂ ya
where
∂w ∂w
θ̂ xa = – ------- , and θ̂ ya = ------- Eq. 4•266
∂y a ∂x a
The nonconforming element defines a 12-terms polynomial for the deflection “w” as
w = α0 + α1 x + α2 y + α3 x2 + α4 xy + α5 y2 +
α6 x3 + α7 x2y + α8 xy2 + α9 y3 + α10 x3y + α11 xy3
≡ Pα Eq. 4•267
where
2
P = 1 x y x 2 xy y x 3 x 2 y xy 2 y 3 x 3 y xy 3 Eq. 4•268
Notice that the polynomial is not complete up to the third-order. For each of four nodes on the corner of the rect-
angle (a = 0, 1, 2, 3), we have twelve equations
wa α 0 + α 1 x a + α 2 y a + α 3 x a2 + α 4 x a y a + α 5 y a2 + α 6 x a3 + α 7 x a2 y a + α 8 x a y a2 + α 9 y a3 + α 10 x a3 y a + α 11 x a y a3
θ̂ xa = – α 2 – α 4 x a – 2α 5 y a – α 7 x a2 – 2α 8 x a y a – 3α 9 y a2 – α 10 x a3 – 3α 11 x a y a2
θ̂ ya α 1 + 2α 3 x a + α 4 y a + 3α 6 x a2 + 2α 7 x a y a + α 8 y a2 + 3α 10 x a2 y a + α 11 y a3
≡ Ca α Eq. 4•269
1 x a y a x a2 x a y a y a2 x a3 x a2 y a x a y a2 y a3 x a3 y a x a y a3
Ca ≡ 0 0 –1 0 – x a – 2y a 0 – x a2 – 2x a y a – 3y a2 – x a3 – 3x a y a2 Eq. 4•270
0 1 0 2x a y a 0 3x a2 2x a y a y a2 0 3x a2 y a y a3
for a = 0, 1, 2, 3. Therefore, C is a 12 × 12 matrix. The vector α can be obtained by inverting Eq. 4•269 as
α = C – 1 û ea Eq. 4•271
The Program Listing 4•21 implements the generic procedure in the above to derive the nonconforming shape
function (Eq. 4•273) for the thin-plate bending rectangular element. Eq. 4•262 and Eq. 4•264 are then taken to
define the B-matrix and the stiffness matrix, respectively. The plate is clamped at four sides and with uniform
unit loading. Only a quarter (upper-right) of the plate is modeled due to the symmetry of the geometry and the
boundary conditions. 4 × 4 (= 16) elements are used in the computation. At the right and the top edges of the
model the boundary conditions are w = ∂ w/ ∂ x = ∂ w/ ∂ y = 0 (clamped). At the bottom and the left edges are
taken as ∂ w/ ∂ y =0, and ∂ w/ ∂ x =0, respectively (see Figure 4•58a). The solution of the vertical deflection is
shown in Figure 4•58b.
The maximum deflection is at the center of the plate, or at the lower-left corner of the finite element model.
The exact solution is 226800.1 The results are shown in TABLE 4•8., which shows the convergence toward the
exact solution when the mesh size is refined.
1. The exact solution is computed from formula provided in p. 31 in Zienkiewicz, O.C. and R.L. Taylor, 1991, “The finite
element method”, 4th ed., vol. 2. McGraw-Hill, Inc., UK, and reference therein.
#include "include\fe.h"
static row_node_no = 5;
EP::element_pattern EP::ep = EP::QUADRILATERALS_4_NODES;
Omega_h::Omega_h() {
double coord[4][2] = {{0.0, 0.0}, {1.0, 0.0}, {1.0, 1.0}, {0.0, 1.0}};
int control_node_flag[4] = {TRUE, TRUE, TRUE, TRUE};
block(this, row_node_no, row_node_no, 4, control_node_flag, coord[0]);
}
gh_on_Gamma_h::gh_on_Gamma_h(int df, Omega_h& omega_h) {
__initialization(df, omega_h);
for(int i = 0; i < row_node_no-1; i++) bottom B.C. ∂ w/ ∂ y =0
the_gh_array[node_order(i)](1) = gh_on_Gamma_h::Dirichlet;
for(int i = 0; i < row_node_no-1; i++)
the_gh_array[node_order(i*row_node_no)](2) = gh_on_Gamma_h::Dirichlet; left B.C. ∂ w/ ∂ x =0
for(int i = 1; i <= row_node_no; i++) {
the_gh_array[node_order(i*row_node_no-1)](0) = gh_on_Gamma_h::Dirichlet;
the_gh_array[node_order(i*row_node_no-1)](1) = gh_on_Gamma_h::Dirichlet;
top B.C. w = ∂ w/ ∂ x = ∂ w/ ∂ y =0
the_gh_array[node_order(i*row_node_no-1)](2) = gh_on_Gamma_h::Dirichlet; }
for(int i = 0; i < row_node_no-1; i++) { right B.C. w = ∂ w/ ∂ x = ∂ w/ ∂ y =0
the_gh_array[node_order(row_node_no*(row_node_no-1)+i)](0) =
gh_on_Gamma_h::Dirichlet;
the_gh_array[node_order(row_node_no*(row_node_no-1)+i)](1) =
gh_on_Gamma_h::Dirichlet;
the_gh_array[node_order(row_node_no*(row_node_no-1)+i)](2) =
gh_on_Gamma_h::Dirichlet;
}
}
class PlateR4 : public Element_Formulation {
public:
PlateR4(Element_Type_Register a) : Element_Formulation(a) {}
Element_Formulation *make(int, Global_Discretization&);
PlateR4(int, Global_Discretization&);
};
Element_Formulation* PlateR4::make(int en, Global_Discretization& gd) {
return new PlateR4(en,gd);
}
static const double E_ = 1.0; static const double v_ = 0.25; static const double t_ = 0.01; 1 ν 0
static const double D_ = E_ * pow(t_,3) / (12.0*(1-pow(v_,2)));
Et 3 ν 1 0
static const double Dv[3][3] = { {D_, D_*v_, 0.0 }, D = -------------------------
{D_*v_, D_, 0.0 }, 12 ( 1 – ν 2 ) 1–ν
{0.0, 0.0, D_*(1-v_)/2.0} }; 0 0 ------------
2
C0 D = MATRIX("int, int, const double*", 3, 3, Dv[0]);
PlateR4::PlateR4(int en, Global_Discretization& gd) : Element_Formulation(en, gd) {
int ndf = 3;
Quadrature qp(2, 16);
H0 dx_inv;
H2 X;
{
H2 z(2, (double*)0, qp),
n = INTEGRABLE_VECTOR_OF_TANGENT_OF_TANGENT_BUNDLE( coordinate transformation rule
"int, int, Quadrature", 4/*nen*/, 2/*nsd*/, qp), zai, eta;
zai &= z[0]; eta &= z[1];
n[0] = (1-zai)*(1-eta)/4; n[1] = (1+zai)*(1-eta)/4;
n[2] = (1+zai)*(1+eta)/4; n[3] = (1-zai)*(1+eta)/4;
X &= n*xl;
}
dx_inv &= d(X).inverse();
J dv(d(X).det());
Listing 4•21 Plate bending using nonconformming rectangular element (project workspace file “fe.dsw”,
project “rectangular_plate_bending” with Macro definition
“__GENERIC_NONCONFORMING_SHAPE_FUNCTION” set at compile time).
w = ∂w/∂x = ∂w/∂y =0
∂w/∂x = 0
200000
0
5
∂w/∂y =0 1500000
100000
00
50000
00 4
0
1.0 1 3
2
3 2
4
51
1.0
(b)
(a)
Figure 4•58 Clamped boundary conditions and nodal deflections for rectangular plate bending
elements (4 × 4 mesh are shown) using non-conformming shape function.
Alternatively, we can substitute the explicit shape functions1 in Eq. 4•262 with
( ξξ a + 1 ) ( ηη a + 1 ) ( 2 + ξξ a + ηη a – ξ 2 – η 2 )
1
N a ≡ --- – b η a ( ξξa + 1 ) ( ηη a + 1 ) 2 ( ηη a – 1 ) Eq. 4•274
8
aξa ( ξξa + 1 ) 2 ( ξξa – 1 ) ( ηη a + 1 )
where “2a” and “2b” are the lengths of a rectangular element, and the nodal normalized coordinates are [ξa, ηa]
= {(-1, -1), (1, -1), (1, 1), (-1, 1)}. Implementation of Eq. 4•274, to be substituting in Eq. 4•262, is straight for-
ward as
1. see p. 17 in Zienkiewicz, O.C. and R.L. Taylor, 1991, “The finite element method”, 4th ed., vol. 2. McGraw-Hill, Inc.,
UK, and reference therein.
On the other hand, the Eq. 4•272 is quite generic especially when no one is deriving an explicit formula like Eq.
4•274 for us. The computation is done on the same project (“rectangular_plate_bending” in project workspace
file “fe.dsw”) with macro definition “__EXPLICIT_NONCONFORMING_SHAPE_FUNCTION” set at com-
pile time. The solutions is certainly identical to the one with generic procedure for computing the shape function.
wa
∂w
-------
∂y a
û ea ≡ Eq. 4•275
∂w
-------
∂x a
∂2 w
------------
-
∂x∂y a
with four nodes at each corner of the rectangle we have totally 16 degree of freedoms. Therefore, a complete
third-order polynomial can be used to represent the deflection w, with P defined as
2 2 2
P = 1 x y x 2 xy y x 3 x 2 y xy 2 y 3 x 3 y x 2 y xy 3 x 3 y x 2 y 3 x 3 y 3 Eq. 4•276
1 x a y a x a2 x a y a y a2 x a3 x a2 y a x a y a2 y a3 x a3 y a x a2 y a2 x a y a3 x a3 y 2 a x a2 y 3 a x a3 y 3 a
0 0 1 0 x a 2y a 0 x a2 2x a y a 3y a2 x a3 2x a2 y a 3x a y a2 2x a3 y a 3x a2 y 2 a 3x a3 y 2 a
Ca ≡ Eq. 4•277
0 1 0 2x a y a 0 3x a2 2x a y a y a2 0 3x a2 y a 2x a y a2 y a3 3x a2 y 2 a 2x a y 3 a 3x a2 y 3 a
0 0 0 0 1 0 0 2x a 2y a 0 3x a2 4x a y a 3y a2 6x a2 y a 6x a y a2 9x a2 y 2 a
Eq. 4•276 and the inverse of Eq. 4•277 can be substituted in Eq. 4•272 to define the B-matrix. The explicit shape
functions for the conforming rectangular element, , is defined as
( ξ + ξ a ) 2 ( ξξ a – 2 ) ( η + η a ) 2 ( ηη a – 2 )
1 – a ξ a ( ξ + ξ a ) 2 ( ξξ a – 1 ) ( η + η a ) 2 ( ηη a – 2 )
N a ≡ ------ Eq. 4•278
16 – b ( ξ + ξ ) 2 ( ξξ – 2 )η ( η + η ) 2 ( ηη – 1 )
a a a a a
abξ a ( ξ + ξ a ) 2 ( ξξ a – 1 )η a ( η + η a ) 2 ( ηη a – 1 )
where “2a” and “2b” are the lengths of the rectangular element, and the subscript a = 0, 1, 2, 3 are the nodal
numbers (developed by Bogner et al.1,2). The same project “rectangular_plate_bending” can be used with macro
definition “__EXPLICIT_CONFORMING_SHAPE_FUNCTION” set at compile time for using Eq. 4•278, or
no macro definition set at compile time for its generic counterpart via Eq. 4•277. The results of center deflection
of the conforming rectangular plate are shown in TABLE 4•9. .
2 2 2
P = L0 L 1 L2 L0 L1 L 1 L 2 L2 L0 L 0 L 1 L1 L 2 L 2 L0 Eq. 4•279
Three third order terms are chosen in addition to the first six complete second order terms. The explicit shape
function for the first node is (with cyclic permutation of 0, 1, 2 for other two nodes)
1. see p. 49 in Zienkiewicz and R.L. Taylor, 1991, “The finite element method”, 4th ed., vol. 2. McGraw-Hill, Inc., UK, and
reference therein.
2. see also p. 419, Table 9.1 for the “Hermite cubic element” in Reddy, J.N., 1993, “An introduction to the finite element
method”, 2nd ed., McGraw-Hill, Inc., New York.
3. see p. 244 in Zienkiewicz, O.C., 1977, “The finite element method”, 3rd ed., McGraw-Hill, Inc., UK.
L 0 + L 02 L 1 + L 02 L 2 – L 0 L 12 – L 0 L 22
2 1
- L L L – b 1 L 2 L 02 + --- L 0 L 1 L 2
1
N 0 ≡ b 2 L 0 L 1 + --
2 0 1 2 2 Eq. 4•280
where b0 = y1- y2, and c0 = x2-x1. The explicit shape function for the triangular element can be implemented as
#include "include\fe.h"
static row_node_no = 5;
EP::element_pattern EP::ep = EP::SLASH_TRIANGLES;
Omega_h::Omega_h() {
double coord[4][2] = {{0.0, 0.0}, {1.0, 0.0}, {1.0, 1.0}, {0.0, 1.0}};
int control_node_flag[4] = {TRUE, TRUE, TRUE, TRUE};
block(this, row_node_no, row_node_no, 4, control_node_flag, coord[0]);
}
gh_on_Gamma_h::gh_on_Gamma_h(int df, Omega_h& omega_h) {
__initialization(df, omega_h); bottom B.C. - ∂ w/ ∂ y =0
for(int i = 0; i < row_node_no-1; i++)
the_gh_array[node_order(i)](1) = gh_on_Gamma_h::Dirichlet;
for(int i = 0; i < row_node_no-1; i++)
left B.C. ∂ w/ ∂ x =0
the_gh_array[node_order(i*row_node_no)](2) = gh_on_Gamma_h::Dirichlet;
for(int i = 1; i <= row_node_no; i++) { right B.C. w = ∂ w/ ∂ x = ∂ w/ ∂ y =0
the_gh_array[node_order(i*row_node_no-1)](0) =
the_gh_array[node_order(i*row_node_no-1)](1) =
the_gh_array[node_order(i*row_node_no-1)](2) = gh_on_Gamma_h::Dirichlet;
} top B.C. w = ∂ w/ ∂ x = ∂ w/ ∂ y =0
for(int i = 0; i < row_node_no-1; i++) {
the_gh_array[node_order(row_node_no*(row_node_no-1)+i)](0) =
the_gh_array[node_order(row_node_no*(row_node_no-1)+i)](1) =
the_gh_array[node_order(row_node_no*(row_node_no-1)+i)](2) =
gh_on_Gamma_h::Dirichlet;
}
}
class PlateT3 : public Element_Formulation {
public:
PlateT3(Element_Type_Register a) : Element_Formulation(a) {}
Element_Formulation *make(int, Global_Discretization&);
PlateT3(int, Global_Discretization&);
};
Element_Formulation* PlateT3::make(int en, Global_Discretization& gd) {
return new PlateT3(en,gd);
}
static const double E_ = 1.0; static const double v_ = 0.25; static const double t_ = 0.01;
static const double D_ = E_ * pow(t_,3) / (12.0*(1-pow(v_,2))); 1 ν 0
static const double Dv[3][3]={
Et 3 ν 1 0
{D_, D_*v_, 0.0 }, D = -------------------------
{D_*v_, D_, 0.0 }, 12 ( 1 – ν 2 ) 1–ν
{0.0, 0.0, D_*(1-v_)/2.0 } 0 0 ------------
2
};
C0 D = MATRIX("int, int, const double*", 3, 3, Dv[0]);
PlateT3::PlateT3(int en, Global_Discretization& gd) : Element_Formulation(en, gd) {
int ndf = 3;
Quadrature qp(2, 16);
H0 dx_inv;
H1 X;
{
H1 l(2, (double*)0, qp), coordinate transformation rule
n = INTEGRABLE_VECTOR_OF_TANGENT_BUNDLE( "int, int, Quadrature", 3, 2, qp),
l0 = l[0], l1 = l[1], l2 = 1.0 - l0 - l1;
n[0] = l0; n[1] = l1; n[2] = l2;
X &= n*xl;
}
dx_inv &= d(X).inverse();
J dv(d(X).det());
Listing 4•22 9 dof triangular plate bending using nonconformming rectangular element (project workspace
file “fe.dsw”, project “triangular_plate_bending”).
P = L0 L 1 L2 L0 L1 L 1 L 2 L2 L0 Eq. 4•281
A triangular element can be conceived with six degree of freedoms, with three deflection variables “w” on the
corner nodes and three normal derivatives “ ∂ w/ ∂ n” on the three middle points of the triangle sides as depicted
in Figure 4•59.
w2
(∂w/∂n)4 (∂w/∂n)3
w0
w1
(∂w/∂n)5
Parallel to the derivation of Eq. 4•269 for a generic shape function, we have
The normal derivatives to the node number “3” can be obtained according to the formula1
∂
-----
l0 ∂ ∂ ∂ ∂ ∂
- --------- + --------- – 2 --------- + µ 0 --------- – ---------
- = ------ Eq. 4•283
∂n 3 4∆ ∂L 1 ∂L 2 ∂L0 ∂L 2 ∂L 1
where l0 is the length of the edge opposing to node number “0”, ∆ is the area of the triangle, and µi is defined as
l 22 – l12 l 02 – l 22 l12 – l 02
µ 0 = --------------
- , µ 1 = --------------
- , and µ 2 = --------------
- Eq. 4•284
l 02 l 12 l 22
Similarly we can define for the other two normal derivatives ( ∂ ⁄ ∂n ) 4 and ( ∂ ⁄ ∂n ) 5 . The derivatives of “Pα”
with respect to L0, L1, and L2 are
1. p.27 in Zienkiewicz, O.C. and R.L. Taylor, 1991, “The finite element method”, 4th ed., vol. 2. McGraw-Hill, Inc., UK.
∂w
l0
------- = ------
- [ – 2α 0 + ( 1 – µ 0 )α 1 + ( 1 + µ 0 )α 2 – α 3 + α 4 – α 5 ]
∂n 3 4∆
∂w
l1
-------
∂n 4 = 4∆ [ – 2α 1 + ( 1 – µ 1 )α 2 + ( 1 + µ 1 )α 0 – α 4 + α 5 – α 3 ]
------
-
∂w
------
l2
- [ – 2α 2 + ( 1 – µ 2 )α 0 + ( 1 + µ 0 )α 1 – α 5 + α 3 – α 4 ]
- = ------ Eq. 4•286
∂n 5 4∆
1 0 0 0 0 0
0 1 0 0 0 0
0 0 1 0 0 0
– 2l 0 l0 ( 1 – µ0 ) l0 ( 1 + µ0 )
---------- ------------------------ ------------------------ –1 1 – 1
C≡ 4∆ 4∆ 4∆ Eq. 4•287
l1 ( 1 + µ 1 ) – 2l 1 l1 ( 1 – µ1 )
------------------------ ---------- ------------------------ –1 –1 1
4∆ 4∆ 4∆
l2 ( 1 – µ2 ) l2 ( 1 + µ2 ) – 2l 2
------------------------ ------------------------ ---------- 1 –1 – 1
4∆ 4∆ 4∆
The shape function is defined as N = PC-1. We can still use the definition of stiffness matrix from Eq. 4•264,
∂w ∂w
θ̂ x = – -------, and θ̂ y = ------- Eq. 4•289
∂y ∂x
that improves the symmetry of plate theory equations. The relation of θ n to θ̂ x and θ̂ y can be expressed as
∂w ∂w ∂w
θ n = ------- = n x ------- + n y ------- = ( – n y )θ̂ x + n x θ̂ y Eq. 4•290
∂n ∂x ∂y
#include "include\fe.h"
static row_node_no = 9;
Omega_h::Omega_h() {
int row_segment_no = (row_node_no - 1)/2;
double v[2]; int ena[6];
for(int i = 0; i < row_node_no; i++)
for(int j = 0; j < row_node_no; j++) {
int nn = i*row_node_no+j;
v[0] = (double)j/(double)(row_node_no-1); v[1] = (double)i/(double)(row_node_no-1);
Node* node = new Node(nn, 2, v); the_node_array.add(node);
}
for(int i = 0; i < row_segment_no; i++)
for(int j = 0; j < row_segment_no; j++) {
int nn = i*row_node_no*2+j*2;
ena[0] = nn; ena[1] = ena[0]+row_node_no*2+2; ena[2] = ena[1]-2;
ena[3] = ena[2] + 1; ena[4] = ena[0]+row_node_no; ena[5] = ena[4]+1;
int en = i*row_segment_no*2+j*2;
Omega_eh* elem = new Omega_eh(en, 0, 0, 6, ena); the_omega_eh_array.add(elem);
ena[0] = nn; ena[1] = nn+2; ena[2] = ena[1] + row_node_no*2;
ena[3] = ena[1] + row_node_no; ena[4] = ena[3] -1; ena[5] = ena[0] +1;
elem = new Omega_eh(en+1, 0, 0, 6, ena); the_omega_eh_array.add(elem);
}
}
gh_on_Gamma_h::gh_on_Gamma_h(int df, Omega_h& omega_h) { __initialization(df, omega_h);
for(int i = 1; i < row_node_no-1; i+=2) bottom B.C. ∂ w/ ∂ n =0
the_gh_array[node_order(i)](0) = gh_on_Gamma_h::Dirichlet;
for(int i = 1; i < row_node_no-1; i+=2)
the_gh_array[node_order(i*row_node_no)](0) = gh_on_Gamma_h::Dirichlet;
left B.C. ∂ w/ ∂ n =0
for(int i = 0; i < row_node_no; i+=2) {
the_gh_array[node_order(i*row_node_no-1)](0) = right B.C. w = ∂ w/ ∂ n =0
the_gh_array[node_order((i+1)*row_node_no-1)](0) = gh_on_Gamma_h::Dirichlet;
}
for(int i = 0; i < row_node_no-1; i+=2) { top B.C. w = ∂ w/ ∂ n =0
the_gh_array[node_order(row_node_no*(row_node_no-1)+i)](0) =
the_gh_array[node_order(row_node_no*(row_node_no-1)+i+1)](0) =
gh_on_Gamma_h::Dirichlet;
}
the_gh_array[node_order(row_node_no*row_node_no-1)](0) = gh_on_Gamma_h::Dirichlet;
}
class PlateMorley6 : public Element_Formulation {
public:
PlateMorley6(Element_Type_Register a) : Element_Formulation(a) {}
Element_Formulation *make(int, Global_Discretization&);
PlateMorley6(int, Global_Discretization&);
};
Element_Formulation* PlateMorley6::make(int en, Global_Discretization& gd) {
return new PlateMorley6(en,gd);
}
static const double E_ = 1.0;
static const double v_ = 0.25;
static const double t_ = 0.01;
static const double D_ = E_ * pow(t_,3) / (12.0*(1-pow(v_,2))); 1 ν 0
static const double Dv[3][3] = {
Et 3 ν 1 0
{D_, D_*v_, 0.0 }, D = -------------------------
{D_*v_, D_, 0.0 }, 12 ( 1 – ν 2 ) 1–ν
{0.0, 0.0, D_*(1-v_)/2.0} 0 0 ------------
2
};
C0 D = MATRIX("int, int, const double*", 3, 3, Dv[0]);
Listing 4•23 Morley’s 6-dof triangular plate bending(project workspace file “fe.dsw”, project
“morley_plate_bending”).
Mixed Formulation
Eq. 4•96 to Eq. 4•98 from Chapter 4 is re-written for convenience as
q = – κ ∇T Eq. 5•2
Eq. 5•1 states that the divergence of the heat flux, “q”, is equal to the internal heat source, f. Eq. 5•2 is the Fourier
law of heat conduction which assumed that the heat flux is linearly related to the negative gradient of the temper-
Integrating by parts and applying divergence theorem on the first term yields
∫ wq ( κ –1 q + ∇T ) dΩ = 0 Eq. 5•6
Ω
Finite element approximation of (1) temperature field, “T”, and (2) heat flux, “q”, are defined as
a
where Tˆ and q̂ a are nodal variables (bases), and “a” is nodal index for temperature and heat flux nodes. φTa
and φ qa are two shape functions which can be different from each other. Substituting Eq. 5•7 into Eq. 5•5 and Eq.
5•6 we have the mixed finite element formulation for the heat conduction problem as1
T f
AC q̂
= 1 Eq. 5•8
C 0 Tˆ f2
where
T
f 1 = – Aĥ Γ eh
– C ĝ Γ eg
Eq. 5•11
1. p.319-324 in Zienkiewicz, O.C. and R.L. Taylor, 1991, “The finite element method”, 4th ed., vol. 1. McGraw-Hill, Inc.,
UK.
f2 = – ∫ φT f dΩ + ∫ φT h dΓ – Cĥ e
Γh
Eq. 5•12
Ωe e
Γh
where h and g with “over-bar” are fixed nodal flux boundary conditions and fixed nodal temperature boundary
conditions, respectively, while “h” in Eq. 5•12 can be specified as a function on an element boundary. The fixed
nodal boundary conditions, h and g , in Eq. 5•11 and Eq. 5•12 are encountered frequently in finite element
method and are taken care of by “fe.lib” as default behaviors behind the scene. This is consistent with the treat-
ment of the result of the displacement boundary conditions as reaction, and substrates out of the nodal force term
as “ f -= K u ”.
By inspecting on Eq. 5•9 and Eq. 5•10, the derivatives of temperature field exist. C0-continuity of “T” on the
element interior and boundaries is required. This is needed to guarantees that Eq. 5•10 over the entire problem
domain is integrable. Otherwise, the integration goes to infinity at the discontinuities. Two triangular elements
are considered in the following computations. Figure 5•1a&b shows that two triangular elements with linear tem-
perature field with three corner nodes and either a constant heat flux with one node at the center of the element or
linear heat flux with three nodes at the Gaussian integration points. We notice that the temperature nodes on the
three corners is necessary to ensure the C0-continuity on the element boundaries, while the heat flux, with no
derivatives of heat flux in Eq. 5•9 and Eq. 5•10, can be discontinuous at the element boundaries. Therefore, the
constant heat flux element has one node at the center of the element (Figure 5•1a), and the linear heat flux ele-
ment has three nodes at the Gaussian integration points (Figure 5•1b). Actually, requiring heat flux to be C0-con-
tinuity on the element boundaries, had we use three corner nodes for the heat flux, may cause physically
incorrect conditions.1
Recall in Section 4.2.5, the mixed formulation for one dimensional beam bending problem, we have extended
“fe.lib” with the object-oriented modeling for developing matrix substructuring to solve system of equations in
submatrices similar to Eq. 5•8.
T = 30oC
q=0 q=0
1. e.g. as discussed in p.327 in Zienkiewicz, O.C. and R.L. Taylor, 1991, “The finite element method”, 4th ed., vol. 1.
McGraw-Hill, Inc., UK.
The area coordinates for triangle, L0, L1, and L2, are used for coordinate transformation and integration. The first
three nodes are geometrical nodes at three corners and the fourth-node is the q-node at the center. A reference
matrix “x” (line 12) is constructed to refer to the first three coordinates of “xl”, which leaves out the q-node. To
compute the Jacobian we notice that there is a factor of 1/2 for a triangular element comparing to the a quadrilat-
eral element where the factor = 1 (note that a factor of 1/6 is to be used for a 3-D tetrahedra element). The q-
shape function, “N” (line 16), is defined that the first three shape functions corresponding to three geometrical
nodes are zero; i.e., “N[0] = N[1] = N[2] = 0”. And the fourth shape function is one; i.e., “N[3] = 1”. The ele-
ment stiffness matrix so constructed will have the size of 8 × 8, with only the 2 × 2 submatrix at the lower-right
corner, corresponding to the center q-node, contains no zero components. All the geometrical nodes are to be
specified with Dirichlet type boundary condition. Dummy variables corresponding to these geometrical nodes
will all be eliminated from the global matrix and the global vector. Therefore, the no trivial 2 × 2 element subma-
trix will enter the global stiffness matrix.
The C-submatrix can be defined as
#include "include\fe.h"
#include "include\omega_h_n.h"
Matrix_Representation_Couple::assembly_switch
Matrix_Representation_Couple::Assembly_Switch = Matrix_Representation_Couple::ALL;
static const int row_t_node_no = 4; static const double h_e = 1.0;
Omega_h_i::Omega_h_i(int i) : Omega_h(0) {
if(i == 0) {
double xl[3][2], v[2]; Node *node;
Ωq
for(int j = 0; j < row_t_node_no-1; j++) make center q-nodes
for(int k = 0; k < row_t_node_no-1; k++) {
int node_no = (j*(row_t_node_no-1) + k) *2;
xl[0][0] = (double)k; xl[0][1] = (double)j;
corner coordinates
xl[1][0] = (double)(k+1); xl[1][1] = (double)j;
xl[2][0] = (double)(k+1); xl[2][1] = (double)(j+1); geometrical center coordinates
for(int l = 0; l < 2; l++) v[l] = (xl[0][l] + xl[1][l] + xl[2][l])/3.0;
node = new Node(node_no, 2, v); node_array().add(node);
xl[0][0] = (double)k; xl[0][1] = (double)j;
xl[1][0] = (double)(k+1); xl[1][1] = (double)(j+1);
xl[2][0] = (double)k; xl[2][1] = (double)(j+1);
or(int l = 0; l < 2; l++) v[l] = (xl[0][l] + xl[1][l] + xl[2][l])/3.0;
node = new Node(node_no+1, 2, v); node_array().add(node);
}
for(int j = 0; j < row_t_node_no; j++)
for(int k = 0; k < row_t_node_no; k++) {
corner geometrical nodes
int nn = j*row_t_node_no+k+(row_t_node_no-1)*2*(row_t_node_no-1);
v[0] = ((double)k)*h_e; v[1] = ((double)j)*h_e;
node = new Node(nn, 2, v); node_array().add(node);
}
for(int j = 0; j < row_t_node_no-1; j++) make heat flux elements, Ωqe
for(int k = 0; k < row_t_node_no-1; k++) {
int element_no = ((row_t_node_no-1)*j + k)*2,
center_node_no = element_no,
first_corner_node_no = j*row_t_node_no+k +(row_t_node_no-1)*2*(row_t_node_no-1);
int ena[4];
ena[0] = first_corner_node_no+1+row_t_node_no; ena[1] = first_corner_node_no;
ena[2] = first_corner_node_no+1; ena[3] = center_node_no;
Omega_eh* elem = new Omega_eh(element_no, 0, 0, 4, ena);
omega_eh_array().add(elem);
ena[0] = first_corner_node_no; ena[1] =
first_corner_node_no+1+row_t_node_no;
ena[2] = first_corner_node_no+row_t_node_no; ena[3] = center_node_no+1;
elem = new Omega_eh(element_no+1, 0, 0, 4, ena); omega_eh_array().add(elem); ΩΤ
}
} else if(i == 1) {
double v[2]; make T-corner nodes
for(int j = 0; j < row_t_node_no; j++)
for(int k = 0; k < row_t_node_no; k++) {
int nn = j*row_t_node_no+k; v[0] = ((double)k)*h_e; v[1] = ((double)j)*h_e;
Node* node = new Node(nn, 2, v); node_array().add(node);
} make temperature elements, ΩTe
for(int j = 0; j < row_t_node_no-1; j++)
for(int k = 0; k < row_t_node_no-1; k++) {
int element_no = ((row_t_node_no-1)*j + k)*2,
first_corner_node_no = j*row_t_node_no+k, ena[3];
ena[0] = first_corner_node_no+1+row_t_node_no;
ena[1] = first_corner_node_no; ena[2] = first_corner_node_no+1;
Omega_eh* elem = new Omega_eh(element_no, 0, 0, 3, ena);
omega_eh_array().add(elem);
∫ φ q ⊗ ( κ – 1 φ q ) dΩ
H0 N_q = ((~N) || C0(0.0)) &
(C0(0.0)|| (~N)); A =
stiff &= ((~N_q) * (K_inv * N_q)) | dv; Ωe
}
Element_Formulation_Couple* Heat_Mixed_Formulation::make(
int en, Global_Discretization_Couple& gdc) { C-submatrix element formulation
return new Heat_Mixed_Formulation(en,gdc);
}
Heat_Mixed_Formulation::Heat_Mixed_Formulation(int en, Global_Discretization_Couple& gdc)
: Element_Formulation_Couple(en, gdc) {
Quadrature qp(2, 4);
H1 L(2, (double*)0, qp),
n ≡ φ T linear triangular shape function
n = INTEGRABLE_VECTOR_OF_TANGENT_BUNDLE(
"int, int, Quadrature", 3/*nen*/, 2/*nsd*/, qp), for both coordinate transformation and
L0 = L[0], L1 = L[1], L2 = 1.0 - L0 - L1; n[0] = L0; n[1] = L1; n[2] = L2; temperature field shape function
H1 X = n*xl;
H0 nx = d(n) * d(X).inverse();
J dv(d(X).det()/2.0); N ≡ φ q= 1.0 (constant over element)
H0 N = INTEGRABLE_VECTOR("int, Quadrature", 4/*nen*/, qp); for heat flux field shape function
N[0] = N[1] = N[2] = 0.0; N[3] = 1.0;
H0 N_q = ((~N) || C0(0.0)) &
(C0(0.0)|| (~N)); C = ∫ ∇φT ⊗ φq dΩ
stiff &= (nx * N_q) | dv; Ωe
}
Element_Formulation* Element_Formulation::type_list = 0;
static Element_Type_Register element_type_register_instance;
static Heat_Mixed_Formulation
heat_mixed_formulation_instance(element_type_register_instance);
Listing 5•1 Substructure method for the mixed formulation of the heat conduction problem with constant
heat flux (project: “mixed_heat_conduction” in project workspace file “fe.dsw”).
The element stiffness matrix so generated has the size of 3 × 8. The three rows (number of equations) corre-
sponding to three temperature nodes at the corner. After the element to global mapping, the first 6 columns (cor-
responding to the number of dummy variables) for the geometrical nodes will not enter the global stiffness
matrix. Only the last 2 columns (corresponding to the number of variables) for the center q-node will survive.
After the submatrices have been formed, the solution of system of equation in Eq. 5•8 is the subject of con-
straint optimization problem (see Section 2.3.3 in Chapter 2). In the context of finite element problem, the objec-
tive functional for optimization is quadratic. The problem is further restricted to a quadratic programming
problem, in which only one step along the search path is needed to reach the exact solution (see introduction and
its example in page 145). We discuss the range space method and null space method (see page 149) to solve Eq.
5•8 in the followings.
From first equation of Eq. 5•8 we have
A q̂ + CT Tˆ = f1 Eq. 5•13
Considering A is symmetrical positive definitive, therefore it can be inverted, we can solve for q by
Substituting Eq. 5•14 into second equation of Eq. 5•8, C q̂ = f2, we have
Therefore, “ Tˆ ” can also be solved considering that the similarity transformation of A-1 as “C A-1 CT” preserves
the symmetrical positive definitiveness of A-1. An alternative view is that “C A-1 CT” is the projection of inverse
After we obtain the nodal temperature solution “ Tˆ ”, the nodal heat flux solution “ q̂ ” can be solved from Eq.
5•13 as
Eq. 5•16 and Eq. 5•17 is the formula for the range space method. We notice that the size of A-square submatrix
is the number of q̂ -free-variables, denotes as “n q”. The row-size of the C-submatrix is number of Tˆ -free-vari-
ables, denotes as “nT” (C-submatrix has size of nT × nq). Assuming full-row-rank condition for both A and C sub-
matrices. The rank of A-1 is “nq” and the column rank of CT is “nT”. In order to have “C A-1 CT” non-singular for
the inversion of “C A-1 CT” to be possible, we must have
nq ≥ nT Eq. 5•18
Based on the above simplistic linear algebraic discussion, Eq. 5•18 can be used to conceptually performing
“patch test” in the design of “q-T” two-field mixed finite element.1 However, a mathematically more rigorous
theorem on the existence and uniqueness of the constraint optimization problem (in the abstract form of the sad-
dle-point problem) is known as LBB-condition (coined by Ladyzhenskaya-Babuska-Brezzi and is also known as
inf-sup condition).2 The implementation of the range space method is
1 int main() {
2 Matrix_Representation mr(q_gd);
3 Matrix_Representation_Couple mrc(gdc, 0, 0, &(mr.rhs()), &mr);
4 mrc.assembly();
5 mr.assembly();
6 C0 A = ((C0)(mr.lhs())), f_1 = ((C0)(mr.rhs())),
7 C = ((C0)(mrc.lhs())), f_2 = ((C0)(mrc.rhs()));
8 Cholesky dA(A); // Cholesky decomposition on Hessian A
9 C0 Ainv = dA.inverse(),
10 CAinvCt = C*Ainv*(~C);
11 Cholesky dCAinvCt(CAinvCt); // Cholesky decomposition on C A-1 CT
12 C0 T = dCAinvCt*((C*Ainv*f_1)-f_2),
13 q = dA*(f_1-(~C)*T);
14 q_h = q; // update free degree of freedom
15 q_h = q_gd.gh_on_gamma_h(); // update fixed degree of freedom
16 cout << "heat flow:" << endl;
1. p.324-327 in Zienkiewicz, O.C. and R.L. Taylor, 1989, “The finite element method”, 4th ed., vol. 1. McGraw-Hill, Inc.,
UK.
2. see Brezzi, F, and M. Fortin, 1991, “Mixed and hybrid finite element method”, Spring-Berlag New York, Inc.
respectively, with
f 2e = – ∫ φ T f dΩ Eq. 5•20
Ωe
and,
∫ φT h dΓ – C e ĥ
Γ he
= 0 Eq. 5•21
Γ he
1. see p.206 in Hughes, T. J.R., “The finite element method: linear static and dynamic finite element analysis”, Prentice-Hall,
Inc., Englewood Cliffs, New Jersey.
T
f1 = - C ĝ Γg
Eq. 5•22
Substituting the element level vector of Eq. 5•22 into the second equation in Eq. 5•19, we have
T T
fe = - Ce Ae-1 C e ĝ e Γ ge
- f2e= - Ce Ae-1 C e ĝ e Γ ge
- f2e
= -ke ĝ e Γ ge
- f2e Eq. 5•23
The first term in the right-hand-side is consistent with standard implementation of finite element on the essential
boundary conditions. The mixed form of Eq. 5•19 can be easily implemented which does not require the mecha-
nism provided in the “fe.lib” to express matrix substructuring. Program Listing 5•2 implements Eq. 5•19 (project
“mixed_T_heat_conduction” in project workspace file “fe.dsw”). The temperature solutions of the computation
is certainly identical to the full-fledged mixed formulation. The program is significant simplified comparing to
the full-scale mixed formulation.
#include "include\fe.h"
static row_node_no = 4;
EP::element_pattern EP::ep = EP::SLASH_TRIANGLES;
Omega_h::Omega_h() {
double coord[4][2] = {{0.0, 0.0}, {3.0, 0.0}, {3.0, 3.0}, {0.0, 3.0}};
int control_node_flag[4] = {TRUE, TRUE, TRUE, TRUE};
block(this, row_node_no, row_node_no, 4, control_node_flag, coord[0]);
}
gh_on_Gamma_h::gh_on_Gamma_h(int df, Omega_h& omega_h) { __initialization(df, omega_h);
for(int j = 0; j < row_node_no; j++) {
the_gh_array[node_order(row_node_no*(row_node_no-1)+j)](0) =
the_gh_array[node_order(j)](0) = gh_on_Gamma_h::Dirichlet;
the_gh_array[node_order(row_node_no*(row_node_no-1)+j)][0] =
((double)(row_node_no-1))*10.0;
the_gh_array[node_order(j)][0] = 0.0;
}
}
class HeatMixedT3 : public Element_Formulation {
public:
HeatMixedT3(Element_Type_Register a) : Element_Formulation(a) {}
Element_Formulation *make(int, Global_Discretization&);
HeatMixedT3(int, Global_Discretization&);
};
Element_Formulation* HeatMixedT3::make(int en, Global_Discretization& gd) {
return new HeatMixedT3(en,gd); }
double k_x = 1.0, k_y = 1.0, k_inv[2][2] = { {1.0/k_x, 0.0}, { 0.0, 1.0/k_y}};
C0 K_inv = MATRIX("int, int, const double*", 2, 2, k_inv[0]);
HeatMixedT3::HeatMixedT3(int en, Global_Discretization& gd) : Element_Formulation(en, gd) {
Quadrature qp(2, 4);
H1 L(2, (double*)0, qp),
n = INTEGRABLE_VECTOR_OF_TANGENT_BUNDLE("int, int, Quadrature", 3, 2, qp),
L0 = L[0], L1 = L[1], L2 = 1.0 - L0 - L1;
n[0] = L0; n[1] = L1; n[2] = L2;
H1 X = n*xl; H0 nx = d(n) * d(X).inverse(); J dv(d(X).det()/2.0);
H0 N = INTEGRABLE_SCALAR("Quadrature", qp); N = 1.0;
C = ∫ ∇φT ⊗ φq dΩ
H0 N_q = ((~N) || C0(0.0)) & Ωe
(C0(0.0)|| (~N));
C0 C = (nx * N_q) | dv,
A = ((~N_q) * (K_inv * N_q)) | dv,
A = ∫ φq ⊗ ( κ –1 φq ) dΩ
Ωe
A_inv = A.inverse();
stiff &= C*A_inv*(~C); k = (CA-1CT)
}
Element_Formulation* Element_Formulation::type_list = 0;
Element_Type_Register element_type_register_instance;
static HeatMixedT3 heatmixedt3_instance(element_type_register_instance);
int main() {
int ndf = 1; Omega_h oh; gh_on_Gamma_h gh(ndf, oh); U_h uh(ndf, oh);
Global_Discretization gd(oh, gh, uh);
Matrix_Representation mr(gd);
mr.assembly();
C0 u = ((C0)(mr.rhs())) / ((C0)(mr.lhs()));
uh = u; uh = gh;
cout << uh << endl;
return 0;
}
Listing 5•2 Mixed formulation with discontinuous temperature field (C0 continuity dropped) reduces to
temperature field only formulation (project: “mixed_t_field_heat_conduction”).
where
∂N a
----------
b xa ∂x
Ba ≡ ≡ Eq. 5•25
b ya ∂N a
----------
∂y
and a = 0, 1, 2, 3. A reduced integration (1 Gauss point at the center of the element) will result in rank deficiency
of the element stiffness matrix. The rank of the element stiffness matrix is the number of integration points times
the number of independent relations. In the case of heat conduction, the number of independent relations is the
number of equations relating the heat flux [qx, qy] and the temperature gradients [∂T/∂x, ∂T/∂y] by Fourier law
of heat conduction in 2-D. Therefore, the rank of the stiffness matrix for 1-point Gauss integration rule is 2 (=
2 × 1). For the element stiffness matrix of size 4 × 4 in Eq. 5•24, the 1-point Gauss integration leads to rank defi-
ciency of 2 (= 4-2). We can consider that the 1-point integration stiffness matrix is span by bx = {bxa, bya} in Eq.
5•25, which are bases in 4. The two spurious zero energy modes in the null space are constant solution mode
with nodal solution of sa = [1, 1, 1, 1] and hourglass mode with nodal solution of ha = [-1, 1, -1, 1]. The hour-
glass mode is illustrated in Figure 5•2. The two zero energy modes are orthogonal to bx
where we also denote x0 = x, and x1 = y. We notice that for isoparametric coordinate transformation xi = Na xia ,
we have the relation
1
T
0.5
0
-0.5 η
-1
1
0.5
.5
-0.5
-1
ξ
-1
-0.5
0
0.5
1
The constant nodal solution sa is considered “proper”, since a constant temperature field produces no heat flux is
as expected, while the hourglass mode ha has gradient far from zero (see Figure 5•2) but no heat flux production,
which is considered “improper”. Therefore, we expect the 4-node element stiffness to produce a rank 3 matrix.
The strategy is to use a so-called trial hourglass mode, Ψa , to construct a correct-ranked stiffness ke in a way that
is very economical to compute as
The first term on the right hand side, ke(1-point) is standard stiffness matrix computed with only one Gaussian
integration point. The second term ke(hourglass) is to be constructed with the trial hourglass mode Ψa. The general
form of the trial hourglass mode, Ψa , in 4 is span by the four bases [bxa, bya, sa, ha] as
From last part of Eq. 5•31, no component of Ψa is in sa, so in Eq. 5•29 a2 = 0, and we have this equation re-writ-
ten as
Substituting Eq. 5•32 into Ψa xa = 0 in the first part of Eq. 5•31, and using bxa xa = 1, and bya xa = 0 as indicated
in Eq. 5•27, we have
a0 = - h a x a Eq. 5•33
Similarly, substituting Eq. 5•32 into Ψa ya = 0 in the second part of Eq. 5•31, and using bya ya = 1, and bxa ya = 0
as in Eq. 5•27, we have
a1 = - h a y a Eq. 5•34
where Ψa is Ψa normalized to ||Ψa|| = 2, b = [bx, by]T, and J ≡ det ( ∂x ⁄ ∂ ξ ) . Program Listing 5•3 implements the
hourglass element for heat conduction (project “hourglass_heat” in project ˜ workspace file”fe.dsw”).
We measure the time spent in the computation of stiffness matrix for the heat conduction element with irre-
ducible formulation in Chapter 4. It takes 3.5 seconds to assemble the stiffness on an obsolete 166 MHz PC. For
the mixed formulation in the last section, although we gain significant freedom in terms of formulation, the
assemble of diagonal and off-diagonal stiffness matrices (A and C) takes 6.2 seconds on the same computer. The
hourglass element takes only 0.5 second to assemble the stiffness.
1. see similar derivation for elasticity in p.251-254 in Hughes, T. J.R., “The finite element method: linear static and dynami c
finite element analysis”, Prentice-Hall, Inc., Englewood Cliffs, New Jersey, and references therein.
#include "include\fe.h"
EP::element_pattern EP::ep = EP::QUADRILATERALS_4_NODES;
Omega_h::Omega_h() {
double coord[4][2] = {{0.0, 0.0}, {3.0, 0.0}, {3.0, 3.0}, {0.0, 3.0}};
int control_node_flag[4] = {1, 1, 1, 1};
block(this, 4, 4, 4, control_node_flag, coord[0]);
}
gh_on_Gamma_h::gh_on_Gamma_h(int df, Omega_h& omega_h) {
__initialization(df, omega_h);
int row_node_no = 4;
for(int i = 0; i < row_node_no; i++) {
the_gh_array[node_order(i)](0) = gh_on_Gamma_h::Dirichlet; bottom B.C. = 0oC
the_gh_array[node_order(row_node_no*(row_node_no-1)+i)](0) = top B.C. = 30oC
gh_on_Gamma_h::Dirichlet;
the_gh_array[node_order(row_node_no*(row_node_no-1)+i)][0] = 30.0;
}
}
class HourGlassHeatQ4 : public Element_Formulation { public:
HourGlassHeatQ4(Element_Type_Register a) : Element_Formulation(a) {}
Element_Formulation *make(int, Global_Discretization&);
HourGlassHeatQ4(int, Global_Discretization&);
};
Element_Formulation* HourGlassHeatQ4::make(int en, Global_Discretization& gd) {
return new HourGlassHeatQ4(en,gd);
}
HourGlassHeatQ4::HourGlassHeatQ4(int en, Global_Discretization& gd) :
Element_Formulation(en, gd) {
Quadrature qp(2, 1); 1-point Gaussian integration for 2-D
H1 Z(2, (double*)0, qp), Zai, Eta,
N = INTEGRABLE_VECTOR_OF_TANGENT_BUNDLE("int, int, Quadrature", 4, 2, qp);
Zai &= Z[0]; Eta &= Z[1];
N[0] = (1-Zai)*(1-Eta)/4; N[1] = (1+Zai)*(1-Eta)/4;
N[2] = (1+Zai)*(1+Eta)/4; N[3] = (1-Zai)*(1+Eta)/4;
H1 X = N*xl; H0 Nx = d(N) * d(X).inverse(); J dv(d(X).det());
double k_ = 1.0; C0 K_standard = (Nx * k_ * (~Nx)) | dv;
ke(1-point) from irreducible formulation
C0 h(4, (double*)0); h[3] = 1.0; Ψa = ha - (ha xa) bxa - (ha ya) bya
C0 phi = h - Nx.quadrature_point_value(0)*((~xl)*h); Ψa is Ψa normalized to ||Ψa|| =2
double factor = 2.0/norm(phi); phi *= factor;
H0 b = Nx(0) & Nx(1);
b = [bx, by]T
double j = (double)(d(X).det().quadrature_point_value(0)); J ≡ det ( ∂x ⁄ ∂ ξ )
C0 K_hourglass = (k_*j*((~b)*b) | dv) /12.0 * (phi%phi); kJ ( b • b )
k e ( hourglass ) ≡ ----------------------- ( Ψ ⊗ Ψ )
stiff &= K_standard + K_hourglass; 12
}
Element_Formulation* Element_Formulation::type_list = 0; ke = ke(1-point) + ke(hourglass)
Element_Type_Register element_type_register_instance;
static HourGlassHeatQ4 hourglassheatq4_instance(element_type_register_instance);
int main() {
int ndf = 1; Omega_h oh; gh_on_Gamma_h gh(ndf, oh); U_h uh(ndf, oh);
Global_Discretization gd(oh, gh, uh); Matrix_Representation mr(gd);
mr.assembly();
C0 u = ((C0)(mr.rhs())) / ((C0)(mr.lhs()));
uh = u; uh = gh;
cout << uh << endl;
return 0;
}
Listing 5•3 One point Gauss integration stiffness matrix with hourglass stabilizer element (project:
“hourglass_heat” in project workspace file “fe.dsw”).
∫ ( Lδu ) σ dΩ – ∫ δu b dΩ – ∫ δuT h dΓ
T
δ (u) = T
Eq. 5•37
Ω Ω Γh
where the body force is denoted as b, the strain is defined as ε = Lu , and the differential operator L in matrix
form as
∂
------ 0
∂x
L ≡ 0 ----- ∂ Eq. 5•38
-
∂y
∂ ∂
------ ------
∂y ∂x
The traction boundary condition is t = h on Γh. In the mixed formulation, in addition to the variational approxi-
mation on the equilibrium equations. we will also use the variational approximation to both the constitutive
equations and strain-displacement relations, separately. The interpolation functions Φ(x) in finite element
approximation of displacement field is taken as
where Φ ea ( x ) is interpolation functions, and û ea is nodal displacements. The subscript “e” denotes the element
level.
σ = D ε = D Lu Eq. 5•40
We also add stress field, σ, as additional variable in the Lagrangian functional such that
∫ δ σ T ( Lu – D –1 σ ) dΩ = 0 Eq. 5•41
Ω
Eq. 5•37 and Eq. 5•41 are the Euler-Lagrange equations corresponding to the Lagrangian functional.
( σ, u ) = --- ∫ σ T D –1 σ dΩ + ∫ u T ( L T σ + b ) dΩ – ∫ u T ( n σ – h ) dΓ
1
Eq. 5•42
2
Ω Ω Γh
σe ≡ Ψea ( x ) σ̂e
a
Eq. 5•43
By inspecting Eq. 5•37 and Eq. 5•41, stress field has no derivatives taken on it. Therefore, the C0-continuity
requirement on the element boundaries, to ensure the integral equation does not give infinity, can be dropped.
The interpolation functions Ψea ( x ), in contrast to Φ ea ( x ) in Eq. 5•39, can be taken as piece-wise continuous func-
tions across the entire problem domain. For example, for four stress nodes taken at Gauss integration points with
the natural coordinates
1 1 1 1 1 1 1 1
[ ξ a, η a ] = – -------, – ------- , -------, – ------- , -------, ------- , – -------, ------- Eq. 5•44
3 3 3 3 3 3 3 3
1
Ψ ea ( x ) ≡ --- ( 1 + 3ξ a ξ ) ( 1 + 3η a η ) Eq. 5•45
4
If such discontinuous (at the element boundaries) interpolation is taken, the stress field can be approximated glo-
bally. Because there is no inter-element dependency. The subscript “e” on Ψ can be dropped. The matrix form of
Eq. 5•37 and Eq. 5•41, at element level, is
AC
T
σ̂ =
f1
Eq. 5•46
C 0 û f2
where
A = – ∫ Ψ ⊗ ( D Ψ ) dΩ
–1
Eq. 5•47
Ωe
C = ∫ B ⊗ Ψ dΩ Eq. 5•48
Ωe
f1 = – A ( n σ )
T
Γ eσ
–C u Γ eu
Eq. 5•49
f2 = ∫Φ b dΩ + ∫Φ ( n σ ) dΓ – C ( n σ ) Eq. 5•50
Γ eσ
Ωe Γ eσ
nσ ≥ nu Eq. 5•51
Two quadrilateral elements with four σ-nodes and eight u-nodes (Q 4/8)1 are used to compute the beam bending
problem in the higher-order path test in the Chapter 4. The numbers of degree of freedom for the two fields are
nσ = 8 × 3 = 24, nu = 13 × 2-4 = 22, which satisfied the conceptual patch test criterion in Eq. 5•51. The implemen-
tation of Eq. 5•8 to Eq. 5•12 is shown in Program Listing 5•4 (project “hellinger_reissner_variational_principle”
in project workspace file “fe.dsw”). They present no new difficulty from the Program Listing 5•1. The solution
of the tip-deflection is 0.75, which is exact.
1. p.331 in Zienkiewicz, O.C. and R.L. Taylor, 1989, “The finite element method”, 4th ed., vol. 1. McGraw-Hill, Inc., UK.
#include "include\fe.h"
#include "include\omega_h_n.h"
Matrix_Representation_Couple::assembly_switch
Matrix_Representation_Couple::Assembly_Switch = Matrix_Representation_Couple::ALL;
static const int row_node_no = 5; static const int row_segment_no = row_node_no-1;
static const double L_ = 10.0; static const double c_ = 1.0;
static const double h_e_ = L_/((double)row_segment_no);
static const double E_ = 1.e3; static const double v_ = 0.3;
Omega_h_i::Omega_h_i( int i) : Omega_h(0) {
if(i == 0) { Ωσ
double inv_sqrt3 = 1.0/sqrt(3.0), v[2], xl[4][2], zai, eta; Node *node;
xl[0][0] = 0.0; xl[0][1] = 0.0; xl[1][0] = 2.0*h_e_; xl[1][1] = 0.0; 1st element
xl[2][0] = 2.0*h_e_; xl[2][1] = 2.0*c_; xl[3][0] = 0.0; xl[3][1] = 2.0*c_; coordinates of four corner nodes
1 1
zai = - inv_sqrt3; eta = - inv_sqrt3; physcial coordinates at ( – -------, – ------- )
for(int j = 0; j < 2; j++) v[j] =(1.0-zai)*(1.0-eta)/4.0*xl[0][j]+(1.0+zai)*(1.0-eta)/4.0*xl[1][j]+ 3 3
(1.0+zai)*(1.0+eta)/4.0*xl[2][j]+ (1.0-zai)*(1.0+eta)/4.0*xl[3][j];
node = new Node(0, 2, v); node_array().add(node);
1 1
zai = inv_sqrt3; eta = - inv_sqrt3; physcial coordinates at ( -------, – ------- )
for(int j = 0; j < 2; j++) v[j] = (1.0-zai)*(1.0-eta)/4.0*xl[0][j]+ (1.0+zai)*(1.0-eta)/4.0*xl[1][j]+ 3 3
(1.0+zai)*(1.0+eta)/4.0*xl[2][j]+ (1.0-zai)*(1.0+eta)/4.0*xl[3][j];
node = new Node(1, 2, v); node_array().add(node);
1 1
zai = inv_sqrt3; eta = inv_sqrt3; physcial coordinates at ( -------, ------- )
for(int j = 0; j < 2; j++) v[j] = (1.0-zai)*(1.0-eta)/4.0*xl[0][j]+ (1.0+zai)*(1.0-eta)/4.0*xl[1][j]+ 3 3
(1.0+zai)*(1.0+eta)/4.0*xl[2][j]+ (1.0-zai)*(1.0+eta)/4.0*xl[3][j];
node = new Node(2, 2, v); node_array().add(node);
1 1
zai = - inv_sqrt3; eta = inv_sqrt3; physcial coordinates at ( – -------, ------- )
for(int j = 0; j < 2; j++) v[j] = (1.0-zai)*(1.0-eta)/4.0*xl[0][j]+ 1.0+zai)*(1.0-eta)/4.0*xl[1][j]+ 3 3
(1.0+zai)*(1.0+eta)/4.0*xl[2][j]+ (1.0-zai)*(1.0+eta)/4.0*xl[3][j];
node = new Node(3, 2, v); node_array().add(node);
xl[0][0] = 2.0*h_e_; xl[0][1] = 0.0; xl[1][0] = 4.0*h_e_; xl[1][1] = 0.0; 2nd element
xl[2][0] = 4.0*h_e_; xl[2][1] = 2.0*c_; xl[3][0] = 2.0*h_e_; xl[3][1] = 2.0*c_;
zai = - inv_sqrt3; eta = - inv_sqrt3;
for(int j = 0; j < 2; j++) v[j] = (1.0-zai)*(1.0-eta)/4.0*xl[0][j]+ (1.0+zai)*(1.0-eta)/4.0*xl[1][j]+
(1.0+zai)*(1.0+eta)/4.0*xl[2][j]+ (1.0-zai)*(1.0+eta)/4.0*xl[3][j];
node = new Node(4, 2, v); node_array().add(node);
zai = inv_sqrt3; eta = - inv_sqrt3;
for(int j = 0; j < 2; j++) v[j] = (1.0-zai)*(1.0-eta)/4.0*xl[0][j]+ (1.0+zai)*(1.0-eta)/4.0*xl[1][j]+
(1.0+zai)*(1.0+eta)/4.0*xl[2][j]+ (1.0-zai)*(1.0+eta)/4.0*xl[3][j];
node = new Node(5, 2, v); node_array().add(node);
zai = inv_sqrt3; eta = inv_sqrt3;
for(int j = 0; j < 2; j++) v[j] = (1.0-zai)*(1.0-eta)/4.0*xl[0][j]+ (1.0+zai)*(1.0-eta)/4.0*xl[1][j]+
(1.0+zai)*(1.0+eta)/4.0*xl[2][j]+ (1.0-zai)*(1.0+eta)/4.0*xl[3][j];
node = new Node(6, 2, v); node_array().add(node);
zai = - inv_sqrt3; eta = inv_sqrt3;
for(int j = 0; j < 2; j++) v[j] = (1.0-zai)*(1.0-eta)/4.0*xl[0][j]+ (1.0+zai)*(1.0-eta)/4.0*xl[1][j]+
(1.0+zai)*(1.0+eta)/4.0*xl[2][j]+ (1.0-zai)*(1.0+eta)/4.0*xl[3][j];
node = new Node(7, 2, v); node_array().add(node);
v[0] = 0.0; v[1] = 0.0; node = new Node(8, 2, v); node_array().add(node); node # 8-20 are geometrical nodes
v[0] = 1.0*h_e_; node = new Node(9, 2, v); node_array().add(node); (serendipity)
v[0] = 2.0*h_e_; node = new Node(10, 2, v); node_array().add(node);
v[0] = 3.0*h_e_; node = new Node(11, 2, v); node_array().add(node);
v[0] = 4.0*h_e_; node = new Node(12, 2, v); node_array().add(node);
v[0] = 0.0; v[1] = 1.0*c_; node = new Node(13, 2, v); node_array().add(node);
v[0] = 2.0*h_e_; node = new Node(14, 2, v); node_array().add(node);
v[0] = 4.0*h_e_; node = new Node(15, 2, v); node_array().add(node);
v[0] = 0.0; v[1] = 2.0*c_; node = new Node(16, 2, v); node_array().add(node);
v[0] = 1.0*h_e_; node = new Node(17, 2, v); node_array().add(node);
Element_Formulation_Couple* ElasticQ84_Mixed_Formulation::make(
off-diagonal C-matrix definition
int en, Global_Discretization_Couple& gdc) {
return new ElasticQ84_Mixed_Formulation(en,gdc);
}
ElasticQ84_Mixed_Formulation::ElasticQ84_Mixed_Formulation(
int en, Global_Discretization_Couple& gdc) : Element_Formulation_Couple(en, gdc) {
Quadrature qp(2, 9);
H1 Z(2, (double*)0, qp),
N = INTEGRABLE_VECTOR_OF_TANGENT_BUNDLE("int, int, Quadrature", 8, 2, qp),
Zai, Eta;
Zai &= Z[0]; Eta &= Z[1];
N[0] = (1.0-Zai)*(1.0-Eta)/4.0; N[1] = (1.0+Zai)*(1.0-Eta)/4.0;
N[2] = (1.0+Zai)*(1.0+Eta)/4.0; N[3] = (1.0-Zai)*(1.0+Eta)/4.0;
Listing 5•4 Substructure solution for the Hellinger-Reissner variational principle for plane elasticity
(project: “hellinger_reissner_variational_formulation” in project workspace file “fe.dsw”).
Eq. 5•53 and Eq. 5•37 are the Euler-Lagange equations of the Lagrangian functional
( ε, σ, u ) = --- ∫ ε T D ε dΩ – ∫ σ T ( ε – Lu ) dΩ – ∫ uT b dΩ – ∫ uT h dΓ
1
Eq. 5•54
2
Ω Ω Ω Γh
The Lagrangian functional in Eq. 5•54 is known as the Hu-Washizu variational principle. The finite element
approximation to the strain field uses the interpolation functions Ξ(x) as
The matrix form, at element level, of Eq. 5•37 and Eq. 5•53 is
A CT 0 ε̂ f1
Eq. 5•56
C 0 ET σ̂ = f2
0 E 0 f3
û
where
A = ∫ Ξ ⊗ ( D Ξ ) dΩ Eq. 5•57
Ωe
E = ∫ B ⊗ Ψ dΩ Eq. 5•58
Ωe
C = – ∫ Ξ ⊗ Ψ dΩ Eq. 5•59
Ωe
f3 = ∫Φ b dΩ + ∫Φ ( n σ ) dΓ – E ( n σ ) Eq. 5•62
Γ eσ
Ωe Γ eσ
The Program Listing 5•5 implements Eq. 5•56 to Eq. 5•62 (project “hu_washizu_variational_principle” in
project workspace file “fe.dsw”). With the same patch test problem for the Hellinger-Reissner variational princi-
ple in the previous section. We use shape functions with four nodes at Gaussian integration points for both stress
(σ) and strain (ε) fields; i.e., Ξ = Ψ. This results in C-matrix in Eq. 5•59 to be symmetrical negative definitive.
Care should be taken, if Cholesky decomposition is used, which is applicable to a symmetrical positive defini-
tive matrix. The displacement (u) shape function Φ is an eight-nodes serendipity element. These choices satisfy
the condition in Eq. 5•63. The coding of matrix substructuring technique supported by “fe.lib” becomes a little
more elaborated with three-fields (ε, σ, u) instead of two-fields (σ, u).
The modification from two-fields problem is minor, however. For the definitions of discretized global
domain and boundary have three index entries as
1. see discussion in p.333-334 in Zienkiewicz, O.C. and R.L. Taylor, 1989, “The finite element method”, 4th ed., vol. 1.
McGraw-Hill, Inc., UK.
#include "include\fe.h"
#include "include\omega_h_n.h"
Matrix_Representation_Couple::assembly_switch Matrix_Representation_Couple::
Assembly_Switch = Matrix_Representation_Couple::ALL;
static const int row_node_no = 5; static const int row_segment_no = row_node_no-1;
static const double L_ = 10.0; static const double c_ = 1.0;
static const double h_e_ = L_/((double)row_segment_no);
static const double E_ = 1.e3; static const double v_ = 0.3;
Omega_h_i::Omega_h_i(int i) : Omega_h(0) {
if(i == 0 || i == 1) { define Ωε, and Ωσ
double inv_sqrt3 = 1.0/sqrt(3.0), v[2], xl[4][2], zai, eta;
Node *node;
xl[0][0] = 0.0; xl[0][1] = 0.0; xl[1][0] = 2.0*h_e_; xl[1][1] = 0.0; elem # 0 nodal coordinates
xl[2][0] = 2.0*h_e_; xl[2][1] = 2.0*c_; xl[3][0] = 0.0; xl[3][1] = 2.0*c_;
zai = - inv_sqrt3; eta = - inv_sqrt3;
for(int j = 0; j < 2; j++)
1st Gauss point natural coordinates
v[j] = (1.0-zai)*(1.0-eta)/4.0*xl[0][j]+ (1.0+zai)*(1.0-eta)/4.0*xl[1][j]+ 1st Gauss point physical coordinates
(1.0+zai)*(1.0+eta)/4.0*xl[2][j]+ (1.0-zai)*(1.0+eta)/4.0*xl[3][j];
node = new Node(0, 2, v); node_array().add(node);
zai = inv_sqrt3; eta = - inv_sqrt3;
2nd Gauss point natural coordinates
for(int j = 0; j < 2; j++) 2nd Gauss point physical coordinates
v[j] = (1.0-zai)*(1.0-eta)/4.0*xl[0][j]+(1.0+zai)*(1.0-eta)/4.0*xl[1][j]+
(1.0+zai)*(1.0+eta)/4.0*xl[2][j]+(1.0-zai)*(1.0+eta)/4.0*xl[3][j];
node = new Node(1, 2, v); node_array().add(node);
zai = inv_sqrt3; eta = inv_sqrt3; 3rd Gauss point natural coordinates
for(int j = 0; j < 2; j++) 3rd Gauss point physical coordinates
v[j] = (1.0-zai)*(1.0-eta)/4.0*xl[0][j]+ (1.0+zai)*(1.0-eta)/4.0*xl[1][j]+
(1.0+zai)*(1.0+eta)/4.0*xl[2][j]+ (1.0-zai)*(1.0+eta)/4.0*xl[3][j];
node = new Node(2, 2, v); node_array().add(node);
zai = - inv_sqrt3; eta = inv_sqrt3; 4th Gauss point natural coordinates
for(int j = 0; j < 2; j++)
v[j] = (1.0-zai)*(1.0-eta)/4.0*xl[0][j]+ (1.0+zai)*(1.0-eta)/4.0*xl[1][j]+
4th Gauss point physical coordinates
(1.0+zai)*(1.0+eta)/4.0*xl[2][j]+ (1.0-zai)*(1.0+eta)/4.0*xl[3][j];
node = new Node(3, 2, v); node_array().add(node);
xl[0][0] = 2.0*h_e_; xl[0][1] = 0.0; xl[1][0] = 4.0*h_e_; xl[1][1] = 0.0;
xl[2][0] = 4.0*h_e_; xl[2][1] = 2.0*c_; xl[3][0] = 2.0*h_e_; xl[3][1] = 2.0*c_;
elem # 1 nodal coordinates
zai = - inv_sqrt3; eta = - inv_sqrt3; 1st Gauss point natural coordinates
for(int j = 0; j < 2; j++) 1st Gauss point physical coordinates
v[j] = (1.0-zai)*(1.0-eta)/4.0*xl[0][j]+ (1.0+zai)*(1.0-eta)/4.0*xl[1][j]+
(1.0+zai)*(1.0+eta)/4.0*xl[2][j]+ (1.0-zai)*(1.0+eta)/4.0*xl[3][j];
node = new Node(4, 2, v); node_array().add(node);
zai = inv_sqrt3; eta = - inv_sqrt3; 2nd Gauss point natural coordinates
for(int j = 0; j < 2; j++)
v[j] = (1.0-zai)*(1.0-eta)/4.0*xl[0][j]+ (1.0+zai)*(1.0-eta)/4.0*xl[1][j]+
2nd Gauss point physical coordinates
(1.0+zai)*(1.0+eta)/4.0*xl[2][j]+ (1.0-zai)*(1.0+eta)/4.0*xl[3][j];
node = new Node(5, 2, v); node_array().add(node);
zai = inv_sqrt3; eta = inv_sqrt3;
for(int j = 0; j < 2; j++)
3rd Gauss point natural coordinates
v[j] = (1.0-zai)*(1.0-eta)/4.0*xl[0][j]+ (1.0+zai)*(1.0-eta)/4.0*xl[1][j]+ 3rd Gauss point physical coordinates
(1.0+zai)*(1.0+eta)/4.0*xl[2][j]+ (1.0-zai)*(1.0+eta)/4.0*xl[3][j];
node = new Node(6, 2, v); node_array().add(node);
zai = - inv_sqrt3; eta = inv_sqrt3;
4th Gauss point natural coordinates
for(int j = 0; j < 2; j++) 4th Gauss point physical coordinates
v[j] = (1.0-zai)*(1.0-eta)/4.0*xl[0][j]+ (1.0+zai)*(1.0-eta)/4.0*xl[1][j]+
(1.0+zai)*(1.0+eta)/4.0*xl[2][j]+ (1.0-zai)*(1.0+eta)/4.0*xl[3][j];
node = new Node(7, 2, v); node_array().add(node);
∫ Ξ ⊗ ( D Ξ ) dΩ
stiff &= MATRIX("int, int", 36, 36);
C0 stiff_sub = MATRIX("int, int, C0&, int, int", 12, 12, stiff, 24, 24);
A =
stiff_sub = ((~N_epsilon) * (D * N_epsilon)) | dv; Ωe
} else { K matrix; for iterative method to esti-
Quadrature qp(2, 9);
H1 Z(2, (double*)0, qp), , Zai, Eta
mate initial u values
N = INTEGRABLE_VECTOR_OF_TANGENT_BUNDLE( "int, int, Quadrature", 8, 2, qp);
Listing 5•5 Substructure method for the three-field Hu-Washizu variational principle for plane elasticity
(project: “hu_wahsizu_variational_formulation” in project workspace file “fe.dsw”).
Line 16 is to construct the “convergence acceleration matrix” if iterative method is used.1 The global solution is
proceeded with û to be solved first as
û = (EC-1 AC-1 ET)-1 (f3 - EC-1 f1 + EC-1 AC-1 f2) Eq. 5•64
The Matrix_Representations are in instantiated in the followings with the global substructuring solution
1 int main() {
2 Matrix_Representation mr(epsilon_gd);
3 Matrix_Representation_Couple mrcC(gdc_sigma_epsilon, 0, 0, &(mr.rhs()), &mr);
4 Matrix_Representation_Couple mrcE(gdc_u_sigma, 0, 0, &(mrcC.rhs()), &mrcC);
5 mrcC.assembly();
6 mr.assembly();
7 mrcE.assembly();
8 C0 A = ((C0)(mr.lhs())), f_1 = ((C0)(mr.rhs())),
9 C = ((C0)(mrcC.lhs())), f_2 = ((C0)(mrcC.rhs())),
10 E = ((C0)(mrcE.lhs())), f_3 = ((C0)(mrcE.rhs()));
11 Cholesky dnC(-C); // decomposition; C is symmetrical negative definite
12 C0 Cinv = -(dnC.inverse()); // û = (EC-1AC-1ET)-1 (f3-EC-1f1+ EC-1AC-1f2)
13 C0 CinvA = Cinv*A;
14 C0 CinvACinv = CinvA*Cinv;
1. see next section and p. 361 in Zienkiewicz, O.C. and R.L. Taylor, 1989, “The finite element method”, 4th ed., vol. 1.
McGraw-Hill, Inc., UK.
The tip-deflection solution for the bending problem is the same as the one solved in the higher-order patch test
case. They are both exact.
σ̂ ε̂
0 0
= = û 0 = 0 Eq. 5•67
The iterative procedure starts from using the stiffness matrix from the irreducible formulation as the conver-
gence acceleration matrix
where δ û = û n+1 - û n, therefore, the next iteration solution û n+1 can be obtained. The computation of û 1 from
û 0(=0) is the same as the solution from the standard irreducible formulation. rn is defined as the residual of the
third equation in Eq. 5•56 as
r n ≡ E T σ̂ – f 3
n
Eq. 5•70
the next iterative solutions for strain ε̂ n+1, and stress σ̂ n+1 are computed from
ε̂ σ̂ = D ε̂
n+1 n+1 n+1
= C T Eû n + 1, then Eq. 5•71
We test this iterative procedure by setting the Poisson ratio ν = 0.5-10-12, which is nearly incompressible. In
addition, the plane strain is assumed. The same project “hu_washizu_variational_principle” in project workspace
file “fe.dsw” with macro definitions “__TEST_NEARLY_INCOMPRESSIBLE_PLANE_STRAIN” and
“__TEST_AUGMENTED_LAGRANGIAN_ITERATIVE_METHOD” are set at the compile time. The initial
tip-deflection solution is the same as the one from irreducible formulation, which is expected not to be the final
answer due to the nearly incompressibility under plane strain condition. After several iterations the solution con-
verged and terminated with a energy norm smaller than 10-15 times its initial energy.
m • ε – ---- dΩ
p
∫ ( δp ) T K
Eq. 5•73
Ω
where δp is the variation of p. From Eq. 4•220 in Chapter 4, we have the element stiffness matrix
k = ∫ ε ( δu ) T σ ( u )dΩ = ∫ ε ( δu )T [ σd ( u ) + mp ( u ) ]dΩ
Ω Ω
∫ ε ( δu ) T µ D0 – --- m ⊗ m ε ( u ) + mp ( u ) dΩ
2
=
Eq. 5•74
3
Ω
where the variation of u is denoted as δu. Notice that the equilibrium equation is the equilibrium of internal
forces and external forces. The weak statement of the equilibrium equation gives
∫ ε ( δu ) T µ D 0 – --- m ⊗ m ε ( u ) + mp ( u ) dΩ =
2
3 ∫ ( δu ) T b dΩ + ∫ ( δu ) T t dΓ Eq. 5•75
Ω Ω Γ
Substituting into Eq. 5•75 and Eq. 5•72 gives in matrix form at element level is
A C T û = f 1 Eq. 5•77
C V p̂ f2
where
2
A = ∫ BTµ D 0 – --- m ⊗ m BdΩ
3
Eq. 5•78
Ωe
Np ⊗ Np
V = – ∫ --------------------- dΩ Eq. 5•80
K
Ωe
f1 = ∫ NuT b dΩ + ∫ NuT t dΓ – Au Γ e
– CTp Γe
Eq. 5•81
Ωe Γe
f 2 = – Cu Γe
– Vp Γe
Eq. 5•82
The solution to substructuring of Eq. 5•77 is proceeded from its second equation, using symmetrical negative
definitiveness of V,
That is,
û = [ A – C T V –1 C ] – 1 [ f1 – C T V –1 f 2 ] Eq. 5•85
Therefore, we first solve û , and then substituting û back to Eq. 5•83 for recovering p̂ . Program Listing 5•6
(project “incompressible_u_p_formulation” in project workspace file “fe.dsw”) implements the Q 4/9 element
(with an additional center u-node) and test problem in the higher-order patch test.
A C T û = f 1 Eq. 5•86
C 0 p̂ f2
Due to the zero diagonals, the solution procedure is different from the nearly incompressible case; i.e., the inver-
sion of V in Eq. 5•83 to Eq. 5•85 is not permissible now. From first equation of Eq. 5•86, we have
CA –1 [ f 1 – C T p̂ ] = f2 ⇒ p̂ = [ CA –1 C T ] – 1 [ CA –1 f 1 – f2 ] Eq. 5•88
After p̂ is solved by the second part of Eq. 5•88, û can be recovered from second part of Eq. 5•87. The tip-
deflection is “-0.5625”. This incompressible formulation is computed with macro definition “__TEST_INCOM
PRESSIBLE_PLANE_STRAIN” set at the compile time for the same project
“incompressible_u_p_formulation” in project workspace file “fe.dsw”.
Displacement-Only Mixed Formulation: If pressure field is taken as discontinuous field, the pressure can be elim-
inated at the element level for both the nearly incompressible and incompressible cases, since no element shares
pressure node with the other element, and therefore no inter-element dependency. For example, in the nearly
incompressible case Eq. 5•85 can be written as
k e û e = f e Eq. 5•89
where the re-defined element stiffness matrix element force vector are
The Program Listing 5•7 implements the present u-only simplified mixed formulation (project “mixed_u_only”
in project workspace file “fe.dsw”). The element stiffness matrix of Q 4/9 mixed formulation is equivalent to that
of the selective reduced integration Lagrangian 9-node element (in page 406 of Chapter 4 when “ElasticQ9” ele-
ment with selective reduced integration is used. see Figure 5•3). This is known as the equivalence theorem in a
more general context.2 Notice that due to the ill-condition number in the nearly incompressible plain strain case,
the full-scale mixed solution and the displacement-only solution may differ numerically.
1. see p. 202-203 in Hughes, T.J.R., 1987, “The finite element method: linear static and dynamic finite element analysis”,
Prentice-Hall, Inc., Englewood Cliffs, New Jersey.
2. see p. 221-223 in Hughes, T.J.R., 1987, “The finite element method: linear static and dynamic finite element analysis”,
Prentice-Hall, Inc., Englewood Cliffs, New Jersey.
Figure 5•3 The equivalence of the selective reduced integration and the mixed formulation.
#include "include\fe.h"
static const double E_ = 1.e3;
static const double v_ = 0.5-1.e-12;
static const double K_ = E_/3.0/(1-2.0*v_);
static const double a_ = E_*(1-v_)/(1+v_)/(1-2*v_);
static const double Dv[3][3]= {{a_, a_*v_/(1-v_), 0.0 },
{a_*v_/(1-v_), a_, 0.0 },
{0.0, 0.0, a_*(1-2*v_)/2.0/(1-v_)} };
C0 D = MATRIX("int, int, const double*", 3, 3, Dv[0]);
static const double lambda_ = v_*E_/((1+v_)*(1-2*v_));
static const double mu_ = E_/(2*(1+v_));
EP::element_pattern EP::ep = EP::LAGRANGIAN_9_NODES;
Omega_h::Omega_h() {
int control_node_flag[4] = {1, 1, 1, 1};
double x[4][2] = {{0.0, 0.0}, {10.0, 0.0}, {10.0, 2.0}, {0.0, 2.0}};
block(this, 3, 5, 4, control_node_flag, x[0]);
}
gh_on_Gamma_h::gh_on_Gamma_h(int df, Omega_h& omega_h) {
__initialization(df, omega_h);
double f_ = 15.0;
for(int i = 0; i < row_node_no; i++) {
the_gh_array[node_order((i+1)*5-1)](0) = gh_on_Gamma_h::Dirichlet;
the_gh_array[node_order((i+1)*5-1)][0] = 0.0;
the_gh_array[node_order(5)](1)= gh_on_Gamma_h::Dirichlet;
the_gh_array[node_order(5)](1)= 0.0;
double h_ = 1.0;
the_gh_array[node_order(10)](0) = gh_on_Gamma_h::Nuemann;
the_gh_array[node_order(10)][0] = f_*(1.0/3.0)*h_;
the_gh_array[node_order(0)](0)= gh_on_Gamma_h::Nuemann;
the_gh_array[node_order(0)][0] = -f_*(1.0/3.0)*h_;
}
class ElasticQ9 : public Element_Formulation {
public:
ElasticQ9(Element_Type_Register a) : Element_Formulation(a) {}
Element_Formulation *make(int, Global_Discretization&);
ElasticQ9(int, Global_Discretization&);
};
Element_Formulation* ElasticQ9::make(int en, Global_Discretization& gd) {
return new ElasticQ9(en,gd);
}
ElasticQ9::ElasticQ9(int en, Global_Discretization& gd) : Element_Formulation(en, gd) {
Quadrature qp(2, 9);
H1 Z(2, (double*)0, qp),
N = INTEGRABLE_VECTOR_OF_TANGENT_BUNDLE("int, int, Quadrature", 9, 2, qp),
Zai, Eta;
9-nodes Lagrangian shape functions
Zai &= Z[0]; Eta &= Z[1];
N[0]=(1-Zai)*(1-Eta)/4;
N[1]=(1+Zai)*(1-Eta)/4; Step1: initial four corner nodes
N[2]=(1+Zai)*(1+Eta)/4;
N[3] =(1-Zai)*(1+Eta)/4;
N[8] = (1-Zai.pow(2))*(1-Eta.pow(2)); Step 2: add center nodes
N[0] -= N[8]/4; N[1] -= N[8]/4; Step 3:modification of four corner
N[2] -= N[8]/4; N[3] -= N[8]/4;
nodes
N[4] = ((1-Zai.pow(2))*(1-Eta)-N[8])/2;
N[5] = ((1-Eta.pow(2))*(1+Zai)-N[8])/2; due to the presence of the center node
N[6] = ((1-Zai.pow(2))*(1+Eta)-N[8])/2; Step 4:add four edge nodes and correct
N[7] = ((1-Eta.pow(2))*(1-Zai)-N[8])/2;
for the presence of the center node
∫ ( mNp ) T BdΩ
(n[0] | n[1] | n[2] | n[3] ) &
(Zero | Zero | Zero | Zero ) ); ce =
C0 c = ((~mN_p)*B) | dV; Ωe
stiff &= a-(~c)*v_inv*c;
}
Element_Formulation* Element_Formulation::type_list = 0; ke ≡ a e – ( c e ) T ( v e ) –1 c e
Element_Type_Register element_type_register_instance;
static ElasticQ9 elasticq9_instance(element_type_register_instance);
int main() {
int ndf = 2;
Omega_h oh;
gh_on_Gamma_h gh(ndf, oh);
U_h uh(ndf, oh);
Global_Discretization gd(oh, gh, uh);
Matrix_Representation mr(gd);
mr.assembly();
C0 u = ((C0)(mr.rhs())) / ((C0)(mr.lhs()));
gd.u_h() = u;
gd.u_h() = gd.gh_on_gamma_h();
cout << gd.u_h() << endl;
return 0;
}
Listing 5•7 Displacement only mixed formulation with discontinuous pressure field (project:
“mixed_u_only” in project workspace file “fe.dsw”).
εv = m T ε = m T Lu Eq. 5•91
∂
------ 0
∂x
L = 0 ----- ∂ Eq. 5•92
-
∂y
∂ ∂
------ ------
∂y ∂x
Kε v – p = 0 Eq. 5•93
Weighted residual statements can be obtained by applying pressure variation δp to Eq. 5•91 and volumetric strain
variation δεv to Eq. 5•93 together with the weak statement (displacement variation δu) to the equilibrium equa-
tion (Eq. 5•75) we have
∫ ε ( δu ) T µ D 0 – --- m ⊗ m ε ( δu ) + mp ( u ) dΩ =
2
3 ∫ δu T b dΩ + ∫ δuT t dΓ Eq. 5•94
Ω Ω Γ
∫ δp [ εv – m T Lu ] dΩ = 0 Eq. 5•95
Ω
A CT 0 û f1
C 0 E T p̂ = f 2 Eq. 5•98
0 E H ε̂v f3
where
2
A = ∫ BT µ D 0 – --- m ⊗ m BdΩ
3
Eq. 5•99
Ωe
f1 = ∫ NuT b dΩ + ∫ NuT t dΓ – Au Γ e
– CTp Γe
Eq. 5•103
Ωe Γe
f 2 = – Cu Γe
–E T εv Eq. 5•104
Γe
f3 = – Ep Γe
– Hε v Eq. 5•105
Γe
In the process of solution, E-matrix will need to be invertible. If we take Np = Nv = Ψe(x), “-E” becomes
which is symmetrical positive definitive and the inversion of “-E” can be computed by Cholesky decomposition.
With such choice of Np = Nv, from the third equation of Eq. 5•98, we have
That is
û = [ A + C T ( E T H –1 E ) –1 C ] – 1 [ f1 + C T ( E T H –1 E ) –1 ( E T H –1 f3 – f 2 ) ] Eq. 5•110
After we obtain û , p̂ can be recovered from second part of Eq. 5•108. Then ε̂v is computed directly from second
equation of Eq. 5•98. The Program Listing 5•8 implements the u-p-εv three-field incompressible formulation
(project “incompressible_u_p_epsilon_v_formulation” in project workspace file “fe.dsw”).
If Ψe(x) is taken as discontinuous fields, the variables p and εv can be eliminated at the element level and f3 =
0 because no boundary conditions can be imposed for p and εv. Then, we withheld the subtraction of essential
displacement boundary conditions from the second equation in Eq. 5•98, therefore, f2 = 0. From the second equa-
tion we have
or
where
A ≡ A + C T E – 1 H ( E T ) –1 C = A + W T HW Eq. 5•115
Therefore a modified stiffness matrix A, based on the Hu-Washizu variational principle can be used to compute
the displacement field directly. The implementation of the A solution procedure is
#include "include\fe.h"
#include "include\omega_h_n.h"
Matrix_Representation_Couple::assembly_switch
Matrix_Representation_Couple::Assembly_Switch = Matrix_Representation_Couple::ALL;
static const int row_node_no = 5; static const int row_segment_no = row_node_no-1;
static const double L_ = 10.0; static const double c_ = 1.0; half element size
static const double h_e_ = L_/((double)row_segment_no);
static const double E_ = 1.e3; static const double v_ = 0.3;
static const double lambda_ = v_*E_/((1+v_)*(1-2*v_));
static const double mu_ = E_/(2*(1+v_)); plane stress modification
static const double lambda_bar = 2*lambda_*mu_/(lambda_+2*mu_);
static const double K_ = lambda_bar+2.0/3.0*mu_;
Omega_h_i::Omega_h_i(int i) : Omega_h(0) { define Ωu
if(i == 0) { double v[2]; Node *node; 1st row
v[0] = 0.0; v[1] = 0.0; node = new Node(0, 2, v); node_array().add(node);
v[0] = h_e_; node = new Node(1, 2, v); node_array().add(node);
v[0] = 2.0*h_e_; node = new Node(2, 2, v); node_array().add(node);
v[0] = 3.0*h_e_; node = new Node(3, 2, v); node_array().add(node); 2nd row
v[0] = 4.0*h_e_; node = new Node(4, 2, v); node_array().add(node);
v[0] = 0.0; v[1] = 1.0*c_; node = new Node(5, 2, v); node_array().add(node);
v[0] = 2.0*h_e_; node = new Node(6, 2, v); node_array().add(node);
v[0] = 4.0*h_e_; node = new Node(7, 2, v); node_array().add(node);
v[0] = 0.0; v[1] = 2.0*c_; node = new Node(8, 2, v); node_array().add(node);
3rd row
v[0] = h_e_; node = new Node(9, 2, v); node_array().add(node);
v[0] = 2.0*h_e_; node = new Node(10, 2, v); node_array().add(node);
v[0] = 3.0*h_e_; node = new Node(11, 2, v); node_array().add(node);
v[0] = 4.0*h_e_; node = new Node(12, 2, v); node_array().add(node);
Serendipity 8-nodes element
int ena[8]; Omega_eh *elem;
ena[0] = 0; ena[1] = 2; ena[2] = 10; ena[3] = 8; ena[4] = 1; ena[5] = 6; ena[6] = 9; ena[7] = 5;
elem = new Omega_eh(0, 0, 0, 8, ena); omega_eh_array().add(elem);
ena[0] = 2; ena[1] = 4; ena[2] = 12; ena[3] = 10; ena[4] = 3; ena[5] = 7; ena[6] = 11; ena[7] = 6; define Ωp and Ωε
elem = new Omega_eh(1, 0, 0, 8, ena); omega_eh_array().add(elem); v
} else if(i == 1 || i == 2) { double inv_sqrt3 = 1.0/sqrt(3.0), v[2], xl[4][2], zai, eta; Node *node; elem # 0 nodal coordinates
xl[0][0] = 0.0; xl[0][1] = 0.0; xl[1][0] = 2.0*h_e_; xl[1][1] = 0.0;
xl[2][0] = 2.0*h_e_; xl[2][1] = 2.0*c_; xl[3][0] = 0.0; xl[3][1] = 2.0*c_;
1st Gauss point natural coordinates
zai = - inv_sqrt3; eta = - inv_sqrt3;
for(int j = 0; j < 2; j++) 1st Gauss point physical coordinates
v[j] = (1.0-zai)*(1.0-eta)/4.0*xl[0][j]+ (1.0+zai)*(1.0-eta)/4.0*xl[1][j]+
(1.0+zai)*(1.0+eta)/4.0*xl[2][j]+ (1.0-zai)*(1.0+eta)/4.0*xl[3][j];
2nd Gauss point natural coordinates
node = new Node(0, 2, v); node_array().add(node);
zai = inv_sqrt3; eta = - inv_sqrt3; 2nd Gauss point physical coordinates
for(int j = 0; j < 2; j++)
v[j] = (1.0-zai)*(1.0-eta)/4.0*xl[0][j]+ (1.0+zai)*(1.0-eta)/4.0*xl[1][j]+
(1.0+zai)*(1.0+eta)/4.0*xl[2][j]+ (1.0-zai)*(1.0+eta)/4.0*xl[3][j];
node = new Node(1, 2, v); node_array().add(node); 3rd Gauss point natural coordinates
zai = inv_sqrt3; eta = inv_sqrt3; 3rd Gauss point physical coordinates
for(int j = 0; j < 2; j++)
v[j] = (1.0-zai)*(1.0-eta)/4.0*xl[0][j]+ (1.0+zai)*(1.0-eta)/4.0*xl[1][j]+
(1.0+zai)*(1.0+eta)/4.0*xl[2][j]+ (1.0-zai)*(1.0+eta)/4.0*xl[3][j];
node = new Node(2, 2, v); node_array().add(node); 4th Gauss point natural coordinates
zai = - inv_sqrt3; eta = inv_sqrt3;
4th Gauss point physical coordinates
for(int j = 0; j < 2; j++)
v[j] = (1.0-zai)*(1.0-eta)/4.0*xl[0][j]+ (1.0+zai)*(1.0-eta)/4.0*xl[1][j]+
(1.0+zai)*(1.0+eta)/4.0*xl[2][j]+ (1.0-zai)*(1.0+eta)/4.0*xl[3][j];
node = new Node(3, 2, v); node_array().add(node);
elem # 1 nodal coordinates
xl[0][0] = 2.0*h_e_; xl[0][1] = 0.0; xl[1][0] = 4.0*h_e_; xl[1][1] = 0.0;
xl[2][0] = 4.0*h_e_; xl[2][1] = 2.0*c_; xl[3][0] = 2.0*h_e_; xl[3][1] = 2.0*c_; 1st Gauss point natural coordinates
zai = - inv_sqrt3; eta = - inv_sqrt3;
This computation can be invoked by setting macro definition “__TEST_A_BAR_FORMULATION” at the com-
pile time for the project “incompressible_u_p_epsilon_v_formulation” in project workspace file “fe.dsw”.
B Method
Mixed formulation leads to complicated matrix substructuring problem. With the support of “fe.lib” it does
not seems to be too difficult. However, most finite element programs do not have the capability to deal with the
matrix substructuring problem. Engineering simplification need to be made, particularly if such simplification
leads to highly efficient programs. For example, the A formulation in the above simplifies the program to be
compatible with the standard irreducible formulation, if only the displacement field is to be solved. Furthermore,
we may wish to re-define, in place of the strain-displacement matrix (B-matrix), the B-matrix such that A is
expressed in the form conformable to the standard irreducible formulation1
∫B
T
A≡ DB dΩ Eq. 5•116
Ωe
2 00
D ≡ µ D 0 – --- m ⊗ m + K ( m ⊗ m ) where D 0 = 0 2 0
2
Eq. 5•117
3
0 01
This expression can be easily implemented with the aid of VectorSpace C++ Library. A more graphic expression
suitable for a lower-level implementation is
K + 2 – --- µ
2 2
K – --- µ 0
3 3
D≡ Eq. 5•118
K + 2 – --- µ 0
2 2
K – --- µ
3 3
0 0 µ
1. p. 345 in Zienkiewicz, O.C. and R.L. Taylor, 1989, “The finite element method”, 4th ed., vol. 1. McGraw-Hill, Inc., UK.
That is the deviatoric part of B is just the same as that of B, but its volumetric part, Bvol , needs to be consistent
with the u-p-εv mixed formulation. Bvol and Bdev are defined as
1 1 1
Bvol ≡ --- ( m ⊗ m )B = --- m ( m • B ) and B dev ≡ B – B vol = I – --- ( m ⊗ m ) B Eq. 5•120
3 3 3
Now, we make distinction of discrete approximation to the volumetric strain as ε vh ≡ ε v and the infinite dimen-
sional εv in continuum mechanics. The approximation in Eq. 5•97 can be more precisely written (with over-bar
indicates “average” for certain simplest approximation which will become evident later) as
and,
m • B = Ψe W Eq. 5•123
From the first part of Eq. 5•120, Bvol can be defined similarly to Bvol as
1 1
Bvol = --- m ( m • B ) = --- mΨ e W Eq. 5•124
3 3
1 1
B ≡ Bdev + B vol = I – --- ( m ⊗ m ) B + --- mΨ e W Eq. 5•125
3 3
Eq. 5•125 can look quite formidable, a step-by-step algorithm for the B formula can be given as1
∂N ∂N
B1 = -------, and B2 = ------- Eq. 5•126
∂x ∂y
˜
and define B as
1. see p.233-236 in Hughes, T.J.R., 1987, “The finite element method: linear static and dynamic finite element analysis”,
Prentice-Hall, Inc., Englewood Cliffs, New Jersey.
and
B 5 B6
B = B 4 B7 Eq. 5•128
B 2 B1
where
˜
B1 – B 1
B 4 ≡ ------------------ Eq. 5•129
3
B 5 ≡ B 1 + B4 Eq. 5•130
B2 – B 2
B 6 ≡ ------------------ Eq. 5•131
3
B 7 ≡ B 2 + B6 Eq. 5•132
It is easily verifiable that Eq. 5•126 to Eq. 5•132 is equivalent to Eq. 5•125. The algorithm given above, Eq.
5•128 to Eq. 5•132, involves only simple arithmatics. The Program Listing 5•9 implements the B formulation in
this section. Due to the importance of this formulation, the post-processing to compute reactions, stresses, strains
are also included. The details of the post-processing have been discussed in Chapter 4.
We note by passing that from Eq. 5•124 and for the case of bilinear element with piece-wise constant pres-
sure and volumetric strain; i.e., Ψe(x) = 1, we have
∫ ( mΨe ) ⊗ B dΩ ∫ Bvol dΩ
1 1 Ωe Ωe
B vol = --- mΨ e W = --- mΨ e ------------------------------------------- = ------------------------
- Eq. 5•133
3 3
∫ e e
[ Ψ ⊗ Ψ ]dΩ ∫ dΩ
Ωe Ωe
The last term shows that the Bvol is the “mean dilatation of B” over the element domain Ωe.1 This special case
adds another perspective to our understanding of Eq. 5•119 that the definition of B ≡ Bdev + B vol is through the
modification of its volumetric part Bvol, which is the mean of Bvol over the element domain, according to Eq.
5•133. By inspecting this equation, Bvol can also be interpreted as the least squares smoothing of Bvol over the
element domain. Therefore, B is considered an assumed-strain method as oppose to the assumed-displacement
1. see p. 235 in Hughes, T.J.R., 1987, “The finite element method: linear static and dynamic finite element analysis”, Pren-
tice-Hall, Inc., Englewood Cliffs, New Jersey, and the reference therein by Nagtegaal, Parks and Rice.
#include "include\fe.h"
static const int row_node_no = 5; static const int col_node_no = 3;
static const int row_segment_no = row_node_no-1;
static const double L_ = 10.0; static const double c_ = 1.0;
static const double h_e_ = L_/((double)row_segment_no);
static const double E_ = 1.e3; static const double v_ = 0.5-1.e-12;
static const double lambda_ = v_*E_/((1+v_)*(1-2*v_));
static const double mu_ = E_/(2*(1+v_)); static const double K_ = E_/3.0/(1-2.0*v_);
Omega_h::Omega_h() {
double v[2]; Node *node; define Ωu
v[0] = 0.0; v[1] = 0.0; node = new Node(0, 2, v); node_array().add(node);
1st row
v[0] = h_e_; node = new Node(1, 2, v); node_array().add(node);
v[0] = 2.0*h_e_; node = new Node(2, 2, v); node_array().add(node);
v[0] = 3.0*h_e_; node = new Node(3, 2, v); node_array().add(node);
v[0] = 4.0*h_e_; node = new Node(4, 2, v); node_array().add(node);
2nd row
v[0] = 0.0; v[1] = 1.0*c_; node = new Node(5, 2, v); node_array().add(node);
v[0] = 2.0*h_e_; node = new Node(6, 2, v); node_array().add(node);
v[0] = 4.0*h_e_; node = new Node(7, 2, v); node_array().add(node);
v[0] = 0.0; v[1] = 2.0*c_; node = new Node(8, 2, v); node_array().add(node);
3rd row
v[0] = h_e_; node = new Node(9, 2, v); node_array().add(node);
v[0] = 2.0*h_e_; node = new Node(10, 2, v); node_array().add(node);
v[0] = 3.0*h_e_; node = new Node(11, 2, v); node_array().add(node);
v[0] = 4.0*h_e_; node = new Node(12, 2, v); node_array().add(node);
int ena[8]; Omega_eh *elem;
ena[0] = 0; ena[1] = 2; ena[2] = 10; ena[3] = 8; ena[4] = 1; ena[5] = 6; ena[6] = 9; ena[7] = 5; Serendipity 8-nodes element
elem = new Omega_eh(0, 0, 0, 8, ena); omega_eh_array().add(elem);
ena[0] = 2; ena[1] = 4; ena[2] = 12; ena[3] = 10; ena[4] = 3; ena[5] = 7; ena[6] = 11; ena[7] = 6;
elem = new Omega_eh(1, 0, 0, 8, ena); omega_eh_array().add(elem);
}
gh_on_Gamma_h::gh_on_Gamma_h( int df, Omega_h& omega_h) { boundary conditions
__initialization(df, omega_h);
the_gh_array[node_order(4)](0) = gh_on_Gamma_h::Dirichlet;
the_gh_array[node_order(4)][0] = 0.0;
the_gh_array[node_order(7)](0) = gh_on_Gamma_h::Dirichlet;
the_gh_array[node_order(7)][0] = 0.0;
the_gh_array[node_order(12)](0) = gh_on_Gamma_h::Dirichlet;
the_gh_array[node_order(12)][0] = 0.0;
the_gh_array[node_order(7)](1) = gh_on_Gamma_h::Dirichlet;
the_gh_array[node_order(7)][1] = 0.0;
the_gh_array[node_order(8)](0) = gh_on_Gamma_h::Neumann;
the_gh_array[node_order(8)][0] = -5.0;
the_gh_array[node_order(0)](0) = gh_on_Gamma_h::Neumann;
the_gh_array[node_order(0)][0] = 5.0;
}
static int ndf = 2;
static Omega_h oh;
static gh_on_Gamma_h gh(ndf, oh);
static U_h uh(ndf, oh);
static Global_Discretization gd(oh, gh, uh);
class Elastic_B_bar_Q84 : public Element_Formulation {
public:
Elastic_B_bar_Q84(Element_Type_Register a) : Element_Formulation(a) {}
Element_Formulation *make(int, Global_Discretization&);
Elastic_B_bar_Q84(int, Global_Discretization&);
};
Element_Formulation* Elastic_B_bar_Q84::make(int en, Global_Discretization& gd) {
return new Elastic_B_bar_Q84(en,gd);
}
∫B
T
} else stiff &= ((~B_bar)*D*B_bar)|dv; K≡ DB dΩ
}
Element_Formulation* Element_Formulation::type_list = 0; Ωe
Element_Type_Register element_type_register_instance;
D ≡ µ D 0 – --- m ⊗ m + K ( m ⊗ m )
2
static Elastic_B_bar_Q84
elastic_B_bar_Q84_instance(element_type_register_instance);
3
int main() {
where
Matrix_Representation mr(gd);
mr.assembly();
20 0
C0 u = ((C0)(mr.rhs())) / ((C0)(mr.lhs()));
gd.u_h() = u; D0 = 0 2 0
gd.u_h() = gd.gh_on_gamma_h(); 00 1
cout << gd.u_h() << endl;
#if defined(__TEST_POST_PROCESSING)
Matrix_Representation::Assembly_Switch = Matrix_Representation::REACTION; post-processing
mr.assembly(FALSE);
cout << "reaction:" << endl << (mr.global_nodal_value()) << endl;
Matrix_Representation::Assembly_Switch = Matrix_Representation::STRAIN;
mr.assembly(FALSE);
Matrix_Representation::Assembly_Switch = Matrix_Representation::NODAL_STRAIN;
mr.assembly(FALSE);
cout << "nodal strains:" << endl << (mr.global_nodal_value()) << endl;
Matrix_Representation::Assembly_Switch = Matrix_Representation::STRESS;
mr.assembly(FALSE);
Matrix_Representation::Assembly_Switch = Matrix_Representation::NODAL_STRESS;
mr.assembly(FALSE);
cout << "nodal stresses:" << endl << (mr.global_nodal_value()) << endl;
#endif
return 0;
}
Listing 5•9 B matrix formulation for plane elasticity (project: “b_bar_formulation” in project workspace
file “fe.dsw”).
Non-conforming Element
In Chapter 4 Figure 4•42 eight bilinear 4-node elements and two Lagrangian 9-node elements are used to
compute beam bending problem. The tip-deflection solution (see TABLE 4•3. in page 405) with bilinear 4-node
elements (2 × 2 integration) is only 60 % of the exact solution, while it is 98% accurate for the Lagrangian 9-node
elements. The bilinear 4-node element exhibits shear locking and dilatation locking both due to the interpolation
failure to represent x2 and y2 (see aliasing analysis discussed in page 397 and page 406). Therefore these two
quadratic displacement modes are added back (1) to improve the bending behavior, and (2) to overcome the
incompressible limit of the 4-node element. This is the Wilson’s nonconforming element.
where node number “0, 1, 2 , 3” corresponding to four nodes of the element. α̂ e , with a = 4, 5, are known as
a
“nodeless variables” or “generalized displacements”, which is independent from the other element. Since α̂ e are
a
independent from the other elements, they can be eliminated at the element level. The element stiffness matrix is
of the form corresponding to variables a = [u0, u1, u2, u3, α4, α5]T as
T
k uu k αu û f
= Eq. 5•136
k αu k αα α̂ 0
α̂ – 1 k û
= – k αα αu Eq. 5•137
kû ≡ [ k uu – k αu
T k – 1 k ]û = f
αα αu Eq. 5•138
However, it is found that Wilson’s non-conforming element only works for rectangular or parallelogram element.
The addition of two quadratic shape functions in Eq. 5•134 make the interpolation polynomial not complete up to
the second-order. Therefore, the spatial isotropy is lost; the element is not invariant with respect to an arbitrary
coordinate axes rotation. The integration of B-matrix components corresponding to the nodeless variables are
∂y
– ξ ------ 0
1 1 N 4, x 0 1 1 ∂η
∂x
B4 dΩ = ∫∫ 0 N 4, y J dξ dη = 2 ∫ ∫ 0 ξ ------ dξd
∂η
Eq. 5•139
–1 –1 N 4, y N 4, x –1 –1
∂x ∂y
ξ ------ – ξ ------
∂η ∂η
∂y
η ------ 0
1 1 N 5, x 0 1 1 ∂ξ
∂x
∫ B 5 dΩ = ∫∫ 0 N 5, y J dξ dη = 2 ∫ ∫ 0 – η ------ dξ dη
∂ξ
Eq. 5•140
Ωe –1 – 1 N 5, y N 5, x –1 –1
∂x ∂y
– η ------ η ------
∂ξ ∂ξ
where we have use the relation N,x = N,ξ ξ,x, and the inversion of a 2 × 2 matrix
–1
∂ξ ∂ξ ∂x ∂x ∂y ∂x
------ ------ ------ ------ ------ ------
∂x ∂y ∂ξ ∂η 1 ∂η – ∂η
= = --
- Eq. 5•141
∂η ∂η ∂y ∂y J ∂y ∂x
------ ------ ------ ------ – ------ ------
∂x ∂y ∂ξ ∂η ∂ξ ∂ξ
For rectangular and parallelogram elements, the derivatives in Eq. 5•139 and Eq. 5•140 are constants through
out element domain. The integration of B4 and B5 over ξ and η = [-1, 1] are zero when the derivatives are con-
stant throught out the element domain, since B4 and B5, as defined in Eq. 5•139 and Eq. 5•140, become odd
functions of ξ and η. For element geometry other than rectangular or parallelogram, we can improve the behav-
ior of the non-conforming element. This can be achieved by evaluating the derivatives and the Jacobian for B4
and B5 only at the center [ξ0, η0] of the element. Such element has the improved behavior for element geometry
that are not rectangular or parallelogram (Taylor’s non-conforming element). Recall that the 8 bilinear 4-node
elements in the higher-order patch test (page 422 in Chapter 4) produces tip deflection of “-0.656467”, which is
significantly less than the exact solution of “-0.75”. Program Listing 5•10 implements the non-conforming ele-
ment discussed in this section. The Taylor’s non-conforming element can be invoked by setting macro definition
“__TEST_TAYLOR” at the compile time. The distortion of vertical element boundaries can be set by the macro
definition “__TEST_DISTORTION”. The results of eight non-conforming elements in the same problem in the
higher-order patch test are listed in the TABLE5• 1.
Without geometrical distortion both Wilson’s and Taylor’s non-conforming elements produce exact solution.
That is they both provide solution superior than the standard bilinear element. When the geometrical distortion
#include "include\fe.h"
static const double L_ = 10.0; static const double c_ = 1.0; static const double h_e_ = L_/4.0;
static const double E_ = 1.e3; static const double v_ = 0.3;
#if defined(__TEST_DISTORTION)
static const double e_ = h_e_/2.0;
#else
static const double e_ = 0.0;
#endif
Omega_h::Omega_h() {
double v[2]; Node *node; define Ωu
v[0] = 0.0; v[1] = 0.0; node = new Node(0, 2, v); node_array().add(node);
1st row
v[0] = h_e_-e_; node = new Node(1, 2, v); node_array().add(node);
v[0] = 2.0*h_e_-2.0*e_; node = new Node(2, 2, v); node_array().add(node);
v[0] = 3.0*h_e_-e_; node = new Node(3, 2, v); node_array().add(node);
v[0] = 4.0*h_e_; node = new Node(4, 2, v); node_array().add(node);
2nd row
v[0] = 0.0; v[1] = 1.0*c_; node = new Node(5, 2, v); node_array().add(node);
v[0] = 1.0*h_e_; node = new Node(6, 2, v); node_array().add(node);
v[0] = 2.0*h_e_; node = new Node(7, 2, v); node_array().add(node);
v[0] = 3.0*h_e_; node = new Node(8, 2, v); node_array().add(node);
3rd row
v[0] = 4.0*h_e_; node = new Node(9, 2, v); node_array().add(node);
v[0] = 0.0; v[1] = 2.0*c_; node = new Node(10, 2, v); node_array().add(node);
v[0] = h_e_+e_; node = new Node(11, 2, v); node_array().add(node);
v[0] = 2.0*h_e_+2.0*e_; node = new Node(12, 2, v); node_array().add(node);
v[0] = 3.0*h_e_+e_; node = new Node(13, 2, v); node_array().add(node);
v[0] = 4.0*h_e_; node = new Node(14, 2, v); node_array().add(node); 4-nodes element
int ena[4]; Omega_eh *elem;
ena[0] = 0; ena[1] = 1; ena[2] = 6; ena[3] = 5;
elem = new Omega_eh(0, 0, 0, 4, ena); omega_eh_array().add(elem);
ena[0] = 1; ena[1] = 2; ena[2] = 7; ena[3] = 6;
elem = new Omega_eh(1, 0, 0, 4, ena); omega_eh_array().add(elem);
ena[0] = 2; ena[1] = 3; ena[2] = 8; ena[3] = 7;
elem = new Omega_eh(2, 0, 0, 4, ena); omega_eh_array().add(elem);
ena[0] = 3; ena[1] = 4; ena[2] = 9; ena[3] = 8;
elem = new Omega_eh(3, 0, 0, 4, ena); omega_eh_array().add(elem);
ena[0] = 5; ena[1] = 6; ena[2] = 11; ena[3] = 10;
elem = new Omega_eh(4, 0, 0, 4, ena); omega_eh_array().add(elem);
ena[0] = 6; ena[1] = 7; ena[2] = 12; ena[3] = 11;
elem = new Omega_eh(5, 0, 0, 4, ena); omega_eh_array().add(elem);
ena[0] = 7; ena[1] = 8; ena[2] = 13; ena[3] = 12;
elem = new Omega_eh(6, 0, 0, 4, ena); omega_eh_array().add(elem);
ena[0] = 8; ena[1] = 9; ena[2] = 14; ena[3] = 13;
elem = new Omega_eh(7, 0, 0, 4, ena); omega_eh_array().add(elem);
}
gh_on_Gamma_h::gh_on_Gamma_h(int df, Omega_h& omega_h) {
__initialization(df, omega_h); boundary conditions
the_gh_array[node_order(4)](0) = gh_on_Gamma_h::Dirichlet;
the_gh_array[node_order(4)][0] = 0.0;
the_gh_array[node_order(9)](0) = gh_on_Gamma_h::Dirichlet;
the_gh_array[node_order(9)][0] = 0.0;
the_gh_array[node_order(14)](0) = gh_on_Gamma_h::Dirichlet;
the_gh_array[node_order(14)][0] = 0.0;
the_gh_array[node_order(9)](1) = gh_on_Gamma_h::Dirichlet;
the_gh_array[node_order(9)][1] = 0.0;
the_gh_array[node_order(10)](0) = gh_on_Gamma_h::Neumann;
the_gh_array[node_order(10)][0] = -5.0;
the_gh_array[node_order(0)](0) = gh_on_Gamma_h::Neumann;
the_gh_array[node_order(0)][0] = 5.0;
}
Element_Formulation* Element_Formulation::type_list = 0;
Element_Type_Register element_type_register_instance;
static Elastic_Nonconforming_Q4
elastic_nonconforming_q4_instance(element_type_register_instance);
int main() {
Matrix_Representation mr(gd);
mr.assembly();
C0 u = ((C0)(mr.rhs())) / ((C0)(mr.lhs()));
gd.u_h() = u;
gd.u_h() = gd.gh_on_gamma_h();
cout << gd.u_h() << endl;
return 0;
}
Listing 5•10 Nonconforming element for plane elasticity (project: “nonconforming_element” in project
workspace file “fe.dsw”).
The element stiffness matrix is of size 8 × 8 for bilinear 4-node element (i.e., {ndf × nen} × {ndf × nen}). There are
three independent relations provided by three equations of stress-strain relations. Therefore, the one point Gauss
integration produces ke(1-point) of rank 3, which is clearly rank deficient. The correct rank number for ke should
be 5, which is from the full-rank subtracts the three rigid-body-motions (8-3=5). Therefore, two trial hourglass
modes (expanded by Ψ to both u and v), corresponding to the x-hourglass and y-hourglass modes (see Figure
4•50d& e in Chapter 4), are used to define ke(hourglass) as
EJ ( b ⊗ b )
k e ( hourglass ) ≡ -------------------------- ⊗ ( Ψ ⊗ Ψ ) Eq. 5•143
12
where E is the Young’s modulus. We can view Eq. 5•36 for heat conduction as the 1-ndf degenerated version of
Eq. 5•143 (by using κ ( b • b ) , a scalar, in place of E ( b ⊗ b ), which is a 2 × 2 matrix).The hourglass element is
implemented in project “hourglass_element” in project workspace file “fe.dsw” and is shown in the Program
Listing 5•11.
We test the performance of the hourglass element by considering the same problem solved by project
“higher_order_patch_test”. The formulation of 1-point integration stiffness is the same as the stiffness matrix
computed with macro definition “__TEST_B_MATRIX_FORMULATION” set at compile time for project
“higher_order_patch_test”. The hourglass element takes only 0.5 second to assemble the global stiffness matrix
comparing to 4.5 seconds for the standard 2 × 2 integration with project “higher_order_patch_test” (on an obso-
lete 166 MHz PC). However, the tip deflection is -0.82 instead of -0.75 (exact), which is not very accurate.
#include "include\fe.h"
static const double L_ = 10.0; static const double c_ = 1.0; static const double h_e_ = L_/4.0;
static const double E_ = 1.e3; static const double v_ = 0.3;
#if defined(__TEST_DISTORTION)
static const double e_ = h_e_/2.0;
#else
static const double e_ = 0.0;
#endif
Omega_h::Omega_h() {
double v[2]; Node *node; define Ωu
v[0] = 0.0; v[1] = 0.0; node = new Node(0, 2, v); node_array().add(node);
1st row
v[0] = h_e_-e_; node = new Node(1, 2, v); node_array().add(node);
v[0] = 2.0*h_e_-2.0*e_; node = new Node(2, 2, v); node_array().add(node);
v[0] = 3.0*h_e_-e_; node = new Node(3, 2, v); node_array().add(node);
v[0] = 4.0*h_e_; node = new Node(4, 2, v); node_array().add(node);
2nd row
v[0] = 0.0; v[1] = 1.0*c_; node = new Node(5, 2, v); node_array().add(node);
v[0] = 1.0*h_e_; node = new Node(6, 2, v); node_array().add(node);
v[0] = 2.0*h_e_; node = new Node(7, 2, v); node_array().add(node);
v[0] = 3.0*h_e_; node = new Node(8, 2, v); node_array().add(node);
3rd row
v[0] = 4.0*h_e_; node = new Node(9, 2, v); node_array().add(node);
v[0] = 0.0; v[1] = 2.0*c_; node = new Node(10, 2, v); node_array().add(node);
v[0] = h_e_+e_; node = new Node(11, 2, v); node_array().add(node);
v[0] = 2.0*h_e_+2.0*e_; node = new Node(12, 2, v); node_array().add(node);
v[0] = 3.0*h_e_+e_; node = new Node(13, 2, v); node_array().add(node);
v[0] = 4.0*h_e_; node = new Node(14, 2, v); node_array().add(node); 4-nodes element
int ena[4]; Omega_eh *elem;
ena[0] = 0; ena[1] = 1; ena[2] = 6; ena[3] = 5;
elem = new Omega_eh(0, 0, 0, 4, ena); omega_eh_array().add(elem);
ena[0] = 1; ena[1] = 2; ena[2] = 7; ena[3] = 6;
elem = new Omega_eh(1, 0, 0, 4, ena); omega_eh_array().add(elem);
ena[0] = 2; ena[1] = 3; ena[2] = 8; ena[3] = 7;
elem = new Omega_eh(2, 0, 0, 4, ena); omega_eh_array().add(elem);
ena[0] = 3; ena[1] = 4; ena[2] = 9; ena[3] = 8;
elem = new Omega_eh(3, 0, 0, 4, ena); omega_eh_array().add(elem);
ena[0] = 5; ena[1] = 6; ena[2] = 11; ena[3] = 10;
elem = new Omega_eh(4, 0, 0, 4, ena); omega_eh_array().add(elem);
ena[0] = 6; ena[1] = 7; ena[2] = 12; ena[3] = 11;
elem = new Omega_eh(5, 0, 0, 4, ena); omega_eh_array().add(elem);
ena[0] = 7; ena[1] = 8; ena[2] = 13; ena[3] = 12;
elem = new Omega_eh(6, 0, 0, 4, ena); omega_eh_array().add(elem);
ena[0] = 8; ena[1] = 9; ena[2] = 14; ena[3] = 13;
elem = new Omega_eh(7, 0, 0, 4, ena); omega_eh_array().add(elem);
}
gh_on_Gamma_h::gh_on_Gamma_h(int df, Omega_h& omega_h) {
__initialization(df, omega_h); boundary conditions
the_gh_array[node_order(4)](0) = gh_on_Gamma_h::Dirichlet;
the_gh_array[node_order(4)][0] = 0.0;
the_gh_array[node_order(9)](0) = gh_on_Gamma_h::Dirichlet;
the_gh_array[node_order(9)][0] = 0.0;
the_gh_array[node_order(14)](0) = gh_on_Gamma_h::Dirichlet;
the_gh_array[node_order(14)][0] = 0.0;
the_gh_array[node_order(9)](1) = gh_on_Gamma_h::Dirichlet;
the_gh_array[node_order(9)][1] = 0.0;
the_gh_array[node_order(10)](0) = gh_on_Gamma_h::Neumann;
the_gh_array[node_order(10)][0] = -5.0;
the_gh_array[node_order(0)](0) = gh_on_Gamma_h::Neumann;
the_gh_array[node_order(0)][0] = 5.0;
}
Listing 5•11 Hourglass element for plane elasticity (project: “hourglass_element” in project workspace file
“fe.dsw”).
t0 = - t1 = λ Eq. 5•144
Irreducible Subdomains
The Euler-Lagrange equations applied on each of the two subdomains are
Γ0
Ω0
σ0 , u0 t = nσ (Cauchy’s formula)
n
t1
x0
ΓI tn = t • n
t0
Ω1 tt = t-tn
σ1 , u1
Γ1
Figure 5•4 Traction contact condition in the internal discontinuous surface ΓI. tn
is the normal component of t, and tt is its tangential component.
1. see p. 242 in Malvern, L.E., 1969, “Introduction to the mechanics of a continuous medium”, Prentice-Hall, Inc., Engle-
wood Cliffs, New Jersey.
∫ δ λ T ( u1 – u0 ) dΓ Eq. 5•146
ΓI
The matrix form of Eq. 5•145 and Eq. 5•146 after finite element approximation is1
K 0 Q 0T 0 û 0 f0
Q 0 0 Q1 λ̂ = fI Eq. 5•147
0 Q 1T K 1 û 1 f1
where for i = 0, 1
Q i = ( – 1 ) i ∫ N λT N u dΓ Eq. 5•149
i
ΓI
fi = ∫ NuT b dΩ + ∫ NuT h i dΓ
i i
Eq. 5•150
Ωi Γi
In the spirit of the B-method, all internal fields are eliminated at the element level through static condensation
to make a displacement-only formulation that resembles the standard irreducible formulation. In the present case,
the displacement û in Eq. 5•147 can be eliminated, leave only with boundary forces λ̂, provided that the stiff-
ness matrices Ki are all invertible. However, that would require each subdomain be specified so that the rigid-
body-motions are precluded. Therefore, the singularity of the stiffness matrices is avoided. The difficulty in
removing rigid-body-motions for each subdomain limits the practical use of the hybrid method in the present
form.
The Program Listing 5•12 implements the hybrid irreducible domains formulation. The test problem for the
higher-order patch test is now illustrated in Figure 5•5. As stated earlier special difficulty arises that the subdo-
main “0” is not fully constrained that the rigid-body modes can be prevented. The solution procedure is pro-
ceeded as the followings. First the static condensation can still be applied to the subdomain “1”, since its is
constrained sufficiently to suppress the rigid body motions. From third equation of Eq. 5•147, since K1 is not sin-
gular we can have
1. see p.375 in Zienkiewicz, O.C. and R.L. Taylor, 1989, “The finite element method”, 4th ed., vol. 1. McGraw-Hill, Inc.,
UK.
Q 0 û 0 + Q 1 K 1– 1 [ f 1 – Q 1T λ̂ ] = fI Eq. 5•152
Therefore,
After û 0 is obtained, λ̂ is computed from Eq. 5•153, then û 1 is computed from Eq. 5•151. The solution of this
computation shows that the tip-deflection is 0.75 (exact), and the horizontal traction λx on top and bottom of the
interface element has the magnitude of “15”, which is also exact.
Γ0 Γ1
15
Ωh0 Ωh 1
-15
ΓI
E =103, ν = 0.3
Figure 5•5 Bean bending problem for the hybrid irreducible domains formulation.
#include "include\fe.h"
#include "include\omega_h_n.h"
Matrix_Representation_Couple::assembly_switch
Matrix_Representation_Couple::Assembly_Switch = Matrix_Representation_Couple::ALL;
#include "include\global_discretization_gamma_h_n.h"
static const double L_ = 10.0;
static const double c_ = 1.0;
static const double h_e_ = L_/((double)4.0);
static const double E_ = 1.e3;
static const double v_ = 0.3;
static const double lambda_ = v_*E_/((1+v_)*(1-2*v_));
static const double mu_ = E_/(2*(1+v_));
static const double lambda_bar = 2*lambda_*mu_/(lambda_+2*mu_);
static const double K_ = lambda_bar+2.0/3.0*mu_;
Omega_h_i::Omega_h_i(int i) : Omega_h(0){
if(i == 0) {
double v[2]; define Ωh0
Node *node;
v[0] = 0.0; v[1] = 0.0; node = new Node(0, 2, v); node_array().add(node);
1st row
v[0] = h_e_; node = new Node(1, 2, v); node_array().add(node);
v[0] = 2.0*h_e_; node = new Node(2, 2, v); node_array().add(node); 2nd row
v[0] = 0.0; v[1] = 1.0*c_; node = new Node(3, 2, v); node_array().add(node);
v[0] = 2.0*h_e_; node = new Node(4, 2, v); node_array().add(node);
v[0] = 0.0; v[1] = 2.0*c_; node = new Node(5, 2, v); node_array().add(node);
v[0] = h_e_; node = new Node(6, 2, v); node_array().add(node); 3rd row
v[0] = 2.0*h_e_; node = new Node(7, 2, v); node_array().add(node);
int ena[8]; Omega_eh *elem;
ena[0] = 0; ena[1] = 2; ena[2] = 7; ena[3] = 5; 4-nodes element
ena[4] = 1; ena[5] = 4; ena[6] = 6; ena[7] = 3;
elem = new Omega_eh(0, 0, 0, 8, ena);
omega_eh_array().add(elem);
} else if(i == 1) { define Ωh1
double v[2];
Node *node;
v[0] = 2.0*h_e_; v[1] = 0.0; node = new Node(0, 2, v); node_array().add(node);
v[0] = 3.0*h_e_; node = new Node(1, 2, v); node_array().add(node);
v[0] = 4.0*h_e_; node = new Node(2, 2, v); node_array().add(node);
v[0] = 2.0*h_e_; v[1] = 1.0*c_; node = new Node(3, 2, v); node_array().add(node);
v[0] = 4.0*h_e_; node = new Node(4, 2, v); node_array().add(node);
v[0] = 2.0*h_e_; v[1] = 2.0*c_; node = new Node(5, 2, v); node_array().add(node);
v[0] = 3.0*h_e_; node = new Node(6, 2, v); node_array().add(node);
v[0] = 4.0*h_e_; node = new Node(7, 2, v); node_array().add(node);
int ena[8]; Omega_eh *elem;
ena[0] = 0; ena[1] = 2; ena[2] = 7; ena[3] = 5;
ena[4] = 1; ena[5] = 4; ena[6] = 6; ena[7] = 3;
elem = new Omega_eh(0, 0, 0, 8, ena);
omega_eh_array().add(elem);
} else if(i == 2) { define ΓI; interface line elements
double v[2];
Node *node;
v[0] = 2.0*h_e_; v[1] = 0.0; node = new Node(0, 2, v); node_array().add(node);
v[1] = 1.0*c_; node = new Node(1, 2, v); node_array().add(node);
v[1] = 2.0*c_; node = new Node(2, 2, v); node_array().add(node);
int ena[3]; Omega_eh *elem; ena[0] = 0; ena[1] = 1; ena[2] = 2;
elem = new Omega_eh(0, 0, 0, 3, ena);
omega_eh_array().add(elem);
}
}
∫ BiT Di Bi dΩ
wy*~wy*U[1][1] )
| dv2 ); Ki =
stiff &= stiff_vol + stiff_dev; Ωi
}
û 1 = K 1–1 [ f 1 – Q 1T λ̂ ]
gd_interface.u_h() = gd_interface.gh_on_gamma_h();
cout << "inteface traction:" << endl << gd_interface.u_h();
gd_0.u_h() = u_0;
gd_0.u_h() = gd_0.gh_on_gamma_h();
cout << "first domain displacement:" << endl << gd_0.u_h();
gd_1.u_h() = u_1;
gd_1.u_h() = gd_1.gh_on_gamma_h();
cout << "second domain displacement:" << endl << gd_1.u_h();
return 0;
}
Listing 5•12 Hybrid irreducible subdomains for plane elasticity (project: “hybrid_irreducible_subdomain”
in project workspace file “fe.dsw”).
where i = 0, 1. An alternative form of the Hellinger-Reissner variational principle can be used to develop highly
efficient and accurate element. The Pian-Sumihara element can be easily implemented with the assumed-stress
field as1
a 22 ( ξ – ξ 0 ) a 02 ( η – η 0 )
Nσ ≡ b 22 ( ξ – ξ 0 ) b 02 ( η – η 0 ) Eq. 5•157
a2 b2 ( ξ – ξ0 ) a0 b0 ( η – η0 )
where
3 3 3 3 3 3
a0 ≡ ∑ xa ξa, a1 ≡ ∑ x a ξa ηa, a 2 ≡ ∑ xa ηa, b0 ≡ ∑ y a ξa, b1 ≡ ∑ ya ξa ηa, b2 ≡ ∑ ya ηa Eq. 5•158
a=0 a=0 a=0 a=0 a=0 a=0
{xa, ya}T(a = 0, ..., 3) are nodal coordinate, {ξa, ηa}T = {{-1,-1}, {1,-1},{1,1},{-1,1}}, and
J1 J2 a 0 b 2 – a2 b 0 a0 b1 – a1 b0 a1 b2 – a2 b1
ξ 0 ≡ --------, η 0 ≡ --------, with J 0 ≡ ----------------------------, J 1 ≡ ---------------------------- , J 2 ≡ ---------------------------- Eq. 5•159
3J 0 3J 0 16 16 16
where ke(1-point) is again the 1-point Gaussian integration on stiffness matrix from standard irreducible formula-
tion, and ke(stabilizer) is defined as
1. see p.282-285 in Zienkiewicz, O.C. and R.L. Taylor, 1989, “The finite element method”, 4th ed., vol. 1. McGraw-Hill,
Inc., UK.
∫ Nσ D –1 Nσ dΩ, ∫ N σ B dΩ
T T
A= and C = Eq. 5•162
Ωe Ωe
The Program Listing 5•12 implements the Pian-Sumihara element. For the test case in “higher-order patch test”
in nearly incompressible plane strain condition (with ν = 0.5 - 10-12), the tip-deflection of the Pian-Sumihara
element is “-0.566027”, which is comparable to the tip-deflection “-0.5625” in project
“incompressible_u_p_formulation” (with ν = 0.5 & plane strain). For element distortion test1, as shown in Fig-
ure 5•6, the Pian-Sumihara element produces tip-deflection that is 80% of that of the element without distortion,
which is far better than bilinear 4-node element. The Pian-Sumihara element is praised as the most efficient and
accurate four-noded element to date.
0.5
10
1. see p.387 in Zienkiewicz, O.C. and R.L. Taylor, 1989, “The finite element method”, 4th ed., vol. 1. McGraw-Hill, Inc.,
UK.
#include "include\fe.h"
static const double L_ = 10.0; static const double c_ = 2.0; static const double h_e_ = L_/2.0;
#if defined(__TEST_HIGHER_ORDER_PATCH_TEST)
static const double E_ = 1000.0;
static const double v_ = 0.5-1.e-12;
#else
static const double E_ = 1500.0; static const double v_ = 0.25;
#endif
#if defined(__TEST_DISTORTION)
static const double e_ = h_e_/10.0;
#else
static const double e_ = 0.0;
#endif
EP::element_pattern EP::ep = EP::QUADRILATERALS_4_NODES;
Omega_h::Omega_h() {
#if defined(__TEST_HIGHER_ORDER_PATCH_TEST) eight 4-nodes quadrilaterals
double x[4][2] = {{0.0, 0.0}, {10.0, 0.0}, {10.0, 2.0}, {0.0, 2.0}};
int control_node_flag[4] = {1, 1, 1, 1}, col_node_no = 5, row_node_no = 3;
block(this, row_node_no, col_node_no, 4, control_node_flag, x[0]);
#else
double v[2]; Node *node; two 4-nodes quadrilaterals
v[0] = 0.0; v[1] = 0.0; node = new Node(0, 2, v); node_array().add(node);
v[0] = h_e_-e_; node = new Node(1, 2, v); node_array().add(node);
v[0] = 2.0*h_e_; node = new Node(2, 2, v); node_array().add(node);
v[0] = 0.0; v[1] = c_; node = new Node(3, 2, v); node_array().add(node);
v[0] = h_e_+e_; node = new Node(4, 2, v); node_array().add(node);
v[0] = 2.0*h_e_; node = new Node(5, 2, v); node_array().add(node);
int ena[4]; Omega_eh *elem; ena[0] = 0; ena[1] = 1; ena[2] = 4; ena[3] = 3;
elem = new Omega_eh(0, 0, 0, 4, ena); omega_eh_array().add(elem);
ena[0] = 1; ena[1] = 2; ena[2] = 5; ena[3] = 4;
elem = new Omega_eh(1, 0, 0, 4, ena); omega_eh_array().add(elem);
#endif
}
gh_on_Gamma_h::gh_on_Gamma_h(int df, Omega_h& omega_h) {
__initialization(df, omega_h);
#if defined(__TEST_HIGHER_ORDER_PATCH_TEST) test case from p. 301 “Load 2”
int col_node_no = 5, row_node_no = 3; Zienkiewicz & Taylor vol.1
for(int i = 0; i < row_node_no; i++) {
the_gh_array[node_order((i+1)*col_node_no-1)](0) = gh_on_Gamma_h::Dirichlet;
the_gh_array[node_order((i+1)*col_node_no-1)][0] = 0.0;
}
the_gh_array[node_order(col_node_no*((row_node_no+1)/2)-1)](1) =
gh_on_Gamma_h::Dirichlet;
the_gh_array[node_order(col_node_no*((row_node_no+1)/2)-1)][1] = 0.0;
double h_ = 1.0, f_ = 15.0;
the_gh_array[node_order(2*col_node_no)](0) = gh_on_Gamma_h::Neumann;
the_gh_array[node_order(2*col_node_no)][0] = -f_*(1.0/3.0)*h_;
the_gh_array[node_order(0)](0) = gh_on_Gamma_h::Neumann;
the_gh_array[node_order(0)][0] = f_*(1.0/3.0)*h_;
#else
the_gh_array[node_order(2)](0) = the_gh_array[node_order(5)](0) =
test case B.C. #3 from p. 386 , distorted
the_gh_array[node_order(2)](1) = gh_on_Gamma_h::Dirichlet; element configuration in p. 387 of
the_gh_array[node_order(3)](0) = gh_on_Gamma_h::Neumann; Zienkiewicz & Taylor vol.1
the_gh_array[node_order(3)][0] = 1000.0;
the_gh_array[node_order(0)](0) = gh_on_Gamma_h::Neumann;
the_gh_array[node_order(0)][0] = -1000.0;
#endif
}
H1 x = n*xl;
H0 nx = d(n) * d(x).inverse();
J dv2(d(x).det());
int main() {
int ndf = 2;
Omega_h oh;
gh_on_Gamma_h gh(ndf, oh);
U_h uh(ndf, oh);
Global_Discretization gd(oh, gh, uh);
Matrix_Representation mr(gd);
mr.assembly();
C0 u = ((C0)(mr.rhs())) / ((C0)(mr.lhs()));
gd.u_h() = u;
gd.u_h() = gd.gh_on_gamma_h();
cout << gd.u_h();
return 0;
}
Listing 5•13 Hybrid Pian-Sumihara element for plane elasticity (project: “hybrid_pian_sumihara” in
project workspace file “fe.dsw”).
∂w
γx θx -------
∂x
γ = = – + = ( – θ ) + ∇w Eq. 5•163
γy θy ∂w
-------
∂y
In Reissner-Mindlin plate theory, the “fiber” is assumed to remain in plane, but it is not assumed to keep perpen-
dicular to the mid-surface as in thin plate theory. That is the transverse shear, γ, is not assumed to be zero. The
bending moment constitutive equations from Eq. 4•251 in Chapter 4 is
M = DL θ Eq. 5•164
The shear force relations to the bending moments and vertical loads from Eq. 4•254 in Chapter 4 are
First, M can be eliminated by substituting Eq. 5•164 into first part of Eq. 5•165 as
L T DL θ + S = 0 Eq. 5•166
Then, the constitutive equation for shear force and transverse shear strain is S = αγ (Eq. 4•253 in Chapter 4)
where α is the shear rigidity. Therefore,
--- + θ – ∇w = 0
S
Eq. 5•167
α
from the definition of the transverse shear strain γ ≡ ( – θ ) + ∇w in Eq. 5•163. Eq. 5•167 can be re-arranged as S
= α ( – θ + ∇w ) . Then, this is used to eliminate the shear force S in Eq. 5•166 as
L T DL θ – α ( θ – ∇w ) = 0 Eq. 5•168
∇ T [ α ( θ – ∇w ) ] = q Eq. 5•169
and
Integrating by parts to first term of Eq. 5•171 and to the left-hand-side terms of Eq. 5•172 and apply Green’s the-
orem
∫ ( ( LNθ ) T DLNθ + NθT αNθ ) dΩ θ̂ – ∫ NθT α∇Nw dΩŵ = ∫ NθT MΓ dΓ Eq. 5•173
Ω Ω Γ
and
or in matrix form
T ŵ
K s K bs fw
= Eq. 5•175
K bs K b θ̂ fθ
where
T
K s K bs Ks T
K bs 0 0
= + ≡ KS + KB Eq. 5•181
K bs K b K bs K b ( α ) 0 Kb ( D )
The matrix KS involves components that enforce the shear constraints, and KB is the part that has only to do
with the bending energy. This stiffness splitting is useful for selective reduced integration, in which the reduced
integration is applied to all the submatrices in the shear constraint part, KS, to avoid shear locking.
The Program Listing 5•14 implements the “heterosis element” illustrated in Figure 5•7. The deflection
degree of freedom, w, uses eight-node serendipity shape function, and the rotation degrees of freedom, θ = [θx ,
θy]T, use Lagrangian 9-node shape function. All terms except the bending stiffness submatrix Kb(D) (or KB stiff-
ness matrix) use the reduced (2 × 2) integration. We solve the same test problem in the thin plate section. The
maximum deflection is 222358 at center, which is comparing to the exact solution of the thin-plate theory of
226800.
:w
:θ
Figure 5•7 Heterosis element for irreducible thick plate formulation with Serendipity shape
function for deflection (w) and Lagrangian shape function for rotation (θ). The bending stiffness
submatrix Kb(D) is fully integrated with 3 × 3 integration. All other terms, that involve shear
constraints, are selective reduced integrated (2 × 2).
#include "include\fe.h"
#include "include\omega_h_n.h"
Matrix_Representation_Couple::assembly_switch
Matrix_Representation_Couple::Assembly_Switch = Matrix_Representation_Couple::ALL;
Omega_h_i::Omega_h_i(int i) : Omega_h(0) { Heterosis element
if(i == 0) { int row_segment_no = 2, count=0;
Node *node; double v[2], h = 1.0/((double)(row_segment_no)); Ωw; 8-node serendipity element
for(int j = 0; j < row_segment_no; j++) {
v[1] = ((double)j)*h;
for(int k = 0; k < (2*row_segment_no+1); k++) {
v[0] = ((double)k)*h/2; node = new Node(count++, 2, v); the_node_array.add(node);
}
v[1] += h/2.0;
for(int k = 0; k < row_segment_no+1; k++) {
v[0] = ((double)k)*h; node = new Node(count++, 2, v); the_node_array.add(node);
}
}
v[1] = 1.0;
for(int j = 0; j < (2*row_segment_no+1); j++) {
v[0] = ((double)j)*h/2.0; node = new Node(count++, 2, v); the_node_array.add(node);
}
int ena[8]; Omega_eh *elem; count = 0;
for(int j = 0; j < row_segment_no; j++)
for(int k = 0; k < row_segment_no; k++) {
int first_node = j*(3*row_segment_no+2)+k*2;
ena[0] = first_node; ena[1] = ena[0]+2; ena[2] = ena[1]+(3*row_segment_no+2);
ena[3] = ena[2]-2; ena[4] = ena[0]+1;
ena[5] = (j+1)*(2*row_segment_no+1)+j*(row_segment_no+1)+k+1;
ena[6] = ena[3]+1; ena[7] = ena[5]-1;
elem = new Omega_eh(count++, 0, 0, 8, ena); the_omega_eh_array.add(elem);
}
} else if(i == 1) { Ωθ; Lagrangian 9-node element
int row_segment_no = 2;
int count = 0; Node *node; double v[2], h = 1.0/((double)(row_segment_no));
for(int j = 0; j < row_segment_no; j++) {
v[1] = ((double)j)*h;
for(int k = 0; k < (2*row_segment_no+1); k++) {
v[0] = ((double)k)*h/2; node = new Node(count++, 2, v); the_node_array.add(node);
}
v[1] += h/2.0;
for(int k = 0; k < (2*row_segment_no+1); k++) {
v[0] = ((double)k)*h/2; node = new Node(count++, 2, v); the_node_array.add(node);
}
}
v[1] = 1.0;
for(int j = 0; j < (2*row_segment_no+1); j++) {
v[0] = ((double)j)*h/2.0; node = new Node(count++, 2, v); the_node_array.add(node);
}
int ena[9]; Omega_eh *elem; count = 0;
for(int j = 0; j < row_segment_no; j++) for(int k = 0; k < row_segment_no; k++) {
int row_node_no = 2*row_segment_no+1, first_node = j*2*row_node_no+k*2;
ena[0] = first_node; ena[1] = ena[0]+2; ena[2] = ena[1]+2*row_node_no;
ena[3] = ena[2]-2; ena[4] = ena[0]+1; ena[5] = ena[1]+row_node_no;
ena[6] = ena[2]-1; ena[7] = ena[0]+row_node_no; ena[8] = ena[7] +1;
elem = new Omega_eh(count++, 0, 0, 9, ena); the_omega_eh_array.add(elem);
}
}
}
θ
Global_Discretization w_gd(oh_w, w_gh, w_h, w_type);
const int theta_ndf = 2;
Omega_h_i oh_theta(1);
gh_on_Gamma_h_i theta_gh(1, theta_ndf, oh_theta);
U_h theta_h(theta_ndf, oh_theta);
Global_Discretization theta_gd(oh_theta, theta_gh, theta_h, theta_type);
Global_Discretization_Couple gdc(theta_gd, w_gd); θ-w
Matrix_Representation mr_w(w_gd);
Matrix_Representation mr_theta(theta_gd);
Matrix_Representation_Couple mrc(gdc, 0, &(mr_theta.rhs()), &(mr_w.rhs()), &mr_w);
mr_w.assembly();
mr_theta.assembly();
mrc.assembly();
C0 Ks = ((C0)(mr_w.lhs())), f_w = ((C0)(mr_w.rhs())), Ks , and fw
Kbs = ((C0)(mrc.lhs())), Kbs
Kb = ((C0)(mr_theta.lhs())), f_theta = ((C0)(mr_theta.rhs()));
Kb, and fθ
Cholesky dKs(Ks);
C0 Ks_inv = dKs.inverse(),
KbsKs_invKbst = Kbs*Ks_inv*(~Kbs),
K = Kb-KbsKs_invKbst,
f = f_theta-Kbs*(dKs*f_w);
θ̂=(Kb-Kbs(Ks)-1KbsT)-1
LU dK(K); (fθ-Kbs(Ks)-1fw)
C0 theta = dK*f,
w = dKs*(f_w-(~Kbs)*theta);
ŵ = (Ks)-1(fw-KbsTθ̂ )
w_h = w; w_h = w_gd.gh_on_gamma_h();
cout << "deflection w:" << endl << w_h << endl;
theta_h = theta; theta_h = theta_gd.gh_on_gamma_h();
cout << "rotation (theta_x, theta_y):" << endl;
for(int i = 0; i < theta_h.total_node_no(); i++) cout << theta_h[i] << endl;
return 0;
}
Listing 5•14 Heterosis element for θ-w irreducible thick plate formulation (project:
“irreducible_thick_plate” in project workspace file “fe.dsw”).
L T DL θ + S = 0 Eq. 5•183
--- + θ – ∇w = 0
S
Eq. 5•184
α
Three variables in Eq. 5•183 to Eq. 5•185 [θ, S, w]T are approximated as
θ = N θ θ̂ , S = N S S and w = Nw ŵ
ˆ
Eq. 5•186
Integrating by parts on the first term of Eq. 5•187 and the first term of Eq. 5•189, then apply Green’s theorem,
and simply change sign of Eq. 5•188 yield
∫ ( ∇Nw )T NS dΩŜ = – ∫ Nw ∫ w Γ
T q dΩ + N T S dΓ Eq. 5•192
Ω Ω Γ
Kb C T 0 θ̂ fθ
C H E T Sˆ = 0 Eq. 5•193
0 E 0 ŵ fw
H = – ∫ N ST --- N S dΩ
1
α
Eq. 5•196
Ω
The condition for non-singular matrices parallel to that for the three-fields Hu-Washizu variational principle in
Eq. 5•63 is
By inspecting Eq. 5•194 to Eq. 5•199, the first derivatives of θ and w exist. Therefore, C0-continuity need to be
satisfied for these two fields. S-field is taken as discontinuous, which has the potential advantage of being elim-
inated at the element level.1 The Program Listing 5•15 implements Heterosis element with θ-S-w mixed formu-
lation. The S-field is 4-node element by taking the four nodes at the 2 × 2 integration points. The results of the
mixed formulation is the same as the one with selective reduced integration on the shear constraint terms. In
plane elasticity, the equivalence theorem, in the previous chapter, was applied to (1) the selective reduced inte-
gration on the pressure constraint for the irreducible formulation, and (2) the pressure node taking on the Gauss
integration points for the mixed u-p formulation. Now, the equivalence theorem is applied to, the thick plate the-
ory, for (1) the θ-w irreducible formulation with the selective reduced integration on the shear constraint terms,
and (2) the θ-S-w mixed formulation where S-field is 4-node element with four nodes at the 2 × 2 integration
points. This equivalence theorem is illustrated in Figure 5•8.
The Program Listing 5•15 implement the mixed thick plate formulation (θ-S-w) for heterosis element. The
deflection is exactly the same as those obtained from the irreducible formulation.
1. p. 75 in Zienkiewicz, O.C. and R.L. Taylor, 1991, “The finite element method”, 4th ed., vol. 2. McGraw-Hill, Inc., UK.
Figure 5•8 The equivalence of selective reduced integration of the shear constraint
terms of the irreducible formulation and mixed formulation with shear force nodes at
Gauss integration points. Both formulation use the heterosis element which uses
serendipity element for w and Lagrangian element for θ.
#include "include\fe.h"
#include "include\omega_h_n.h"
Matrix_Representation_Couple::assembly_switch
Matrix_Representation_Couple::Assembly_Switch = Matrix_Representation_Couple::ALL;
Omega_h_i::Omega_h_i(int i) : Omega_h(0) { Heterosis element; see Hughes[1987]
Ωθ; 9-node Lagrangian element
if(i == 0) {
int row_segment_no = 2;
int count = 0; Node *node; double v[2], h = 1.0/((double)(row_segment_no));
for(int j = 0; j < row_segment_no; j++) {
v[1] = ((double)j)*h;
for(int k = 0; k < (2*row_segment_no+1); k++) {
v[0] = ((double)k)*h/2; node = new Node(count++, 2, v); the_node_array.add(node);
}
v[1] += h/2.0;
for(int k = 0; k < (2*row_segment_no+1); k++) {
v[0] = ((double)k)*h/2; node = new Node(count++, 2, v); the_node_array.add(node);
}
}
v[1] = 1.0;
for(int j = 0; j < (2*row_segment_no+1); j++) {
v[0] = ((double)j)*h/2.0; node = new Node(count++, 2, v); the_node_array.add(node);
}
int ena[9]; Omega_eh *elem; count = 0;
for(int j = 0; j < row_segment_no; j++) for(int k = 0; k < row_segment_no; k++) {
int row_node_no = 2*row_segment_no+1, first_node = j*2*row_node_no+k*2;
ena[0] = first_node; ena[1] = ena[0]+2; ena[2] = ena[1]+2*row_node_no;
ena[3] = ena[2]-2; ena[4]=ena[0]+1;ena[5]=ena[1]+row_node_no;
ena[6]=ena[2]-1;ena[7]=ena[0]+row_node_no; ena[8] = ena[7] +1;
elem = new Omega_eh(count++, 0, 0, 9, ena); the_omega_eh_array.add(elem);
Kb matrix; θ-field
if(gd.type() == theta_type) {
Quadrature qp(2, 9);
H1 Z(2, (double*)0, qp), Zai, Eta, 3x3 Gauss integration for bending
N = INTEGRABLE_VECTOR_OF_TANGENT_BUNDLE("int, int, Quadrature", 9, 2, qp);
Zai &= Z[0]; Eta &= Z[1];
Lagrangian shape functions for θ dof
N[0] = (1-Zai)*(1-Eta)/4; N[1] = (1+Zai)*(1-Eta)/4;
N[2] = (1+Zai)*(1+Eta)/4; N[3] = (1-Zai)*(1+Eta)/4;
N[8] = (1-Zai.pow(2))*(1-Eta.pow(2));
N[0] -= N[8]/4; N[1] -= N[8]/4; N[2] -= N[8]/4; N[3] -= N[8]/4;
N[4] = ((1-Zai.pow(2))*(1-Eta)-N[8])/2; N[5] = ((1-Eta.pow(2))*(1+Zai)-N[8])/2;
N[6] = ((1-Zai.pow(2))*(1+Eta)-N[8])/2; N[7] = ((1-Eta.pow(2))*(1-Zai)-N[8])/2;
N[0] -= (N[4]+N[7])/2; N[1] -= (N[4]+N[5])/2;
N[2] -= (N[5]+N[6])/2; N[3] -= (N[6]+N[7])/2;
H1 X = N*xl; H0 Nx = d(N) * d(X).inverse(); J dV(d(X).det());
H0 W_x = INTEGRABLE_SUBMATRIX("int, int, H0&", 1, nsd, Nx), Wx, Wy, B;
Wx &= W_x[0][0]; Wy &= W_x[0][1];
B &= (~Wx || C0(0.0)) &
(C0(0.0) || ~Wy ) & Kb = ∫ ( LNθ ) T DLNθ dΩ
(~Wy || ~Wx ); Ω
stiff &= (~B)*(D*B) | dV;
} else if(gd.type() == S_type) {
H matrix; S-field
Quadrature qp(2, 4);
H1 Z(2, (double*)0, qp), Zai, Eta,
N = INTEGRABLE_VECTOR_OF_TANGENT_BUNDLE("int, int, Quadrature", 4, 2, qp);
Zai &= Z[0]; Eta &= Z[1];
N[0] = (1.0-Zai)*(1.0-Eta)/4.0; N[1] = (1.0+Zai)*(1.0-Eta)/4.0; bilinear corner node shape function for
N[2] = (1.0+Zai)*(1.0+Eta)/4.0; N[3] = (1.0-Zai)*(1.0+Eta)/4.0; coordinate transformation
C0 x = MATRIX("int, int, C0&, int, int", 4, 2, xl, 4, 0);
H1 X = N*x; J dv(d(X).det());
double sqrt3 = sqrt(3.0); C0 zero(0.0);
H0 n = INTEGRABLE_VECTOR("int, Quadrature", 4, qp), n0, n1, n2, n3, zai, eta;
zai &= ((H0)Z[0]); eta &= ((H0)Z[1]);
n[0] = (1.0-sqrt3*zai)*(1.0-sqrt3*eta)/4.0; n[1] = (1.0+sqrt3*zai)*(1.0-sqrt3*eta)/4.0;
n[2] = (1.0+sqrt3*zai)*(1.0+sqrt3*eta)/4.0; n[3] = (1.0-sqrt3*zai)*(1.0+sqrt3*eta)/4.0; S-shape functions:
n0 &= n[0]; n1 &= n[1]; n2 &= n[2]; n3 &= n[3]; bilinear four Gauss point nodes
H0 N_S = (( n0 | zero | n1 | zero | n2 | zero | n3 | zero ) &
(zero | n0 | zero | n1 | zero | n2 | zero | n3 ));
stiff &= MATRIX("int, int", 16, 16);
H = – ∫ N ST --- N S dΩ
C0 stiff_sub = MATRIX("int, int, C0&, int, int", 8, 8, stiff, 0, 0);
1
stiff_sub = -( ((~N_S) * N_S) | dv)/alpha_; α
Ω
}
}
Element_Formulation_Couple* Plate_Heterosis::make(
int en, Global_Discretization_Couple& gdc) { return new Plate_Heterosis(en,gdc); }
Plate_Heterosis::Plate_Heterosis(int en, Global_Discretization_Couple& gdc) :
Element_Formulation_Couple(en, gdc) {
if(gdc.type() == S_theta_type) { C matrix; S-θ couple
Quadrature qp(2, 9);
H1 Z(2, (double*)0, qp), Zai, Eta,
N = INTEGRABLE_VECTOR_OF_TANGENT_BUNDLE("int, int, Quadrature", 4, 2, qp);
Zai &= Z[0]; Eta &= Z[1]; bilinear corner node shape function for
N[0] = (1.0-Zai)*(1.0-Eta)/4.0; N[1] = (1.0+Zai)*(1.0-Eta)/4.0;
N[2] = (1.0+Zai)*(1.0+Eta)/4.0; N[3] = (1.0-Zai)*(1.0+Eta)/4.0;
coordinate transformation
C0 x = MATRIX("int, int, C0&, int, int", 4, 2, xl, 4, 0);
H1 X = N*x;
J dv(d(X).det());
θ-shape functions:
nt0, nt1, nt2, nt3, nt4, nt5, nt6, nt7, nt8;
nt[0] = (1-zai)*(1-eta)/4; nt[1] = (1+zai)*(1-eta)/4;
nt[2] = (1+zai)*(1+eta)/4; nt[3] = (1-zai)*(1+eta)/4; Lagrangian shape function
nt[8] = (1-zai.pow(2))*(1-eta.pow(2));
nt[0] -= nt[8]/4; nt[1] -= nt[8]/4; nt[2] -= nt[8]/4; nt[3] -= nt[8]/4;
nt[4] = ((1-zai.pow(2))*(1-eta)-nt[8])/2; nt[5] = ((1-eta.pow(2))*(1+zai)-nt[8])/2;
nt[6] = ((1-zai.pow(2))*(1+eta)-nt[8])/2; nt[7] = ((1-eta.pow(2))*(1-zai)-nt[8])/2;
nt[0] -= (nt[4]+nt[7])/2; nt[1] -= (nt[4]+nt[5])/2;
nt[2] -= (nt[5]+nt[6])/2; nt[3] -= (nt[6]+nt[7])/2;
nt0 &= nt[0]; nt1 &= nt[1]; nt2 &= nt[2]; nt3 &= nt[3]; nt4 &= nt[4];
nt5 &= nt[5]; nt6 &= nt[6]; nt7 &= nt[7]; nt8 &= nt[8];
H0 N_theta=((nt0|zero|nt1|zero|nt2|zero|nt3|zero|nt4|zero|nt5|zero|nt6|zero|nt7|zero|nt8|zero)& Nθ
(zero|nt0|zero|nt1|zero|nt2|zero|nt3|zero|nt4|zero|nt5|zero|nt6|zero|nt7|zero|nt8));
stiff &= MATRIX("int, int", 16, 18);
C0 stiff_sub = MATRIX("int, int, C0&, int, int", 8, 18, stiff, 0, 0); C = – ∫ N ST N θ dΩ
stiff_sub = -( ((~N_S) * N_theta) | dv);
Ω
} else if(gdc.type() == w_S_type) {
Quadrature qp(2, 9);
H1 Z(2, (double*)0, qp), , Zai, Eta E matrix; w-S couple
N = INTEGRABLE_VECTOR_OF_TANGENT_BUNDLE("int, int, Quadrature", 8, 2, qp); w shape function:
Zai &= Z[0]; Eta &= Z[1];
N[0] = (1-Zai)*(1-Eta)/4; N[1] = (1+Zai)*(1-Eta)/4; serendipity shape functions
N[2] = (1+Zai)*(1+Eta)/4; N[3] = (1-Zai)*(1+Eta)/4;
N[4] = (1-Zai.pow(2))*(1-Eta)/2; N[5] = (1-Eta.pow(2))*(1+Zai)/2;
N[6] = (1-Zai.pow(2))*(1+Eta)/2; N[7] = (1-Eta.pow(2))*(1-Zai)/2;
N[0] -= (N[4]+N[7])/2; N[1] -= (N[4]+N[5])/2; N[2] -= (N[5]+N[6])/2; N[3] -= (N[6]+N[7])/2;
H1 X = N*xl; H0 Nx = d(N) * d(X).inverse(); J dv(d(X).det());
H0 W_x = INTEGRABLE_SUBMATRIX("int, int, H0&", 1, 2/*nsd*/, Nx), Wx, Wy, grad_W;
Wx &= W_x[0][0]; Wy &= W_x[0][1];
grad_W &= (~Wx) & ∇w
(~Wy);
double sqrt3 = sqrt(3.0);
C0 zero(0.0);
H0 n = INTEGRABLE_VECTOR("int, Quadrature", 4, qp), n0, n1, n2, n3, zai, eta;
zai &= ((H0)Z)[0]; eta &= ((H0)Z)[1]; S-shape functions:
n[0] = (1.0-sqrt3*zai)*(1.0-sqrt3*eta)/4.0; n[1] = (1.0+sqrt3*zai)*(1.0-sqrt3*eta)/4.0; bilinear four Gauss point nodes
n[2] = (1.0+sqrt3*zai)*(1.0+sqrt3*eta)/4.0; n[3] = (1.0-sqrt3*zai)*(1.0+sqrt3*eta)/4.0;
n0 &= ((H0)n)[0]; n1 &= ((H0)n)[1]; n2 &= ((H0)n)[2]; n3 &= ((H0)n)[3];
H0 n_S = (n0|zero|n1|zero|n2|zero|n3|zero) & NS
(zero|n0|zero|n1|zero|n2|zero|n3);
stiff &=MATRIX("int, int", 8, 16);
C0 stiff_sub=MATRIX("int, int, C0&, int, int", 8, 8, stiff, 0, 0); E = ∫ ( ∇Nw )T NS dΩ
stiff_sub = ( (~grad_W) * n_S) | dv; Ω
double f_0 = 1.0;
force &= (((H0)N)*f_0) | dv;
} fw = ∫ NwT q dΩ
} Ω
Element_Formulation* Element_Formulation::type_list = 0;
static Element_Type_Register element_type_register_instance;
static Plate_Heterosis plate_heterosis_instance(element_type_register_instance);
θ
const int theta_ndf = 2;
Omega_h_i oh_theta(0);
gh_on_Gamma_h_i theta_gh(0, theta_ndf, oh_theta);
U_h theta_h(theta_ndf, oh_theta);
Global_Discretization theta_gd(oh_theta, theta_gh, theta_h, theta_type);
const int S_ndf = 2; S
Omega_h_i oh_S(1);
gh_on_Gamma_h_i S_gh(1, S_ndf, oh_S);
U_h S_h(S_ndf, oh_S);
Global_Discretization S_gd(oh_S, S_gh, S_h, S_type);
const int w_ndf = 1; w
Omega_h_i oh_w(2);
gh_on_Gamma_h_i w_gh(2, w_ndf, oh_w);
U_h w_h(w_ndf, oh_w);
Global_Discretization w_gd(oh_w, w_gh, w_h); S-θ
Global_Discretization_Couple gdc_S_theta(S_gd, theta_gd, S_theta_type);
w−S
Global_Discretization_Couple gdc_w_S(w_gd, S_gd, w_S_type);
Matrix_Representation mr_theta(theta_gd); Matrix_Representation mr_S(S_gd);
Matrix_Representation_Couple mrcE(gdc_w_S,0, 0,&(mr_S.rhs()),&mr_S);
Matrix_Representation_Couple mrcC(gdc_S_theta,0,
&(mr_S.rhs()),&(mr_theta.rhs()), &mr_theta);
mr_theta.assembly();
mr_S.assembly();
mrcC.assembly();
Kb ,
mrcE.assembly();
C0 K = ((C0)(mr_theta.lhs())), H,
H = ((C0)(mr_S.lhs())), C,
C = ((C0)(mrcC.lhs())),
fθ
f_theta = ((C0)(mr_theta.rhs())),
E = ((C0)(mrcE.lhs())), E
f_S = ((C0)(mr_S.rhs())), fS
f_w = ((C0)(mrcE.rhs()));
fw
Cholesky dK(K);
C0 K_inv = dK.inverse(),
CK_inv = C*K_inv,
A = H-CK_inv*(~C);
Cholesky dnA(-A);
C0 A_inv = -(dnA.inverse()),
EA_inv = E*A_inv, ŵ =
EA_invEt = EA_inv*(~E);
(EA-1ET)-1(EA-1fS - EA-1CKb-1fθ−fw)
Cholesky dnEA_invEt(-EA_invEt); ˆ
C0 w = -(dnEA_invEt*( E*-(dnA*f_S) - E*-(dnA*(C*(dK*f_theta))) -f_w) ), S = A-1(fS−ET ŵ −CKb-1fθ)
S = -(dnA*(f_S-(~E)*w-C*(dK*f_theta))),
theta = dK*(f_theta-(~C)*S);
theta_h = theta;
θ̂ = Kb-1(fθ-CT Sˆ )
theta_h = theta_gd.gh_on_gamma_h();
cout << "rotation (theta_x, theta_y):" << endl;
for(int i = 0; i < theta_h.total_node_no(); i++) cout << theta_h[i] << endl;
S_h = S;
S_h = S_gd.gh_on_gamma_h();
cout <<"shear force S:" << endl << S_h << endl;
w_h = w;
w_h = w_gd.gh_on_gamma_h();
cout << "deflection w:" << endl << w_h << endl;
return 0; }
Listing 5•15 Heterosis element for θ-S-w mixed thick plate formulation (project: “mixed_thick_plate” in
project workspace file “fe.dsw”).
S = α [ ∇w – θ ] Eq. 5•201
For a rectangular plate with edges parallel to x and y axis (Figure 5•9a), these constraints, on the mid-side nodes
of an element boundaries, can be easily computed from bilinear four-node shape functions for both θ and w as 1
w 1 – w 0 θx + θ x
0 1
ˆ4
S x = α ------------------- – ------------------
a 2
w2 – w 1 θy + θy
1 2
ˆ5
S y = α ------------------- – ------------------
b 2
w2 – w 3 θx + θx
2 3
ˆ6
S x = α ------------------- – ------------------
a 2
w 3 – w0 θy + θy
3 0
ˆ7
S y = α ------------------- – ------------------ Eq. 5•202
b 2
3 6
2
NS
x
4 = (1−η)/2
b 7 5
NS 5 = (1+ξ)/2
y
1
0 4 NS 6 = (1+η)/2
x
a
: θ and w : Sx : Sy
NS
y
7 = (1−ξ)/2
(a)
(b)
Figure 5•9 Discrete Reissner-Mindlin method for rectangular element with bilinear four-node
shape functions for both θ and w, and four shear constraints enforced on the element boundaries.
1. p.87-88 in Zienkiewicz, O.C., and R.L. Taylor, 1991, “ The finite element method: solid and fluid mechanics, dynamics,
and non-linearity”, 4-th eds., McGraw-Hill Inc., UK.
S = α [ Q w ŵ – Q θ θ̂ ]
ˆ
Eq. 5•203
where
–1 1 –1 –1
------ --- 0 0 ------ 0 ------ 0 0 0 0 0
a a 2 2
1 –1 –1 –1
0 --- ------ 0 0 0 0 ------ 0 ------ 0 0
b b 2 2
Qw = , Qθ = Eq. 5•204
1 –1 –1 –1
0 0 --- ------ 0 0 0 0 ------ 0 ------ 0
a a 2 2
1 –1 –1 –1
--- 0 0 ------ 0 ------ 0 0 0 0 0 ------
b b 2 2
S =NS Sˆ with NS shape functions for these four mid-side nodes are shown in Figure 5•9(b). The matrix form of
the problem becomes
T
K ww K wθ ŵ fw
= Eq. 5•205
K wθ K θθ θ̂ fθ
where
fw and fθ are the same as those defined in the irreducible formulation. The Program Listing 5•16 implements the
discrete Reissner-Mindlin formulation in the above. The center deflection for this bilinear four-node element
with discrete shear constraints is 325634, which is greater than the analytical solution for the thin-plate theory of
226800.
#include "include\fe.h"
#include "include\omega_h_n.h"
Matrix_Representation_Couple::assembly_switch
Matrix_Representation_Couple::Assembly_Switch = Matrix_Representation_Couple::ALL;
EP::element_pattern EP::ep = EP::QUADRILATERALS_4_NODES; Ωθ & Ωw; bilinear 4-node element
static int row_node_no = 9; static int col_node_no = row_node_no;
Omega_h_i::Omega_h_i(int i) : Omega_h(0) {
if(i == 0 || i == 1) {
double x[4][2] = {{0.0, 0.0}, {1.0, 0.0}, {1.0, 1.0}, {0.0, 1.0}};
int control_node_flag[4] = {1, 1, 1, 1};
block(this, row_node_no, col_node_no, 4, control_node_flag, x[0]);
}
}
gh_on_Gamma_h_i::gh_on_Gamma_h_i(int i, int df, Omega_h& omega_h) : gh_on_Gamma_h() {
gh_on_Gamma_h::__initialization(df, omega_h);
boundary conditions
if(i == 0) {
for(int j = 0; j < row_node_no; j++) {
for(int k = 0; k < df; k++) {
the_gh_array[node_order((j+1)*row_node_no-1)](k) =
the_gh_array[node_order(row_node_no*(row_node_no-1)+j)](k) =
gh_on_Gamma_h::Dirichlet;
}
the_gh_array[node_order(j)](1) = gh_on_Gamma_h::Dirichlet;
the_gh_array[node_order(j*row_node_no)](0) = gh_on_Gamma_h::Dirichlet;
}
} else if(i == 1) {
for(int j = 0; j < 9; j++) {
the_gh_array[node_order((j+1)*row_node_no-1)](0) =
the_gh_array[node_order(row_node_no*(row_node_no-1)+j)](0) =
gh_on_Gamma_h::Dirichlet;
}
}
}
static Global_Discretization *w_type = new Global_Discretization;
static Global_Discretization *theta_type = new Global_Discretization;
class Plate_Discrete_Reissner_Mindlin : public Element_Formulation_Couple {
public:
Plate_Discrete_Reissner_Mindlin(Element_Type_Register a) :
Element_Formulation_Couple(a) {}
Element_Formulation *make(int, Global_Discretization&);
Plate_Discrete_Reissner_Mindlin(int, Global_Discretization&);
Element_Formulation_Couple *make(int, Global_Discretization_Couple&);
Plate_Discrete_Reissner_Mindlin(int, Global_Discretization_Couple&);
};
Element_Formulation* Plate_Discrete_Reissner_Mindlin::make( int en,
Global_Discretization& gd) { return new Plate_Discrete_Reissner_Mindlin(en,gd); }
static const double E_ = 1.0; static const double v_ = 0.25;
static const double t_ = 0.01;
static const double D_ = E_ * pow(t_,3) / (12.0*(1-pow(v_,2)));
static const double Dv[3][3] = { 1 ν 0
{D_, D_*v_, 0.0 }, Et 3 ν 1 0
{D_*v_, D_, 0.0 }, D = -------------------------
{0.0, 0.0, D_*(1-v_)/2.0} 12 ( 1 – ν 2 ) 1–ν
0 0 ------------
}; 2
C0 D = MATRIX("int, int, const double*", 3, 3, Dv[0]);
static const double mu_ = E_/(2*(1+v_));
static const double alpha_ = (5.0/6.0)*mu_*t_;
}
stiff &= ( ( ((~Q_w)*n)*((~n)*Q_w) ) | dv )/alpha_; K ww = ∫ ( NS Qw )T αNS Qw dΩ
} Ω
Element_Formulation_Couple* Plate_Discrete_Reissner_Mindlin::make(int en,
Global_Discretization_Couple& gdc) {
return new Plate_Discrete_Reissner_Mindlin(en,gdc);
}
Plate_Discrete_Reissner_Mindlin::Plate_Discrete_Reissner_Mindlin(int en,
Global_Discretization_Couple& gdc) : Element_Formulation_Couple(en, gdc) {
Quadrature qp(2, 4);
H1 Z(2, (double*)0, qp), Zai, Eta,
N = INTEGRABLE_VECTOR_OF_TANGENT_BUNDLE("int, int, Quadrature", 4, 2, qp);
Zai &= Z[0]; Eta &= Z[1];
N[0] = (1.0-Zai)*(1.0-Eta)/4.0; N[1] = (1.0+Zai)*(1.0-Eta)/4.0;
N[2] = (1.0+Zai)*(1.0+Eta)/4.0; N[3] = (1.0-Zai)*(1.0+Eta)/4.0;
H1 X = N*xl; J dv(d(X).det()); S-shape functions:
H0 n = INTEGRABLE_VECTOR("int, Quadrature", 4, qp), zai, eta; NS
zai = ((H0)Z)[0]; eta &= ((H0)Z)[1];
n[0] = (1-eta)/2.0; n[1] = (1+zai)/2.0; n[2] = (1+eta)/2.0; n[3] = (1-zai)/2.0;
t0 = n σ0 Eq. 5•209
where n is the outward unit surface normal at the point x0 on subdomain Ω0. Since the outward unit surface nor-
mal at the point x1 on subdomain Ω1 is opposite in direction to n. We also have
t1 = (- n) σ1 Eq. 5•210
Because we assumed the contact is frictionless, only the normal component of the tractions, t 0 • n = – t 1 • n ≡ λ ,
on the contact surface are required to be in equilibrium according to Newton’s third law of motion. The contact
conditions on contact surface Γc is2
0
g ≡ [ u1 – u 0 ] • n + g = 0 ⇒ t0 • n = – t1 • n < 0
0
g ≡ [ u1 – u 0 ] • n + g > 0 ⇒ t 0 • n = –t 1 • n = 0 Eq. 5•211
where “g” is the “gap” between the two bodies, and “g0” is the initial gap. In constraint optimization (see Eq.
2•14 in Chapter 2), Eq. 5•211 can be concisely written as the Kuhn-Tucker condition
g λ = 0, λ ≤ 0, g ≥ 0 Eq. 5•212
We note that λ, the normal component of the tractions, also plays the role of the Lagrange multipliers in the con-
text of constraint optimization problem. The geometry of contact element is shown in Figure 5•10. The node
number “8”, with coordinates x11, is projected to its opposite boundary on x0. The segment length proportional
parameter α is computed from
1. see p. 63 in Fung Y.C., 1965, “Foundations of solid mechanics”, Prentice-Hall, Inc., Englewood Cliffs, New Jersey.
2. Simo, J.C. Wriggers, P., and R.L.Taylor, 1985, “A perturbed Lagrangian formulation for the finite element solution of
contact problem”, Computer Methods in Applied Mechanics and Engineering, vol. 50, p. 163-180, Elsevier Science Publish-
ers, North-Holland.
9x1
nodes
6 Ω1 x11 1
projected x10
7
nodes
n0 8
y2
contact element n1
1 numbers 0 5
1 2 4 y1
3 4x 1
unit surface 3 0
normal 1 x00 x00 5
2
Ωi domains 0
Ω0 αl
l = ||x01-x00||
Figure 5•10 Contact element geometry. Six contact line elements are formed
through projection of real nodes onto the boundary of the opposite boundaries.
( x 10 – x 00 ) • ( x 01 – x 00 )
α = -------------------------------------------------- Eq. 5•213
x 01 – x 00
and the coordinates of the projected point x00, and displacement on the projected node u0 can be linearly interpo-
lated from the level rule as
0
x 0 = ( 1 – α )x 00 + αx 01, and u 0 = ( 1 – α )u 00 + αu 01 Eq. 5•214
For the contact element number “5” in Figure 5•10 we recognize that, on Ω0, the node numbers “3”, “4”, in glo-
bal node numbering, are involved in the element. The contact element node on Ω0 associated with the current
element number “5” has the coordinates of {x00, x01}. The displacement associated with these two nodes is
denoted as u = [u00, u01]T which is related to the displacement of “real” nodes (global node numbers “3” and “4”,
0 1
and redefined as local element node number “0” and “1”) û = [ û 0 , û0 ]T by the same level rule as
0 0
u0 ( 1 – α 0 )n 0 α0 n0 û 0
u= = = cû Eq. 5•215
u0
1 α1 n1 ( 1 – α 1 )n 1 û 1
0
where α0 for node number “3” is computed from Eq. 5•213, and α1= 0 (consistent with Eq. 5•213) for node num-
ber “4” which is not a projected node. The initial gap, g0, and the tangent vector, s, along boundary Γ34 are
0 ( x 01 – x 00 )
g = x 10 – x 0 , s = ---------------------
- Eq. 5•216
x 01 – x 00
The outward unit surface normal vector, n0, is defined from rotating the tangent vector 90o counter-clockwise
π π
cos --- sin --- s
2 2 x
n0 = Rs = = [sy, -sx]T Eq. 5•217
π π sy
– sin --- cos ---
2 2
The Lagrangian functional to the contact problem is (with domain index i = 0, 1)
The Euler-Lagrange equations are obtained by taking the directional derivatives of Lagrangian functional with
respect to ui and λ, then, set to zero.
∫ δ λT { [ u2 – u1 ] • n + g 0 } dΓ = 0 Eq. 5•220
Γc
The first three terms in Eq. 5•219 are the same as the standard irreducible formulation. The last term in Eq.
5•219 and Eq. 5•220 are needed for the contact formulation. The Lagrangian multiplier and the displacement (on
the contact elements) are approximated as linear line element from their nodal values (with “hat”)
where N ua ≡ N a c ; i.e., the projected displacement u of contact nodes is related to u of global nodes by Eq. 5•215.
The off-diagonal element stiffness submatrices in the mixed formulation of Eq. 5•219 and Eq. 5•220 are
Eq. 5•222 adds nothing new from mixed formulation point of view. What is new to the program is we need to
feed the information, at the element level, on the corresponding λ and u on domain Ω0 and Ω1 that are associ-
ated with the particular contact segment.
1 int ndf = 2;
2 Omega_h_i elastic_foundation_oh(0); // elastic foundation; Ω elastic foundation
3 gh_on_Gamma_h_i elastic_foundation_gh(0, ndf, elastic_foundation_oh);
4 U_h u_0(ndf, elastic_foundation_oh); // uelastic foundation
5 Global_Discretization *elastic_foundation_type = new Global_Discretization();
6 Global_Discretization
7 elastic_foundation_gd(elastic_foundation_oh, elastic_foundation_gh, u_0,
8 elastic_foundation_type);
9 Gamma_h_i // potential contact segment; Γelastic founcation
10 elastic_foundation_gamma_h(0, elastic_foundation_oh);
11 Global_Discretization_Gamma_h_i
12 elastic_foundation_gamma_h_gd(0, elastic_foundation_gd, elastic_foundation_type,
13 elastic_foundation_gamma_h);
where line 9 defines the potential contact surface for the elastic foundation with a new class Gamma_h_i for Γi .
The Gamma_h_i class does not have variables of its own. The Global_Discretization of this class,
Global_Discretization_Gamma_h_i, uses the variable u_0 instantiated for the Ωi. The following lines define the
rigid punch (the upper part) in a similar manner.
B.C.
rigid punch problem v = -2
(a)
8 9 10 11 E = 1.e10
(B.C for the next section ν = 0.5
on rigid sled problem 4 5 6 7
4 u = 0.2 t)
0 1 2 3
14 15 16 17 18 19 20
7 8 9 10 11 12 13
20
E = 1.e5
ν = 0.5
0 1 2 3 4 5 6
60
15 0 16 1 17 2 18 3 19 4
Ωelastic foundation
Γelastic foundation
Figure 5•11 Frictionless contact problem with rigid punch (or sled) on top an elastic foundation.
1 Global_Discretization_Couple *interface_elastic_foundation_gamma_h_type
2 = new Global_Discretization_Couple();
3 Global_Discretization_Couple // {Γc-Γelastic founcation}
4 interface_elastic_foundation_gdc(interface_gd, elastic_foundation_gamma_h_gd,
5 interface_elastic_foundation_gamma_h_type);
6 Global_Discretization_Couple *interface_rigid_punch_gamma_h_type
7 = new Global_Discretization_Couple();
8 Global_Discretization_Couple // {Γc-Γrigid punch}
9 interface_rigid_punch_gdc(interface_gd, rigid_punch_gamma_h_gd,
10 interface_rigid_punch_gamma_h_type);
The contact element domain (Ωeh)c is defined as a new C++ class “Contact_Omega_eh” derived from class of
element domain Ωeh as
The public inheritance relationship makes the class of contact element domain inherits what is available in the
class of element domain. In addition, the derived class is equipped with new private data which define what
boundary elements on the surrounding bodies are associated with the contact element domain. For contact ele-
ment number 3 in Figure 5•11b, the associated elements to the domain from the lower body is the boundary ele-
ment number 2, and from the upper body is the boundary element number 1. A contact element domain for
contact element number 3 is defined inside the constructor of class Omega_h_i as (see also Eq. 5•12)
Ω1
1
α11 = 0
α10=1/2
n1
3 4
n0 3
α01 =1/2
α00 = 0 2
Ω0
Figure 5•12 Details of the contact element number “3”.
1 int ena[2];
2 ena[0] = 3; ena[1] = 4;
3 Omega_eh *elem = new Omega_eh(3, 0, 0, 2, ena); // store standard element information
4 int aen[2]; // associated element number array
5 aen[0] = 2; aen[1] = 1;
6 double pp[4], nv[4]; // projection parameters and unit normals
7 pp[0] = 0.0; pp[1] = 0.5; pp[2] = 0.5; pp[3] = 0.0; // α00, α01 (lower), and α10, α11 (upper)
8 nv[0] = 0.0; nv[1] = -1.0; // n0 = [n0x, n0y]T
9 nv[2] = 0.0; nv[3] = 1.0; // n1 = [n1x, n1y]T
10 Contact_Omega_eh *c_elem = new Contact_Omega_eh(aen, pp, nv, elem); // contact element
11 omega_eh_array().add(c_elem); // element array
The associated element number array supplies the element numbers of the lower and the upper boundary associ-
ated with the contact element number “3”. The element number for the lower boundary is “2”, and the upper ele-
ment number for the upper boundary is “1”. That is, in line 5,
aen[0] = 2; aen[1] = 1;
This information is needed for the contact element, so the contact element will know which real nodes are asso-
ciated with the contact element. The projection parameters, α00 and α01, for the lower boundary nodes are store
in pp[0] and pp[1]. The first node on the lower boundary associated with the contact element number 3 is the ele-
ment number 17 which is a real node α00 = 0. The second node is a projection node in the middle of element
number 17 and 18. Therefore, the projection parameter α01 = 0.5. Alternatively, we can use Eq. 5•213 to com-
pute the value of α when the geometrical condition becomes more complicated. α10 and α11 for the upper
boundary can be computed similarly. The first node again lies in the middle of node number “1” and node num-
ber “2” of the upper boundary. The projection parameter α10 = 0.5. The second node on the upper boundary is a
real node number “2” of the upper boundary and we have α11 = 0. We write in line 7
Figure 5•13 Rigid punch on elastic foundation. The punch has been pushed down 2
unit.
#include "include\fe.h"
#include "include\omega_h_n.h"
Matrix_Representation_Couple::assembly_switch
Matrix_Representation_Couple::Assembly_Switch = Matrix_Representation_Couple::ALL;
#include "include\global_discretization_gamma_h_n.h"
class Contact_Omega_eh : public Omega_eh {
int the_associate_element[2]; defininig class of a contact element
double the_projection_parameter[4]; domain
double the_normal_vector[4];
public:
Contact_Omega_eh(int* ae, double* pp, double* nv, Omega_eh& oh) : Omega_eh(oh) {
for(int i = 0; i < 2; i++) the_associate_element[i] = ae[i];
for(int i = 0; i < 4; i++) the_projection_parameter[i] = pp[i];
for(int i = 0; i < 4; i++) the_normal_vector[i] = nv[i];
}
int* associate_element() { return the_associate_element; }
double* projection_parameter() { return the_projection_parameter; }
double* normal_vector() { return the_normal_vector; }
};
static const double E_[2] = {1.e5, 1.e10}; Young’s moduli
static const double v_ = (0.5-1.e-12); Poisson ratios
static const double lambda_[2] = { v_*E_[0]/((1+v_)*(1-2*v_)),
v_*E_[1]/((1+v_)*(1-2*v_))};
static const double mu_[2] = {E_[0]/(2*(1+v_)), shear moduli
E_[1]/(2*(1+v_))};
static const double lambda_bar[2] = {2*lambda_[0]*mu_[0]/(lambda_[0]+2*mu_[0]),
2*lambda_[1]*mu_[1]/(lambda_[1]+2*mu_[1])}; λ
Omega_h_i::Omega_h_i(int i) : Omega_h(0){
if(i == 0) { elastic foundation; Ωelastic founcation
double v[2]; Node *node;
v[0] = -30.0; v[1] = -20.0; node = new Node(0, 2, v); node_array().add(node); define nodes
v[0] = -16.0; node = new Node(1, 2, v); node_array().add(node);
v[0] = -8.0; node = new Node(2, 2, v); node_array().add(node);
v[0] = 0.0; node = new Node(3, 2, v); node_array().add(node);
v[0] = 8.0; node = new Node(4, 2, v); node_array().add(node);
v[0] = 16.0; node = new Node(5, 2, v); node_array().add(node);
v[0] = 30.0; node = new Node(6, 2, v); node_array().add(node);
v[0] = -30.0; v[1] = -8.0; node = new Node(7, 2, v); node_array().add(node);
v[0] = -16.0; node = new Node(8, 2, v); node_array().add(node);
v[0] = -8.0; node = new Node(9, 2, v); node_array().add(node);
v[0] = 0.0; node = new Node(10, 2, v); node_array().add(node);
v[0] = 8.0; node = new Node(11, 2, v); node_array().add(node);
v[0] = 16.0; node = new Node(12, 2, v); node_array().add(node);
v[0] = 30.0; node = new Node(13, 2, v); node_array().add(node);
v[0] = -30.0; v[1] = 0.0; node = new Node(14, 2, v); node_array().add(node);
v[0] = -16.0; node = new Node(15, 2, v); node_array().add(node);
v[0] = -8.0; node = neww Node(16, 2, v); node_array().add(node);
v[0] = 0.0; node = new Node(17, 2, v); node_array().add(node);
v[0] = 8.0; node = new Node(18, 2, v); node_array().add(node);
v[0] = 16.0; node = new Node(19, 2, v); node_array().add(node);
v[0] = 30.0; node = new Node(20, 2, v); node_array().add(node);
int ena[4]; Omega_eh *elem;
ena[0] = 0; ena[1] = 1; ena[2] = 8; ena[3] = 7; define elements
elem = new Omega_eh(0, 0, 1, 4, ena); omega_eh_array().add(elem);
ena[0] = 1; ena[1] = 2; ena[2] = 9; ena[3] = 8;
elem = new Omega_eh(1, 0, 1, 4, ena); omega_eh_array().add(elem);
ena[0] = 2; ena[1] = 3; ena[2] = 10; ena[3] = 9;
elem = new Omega_eh(2, 0, 1, 4, ena); omega_eh_array().add(elem);
Contact node# 0 1 2 3 4 5 6
Γelastic founcation 0 -1 1 -1 2 -1 3
Γrigid sled -1 16 -1 17 -1 18 -1
TABLE5• 2. Contact node number table.
What is not shown in the present example is, if for a particular contact node number both cells contain the
default value “-1” it means that there is a detachment segment along the contact surface. This situation occurred
either there is no node project onto certain segments or the projection form a gap that is beyond a pre-defined
tolerance value. If both cell contains no default values, that is the case when the node on one side are projected
to the node on the other side within a tolerance distance. The contact searching algorithm is to (1) construct
TABLE5• 2. automatically, then (2) use this table to define contact nodes and contact elements. The second part
is straight forward, since we have the manual input experience gained from the rigid punch problem in the previ-
ous section. We focus on the first part. The basic idea is to scan through the contact surface Γc to find a leading
node from either side of the potential contact boundaries Γi . The leading node is the first node encountered that
is successful projected to the opposite boundary segment. For example, if we are now in the middle of the pro-
cess at the contact node number 2, the next leading node that can be successfully project to the opposing Γi is
node number 17 on the Γelastic foundation. Successive leading nodes are filled in to the rows that they are corre-
sponding to, and leave the opposite row with the default value (-1) which indicates a projected node. Inspecting
on yet another example in Figure 5•10, we find that the leading node is not necessary always alternating on rows
in their appearance on the contact searching number table as in TABLE5• 2. In general, they may appear consec-
Then the projection of a node onto its opposite target segment is successful, on condition that the gap as com-
puted from Eq. 5•211 between the current node and the projected node is within a tolerance value. In a special
condition, when α is very close to 0 or 1 and the distance between the projected node and the ends of the target
segment is small, we have a node-on-node projection situation. These functions are implemented in the same file
“contact_searching.cpp” under its “geometrical utilities” section.
The contact searching algorithm should be invoked upon every iteration. The Q matrix will be changed
accordingly. The rigid sled is push from left to the right for 8 steps with incremental displacement of ∆v = 0.2;
i.e., vt = ∆v t ( where t = 0, 8). The Program implements the frictionless sled on elastic foundation problem. The
Program Listing 5•18 implements the frictionless rigid punch problem. The solution of this problem is shown in
Figure 5•14.
t=2
t=5
t=8
#include "vs.h"
#include "fe.h"
#include "omega_h_n.h"
Matrix_Representation_Couple::assembly_switch
Matrix_Representation_Couple::Assembly_Switch = Matrix_Representation_Couple::ALL;
#include "global_discretization_gamma_h_n.h"
#include "contact_searching.h"
static const double E_[2] = {1.e5, 1.e10}; static const double v_ = (0.5-1.e-12); Young’s moduli
static const double lambda_[2] = {v_*E_[0]/((1+v_)*(1-2*v_)), v_*E_[1]/((1+v_)*(1-2*v_))}; Poisson ratios
static const double mu_[2] = {E_[0]/(2*(1+v_)), E_[1]/(2*(1+v_))}; shear moduli
static const double lambda_bar[2] =
{2*lambda_[0]*mu_[0]/(lambda_[0]+2*mu_[0]),2*lambda_[1]*mu_[1]/(lambda_[1]+2*mu_[1])}; λ
Omega_h_i::Omega_h_i(int i) : Omega_h(0){
the_index = i;
if(i == 0) {
double v[2]; Node *node; elastic foundation; Ωelastic founcation
v[0] = -30.0; v[1] = -20.0; node = new Node(0, 2, v); node_array().add(node); define nodes
v[0] = -16.0; node = new Node(1, 2, v); node_array().add(node);
v[0] = -8.0; node = new Node(2, 2, v); node_array().add(node);
v[0] = 0.0; node = new Node(3, 2, v); node_array().add(node);
v[0] = 8.0; node = new Node(4, 2, v); node_array().add(node);
v[0] = 16.0; node = new Node(5, 2, v); node_array().add(node);
v[0] = 30.0; node = new Node(6, 2, v); node_array().add(node);
v[0] = -30.0; v[1] = -8.0; node = new Node(7, 2, v); node_array().add(node);
v[0] = -16.0; node = new Node(8, 2, v); node_array().add(node);
v[0] = -8.0; node = new Node(9, 2, v); node_array().add(node);
v[0] = 0.0; node = new Node(10, 2, v); node_array().add(node);
v[0] = 8.0; node = new Node(11, 2, v); node_array().add(node);
v[0] = 16.0; node = new Node(12, 2, v); node_array().add(node);
v[0] = 30.0; node = new Node(13, 2, v); node_array().add(node);
v[0] = -30.0; v[1] = 0.0; node = new Node(14, 2, v); node_array().add(node);
v[0] = -16.0; node = new Node(15, 2, v); node_array().add(node);
v[0] = -8.0; node = new Node(16, 2, v); node_array().add(node);
v[0] = 0.0; node = new Node(17, 2, v); node_array().add(node);
v[0] = 8.0; node = new Node(18, 2, v); node_array().add(node);
v[0] = 16.0; node = new Node(19, 2, v); node_array().add(node);
v[0] = 30.0; node = new Node(20, 2, v); node_array().add(node);
int ena[4]; Omega_eh *elem; define elements
ena[0] = 0; ena[1] = 1; ena[2] = 8; ena[3] = 7;
elem = new Omega_eh(0, 0, 1, 4, ena); omega_eh_array().add(elem);
ena[0] = 1; ena[1] = 2; ena[2] = 9; ena[3] = 8;
elem = new Omega_eh(1, 0, 1, 4, ena); omega_eh_array().add(elem);
ena[0] = 2; ena[1] = 3; ena[2] = 10; ena[3] = 9;
elem = new Omega_eh(2, 0, 1, 4, ena); omega_eh_array().add(elem);
ena[0] = 3; ena[1] = 4; ena[2] = 11; ena[3] = 10;
elem = new Omega_eh(3, 0, 1, 4, ena); omega_eh_array().add(elem);
ena[0] = 4; ena[1] = 5; ena[2] = 12; ena[3] = 11;
elem = new Omega_eh(4, 0, 1, 4, ena); omega_eh_array().add(elem);
ena[0] = 5; ena[1] = 6; ena[2] = 13; ena[3] = 12;
elem = new Omega_eh(5, 0, 1, 4, ena); omega_eh_array().add(elem);
ena[0] = 7; ena[1] = 8; ena[2] = 15; ena[3] = 14;
elem = new Omega_eh(6, 0, 1, 4, ena); omega_eh_array().add(elem);
ena[0] = 8; ena[1] = 9; ena[2] = 16; ena[3] = 15;
elem = new Omega_eh(7, 0, 1, 4, ena); omega_eh_array().add(elem);
ena[0] = 9; ena[1] = 10; ena[2] = 17; ena[3] = 16;
elem = new Omega_eh(8, 0, 1, 4, ena); omega_eh_array().add(elem);
ena[0] = 10; ena[1] = 11; ena[2] = 18; ena[3] = 17;
elem = new Omega_eh(9, 0, 1, 4, ena); omega_eh_array().add(elem);
∫ NλT Nu
( zero | N_bar_0[1] | zero | N_bar_1[1] );
if(gdc.type() == interface_elastic_foundation_gamma_h_type) Q = dΓ ; where N ≡ Nc
u
stiff &= -(((~N_lambda)*N_u_bar)|d_l); Γc
else stiff &= (((~N_lambda)*N_u_bar)|d_l); Q0 {Γc-Γelastic founcation}
}
} Q1 {Γc-Γrigid punch}
Element_Formulation* Element_Formulation::type_list = 0;
Element_Type_Register element_type_register_instance;
static Elastic_Contact_Q4 elastic_contact_q4_instance(element_type_register_instance);
#include "\vs\ex\fe\contact_frictionless_sled_on_elastic_foundation\omega_h_n.cpp"
int main() {
C0 K_0, Q_0, K_1, Q_1, f_0, f_i, f_1, lambda, u_0, u_1;
interface_oh.contact_searching_algorithm(interface_oh, elastic_foundation_gamma_h_gd,
rigid_sled_gamma_h_gd, lambda_h, interface_gh, 0.05);
Matrix_Representation mr_K_0(elastic_foundation_gd);
Matrix_Representation mr_K_1(rigid_sled_gd);
Matrix_Representation_Couple mrc_Q_0(interface_elastic_foundation_gdc, 0, 0,
&(mr_K_0.rhs()) );
Matrix_Representation_Couple mrc_Q_1(interface_rigid_sled_gdc, 0, &(mrc_Q_0.rhs()),
&(mr_K_1.rhs()) );
for(int i = 0; i < 9; i++) {
mr_K_0.assembly(); K_0 &= (C0)(mr_K_0.lhs()); K0
mr_K_1.assembly(); K_1 &= (C0)(mr_K_1.lhs()); K1
mrc_Q_0.assembly(); Q_0 &= (C0)(mrc_Q_0.lhs()); Q0
mrc_Q_1.assembly(); Q_1 &= (C0)(mrc_Q_1.lhs());
f_0 &= (C0)(mr_K_0.rhs()); f_1 &= (C0)(mr_K_1.rhs()); f_i &= (C0)(mrc_Q_0.rhs()); Q1
Cholesky dK0(K_0); Cholesky dK1(K_1); f0, f1, fi
C0 K0_inv = dK0.inverse(); C0 QK_inv_0 = Q_0*K0_inv; C0 QKQ_0 = QK_inv_0*(~Q_0); K0-1, Q0 K0-1 Q0T
C0 K1_inv = dK1.inverse(); C0 QK_inv_1 = Q_1*K1_inv; C0 QKQ_1 = QK_inv_1*(~Q_1);
Cholesky mdQKQ(QKQ_0+QKQ_1, 1.e-12); K1-1, Q1 K1-1 Q1T
lambda &= mdQKQ * (QK_inv_0*f_0+QK_inv_1*f_1-f_i); (Q0 K0-1 Q0T + K1-1, Q1 K1-1 Q1T)-1
u_0 &= dK0*(f_0-(~Q_0)*lambda); u_1 &= dK1*(f_1-(~Q_1)*lambda); λ̂=(Q0 K0-1 Q0T + K1-1, Q1 K1-1 Q1T)-1
lambda_h = lambda; lambda_h = interface_gd.gh_on_gamma_h();
u_0_h = u_0; u_0_h = elastic_foundation_gd.gh_on_gamma_h(); (Q0 K0-1 f0 + Q1K1-1 f1 - fi)
u_1_h = u_1; u_1_h = rigid_sled_gd.gh_on_gamma_h(); û 0 = K0-1 (f0 - Q0Tλ̂)
cout << "contact forces:" << endl << lambda_h << "elastic foundation displacement:" << û 1 = K1-1 (f1 - Q1Tλ̂)
endl << u_0_h << "rigid sled displacement:" << endl << u_1_h;
if(i < 8) {
for(int j = 8; j < 12; j++) {
int node_order = rigid_sled_gd.gh_on_gamma_h().node_order(j);
rigid_sled_gd.gh_on_gamma_h().gh_array()[node_order][0] = ((double)(i+1));
}
cout << "step " << i << " :" << endl;
rigid_sled_gd.u_h() = rigid_sled_gd.gh_on_gamma_h();
interface_oh.contact_searching_algorithm(interface_oh, elastic_foundation_gamma_h_gd,
rigid_sled_gamma_h_gd, lambda_h, interface_gh,
mr_K_0, mr_K_1, mrc_Q_0, mrc_Q_1, 0.05);
}
}
return 0;
}
∂R
R ( û i + 1 ) = R ( û i + δ û i ) ≅ R ( û i ) + ------- δ û i = 0
∂û û i Eq. 5•224
where û i+1 = û i + δ û i. The solution of the non-linear problem is obtained by setting the residual vector to van-
ish; i.e., find the root for the equation R(ui+1) = 0. Notice that the approximation step in Eq. 5•224 is done by the
Taylor expansion up to the first-order. From this approximated equation, the incremental displacement δui is the
solution of the simultaneous linear algebraic equations
∂R – 1
δ û i = – ------- R ( û i ) ≡ K –T1 R ( û i ) Eq. 5•225
∂û û i
∂R
where K T ≡ –-------
∂û is the tangent stiffness matrix.
û
i
1. Simo and Taylor, 1985, “Consistent tangent operators for rate-independent elastoplasticity” in Computer
Methods in Applied Mechanics and Engineering, v. 48, p.101-118
2. Chapter 7 in Zienkiewicz and Taylor, 1991, “The finite element method”, 4th ed. v. 2, McGraw-Hill, UK.
Hence, the tangent stiffness matrix KT in Eq. 5•225 is obtained by taking the derivatives of R in Eq. 5•226 with
respect to the solution u as
∂R ∂σ T
-------
∂û
= –∑ ∫Ω BT ------
∂û
- B dΩ = – ∑ ∫Ω B DT B dΩ ≡ – K T Eq. 5•227
û i
e e ûi e e
where, DT ≡ ∂ Ψ ⁄ ∂ ε is the tangent moduli, with the Cauchy stress σ ≡ ∂Ψ ⁄ ∂ε and where Ψ is the free energy
2 2
function. With KT defined, the incremental displacement δui is readily obtained by solving δ û i = KT–1 R ( ûi ) .
Note that the tangent moduli is the second derivatives with respect to free energy function. The minimization is
–1
with respect to free energy Ψ, and the equation δ û i = KT R ( û i ) is equivalent to the Newton’s Formula for opti-
mization of a function f(x) as δx = - df / d 2f (obtained from second-order Taylor expansion), with x += δx for
solution update.
Incremental Loading Loop: At the outer incremental loop, enclosed in the “for” statement, the incremental load-
ing is added to the boundary at the beginning of each time step; i.e., at the first iteration (i = 0), δu0 is set to the
incremental boundary condition δu0 of the time step. This quantity is then given to the element level for the com-
putation of the residual (R). The residual is to be modified by the reaction caused by the incremental boundary
condition as R -= KTδu0.
For the stress-strain relation in plasticity, which is path-dependent, we also need to introduce total incremen-
tal displacement ∆ û = ∑ δ û (with k = 0,1, 2, ..., i). ∆ui is the “total increment”; the summation of all incre-
i k
mental displacement from the first iteration up to the current iteration within the current time step. This quantity
is updated at the global level in the “main()” function, and is referred to in the element level.
Global Newton-Raphson Iteration: The inner loop is the global Newton-Raphson iteration inside the do-while
control statement. The heart of the algorithm is the member function “Matrix_Representation::assembly()” which
performs three heavy-duty steps to be coded in the Element_Formulation. These three steps are (1) apply exter-
nal load (Fext in Eq. 5•226) and subtract the internal force divergence term as in Eq. 5•226. The details of these
three steps in the element level is in Section 5.3.2.
Listing 5•19 Incremental loading algorithm and the global Newton-Raphson iteration of the non-linear
problem
Theory of Elastoplasticity
Following the development in Simo and Taylor[1985]3 for the infinitesimal case, the deviatoric stress (s) and
deviatoric strain (e) tensors are defined as
with second-order unit tensor, 1, and the trace operator, tr( ). The von Mises yield criterion is
f ( ξ, α, κ ) ≡ ξ – 2- κ ( e p ) ≤ 0 , Eq. 5•229
3
∫
t
2
ep = -
3
d p ( τ ) dτ , Eq. 5•230
0
with ξ = s – α . where α is the back stress for the kinematic hardening. The kinematic hardening function κ(e p),
depends on plastic strain ep, is for the isotropic hardening, where dp is the plastic strain rate. The consistency
condition, or the normality rule, in the theory of plasticity requires that stress remains on the yield surface during
the plastic deformation. This is equivalent to assuming maximum plastic dissipation for a quasi-equilibrium pro-
cess in the context of irreversible thermodynamics. From the consistency condition, we get the deviatoric stress-
strain rate constitutive equation as
1
γ ≡ d p = -----------------------------
κ' + H’α
Eq. 5•232
1 + 3µ -------------------
µ is the shear modulus. n̂ = ξ TRn + 1 ⁄ ξ nTR+ 1 , where superscript “TR” denotes the trial elastic state. The primes
in κ and Hα denotes the first-derivative with respect to ep. The so-called continuum elastoplastic tangent moduli
·
ae p for the constitutive law σ = a : ε· in rate form is thus obtained as
ep
elastic predictor
sTRn+1
plastic corrector
sn sn+1
Yield Surface
where sn is obtained from stored data of the last time step, and superscript “TR” denotes the trial elastic state. We
also have
ξnTR+ 1 = s n + 1 – α n , and n̂ =
TR
ξ TR n + 1 ⁄ ξnTR+ 1 Eq. 5•235
where back stress αn is also obtained from stored data of the last time step.
Plastic-Corrector: The consistency condition requires the state of stress remains on yield surface during plastic
deformation. This leads to
φ ( γ ∆t ) ≡ – 2-3 κ ( e np + 1 ) + ξnTR+ 1 – 2µ γ ∆t + 2- ∆H α = 0 ,
3
Eq. 5•236
where the last two terms 2µγ∆t is the magnitude of plastic relaxation stress and 2 ⁄ 3 ∆Hα is the magnitude of
translation on “π-plane” of yield surface due to kinematic hardening with the “π-plane” proportional length-fac-
tor 2 ⁄ 3 considered. ep is integrated as
Define λ = γ∆t, the local Newton-Raphson iteration to enforce the consistency condition provides the Newton’s
formula for numerical root-finding on Eq. 5•236 as
the back stress and deviatoric stress are integrated for updating history data according to
αn + 1 = αn + -2 ∆H α ( enp + 1 ) n̂ , and s n + 1 =
3
αn + 1 + -2 κ ( enp + 1 ) n̂
3
Eq. 5•239
where,
[ κ n + 1 + ∆H α ] 1
β ≡ 2-3 -----------------------------------
- , and γ ≡ -------------------------------------------
[ κ' + H’ ]
- – (1 – β) = γ – (1 – β)
ξ TR
n+1 α n+1
Eq. 5•242
1 + ----------------------------------
3µ
We can check this formula by investigating that whether in the limit of infinitesimal incremental loading step,
the consistent tangent moduli does degenerate to the continuum tangent moduli; i.e., to have β = 1 , thus
γ = γ – ( 1 – β ) = γ . This follows immediately from the definitions of β and γ in Eq. 5•242.
where sn is obtained from history data. In order to get the trial deviatoric stress, we first need to compute the ele-
ment total incremental displacement (∆ û e = ∆ û +∆u, free plus fixed degree of freedom), and then total incre-
mental deviatoric strain (∆e = Bdev ∆ û e). Then, we can get to the 2µ∆e term in the elastic predictor (Eq. 5•243).
Note that Bdev is the deviatoric part of the B matrix. This trial state of elastic deviatoric stress (sTRn+1) may or
may not shoot out of the yield surface. This is inspected by checking the yield radius (R) against the norm of
ξTRn+1, where ξTRn+1 = sTRn+1 - αn. The back stress αn is obtained from the history data. If the norm of ξTRn+1 is
smaller than or equal to R, the state of stress remains inside the yield surface in the elastic state, we have sn+1 =
sTRn+1. Otherwise, the constraint is violated. The state of the stress is outside the yield surface and the plastic-cor-
rector follows.
The plastic-corrector is to return to the state of stress back to the yield surface—the consistency condition;
i.e., to satisfy φ ( λ ) ≡ – 2-3 κ ( e np + 1 ) + ξnTR+ 1 – 2 µλ + 2- ∆H α = 0 (with λ = γ∆t) in Eq. 5•236. With the Vector-
3
Space C++ Library, root-finding is very simple. Just declare C1 type variables and function for this equation. The
Newton iteration is furnished by the formula δ λ = – φ ( λ ) ⁄ d φ ( λ ) , with the solution updated by λ += δλ. The
C++ codes parallel the mathematical expression exactly. After we obtain the converged result form the local
Newton iteration, one can update s, α, and ep according to Eq. 5•237 and Eq. 5•239.
The consistent tangent moduli (Eq. 5•241) is readily obtained with all these parameters available. The
Cauchy stress is computed with averaged volumetric strain εv = Bvol û e = tr(ε)1, and then plug the volumetric
strain into the second term in the right-hand-side of Eq. 5•240. The rest of the Program Listing 5•20is the (tan-
gent) stiffness, and the residual. The stiffness is computed as defined in
T ep
∫B a B dΩ
Ωe
with B in place of the B for the B-formulation. The residual only needed to be computed for the non-linear prob-
lem, where we need to subtract the internal stress divergence term out of the “force” vector as
∫ B σ dΩ
T
Ωe
The implementation of the radial return mapping method is shown is Program Listing 5•20.
εv = Bvol û e = tr(ε)1
for(int i = 0; i < nen; i++)
eps_v += B_bar_vol_q(i*ndf)*ul[i*ndf] + B_bar_vol_q(i*ndf+1)*ul[i*ndf+1];
eps_v *= 3.0; σn+1 = sn+1 + K Ttr(ε) 1
stiffness += ∫ B a B dΩ
sigma_q = s_q + K_*eps_v; ep
}
stiff &= ((~B_bar)*Dt*B_bar)|dv; Ωe
∫ B σ dΩ
T
force &= -((~B_bar)*sigma)|dv; force -=
} Ωe
Listing 5•21 Radial return mapping alogrithm for the elastoplastic element.
δu = 0.01m
18 m
5m
5m
Figure 5•16 Perforated strip under uniaxial extension.
5.3.4 Perforated Strip under Uni-axial Extension
A benchmark test problem for elastoplasticity is shown in Figure 5•16. For the experimental results see Theo-
caris and Marketos[1964]1. The geometry and its boundary conditions are described below. Only a quadrant of
the problem is modeled due to the symmetry of the problem. Plain strain is assumed. The material properties are
Young’s Modulus E = 70 MPa, Poisson ratio ν = 0.2. The isotropic hardening law is given specifically as
where, Y0 = Y∞ = 0.243 MPa, ι = 0.1. No kinematic hardening is assumed, for simplicity. This test problem will
also be used in the two implementations of the finite deformation elastoplasticity in Section 5.4 for comparison.
The project “elastoplasticity” in project workspace file “fe.dsw” implements the example problem in this sec-
tion and the algorithm discussed in the last section. The yield ratio contours of the perforated strip after 10 incre-
mental loading steps are shown in Figure 5•17. The plastic zone is developed in area where the highest intensity
of the maximum shear values are expected. This zone runs about 45 o upwards from point A, forming a plastic
enclave. This direction is mostly parallel to the point-wise maximum shear directions inside this zone.
For the non-linear problem at hand, the iterative process requires the “matrix assembly” and “solution phase”
been processed many times, comparing to a linear problem where this process is only go thourgh once. There-
fore, the computing time becomes a concern for us. However, the computing time is very sensitive to the size of
the problem. We use “n” to represent the size of a problem, which may stand for total node/element number of a
problem. The element formulation in the assembly process is linearly proportional to “n”. We denote this linear
dependency as O(n). The memory space required for storing global stiffness matrix is proportional to n2 or
3.50
3.00
2.50
2.00
1.50
1.00
0.75
0.50
0.25
A 0.00
Figure 5•17 Contours of yield ratio (normalized to initial yield value) after
accumulating 10 incremental steps (δu = 0.01 m; total streching ~0.56%).
denote as O(n2), while the matrix solution problem is known to be a process of O(n3). As the size of the problem
increases, obviously the matrix solution process will become very costly. Therefore, optimization methods,
introduced in Chapter 2, that alleviates matrix solution process will be most desirable when the problem size
becomes large. We discuss (1) the conjugate gradient method that completely avoids the need for inverting a
matrix, and (2) the quasi-Newton BFGS method that only inverting matrix once in a while. A word of caution is
as you have seen in Chapter 2, the classical Newton method is the most powerful one that has quadratic conver-
gence rate. Use of these other methods may slow down the convergence rate substantially. The trade-off is then
many more iterations are needed to get to the same level convergence, when the second-order information (the
inverse of Hessian in this case is the decomposition of the global stiffness matrix) is to be avoid either partially
or completely.
1
Π ( u ) = --- u T Ku – u T f Eq. 5•245
2
where K is the global stiffness matrix and f is the global force vector and we drop the “hat” sign for the nodal
solution as u for simplicity. For the conjugate gradient method the initial search direction is taken as the negative
gradient as
where Du is the directional derivatives, and r0 is the residual. The the solution u, for the next iteration, is updated
along the search direction p as
The that minimized the objective functional for this quadratic programming case can be shown as1
g kT p k
α k = – -----------------
- Eq. 5•248
p kT Kp k
The next search direction is so chosen that it is orthogonal to the set of previously chosen search directions as
g kT + 1 Kp k
p k + 1 = – g k + 1 + β k p k, where β k = -----------------------
- Eq. 5•249
p kT Kp k
and gk+1 is g evaluated at uk+1. In Chapter 2, the conjugate gradient method is introduced with line search
method that no Hessian (global stiffness matrix) is required. In that case, the convergence will be extremely slow.
For finite element method, not assembling the global stiffness matrix gives some advantage that the required
memory space grow as O(n2) as discussed earlier. However, if the memory is not of very much concern, the vital
second-order information (the Hessian), will help convergence by a lot. The decision to assemble the global stiff-
ness matrix also depends on the fact that computation to form element stiffness matrix is un-avoidable. In order
to compute the residual according to Eq. 5•246, the element stiffness ke must be computed anyway. Since ele-
ment stiffness matrices are computed, assembly of the global stiffness matrix provides vital second-order infor-
mation which facilitates the convergence of the conjugate gradient method. Eq. 5•246 to Eq. 5•249 are
implemented for each incremental loading step, with its index “i”, as
1. see p.244 in Luenberger, D.G., 1984, “Linear and nonlinear programming”, 2nd eds., Addison-Wesley Publishing Com-
pany, Inc., Reading, Massachusetts.
The global stiffness matrix is assembled in this algorithm, but no matrix solution, by direct method, is performed.
The convergence rate compared to the classical Newton-Raphson method is still very very slow. The implemen-
tation of the conjugate gradient method for quadratic programming case can be activated by setting the macro
definition “__TEST_CONJUGATE_GRADIENT_METHOD” in project “elastoplasticity” of project workspace
file “fe.dsw”.
q i • ( B i q i ) p i ⊗ p i p i ⊗ ( B i q i ) + ( B i q i ) ⊗ p i
B iBFGS = B i + 1 + --------------------------- ------------------ – ---------------------------------------------------------------- Eq. 5•250
+1
qi • pi pi • pi qi • pi
For size of B as “n”, step 1 to step 2 can be repeated every “m” times (m < n). This m-iteration loop can be
repeated many times until all the convergence criteria are met. This additional outer loop is known as the partial
quasi-newton method, which aids the convergent rate of a numerical iterative process as opposed to theoretical
one that the convergence is guaranteed in “n” steps. The implementation of the BFGS method is as the follow-
ings
With the aids of the inverse of Hessian, B, the convergence rate shows dramatic improvement. When the size of
the matrix is increased, saving time in inverting the global stiffness matrix, a O(n3) process, will become more
and more critical. This implementation can be activated by setting marco definition “__TEST_QUASI_
NEWTON_BFGS_METHOD” in project “elastoplasticity” of workspace file “fe.dsw”.
Strain Measures
We first check the effect of finite rotation on the infinitesimal strain. A rigid body rotation can be expressed
as an orthogonal transformation, with θ as the rotation angle,
1 2
--- θ + O ( θ 4 ) 0
cos θ – 1
ε ≡ 1--2- ( ∇ u + ∇ T u ) = 1--2- ( R + R T – 2 I ) = 0
≅
2
Eq. 5•252
0 cos θ – 1 1 2
0 --- θ + O ( θ 4 )
2
The last approximation step is the Taylor expansion of the cos θ. Therefore, finite rotation, a rigid body motion,
would excite unwanted strain in the order of θ2 (in radian). In many practical engineering applications we are
Stress Rates
In small deformation problems, the Cauchy stress is used in the formulation. Unfortunately, the Cauchy
stress rate is not objective. Consider again applying a rigid body rotation to a body, where no stress should be
excited. Since the stress is a second order tensor, it transforms with the rotation tensor according to the transfor-
mation rule
σ’ = R σ RT Eq. 5•253
· 1 T
W ≡ R = --- ( ∇ u· – ∇ u· ) Eq. 5•254
2
Note that W is skew symmetric, so WT = -W. For two coordinates with x = X initially, and therefore R = I initially,
taking time derivative with respect to Eq. 5•253, (then evaluated at x = X, and R = I)
σ· ’ = W σ + W σ = Wσ – σW
T
Eq. 5•255
If σ ≠ 0 , σ· will not vanish. However, no stress rate is expected to be excited. Therefore, σ· is the spurious
’ ’
stress rate that needs to be corrected. The Jaumann stress rate takes out of this spurious stress rate and is defined
as
σ̃ = σ· – W σ + σ W Eq. 5•256
σ̃ is also known as the co-rotational stress rate. For a more formal treatment of the objective stress rate see stan-
dard text.1
The Jaumann stress rate is said to be objective only with respect to isomorphism. Isomorphism means that
the transformation is a one-to-one mapping with inverse of the transformation defined. We are also interested in
diffeomorphism. Diffeomorphism is a transformation that further requires the derivatives of the transformation is
also one-to-one mapping with inverse. There are many other objective stress rates historically. They are either
objective with respect to isomorphism or diffeomorphism.2 Rates which are objective with respect to diffeomor-
phism are called covariance. Therefore, the covariance requirement demands that a physical quantity invariant
under arbitrary Cn transformation. The covariance let us leap forward to the essence of modern Einstein’s rela-
Material Moduli
We restrict our discussion on the constitutive law to what is relevant to the strain measure and the stress rate
issues developed above. First of all, lets note the material moduli is a fourth-order tensor. A transformation apply
to a fourth-order tensor is similar to that of Eq. 5•253 for the second-order tensor, only the spurious term gener-
ated will be twice as ugly as what is in Eq. 5•255. Disregarding what those ugly spurious terms really are, it is
suffice to say that the material moduli so derived is not invariance with respect to a rigid body motion. This has
certain impact on material moduli of anisotropic materials. For isotropic materials, the constitutive law are
required to be written, by choosing its independent parameters, to be invariant with respect to rotation. It is not so
fortunately for the anisotropic materials. It becomes our responsibility to make it objective by subtracting corre-
sponding spurious terms similar to what was done in Eq. 5•256. The argument above for anisotropic material
moduli is therefore completely parallel that for stress rate. Secondly, we already know the Cauchy stress rate is
not objective (the spurious stress σ · ≠ 0 in Eq. 5•255 occurs). When we construct the constitutive equation in
’
stress-strain rate form, we could expect that the constitutive equation should be defined using the objective stress
rates only; i.e., not to use the Cauchy stress rate. Then, if it is more convenient for computation, the objective
stress rates can then be expressed in terms of the Cauchy stress rate with its corresponding spurious terms. When
the objective stress rate is used, the problem occurred in the material moduli here resolves automatically.
φ
φ(X) = x
WX
X Tφ WX
Kinematics
A simple body is an open set in n. Define its containing space as = n. A C1 configuration is repre-
sented by a mapping φ: → , see Figure 5•18. A tangent space to the set at point X is denoted as TX . The
tangent space is the vector space n considered as vectors with base point X. Similarly the tangent space to is
Tx . A tangent vector WX = (X, W) ∈ T X, with X denoting its base point. A tangent map of φ is defined as T
φ: TX → Tx . T φ(X, W) = (φ(X), Dφ(X) • W) where “Dφ(X) • W” denotes the derivative of φ(X) at X in the
direction of W. Then, T φ • WX is called “push-forward” of W by φ. The push-forward is also denoted as φ*.
Deformation Gradient
The tangent of φ is denoted F, the deformation gradient of φ; i.e., F = T φ. Let {XA} and {xa} denote coordi-
nate system on and , respectively. The matrix F with respect to the coordinate bases EA(X) and ea(x) is given
by F = FaA ea ⊗ EA (= GRAD φ(X) ≡ ∇ ⊗ φ(X), where the operator ⊗ denotes the tensor product)
a ∂φ a ∂x a
F A ( X ) = ---------- ( X ) , note that d x a = ---------- dX A Eq. 5•257
∂X A ∂X A
where dxa and dXA are the infinitesimal position vectors, which are the tangent vectors in and . From Eq.
5•257, dxa = F • dXA (= T φ • dXA = φ* dXA) is the “push-forward” of dXA by φ. The Jacobian is, J = det F.
The restriction that J > 0 is the impenetrability of matter, in that the local invertibility of F is required. Therefore,
the relative orientation of the line elements is preserved under deformation.
Metric tensor on is defined as gab(x) = <ea , eb>x , and metric tensor on is defined as GAB(X) = <EA ,
EB>X , where < , > is the inner product in n. The transpose of F is defined as
A
( F T ( x ) )a = g ab ( x ) F Bb ( X ) G AB ( X ) Eq. 5•258
Note that the deformation gradient F is a two-point tensor which can be considered as a tensor with “two legs”;
one on and the other on . Therefore, the question of whether F is symmetric is irrelevant!
Since F with two different “legs” can not be symmetrical, we want to find ways to define “symmetrized”
quantities from it; i.e., somehow, get the off-diagonal components to have same value and “leg”. Therefore, the
d x 2 – d X 2 = d X • ( FT F – G ) • d X = d X • ( C – G ) • d X
Eq. 5•259
2E
The Lagrangian (material, or Green) strain tensor E is defined as 2E = (C - G), (where G = I, in the Cartesian
Coordinates, where I is the identity on TX ). Similarly,
d x 2 – d X 2 = d X • ( g – ( FF T ) –1 ) • d X = d X • ( g – b –1 ) • d X Eq. 5•260
2e
The Eulerian (spatial) strain tensor e is defined as 2e = (g - b-1), (where g = i, where i is the identity on Tx ).
3
S = ∑ λi ( pi ⊗ pi )
i=1
Eq. 5•261
This leads to det(Q - I) = 0, indicating there exists a unit vector p such that
Q p = p = QT p Eq. 5•263
The orthogonal tensor Q can be expressed as the following equation (for derivation see P. Chadwick[1976]1)
1. see p. 37 in Chadwick, P., 1976, “Continuum mechanics, concise theory and problems”, John Wiley & Sons,
New York.
λ1 p x x’
λ3
q
p2
p1 λ2
r
Figure 5•19 Stretches and Rotation for the Polar Decomposition
One recognizes, in Eq. 5•264, (q cosθ-r sinθ)q+(q sinθ+r cosθ)r is a rotation about p-axis on (q, r)-plane. Fig-
ure 5•19 illustrates both the interpretation of S as stretches λ1, λ2, λ3, and Q as rotation around p-axis.
Recall C and b are symmetric and positive definite. Define U = C , and V = b , where U and V are the
right stretch tensor and the left stretch tensor, respectively. U and V have common eigenvalues λ i. The eigen-
vectors pi of U and the eigenvectors qi of V are called the referential stretch axis and the current stretch axis,
respectively.
pi
Right Polar
λi
qi
Stretch by U Rotate by R
Decomposition
pi
Left Polar
qi λi
Rotate by R Stretch by V
ε 1
= e + --- Θ g
3
Eq. 5•265
where infinitesimal strain tensor ε ≡ ( ∇ u + ∇T u ) ⁄ 2 , and the mean dilatation Θ ≅ J ≡ ∇•u . This kinematic split
is important for kinematic constraint for the incompressible materials, and is useful in the context of J2 plasticity.
In finite deformation this additive split is substituted by a multiplicative split.
F = J 1 / 3 F̂ Eq. 5•266
where “^”, denotes the volume-preserving part. Therefore, the corresponding right Cauchy-Green tensor is
T
Ĉ = J – 2 / 3 C = F̂ F̂ Eq. 5•267
The infinitesimal version of the additive split (Eq. 5•265) can be understood as linearization of the multiplicative
split (Eq. 5•267) about φ = I as the followings. Let denotes the set of all configuration φ. A tangent vector to
at φ0 ∈ is a vector field u covering φ0. The variation εu, where ε ∈ , is the infinitesimal deformation imposed
on the finite deformation φ0, or writes εu = δ φ0, the variation of the configuration. Considering an incremental
motion φ0ε = φ0 + εu • φ0, the deformation gradient with infinitesimal increment Fε is
The linearization of the volume-preserving right Cauchy Green tensor Ĉ in Eq. 5•267 gives “2e”, , where e,
unfortunately an overloaded symbol, is the infinitesimal deviatoric strain tensor. The derivation is explained in
the followings. The derivatives of the Jacobian, and the right Cauchy-Green tensor evaluated at ε = 0 are:
d d
det ( F ε ) = det ( F ) tr ( ∇u ) , and Cε = F T ( ∇ u + ∇T u ) F Eq. 5•269
dε ε=0
dε ε=0
εu
φ0
εu
φ0( )
Figure 5•21 Infinitesimal deformation imposed on finite deformation
∂
L v t = φ ∂ t φ∗t Eq. 5•270
*
where φ* is the “pull-back” by φ, defined as action taken by tangent map T φ-1. In words, the procedure of taking
the Lie derivatives is:
(3) “push-forward” the result of the time derivative to the current configuration using F.
This objective spatial rate is sometimes called a convected time derivative. It is the change of a spatial object rel-
ative to the flow of the spatial velocity field. Therefore, it is naturally objective. As we have mention earlier, the
objective flux (rate) defined by the Lie derivatives is objective with respect to diffeomorphisms and is called
covariant. A covariant rate transforms tensorially independent of any preferred coordinate system.
Strain rates are essential in defining constitutive laws in rate-form, for example, viscosity and plasticity. The
covariant nature derived from the Lie derivatives plays an important role. Following results of the Lie deriva-
tives on spatial tensors, g and b, are important.
We discuss the derivation in words, with two additional pull-back identities φ*(g) = C and φ*(b) = G [see Mars-
den and Hughes]1. In the first part of this equation, the Lie derivative of spatial metric tensor g gives two times
T
the spatial rate of deformation tensor d ( ≡ ( ∇ u· + ∇ u· ) ⁄ 2 ). The result follows immediately from the definition
of the Lie derivative per se. The first step is: “pull-back” of g is C the right Cauchy-Green tensor; i.e., φ*(g) = C.
Then the second step is: taking the time derivative on C yields 2D by definition; i.e., 2D ≡ ∂C ⁄ ∂t , where D is
the material rate of deformation tensor. The third step is: “push-forward” of D is the spatial rate of deformation
tensor d; i.e., φ*(D) = d. The second part of Eq. 5•271 shows the left Cauchy-Green tensor b is “dragged” by the
flow v. Again, following the procedure defined in the Lie derivatives, in the first step, the “pull-back” of b is the
material metric tensor G; i.e., φ*(b) = G. In the second step, the material metric tensor G is constant with respect
to time, and will be zero upon taking time derivative. In the third step, “push-forward” of zero gives zero. Eq.
5•271 is useful for developing the constitutive law in rate form.
Piola Transformation: Let y be a vector field on , and Y a vector field on . The Piola Transform of y to Y is
given by
This stress definition gives simple form of equations of motion. However, the first Piola-Kirchhoff stress tensor
is not symmetrical. Recall mixed upper-lower case indices means that the first Piola-Kirchhoff stress is a two-
point tensor with “two legs”. This disadvantage of first Piola-Kirchhoff stress tensor leads to the definition of
second Piola-Kirchhoff stress tensor. The second Piola-Kirchhoff stress tensor S is defined as pulling the “first
leg” of P back by φ as
Both τ and σ are defined only in spatial configuration. We observe the difference of them is a scalar factor of J,
the Jacobian. The reason for this proliferation of stresses (τ and σ) in the same configuration is whether one is
performing the Piola transformation with J factor considered, or merely pull-back and push-forward the tensor
objects between spatial and material configurations.
In this section, we mentioned one of the disadvantage of the Cauchy stress in describing the Green elastic
material. The other dismay of the Cauchy stress is the rate of the Cauchy stress is not objective (while the rate of
another symmetric stress tensor, the second Piola-Kirchhoff stress tensor S is objective). Therefore, the Cauchy
stress rate will not be suitable for defining constitutive laws formulated in rate form, e.g., viscosity and plasticity.
The subject of objective rates has been very controversial. All the “objective rates” of second-order tensors, for
examples the Oldroyd rate, Truesdell rate, and Jaumann rate are in fact either (1) the Lie derivatives of the
Cauchy stress tensor σab, or (2) the Lie derivative of the Cauchy stress’s associate tensors (σab, σab, σab), or
(3) certain linear combinations of the Lie derivatives of the associate Cauchy stresses.
Constitutive Equations
In the context of elastoplasticity for this workbook, we restrict ourselves on isotropic material in hyperelas-
ticity and associative flow law in the classical J2 plasticity (see e.g., Fung [1965]1, or Malvern [1969]2 for intro-
duction). We focus on the consequence of finite deformation on such a constitutive law.
For pure elasticity (isothermal), Ψ—the free energy, and E—the internal energy coincide (Ψ = E). We also
postulate that the constitutive equation can be expressed in differential operator which is defined in “local”
points to represent the whole material; i.e., the axiom of locality. Taking covariance of energy balance as an
other axiom, one can deduce
∂E ∂F
ρRef ------- = P: ------- Eq. 5•276
∂t ∂t
This is, the change in internal energy (the left-hand side) equals the “mechanical power” (the right-hand side).
Therefore, the constitutive equation, in material configuration expressed by the first Piola-Kirchhoff stress ten-
sor, should have the form of
∂Ψ
P = ρRef g# -------- Eq. 5•277
∂F
1. Fung, Y.C., 1965, “Foundations of solid mechanics” Prentice-Hall, Inc., Englewood Cliffs, New Jersey.
2. Malvern, L.E., 1969, “Introduction to the mechanics of a continuous medium” Prentice-Hall, Inc., Englewood
Cliffs, New Jersey.
∂Ψ
S = 2 ρRef -------- Eq. 5•278
∂C
Recall the “push-forward” of C is g. Having the Piola transformation in mind, Eq. 5•278 is equivalent to consti-
tutive equations written in the Cauchy stress σ, and the Kirchhoff stress τ defined in spatial configuration as the
follows
Ψ ∂Ψ
σ (= J-1 τ) = 2 ρ ∂-------
- ⇒ τ = 2 ρRef -------- Eq. 5•279
∂g ∂g
This is the Doyle-Erickson formula. For the dependence of free energy Ψ on g, one can feel more comfortable by
directly considering spatial setting as the followings. Changes of spatial metric tensor on from g to say g'
affect the accelerations of particles. Thus, the internal energy E must depend on the metric tensor g; i.e., for the
free energy Ψ to depend on g.
The elasticity tensor or the elasticities A is a fourth-order tensor on , and is defined as
∂S ABCD ∂S AB ∂2 Ψ
A = ------- , i.e., A = -------------- = 2 ρ Ref ----------------------------- Eq. 5•280
∂C ∂C CD ∂C AB ∂C CD
the symmetry of the elasticity tensor AABCD = ACDAB implies the cross partial differentiation of Ψ are equal
∂2 Ψ ∂2Ψ
----------------------------- = ----------------------------- Eq. 5•281
∂C AB ∂C CD ∂CCD ∂C AB
This is the condition for the existence of the free energy Ψ as a potential function from calculus; i.e., also the
condition to define hyperelasticity.
We have concluded that the free energy Ψ depends on φ and F only through C. Recall that C is a symmetric
tensor, which can be brought to diagonal form by orthogonal transformation. For regular φ we have seen, a posi-
tive definite symmetric tensor in this case C, admits spectral representation (Eq. 5•261). The free energy Ψ must
be a function only of the eigenvalues of C; that is, Ψ depends only on the principles stretches of U. However,
instead of using eigenvalues, the three invariants of C, where I(C) = tr (C), II(C) = det C tr C-1, and III(C) = det
C = J2, is also convenient to use. The same is true for the free energy Ψ expressed in spatial configuration with
the left Cauchy-Green tensor b, used in the next section.
Elasticity
Recall the kinematic split into deviatoric and spheric parts, the volume-preserving right Cauchy-Green ten-
T
sor is, therefore, defined as Ĉ = J –2 / 3 C = F̂ F̂ . The volume-preserving left Cauchy-Green tensor is defined
similarly as b̂ = J –2 / 3 b = F̂F̂ . The idea is to express the free energy Ψ as function of three invariant of b; i.e.,
T
Ib, IIb, and IIIb. For an isotropic material with uncoupled volumetric and deviatoric responses, the stored energy
ˆ
function has the form of Ψ = Ψ ( b̂ ) + U ( J ) . A special case in invariants of b is
1
Ψ = --- µ ( Îb – 3 ) + U ( J ) Eq. 5•282
2
where Î b = J-2/3 Ib, (where Ib ≡ b : g), and that J = IIIb1/2. Two identities are useful for the following deriva-
tions: Lv(IIIb) = Lv(J2) = 2 IIIb g# : d, and Lv(Ib) = Lv(b# : g) = 2 b# : d. Recall the Doyle-Erickson formula
(Eq. 5•279) as the Kirchhoff stress in relation to the free energy Ψ is
Ψ
τ = 2 ρRef ∂-------
- ⇒ τ = J p g# + µ dev b̂ , Eq. 5•283
∂g
∂τ
where p = dU/dJ. The spatial elasticity tensor a ≡ 2 ------ becomes,
∂g
2 1
a = J 2 U’’( J )( g ⊗ g ) + Jp ( g ⊗ g – 2I )+ --- µ Î b I – --- ( g ⊗ g ) – ( devb̂ ⊗ g – g ⊗ devb̂ ) Eq. 5•284
3 3
2 1
a dev ≡ --- µ ˆI b I – --- ( g ⊗ g ) – ( devb̂ ⊗ g – g ⊗ devb̂ ) Eq. 5•285
3 3
For example, a possible case of U, satisfying polyconvex condition, is U(J) = K/2 (J2-1). U’ = dU/dJ = p = KJ =
K (1+ ∫ dV/V) = K (1+logV), U’’ = d2U/d2J = K = V ( ∂p ⁄ ∂V ) T , with the last identity derived from p =
K(1+logV), where K is then identified as the bulk modulus. Eq. 5•284 is symmetrical as required in hyperelas-
ticity shown in Eq. 5•281.
We observe that in the above extremely simplified case of finite elasticity, the elasticity tensor a is a strong
non-linear function of deformation (expressed in J and b) and metric tensor (g), even though the bulk modulus K
and the shear modulus µ have been naively assumed as constant with respect to deformation and thermal effect.
This non-linearity is required if the constitutive equation is to be covariant, and is to be consistent with the sim-
1. J.C. Simo, 1988, “A Framework for Finite Strain Elastoplasticity Based on Maximum Plastic Dissipation and the Multi-
plicative Decompositions: Part I. Continuum Formulation.” Computer Methods in Applied Mechanics and Engineering, vol.
66, p. 199-219. “Part II. Computational Aspects.”, vol. 68, p.1-31.
We have come to a long way to define the elastoplasticity in the finite deformation range. However, the com-
plicated mathematical expression in Eq. 5•284 is only to implement the simplest ideas of “objectivity” and
“restorable elastic energy”. The algebraic structures in the infinitesimal deformation simply does not carry over
to the finite deformation. The idiosyncrasy in finite deformation range, if not respected, will lead only to absur-
dity!
where E is the finite Lagrangian strain tensor, and D is the strain rate tensor. The constitutive equation for such
additive decomposition is usually cast in a rate-form. However, the notion of hyperelasticity is not consistent
with a rate expression. The incremental objective algorithm, historically, is developed to ensure the frame-indif-
ference nature of the path-dependent stress-strain integration scheme, and hyperelasticity is achieved through
algorithmic approximation. The semantics of the Eq. 5•286 implies that the elastic and plastic deformations
occurred simultaneously. De and Dp are unknown to be solved simultaneously. An additional (stronger) assump-
tion, on top of the formal additive decomposition, will help us resolve the elastic and plastic parts of deformation
separately. This additional assumption is that the elastic and plastic deformation occur in succession. The defor-
mation gradient, occurring in two consecutive steps, is naturally expressed as a multiplicative decomposition
(Lee decomposition).
F = F e Fp Eq. 5•287
The superscript “e” indicates elastic part, and “p” indicates the plastic part. The tensor (Fe)-1, therefore, is the
deformation gradient which responses to the elastically released stress. The released intermediate configuration
is thus introduced. In multiplicative decomposition, one is able to compute elastic response exactly (by mere
function evaluation as opposed to algorithmic approximation in the incremental objective algorithm). This addi-
tional assumption of multiplicative decomposition is also justifiable from scientific point of view. It has been
argued from two perspectives. Firstly, from theoretical development of how elastic deformation of a crystalline
material is lead to dislocation (i.e., plasticity as the macroscopic manifestation of crystal structural dislocation).
Secondly, electron-microscope study that demonstrates how this micro-mechanism can occur.
rian strain (ep), we observe that (be)-1 plays the role of the plastic metric tensor gp. That is, for the elastic Eule-
rian strain ee, the elastic Finger deformation tensor (be)-1 is the deformation from the “identity” g, while for the
e –1
∂ Ψ ( g, ( b ) , F ) ∂2Ψ ∂2Ψ
τp = 2 ρRef Lv( ------------------------------------------ ) = - 2 ρ Ref -------------------------
e –1
Lv e – 1- L v ( ( b ) ) + -------------- L v ( F ) Eq. 5•288
∂g ∂g∂(b ) ∂ g ∂F
such that
e –1
∂ 2 Ψ ( g, ( b ) , F )
Lv τ p
= - 4 ρ Ref --------------------------------------------
e –1 - : dp Eq. 5•289
∂g∂(b )
e –1
Recall Lv(g) = 2d, and Lv(F) is zero, since φ*F = I. Note that L v ( ( b ) ) = L v ( g p ) = 2dp, the spatial rate of
plastic deformation tensor, where d is the symmetrical part of the velocity gradient.
e –1
∂ 2 Ψ ( g, ( b ) , F )
Lv τ = - J
p -2 / 3 e –1
e –1 - : L v ( ( b ) ) ],
dev [ 2ρRef-------------------------------------------- Eq. 5•290
∂g∂(b )
e –1
φ ( g, ( b ) , q, F ) ≤ 0 Eq. 5•291
e –1
∂ Ψ ( g, ( b ) , F )
γ φ ( g, ( b ) , q, F )
p e –1 · e –1
≡ ------------------------------------------
e –1 : Lv ( ( b ) ) + Eq. 5•292
∂( b )
· ·
where γ is the plastic consistency parameter. γ is also the Lagrangian multiplier in constrained optimization.
The minimization condition of the Lagrangian functional is the Euler-Lagrange equations (with virtual variation
of g )
∂
------ p = 0, Eq. 5•293
∂g
and the Kuhn-Tucker condition, the celebrated trio in inequality constrained optimization is
· e –1 · e –1
γ ≥ 0 , φ ( g, ( b ) , q, F ) ≤ 0 , and γ φ ( g, ( b ) , q, F ) = 0 Eq. 5•294
the Euler-Lagrange equation Eq. 5•292 gives the definition of the plastic relaxation stress as
∂φ
Lv τp = γ· 2 J dev ------
–2 / 3
Eq. 5•295
∂g
This is the plastic flow law basing on Hill’s principle of maximum plastic dissipation as derived in Simo[1988]1.
For a specific case of isotropic-kinematic hardening J2 plasticity with uncoupled hyperelasticity, the problem
is defined as
· ξ 1
µ J –2 / 3 dev [ L v b e ] = – 2µγ n̂ , where n̂ ≡ -------- , µ ≡ --- µ J –2 / 3 tr [ b e ] , and tr [ L v b e ] = 0 Eq. 5•299
ξ 3
1. J.C. Simo, 1988, “A Framework for Finite Strain Elastoplasticity Based on Maximum Plastic Dissipation and the Multipli-
cative Decompositions: Part I. Continuum Formulation.” Computer Methods in Applied Mechanics and Engineering, vol. 66,
p. 199-219. “Part II. Computational Aspects.”, vol. 68, p.1-31.
· 2·
κ ( e p ) = κ0 + β h’e p , where e p = --- γ Eq. 5•300
3
· h'
J – 2 / 3 dev [ L v α ] ≡ 2 µγ ------- ( 1 – β ) n̂ , where µ ≡ µ – J –2 / 3 tr [ α ] , and tr [ L v α ] = 0 Eq. 5•301
3µ
µ ≡ µ – --- tr [ F u α ( F u ) ]
1 1 ˆ ˆ T
µ ≡ --- µ tr [ b̂ ne +TR
1 ], Eq. 5•302
3 3
1 2
TR
a dev n+1 ≡ 2 µ I – --- ( g ⊗ g ) – --- ( s TR ⊗ g + g ⊗ s TR ) Eq. 5•303
3 3
≡ 2 µ I – --- ( g ⊗ g ) – --- ( ξ TR ⊗ g + g ⊗ ξ TR )
TR 1 2
h dev n+1 Eq. 5•304
3 3
2 µγ n + 1 1 h’ κ’ , h’
- + ------- δ ≡ f 2µ – ----- 1 + ------ – 1 --- γ n + 1 , δ ≡ 2
f0 ≡ ------------------- 1 4
ξn + 1
TR
-, f 1 ≡ ----- – f 0
δ0
, δ ≡ 1 + ------
0 3µ 3µ 1 1 δ0 3µ 3 2 ξnTR+ 1 f 1 Eq. 5•305
The deviatoric part of the consistent tangent moduli Eq. 5•285 is modified accordingly with superscript “s”
denote symmetrized
ep
a dev n+1 ≡ a dev
TR
n+1
TR
– f 0 h dev n+1 – δ1 ( n̂ ⊗ n̂ ) – δ 2 ( n̂ ⊗ dev [ n̂ 2 ] ) s Eq. 5•306
1. J.C. Simo, 1988, “A Framework for Finite Strain Elastoplasticity Based on Maximum Plastic Dissipation and the Multi-
plicative Decompositions: Part I. Continuum Formulation.” Computer Methods in Applied Mechanics and Engineering, vol.
66, p. 199-219. “Part II. Computational Aspects.”, vol. 68, p.1-31.
2. J.C. Simo, R.L. Taylor, and K.S. Pister, 1985, “Variational and Projection Methods for the Volume Constraint in Finite
Deformation Elasto-Plasticity”, Computer Methods in Applied Mechanics and Engineering, vol. 51, p. 177-208.
Proceed as in the linear case for the Hu-Washizu variational principle, the Euler-Lagrange equations are a set of
simultaneous equations and is reduced to displacement field only formulation. Linearization with respect to vari-
ations at tn+1, we obtain the tangent stiffness as (η is the variation about φn+1)
1 T
--- ∇ η :a: ∇ u = ∇ T η : [ σ ∇ u ] + ∇ T η : [ p ( 1 ⊗ 1 – 2I ) + a dev
ep ] :∇ u Eq. 5•308
J
The first term in the right-hand-side is the geometrical stiffness. The second term, which depends on the material
moduli, can be cast in the B-matrix formulation with an averaged volumetric term as
∫ B T [ adev
ep
n+1 ( Θ ) N’ N’]vol ( φ n + 1 )
– 2pI ] B dV + [ U’’
T
Eq. 5•309
Ωe
U(Θ) is the volumetric part of the free energy function, where Θ is the element mean dilatation, and N’ is the
averaged derivatives of shape function over the element.
sTRn+1
elastic predictor
plastic corrector
sn+1 elastic predictor sTRn+1
plastic corrector
sn sn sn+1
Yield Surface Yield Surface
3. J.C. Simo, 1988, “A Framework for Finite Strain Elastoplasticity Based on Maximum Plastic Dissipation and the Multipli-
cative Decompositions: Part I. Continuum Formulation.” Computer Methods in Applied Mechanics and Engineering, vol. 66,
p. 199-219. “Part II. Computational Aspects.”, vol. 68, p.1-31.
In the closest-point projection method, the integration is done at φn then push-forward by Fu (the incremental
deformation gradient) to the current configuration φn+1. The push-forward of the intermediate configuration and
its internal plastic variables at the finite deformation range makes the algorithm not only incremental objective,
but covariant.
Computationally, the stress inversion step of cutting-plane method is the trade-off for the push-forward step
in the closest-point projection method. The algorithm for the closest-point-projection is aesthetically more satis-
fying but requires the introduction of the advanced covariant concept.
Deformation Gradient from Shape Derivatives at φn+1 and the Mean Dilatational Approximation : The shape
derivative code (see Program Listing 5•22) is different from the infinitesimal case that it is evaluated at x = φn+1,
in place of X = φ0. The spatial metric tensor g = i (spatial identity), instantaneously coincide with a Cartesian
coordinates system. Recall x = X + u,
The lower case “grad”, in convention, denotes derivatives with respect to spatial coordinates x. Post-multiply by
F-1, we have
We first compute F-1 from grad u. The inverse of F-1 gives the deformation gradient, F. And evaluate the Jaco-
bian from the determinant of the deformation gradient as J = det(F). Remember for the following computation
all the available quantities are center around x = φn+1.
The mean dilatation Θ is
where the current volume is v = ∫ d v , and the initial volume is V = ∫ ------ , both integration are done at the cur-
dv
φ ( Ωe ) φ ( Ωe ) J
rent configuration φn+1.
Listing 5•22 Deformation gradient and mean dilatation approximation evaluated from configuration at x
= φn+1.
K
U ( Θ ) = ---- ( log Θ ) 2 Eq. 5•313
2
where K is the bulk modulus. p = U' = dU / dΘ, and U'' will be used in the calculation of tangent stiffness.
Again with VectorSpace C++ Library the evaluation of these derivatives is quite simple. We just need to declare
U and Θ to be a function and an independent variable, respectively, of C2 class. Alternatively, with such simple
equation, you might want to do the differentiation by hand, and code the result directly. In addition, because
pow(int) function only takes integer exponent, J2/3 is expressible as functions of exp() and log() by the equation
J2/3 = elog(J)2/3.
The update procedure for implementation 1 is to map the history data {Θn, pn, Jn, Fn, σn, (Fp)-1n, epn, αn} at
tn to {Θn+1, pn+1, Jn+1, Fn+1, σn+1, (Fp)-1n+1, epn+1, αn+1} at tn+1. We omit the programming details, since the
implementation to construct object-oriented private data members is completely parallel to that of the infinitesi-
mal case.
Elastic-Predictor and Plastic-Corrector: The elastic predictor (see Program Listing 5•23) assumes that the trial
plastic deformation gradient at tn+1 is Fnp +TR 1 ≡ F n ; i.e., no new plastic flow occurs. Therefore, the internal plas-
p
tic variables remain unchanged. Although the configuration has changed from φn to φn+1, push forward of zero
(no change of internal plastic variables) is zero. Then, the trial elastic deformation gradient at the current config-
p –1
1 ≡ F n + 1 ( F n ) . The deviatoric part of the Kirchhoff stress is calculated from the constitutive
uration is F ne +TR
equation, which has already been chosen to be covariant, as
Listing 5•23 Elsatic predictor assume no plastic flow occurs. The deviatoric Kirchhoff stress s, in engi-
neering convention, is evaluated from the consititutive law
Intermediate Configuration: We mentioned that the covariance of this stress-strain path integration algorithm
follows directly from the covariance of the constitutive law. We did not mention how do we get the (Fpn)-1 in the
above algorithm to furnish the covariance requirement.
We first describe a tactic change on stress inversion algorithm, from Simo et al[1985]1, adopted in the imple-
mentation of this workbook. At the initial step of the incremental loading step (Fp0)-1 can be assumed to be an
identity matrix, for no deformation has occurred. The tensor (Fp0)-1 remains unchanged during the same time
1. J.C. Simo, R.L. Taylor, and K.S. Pister, 1985, “Variational and Projection Methods for the Volume Constraint in Finite
Deformation Elasto-Plasticity”, Computer Methods in Applied Mechanics and Engineering, vol. 51, p. 177-208.
(a) get the Kirchhoff stress from the Jacobian and the Cauchy stress— τ n = J σ n from Eq. 5•275.
(b) compute dev ( b̂ ne ) from the deviatoric part of the constitutive law— dev ( b̂ ne ) = --- dev ( τn ) from Eq. 5•296.
1
µ
(c) compute b̂ ne from its deviatoric part dev ( b̂ne ) and requiring det ( b̂ne ) = 1.
(d) compute the elastic left-Cauchy-Green tensor from the elastic volume-preserving left-Cauchy-Green tensor—
b ne = J – 2 / 3 b̂ ne by definition.
(e) compute the left-stretch tensor ( Vne ) from the left-Cauchy-Green tensor ( bne )— Vne = b ne .
(f) set Fne = V ne , this is valid for isotropic case, and compute F pn = F n (F en)-1 from the multiplicative decompo-
sition (Eq. 5•287). Then, (F pn)-1 can be obtained.
Except for sub-steps (c) and (e), the other sub-steps are straight forward without explanation. The essence of
sub-step (c), for those who familiar with solid mechanics, is similar to the calculation of principle deviatoric
stress from a deviatoric stress. See C.Y. Fung[1965, p. 80]1, or L.E. Malvern[1969, p. 91]2.
Kinematic split of the elastic volume-preserving left-Cauchy-Green tensor b̂ ne is
1
b̂ ne = dev ( b̂ ne ) + Ξ , where Ξ = --- tr ( b̂ ne ) . Eq. 5•315
3
The objective of the problem at hand is to compute the spherical part Ξ , then adds to the given deviatoric part,
requiring that det ( b̂ ne ) = 1 —add trace to a unimodular deviatoric tensor. Assuming plain strain, we define
if(new_time_flag) { τn = Jσn
H0 dev_b_hat_n_e = INTEGRABLE_MATRIX("int, int, Quadrature", 3, 3, qp),
dev ( τ n ) = τ n – 1--3- tr ( τn )
dev_tau_n = INTEGRABLE_MATRIX("int, int, Quadrature", 3, 3, qp),
tau_n = J_n*sigma_n, tr_tau_n = tau_n[0]+tau_n[1]+tau_n[2];
dev_tau_n[0][0] = tau_n[0] - (1.0/3.0)*tr_tau_n; dev_tau_n[1][1] = tau_n[1]-(1.0/3.0)*tr_tau_n;
dev ( b̂ ne ) = --- dev ( τ n )
dev_tau_n[2][2] = tau_n[2] - (1.0/3.0)*tr_tau_n; dev_tau_n[0][1] = dev_tau_n[1][0] = tau_n[3]; 1
dev_b_hat_n_e = dev_tau_n / mu_; µ
H0 bp = dev_b_hat_n_e, theta_0(qp),
J2 = (bp[0][0].pow(2)+bp[1][1].pow(2)+bp[2][2].pow(2))/2.0 + bp[0][1].pow(2);
#if defined(__NUMERICAL_ROOT_FINDING)
for(int i = 0; i < qp.no_of_quadrature_point(); i++) { 1
C0 J2_q = J2.quadrature_point_value(i), theta_0_q = theta_0.quadrature_point_value(i), J 2 ≡ --- [ ( b’11 ) 2 + ( b’22 ) 2 + ( b’33 ) 2 ]
2
bp_q=bp.quadrature_point_value(i), J3_prime = (J2_q-bp_q[2][2].pow(2))*bp_q[2][2] + 1.0;
int MAX_ITERATION_NO = 50, count = 0; double EPSILON = 1.e-6;
+ ( b’12 ) 2 + ( b’13 ) 2 + ( b’23 ) 2
C1 X(1.0), f; C0 d_X(0.0);
do { J’3 = J 3 – I 3
f &= X.pow(3) - J2_q*X - J3_prime; d_X = -((C0)f)/d(f); ((C0)X) += d_X;
} while((double)norm(d_X) > EPSILON && ++count < MAX_ITERATION_NO);
if(count == MAX_ITERATION_NO) ofs << f ( Ξ ) = Ξ 3 – J 2 Ξ – J’3
"Warning: No convergence achieved for solution of a cubic algebraic equation!" << endl;
f
theta_0_q = ((C0)X); d Ξ = – ----- , and updating by Ξ += d Ξ
} df
#else
for(int i = 0; i < qp.no_of_quadrature_point(); i++) { Closed-form solution
C0 J2_q = J2.quadrature_point_value(i), theta_0_q = theta_0.quadrature_point_value(i);
if((double)J2_q < 1.e-6) theta_0_q = 1.0;
else {
C0 bp_q=bp.quadrature_point_value(i), Octahedral deviatoric tensor
J3_prime=(J2_q-bp_q[2][2].pow(2))*bp_q[2][2]+1.0,
2J
bp_oct = sqrt((2.0/3.0)*J2_q), temp = (-J2_q/3.0).pow(3)+(J3_prime/2.0).pow(2);
b oct = -------2-
if((double)temp > 0.0) { 3
double a1, a2,
arg1=(double)(J3_prime/2.0+sqrt(temp)),arg2=(double)(J3_prime/2.0-sqrt(temp)); marginal conditions
if(fabs(arg1) < 1.e-6)a1=0.0; else a1=((arg1>=0.0) ? 1.0 : -1.0) * exp(logl(fabs(arg1))/3.0);
if(fabs(arg2) < 1.e-6)a2=0.0; else a2=((arg2>=0.0) ? 1.0 : -1.0) * exp(logl(fabs(arg2))/3.0);
theta_0_q = a1 + a2; J’3 3 3 / 2 2 J’3
} else { cos 3α = ------ ---- = ----------------
-
C0 cos_3alpha=J3_prime * sqrt(2.0) / bp_oct.pow(3), alpha = acos(cos_3alpha)/3.0; 2 J 2 3
b oct
theta_0_q = sqrt(2.0) * bp_oct * cos(alpha);
}
} J
} Ξ = 2 ----2- cos α
#endif
3
{
H0 b_hat_n_e = dev_b_hat_n_e + theta_0 * I_33; b̂ ne = dev ( b̂ ne ) + Ξ
H0 b_n_e = J_two_third * b_hat_n_e;
H0 V_n_e = INTEGRABLE_MATRIX("int, int, Quadrature", 2, 2, qp);
H0 I_b = b_n_e[0][0]+b_n_e[1][1],
1
I_b =b_n_e[0][0]*b_n_e[1][1]-b_n_e[0][1]*b_n_e[1][0]; V = ----------------------------------- ( b + IIb I )
V_n_e[0][0] = (b_n_e[0][0]+sqrt(II_b)) / sqrt(I_b+2.0*sqrt(II_b)); ( I b + 2 II b )
V_n_e[1][1] = (b_n_e[1][1]+sqrt(II_b)) / sqrt(I_b+2.0*sqrt(II_b));
V_n_e[0][1] = b_n_e[0][1] / sqrt(I_b+2.0*sqrt(II_b));
V_n_e[1][0] = b_n_e[1][0] / sqrt(I_b+2.0*sqrt(II_b));
H0 F_n_e = V_n_e;
F_p_inv_n = F_n_e * F_n.inverse();
F ne = V ne , F pn = F n (F en)-1, (F pn)-1
}
Following standard procedure for finding principle values, it can be shown that the principle deviations b’ of Eq.
5•316 satisfy
Numerical root-finding procedure can be applied to second part of Eq. 5•317. This is turned on by a macro defi-
nition “__NUMERICAL_ROOT_FINDING”. With the aid of the VectorSpace C++ Library this numerical pro-
cedure is straight forward without any explanation (see Chapter 2 for introduction). However, a closed-form
expression for this cubic algebraic equation is even more desirable. Considering that the last two equations are
almost identical. The solution for b’ can be use for solution of Ξ if one defines J' 3 = J 3 – I3 ; i.e., J'3 is used in
place of J3 in the formula for solving b’ . According to Malvern p. 922., the solution of the cubic algebraic equa-
tion is obtained by substituting
J Eq. 5•318
Ξ = 2 ----2- cos α
3
J 3
2 ----- [ 4 cos3 α – 3 cos α ] = J’3 ,
2
3 Eq. 5•319
The octahedral b (boct, the counterpart of the octahedral shear stress τo in C.Y. Fung1) does not really need to be
introduced, if one use the first identity only (as in L.E. Malvern, p.922). We can first invert for α from Eq. 5•321
e
then get to Ξ by Eq. 5•318. After solving for the value of Ξ , b̂ n is obtained from Eq. 5•315.
The sub-step (e) is the steps that one usually uses to compute numerical values of polar decomposition from a
deformation gradient tensor. We only need the middle step, for b = FFT is already available, and after we get V
where Iv = tr V, and IIv = det V. The solution of this equation can be shown as
1
V = ----------------------------------- ( b + II b I ) Eq. 5•323
( I b + 2 II b )
Consistent Tangent Moduli, Tangent Stiffness and Residual: The consistent tangent moduli (Eq. 5•284 with
deviatoric part Eq. 5•285 modified by Eq. 5•302-Eq. 5•306) is having without the push-forward and set g = i
(since the current configuration is described in an instantaneous Cartesian coordinates). The implementation is
transparent (see Program Listing 5•25). Some nuisance in the code is caused by the use of engineering conven-
tion of the deviatoric Kirchhoff stress (s) as a vector instead of its counterpart dev τ as a matrix. Naturally, Vec-
torSpace C++ Library, designed for general purpose numerical computation, does not have specific operators
defined for s. Therefore, we throw the VectorSpace C++ Library out of the window, s need to be computed com-
ponent by component just as you would do in C or Fortran. A better implementation will be to use object-ori-
ented method to extend a new derived class for this “engineering 1-D matrix”.
...
C0 n2(4, (double*)0);
n̂ 2
n2[0] = n[0]*n[0]+n[3]*n[3]; n2[1] = n[3]*n[3]+n[1]*n[1];
n2[2] = n[2]*n[2]; n2[3] = n[3]*(n[0]+n[1]); tr( n̂ 2 )
C0 trn2 = n2[0]+n2[1]+n2[2];
C0 dev_n2(4, (double*)0); dev_n2 = n2;
for(int i = 0; i < 3; i++) dev_n2[i] -= trn2/3.0;
dev( n̂ 2 ) = n̂ 2 - 1/3 tr( n̂ 2 )
C0 mu_2bar = mu_bar_q - (alpha_n_q[0]+alpha_n_q[1]+alpha_n_q[2])/3.0,
2µγ n + 1
µ ≡ µ – --- tr ( α )
f_0 = 2.0*mu_bar_q*lambda_q / ZAI_norm_q, 1
f0 ≡ ------------------
-
delta_0 = (1.0+d_H_alpha/3.0/mu_+d(KAPA)/3.0/mu_2bar),
f_1 = (1.0/delta_0-f_0),
3 , ξn + 1 ,
TR
delta_1 = f_1*2.0*mu_2bar -
(1.0/delta_0*(1.0+d_H_alpha/3.0/mu_)-1.0)*(4.0/3.0)*lambda_q, 1 h’ κ’
f 1 ≡ ----- – f 0 δ0 ≡ 1 + ------ + ------
delta_2 = 2.0*ZAI_norm_q*f_1; δ0 , 3µ
C0 TR_h_dev = 2.0 * mu_2bar * (I_mu - (One%One)/3.0) - 3µ
(2.0/3.0)*(ZAI_q%One+One%ZAI_q);
a_dev_q = TR_a_dev_q - f_0 * TR_h_dev - delta_1*(n%n) -
δ 1 ≡ f1 2µ – ----- 1 + ------ – 1 --- γ n + 1
(delta_2/2.0)*(n%dev_n2+dev_n2%n); 1 h’ 4
} else { δ0 3µ 3
a_dev_q = TR_a_dev_q;
ofs << " Elastic state: element # " << en << ", quadrature point # " << i
}
<< ", yield ratio: " << yield_ratio_q << endl;
δ2 ≡ 2 ξnTR+ 1 f1
sigma_q = s_q/J_q + p*One;
1
} TR
a dev n+1 ≡ 2µ I – --- ( g ⊗ g ) –
. . . 3
2 TR
– --- ( s ⊗ g + g ⊗ s TR )
3
1
TR
h dev n+1 ≡ 2µ I – --- ( g ⊗ g )
. . . 3
– --- ( ξ ⊗ g + g ⊗ ξ TR )
2 TR
3
. . .
ep
a dev n+1 ≡ a nTR
+ 1 – f 0 h n + 1 – δ 1 ( n̂ ⊗ n̂ )
TR
– δ 2 ( n̂ ⊗ dev [ n̂ 2 ] ) s
∇Tη : [ σ ∇u ]
C0 e = BASIS("int", ndf),
E = BASIS("int", nen),
U = (e%e)*(E%E);
H0 fact = wx*(sigma[0]*~wx+sigma[3]*~wy) + wy*(sigma[3]*~wx+sigma[1]*~wy);
stiff_geometrical &= (+(fact*U[0][0]+ fact*U[1][1])) |dv; ∫ B T [ a dev
ep – 2pI ]B dv
} φ ( Ωe ) n+1
{
C0 dN_bar = (( (~wx) || (~wy) ) | dv) / vol; T
stiff_material &= ( ((~B) * (a_dev/j + p*((One%One) - 2.0*I_mu)) * B) | dv ) + [ U’’( Θ ) N’ N’]vol [ φ n + 1 ]
+ dd(U) * theta * ((~dN_bar)*dN_bar) * vol;
}
stiff &= stiff_geometrical + stiff_material; ∫ B T σ dv
force &= -((~B)*sigma)|dv; φ ( Ωe )
Incremental Covariant Algorithm: Step 2-Step 5 These steps include computing the elastic-predictor, the plas-
tic-corrector, the radial return mapping algorithm, and the consistent tangent moduli. They are the same as in the
ˆ
Implementation 1, except we need to push-forward b̂ n and αn by F u in the elastic-predictor and the radial
e
αnTR+ 1 ≡ dev [ Fˆ u αn ( Fˆ u )
T ˆ ˆ T
] , and b̂ nTR
+ 1 ≡ F u b̂ n ( F u ) Eq. 5•324
ˆ T
µ ≡ µ – --- tr [ F u α n ( F u ) ]
1 ˆ
Eq. 5•325
3
1. J.C. Simo, 1988, “A Framework for Finite Strain Elastoplasticity Based on Maximum Plastic Dissipation and the Multi-
plicative Decompositions: Part I. Continuum Formulation.” Computer Methods in Applied Mechanics and Engineering, vol.
66, p. 199-219. “Part II. Computational Aspects.”, vol. 68, p.1-31.
This definition of µ is also used in the consistent tangent moduli. The radial return mapping also include updat-
ing history data
2
e np + 1 = e np + --- γ n + 1 Eq. 5•326
3
αn + 1 = α TR n + 1 + -----
h’
3µ
- 2µγ n + 1 n̂ n + 1 Eq. 5•327
The spatial metric tensor g is now in place of 1 in “Implementation 1” used in implementation 1. The present
implementation is straight forward except, again, minor nuisance caused by engineering convention that is not
supported by VectorSpace C++ Library.
2µγ
1 – --------- tr [ b̂ n + 1 ] n̂ n + 1
b̂ ne + 1 = b̂ ne +TR e TR
Eq. 5•329
3µ
The finite deformation formulation for the implementation 1 for cutting-plane method is implemented in
project “finite_elastoplasticity” and the implementation 2 for closet-point-projection method is implemented in
project “covariant_finite_elasticity”. Both projects are in project file “fe.dsw”. Figure 5•24 shows the strip with
circular whole under 0.33 % (6 incremental steps) and 0.56 % (10 incremental steps) vertical stretching Figure
5•24a and b, respectively. At 0.33% stretching the deformation mode is similar to the infinitesimal case up to
0.56 % stretching (Figure 5•24c). At 0.56 % stretching the high yield ratio area is confined to the bottom edge of
the model. This is the “necking” process developed. After 1/10 (180 incremental steps) vertical stretching is
shown in Figure 5•25a. Necking is developed into a clear stage.
Line Search: The time consuming nonlinear iterative algorithm becomes clearly a problem. 180 incremental
step, say with average of five iteration for each incremental step means that we will have computation time that
is about a thousand times that of a linear problem. The problem is even more serious if we consider to stretch the
strip with circular hole problem up to 1/3 as shown in Figure 5•25b. That will need about four thousand times
that of a linear problem. A lot of computing time will be required. One immediate solution is to increase incre-
mental loading 10 times by using δu = 0.1 instead of δu = 0.01 for each incremental loading step. However, with
such magnitude of incremental loading, the nonlinear iteration fails to converge even for the first time step. As
we recalled from Chapter 2 the classic Newton method is powerful that its convergent rate is quadratic. How-
ever, the Newton method often leads to wild search path and may easily fail to converge. Therefore, a line search
method with golden section can be use to regulate the search path of the Newton method (see page 125 in Chap-
ter 2). This is implemented as the follows
1 Matrix_Representation::Assembly_Switch = Matrix_Representation::ALL;
2 mr.assembly();
3 new_time_flag = FALSE; // flag to turn off Step 3 of the finite deformation procedure, after first iteration
4 C0 p = ((C0)(mr.rhs())) / ((C0)(mr.lhs()));
5 energy = fabs( (double)(p * ((C0)(mr.rhs())) ) );
6 if(count == 1) { // first iteration only
7 for(int j = 86; j <= 92; j++) {
8 gh[j][1] += d_gh[j][1];
b_e_hat_q[0][0] = TR_b_e_hat_q[0][0]-mu_2bar_gamma/(3.0*mu_)*tr(TR_b_e_hat_q)*n[0];
b_e_hat_q[1][1] = TR_b_e_hat_q[1][1] - mu_2bar_gamma/(3.0*mu_)*tr(TR_b_e_hat_q)*n[1];
b_e_hat_q[2][2] = TR_b_e_hat_q[2][2] - mu_2bar_gamma/(3.0*mu_)*tr(TR_b_e_hat_q)*n[2];
2µγ
1 – --------- tr [ b̂ n + 1 ] n̂ n + 1
b̂ ne + 1 = b̂ ne +TR
b_e_hat_q[0][1] = b_e_hat_q[1][0] = e TR
TR_b_e_hat_q[0][1] - mu_2bar_gamma/(3.0*mu_)*tr(TR_b_e_hat_q)*n[3]; 3µ
3.50 & up
3.00
2.50
2.00
1.50
1.00
0.75
0.50
Figure 5•25 The further development of the “necking” of the strip with (a) 1/10
vertical stretching, and (b) 1/3 vertical stretching.
W
W0,2 space 165
W1,2 space 165
W2,2 space 165
weak form 201
weak formulation 233, 327
weighted residual method 253, 268