Parallel-Vector Equation Solvers For Finite Element Engineering Applications

Parallel-Vector Equation
Solvers for Finite Element

Engineering Applications
Duc Thai Nguyen

Oid Dominion University
Noifoik, Virginia
Springer Science+Business Media, LLC

Library of Congress Cataloging-in-Publication Data
Nguyen, Duc T.
Parallel-vector equation sol vers for finite element engineering applicationslDuc Thai Nguyen.
p. cm.
lncludes bibliographical references and index.
ISBN 978-1-4613-5504-5 ISBN 978-1-4615-1337-7 (eBook)
DOI 10.1007/978-1-4615-1337-7
1. Finite element method. 2. Parallel processing (Electronic computers) 3. Differential
equations-Numerical solutions. I. Title.
TA347.F5 N48 2001

620' .001' 51535-dc21
2001038333
ISBN 978-1-4613-5504-5
© 2002 Springer Science+Business Media New York
Originally published by Kluwer Academic /Plenum Publishers, New York in 2002
Softcover reprint ofthe hardcover Ist edition 2002
10 9 8 7 6 5 432
A C.I.P. record for this book is available from the Library of Congress.
AII rights reserved
No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any
means, electronic, mechanical, photocopying, microfilming, recording, or otherwise, without written
permission from the Publisher
To Dac K. Nguyen
Thinh T. Thai
Hang N. Nguyen
Eric N. D. Nguyen and Don N. Nguyen
Preface
In spite ofthe fact that parallel-vector computational (equation solution) algorithms have
been receiving a lot of attentions for over 20 years, and a large number of research
articles have been published during this period, a limited number of texts and research
books have been written on the subject. Most (if not all) existing texts on this subject
have been written by computer scientists, and/or applied mathematicians. Majority of
existing texts have been over-emphasizing on theoretical developments of new, and/or
promising parallel (equation solution) algorithms (for varieties of applications in
Engineering and Science disciplines). Materials presented in most existing texts are
either too condense (without enough important detailed explanations), or too advance
for the typical senior undergraduate and/or graduate engineering students. It should be
emphasized here that while many important theoretical developments, which have
significant impacts on highly efficient existing parallel-vector (equation solution)
algorithms, have been carefully discussed and well-documented in current texts,
important detailed computer implementations of the developed algorithms, however,
have been usually omitted. Furthermore, it should be kept in minds that while few
existing texts in this subject (direct equation solution algorithms for parallel and/or
vector computers) have been written by computer scientists, and/or applied
mathematicians, truly large-scale models (which require parallel and vector capabilities
offered by modem high-performance computers) are often generated, solved, and
interpreted by the engineering communities.
This book is written to address the concerns mentioned above and is intended
to serve as a textbook for senior undergraduate, and graduate "engineering" students. A
number of state-of-the-art FORTRAN codes, however, have been developed and
discussed with great details in this textbook. Special efforts have been made by the
author to present the materials in such a way to minimize the mathematical background
requirements for typical senior undergraduate, and graduate engineering students. Thus,
compromises between rigorous mathematics and practical simplicities are sometimes
necessary.
This book has several unique features that distinguish it from other books:
I. Simplicity: The book has been written and explained in simple
fashion, so that senior undergraduate and first year graduate students (in Civil,
Mechanical, Aerospace, Electrical, Computer Science and Mathematic departments) can
understand the presented materials with minimum background requirements. A working
(undergraduate) knowledge in FORTRAN codings is helpful to understand the "detailed
codings" of the presented materials. Some (undergraduate) linear algebra background
should be useful, although it is NOT a requirement for reading and understanding the
materials in the book. Undergraduate background in Matrix Structural Analysis and/or
Finite Element Analysis should be useful, only for the materials presented in Chapter 3.
Graph theories have not been traditionally introduced in the undergraduate/graduate
engineering curriculums and therefore graph theories are not required to understand the
materials presented in this book.
2. Algorithms are discussed for different parallel and/or vector computer platforms:
Parallel and/or vectorized algorithms for various types ofdirect equation solvers are
vii
viii Preface
presented and discussed for both "shared memory" (such as the Cray-2, Cray-YMP,
Cray-C90, Convex) and "distributed memory" (such as the Intel i860, Intel Paragon,
IBM-SP2, Meiko) computer platforms. The vectorized algorithms can also be
"efficiently" executed on IBM-R6000/590 workstations. The vectorized algorithms
and their associated FORTRAN codes can also be executed (with less efficiency)
on other workstations and/or personal computers (P.c.) without having vectorized
capabilities.
3. More emphasis on important detailed FORTRAN computer implementations:
Efforts have been made to explain to the readers on important detailed FORTRAN
computer implementations of various algorithms presented in the book. Thus, the
readers should be able to incorporate the presented computer codes, subroutines
into hislher application codes.
4. Several state-of-the-art FORTRAN equation solvers are discussed: While great
amounts of effort have been spent to explain the detailed algorithms in a "simple
fashion," many state-of-the-art equation solvers have been developed and presented
in the book. Many of the presented solvers have been used by universities, large
aerospace corporations and government research laboratories in the U.S., Europe
and Asia.
5. Large-scale practical engineering finite element models are used: For derivations
and explanations of various algorithms described in the book, small-scale examples
are used to simplify and to facilitate the discussions. However, several medium to
large-scale, practical engineering finite element models are used to demonstrate the
efficiency and accuracy of the presented algorithms.
6. Algorithms are available for different types of linear equations: Different types of
algorithms for the solutions of various types of system of simultaneous linear
equations are presented in the book. Symmetrical/unsymmetrical, positive
definite/negative definite/indefinite, incore/out-of-core, skyl ine/variab Ie
bandwidth/sparse/tridiagonal system of equations have all been treated in great
detail by the author.
The book contains II chapters. Chapter I presents a brief review of some basic
descriptions of shared and distributed parallel-vector computers. Measurements for
algorithms' performance, and commonly "good practices" to achieve vector speed are
also discussed in this chapter. Different storage schemes for the coefficient (stiffness)
matrix (of system of linear equations) are discussed with great details in Chapter 2.
Efficient parallel algorithms for generation and assembly of finite element coefficient
(stiffness) matrices are explained in Chapter 3. Different parallel-vector "skyline"
algorithms for shared memory computers (such as Cray-YMP, Cray-C90 etc ... ) are
developed and evaluated in Chapter 4. These algorithms have been developed in
conjunction with the skyline storage scheme, proposed earlier in Chapter 2. Parallel-
vector "variable bandwidth" equation solution algorithms (for shared memory
computers) are presented and explained in Chapter 5. These algorithms have been
derived based upon the variable bandwidth storage scheme, proposed earlier in Chapter
2. Out-of-core equation solution algorithms on shared memory computers are considered
in Chapter 6. These algorithms are useful for cases where very large-scale models need
to be solved, and there are not enough core-memories to hold all arrays in the in-core
Preface ix
memories. Parallel-vector equation solution strategies for "distributed-memory"

computers are discussed in Chapter 7. These equation solution strategies are based upon
the parallel generation and assembly of finite element (stiffness) matrices, suggested
earlier in Chapter 3. Unsymmetrical banded system of equations are treated in Chapter
8, where both parallel and vector strategies are described. Parallel algorithms for tri-
diagonal system of equations on distributed computers are explained in Chapter 9.
Sparse equation solution algorithms are presented in Chapter 10. Unrolling techniques
to enhance the vector performance of sparse algorithms are also explained in this
chapter. Finally, system of sparse equations where the coefficient (stiffness) matrix is
symmetrical/ unsymmetrical and/or indefinite (where special pivoting strategies are
required) are considered with great details in Chapter II.
The book also contains a limited number of exercises to further supplement and
reinforce the concepts and ideas presented. The references are listed at the end of each
chapter.
The author would like to invite the readers to point out any errors that come to their
attention. The author also welcomes any comments and suggestions from the readers.
Duc Thai Nguyen
Norfolk, Virginia
Acknowledgments
During preparation of this book, I have received (directly and indirectly) help from many
people. First, I would like to express my sincere gratitude to my colleagues at NASA
Langley Research Center: Dr. Olaf O. Storaasli, Dr. Jerrold M. Housner, Dr. James
Starnes, Dr. Jaroslaw S. Sobieski, Dr. Keith Belvin, Dr. Peigman M. Maghami, Dr. Tom
Zang, Dr. John Barthelemy, Dr. Carl Gray Jr., Dr. Steve Scotti, Dr. Kim Bey, Dr. Willie
R. Watson, and Dr. Andrea O. Salas for their encouragement and support on the subject
of this book during the past 13 years.
The close collaborative works with Dr. OlafO. Storaasli and Dr. Jiangning Qin, in
particular, have direct impacts on the writing of several chapters in this textbook.
I am very grateful to Dr. Lon Water (Maui, Hawaii), Professors Pu Chen (China),
S.D. Rajan (Arizona), B.D. Belegundu (Pennsylvania), J.S. Arora (Iowa), Dr. Brad
Maker (California), Dr. Gerald Goudreau (California), Dr. Esmond Ng (Lawrence
Berkeley Laboratory, California) and Mr. Maurice Sancer (California) for their
enthusiasm and supports of many topics discussed in this book.
My appreciation also goes to several of my former doctoral students, such as Dr.
T.K. Agarwal, Dr. John Zhang, Dr. Majdi Baddourah, Dr. AI-Nasra, and Dr. H.
Runesha who have worked with me for several years. Some of their research
contributions have been included in this book.
In addition, I would like to thank my colleagues at Old Dominion University (ODU)
and Hong Kong University of Science and Technology (HKUST) for their support,
collaborative works, and friendship. Among them, Professor A. Osman Akan, Professor
Isao Ishibashi, Professor Chuh Mei, and Professor Zia Razzaq at ODU, Professor T. Y.
Paul Chang, and Professor Pin Tong at HKUST. Substantial portions of this textbook
have been completed during my sabbatical leave period (January 1 - December 30,
1996) from O.D.U. to work at HKUST (during February 22 - August 22, 1996).
The successful publication and smooth production of this book are due to ODU
skillful office supported staff: Mrs. Sue Smith, Mrs. Mary Carmone, Mrs. Deborah L.
Miller, and efficient management and careful supervision ofMr. Tom Cohn, Ms. Ana
Bozicevic, and Mr. Felix Portnoy, Editors ofKluwer/Plenum Publishing Corporation.
Special thanks go to Ms. Catherine John, from Academic Press (AP) Ltd., London,
UK for allowing us to reproduce some materials from the AP textbook "Sparse Matrix
Technology," (by Sergio Pissanetzky) for discussions in Chapter 10 (Tables 10.2 and
10.5) of our book.
Last but not least, I would like to thank my parents (Mr. Dac K. Nguyen, and Mrs.
Thinh T. Thai), my family (Hang N. Nguyen, Eric N. D. Nguyen and Don N. Nguyen),
whose encouragement has been ever present.
Duc T. Nguyen
Norfolk, Virginia
Xl
Disclaimer of Warranty
We make no warranties, express or implied, that the programs contained in this
distribution are free of error, or that they will meet your requirements for any particular
application. They should not be relied on for solving a problem whose incorrect solution
could result in injury to a person or loss of property. The author and publisher disclaim
all liability for direct, indirect, or consequential damages resulting from use of the
programs or other materials presented in this book.
xiii
Contents
1. Introduction 0000000000000000000000000000000000000000000000000000
1.1 Parallel Computers 1

1.2 Measurements for Algorithms' Performance 2
1.3 Vector Computers 3
1.4 Summary 0 10
1.5 Exercises 11
1.6 References 11
2. Storage Schemes for the Coefficient Stiffness Matrix 00000000000000000 13
201 Introduction 00000000000000000000000000000000000000000000 13

202 Full Matrix 000000000000000000000000000000000000000000000 14
2.3 Symmetrical Matrix 00000000000000000000000000000000000000 14
2.4 Banded Matrix 000000000000000000000000000000000000000000 14
205 Variable Banded Matrix 00000000000000000000000000000000000 14
206 Skyline Matrix 000000000000000000000000000000000000000000 15
207 Sparse Matrix 0000000000000000000000000000000000000000000 16
208 Detailed Procedures For Determining The Mapping Between
2-D Array and I-D Array in Skyline Storage Scheme 000000000000 17
209 Determination of the Column Height (lCOLH) of a Finite Element
Model 0000000000000000000000000000000000000000000000000 19
2010 Computer Implementation For Determining Column Heights 000000 23
2011 Summary 00000000000000000000000000000000000000000000000 25
2012 Exercises 0000000000000000000000000000000000000000000000 26
2013 References 000000000000000000000000000000000000000000000 26
3. Parallel Algorithms for Generation andAssembly of Finite Element

Matrices 00000000000000000000000000000000000000000000000000000 27
301 Introduction 00000000000000000000000000000000000000000000 27

302 Conventional Algorithm to Generate and Assemble Element
Matrices 00000000000000000000000000000000000000000000000 27
3.3 Node-by-Node Parallel Generation and Assembly Algorithms 00000 29
3.4 Additional Comments on Baddourah-Nguyen's (Node-by-Node)
Parallel Generation and Assembly (G&A) Algorithm 00000000000 37
3.5 Application of Baddourah-Nguyen's Parallel G&A Algorithm 00000 38
306 Qin-Nguyen's G&A Algorithm 00000000000000000000000000000 41
307 Applications of Qin-Nguyen's Parallel G&A Algorithm 0000000000 46
308 Summary 00000000000000000000000000000000000000000000000 48
309 Exercises 0000000000000000000000000000000000000000000000 49
3010 References 000000000000000000000000000000000000000000000 50
xv
xvi Contents
4. Parallel-Vector Skyline Equation Solver on Shared Memory Computers 51
4.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 51
4.2 Choleski-based Solution Strategies .......................... 51
4.3 Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 52
4.3.1 Basic sequential skyline Choleski factorization: computer
code (version 1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 55
4.3.2 Improved basic sequential skyline Choleski factorization:
computer code (version 2) . . . . . . . . . . . . . . . . . . . . . . . . . . .. 59
4.3.3 Parallel-vector Choleski factorization (version 3) . . . . . . . . .. 60
4.3.4 Parallel-vector (with "few" synchronization checks)
Choleski factorization (version 4) .. . . . . . . . . . . . . . . . . . . .. 64
4.3.5 Parallel-vector enhancement (vector unrolling) Choleski
factorization (version 5) . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 66
4.3.6 Parallel-vector (unrolling) skyline Choleski factorization
(version 6) ........................................ 69
4.4 Solution of Triangular Systems ............................. 72
4.4.1 Forward solution ................................... 72
4.4.2 Backward solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 78
4.5 Force: A Portable, Parallel FORTRAN Language ............... 81
4.6 Evaluation of Methods on Example Problems . . . . . . . . . . . . . . . . .. 82
4.7 Skyline Equation Solver Computer Program ................... 86
4.8 Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 86
4.9 Exercises. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 87
4.10 References.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 88
5. Parallel-Vector Variable Bandwidth Equation Solver on Shared Memory

Computers ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 91
5.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 91
5.2 Data Storage Schemes .................................... 91
5.3 Basic Sequential Variable Bandwidth Choleski Method. . . . . . . . .. 96
5.4 Vectorized Choleski Code with Loop Unrolling ............... 101
5.5 More on Force: A Portable, Parallel FORTRAN Language ...... 103
5.6 Parallel-Vector Choleski Factorization ...................... 103
5.7 Solution of Triangular Systems ............................ 108
5.7.1 Forward solution .................................. 109
5.7.2 Backward solution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 112
5.8 Relations Amongst the Choleski, Gauss and LDU Factorizations . 115
5.8.1 Choleski (UTU) factorization . . . . . . . . . . . . . . . . . . . . . . . .. 115
5.8.2 Gauss (with diagonal terms Ljj =l) LU factorization ....... 117
5.8.3 Gauss (LU) factorization with diagonal terms U jj =1 ....... 118
5.8.4 LDLT factorization with diagonal term Ljj =1 ............ 120
5.8.5 Similarities of Choleski and Gauss methods . . . . . . . . . . . .. 122
5.9 Factorization Based Upon "Look Backward" Versus "Look Forward"
Strategies ............................................. 123
Contents xvii
5.10 Evaluation of Methods For Structural Analyses ............... 129

5.10.1 High speed research aircraft. . . . . . . . . . . . . . . . . . . . .. 130
5. I 0.2 Space shuttle solid rocket booster (SRB) . . . . . . . . . . .. 131
5.11 Descriptions of Parallel-Vector Subroutine PVS ............... 134
5.12 Parallel-Vector Equation Solver Subroutine PVS .............. 136
5.13 Summary .............................................. 137
5.14 Exercises .............................................. 138
5.15 References ............................................ 139
6. Parallel-Vector Variable Bandwidth Out-of-Core Equation Solver ... 14 I
6.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 141

6.2 Out-of-Core ParallellV ector Equation Solver (version 1) ........ 141
6.2.1 Memory usage and record length. . . . . . . . . . . . . . . . . . . . .. 142
6.2.2 A synchronous input/output on Cray computers. . . . . . . . .. 144
6.2.3 Brief summary for parallel-vector incore equation solver on the
Cray Y-MP ....................................... 145
6.2.4 Parallel-vector out-of-core equation solver on the Cray
Y-MP ......................................... " 146
6.3 Out-of-Core Vector Equation Solver (version 2) . . . . . . . . . . . . . .. 149
6.3.1 Memoryusage .................................... 149
6.3.2 Vector out-of-core equation solver on the Cray Y-MP . . . .. 149
6.4 Out-of-Core Vector Equation Solver (version 3) . . . . . . . . . . . . . .. 155
6.5 Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 157
6.5.1 Version 1 performance ............................. 157
6.6 Summary .............................................. 162
6.7 Exercises. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 163
6.8 References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 163
7. Parallel-Vector Skyline Equation Solver for Distributed Memory

Computers .................................................. 165
7.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 165

7 .2 Parallel-Vector Symmetrical Equation Solver ................. 165
7.2.1 Basic symmetrical equation solver ..................... 165
7.2.2 Parallel-vector performance improvement in decomposition 166
7.2.3 Communication performance improvement in factorization . 176
7.2.4 Forward/backward elimination ....................... 177
7.3 Numerical Results and Discussions . . . . . . . . . . . . . . . . . . . . . . . .. 181
7.4 FORTRAN Call Statement to Subroutine Node ............... 185
7.5 Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 187
7.6 Exercises. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 188
xviii Contents
7.7 References ....................................... 188
8. Parallel-Vector Unsymmetrical Equation Solver ................... 191
8.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 191

8.2 Parallel-Vector Unsymmetrical Equation Solution Algorithms. . .. 191
8.2.1 Basic unsymmetric equation solver . . . . . . . . . . . . . . . . . . .. 191
8.2.2 Detailed derivation for the [L] and [U] matrices . . . . . . . . .. 193
8.2.3 Basic algorithm for decomposition of "full" bandwidth/column
heights unsymmetrical matrix ........................ 194
8.2.4 Basic algorithm for decomposition of "variable"
bandwidths/column heights unsymmetrical matrix ........ 198
8.2.5 Algorithms for decomposition of "variable" bandwidths/column
heights unsymmetrical matrix with unrolling strategies .... 199
8.2.6 Parallel-vector algorithm for factorization. . . . . . . . . . . . . .. 200
8.2.7 Forward solution phase [L]{y}={b} ................... 202
8.2.8 Backward solution phase [U] {x} = {y} ................ 204
8.3 Numerical Evaluations .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 206
8.4 A Few Remarks On Pivoting Strategies. . . . . . . . . . . . . . . . . . . . .. 211
8.5 A FORTRAN Call Statement to Subroutine UNSOLVER ....... 212
8.6 Summary .............................................. 214
8.7 Exercises .............................................. 214
8.8 References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 216
9. A Tridiagonal Solver for Massively Parallel Computers . . . . . . . . . . . . .. 217
9.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 217

9.2 Basic Sequential Solution Procedures for Tridiagonal Equations .. 217
9.3 Cyclic Reduction Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 221
9.4 Parallel Tridiagonal Solver by Using Divided and Conquered
Strategies ............................................. 226
9.5 Parallel Factorization Algorithm for Tridiagonal System of
Equations Using Separators ............................... 229
9.6 Forward and Backward Solution Phases ..................... 236
9.6.1 Forward solution phase: [L] {z} = {y} ................. 236
9.6.2 Backward solution phase: [U] {x} = {z} . . . . . . . . . . . . . . .. 238
9.7 Comparisons between Different Algorithms .................. 239
9.8 Numerical Results ...................................... 240
9.9 A FORTRAN Call Statement To Subroutine Tridiag ........... 241
9.10 Summary .............................................. 244
9.11 Exercises. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 244
9.12 References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 245
10. Sparse Equation Solver with Unrolling Strategies. . . . . . . . . . . . . . . . .. 247

10.1 Introduction ........................................... 247
10.2 Basic Equation Solution Algorithms ........................ 248
Contents xix
10.2.1 Choleski algorithm ............................. 248

10.2.2 LDU algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 249
10.3 Storage Schemes for the Coefficient Stiffness Matrix . . . . . . . . . .. 252
10.4 Reordering Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 254
10.5 Sparse Symbolic Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 255
10.6 Sparse Numerical Factorization ............................ 271
10.7 Forward and Backward Solutions .......................... 278
10.7.1 Forward substitution phase ........................ 279
10.7.2 Backward substitution phase ....................... 279
10.8 Sparse Solver with Improved Strategies. . . . . . . . . . . . . . . . . . . . .. 280
10.8.1 Finding master (or super) degree-of-freedom (dot) ...... 280
10.8.2 Sparse matrix (with unrolling strategies) times vector .... 281
10.8.3 Modifications for the chained list array ICHAINL ( - ) . .. 288
10.8.4 Sparse numerical factorization with unrolling strategies .. 289
10.8.5 Out-of-core sparse equation solver with unrolling
strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 299
10.9 Numerical Performance of the Developed Sparse Equation Solver. 301
10.10 FORTRAN Call Statement to SPARSE Equation Solver ........ 306
10.11 Summary .............................................. 308
10.12 Exercises .............................................. 308
10.13 References ............................................ 309
11. Algorithms for Sparse-Symmetrical-Indefinite and Sparse-Unsymmetrical

System of Equations .......................................... 311
11.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 311

11.2 Basic Formulation for Indefinite System of Linear Equations . . . .. 311
11.3 Rotation Matrix [R] Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 318
11.4 Natural 2 x 2 Pivoting. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 323
11.5 Switching Row(s) and Column(s) During Factorization ......... 325
11.6 Simultaneously Performing Symbolic and Numerical Factorization 329
11. 7 Restart Memory Managements. . . . . . . . . . . . . . . . . . . . . . . . . . . .. 329
11.8 Major Step-by-Step Procedures for Mixed Look Forward!
Backward, Sparse LDLT Factorization, Forward and Backward
Solution With 2x2 Pivoting Strategies . . . . . . . . . . . . . . . . . . . . . .. 331
11.9 Numerical Evaluations ................................... 332
11.10 Some Remarks on Unsymmetrical-Sparse System of Linear
Equations ............................................. 334
11.11 Summary.............................................. 338
11.12 Exercises.............................................. 338
11.13 References ............................................ 338
Index .......................................................... 341


Parallel-Vector Equation Solvers For Finite Element Engineering Applications

Uploaded by

Copyright:

Available Formats

Parallel-Vector Equation Solvers For Finite Element Engineering Applications

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Parallel-Vector Equation Solvers For Finite Element Engineering Applications

Uploaded by

Copyright:

Available Formats

Parallel-Vector Equation

Solvers for Finite Element

Duc Thai Nguyen

Springer Science+Business Media, LLC

TA347.F5 N48 2001

AII rights reserved

memories. Parallel-vector equation solution strategies for "distributed-memory"

Duc Thai Nguyen

1.1 Parallel Computers 1

2. Storage Schemes for the Coefficient Stiffness Matrix 00000000000000000 13

201 Introduction 00000000000000000000000000000000000000000000 13

3. Parallel Algorithms for Generation andAssembly of Finite Element

301 Introduction 00000000000000000000000000000000000000000000 27

4. Parallel-Vector Skyline Equation Solver on Shared Memory Computers 51

5. Parallel-Vector Variable Bandwidth Equation Solver on Shared Memory

5.10 Evaluation of Methods For Structural Analyses ............... 129

6. Parallel-Vector Variable Bandwidth Out-of-Core Equation Solver ... 14 I

6.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 141

7. Parallel-Vector Skyline Equation Solver for Distributed Memory

7.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 165

7.7 References ....................................... 188

8. Parallel-Vector Unsymmetrical Equation Solver ................... 191

8.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 191

9. A Tridiagonal Solver for Massively Parallel Computers . . . . . . . . . . . . .. 217

9.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 217

10. Sparse Equation Solver with Unrolling Strategies. . . . . . . . . . . . . . . . .. 247

10.2.1 Choleski algorithm ............................. 248

11. Algorithms for Sparse-Symmetrical-Indefinite and Sparse-Unsymmetrical

11.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 311

Index .......................................................... 341

You might also like