10.1007/978 3 319 06136 8

Computational Methods in Applied Sciences
Sergio R.Idelsohn Editor
Numerical
Simulations
of Coupled
Problems in
Engineering
Numerical Simulations of Coupled Problems
in Engineering
Computational Methods in Applied Sciences
Volume 33
Series editor
E. Oate
International Center for Numerical Methods in Engineering (CIMNE)
Technical University of Catalonia (UPC)
Edificio C-1, Campus Norte UPC
Gran Capitn, s/n
08034 Barcelona, Spain
e-mail: [email protected]
url: http://www.cimne.com
For further volumes:

http://www.springer.com/series/6899
Sergio R. Idelsohn
Editor
Numerical Simulations
of Coupled Problems
in Engineering
123
Editor
Sergio R. Idelsohn
International Center for Numerical
Methods in Engineering (CIMNE)
Catalan Institution for Research
and Advanced Studies (ICREA)
Barcelona
Spain
ISSN 1871-3033
ISBN 978-3-319-06135-1 ISBN 978-3-319-06136-8 (eBook)
DOI 10.1007/978-3-319-06136-8
Springer Cham Heidelberg New York Dordrecht London
Library of Congress Control Number: 2014939045
Springer International Publishing Switzerland 2014

This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of
the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or
information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed. Exempted from this legal reservation are brief
excerpts in connection with reviews or scholarly analysis or material supplied specifically for the
purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the
work. Duplication of this publication or parts thereof is permitted only under the provisions of
the Copyright Law of the Publishers location, in its current version, and permission for use must
always be obtained from Springer. Permissions for use may be obtained through RightsLink at the
Copyright Clearance Center. Violations are liable to prosecution under the respective Copyright Law.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt
from the relevant protective laws and regulations and therefore free for general use.
While the advice and information in this book are believed to be true and accurate at the date of
publication, neither the authors nor the editors nor the publisher can accept any legal responsibility for
any errors or omissions that may be made. The publisher makes no warranty, express or implied, with
respect to the material contained herein.
Printed on acid-free paper
Springer is part of Springer Science+Business Media (www.springer.com)

Preface
This book contains state-of-the-art contributions in the field of Coupled Problems

in Engineering. A selected specialist has written each chapter as an extended
version of the paper presented at the conference Fifth Computational Methods for
Coupled Problems in Science and Engineering held in Ibiza in July 2013. This
Conference brought together more than 400 participants from 41 countries and was
dedicated to celebrate the 60th birthday of Prof. Eugenio Oate.
The Conference was included as one of the Thematic Conferences of the
European Community on Computational Methods in Applied Sciences (ECCO-
MAS) and a Special Interest Conference of the International Association for
Computational Mechanics (IACM). It was also supported by other scientific
organizations in Europe and worldwide.
This book contains 16 chapters written by distinguished authors, who present
and discuss mathematical models, numerical methods, and computational tech-
niques for solving Coupled Problems of multidisciplinary character. The goal of
this book is to take a step forward in the formulation and solution of real life
problems with a multidisciplinary vision, accounting for all the complex couplings
involved in the physical description of the problem.
Topics treated in the various chapters include developments and applications of
Coupled Problems in a wide variety of situations such as Non-Linear Materials,
Cardiovascular Fluid Mechanics, Multi-Fluid Flows, or Fluid-Structure Interac-
tions, using different techniques like particle methods, reduced order models or
partitioned parallelization techniques.
This book includes contributions submitted directly by authors. The editor
cannot accept responsibility for any inaccuracies, comments, and opinions
contained in the text.
The editor would like to take this opportunity for thanking all authors for sub-
mitting their excellent contributions on time. Many thanks also to ECCOMAS and
Springer for accepting the publication of this book in the series Computational
Methods in Applied Sciences.
Sergio R. Idelsohn
v
Contents
Part I Non-Linear Materials in Coupled Problems
Generalized Viscoplasticity Based on Overstress (GVBO)

for Large Strain Single-Scale and Multiscale Analyses . . . . . . . . . . . . 3
Vasilina Filonova, Yang Liu and Jacob Fish
Numerical Simulation of Double Cup Extrusion Test Using

the Arbitrary Lagrangian Eulerian Formalism. . . . . . . . . . . . . . . . . . 29
Romain Boman, Roxane Koeune and Jean-Philippe Ponthot
Part II Cardiovascular Fluid Mechanics
Simplified Fluid-Structure Interactions for Hemodynamics . . . . . . . . . 57

Olivier Pironneau
Patient-Specific Cardiovascular Fluid Mechanics Analysis

with the ST and ALE-VMS Methods . . . . . . . . . . . . . . . . . . . . . . . . . 71
Kenji Takizawa, Yuri Bazilevs, Tayfun E. Tezduyar, Christopher C. Long,
Alison L. Marsden and Kathleen Schjodt
Part III Particle Methods in Coupled Problems
Direct Numerical Simulation of Particulate Flows Using

a Fictitious Domain Method. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
Bircan Avci and Peter Wriggers
A Particle Finite Element Method (PFEM) for Coupled Thermal

Analysis of Quasi and Fully Incompressible Flows
and Fluid-Structure Interaction Problems . . . . . . . . . . . . . . . . . . . . . 129
Eugenio Oate, Alessandro Franci and Josep M. Carbonell
vii
viii Contents
Numerical Simulation and Visualization of Material Flow

in Friction Stir Welding via Particle Tracing . . . . . . . . . . . . . . . . . . . 157
N. Dialami, M. Chiumenti, M. Cervera, C. Agelet de Saracibar,
J. P. Ponthot and P. Bussetta
Some Considerations on Surface Condition of Solid

in Computational Fluid-Structure Interaction. . . . . . . . . . . . . . . . . . . 171
Masao Yokoyama, Kohei Murotani, Genki Yagawa
and Osamu Mochizuki
Part IV Reduced-Order Models
Reduced-Order Modelling Strategies for the Finite Element

Approximation of the Incompressible Navier-Stokes Equations . . . . . . 189
Joan Baiges, Ramon Codina and Sergio R. Idelsohn
A Survey of Hierarchical Model (Hi-Mod) Reduction

Methods for Elliptic Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
Simona Perotto
Part V Multifluid Flows
On the Application of Two-Fluid Flows Solver

to the Casting Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
K. Kamran, R. Rossi, P. Dadvand and S. R. Idelsohn
Recent Advances in the Particle Finite Element Method

Towards More Complex Fluid Flow Applications . . . . . . . . . . . . . . . . 267
Norberto M. Nigro, Juan M. Gimenez and Sergio R. Idelsohn
Part VI Fluid-Structure Interactions Problems
Computational Engineering Analysis and Design

with ALE-VMS and ST Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321
Kenji Takizawa, Yuri Bazilevs, Tayfun E. Tezduyar, Ming-Chen Hsu,
Ole iseth, Kjell M. Mathisen, Nikolay Kostov and Spenser McIntyre
Computational Wind-Turbine Analysis with the ALE-VMS

and ST-VMS Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355
Yuri Bazilevs, Kenji Takizawa, Tayfun E. Tezduyar,
Ming-Chen Hsu, Nikolay Kostov and Spenser McIntyre
Contents ix
Part VII Partitioned Method and Parallelization Techniques
Scaling Up Multiphysics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389

Rainald Lhner and Joseph D. Baum
Partitioned Solution of Coupled Stochastic Problems . . . . . . . . . . . . . 405

Mohammad Hadigol, Alireza Doostan, Hermann G. Matthies
and Rainer Niekamp
Part I
Non-Linear Materials in Coupled
Problems
Generalized Viscoplasticity Based on Overstress
(GVBO) for Large Strain Single-Scale
and Multiscale Analyses
Vasilina Filonova, Yang Liu and Jacob Fish
Abstract A generalized viscoplasticity based on the overstress (GVBO) model that

successfully reproduces large strain and strain rate experimental data has been devel-
oped. The GVBO model confirmed increased shear resistance of polyurea at very high
strain rates (105 106 s1 ) observed in the experiments. With the proposed GVBO
model, we conducted numerical simulation of fragment simulating projectile impact-
ing polyurea/ high-hard-steel bi-layers at high impact velocities (>1000 m/s). For
model validation, two positions of the polymer coating with respect to the steel plate
have been considered: the front and back coating, with a front side being a target.
Numerical impact simulations utilizing a single-scale GVBO model predicted that a
polyurea bi-layer with a front coating increases penetration velocity by about 15.4 %
(against 23 % observed in the experiments), while the steel plate with a back coat-
ing raises penetration velocity by 7.5 % (as opposed to 8.8 % in the experiment)
in comparison to the ballistic limit of the blank steel plate. This minor discrep-
ancy between experimental and simulation results is qualitatively attributed to the
space-time multiscale phenomena. We show that a possible anisotropy introduced by
material heterogeneity increases resistance of the polyurea layer by partially trans-
forming the pressure wave into a dissipated shear wave. We further demonstrate that
dispersion further enhances energy absorption of the polyurea coating.
Keywords Polyurea Viscoplasticity model based on the overstress Multiscale

Anisotropy Dispersion
V. Filonova Y. Liu J. Fish (B)

Columbia University, Newyork, USA
S. R. Idelsohn (ed.), Numerical Simulations of Coupled Problems in Engineering, 3

Computational Methods in Applied Sciences 33, DOI: 10.1007/978-3-319-06136-8_1,
4 V. Filonova et al.
1 Introduction
The present manuscript focusses on a single-scale and multiscale modeling of copoly-

mers. A copolymer is a polymer derived from two (or more) monomeric species, as
opposed to a homopolymer where only one monomer is used. Polyurea, which is a
generic term for a block copolymer, comprises of homopolymer subunits linked by
covalent bonds. The union of the homopolymer subunits may require an intermediate
non-repeating subunit, known as a junction block. Polyurea exhibits a wide range of
mechanical properties, from soft rubber to hard plastic depending on the chemistry.
The range of properties together with their rapid reaction has led to many appli-
cations as coatings, for example on tunnels, bridges, roofs, parking decks, storage
tanks, freight ships, truck beds, etc. Polyurea coatings have been applied to military
armor to increase its resistance to ballistic penetration [1].
Many copolymer models [26] are derived from the standard linear solid model
(SLS), capable of predicting both creep and relaxation as shown in Fig. 1.
The arm 1 in Fig. 1 is a hyperelastic arm, whereas the bottom arm in Fig. 1 is the
Maxwell arm consisting of a dashpot and hyperelastic spring connected in series.
For large strain problems [5, 6], the deformation gradient in the bottom arm, F2 =
F1 F, which is identical to the deformation gradient in the top arm, is decomposed
into elastic and inelastic deformation gradients
F2 = F2e Fin
2 (1)
The elastic response in the hyperelastic arm is assumed to obey Neo-Hookean law,
whereas the stress in the Maxwell arm 2 depends
e on
e
left stretch tensor V2 and
pressure p2 in the Maxwell arm, i.e. 2 = f V2 , p2 . For more details we refer
to [5, 6]. These models, however, have been considered for up to the strain rates of
104 s1 .
The primary objectives of the present chpater are as follows:
Develop a block copolymer constitutive model for large strains and strain rates
in the range of 105 106 s1 . Rather than utilizing the framework of multiplica-
tive decomposition (1), we will pursue an additive decomposition of the rate of
deformation
D = De + Din (2)
The evolution of the inelastic responses will follow the framework of the VBO
model [7, 8] with only exception that the elastic constants and viscosity will be
assumed to be deformation-dependent. While there are a number of available
models of polyurea in the literature [9], to our knowledge, there is no published
work utilizing the framework of viscoplasticity model based on overstress (VBO)
to describe exceedingly large strains and strain rates observed in the experiments.
In the following we will refer to the generalization of the VBO model to large
strains and strain rates as the GVBO.
Generalized Viscoplasticity Based on Overstress (GVBO) 5
Fig. 1 Rheological model

of the polymer. Arm 1 1
spin model, arm 2Maxwell
arm consisting of spring and
dashpot 2
Validate the proposed GVBO constitutive model. We will consider recent experi-
ments conducted at Brown University (impact of a steel flyer on a steel/polyurea/
steel sandwich plate under high shear strain rates) [1012].
Validate the GVBO model on the structural impact problem. We will consider the
impact of the projectile onto the polyurea/steel bi-layer where a polymer layer is
placed on the front or back of the plate. We will verify experimental observation
suggesting considerable advantage of placing the polyurea layer on the top of the
target plate at very high impact velocities (>1000 m/s) [1315]. It is noteworthy
to point out that for low to moderate impact velocities the back polyurea coating
of the plate shows better resistance than the front coating [16].
The last part of the manuscript will consider space-time multiscale analysis of the
block copolymer to explain some of the discrepancies of the single-scale GVBO
model. The polyurea microstructure will be modeled as a two-phase heterogeneous
anisotropic material with the inclusion phase being elastic and the soft domains
obeying GVBO constitutive model. The multiple temporal scale effect, which
gives rise to dispersion, will be taken into account using the recently developed
micro-inertia approach [17].
2 Generalized Viscoplasticity Based on Overstress (GVBO)

for Large Strains
In this section we present the generalized viscoplasticity model based on overstress

(GVBO) for large strain and strain rate problems. First, in Sect. 2.1, the model is
stated in the corotational (or rotation-free) frame. Subsequently, in Sects. 2.2 and
2.3, we develop a deformation-dependent elastic modulus and viscosity function for
the classical VBO model [7, 8].
2.1 VBO for Large Rotations
We start form the constitutive equation based on the additive decomposition of the
rate of deformation

= L : D Din (3)

where is an objective Cauchy stress rate; D is the rate of deformation tensor, and
Din is the inelastic rate of deformation. The elastic constitutive tensor L for isotropic
material is given by
L = 1 1 + 2GI (4)
where is the first Lame constant, and G is the shear modulus.

To account for large rotations, we consider the Green-Naghdi rate of Cauchy
stress, which describes material response in the initial rotation-free frame. The Green-
Naghdi rate of the Cauchy stress is defined by

T
T

= + R R R R (5)
The flow rule for the case of finite rotations but small strains is given by
1+ 3 sg
d = de + din = s+ (6)
E 2 Ek
where d is the deviatoric part of the rate of deformation tensor D decomposed into
elastic and inelastic parts, de and din , respectively. s is the deviatoric Cauchy stress
tensor; g is the deviatoric part of the equilibrium stress; E is the Youngs modulus;
is the Poissons ratio, and k is the viscosity function defined as

k3
k = k1 1 + (7)
k2
where k1 , k2 , k3 are material constants. The overstress invariant is defined by

2
= (s g) : (s g) (8)
3
The evolution law for the equilibrium stress is given by

sg (g f)
g= s+ + 1 f (9)
E k k A E
where is the shape function defined as
= a1 + (a2 a1 ) exp (a3 ) (10)
with a1 , a2 , a3 being material constants.

The evolution laws of the kinematic stress function f is defined as
sg 2 Et
f = E t ; E t (11)
Ek 3 (1 E t /E)
and of the isotropic stress function A as

A = Ac A f A ; A[t = 0] = A0 (12)
Ek
where Ac , A f are material constants.
For more details about the classical VBO model see [7, 8].
2.2 Deformation-Dependent Elastic Constitutive Tensor
We introduce the following deformation-dependent Lame constants related to small

strain elastic constitutive tensor (4)
[J ] = J p 0
G[J ] = J p (G 0 0 ln J ) (13)
where 0 , G 0 are the initial (small deformation) values and p is a material parameter;
the Jacobian is defined as usual by J = det(F) where F is the deformation gradient
tensor. The elastic constitutive tensor (4) is assumed to be a function of the Jacobian
L[J ] = [J ]1 1 + 2G[J ]I (14)
The initial elastic properties E 0 , 0 are related to the initial Lame constants by
E 0 0 E0
0 = ; G0 = (15)
(1 + 0 ) (1 20 ) 2 (1 + 0 )
The deformation-dependent Youngs modulus and Poisson ratio are related to

deformation-dependent Lame parameters (13) by
G[J ] (3[J ] + 2G[J ]) [J ]

E[J ] = ; [J ] = (16)
[J ] + G[J ] 2 ([J ] + G[J ])
The elastic parameters are allowed to increase only from their initial values: G G 0 ,
0 , E E 0 ; i.e. if elastic parameters decrease based on Eq. (13), the initial
values are taken instead.
2.3 The Deformation-Dependent Viscosity Function
We assume the viscosity k to be a function of overstress and Jacobian as follows

Table 1 Generalized VBO model summary

GVBO Parameters 15: E 0 , 0 , E t , p, a1 , a2 , a3 , k1 , k2 , k3 , k4 , E m , A0 , Ac , A f
Elastic moduli E[J ] = G[J ](3[J ]+2G[J ])
[J ]+G[J ]
[J ] = J 0 ; G[J ] = J p (G 0 0 ln J )
p
0 0
0 = (1+0E)(12 0)
E0
; G 0 = 2(1+ 0)
1+ 3
Flow law d= s + 2 E[Jsg
E[J ] ]k[,J ]

sg (gf)
Equilibrium stress evolu- g = E[J ] s + k[,J ] k[,J ] A + 1 E[J ] f
tion law

Kinematic stress evolu- f = E t [J ] E[Jsg 2E t
]k[,J ] ; E t [J ] 3(1E t /E[J ])
tion law
Isotropic stress function A = Ac A f A E[J ]k[,J ] ; A (t = 0) = A0
evolution law
Shape function = a1 + (a2 a1 ) ea3
k4
]E 0
Viscosity function k[, J ] = k1 1 + k [J ] ; k2 [J ] k2 + k3 E[J
E m E 0
2
k4
E[J ] E 0
k[, J ] = k1 1 + ; k2 [J ] k2 + k3 (17)
k2 [J ] Em E0
where k1 , k2 , k3 , k4 , E m are material parameters. Note that k2 = k2 is constant when

Youngs modulus is constant, i.e. the deformation is small. The parameter E m is a
maximum Youngs modulus and k3 defines the sensitivity to large deformation.
The generalized VBO model for the finite deformation theory is summarized in
Table 1.
3 Model Validation
3.1 Experimental Setup
The experiments investigating resistance of elastomer to shearing and failure at

extreme loading conditions were conducted by Clifton and coworkers [1012]. The
polyurea is cast between two steel plates. The flyer impacts the sandwich plate at high
velocity. For pressure-shear tests, the flyer is slightly inclined at angle = 18 to pro-
duce a shear response of the polyurea as shown in Fig. 2. We consider two pressure-
shear experiments: Shot#1201 and Shot#404 at shear-strain rates of 4.1 105 s1
and 2.4 105 s1 , respectively. The main difference between the two shots is impact
velocity (see Table 3), which is higher for Shot#1201. In addition, we consider a pure
compression experiment, Shot#1203, which is required to identifying pure compres-
sion stress-strain relation. The geometry, material properties and impact velocities
are described in Tables 2 and 3, (for details see references [10, 11]).
Fig. 2 Experiment setting [10]
Table 2 Flyer and plate steel material parameters

Shot No. Steel Density (g/cm3 ) Youngs modulus (MPa) Poissons ratio
1201, 1203 Pure WC 15.4 609100 0.2
404 Hampden steel 7.861 213700 0.29
Table 3 Thickness of the flyer, sandwich plates, and impact velocity for two experiments
Shot No. Sample polyurea Front plate Rare plate Flyer Impact velocity Angle
(mm) (mm) (mm) (mm) V0 (m/s) ( )
1201 0.097 3.582 5.578 7.411 183.45 18
1203 0.089 3.588 5.898 10.5 175 0
404 0.11 2.896 7.041 6.991 112.6 18
3.2 Numerical Simulation
For numerical simulations we consider an idealized model: a narrow longitudinal

strip (in 3D) of a sandwich and a flyer as shown in Fig. 3. Such an idealization was
originally proposed in reference [18]. The polyurea is modeled by one 8-node hexa-
hedral solid element. The flyer and the steel plates were meshed by 8-node hexahedral
solid elements and modeled as elastic material with parameters described in Table 2.
The boundary and initial conditions are listed in Table 4. The calibrated material
parameters of the GVBO model of polyurea are listed in Table 5. The parameter E m
is taken to be the maximum Youngs modulus observed in Shot#404. The parameters
k1 , . . . , k4 and p are calibrated against the experimental data from Shots#1201 and
#404. Note that we use the same set of polyurea parameters for both experiments in
Shot#1201 and Shot#404.
We also consider cohesive elements at the interfaces between the polymer layer
and the steel. The cohesive element is described by an anisotropic traction-separation
law. The damage initiates at a maximum effective stress, and the total damage corre-
sponds to the maximum effective displacement as shown in Table 6 . We use different
parameters for the cohesive layer in the experiments Shot#1201 and Shot#404 since
V0
Polyurea
Flyer Front Rear
V0
Fig. 3 Simplified numerical simulation model [18]
Table 4 Boundary and initial conditions considered in numerical simulation

Flyer plate Sandwich plate assembly
u 1 = free
V1 = V0 cos
Initial velocity: Constraints: u 2_top = u 2_bottom
V2 = V0 sin
u3 = 0
Constraints: u 2_top = u 2_bottom ; u 3 = 0
Table 5 GVBO material parameters for polyurea

Density (g/cm3 ) E 0 (MPa) 0 E t (MPa) p a1 (MPa) a2 (MPa) a3 (1/MPa)
1.1 278 0.491 23.3 6 0 10 0
k1 k2 (MPa) k3 (MPa) k4 E m (MPa) A0 (MPa) Ac (s1 ) A f (MPa)
17 0.47 1.47 4 13600 1 0 0
Table 6 Material parameters for cohesive interface modeled by traction-separation law

Shot No. Density Elastic properties Maximum nominal Maximum effective
(g/cm3 ) (MPa) stress (MPa) displacement (mm)
E G1 G2 Sn S1 S2
1201, 1203 1.56 12000 14000 14000 6000 7000 7000 0.8
404 1.56 175 175 175 120 120 120 0.8
the steel plates are different. The presence of the cohesive layer does not influence
significantly the numerical results.
The comparison of numerical simulation against experimental data of Shot#1201
is depicted in Figs. 4 and 5. The simulation results can be seen to be in good agreement
with the experiments except for the normal velocity that behaves differently after the
Fig. 4 Shear and compressive stress versus shear strain and time, respectively, for Shot#1201
Fig. 5 Transverse and normal velocity versus time for Shot#1201
first peak. This is most likely related to the arrival of the boundary wave and the use
of idealized geometry considered in the simulations.
It can be seen that for Shot#404 numerical results are in good agreement with the
experimental results (see Figs. 6, 7, 8 (right)).
The compressive stress-strain curve related to pure compression loading-unloading
experiments (Shot#1203) is plotted in Fig. 8 (left). The loading compressive stress-
strain curve for Shot#404 is depicted in Fig. 8 (right) and compared with the experi-
mental observations taken from reference [11].
The pressure dependence of shearing resistance is approximated by a linear func-
tion in Fig. 9 (left), [10]. The numerical simulation results shown in Fig. 9 (right)
correspond to the problem geometry and steel material taken from Shot#1201 and
Shot#404 (Table 2). The results are obtained for impact velocities ranging between
V0 = 112.6 m/s and V0 = 183.45 m/s, and fit nicely the linear function predicted by
the experiments.
3.3 Confined Monotonic Loading
There is a lack of experimental data for uniaxial monotonic compressive loading at

strain rates in the range of about 105 s1 and in the presence of large distortions.
Instead, we consider experimental data from reference [18] related to confined com-
Fig. 6 Shear and compressive stress versus shear strain and time, respectively, for Shot#404
Fig. 7 Transverse and normal velocity versus time for Shot#404
Fig. 8 Compressive stress versus compressive strain for pure-compression Shot#1203 (left) and
for pressure-shear Shot#404 [11] (right)
pressive test of polyurea subjected to a strain rate of 2600 s1 at 273 K. We model

confined compression for a single element of polyurea. Figure 10 shows a reasonable
agreement with experimental data. The result has been found to be almost insensitive
to inelastic material parameters.
Fig. 9 Dependence of shear resistance on pressure: experimental data [2] (left) and numerical
simulations (right)
Fig. 10 Confined compressive stress-strain curves for GVBO and experimental data [18]
3.4 Validation for Polyurea/Steel Bi-Layer
In this section we consider a composite polyurea/steel plate impacted by FSP (frag-

ment simulating projectile) at very high velocities (>1000 m/s). Two positions of the
polymer layer with respect to the steel have been considered: the front and back
coating, with a front side being the target.
The polyurea is modeled by the generalized viscoplasticity model based on over-
stress (GVBO) outlined in the previous section. For failure criterion, we considered
the maximum principal strain. For high-hard steels we studied MIL-A46100 steel
which is modeled by the Johnson-Cook constitutive model.
The ballistic limit (or V50 velocity) is estimated for a blank steel plate, polyurea-
steel plate and steel-polyurea plate. It has been observed that the bi-layer with a front
polyurea coating increases penetration velocity by about 15.4 %, while the plate with
back coating raises penetration velocity by only 7.5 % for MIL-A46100 steel with
respect to the ballistic limit of the blank steel plate.
The polyurea material is modeled by the generalized viscoplasticity model based
on overstress. The calibrated material parameters are shown in Table 5. The failure
Table 7 Physical properties of the steel

Steel Youngs Poisson Density Inelastic heat Specific Coefficient of
Modulus ratio (g/cm3 ) fraction, heat linear thermal
(MPa) (J/kgK) expansion (1/K)
MIL-A46100 215000 0.29 7.85 0.9 4.8e+08 11.5e-6
Table 8 The Johnson-Cook parameters for the steel [19, 21]

Steel A (MPa) B (MPa) n 0 (s 1 ) C m T0 (K) Tm (K)
MIL-A46100 1050 250 0.12 1 0.02 0.5 298 1720
criterion for the polyurea is a maximum principal strain, and the critical damage
value is D pol = 1.4.
Material properties for the MIL-A46100 steel [19] are listed in Table 7. The para-
meter is the Taylor-Quinney empirical coefficient that represents proportion of
plastic work converted into heat, and the value 0.9 is considered in this study follow-
ing the reference [20].
The steel is modeled by the Johnson-Cook model with material constants taken
from references [21, 22]. The von Mises tensile flow stress is defined by

= A + Bn 1 + C ln 1 T m (18)
where is the equivalent (or effective) plastic strain, and 0 are the current and
reference strain rates; = /0 is the dimensionless plastic strain rate for 0 =
1.0 s1 . The homologous temperature T is defined as
T T0
T = (19)
Tm T0
where Tm and T0 are the melting and room temperatures, respectively. The Johnson-
Cook parameters for the steel are summarized in Table 8.
The Johnson-Cook model defines the damage variable as

D= (20)
f
where is an increment of equivalent plastic strain, and f is the equivalent strain

to fracture determined as

f = D1 + D2 exp D3 1 + D4 ln 1 + D5 T (21)
For the Johnson-Cook model, fracture is assumed to occur when D = D steel = 1.

The Johnson-Cook failure parameters for the MIL-A46100 steel are not available in
Table 9 The Johnson-Cook failure parameters for the 4340 steel [22]
D1 D2 D3 D4 D5
0.05 3.44 2.12 0.002 0.61
Table 10 Ballistic limit of different plates

Steel Ballistic limit Blank steel Polyurea-steel Steel-polyurea
High-hard steel, V50 (m/s) 1184 5 1483 7
experiments
[13, 14]
Difference w.r.t. to blank steel 23 % 8.8 %
MIL-A46100, V50 (m/s) 1188 1371 1 1277
simulation
Difference w.r.t. to blank steel 15.4 % 7.5 %
the open literature and have been assumed to be the same as for the 4340 steel, (see
Table 9 [22]).
The 3D plate is constructed from two layers: steel and polyurea of thickness 12.7
mm. The plate radius considered in the simulations is 125 mm. We consider the
impact by a fragment simulating projectile of 0.5 caliber, 12.7 diameter, and mass
of 12.4 g.
The impact problem is simulated using Abaqus/Explicit solver with built-in
Johnson-Cook material and failure models and the user-defined material subroutine
(describing GVBO) for the polyurea. Only one quarter of the plate was simulated
assuming its symmetry. The bullet is modeled as a rigid body. The plate is assumed
to be free.
The model was meshed using 8-node hexahedral element with reduced integration.
For stabilization in Abaqus/Explicit, linear and quadratic bulk viscosity parameters
were chosen as 0.54 and 2.4, respectively.
The ballistic limit (V50 ) is estimated in the simulation as an average velocity
between the minimum impact velocity for penetration and maximum velocity before
penetration. It is listed in Table 10 for the blank steel and coated steel plates. These
simulation results are comparable with the results obtained in the experiments [13,
14]. The difference in the ballistic limit between the polyurea-steel and the blank
steel is about 15.4 %.
Note that the enhancement obtained by the front polyurea layer is slightly smaller
than in the numerical simulation reported in [23] (where an impact problem of FSP
penetrating polyurea-4340 steel was considered, and the polyurea was modeled using
different viscoplasticity model [15]). In reference [23], the ballistic limits of the blank
steel plate and the polyurea-steel bi-layer were overestimated.
Figure 11 shows the snapshot of a projectile penetration into the bi-layer of
polyurea and MIL-A46100 steel at impact velocity of V = 1370 m/s. The figure
compares the relative position of the projectile in the polyurea layer in front and back
Fig. 11 The FSP impact on polyurea-MIL-A46100 steel plate (top) and MIL-A46100 steel-
polyurea plate (bottom) at impact velocity of 1370 m/s at time 1.e-04s
Fig. 12 FSP velocity for polyurea-steel and steel-polyurea plates at impact velocity of 1370 m/s
coating configurations. The results suggest a delay in the projectile penetration in

the case of the front polyurea coating polyurea in comparison to the back coating.
The evolution of the projectile velocity in time for both composite plate configu-
rations is shown in Fig. 12. It can be seen that the decrease in the projectile velocity
is more pronounced in the case of front polyurea coating of the steel.
Fig. 13 Micromechanical
model of polyurethane
consisting of hard domains
(HD) and soft domains (SD)
with a low HD content [25]
HD SD
4 Multiscale Modeling of Polymer
4.1 Block Copolymers
Microstructure of the polyurea is composed of hard domains (HD) and soft domain
(SD) forming a two-phase microstructure as shown in Fig. 13. Hard segments are
with high glass-transition temperature Tg and soft segments are with low Tg . The
soft segment has its glass transition below the normal operating temperature and is,
therefore, rubbery. The hard segment has its glass transition or its melting tempera-
ture above the ordinary operating temperature and is, therefore, either glassy and/or
crystalline. It is well known that the microphase separation of hard and soft domains
is responsible for the versatile properties of this broad class of polymers [24, 25].
Recent studies [26] have shown that Tg of the soft domain is on average 80 C higher
at the free surface than in the interior and 60 C higher than at the circumference.
For the low strain rate tensile specimens, the Tg increases with strain and reaches a
maximum value at a strain of 3.6. These increases in the glass transition tempera-
tures is believed to be due to mixing of the hard and soft segments, but the precise
mechanism is not well understood and cannot be investigated without performing
micromechanical analysis of the polymer.
Numerous experimental studies (see for example [27]) suggested a significant
shift in glass transition Tg with strain rate. The precise effect of strain rate on hard
and soft domains is not known, except on the overall behavior of the polymer. We will
identify the rate dependent properties of SD and HD domain using inverse method.
4.2 Multiscale Model of Copolymers
Consider the following governing equations of a block copolymer over the composite
domain

i j, j + bi = u i on

i j = L i jkl kl kl on
1

i j = u (i, j) u i, j + u j,i on (22)
2

u i = u i on u

i j n j = ti on t

where u is displacement, is density; u i, j denotes the derivative of displace-

ment increment with respect to the midstep; u (i, j) denotes the symmetric gradient;

kl = u (i, j) is an integral of the rate of deformation over the time step; is an

eigenstrain; and i j is Cauchy stress. The microscopic coordinate system is y = x/ ,
and it is considered to be independent on the macroscopic system of coordinates, x,
when 1. Then the spatial derivative rule is ,i = ,xi + ,yi / .
For a viscoplastic material considered in Sect. 2 undergoing large strains, it is
convenient to decompose the eigenstrain increment as follows

kl =
in
kl + kl
el

1 (23)

kl = Ikli j L klmn E 0 , G 0
el L mni j E , G i j in
ij
where el kl is contribution to eigenstrain resulting from deformation-dependent

elastic properties (16), and in
kl is the usual inelastic strain increment.
The displacement field is expanded asymptotically as

u i (x, y, t) = u ic (x, t) + u i(1) (x, y, t) + O 2 (24)
where uc and u(1) are coarse and fine-scale displacements, respectively.

The first order strain field is given by
(1)
i j (x, y, t) = icj (x, t) + u y, t) +
(i,y j ) (x,
O ( )

1
icj (x, t) = u c(i,x ) (x, t) = i j (x, y, t) d (25)
j ||

The fine-scale displacement is assumed to be decomposed as [2832]


(1)
u i (x, y, t) = Hikl (y, t) kl
c
(x, t) + gikl (y, y, t) kl (x, y, t) d (26)

where Hikl , gikl are y-periodic instantaneous functions for elastic and inelastic defor-
mation, respectively. Note that for large strain problems, the unit cell domain may
evolve and therefore the influence functions may change from one increment to
another. In the present manuscript we introduce an approximation by which we
approximate the influence functions by their initial values
Hikl (y) Hikl (y, t) = Hikl (y, 0)

gikl (y, y) gikl (y, y, t) = gikl (y, y, 0) (27)
In the model reduction approach [2833] the eigenstrain field is discretized over
volume partitions as follows

n
N () (y) i j
c()
i j (x, y, t) = (x, t) (28)
=1
where n is a number of partitions or phases. A shape function N () is either continuous

(for compatible eigenstrains, see [33]) or piece-wise constant as follows

n
n
() 1 y ()
N (y) = () = ; () = (29)
0 y/ ()
=1 =1
and
1
kl (x, y, t) d()
c()
i j (x, t) = () (30)

()

Inserting (27) and (28) into (26) and introducing Si
kl()
(y) = kl
(y, y) N () (y) d
gi
yields
(1)

n
kl() c()
u i (x, y, t) = Hikl (y) kl
c
(x, t) + Si (y) kl (x, t) (31)
=1
Inserting (31) into (25) gives

n
mn()
kl (x, y, t) = Iklmn + H(k,y
mn
l)
(y) c
mn (x, t) + S(k,yl ) (y) c()
mn (x, t)
=1
(32)
Given the rate of deformation increment and the previous converged value of Cauchy
stress in each phase of the unit cell the stress can be updated using the GVBO model
outlined in the previous section.
Finally, the coarse-scale Cauchy stress is obtained by averaging fine-scale stresses

1
icj = i j d (33)
||

for the coarse-scale equilibrium equation
icj, j + bic = c u ic (34)
where bc and c are averages of fine-scale body force b and density , respectively.
4.3 Dispersion Contribution
In dynamic multiscale problems there is an effect of micro-inertia, which arises due

to material heterogeneity. We account for the micro-inertia effect in the coarse-scale
problem by modifying the coarse-scale stress (for more details see [17])
icj = icj + 2 Di jkl kl

c
(35)
The coarse-scale problem (34) is then redefined as
icj, j + bic = c u ic on
icj, j n cj = tic on t (36)
where Di jkl is a dispersion coefficient that depends on the material impedance, unit
cell size and overall density [17].
To compute the dispersion coefficient it is convenient to decompose it to linear
and nonlinear contributions
Di jkl = Dilin nonlin

jkl + Di jkl (37)
The linear term is defined as

1
Dilin
ij
jkl
c
h s h kl
s d (38)
||

where h ikl is y-periodic solution of the following problem

Fig. 14 Microstructure of
the unit cell with particles
preferentially oriented at a 45
with respect to a vertical axis
(y)
(i,y j ) (y) = H(i,y j ) (y) + c Ii jkl
h kl kl
ij (39)
h s d = 0; (y) = c (y)

The nonlinear term is defined as
n () () () ()
st () pq () pq
n n n
() pq
Dinonlin R + Q + Q
jkl icj st pq kl
c i j pq c klpq c
=1 =1 =1 kl =1 ij

() 1 st() pq() () 1 i j pq()
Rst pq = c Sr Sr d; Q i j pq = c h r Sr d (40)
|| ||

Note that the dispersion effect becomes significant when (i) the unit cell size is
large, (ii) material impedance is considerable (i.e. large difference in elastic moduli
or densities between phases) and (iii) spatial gradient of acceleration is high.
5 Verification of the Multiscale Model
For modeling polyurea microstructure, we consider an anisotropic unit cell with ellip-
soidal particles (volume fraction 19 %) shown on Fig. 14. The particles are oriented
at preferential angle of 45 with respect to the loading direction.
The soft domains are modeled by the GVBO model while hard domains are kept
elastic with considerably higher elastic modulus than the initial modulus in the soft
domain. Material parameters of the phases (Table 11) are calibrated to the experi-
mental data of normal dynamic impact test [10] and confined test under monotonic
Table 11 GVBO parameters for polyurea in two-scale analysis

Soft domains E 0 (MPa) 0 E t (MPa) p a1 (MPa) a2 (MPa) a3 (1/MPa)
156 0.494 23.3 6 0 10 0
k1 k2 (MPa) k3 (MPa) k4 E m (MPa) A0 (MPa) Ac (s1 ) A f (MPa)
17 0.47 1.47 4 13600 1 0 0
Hard domains E 0 (MPa) 0
890 0.491
Fig. 15 Compressive stress and normal velocity versus time. Comparison of multiscale simulation
with calibrated parameters and experimental data, Shot#1203 [10]
Fig. 16 Confined compressive stress-strain curves. Comparison of multiscale simulation with cal-
ibrated parameters and experimental data [18]
compressive loading [18]. The results are shown in Figs. 15 and 16. The two-scale
model of polyurea is implemented in Abaqus /Explicit with MDS plugin http://
multiscale.biz.
The multiscale model of polyurea is studied for impact problem on polyurea/steel
bi-layer and single polyurea layer. In these studies we have not considered material
failure and thus relatively low impact velocity have been analyzed.
The evolution of projectile velocity is shown in Figs. 17 and 19 for bi-layer and
single layer, respectively. The response of the heterogeneous anisotropic material
Polyurea-Steel Plate, V0=115m/s Steel-PolyureaPlate, V0=115m/s
Projectile Velocity (m/s)

Projectile Velocity (m/s)
140
140
120 120
Heterogeneous
100 100 Heterogeneous
Anisotropic
80
80 Anisotropic
60 Homogeneous
40 Isotropic 60 Homogeneous
20 40 Isotropic
0 20
0.E+00 1.E-04 2.E-04 3.E-04
-20 0
-40
-60 -20
Time (s) 0.E+00 1.E-05 2.E-05 3.E-05 4.E-05 5.E-05
Time (s)
Fig. 17 Evolution of the FSP velocity impacted to bi-layers at initial velocity of 115 m/s. Compar-
ison of the multiscale simulation with anisotropic heterogeneous material and a single-scale model
with isotropic homogeneous material properties
Fig. 18 Shear stress in the polyurea-steel bi-layer at initial velocity of 115 m/s at time 5.e-05 s.
Comparison of multiscale simulation with anisotropic heterogeneous material (top) and a single-
scale model with isotropic homogeneous material properties (bottom)
(Table 11) is compared with the homogeneous single scale GVBO model (Table 5).
It can be seen that a heterogeneous anisotropic polymer layer alone or placed on
the top of the bi-layer (Fig. 17, left) delays the projectile considerably better than a
homogenous polymer. In other words, the energy has been dissipated much faster
and much more effectively by heterogeneous anisotropic material. This is due to
the shear wave propagation as opposed pressure wave propagation in the case of an
isotropic polymer. When the polymer is placed on the bottom of the steel there is no
difference between the homogeneous and heterogeneous polymer (Fig. 17, right).
The snapshots showing the shear wave stress distribution in the polyurea layer
placed on the top of the bi-layer and in the polyurea plate alone are depicted in
Fig. 19 Evolution of the FSP velocity impacted on the single polyurea layer with initial velocity
of 167 m/s. Comparison of the multiscale simulation with anisotropic heterogeneous material and
a single-scale model with isotropic homogeneous material properties
Fig. 20 Shear stress in the polyurea layer model at impact velocity of 167 m/s at time 1.6e-04 s.
Comparison of multiscale simulation with anisotropic heterogeneous material (top) and a single-
scale model with isotropic homogeneous material properties (bottom)
Figs. 18 and 20, respectively. It can be seen that due to preferential orientation of the
hard domains more energy is dissipated than in a homogeneous polyurea.
To study the dispersion effect we consider an impact onto the polymer plate with
initial velocity of 300 m/s. For demonstration purposes we consider relatively large
size of the unit cell (about 1 mm) and high ratio of phase densities (hard domain is
ten times denser than a soft domain). A failure of polymer material is not included
here. The micro-inertia effect is implemented in Abaqus/Explicit solver by adding
the dispersion term into overall stress (35). The simulation results depicted in Figs. 21
Fig. 21 Evolution of the FSP velocity impacted on the single polyurea layer at initial velocity
300 m/s. Comparison of the multiscale simulation for anisotropic heterogeneous material with and
without dispersion
Fig. 22 Shear stress distribution in a polyurea layer impacted by a projectile at initial velocity of
300 m/s at time 1.47e-04 s. Comparison of the multiscale simulation with anisotropic heterogeneous
material and with (top) and without (bottom) consideration of dispersion
and 22 demonstrate that dispersion indeed contributes to the decrease in the projectile
velocity.
6 Summary
In this study we presented a new constitutive model of polyurea, the generalized

viscoplasticity based on overstress (GVBO) for finite deformations at very high
strain rates (105 s1 ). The model parameters were identified against experimental
results and a good predictability of polyurea shear resistance at various pressure

levels has been observed.
The GVBO polyurea model was used to simulate the penetration of FSP into
polyurea/steel bi-layers for high hard steel at high impact velocities. The simulation
results qualitatively confirmed experimental results suggesting the front position of
the polyurea layer to be superior to the back polyurea layer. However, the projec-
tile velocities observed in the experiments were higher than those predicted in the
simulations.
Finally, we demonstrated that energy can be dissipated by means of polymer
anisotropy and polymer heterogeneity. We showed that it is possible to engineer
such a polymer microstructure that will dissipate energy by means of shear waves
as opposed to pressure waves that cannot dissipate energy. Furthermore, we devel-
oped a dispersive multiscale model that accounts for energy absorption by means of
dispersion within material microstructure.
Acknowledgments The support of ONR grant N00014-12-1-0558 is gratefully acknowledged.
References
1. Roland CM, Casalini R (2007) Effect of hydrostatic pressure on the viscoelastic response of
polyurea. Polymer 48:57475752
2. Green MS, Tobolsky AV (1946) A new approach to the theory of relaxing polymeric media. J
Chem Phys 14(2):8092
3. Johnson AR, Quigley J, Freese CE (1995) A viscohyperelastic finite element model for rubber.
Comput Methods Appl Mech Eng 127:163180
4. Roland CM (1989) Network recovery from uniaxial extension: i. elastic equilibrium. Rubb
Chem Technol 62:863879
5. Bergstrom JS, Boyce MC (1998) Constitutive modeling of the large strain time-dependent
behavior of elastomers. J Mech Phys Solids 46(5):931954
6. Jiao T, Clifton RJ, Grunschel SE (2009) Pressure-sensitivity and constitutive modeling of an
elastomer at high strain rates. AIP Conf Proc 1195:12291232
7. Colak OU (2004) Modeling of large simple shear using a viscoplastic overstress model and
classical plasticity model with different objective stress rates. Acta Mechanica 167:171187
8. Gomaa S, Sham T-L, Krempl E (2004) Finite element formulation for finite deformation,
isotropic viscoplasticity theory based on overstress (fvbo). Int J Solids Struct 41:36073624
9. Grujicic M, He T, Pandurangan B et al (2012) Experimental characterization and material-
model development for microphase-segregated polyurea: an overview. J Mater Eng Perform
21:216
10. Clifton RJ, Jiao T (2012) Resistance of elastomers to shearing and failure at extreme loading
conditions. ONR-Workshop
11. Jiao T, Clifton RJ, Grunschel SE (2006) High strain rate response of an elastomer. AIP Conf
Proc 845:809812
12. Jiao T, Clifton RJ (2013) Measurement of the response of an elastomer at pressures up to 9gpa
and strain rates of 105 106 s1 . In: 18th Biennial international conference of the APS topical
group on shock compression of condensed matter, Washington, July 2013
13. Roland CM, Fragiadakis D, Gamache RM et al (2012) Factors influencing the ballistic impact
resistance of elastomer-coated metal substrates. Philos Mag 93(5):468477
14. Roland CM, Fragiadakis D, Gamache RM (2010) Elastomer-steel laminate armor. Compos
Struct 92:10591064
15. Roland CM, Twigg JN, Vu Y et al (2007) High strain rate mechanical behavior of polyurea.
Polymer 48:574578
16. Amini MR, Isaacs J, Nemat-Nasser S (2010) Investigation of effect of polyurea on response of
steel plates to impulsive loads in direct pressure-pulse experiments. Mech Mater 42:628639
17. Fish J, Filonova V, Kuznetsov S (2012) Micro inertia effects in nonlinear heterogeneous media.
Int J Numer Meth Eng 91(13):14061426
18. Amirkhizi AV, Isaacs J, McGee J et al (2006) An experimentally-based viscoelastic constitutive
model for polyurea, including pressure and temperature effects. Philos Mag 86(36):58475866
19. Grujicic M, Ramaswami S, Snipes JS et al (2013) Multiphysics modeling and simulations of
Mil A46100 armor-grade martensitic steel gas metal arc welding process. J Mater Eng Perform
22: 29505969
20. Borvik T, Hopperstad OS, Dey S et al (2005) Strength and ductility of weldox 460 e steel
at high strain rates, elevated temperatures and various stress triaxialities. Eng Fract Mech
72(7):10711087
21. Johnson GR, Cook WH (1983) A constitutive model and data for metals subjected to large
strains, high strain rartes and high temperatures. In: Proceedings of 7th international symposium
on ballistics, The Hague, 1983
22. Johnson GR, Cook WH (1985) Fracture characteristics of 3 metals subjected to various strains,
strain rates, temperatures and pressures. Eng Fract Mech 21(1):3148
23. Irshidat M, Al-Ostaz A, Cheng AH-D (2011) Predicting the response of polyurea coated high
hard steel plates to ballistic impact by fragment simulating projectiles. Ole Miss Project 90031.
http://www.serri.org/publications/Documents. Accessed 12 May 2011
24. Yi J, Boyce MC, Lee GF et al (2006) Large deformation rate-dependent stress-strain behavior
of polyurea and polyurethanes. Polymer 47:319329
25. Qi HJ, Boyce MC (2005) Stress-strain behavior of thermoplastic polyurethanes. Mech Mater
37:817839
26. Lee G, Mock W, Fedderly J et al (2007) The effect of mechanical deformation on the glass
transition temperature of polyurea. AIP Conf Proc 955:711714
27. Sharma A, Shukla A, Prosser RA (2002) Mechanical characterization of soft materials using
high speed photography and split hopkinson pressure bar technique. J Mater Sci 37(5):1005
1017
28. Oskay C, Fish J (2008) Fatigue life prediction using 2-scale temporal asymptotic homogeniza-
tion. Comput Mech 42(2):181195
29. Oskay C, Fish J (2007) Eigendeformation-based reduced order homogenization. Comp Meth
Appl Mech Eng 196:12161243
30. Yuan Z, Fish J (2009) Multiple scale eigendeformation-based reduced order homogenization.
Comput Methods Appl Mech Eng 198(2126):20162038
31. Yuan Z, Fish J (2009) Hierarchical model reduction at multiple scales. Int J Numer Meth Eng
79:314339
32. Fish J, Yuan Z (2008) N-scale model reduction theory. In Fish J (ed) Multiscale methods:
bridging the scales in science and engineering. Oxford University Press, New York
33. Fish J, Filonova V, Yaun Z (2013) Hybrid impotent-incompatible eigenstrain based homoge-
nization. Int J Numer Meth Eng 95(1):132
Numerical Simulation of Double Cup Extrusion
Test Using the Arbitrary Lagrangian Eulerian
Formalism
Romain Boman, Roxane Koeune and Jean-Philippe Ponthot
Abstract In this chapter Double Cup Extrusion Test (DCET) is modelled using the
finite element method with the help of the Arbitrary Lagrangian Eulerian (ALE)
formalism. DCET is a tribological test involving very large deformations which
are traditionally dealt with complicated and costly remeshing algorithms. Since the
topology of ALE meshes should remain constant throughout the simulation, two
very thin layers of auxiliary elements are added to the initial mesh of the billet where
the material is expected to flow. This numerical trick is combined with an original
and efficient node relocation procedure which allows the model to take into account
complex geometries of punches. The presented model is firstly validated for limited
punch strokes thanks to a purely Lagrangian simulation. It is then compared with
results from the literature. Eventually the general nature and the effectiveness of this
numerical strategy is demonstrated by a fully-coupled thermomechanical simulation
of thixoforming where the final shape of the billet is compared to experimental
measurements.
1 Introduction
The Double Cup Extrusion Test (DCET) is a tribological test dedicated to forging
operations. Before the conception of DCET, one of the easiest way to quantify
friction for this type of processes was the ring compression test (see Male and
R. Boman (B) R. Koeune J.-P. Ponthot

Department of Aerospace and Mechanical Engineering, University of Liege, 1 chemin des
Chevreuils, 4000 Lige, Belgium
R. Koeune
J. Ponthot

30 R. Boman et al.
(a) r1 (b) r2
Fig. 1 Schematic description of the ring compression test which may be used to quantify friction
in forging operations [25] a low friction (good lubrication) b high friction (bad lubrication)
Cockcroft [17]), which consists in crushing a flat ring until a prescribed thickness
is obtained, as depicted on Fig. 1. If the contacts are well lubricated (Fig. 1a), the
material flows outwards and the inner radius of the ring increases. If the friction
becomes higher (Fig. 1b), this radial motion is slowed down. A smaller radius is then
obtained (r2 < r1 ). Consequently, the final inner radius can be used as an indirect
measure of friction. However, this simple tribological test reproduces rather badly
the real contact conditions and the very high deformations that can be observed at
the interfaces between the material and the tools of real forging operations. Indeed,
according to Bay [2], it is common to reach pressures close to 2.5 GPa, surface
temperatures higher than 600 C and local surface elongation up to 3000 %. DCET
was conceived by Geiger [12] in order to measure friction and to test lubricants in
tribological conditions that are closer to these values.
Although DCET is much more elaborated than the ring compression test, the
measured friction can still be deduced from very simple geometrical quantities. The
experimental setup can be described as follows (see Fig. 2a): a cylindrical billet is
placed in a hollow container of same diameter between two punches. During the test,
the lower punch is not moving while the upper punch goes down and crunches the
specimen. Therefore the material is forced to flow along both punches in such a way
that two cups are gradually formed. If the contacts were perfectly lubricated, i.e. in the
frictionless case, the material would flow symmetrically upwards and downwards.
The H-shaped section of the forged billet would have two branches with the same
height (h 1 = h 2 ).
In practise, friction is unavoidable and induces a dissymmetry in the process.
The obtained final section looks like the one represented in Fig. 2b. The height of
the upper cup (h 1 ) is always higher than the lower one (h 2 ). The friction can then
be quantified by the cup height ratio h 1 / h 2 . The higher this ratio is, the higher the
friction was during the test.
A first direct application to this tribological test is the classification of lubricants
according to their respective efficiency in forging conditions. For example, Gariety
et al. Gariety et al. [11] have compared four lubricants thanks to DCET. They have
also studied the possibility of jamming by visualising the grooves on the free surfaces
of the billet after the test.
A second interesting application is the numerical estimation of a friction coeffi-
cient with the help of the finite element method. In the case of forging, the Trescas
law is usually chosen to model friction:
Numerical Simulation of Double Cup Extrusion Test 31
(a)
moving upper punch
cylindrical
billet (b)
h1 h1
h2 h2
fixed w all fixed lower punch
Fig. 2 a Principle of the double cup extrusion test from [23]b Picture of a deformed billet after
DCET (from Gariety et al. [11])
m max (1)
where is the friction shear stress at the contact interface, m is the friction coefficient
and max is the shear yield stress of the material. A series of numerical simulations
of DCET can be performed using a range of friction values m and the corresponding
curves of cup height ratios h 1 / h 2 versus upper punch displacements can be plotted.
This set of calibration curves and the experimental measurement are then used to
identify a mean coefficient m for the process [7, 9, 26]. This friction value might be
used later, with much care, in more complex numerical simulations of forging which
would involve the same material and the same lubricant. The finite element models of
the previously cited authors were all using the commercial code DEFORM-2D [24]
which conveniently provides an automatic remeshing procedure for quadrangular
meshes.
It is important to notice that the relevance of DCET to evaluating friction in forging
may be somewhat questionable. In fact, the material flow is mostly influenced by
the friction between the billet and the wall of the container. The friction between the
punch and the material, which is more representative of a forging operation, plays
a less significant role on the dissymmetry of the final shape of the billet. Moreover,
some authors, such as Schrader et al. [23], think that the pressures exerted by the
billet on the container are not high enough to use a Trescas law in the finite element
models. A Coulombs law should be more appropriate. Nevertheless, despite these
32 R. Boman et al.
issues, modelling this tribological test is still very interesting from a numerical point
of view.
This chapter is organised in the following way: after a brief review of the ALE
formalism, a simplified extrusion model is presented in order to explain the numerical
trick that will be used to keep the topology of the ALE mesh constant. Then, this
technique is extended to the case of extrusion with curved punches. Next, the model
is validated for small punch strokes by comparison with a classical Lagrangian model
and a simplistic ALE model using mesh smoothing. For larger punch strokes, the
model is compared to results from the literature obtained with a complete remeshing
strategy. Finally, a fully-coupled thermomechanical problem of semi-solid forming
is described and the final predicted shape of the billet is compared to experimental
observations.
This work has been done with Metafor Ponthot [22], an in-house implicit finite
element code developed at the University of Lige in Belgium.
2 Overview of the ALE Formalism
In the ALE formalism, unlike in the Lagrangian case which is commonly used in Solid
Mechanics, the mesh no longer follows the material motion. Consequently, a new
grid coordinate system R is defined and the conservation laws and the constitutive
equations are rewritten in terms of the new coordinates [3, 4, 8, 22]:
Mass:

+ c + v = 0 (2)
t
Momentum:
v
+ (c ) v = + b (3)
t
Energy:

u
+ c u = : D + r + q (4)
t
Material:

+ (c ) = H : D + W W (5)
t
where is the mass density, is the Cauchy stress tensor, b and r are the specific
body forces and heat sources, u is the specific internal energy, D and W are the
symmetric and antisymmetric part of the velocity gradient tensor, q is the heat flux,
and H is a material tensor depending on the constitutive parameters, the stresses,
and the loading history. The last two terms in Eq. (5) result from the particular choice
of the Jaumanns objective time derivative.
The convective velocity c = v v is the difference between the material velocity
v and the mesh velocity v . In the case of nonlinear problems, such as metal forming
simulations, v should ideally depend on the solution. It is thus an additional unknown
of the latter system of equations.
In order to simplify the solution procedure and remain competitive against
Lagrangian models, the set of ALE equations is usually solved using an operator-
split procedure. Each time increment, from time t to t + t, is divided into two
successive steps. The first one is performed exactly in the same way as in the clas-
sical Lagrangian case. During this Lagrangian step the mesh follows the material
motion (v = v, c = 0) until an equilibrated configuration is obtained. The sec-
ond step, also called the Eulerian step, is divided into two substeps: the definition
of an appropriate mesh velocity v by relocating each node of the mesh to a more
suitable position Boman and Ponthot [5], followed by the data transfer from the old
mesh configuration to the new one Boman and Ponthot [6]. This transfer involves the
Gauss-point values (stress tensor components, history variables of the material such
as the equivalent plastic strain) as well as nodal values (velocities, accelerations and
temperature).
In the case of simulations of tribological tests such as DCET, the computation of
friction forces is obviously very important. Nevertheless this evaluation is not as easy
as in the Lagrangian case for which the position of each node of the mesh corresponds
to the same material particle during the whole simulation. The following strategy is
thus implemented: during the Lagrangian step, a classical penalty method is used to
compute the friction occurring at the nodes in contact with a tool. Then, after the
Eulerian step, the equilibrated internal forces are recomputed from the transferred
stress field on the new mesh. The friction forces are calculated by projection onto
the tools and the tangential gaps are recovered from these updated forces.
3 Basic ALE Model of Extrusion
The extrusion process, or more precisely wiredrawing, was first investigated using
the ALE formalism by Hutink et al. [15] in 1990. In that early work, the studied
problem was 2D axisymmetric, the mesh was purely Eulerian and the stationary
solution was sought. Later, Van Haaren et al. [27] and Geijselaers and Hutink [13]
built an extrusion model in order to analyse their respective novel ALE convection
schemes. Similarly the mesh was fixed in space and a transient computation was
performed until the stationary state was reached. In those chapters, the analysis of
the results is not exhaustive: the plastic strain is solely visualised in order to compare
the numerical diffusion of the newly developed advection schemes. In particular, the
friction modelling is not discussed at all.
Transient models of extrusion have been also proposed by Atzema and Hutink [1]
and by Ponthot [21]. The ALE formalism can be very useful in this context. In these
34 R. Boman et al.
8 mm
7 mm
3 mm
Fig. 3 Axisymmetric geometry of the extrusion model of [21]. A cylindrical specimen is con-
strained to flow into a narrow channel in order to produce a hollow cylinder. The specimen shapes
at the beginning and the end of the extrusion are completely different
kind of models, the mesh is not Eulerian anymore. Since the ALE formalism requires
a constant mesh topology and thus a constant number of finite elements throughout
the simulation, it is important to take advantage of the approximate knowledge of the
final shape of the extruded billet in order to build the initial mesh. As an illustration of
the particular mesh management procedure, Ponthots model is presented in Fig. 3.
The extrusion problem is axisymmetric. A cylindrical billet is pushed by a punch into
a narrower channel so that a hollow cylinder is formed. The material is elastoplastic
(E = 200 GPa, = 0.3, Y = 210 + 10 p MPa) and the friction on the boundaries
is modelled by a Coulombs law with a friction coefficient = 0.15.
The transient solution is made up of two quadrangular regions (see Fig. 5): the
first region corresponds to the part of the crushed cylindrical billet that still remains
between the punches and the second one contains the material which has been already
extruded and which lies between the fixed punch and the container wall. At time
t = 0, the second region should be ideally empty. In order to keep a unique mesh
from the beginning to the end of the simulation, Ponthot initially assigns a very small
thickness (h = 0.01 mm) to this second region and creates a mesh on it. This artificial
region is called auxiliary region in the remainder of this work. The resulting finite
elements of the auxiliary region are thus very flat, but they can inflate as a result of the
material flux coming from the first region. The node relocation strategy is relatively
simple (see Fig. 4). Most of the vertices of the mesh are Lagrangian (i.e. they follow
the material motion). Only two vertices are Eulerian (i.e. fixed in space). The line
defining the nose of the fixed punch and its neighbour separating both regions of
the mesh are also Eulerian. The nodes of the other lines are relocated by defining a
cubic spline through them. These splines are then remeshed so that the initial node
distribution and their respective curvilinear abscissa are preserved. As far as the inner
nodes are concerned, they are continuously relocated thanks to the same transfinite
mesher that was used to generate the meshes. These node-relocation methods are
fully described in a previous chapter Boman and Ponthot [5].
fixed
punch
Lagrangian vertices
fixed
Eulerian v ertices
wall
Eulerian lines
moving
punch
Fig. 4 Node relocation procedure. Thanks to the simple geometry of the fixed punch, the definition
of the new mesh is made very easy. The nodes and the line in red are Eulerian. The other lines
which have at least one red vertex are remeshed using cubic splines
Figure 5 shows the progress of the simulation. Of course, the proposed mesh man-
agement technique entails some issues. First of all, it is mandatory to roughly know
the direction of the material flow when setting up the model. Moreover, seeing that
the number of finite elements is initially fixed in the auxiliary regions of the mesh,
these elements become larger and larger as the simulation progresses and, conse-
quently, the geometry of the extruded parts becomes crudely discretised. Finally, it
is not possible to extrude all of the material. The mesh of the billet must always be
made up of the same number of elements, but its thickness continuously decreases.
The crushed quadrangles, which lie either in the auxiliary region at the beginning of
the simulation or in the main region of the mesh at the end of the simulation, lead
to some convergence difficulties. On the one hand these finite elements are poorly
conditioned for the Lagrangian steps of the ALE algorithm and, on the other hand,
the stability criterion of the explicit data-transfer scheme of the Eulerian step is very
restrictive concerning the maximum allowable punch displacement during a single
time step. As a result, a very small time step has to be used at the beginning and at
the end of the simulation (Fig. 6). Figure 7 shows the calculated force on the moving
punch during the extrusion operation. The curve obtained by the current implemen-
tation is compared to the former results of Ponthot [21]. The trends of both curves are
very similar and the final values are identical. The discrepancy between the curves
may be explained by the differences in the ALE management of friction.
Despite these limitations, this particular mesh management technique is very
attractive for modelling extrusion or any other process for which the material flow is
predictable. For example, Gadala et al. [10] used the same ALE method to compute
the shape of a metallic chip of a cutting operation.
36 R. Boman et al.
H = 8 mm
region 2
region 1
h = 0.01 mm
(10 elements through
the thickness )
t =0s
t =1s
t =2s
Dtot = 0.95 H
t =3s
Fig. 5 Results of the extrusion test of Ponthot [21] for a punch stroke up to 95 % of the initial
thickness of the cylinder (H )
0.035
t
max
0.03
Time step size ( t) [s]

0.025
0.02
0.015
0.01
0.005
0
0 2 4 6 8
Punch displacement [mm]
Fig. 6 Size of the time increment t along the simulation
70
60
50
Force [kN]
40
30
20
10 this work
Ponthot (1995)
0
0 2 4 6 8
Fig. 7 Extrusion force as a function of the punch displacement and comparison with the results of
Ponthot [21]
4 ALE Model of DCET
4.1 Geometry and Parameters
The previous mesh management technique is now applied to a Double Cup Extrusion
Test. The chosen geometry was developed at the Engineering Research Center (ERC)
of the Ohio State University in order to assess the properties of various lubricants [7].
The exact punch geometry is described in Fig. 8 and the corresponding numerical
38 R. Boman et al.
Fig. 8 Punch geometry used Ds

by Tan et al. [26] and Schrader
[23]
hp
R
Df
Dp
Table 1 Geometry of the

Punch Billet
extrusion test studied by [23].
D p D f Ds R hp d0 h0
The parameters are related to
Fig. 8 [mm] [mm] [mm] [mm] [mm] [o ] [o ] [mm] [mm]
15.88 9.53 15.72 1.17 1.57 10.0 5.0 31.75 31.75
values are listed in Table 1. They are related to the work of [23], which will be the
main reference in the remaining part of this chapter. The geometry of the punch is
far more complex than the one used by Ponthot. Even if Buschhausen et al. [7] claim
that the shape of the punch does not play a significant role on the results (its shape is
actually optimised to favour the radial flow of the lubricant, which is not modelled
here), this complex shape and all its geometrical details are retained in the model in
order to demonstrate the capabilities of our ALE node-relocation algorithm.
The initial diameter of the cylindrical billet d0 is equal to its height h 0 and to the
internal diameter of the container. The extrusion ratio is defined as the ratio of the
surface of the punch nose and the upper surface of the billet (r = D 2p /d02 ). This value,
deduced from Table 1, is equal to r = 0.25. According to Schrader et al. [23], this
particular value of r is ideal to observe large variations in the results due to friction
conditions.
The material is an AISI 1018 steel with classical elastic properties: Youngs mod-
ulus E = 200 GPa and Poissons ratio = 0.3. The nonlinear hardening is modelled
by the following law:
Y = K ( p )n (6)
where K = 735 MPa and n = 0.17. This law has been identified from a standard
tensile test the elastic part of which has been neglected. It is employed here in this
form, despite the fact that the initial yield stress is zero.
A Trescas
law models the frictional contact: Eq. (1) may be rewritten as
m Y / 3. The extrusion test is supposed to provide the value of this friction coeffi-
cient m by identification of an experimental curve and a series of numerical curves
obtained with a range of m values. In the following simulations the default value
is m = 0.05. In practise, the yield stress Y appearing in the Trescas law is usu-
ally chosen as the initial yield stress (see for example the DCET simulations of Tan
et al. [26]). In the work of Schrader et al., this numerical value is zero. Consequently,
a first possibility is to use the updated local yield stress. However, this value is only
defined at the Gauss points of the neighbouring elements of the contact nodes, and
not directly at these nodes. The yield stress should thus be extrapolated from the
Gauss points to the contact nodes. As a consequence, the friction force evaluated at
a given node depends on all the positions of the nodes of the neighbouring elements
and a new and more complete stiffness matrix must be computed in order to keep a
quadratic convergence rate. The second possibility is to choose a mean value of the
yield stress of the material. The cold-drawn AISI 1018 steel is listed on matweb [18]
with an initial yield stress of 370 MPa. When this particular value is chosen, numer-
ical results close to the ones of Schrader et al. [23] are obtained. The first choice
leads to sensibly different results, which show that some uncertainties remain in the
numerical parameters used in the reference work.
The model is axisymmetric and integrated in time by a implicit quasi-static solver
(the speed of the punch is about 10 mm/s, which is largely insufficient to produce
some inertia phenomena). The mesh is made up of Selective Reduced Integration
(SRI) quadrangles. Both punches and the container are assumed rigid. The contact
is modelled by the penalty method. The normal and tangent penalty coefficients,
p N and pT respectively, are determined by trial and error: p N = 6 104 MPa/mm
and pT = 6 103 MPa/mm for the container wall, and p N = 2 104 MPa/mm and
pT = 2 103 MPa/mm for the punches. The billet is regularly meshed with elements
of 1 mm along the extrusion direction and 0.3 mm along the radial direction for a total
of 31 52 elements). As in Ponthots model presented in the Sect. 4.1, the material
flows are anticipated by adding two very thin auxiliary meshes close to both punches
(15 elements though the thickness = 0.2 mmsee Fig. 10a).
The data-transfer step of the ALE algorithm consists in updating the stresses
(pressure p, and deviatoric stress components srr , sr z , szz ) and the equivalent plastic
strain p . These five fields are processed by a first-order Godunov scheme [6].
4.2 ALE Mesh Motion
The mesh motion definition is obviously more complex than in the case of the former
example. The main difficulty is to define the motion of the red line highlighted in Fig. 9
which represents the surface of the billet under the punch nose and its extension up
to the container wall. Unlike its counterpart in Fig. 4, this curve may not be Eulerian
because the punch is not stationary anymore. Furthermore, given its slightly convex
geometry, the punch is not entirely in contact with the surface of the billet at the
40 R. Boman et al.
moving
punch
empty space
fixed
Lagrangian vertices p1 p2 wall
special vertices
p2
p1
Fig. 9 Node relocation of the DCET model. Unlike the case of Fig. 4, the red line may not be
Eulerian anymore
(a) (b)
punch deformed line
punch (difficult to remesh)
vertical line straight
approximation
used for
p1 remeshing p1
Fig. 10 a Zoom on the upper finely-meshed auxiliary region at the beginning of the simulationb
Excessive distorsion of the elements of the auxiliary domain if the nodes of the red line follow the
real motion of the material boundary
beginning of the computation. A small gap, which is initially empty, should be filled
during the first moments of the process.
One could imagine to simply prevent the radial motion of node p1 of Fig. 9.
The position of node p2 would be such that both nodes would have continuously
the same Y-coordinate. The other vertices would be Lagrangian. Unfortunately, this
solution does not work because an unavoidable material flux is observed between
the two regions of the mesh and the thin auxiliary mesh becomes rapidly distorted
during the first steps of the simulation. Figures 10a and b explain this issue and the
proposed solution. Initially, the vertical line above p1 is very short and very finely
meshed. During the first steps, this line is deformed because the first element of the
auxiliary region receives some spurious fluxes related to the relocation of p1 on the
piecewise-linear boundary of the mesh. These fluxes are very small but, compared
to the very small area of the elements, they are large enough to highly deteriorate
the auxiliary mesh and to make the line impossible to remesh. Consequently this
line is remeshed as if it was a straight line until the contact of p1 with the punch is
established.
The simulation is performed in two successive steps. The first one aims at filling
up the empty gap between the billet and the punch. At the end of this step, the
punch nose is entirely in contact with the billet and the situation becomes similar
to the simple extrusion problem of the previous section. During this first step, the
radial displacement of node p1 (Fig. 9) is set to zero and node p2 is Lagrangian.
The vertical line above p1 is remeshed as if it was straight in order to prevent any
distortion problem of the boundary. Doing so, a small inward spurious material flux
is tolerated through this boundary.
The second step begins when node p1 hits the punch surface. At this precise
moment, the vertical displacement of node p2 , as well as those of all the nodes of
line ( p1 , p2 ), are equaled to the one of p1 . This line follows thus the vertical motion
of the punch. Concerning the former problematic line, it is remeshed at that stage
using a cubic spline in order to precisely follow the boundary of the extruded material.
This two-step strategy is symmetrically applied to the lower part of the billet.
If some friction is modelled between the billet and the tools, the process is not
symmetrical and the transition from the first step to the second does not occur at the
same time for the upper and the lower part of the model. This is not a problem in
practise.
Finally and more classically, all the remaining curves defining the boundaries of
the billet are continuously remeshed using cubic splines. The internal nodes of the
main meshed region are relocated thanks to Giulianis smoothing method Giuliani
[14]. This method has been chosen among many others because it produces the
most regular mesh in this very case. This iterative smoothing requires five iterations
with a overrelaxation coefficient = 1.5. Eventually, both auxiliary meshes are
continuously remeshed by transfinite mapping.
4.3 Model Validation for Limited Punch Strokes
The ALE model of the previous section is compared with two other models: the
first one is a classical Lagrangian model and the second one is an alternative ALE
model which simply consists in smoothing the mesh of the billet without defining
any auxiliary region. This comparison enables us to validate the proposed ALE node
relocation technique.
The punch displacement s (also called punch stroke) is limited to 8 mm so that
the three models can converge and produce results. Figure 11a shows the Lagrangian
solution. The mesh is highly distorted close to the punch nose. Having a closer
look at this problematic spot (Fig. 11b), it can be noticed that the mesh boundary
highly penetrates the punch. The node-to-surface formulation of the contact with
the rigid tool yields erroneous results: the surface of the billet is subjected to very
large local elongations and the surface mesh stretches so widely that the curvature
of the punch radius is not well described anymore. Since the contact detection only
involves the nodes of the boundary, the edges are free to cross the punch analytical
surface producing a very large geometrical error.
42 R. Boman et al.
(a)
s = 8mm
(b)
nodes
in contact
Equivalent plastic
0.0 strain ( p ) 3.0
Fig. 11 a Lagrangian solution for a 8-mm stroke. These results validate the ALE model for the
beginning of the processb Zoom on the Lagrangian solution for which the contact is very badly
taken into account
(a) (b)
s = 8mm s = 8mm
Fig. 12 a ALE solution for a 8-mm stroke (simple model). A single region is meshed and a tricky
smoothing operator is usedb ALE solution for a 8-mm stroke (two-region model)
Figure 12a presents the results obtained by the simple ALE model of DCET
without adding any auxiliary meshed regions. The initial mesh is identical to the
Lagrangian one. All the boundary nodes are relocated using cubic splines in order
0.7 equipotential +
equipotential weighted volumes 0.3 weighted volumes
5.5mm 5.7mm smax =

9.8mm
0.0 Equivalent plastic strain ( p) 3.0
Fig. 13 Comparison of the efficiency of the relocation methods. For each case, the red circle
indicates the most critical zone of the mesh where the elements are highly distorted
to avoid the excessive stretching of the element edges on the boundary, which was
previously discussed. The inner nodes are relocated, after many trials and errors,
by a very peculiar combination of two smoothing operators: 70 % of equipotential
smoothing and 30 % of weighted volume smoothing Boman and Ponthot [5]. The
equipotential part helps to keep the mesh lines almost perpendicular to each other.
The weighted-volume part tries to equalise the volumes of the neighbouring quadran-
gles. Used alone, each of these methods does not permit the computation to converge
so far. Figure 13 shows that it is possible to simulate a stoke of smax = 5.5 mm with
an equipotential smoother and a stoke of smax = 5.7 mm with a volume-weighted
smoother. An appropriate combination of both methods enables the simulation to
converge up to 9.8 mm. Nevertheless, these values are much smaller than the exper-
imental stroke value of 27 mm. Even if this stroke could be reached, it is important
to notice that the combination factors of the smoothing methods are case-dependant
(which means that they are related to a particular value of the friction coefficient
m) and very tricky to guess. This simple ALE model is thus useless except for the
validation of the two-region ALE model.
Figure 12b shows the results when using the more sophisticated ALE model
including the finely meshed auxiliary region for a punch displacement of 8 mm. This
time, the quality of the mesh is very good. The equivalent plastic strain distribution is
very similar to the one computed by the simple ALE model. Of course, the extruded
heights are slightly different because the two-region model starts at t = t0 with
nonzero heights (h 1 (t0 ) = h 2 (t0 ) = ). In order to discard this error, the extruded
heights are measured without taking into account the initial small heights of the
auxiliary meshes (see Fig. 14).
It is possible to compare more precisely these three models. Figure 15 shows the
evolution of h 1 and h 2 as a function of the punch stroke. Both ALE models give very
close numerical results that follow the trend of the Lagrangian model at the beginning
44 R. Boman et al.
Fig. 14 Computation of the

extruded cup heights h 1 and y1
h 2 in the case of the ALE
model using the additional h1 = y1 s
s
auxiliary meshes. The initial
heights are subtracted
h2 = y2
y2
Fig. 15 Comparison of the 10

cup heights h 1 and h 2 for the
three models h1
8
h and h [mm]
6
2
h2
4
1
2 ALE
simple ALE
Lagrangian
0
0 2 4 6 8 10
of the computation. Beyond s = 5 mm, the Lagrangian model withdraws from the
ALE models because of the penetration of the mesh inside the punch surface (see
Fig.11b). Despite the correction on the computed cup heights, the sophisticated ALE
model generates slightly different results from the simple ALE model. This small
error (h 1 = 0.14 mm and h 2 = 0.08 mm for s = 9.8 mm) certainly comes
from the contact length of the billet on the container wall that is not the same in the
two cases. The 2-region model is thus subjected to slightly more friction than the
simple model. This fact directly affects the corresponding curve of Fig. 16. However,
the global trend is quite satisfactory.
In Fig. 17, the curves representing the vertical forces measured on the tools for
the three models are very similar at the beginning of the simulation. Starting from
s = 5 mm, the Lagrangian solution does not model the real process anymore because
of the excessive material penetration into the punch. Yet the simple ALE model
provides force levels that are very close to the more sophisticated ALE model until
it ceases to converge. Finally, as it was expected, the forces of the 3-region ALE
model are slightly higher than the ones obtained by the two other models (+1.3 %
2.5
h /h
2
1 1.5
0.5 ALE
simple ALE
Lagrangian
0
0 2 4 6 8 10
Fig. 16 Comparison of the cup height ratios h 1 / h 2 during the process
3
ALE
simple ALE F up
2.5 y
Lagrangian
Vertical forces [kN]
2 low
Fy
1.5
0.5 wall
Fy
0
0 2 4 6 8 10
up
Fig. 17 Vertical forces computed on the upper punch Fy , on the lower punch Fylow and on the
container wall Fywall for the three models
for the force on the container wall). This difference, which is barely visible in the
Figure, could be further reduced by decreasing the value of , at the cost of a slower
convergence rate and thus an increase of the total computational time.
This preliminary study proves that the implemented ALE treatment of friction is
correct because the same results are obtained independently of the chosen formalism.
46 R. Boman et al.
m = 0.0 m = 0.05 m = 0.1
0.0 Equivalent plastic strain ( p ) 3.0
Fig. 18 Deformed billets which have been obtained for several friction values m
4.4 Study of the Whole Process
The whole process is now simulated by using the ALE model until the vertical
displacement of the punch reaches s = 29 mm, which corresponds to 91 % of the
initial height of the billet. When using m = 0.05, the problem is solved in 346 time
increments corresponding to 422 Newton iterations. The total CPU time is 5 45
(single-threaded run on an AMD Opteron 254, 8 Ghz). This time does not depend
much on the value of the friction coefficient m. Approximately half of this time
(52 %) is spent in the Lagrangian step of the ALE algorithm. The remaining time
splits into node relocation routines (7 %) and the data-transfer scheme (41 %).
The deformed shapes of the billet are presented in Fig. 18 for three values of the
friction coefficient m (m = 0, m = 0.05 et m = 0.1). As expected, the upper cup
height h 1 becomes larger when the friction coefficient increases. Since the elastic
deformations are negligible and thanks to mass conservation, the opposite trend
is observed for the lower cup height h 2 . One must keep in mind that this volume
conservation is not automatically verified in ALE formalism. A closer look at the
volume variation of the mesh reveals that the total volume slightly increases during
the ALE computations. Table 2 shows several interesting values: the added volume
corresponding to the auxiliary meshes represents about 1 % of the exact initial volume
of the experimental billet. At the end of the simulation, an increase of about 0.5 %
of the total volume of the mesh is noticed instead of a slight loss of volume that
could be intuitively expected from the elastic response of the crushed material. This
variation mainly results from the spurious material fluxes that are generated during the
remeshing of the boundaries of the mesh. A smaller fraction of this error might also
come from the limited accuracy of the data-transfer scheme. Anyways, the observed
volume variation of the ALE mesh is always positive and it slightly increases as the
friction coefficient m increases.
Table 2 Variation of the volume of the mesh in the ALE simulations of DCET after a punch stroke
of 29 mm (m = 0.05)
V V /V0
[mm3 ] 100 [%]
Initial volume of the experimental billet V0 25137 100.00
Added volume (auxiliary meshes) 238 0.95
Initial volume of the model 25375 100.95
Final volume of the model (m = 0.00) 25492 101.41
4.5
3.5 m=0.2
3 m=0.15
2.5 m=0.1
2
h /h
1
2 m=0.075
m=0.05
1.5
m=0.025
1 m=0.0
0.5
exp (Schrader)
0
0 5 10 15 20 25 30
Fig. 19 Set of curves of cup height ratios numerically obtained for a range of friction coefficients
m. A value for m may be deduced from the experimental measurements
According to its creator, Geiger [12], the main purpose of DCET is to numerically
identify a friction coefficient which is directly related to the chosen lubricant. To reach
this goal, a series of simulations are performed by considering a range of friction
values m. As an example, Fig. 19 shows the resulting curves in the case of the
studied model. The experimental values from Schrader et al. [23]three punctual
measurementshave been superimposed on the numerical curves. A friction value,
which is marginally greater than m = 0.05, can then be deduced visually from these
results.
48 R. Boman et al.
s=
25mm
max
h1
h1( r max )
0.0 p 3.0
Fig. 20 Two different ways for measuring the height h 1 in the ALE model: either the largest height
h max
1 (the measurement position r may vary during the simulation), or the height h 1 measured on
the container wall (always at rmax ). These simulations correspond to m = 0.05 and n = 0
4.5 Comparison of Results Obtained by ALE and Remeshing
This section is devoted to a comparison between the ALE model and the numerical
and experimental work of Schrader et al. [23]. These authors use DEFORM-2D, a
FE code which is dedicated to the simulation of forging and extrusion processes.
DEFORM-2D features a sophisticated automatic remeshing algorithm which is very
useful to avoid critical distortions of the quadrangular finite elements during the
computation. The numerical techniques, which are compared in this section, are thus
completely different.
In their work, Schrader et al. study the influence of the hardening coefficient n of
the material (see Eq. 6) on the cup height ratio and on the contact pressure when the
friction value is m = 0.05. The cup height ratio is plotted in Fig. 21 for n = 0.17
(the reference value) and n = 0.0 (perfectly plastic material). As far as the ALE
results are concerned, not one but two curves have been plotted for each hardening
coefficient n. The first curve is obtained when the value of h is measured on the
container wall (at r = rmax = d0 /2). The second one is related to the largest value of
h which could be measured at a variable radial position r during the simulation. As
an example, Fig. 20 shows the final shape of the billet for a hardening coefficient of
n = 0. The position of the largest value of h 1 is not located at the container wall. The
cup height ratio may vary a lot according to the particular shape of the upper free
surface of the material and to the measurement position of h 1 . This is particularly
the case when the punch stroke and h are small. For each value of n in Fig. 21, the
two ALE curves nicely surround the one obtained by Schrader using DEFORM-2D.
For a larger stroke, these three curves converge to an identical final value of the cup
height ratio.
3.5
n=0.0
3
2.5
2
h /h
2
1
n=0.17
1.5
this work h(rmax)
1
this work hmax
0.5 Schrader
exp
0
0 5 10 15 20 25
Fig. 21 Influence of the hardening coefficient n of the material on the cup height ratio (m = 0.05)
1200
lower punch upper punch
1000
n=0.0
Pressure [MPa]
800
600
n=0.17
400
200
this work
Schrader
0
0 10 20 30
Y [mm]
Fig. 22 Pressure field measured on the container wall for a 8-mm stroke and two different values
of the hardening coefficient (m = 0.05)
The contact pressures on the container wall are plotted in Fig. 22 for a stroke
limited to 8 mm and two different materials (n = 0 and n = 0.17). There is a very
good agreement between the results of Schrader et al. and the ALE model. The
curves are less close to each other at the level of both punches. Nevertheless, the
global shapes of the pressure fields are quite similar.
Schrader et al. also studied the influence of the initial height h 0 of the cylindrical
billet on the obtained results. The curves obtained for ratios h 0 /d0 of 0.75, 1.0
(reference value) and 1.25 are presented in Fig. 23. Once again, the ALE results are
very close to the published results of Schrader. The largest difference is observed
50 R. Boman et al.
Fig. 23 Influence of the initial 4

thickness h 0 of the cylindrical
billet on the cup height ratio. 3.5
The friction is kept constant
3
(m = 0.05) h0 /d0=1.25
2.5
h0 /d0=1.0
2
h /h
2
1
1.5
h /d =0.75 this work h(rmax)
0 0
1
this work hmax
0.5 Schrader
exp
0
0 5 10 15 20 25
for the ratio h 0 /d0 = 0.75. In this case, the cup height ratio h 1 / h 2 that is computed
by the ALE model is a 5 % underestimate of the value computed by DEFORM-2D.
Since some assumptions have been made about the supposed treatment of friction
and the initial yield stress of the material, this difference may still be considered as
rather small. Consequently it can concluded that the results of the ALE model are
consistent with the ones obtained by a remeshing procedure.
5 Application to Thixoforming
In this section, the ALE model of the previous sections is used to simulate a semi-
solid forming operation, also known as thixoforming. This kind of process relies on a
specific behaviour, called thixotropy, of some alloys near their melting temperature.
They behaves as solids at rest (a billet can sustain its own weight) but they react as
liquids during shearing (for example, they can be cut easily).
A thermomechanical constitutive law which models a smooth transition between
these two behaviours has been implemented in Metafor [16]. The numerical vali-
dation of this law is performed using the ALE model of DCET and the results of a
campaign of experimental tests which was conducted at the University of Lige by
Pierret, Vaneetveld and Rassili [19, 20] in collaboration with the industrial engineer-
ing and mechanical production laboratory of ENSAM (Ecole Nationale Suprieure
dArts et Mtiers, Metz, France).
The adapted numerical model exhibits several additional difficulties compared
to the one presented previously: all the material parameters are temperature depen-
dent and a coupled thermomechanical integration scheme is used. The heat transfer
between the material (a 100Cr6 steel alloy heated up to 1370 C) and the rigid tools
(initially at 130 C) is also taken into account. The upper punch velocity, which plays
Fig. 24 Illustration of the mesh evolution during the extrusion test
27.3 mm 26.1 mm
14.2 mm 14.7 mm
Fig. 25 Comparison of the final shape obtained experimentally (left) and the final deformed section
resulting from the ALE simulation with a friction coefficient m = 0.35 (right)
a significant role on the process due to the variable viscosity of the thixotropic mater-
ial, is not constant. Finally, there is an initial gap between the billet and the container
wall at the beginning of the process (see Fig. 24). The filling of this gap requires the
definition of an supplementary stage in the time-integration sequence.
Figure 25 presents the final shape of the billet obtained experimentally and numer-
ically. Although the friction coefficient has been chosen to get almost the same cup
heights in both cases, it is interesting to see that the simulated upper and lower
boundaries of the cups are very similar to the experimental one. This simulation also
proves that the mesh management technique presented in this chapter is able to deal
with complex material flows.
52 R. Boman et al.
6 Conclusions
An original 2D model of double cup extrusion test (DCET) has been presented in
this chapter. This model efficiently uses the ALE formalism in order to avoid a series
of complex and costly remeshing steps during the simulation. Since the DCET is a
tribological test, the model is also very interesting to validate the contact treatment
on an ALE mesh. An error in the ALE computation of the local friction force would
be immediately reflected on the global final shape of the deformed billet.
In order to keep a constant mesh topology, which is a prerequisite condition to use
the ALE formalism, it is necessary to add two very thin auxiliary material regions
to the initial mesh of the billet. These regions are made up of flat elements which
can inflate during the simulation when the billet is crushed between the punches
and the material flows from one mesh to the other. Although this particular mesh
management technique has been already used by [10, 21], it is the first time that this
kind of method is applied to a geometrically-complex process. Indeed, the noses of
the punches have not been simplified in the DCET model. They are not planar and
their curvature adds a real difficulty to the definition of the mesh motion and the
time-integration sequence.
The presented ALE model has been validated by two different means. It has been
compared first with an equivalent Lagrangian model during the beginning of the
simulations. Secondly, the ALE results have been compared to the ones computed
by DEFORM-2D which makes use of an automatic remeshing procedure. A very
good agreement has been observed between these two numerical techniques although
they are radically different.
Finally, the ALE model of DCET has been used in the frame of a fully-coupled
thermomechanical simulation of a semi-solid forming process. Once again, very
good results have been obtained without any remeshing operations.
References
1. Atzema EH, Hutink J (1995) Finite element analysis of forward/backward extrusion using
ALE techniques. In: Shen Dawson (ed) Simulation of materials processing: theory, methods
and applications : proceedings of the 5th international conference NUMIFORM. New-York
2. Bay N (1994) The state of the art in cold forging lubrication. J Mater Process Technol 46(1
2):1940. doi:10.1016/0924-0136(94)90100-7
3. Benson DJ (1989) An efficient, accurate, simple ale method for nonlinear finite element
programs. Comput Methods Appl Mech Eng 72(3):305350. doi:10.1016/0045-
7825(89)90003-0
4. Benson DJ (1992) Computational methods in lagrangian and eulerian hydrocodes. Comput
Methods Appl Mech Eng 99(23):235394. doi:10.1016/0045-7825(92)90042-I
5. Boman R, Ponthot JP (2012) Efficient ale mesh management for 3d quasi-eulerian problems.
Int J Numer Meth Eng 92(10):857890. doi:10.1002/nme.4361
6. Boman R, Ponthot JP (2013) Enhanced ALE data transfer strategy for explicit and implicit
thermomechanical simulations of high-speed processes. Int J Numer Meth Eng 53(0):6273.
doi: http://dx.doi.org/10.1016/j.ijimpeng.2012.08.007
7. Buschhausen A, Weinmann K, Lee JY, Altan T (1992) Evaluation of lubrication and friction in
cold forging using a double backward-extrusion process. J Mater Process Technol 33(12):95
108. doi:10.1016/0924-0136(92)90313-H
8. Dona J, Huerta A, Ponthot JP, Rodriguez-Ferran A (2004) Encyclopedia of computational
mechanics, chap 14: arbitrary Lagrangian-Eulerian methods, Vol 1. Wiley, pp 413437. doi:10.
1002/0470091355.ecm009
9. Forcellese A, Gabrielli F, Barcellona A, Micari F (1994) Evaluation of friction in cold metal
forming. J Mater Process Technol 45(14):619624. doi:10.1016/0924-0136(94)90408-1
10. Gadala MS, Movahhedy MR, Wang J (2002) On the mesh motion for ale modeling of
metal forming processes. Finite Elem Anal Des 38(5):435459. doi:10.1016/S0168-
874X(01)00080-4
11. Gariety M, Ngaile G, Altan T (2007) Evaluation of new cold forging lubricants without zinc
phosphate precoat. Finite Elem Anal Des 47(34):673681. doi:10.1016/j.ijmachtools.2006.
04.016
12. Geiger R (1976) Der stofffluss beim kombinierten napffliesspressen - metal flow in com-
bined can extrusion - (Berichte aus dem Institut fnr Umformtechnik, UniversitSt Stuttgart). 36,
Girardet, Essen, Germany
13. Geijselaers HJM, Hutink J (2000) Semi implicit second order discontinuous Galerkin con-
vection for ALE calculations. In: Onate E, Morgan K, Periaux J, Stein E (eds) (ECCOMAS)
European congress on computational methods in applied sciences and engineering, Barcelona
14. Giuliani S (1982) An algorithm for continuous rezoning of the hydrodynamic grid in arbi-
trary lagrangian-eulerian computer codes. Nucl Eng Des 72(2):205212. doi:10.1016/0029-
5493(82)90216-3
15. Hutink J, Vreede PT, van der Lugt J (1990) Progress in mixed eulerian-lagrangian finite
element simulation of forming processes. Int J Numer Meth Eng 30(8):14411457. doi:10.
1002/nme.1620300808
16. Koeune R (2011) Semi-solid constitutive modeling for the numerical simulation of thixoform-
ing processes. PhD thesis, University of Lige, Belgium
17. Male AT, Cockcroft MG (1965) A method for the determination of the coefficient of friction
of metals under condition of bulk plastic deformation. J Inst Met 93:3846
18. Matweb (2013) Online materials information resource. http://www.matweb.com/
19. Pierret J (2009) Quantification de la robustesse du procd de thixoformage des aciers. PhD
thesis, University of Lige, Belgium.
20. Pierret J, Rassili A, Vaneetveld G, Bigot R, Lecomte-Beckers J (2010) Friction
coefficients evaluation for steel thixoforging. Int J Mater Form 3:763766. doi:10.1007/s12289-
010-0882-1
21. Ponthot JP (1995) Advances in Arbitrary Eulerian-Lagrangian finite element simulation of large
deformation processes. In: Owen D, Oate E (eds) Computational plasticity: fundamentals and
applications -proceedings of the 4th international conference. Pineridge Press Ltd, Barcelona
22. Ponthot JP (1995) Traitement unifi de la mcanique des milieux continus solides en grandes
transformations par la mthode des lments finis. PhD thesis, Universit de Lige, Lige,
Belgium.
23. Schrader T, Shirgaokar M, Altan T (2007) A critical evaluation of the double cup extrusion
test for selection of cold forging lubricants. J Mater Process Technol 189(13):3644. doi:10.
1016/j.jmatprotec.2006.11.229
24. Scientific Forming Technologies Corporation (2013) DEFORM. http://www.deform.com/
25. Sofuoglu H, Rasty J (1999) On the measurement of friction coefficient utilizing the ring com-
pression test. Tribol Int 32(6):327335. doi:10.1016/S0301-679X(99)00055-9
26. Tan X, Bay N, Zhang W (1998) On parameters affecting metal flow and friction in the double
cup extrusion test. Scand J Metall 27(6):246252
27. Van Haaren MJ, Stoker HC, van den Boogaard AH, Hutink J (2000) The ALE-method with
triangular elements: direct convection of integration point values. Int J Numer Meth Eng
49(5):697720. doi:10.1002/1097-0207(20001020) 49:5 697:AID-NME9763.0.CO;2-U
Part II
Cardiovascular Fluid Mechanics
Simplified Fluid-Structure Interactions
for Hemodynamics
Olivier Pironneau
Abstract Computing blood flows in a closed vascular system by isolating one

section for simulation creates instabilities due to the time-periodic structure of the
flow and possible non-physical back flow in the simplified geometry. We propose
some solutions in the context of a simplified fluid structure interaction on a fixed
geometry but with pressure dependent normal velocities at the compliant walls.The
present analysis is based on the Surface Pressure model for the fluid-structure inter-
actions.
Keywords Blood flow Hemodynamics finite element method Pressure boundary

conditions Primary: 91B28, 65L60 Secondary: 82B31
1 Introduction
Mastering the simulation of blood flow is the key to proper design of by-passes,
stents and heart valves (see Thiriet [17] for instance).
The problem was addressed by Charles Peskin in the nineties and his team have
made impressive simulations since, using fictitious domains and immersed boundary
techniques [1, 12, 13, 18].
Another approach, taken by Quarteroni et al. [5] and the REO project at INRIA [3,
4, 19] is to discretize the full fluid-structure coupled problem with solvers working
in moving domains.
In a seminal paper [11], Nobile and Vergana showed that the problem is well
posed and conserves energy. Nevertheless the numerical simulations are expensive
[2] and there is room for simplifications.
O. Pironneau (B)
Laboratoire Jacques-Louis Lions, Sorbonne Universits, UPMC,
Boite courrier 187, 75252 Paris Cedex 05, France

58 O. Pironneau
In the special case of aortic flow the geometry does not change much. Typically
the aorta has a radius of 1cm and a computational geometry deals with a section of
length of 510 cm; the thickness of the aortic wall is around 0.1 cm; the heart pulse
is about 1 Hz and the pressure drop roughly 6 KPa.
In principle arteries are deformable solids subject to large displacements and
nonlinear elasticity (e.g. [7, 8, 10]). But when small displacement occurs only and
linear elasticity applies, shell models like Koiters can be used. It was shown in [11]
that if lateral displacements are neglected, Koiters model reduces to a scalar equation
for the normal displacement
s htt (T) (Ct ) + at + b = f s , , t given at t = 0

(1)
on the mean position of the vessels wall; here h denotes the average thickness of
the vessel and s its volumic mass; T is the pre-stress tensor (needed because at rest
the vessel is blown up by the blood ); C is a damping term, a, b are viscoelastic terms
and f s the external normal force, i.e. s nn the normal component of the normal
stress at the surface of the solid.
Notice however that the other components of the normal stress tensor cannot be
matched with the fluid when the displacement is assumed normal.
Finally assume that [h, T, C, a] << b; then the Surface Pressure Model is
obtained:
Eh
s nn = b, with b = (2)
A(1 2 )
where A is the vessels cross section, E the Young modulus, the Poisson coefficient.
Some typical values (MKSA):
E = 3M Pa, = 0.3, A = R 2 , R = 0.01, h = 0.001, b = 3.3107 ms2

(3)
2 Boundary Conditions
With simple toroidal coordinates (r, , ) (x = R cos , y = R sin , z =

r sin ) where R = R0 + r cos ,

ur u u
u = h r h h r + + (4)
h h h hr hr h
with h r = 1, h = r1 , h = 1
R because, by definition
Simplified Fluid-Structure Interactions for Hemodynamics 59
1
= (k x)2 + (k y)2 + (k z)2 , k = r, , (5)
h 2k
So u = 0 and u n = 0 imply
R0 + 2r cos u r R0 + 2r cos
u = r u r + u r = 0 r u r | = (6)
r (R0 + r cos ) r R0 + r cos
Similarly

u = e h i k
i
e u k , i, k (r, , )
k
(7)
i k
with
er = (cos cos , cos sin , sin )T ,

e = ( sin cos , sin sin , cos )T , e = ( sin , cos , 0)T (8)
Thus
ur r
f
r
n T (u)n = r u r + 1 + cos2 nn = p + 2 1 + cos2 u n.
r R R r
(9)
Hence the matching conditions at the fluid-structure interface on a torus of small

radius r and big radius R are
r
t = u n, p = 2 1 + cos2 t + b (10)
R r
Notice that (10) implies
r
t p = 2 1 + cos2 t u n + bu n (11)
R r
3 Moving Fluid Domains Versus Fixed Domains
3.1 Energy Considerations
Assuming the fluid Newtonian and incompressible, the pressure p and the velocity
u are given by the Navier-Stokes equations

u
f + u u f = 0, u = 0, (12)
t
60 O. Pironneau
where f is the volumic mass of the fluid, the viscosity and f = pI + (u +

u T ) is the stress tensor.
To check the energy budget one multiplies (12) by u and integrates by parts:

f
t |u|2 + (u + u T ) : (u + u T )
2 2

f 2
+ |u| u n = s u n (13)
2
The fluid velocity on is equal to the wall velocity, so (see [5])

1 1 2 1 2
t |u|2 + |u| u n = t |u| (14)
(t) 2 2 (t) 2
This leads to the following energy identity

f 2 f 2
|u| (T ) + |u + u T |2 = |u| (0)
(T ) 2 (0,T ) 2 (0) 2

+ s u n (15)
(0,T )
3.2 The Problem in Strong Form
Now if we consider (12) on a fixed domain with zero tangential velocities but non-
zero normal velocities on the walls then to conserve energy we need to change u u
into u u 21 |u|2 which happens to be u u due to the identity
1
u u = |u|2 u u. (16)
2
Let us recall another identity:
u = u + u (17)
Therefore the modified Navier-Stokes system suited to flows in fixed domains with
zero tangential components on the walls (u n = 0) is

u
f
u u + u + p = 0, u = 0, (18)
t
In a domain with u n = 0 and p related by (11) on , as shown below.

3.3 The Problem in Variational Form
Its variational formulation of is: find u, p such that u, p with u n| = 0,

u
f u u u + u u p u p u
t

+ p u n = 0. (19)

with p related to u n by (11).
Problem 1 Find u, p, such that u, p, with u n| = 0, u and given at

t = 0,

u
[ f u u u + u u p u p u]
t

+ [(t + b)u n + b(t u n)] = 0.

(20)
r
with = 2 1 + cos2 . As (20) implies (10-a),
r R
energy estimates derive by choosing u = u, p = p, =

|u| (T ) +
f 2
b (T ) +
2
2| u| + 2
2
(t )2
(0,T ) (0,T )

= |u| (0) +
f 2
b2 (0) (21)

3.4 Approximation with the Nedelec Edge Element
Boundary conditions like u n are hard to enforce. Furthermore boundary conditions

involving the pressure have their own difficulties (see [15, 16]). In [6] it is argued
that finite element approximations of (24) requires edge elements. An error analysis
is given with P k P k1 discontinuous elements with degrees of freedom being edge
fluxes of degree k plus face fluxes of degree k 1 and volume fluxes of degree k 2
for the velocities.
Although the proof of convergence is done for k 2, we tested the same idea with
P 1 Raviart-Thomas elements (called RT 0 ) for the velocity and P 0 discontinuous
elements for the pressure. In theory should be P 0 -discontinuous like the pressure;
first we took it P 1 -continuous to simplify the implementation because then we can
62 O. Pironneau
add to the formulation a small regularization everywhere in so as to avoid

having degrees of freedom for only on the boundary.
Then we tested also approximated with the P 1 Raviart-Thomas element and
formulated the laplacian of in mixed form; this augments considerably the number
of degree of freedom: 3 (n v + n e ) + 2 n v for the P 2 P 1 P 1 element (tested
in [14], see also below), 3 n e + n t + 2 n v for the RT 0 P 1 P 1 element and
6n e +2n t +2n v for the RT 0 P 0 RT 0 + P 0 element, where n v is the number
of vertices, n e the number of edges, n t the number of elements. We tested these 3 sets
of element on a simple geometry: a quarter of a torus with a pressure drop imposed
from the top horizontal cross section to the right vertical one. The cross section of
the torus is a circle of radius 1 cm. This circle is extruded on a greater circle of radius
4 cm. The pressure drop is 6 cos( t), b = 200 and = 0.001.
The time step is 0.05. The mesh has n v = 1395, n t = 6120, n e = 1336. The
computation is stopped at t = 0.75.
The results are shown on Fig. 4. On a core [email protected] it takes 17 s with the
Nedelec-P 1 P 1 element to compute 16 time steps with the characteristic-Galerkin
method for the non-linear terms (see [14]) and 22 s with the Nedelec/Raviart-Thomas
element (see Fig. 1).
4 A Formulation Where the Displacement is Eliminated
Notice that can be eliminated from (10), giving a formulation which contains
u n = 0 and
nt p = t u + bu (22)
4.1 A Time Discretisation
Consider now (19) discretized in time :

m+1
u um 1 1
f u m+ 2 u m u + u m+ 2 u p m+1 u
t
1

p u m+ 2 + p m+1 u n = 0. (23)

We use (22) discretized in time to compute p m+1 | and so we consider
Problem 2 Find u, p such that u, p with u and t p given at t = 0,

Fig. 1 Left Surfaces of constant pressure for a flow with = 103 , b = 200 in a quarter of a
torus with R = 4, r = 2 discretized on a fixed geometry with the Nedelec edge element for the
velocity, peacewise constant pressures and linear continuous deformation. Right same as left but
with a mixte Raviart-Thomas element for the displacement

m+1
u um 1 1
f u m+ 2 u m u + u m+ 2 u p m+1 u
t
1
1
p u m+ 2 + [tbu m+ 2 + (u m+1 u m ) + p m n] u = 0. (24)

Formulation (19) is valid only if u n = 0. This condition has been removed from
(24) to make it symmetric and easy to implement but the consequence is that by
working the integrations by parts backward, it is found that this formulation implies
(18) and on :
1 1
[tbu m+ 2 + (u m+1 u m )] n ( p m+1 p m ), u m+ 2 n = 0 (25)
The first condition no longer implies that u n = 0 and the second condition is
like saying that the tangential stress is zero, which means that we match not only the
normal components of the fluid and solid normal stress but all the components.
In summary Problem 2 is different from Problem 1; both of them have physically
sound background but we need to test them numerically to see how different they
are.
64 O. Pironneau
4.2 Discretization with a Finite Element Method
Let Th be a triangulation with K tetraedra {Tk }1K with the usual conformity hypothe-
ses; let := k Tk R3 .
Consider the P 2 P 1 element built from
Vh = {v C 0 ()3 : vi |Tk P 2 , i = 1, 2, 3}
Q h = {q C 0 () : q|Tk P 1 } (26)
We assume that the boundary is made of two part, which is the compliant wall and
the input and output sections on which p is given and u n = 0.
4.3 Discretization of Problem 1
For simplicity we assume that r << R, i.e. = 1. The momentum equation is also
divided by f and = / f and b is changed into b/ f .
A feasible discretization of (24) is to find [u m+1 , p m+1 , m+1 ] Vh Q h Q h
with u m+1 n| = 0, m+1 | = 0 and such that

m+1
u um m+ 12 m+ 21
u u u m
p m+1
u p u
t

1 1
+ u m+ 2 u + m+ 2 ]

1 m+ 1 1 1 1
+ b m+ 2 u n u n 2 (m+1 m ) + (u m+ 2 n) (u n)
t

= p u n , [u, p, ] Vh Q h Q h with u n| = 0, | = 0.

(27)
where is any small positive parameter.

When is kept fixed, an energy consevation identity is found by choosing u =
m+ 21 1
u , p = p m+1 , = m+ 2 :
2

u m+1 u m 2 m+ 21 2 m+ 21 2
+ | u | + | |
t
m+1 2
m 2 1 m+ 21 m+ 1
+ + |u n| =
2
p u n 2 (28)
t
As for the Navier-Stokes equations, when t is small enough the problem has
a unique solution because of the energy estimate and because of a general inf-sup
condition is satisfied with p replaced by [ p, ].
4.4 Discretization of Problem 2

A feasible discretization of (24) is to find u m+1 Vh , p m+1 Q h such that

m+1
u um m+ 21 m+ 21
u u u m
p m+1
u p u
t

1
+ u m+ 2 u

1
+ (u m+ 2 bt + p m n) u = p u n

u Vh , p Q h with u n| = 0 (29)
Notice that u m+1 n| = 0 is implied by the formulation. When is flat that

condition amounts to some component of the velocity being zero which is easy to
implement.
Notice that the energy equality implies stability only so long a p remains bounded
on , which could possibly be derived from (29), but not so obviously:
2

u m+1 u m 2 m+ 12 2 1
+ | u | + b|u m+ 2 |2 t
t
1
1
m+ m+
= pm u n 2 p u n 2 (30)

5 Numerical Tests
5.1 Moving the Geometry for Graphic Visualization
The full model requires that be moved at every time step along its normal of a
quantity tu m n. To preserve the triangulation we follow the literature [2] and solve
an additional problem
d m+1 = 0 in , d m+1 | = d m + ntu m

n, d
m+1
| = 0 (31)
and then move every vertex q j of the triangulation q j q j + d. In theory = 1

but for graphic enhancement it can be adjusted. Note however that (31) is expensive.
66 O. Pironneau
5.2 Comparison of the Two Methods
On the problem described earlier both methods give very similar results as shown
on Fig. 2. The geometry is updated for visualisation purpose with a multiplicative
factor 100.
The geometry is a section of the aorta obtained from a MRI scan. It has 4991 ver-
tices, giving 19964 degrees of freedom for each linear systems for [u 1m+1 , u 2m+1 , u 3m+1 ,
p m+1 ]. The pressure drop from inflow section on the right to outflow section on the
left is p R = 6 cos2 ( t) and the results are shown at t = 0.8. On the smaller cross
sections a pressure drop equal to p R /2 is imposed. Problem 1 and Problem 2 are
solved for comparison with t = 0.05/, = 0.001, b = 200. Results are shown
on Fig. 3.
For Problem 1, the computation took 198 on a macbook pro 15 , 2012, 2.3MHz
core i7. For Problem 2 it took 180 . The results are very similar with some difference
on the pressure but very little on the velocities.
6 Inflow/Outflow Conditions by PML
We end this article with an idea to address the problem of loss of stability due to the
creation of reverse flow in unwanted regions because of the boundary conditions on
the artificial inflow and out flow sections.
We borrow the idea from the PML literature (see for example [9]) and add to the
artery geometry a viscous buffer after out where = 1 >> blood (and similarly
before in but we present the theory applied to the outflow section only).
Consider a geometry where the exit section is o = {0} [0, h] in 2D where
pressure is set to p0 while pressure is set to p1 on entry. Assume that we impose a
parabolic flow u = K y(h y) at the exit of a viscous buffer L = [L , 0) [0, h],
i.e. on {L} [0, h]. Now we solve the Navier-Stokes equations on L. The
problem is to choose K so that the pressure on the inital outflow boundary o is
unchanged in the mean, namely p0 := h 1 p0 dy.
Because at every time step the system to solve is linear we shall adjust K by
superposition so that the mean pressure is p0 on out . Since, p|out p1 + ( p2
p1 ) KK2K
K 1 where p1 is computed with K = K 1 and p2 the mean pressure when
1
K = K 2 , then
p0 p1
K = K 1 + (K 2 K 1 ) (32)
p2 p1
This requires to solve the linear Stokes-like system at each time step 3 times. We
can also add K to the unknowns of the Stokes-like linear system and add out p =
|out | p0 to the equations; we used this second solution in the numerical tests because
it is much less computer intensive.
Fig. 2 Left surface of equal pressure at t = 0.75 computed by solving Problem 1 with P 2 P 1 P 1
elements and a penalization of the condition u n = 0. Right same as left but with Problem 2 and
a P 2 P 1 element
The idea is tested numerically on a quarter of a 2D-torus with radii 0.6 and 1 with
= 0.002 and a pressure drop equal to cos(t) + cos(3t), t (0, 25). The PML
viscosity is 1 = 0.2. A PML region is added to both ends of the tube. Results are
shown on Fig. 4.
68 O. Pironneau
Fig. 3 Computation of [u, p] for Problems 1 & 2 for a portion of an oarta (shown upside down).
Top with Problem 1. the pressure is shown at t = 0.8 on the left on a geometry which has changed
by . On the right the third component of the velocity w is shown on the fixed geometry. Bottom
same for Problem 2
The results look very different and that is because both computations do not have
the same inflow and outflow conditions on the original inflow/outflow boundaries.
In one case the pressure is imposed pointwise with u n = 0, in the PML case
the mean pressure is imposed and no conditions are imposed on the velocity but
parabolic velocity is imposed on the inflow/outflow of the PML boundaries.
The method will be tested in 3D and reported in a future publication.
Fig. 4 Left Geometry for the flow with two PML regions added. Center the velocity vectors
computed without the PML; notice the back flow in the yellow region. Right the same flow (velocity
vectors) computed with the two PML regions. The pressure drop from the two inner boundaries
(corresponding to the top and left boundaries of the geometry on the center figure) are the same as
in the center figure
7 Conclusion
In this article we have presented problems and solutions encountered with fluid-
structure interactions when a middle solution is seeked: neither the full problem
with moving geometries because it is too expensive, nor rigid walls because it is not
precise enough and it doesnt give the geometrical deformation.
The solution adopted here is to delay the geometrical deformations to the graphic
diplay only. But in doing so we have to work with the Navier-Stokes equations with
unusual boundary conditons which require unusual finite element discretizations.
For these intermediary problems we have shown that it is important to preserve
energy. Furthermore we can choose either to match exactly the normal component of
the solid and fluid normal stress tensor or to match approximately the 3 componenets
of the normal stresses by relaxing slightly the no slip condition.
In all cases the problem of back flows in the pulsating cases remains. We have
suggested a possible solution and made some preliminary tests.
References
1. Boffi D, Gastaldi L (2003) A fem for the immersed boundary method. Comput Struct 81:491
501
2. Deparis S, Fernandez MA, Formaggia L (2003) Acceleration of a fixed point algorithm for
fluid-structure interaction using transpiration conditions. ESAIM:M2AN 37(4):601616
3. Fernandez M (2013) Incremental displacement-correction schemes for incompressible fluid-
structure interaction. Numer Math 123:2165
4. Formaggia L, Gerbeau JF, Nobile F, Quarteroni A (2001) On the coupling of 3d and 1d navier-
stokes equations for flow problems in compliant vessels. Comput Methods Appl Mech Eng
191:561582
5. Formaggia L, Quarteroni A, Veneziani A (eds) (2009) Cardiovasuclar mathematics, MS and A
series. Springer, Milano
70 O. Pironneau
6. Girault V (1988) Incompressible finite element methods for Navier-Stokes equations with
nonstandard boundary conditions in R 3 . Math Comp 51(183):5574
7. Gonzalez O (2000) Exact energy and momentum conserving algorithms for general models in
nonlinear elasticity. Comput Methods Appl Mech Eng 190:17631783
8. Gonzalez O, Simo JC (1996) On the stability of symplectic and energy-momentum algorithms
for nonlinear Hamiltonian systems with symmetry. Comput Methods Appl Mech Eng 134:197
222
9. Hu Fang Q, Li XD, Lin DK (2008) Absorbing boundary conditions for nonlinear euler
and navier-stokes equations based on the perfectly matched layer technique. J Comp Phys
227:43984424
10. Le Tallec P (2001) Fluid structure interaction with large structural displacements. Comput
Methods Appl Mech Eng 190:30393067
11. Nobile F, Vergana C (2008) An effective fluid-structure interaction formulation for vascular
dynamics by generalized robin conditions. SIAM J Sci Comp 30(2):731763
12. Peskin C, McQueen D (1989) A three dimensional computational method for blood flow in the
hearth-i. immersed elastic fibers in a viscous incompressible fluid. J Comput Phys 81:372405
13. Peskin C (2002) The immersed boundary method. Acta Numerica 11:479517
14. Pichon KG, Pironneau O (2014 ) Pressure boundary conditions for blood flows. Applied Math
Conf in honnor of L. Tartar. Proc published in AIMS journal (to appear)
15. Pironneau O (1986) Conditions aux limites sur la pression pour les quations de Stokes et de
Navier-Stokes. C R Acad Sci Paris Sr I Math 303(9):403406
16. Pironneau O (1989) Finite element methods for fluids. Wiley, New York
17. Thiriet M (2011) Biomathematical and biomechanical modeling of the circulatory and venti-
latory systems. Control of cell fate in the circulatory and ventilatory systems, vol 2. Springer,
New York
18. Usabiaga F, Bell J, Buscalioni R, Donev A, Fai T, Griffith B, Peskin C (2012) Staggered
schemes for fluctuating hydrodynamics. Multiscale Model Sim 10:13691408
19. Vignon-Clementel I, Figueroa A, Jansen K, Taylor CA (2006) Outflow boundary conditions
for three-dimensional finite element modeling of blood flow and pressure in arteries. Comput
Methods Appl Mech Eng 195:37763796
Patient-Specific Cardiovascular Fluid Mechanics
Analysis with the ST and ALE-VMS Methods
Kenji Takizawa, Yuri Bazilevs, Tayfun E. Tezduyar, Christopher C. Long,

Alison L. Marsden and Kathleen Schjodt
Abstract This chapter provides an overview of how patient-specific cardiovascular

fluid mechanics analysis, including fluidstructure interaction (FSI), can be car-
ried out with the spacetime (ST) and Arbitrary LagrangianEulerian (ALE) tech-
niques developed by the first three authors research teams. The core methods are
the ALE-based variational multiscale (ALE-VMS) method, the Deforming-Spatial-
Domain/Stabilized ST formulation, and the stabilized ST FSI technique. A good
number of special techniques targeting cardiovascular fluid mechanics have been
developed to be used with the core methods. These include (i) arterial-surface extrac-
tion and boundary condition techniques, (ii) techniques for using variable arterial wall
thickness, (iii) methods for calculating an estimated zero-pressure arterial geometry,
(iv) techniques for prestressing of the blood vessel wall, (v) mesh generation tech-
niques for building layers of refined fluid mechanics mesh near the arterial walls, (vi) a
special mapping technique for specifying the velocity profile at an inflow boundary
with non-circular shape, (vii) a scaling technique for specifying a more realistic
volumetric flow rate, (viii) techniques for the projection of fluidstructure interface
K. Takizawa (B)
Department of Modern Mechanical Engineering and Waseda Institute for Advanced Study,
Waseda University, 1-6-1 Nishi-Waseda, Shinjuku-ku, Tokyo 169-8050, Japan
Y. Bazilevs
Structural Engineering, University of California, San Diego, 9500 Gilman Drive, La Jolla,
CA 92093, USA
T. E. Tezduyar K. Schjodt
Mechanical Engineering, Rice University, 6100 Main Street, Houston, TX 77005, USA
C. C. Long
T-3 Fluid Dynamics and Structural Mechanics, Los Alamos National Laboratory,
Los Alamos, NM 87545, USA
A. L. Marsden
Mechanical and Aerospace Engineering, University of California,San Diego,
9500 Gilman Drive, La Jolla, CA 92093, USA

72 K. Takizawa et al.
stresses, (ix) a recipe for pre-FSI computations that improve the convergence of the
FSI computations, (x) the Sequentially-Coupled Arterial FSI technique and its mul-
tiscale versions, (xi) techniques for calculation of the wall shear stress (WSS) and
oscillatory shear index (OSI), (xii) methods for stent modeling and mesh generation,
(xiii) methods for calculation of the particle residence time, and (xiv) methods for an
estimated element-based zero-stress state for the artery. Here we provide an overview
of the special techniques for stent modeling and mesh generation and calculation of
the residence time with application to pulsatile ventricular assist device (PVAD). We
provide references for some of the other special techniques. With results from earlier
computations, we show how the core and special techniques work.
1 Introduction
Cardiovascular fluid mechanics modeling even without fluidstructure interaction

(FSI) poses formidable computational challenges in some cases. Still, with every-
thing else being equal, taking the FSI between the blood flow and arterial walls into
account makes the modeling far more challenging. The Reynolds number at the peak
systole, which characterizes the flow regime, is of the order several hundreds in cere-
bral arteries and a few thousands in the aortic arch. This range of Reynolds numbers
corresponds to complex 3D flows, which are, nevertheless, laminar. Mild turbulence
is seen only under rare circumstances. Consequently, in general, state-of-the-art com-
putational fluid dynamics (CFD) software can deliver stable and reasonably accurate
solutions to cardiovascular fluid mechanics problems that do not require special
methods. However, just CFD modeling of blood flow assumes that the blood vessel
walls are rigid, which is not the case. Vascular walls can undergo large deforma-
tions due to hemodynamic forces. Wall deformations alter the blood flow patterns,
which, in turn, alter the hemodynamics. Therefore, for the computational modeling
to be realistic, the blood flow and wall deformation need to be treated in a coupled
fashion, and that makes it an FSI problem. That is why, although we also present
special methods for and computations from some challenging cardiovascular fluid
mechanics modeling without FSI, we emphasize cardiovascular FSI modeling in this
book chapter.
The blood flow is governed by the NavierStokes equations of incompressible
flows. The vascular wall is typically assumed to behave as an elastic material that is
allowed to undergo large deformations [1]. For simplicity, a hyperelastic framework
is typically employed for this purpose (see [1, 2] and references therein). At the
luminal surface kinematic and traction compatibility conditions are assumed to hold
point-wise. That is, at every point on the luminal surface, the blood flow velocity
must equal that of the wall, and the tractions associated with the fluid and structure
domains must balance. These conditions are both physically meaningful and lead
to a mathematically well-posed problem. To complete the formulation of the FSI
problem, the motion of the blood vessel must be determined fully by point-wise
computation of the time-dependent wall displacement. This added complexity of
Patient-Specific Cardiovascular Fluid Mechanics Analysis 73
the FSI modeling relative to just CFD modeling creates more challenges for the
computation, which are discussed here.
Methods for solving the fully-discretized coupled equations are of two kinds:
loosely- and strongly-coupled. The strongly-coupled solution methods can further
be categorized as block-iterative, direct and quasi-direct coupling techniques
(see [3, 4] for the terminology). The strongly-coupled solution methods include the
monolithic methods, which typically imply matching discretizations at the fluid
structure interface. The quasi-direct and direct coupling methods are applicable to
cases with nonmatching fluid and structure meshes at the interface, become equivalent
to monolithic methods when the interface meshes are matching, and yield more robust
algorithms than block-iterative coupling does, especially when the structure is light
compared to the fluid masses involved in the FSI dynamics.
In loosely-coupled approaches, the equations for fluid, solid and mesh motion
are solved sequentially, in an uncoupled fashion. Typically, within each time step,
the increment in the fluid solution is computed on a fixed spatial domain, the fluid
forces on the structure are collected, and the structural-solution increment is com-
puted, which is followed by an update of the mesh position. This enables the use
of existing fluid and structural solvers, a significant motivation for adopting this
approach. Yet, difficulties in the form of lack of convergence have been noted in
a number of cases, including some cardiovascular FSI cases. In strongly-coupled
approaches, the equations for fluid, solid and mesh motion are solved in a fully cou-
pled fashion, and in a more direct way in the case of quasi-direct and direct coupling.
The main advantage is that monolithic and other quasi-direct and direct coupling
techniques are more robust in that many of the convergence problems encountered
with the loosely-coupled and block-iterative coupling approaches are completely
avoided.
Although introducing FSI to blood flow increases modeling and simulation
complexity, the computations produce physiologically more realistic results than
those generated by just CFD. Effects of including wall elasticity in vascular sim-
ulations have been examined, for example, for carotid artery [5, 6], for cerebral
aneurysms [7, 8] and for the total cavopulmonary connection [9], which was the
first use of FSI for patient-specific pediatric cardiology applications. The rigid-wall
assumption consistently shows an overestimation of the wall shear stress (WSS)
compared to the flexible wall, in some cases by as much as 50 %. Some qualitative
and quantitative differences between the rigid- and flexible-wall simulations were
also observed for the blood flow patterns. We note that the blood vessels of young
children are significantly more flexible that those of adults, making FSI modeling
especially important for pediatric cardiology [10].
Unlike the rigid-wall assumption, FSI enables simulation of the complete mechan-
ical environment of the vascular wall, both the loads acting on the wall due to blood
flow and the loads acting within the wall. The latter loads are particularly impor-
tant for they act on the cells that control wall structure and function, which in turn
may change the bulk elastic properties of the wall and hence the hemodynamics.
As a result, considering the associated wall mechanobiology and the importance of
mechanics in understanding possible cellular responses presents an important future

research direction (see [11] for preliminary results).
In recent years we have seen a rapid increase in the volume and level of research in
cardiovascular fluid mechanics modeling (see, for example, [5, 6, 1222]). Much of
this has been in patient-specific modeling of arteries, especially those with aneurysm.
The preferred method of handling the moving interfaces involved in FSI modeling
has mostly been the Arbitrary LagrangianEulerian (ALE) finite element formula-
tion [23]. A residual-based variational multiscale (RBVMS) formulation in the con-
text of ALE methods (herein referred to as ALE-VMS) was introduced in [24] and
employed in cardiovascular FSI simulations of a number of patient-specific models
of major blood vessels.
One of the earliest spacetime (ST) methods targeting FSI modeling is the
Deforming-Spatial-Domain/Stabilized ST (DSD/SST) formulation [25, 26]. It was
introduced by the Team for Advanced Flow Simulation and Modeling (TAFSM) in
1991 as a general-purpose interface-tracking (i.e. moving-mesh) technique for com-
putation of flow problems with moving boundaries and interfaces. It is based on the
Streamline-Upwind/PetrovGalerkin (SUPG) [27] and Pressure-Stabilizing/Petrov
Galerkin (PSPG) [25, 28] methods. An earlier version of the pressure stabilization,
for Stokes flows, was introduced in [29]. The stabilized ST formulations were intro-
duced and tested earlier by other researchers in the context of problems with fixed
spatial domains (see [30]).
Patient-specific arterial FSI modeling with the DSD/SST formulation was first
reported by Torii et al. in a 2004 journal article [5] published by the Japan Soci-
ety of Mechanical Engineers. Over the years following that, Torii et al. conducted
one of the most extensive series of patient-specific arterial FSI modeling of cerebral
aneurysms [6, 31]. The cases studied in these articles by Torii et al. were almost all for
middle cerebral arteries, and the geometries were constructed from computed tomog-
raphy (CT) images. In these arterial FSI computations the DSD/SST formulation was
used together with the mesh update methods [32] developed by the TAFSM and
was implemented with block-iterative coupling [33]. The inflow boundary condition
used in the computations is a pulsatile velocity profile, which closely represents the
measured flow rate during a cardiac cycle.
New generation DSD/SST formulations, with increased scope, robustness and
efficiency, were introduced by the TAFSM in [4]. The stabilized ST FSI (SSTFSI)
technique, which is based on the new-generation DSD/SST formulations, was also
introduced in [4]. The ST-VMS method [34] is the the variational multiscale version
of the DSD/SST method, which was originally called DSD/SST-VMST (i.e. the
version with the VMS turbulence model) in [35]. The VMS components are from the
RBVMS method given in [3639]. The original DSD/SST formulation was named
DSD/SST-SUPS in [35] (i.e. the version with the SUPG/PSPG stabilization), which
was also called ST-SUPS in [17].
The SSTFSI technique was extended by the TAFSM in [13] to arterial FSI
modeling, with emphasis on arteries with aneurysm. The arterial geometries were
approximations to patient-specific image-based geometries, mostly to those reported
by Torii et al. A number of special techniques for arterial FSI were developed by the
TAFSM in conjunction with the SSTFSI technique. These include techniques for
calculating an estimated zero-pressure (EZP) arterial geometry [40, 41], a special
mapping technique for specifying the velocity profile at an inflow boundary with
non-circular shape [14], techniques for using variable arterial wall thickness [14,
41], mesh generation techniques for building layers of refined fluid mechanics mesh
near the arterial walls [14, 41, 42], a recipe for pre-FSI computations that improve
the convergence of the FSI computations [13], the Sequentially-Coupled Arterial FSI
(SCAFSI) technique [43, 44] and its multiscale versions [44], and techniques [41]
for the projection of fluidstructure interface stresses, calculation of the WSS and
calculation of the oscillatory shear index (OSI). In FSI modeling of three cere-
bral artery segments with aneurysm reported by the TAFSM in [45], the arterial
geometries came from 3D rotational angiography (3DRA). In [45], the TAFSM
also addressed the computational challenges related to extraction of the arterial-
lumen geometry from 3DRA, generation of a mesh for that geometry, and building
a good starting point for the FSI computations. In addition to these computational
challenges common to all three cases, the computational challenges encountered in
some of these cases individually were addressed in [45]. In [46, 47], new techniques
were presented for determining the shrinking amount in the EZP process, the arterial
wall thickness, and the thickness of the layers of refined fluid mechanics volume
mesh near the arterial walls. These techniques were originally proposed in Remark 3
of [45], but the description was very brief. In [46, 47], also a new scaling technique
was introduced for specifying a more realistic volumetric flow rate. In [9], a tech-
nique was proposed to use the Laplaces equation for specifying a variable vessel
wall thickness, and in [8, 48] a prestressing technique was developed for blood ves-
sels. The former addressed the challenge of how to specify spatially varying vessel
wall thickness. It inspired the idea of using the Laplaces equation over the surface
mesh covering the lumen to specify a variable vessel wall thickness [4547]. The
latter addressed the challenge presented by the fact that patient-specific blood ves-
sel geometry data comes from a configuration that is not stress-free, the same fact
that motivated earlier the development of methods for calculating an EZP arterial
geometry [40]. Both techniques are quite general and, most importantly, are inde-
pendent of the details of the patient-specific blood vessel geometry. Related to that,
recently, methods for an estimated element-based zero-stress state for the artery were
introduced in [21].
While modeling the FSI between the blood flow and arterial walls is one of the most
challenging problems in cardiovascular fluid mechanics, there are other complex
problems that are comparably challenging. Patient-specific computation of unsteady
blood flow in an artery with aneurysm and stent is one of them. Special methods
targeting that class of cardiovascular fluid mechanics problems were introduced
in [49], and a large set of computations were reported in [18, 49].
Thrombus formation (i.e., blood clotting) is a major problem in ventricular assist
devices (VADs), especially in pulsatile VADs (PVADs). Long residence times and
areas of recirculation or stagnation may lead to increased risk of thrombosis in
PVADs [19, 50]. A method was presented in [51] for computing the particle resi-
dence time, which is known to correlate with an increased risk of thrombogenesis.
Fig. 1 Flat-stent geometry (left) and arterial lumen geometry (right)
Fig. 2 Deformed stent (left) and split lumen geometry with the stent (right)
The method was developed in an ALE [17, 23] framework (the Eulerian case was
investigated in [52]), and is suitable for flows with moving boundaries and inter-
faces, including FSI. In [53], the recently-developed residence time formulation was
employed in the definition of the objective function for the FSI-based shape opti-
mization study of a current PVAD design.
In this book chapter we provide an overview of how patient-specific cardiovascu-
lar fluid mechanics analysis, including FSI, can be very effectively carried out with
the core and special ST and ALE technique. For the governing equations and the
finite element formulations, including the ALE-VMS, DSD/SST and SSTFSI tech-
niques, we refer the interested reader to [17, 46, 54]. Special techniques for stent
Fig. 3 Flat stent with the periphery of the interior-boundary geometry (top) stent mesh (bottom)
Fig. 4 Aneurysm (left) and parent (right) artery segments, separated by the stent
modeling and mesh generation and particle residence time calculation are described
in Sects. 2 and 3. In Sect. 4, we give references for some of the other special tech-
niques. The fluid (blood) and structure (blood vessel wall) properties and boundary
conditions are given in Sect. 5. We present the computations in Sects. 6 and 7, and
our concluding remarks in Sect. 8.
Fig. 5 Aneurysm artery

segment showing regions of
different thickness for the
layers of refined mesh
2 Stent Modeling and Mesh Generation
This section is from [49]. Mesh generation of the cerebral artery with aneurysm and
stent requires numerous steps that include taking the flat-stent design and lumen
geometry and generating a fluid volume mesh representative of a stented artery with
aneurysm. We begin by mapping the flat stent to the deformed stent, which fits
across the neck of the aneurysm. The artery is separated into two segments, parent
and aneurysm. Layers of refined mesh are generated at the stent and arterial walls in
both segments. After the remaining volume mesh in each segment is generated, the
two segments are merged on the interior-boundary mesh containing the stent.
1. Prepare the lumen geometry and flat-stent model as shown in Fig. 1. We extract
the arterial surface geometry from medical images and generate a lumen geom-
etry reflective of the inflated arterial-wall structure through the process reported
in [45]. The flat-stent model was generated using the geometry of a Cordis Pre-
cise Pro Rx nitinol self-expanding stent (PC0630RXC) with a wire diameter of
about 0.1 mm.
2. Generate a NURBS surface slightly larger than the artery such that the surface
intersects the lumen geometry as shown in Fig. 2. We swept a NURBS sur-
face following the curvature of the parent artery and extending slightly beyond
the aneurysm neck. To simplify the mesh generation process, we only model the
portion of the stent crossing the neck of the aneurysm. The intersection of the
NURBS surface and lumen geometry is the periphery of the interior boundary
containing the stent.
3. Map the periphery of the interior boundary, described above, to the flat stent
and mesh that as shown in Fig. 3. We generate a triangular surface mesh using
ANSYS ICEM CFD meshing software (ICEM) and the geometry defined
by the flat stent and interior-boundary periphery. The maximum element size
specified in ICEM mesh generation leads to the width of the stent wire being
meshed with 34 elements. This ensures sufficient refinement to resolve the flow
on the stent. The flat-stent mesh is then mapped from the flat NURBS surface
to the deformed NURBS surface to form the interior-boundary mesh positioned
across the neck of the aneurysm.
4. Use the periphery of the interior-boundary mesh as a predefined set of element
edges, splitting the lumen geometry into parent and aneurysm segments as shown
in Fig. 4. This reduces complexity in mesh generation. We use ICEM to generate
the triangular surface meshes on the parent and aneurysm segments.
5. Using the surface meshes for the parent and aneurysm segments, we generate
layers of refined mesh on either side of the stent and near the arterial walls. We
use the process reported in [45] to generate the layers in the parent segment. In
generating the layers in the aneurysm segment, we first start by separating the
surface mesh into different regions as shown in Fig. 5. Due to the sharp angle of
the geometry, no layers are explicitly generated in the red region. We specify a
uniform thickness for the layers of refined mesh in the blue regions. The thickness
of the first layer is approximately equal to the first layer of refined mesh in the
parent segment. There are a total of four layers, each increasing in thickness
using a progression ratio of 1.75 (the same number of layers and progression
ratio used in [45]). To prevent elements tangling, the Laplaces equation is solved
over the green region of the surface mesh to determine the thickness growth from
essentially zero at the boundary with the red region to the desired layer thickness
at the blue region boundary. We generate each of the four layers in the aneurysm
segment separately and merge the layers (see Remark 2).
6. The rest of the fluid volume mesh is generated using ICEM. The innermost
surface of the layers of refined mesh is extracted from the volume mesh and
used as the surface mesh for generating the volume mesh in both the parent and
aneurysm segments. The inner volume mesh is then merged to the refined layers.
7. The parent and aneurysm fluid volume mesh segments are merged on the interior-
boundary mesh containing the stent. For the no-stent cases, all nodes are merged
on that interior-boundary mesh. For the single- and double-stent cases, the nodes
on the stent portion of the interior-boundary mesh are not merged and instead
colocated.
Remark 1 We generate the double stent by overlaying two single flat-stent geome-
tries and translating one of them in two directions. We map the intersection of the
deformed NURBS surface and lumen geometry, which is again the periphery of the
interior boundary, to the flat double-stent geometry and mesh the double stent as
one mesh. The double-stent mesh is treated the same as the single-stent mesh in the
remaining mesh generation steps. Figure 6 shows the full single and double stents.
Remark 2 The mesh generation process for the layers of refined mesh in the
aneurysm segment presents challenges regarding tangled elements. With the mesh
refinement required by the problem, building the layers into the artery has the poten-
tial to create elements with negative Jacobians. Each layer must be checked for the
Jacobian values before generating the next layer.
Fig. 6 Surface for single (left) and double (right) stents
3 Particle Residence Time Formulation
Consider the spatial domain , and the subdomain V . It may be of interest

to know how long the material particles moving inside and through the domain
are residing in the subdomain V . (Hence, the term, residence time.) We can
formulate this as an initial-value problem:

= H (x), (1)
t X
where

1 if x V
H (x) = (2)

0 otherwise
is the Heaviside function. The solution of Eq. (1) has the interpretation of the total
time that a particle, occupying a spatial position x at time t, spent in the subdomain V .
The source term given by the Heaviside function ensures that the time is accumulated
only when the particle is inside the subdomain. In this framework, both and V
may be time-dependent, which is the case here.
For applications involving flows in stationary or moving domains it is convenient
to re-write Eq. (1) with respect to the Eulerian (or spatial) frame as

+ u = H (x), (3)
t x
or the ALE frame as

+ (u v) = H (x). (4)
t x
h2 h1
u
u
0 h1 h2 L
x
Fig. 7 Solution of the 1D model problem illustrating the particle residence time method
The following simple 1D example illustrates how the technique works. Let
= (0, L), V = (h 1 , h 2 ) (0, L), and the flow velocity u is a positive con-
stant (see Fig. 7). In this setting, Eq. (3) reduces to

+u = H (x). (5)
t x
Because the flow is from left to right and the region V is located downstream of
the inlet, particles at the inlet could not have spent any time inside V . As a result,
we set = 0 at x = 0. Since this is a pure advection problem, boundary condition
at the outlet is left unspecified. Equation (5) has a steady-state solution:

0 if x [0, h 1 ]

xh 1
(x) = if x [h 1 , h 2 ] (6)

u

h 2 h 1
u if x [h 2 , L].
Equation (6) implies that, once the transient response settles, prior to the interval
of interest, the residence time is zero. As particles enter the interval of interest moving
at constant speed the residence time is proportional to the distance from the leftmost
edge of the interval, and inversely proportional to flow speed. When particles exit
the interval of interest, the residence time stays constant. The analytical solution of
the differential equation given by Eq. (6) is in complete agreement with the expected
particle residence time for this simple case.
Remark 3 Besides illustrating how the method works, this example also shows that
the proposed technique for calculating particle residence time is meaningful when
the solution for reaches a steady state. The time-periodic solution for , which is
often the case with cardiovascular applications, also presents a situation from which
meaningful conclusions may be drawn.
Two scalar measures of residence time proposed in [51, 52] are:

1
RT1 = H (x) (x, t) d dt, (7)
T |V |
T
where

1
|V | = H (x) d dt (8)
T
T
is the time-averaged volume of V , and

1
RT2 = (u v) n d V dt, (9)
T |Q|
T V
where

1
|Q| = (u v) n d V dt (10)
T
T V +
is the time-averaged flow rate through the outflow boundary of V , denoted by V .

In Eq. (9), the expression for RT2 may be simplified further as (see [51])
|V |
RT2 = . (11)
|Q|
The most attractive feature of this measure of residence time is that it requires no
information about .
Remark 4 Applied to the 1D model problem with solution given by Eq. (6) the two
h 1
residence time measures produce RT1 = h 22u and RT2 = h 2 h 1
u , which are the
average and maximum particle residence time in V , respectively. For such a simple
flow situation it is clear that the maximum residence time occurs at the outflow.
However, in multiple dimensions, in the presence of complex, recirculating flow this
may not be the case. In the case of blood pumps, one can in fact assess the device
efficiency (meaning throughput efficiency) by comparing the residence time at the
outlet with that in the interior of the domain. Residence time maxima occurring in
the interior suggest the presence of persistent flow recirculation or stagnation zones
that tend to trap the material and thus lower the pump efficiency.
An alternative approach to computing residence time is based on the dye injection
technique. We use an advection equation as before, set the source term to zero, and
apply a Dirichlet boundary condition on the scalar field at the inlet such that:

1 if n = 1
= , (12)

0 otherwise
where n is the cycle number. (Note that is no longer particle residence time, but
rather dye concentration.) This has the effect of injecting a dye for one cycle, which
can be visualized as it moves through the domain. The volume of dye in the domain
at any given time can be written as V dV . The dye is then ejected over subsequent
cycles, and remaining dye volume is computed at the end of each cycle. The cycles
are repeated until at least 95 % of the dye is removed from the system.
4 Other Special Tecniques
For interested reader, in this section we provide references to articles where spe-
cial techniques related to different components of arterial-geometry construction
and mesh generation can be found. These components include (a) arterial-surface
extraction from medical images [45]; (b) arterial-wall-thickness construction based
on patches [46] and based on solution of the Laplaces equation over the arterial
volume [9] or lumen [46]; (c) fluid mechanics volume mesh generation, including
layers of refined mesh near the arterial walls [46]; (d) calculating the EZP arterial
geometry [46]; (e) calculating the blood vessel tissue prestress [8, 48]; and (f) cal-
culating an estimated element-based zero-stress state for the artery [21]. We also
provide references to articles where some additional special techniques can be found.
These special techniques include (g) a mapping technique for inflow boundaries [14];
(h) boundary condition techniques for inclined inflow and outflow planes [45]; (i) a
scaling technique for specifying realistic inflow rates [47]; (j) techniques [41] for
the projection of fluidstructure interface stresses; (k) a recipe for pre-FSI compu-
tations that improve the convergence of the FSI computations [13]; (l) the SCAFSI
technique [55]; (m) methods for WSS and OSI calculations [41]; and (n) a precon-
ditioning technique [42].
5 Fluid and Structure Properties and Boundary Conditions
5.1 Fluid and Structure Properties
As it was done for the computations reported in [5], the blood is assumed to behave
like a Newtonian fluid. The density and kinematic viscosity are set to 1000 kg/m3
and 4.0 106 m2 /s. The material density of the arterial wall is known to be close
to that of the blood and therefore set to 1000 kg/m3 . The arterial wall is modeled
with the continuum element made of hyperelastic (Fung) material. The Fung material
constants D1 and D2 (from [56]) are 2.6447 103 N/m2 and 8.365, and the penalty
Poissons ratio is 0.45. Cerebral arteries are surrounded by cerebrospinal fluid, and
we expect that to have a damping effect on the arterial-wall dynamics. Therefore we
add a mass-proportional damping, which also helps in removing the high-frequency
modes of the structural deformation. The damping coefficient is chosen in such a
way that the structural mechanics computations remain stable at the time-step size
used. It is 1.5 104 s1 .
5.2 Boundary Conditions
On the arterial walls, we specify no-slip boundary conditions for the flow. In the
structural mechanics part, as boundary condition at the ends of the arteries, we set
the normal component of the displacement to zero, and for one of those nodes we
also set to zero the tangential displacement component that needs to be specified to
preclude rigid-body motion. At the inflow boundary, we specify the velocity profile
as a function of time, using the technique introduced in [14]. We use two types of
conditions at the outflow boundaries. In the explicit version, at all outflow boundaries
of an artery segment we specify the same traction boundary condition. The traction
boundary condition is based on a pressure profile computed as described in [14]. In
the implicit version, we consider a class of outflow boundary conditions in which
the outlet traction is a function of the flow rate there (see [57] for details).
Remark 5 In the current TAFSM computations, the volumetric flow rate at the
inflow (calculated based on a velocity waveform representing the cross-sectional
maximum velocity) is scaled by a factor. The factor is determined in such a way that
the scaled flow rate, when averaged over the cardiac cycle, yields a target WSS for
Poiseuille flow over an equivalent cross-sectional area. The target WSS is 10 dyn/cm2
in the current computations. This technique was introduced in [46, 47].
6 Computations with the ST Methods
All computations were carried out in a parallel computing environment. In FSI mod-
eling, the fully-discretized, coupled fluid and structural mechanics and mesh-moving
equations are solved with the quasi-direct coupling technique (see Sect. 5.2 in [4]),
and the computations were completed without any remeshing. In solving the linear
equation systems involved at every nonlinear iteration, the GMRES search tech-
nique [58] is used with a diagonal preconditioner.
Fig. 8 Model-M6Acom. EZP shrinking amount over the surface (lumen) extracted from the medical
image (left), wall thickness over the shrunk lumen (middle), and structure mesh at zero pressure
(right). The color range represents a value range that increases from light to dark
6.1 FSI Modeling of a Cerebral Artery with Aneurysm
A sample was presented in [46] from a wide set of patient-specific cerebral-aneurysm

models computed recently [47], where the shrinking amount in the EZP process, the
arterial wall thickness, and the thickness of the layers of refined fluid mechanics mesh
are determined based on the solution of the Laplaces equation over the surface mesh
covering the lumen (see Sect. 4). We present that sample also here. The length scales
used in conjunction with the trial ratios for the inflow and outflow boundaries are the
lumen diameters at those ends. The value specified for the thickness of the first layer
of elements at the inflow and outflow boundaries is 0.007 (lumen diameter at those
ends). In these computations, the volumetric flow rate is specified by using the scaling
technique described in Remark 5. Figure 8 shows the EZP shrinking amount, wall
thickness, and structure mesh for the arterial model, which we call Model-M6Acom.
The diameter of the arterial lumen is 3.13 mm at the inflow end, and 2.12 mm at both
outflow ends. The hexahedral structure mesh has two layers of elements across the
arterial wall. For the layers of refined fluid mechanics mesh near the arterial wall, the
progression factor is 1.75. Figure 9 shows the tetrahedral fluid mesh at the lumen,
thickness of the first layer of elements near the arterial wall, and the mesh at the
inflow plane. The structure mesh has 17,574 nodes and 11,650 elements, with 5,858
nodes and 5,825 element faces at the interface. The fluid mesh has 33,040 nodes
and 192,112 elements, with 3,528 nodes and 6,996 element faces at the interface.
The Womersley parameter is 1.96 and the peak volumetric flow rate is 1.2 ml/s. This
is based on the duration of one cardiac cycle (1 s) and the representative diameter
is calculated from the inflow area corresponding to the shape when inflated to the
average pressure.
The computations are carried out with the SSTFSI-TIP1 technique (see [17, 46,
54] for details of the computational method). The (full) SSP option is used (see
Fig. 9 Model-M6Acom. Fluid mechanics mesh at the lumen and outflow planes (left), thickness
of the first layer of elements near the arterial wall (middle), and the mesh at the inflow plane (right).
All pictures are from the starting point of our computation cycle. The color range represents a value
range that increases from light to dark
Fig. 10 Model-M6Acom. WSS when the volumetric flow rate is maximum
Remarks 21 and 22 in [46]). The time-step size is 3.333 103 s. The number of
nonlinear iterations per time step is 6. The number of GMRES iterations per nonlinear
iteration for the fluid + structure block was chosen such that mass balance is satisfied
to within at most 5 % for each case. The number of GMRES iterations is 300, and
this was sufficient for obtaining good mass balance. For all six nonlinear iterations
the fluid scale is 1.0 and the structure scale is 100. For the mesh moving block the
number of GMRES iterations is 30. Figure 10 shows the WSS when the volumetric
Fig. 11 Model-M6Acom. OSI
flow rate is maximum. Figure 11 shows the OSI, calculated with the technique that
excludes rigid-body rotations from the calculation (see [17, 41, 46, 54]).
6.2 Fluid Mechanics Modeling of a Cerebral Artery

with Aneurysm and Stent
This subsection is from [49]. Endovascular stent placement across the neck of an
intracranial aneurysm can lead to aneurysm occlusion and thrombosis. We compare
the flow field of arterial geometries before and after virtual stenting to assess the
changes. Select aneurysms require treatment using two or more stents to sufficiently
alter the flow field allowing for thrombosis. The test computations include a before-
stenting case and after-stenting cases for both single- and double-stent treatments
to compare the effectiveness of stenting with multiple stents. Section 6.2.1 details
the parameters for the arterial geometry used in the computations. In Sect. 6.2.2 we
compare hemodynamic values before and after stenting.
6.2.1 Computational Model
A patient-specific cerebral artery with aneurysm is studied at three states: before

stenting, after stenting with a single stent, and after stenting with two stents. The
inlet and outlet diameters, peak volumetric flow rate, and the Womersley number
are 3.7 mm, 2.9 mm, 2.05 ml/s, and 2.33, respectively. The lumen geometry and the
fluid mechanics mesh for the single-stent case are shown in Fig. 12. The cross-section
view shows the refined mesh at the aneurysm neck on either side of the boundary
separating the aneurysm from the parent artery. The number of nodes for the no-
stent, single-stent and double-stent meshes are 527,323, 566,049 and 662,431, and
the number of elements are 3,168,305, 3,300,182 and 3,736,603. All computations
presented in Sect. 6.2.2 are for zero-thickness representation of the stent.
Fig. 12 Arterial lumen geometry obtained from voxel data (left) and the fluid mechanics mesh for
the single-stent case, with cross-section and inflow plane views
The ST-VMS method is used, with the stabilization parameters as given by Eq. (7)
in [4] for M (= SUPS ) = SUPG and Eq. (37) in [59] for C (=LSIC ) = HRGN . The
time step size is 3.333 103 s. The number of nonlinear iterations per time step
is 4, and the number of GMRES iterations per nonlinear iteration for the no-stent,
single-stent and double-stent cases is 1,000, 1,500 and 1,500.
6.2.2 Comparative Study
Inducing thrombosis in an aneurysm requires altering the hemodynamics at the

aneurysm. Inserting a stent changes the pattern and amount of blood flow from
the parent artery to the aneurysm, influencing stasis within the aneurysm. The stent
free area at the neck of the aneurysm is reduced to approximately 85 and 71 % in
the single- and double-stent cases. We compare the fluid mechanics before and after
stenting by analyzing the ratio of the aneurysm-inflow rate to the time-averaged
parent-artery-inflow rate Q A , the spatially averaged kinetic energy and vorticity in
QP
the aneurysm, and OSI. The aneurysm-inflow rate is calculated by integrating the
magnitude of the normal component of the velocity over the interior-boundary mesh
containing the stent and dividing that by 2. We divide by 2, because the integral of the
magnitude of the normal component of the velocity measures twice the inflow rate.
The effectiveness of stenting using either the single or double stent depends on the
degree to which the flow characteristics were altered and also the arterial geometry
and size of the aneurysm. The higher OSI observed in stent cases follows the belief
that regions with increased OSI prompt thrombus formation [60, 61].
The aneurysm has a volume of 0.10 cm3 and approximate neck area of 0.47 cm2 .
The total area in the neck blocked by the stent in the single- and double-stent cases is
0.07 and 0.13 cm2 . Figure 13 shows the aneurysm velocity magnitude at peak flow
into the aneurysm. The parent artery has an average inflow rate of 0.62 ml/s. The
No Stent Single Stent Double Stent
0 0.01 0.1 1 10
Velocity (cm/s)
Fig. 13 Aneurysm velocity magnitude at peak flow into the aneurysm
25
20
Vorticity (s-1 )
15
10
0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Time(s)
Fig. 14 Comparison of spatially averaged vorticity magnitude in the aneurysm
peak blood flow into and within the aneurysm occurs approximately 0.02 s before
peak inflow rate in the parent artery. The time-averaged Q A decreases by 22 and 78 %
QP
in the single- and double-stent cases. Similarly, the kinetic energy averaged in space
and time decreases by 72 % in the single-stent case and 92 % in the double-stent case.
The reduction in vorticity in the aneurysm caused by stenting is shown in Figs. 14
and 15. The vorticity, averaged in space and time, is reduced by 47 and 72 % in the
single- and double-stent cases.
7 Computations with the ALE-VMS FSI Method
In this section we present two examples of cardiovascular FSI computations, which

include patient-specific cerebral aneurysms and the PVAD. The cerebral aneurysm
results are taken from [48], while the PVAD computations are reported in [19, 51].
1 7 70 700
Vorticity (s 1 )
Fig. 15 Aneurysm vorticity magnitude at peak flow into the aneurysm
Computations were carried out in a parallel computing environment. The FSI equa-
tions are advanced in time using the generalized- time integrator proposed in [62]
for structural dynamics, and developed for fluid mechanics in [63] and FSI in [24].
A quasi-direct solution strategy is used, where the increments of the fluid and struc-
tural mechanics variables are obtained simultaneously [3, 4, 17, 64]. A Jacobian-
based mesh stiffening technique [17, 32, 65, 66] is used to move the fluid mechanics
mesh. The effect of the mesh motion on the fluid equations is omitted from the tan-
gent matrix for efficiency, as advocated in [57] for cardiovascular FSI applications. In
solving the linear equation systems involved at every nonlinear iteration, the GMRES
search technique is used with a block-diagonal preconditioner.
7.1 Cerebral Aneurysms: Tissue Prestress
In this section we focus on the importance of prestressing of arterial tissue and

its effect on the key quantities of interest in FSI computations. For other results
obtained in ALE-VMS FSI simulations of cerebral aneurysms the reader is referred
to [8, 6769]. A Neo-Hookean material with dilatational penalty is employed here
to model the response of the vascular wall. We apply the prestress procedure pro-
posed in [48] to two cerebral aneurysm models shown in Fig. 16. The inflow and
outflow branches are labeled as M1 and M2, respectively, in the same figure. Both
models come from patient-specific imaging data and exhibit significant geometrical
differences. Model 1 has a relatively small aneurysm dome and an inlet branch of
large radius. The situation is reversed for Model 2. Meshing techniques developed by
Zhang et al. [68] are used for generating linear tetrahedral elements for both models.
The meshes contain both the blood flow volume and solid vessel wall. Fine meshes
with boundary layer resolution are employed.
Figure 17 shows the final prestressed state for both models, which also demon-
strates the applicability of the method to different vascular geometries. The models
Fig. 16 Tetrahedral finite element mesh of the middle cerebral artery (MCA) bifurcation with
aneurysm. Both patient-specific models are discretized using approximately 165K elements and
30K nodes. Inlet branches are labeled M1 and outlet branches are labeled M2 for both models. The
arrows point in the direction of inflow velocity. The inlet cross-sectional areas Models 1 and 2 are
4.962 102 and 2.102102 cm2 , respectively. a Model 1; b Model 2
Fig. 17 Final prestressed state for Model 1 and 2. The models are colored by the isocontours
of wall tension, which is defined as the absolute value of the first principal in-plane stress of S0 .
a Model 1; b Model 2
are colored by the isocontours of wall tension, which is defined as the absolute value
of the first principal in-plane stress.
To assess the influence of the prestress, we perform a coupled FSI simulation of
both models and compare the results with and without prestress. Figure 18 shows
the relative wall displacement between the deformed configuration and reference
configuration coming from imaging data. The deformed configuration corresponds
to the time instant when the fluid traction vector is closest to the averaged traction
vector used for the prestress problem. Almost no difference between the reference
and deformed configurations is seen in the case of the prestressed-artery simulation,
as expected. However, in the case of non-prestressed simulation, the differences
between the two configurations are significant. This indicates that the FSI problem
is not being solved on the correct geometry. Furthermore, the relative geometry
error is larger for Model 2, which has a larger aneurysm dome and a thinner wall.
Figure 19 shows the relative wall displacement between the deformed configuration
Fig. 18 Relative wall displacement between the deformed configuration and reference configu-
ration coming from imaging data. Top Model 1; Bottom Model 2. The deformed configuration
corresponds to the time instant when the fluid traction vector is closest to the averaged traction
vector used for the prestress computation. a With prestress; b Without prestress
Fig. 19 Relative wall displacement between the deformed configuration at peak systole and low
diastole. Top Model 1; Bottom Model 2. a With prestress; b Without prestress
at peak systole and low diastole. In both the prestressed and non-prestressed cases the
relative displacement is fairly small, yet non-negligible. The non-prestressed case,
however, makes use of the geometry that is significantly more inflated compared
to the prestressed case and the imaging data.
Fig. 20 Volume-rendered blood flow velocity magnitude near peak systole. Top Model 1; Bottom
Model 2. a With prestress; b Without prestress
Fig. 21 Wall shear stress near peak systole. Top Model 1; Bottom Model 2. a With prestress;
b Without prestress
Figure 20 shows a comparison of the blood flow speed near peak systole for the
simulations with and without prestress. The results between the two cases are very
similar, although some differences in the flow structures are visible, especially for
Model 2. Figure 21 shows a comparison of the WSS near peak systole for both cases.
The WSS, unlike blood flow velocity, exhibits significant differences in magnitude
Fig. 22 PVAD geometry including blood and air chambers as well as inlet and outlet boundaries.
Darker and lighter shades are used to denote the blood and air chambers, respectively
and spatial distribution. The simulations presented clearly show the importance of
tissue prestress in patient-specific vascular FSI modeling for accurate prediction of
hemodynamic phenomena and vessel wall mechanics.
7.2 PVAD: Residence Time Computations
The PVAD operates as follows. The pneumatically driven device is comprised of

the blood and air chambers and a thin membrane separating the two subdomains
(see Fig. 22). The blood chamber is connected to the circulatory system. During the
fill stage, the air is pumped out of the air chamber, which displaces the membrane
downward and the blood chamber is filled with blood from one of the heart ventricles.
During the ejection stage, air is pumped into the air chamber, which displaces the
membrane upward and blood is ejected into the aorta. We simulate the process,
exactly as described, using the FSI techniques summarized in what follows.
The computational FSI methodology for for PVADs was presented in detail in [19].
The linear FEM-based ALE-VMS technique [12, 24, 54, 70] is used for the computa-
tion of blood and air flow in the respective chambers of the device. The NURBS-based
isogeometric KirchhoffLove shell formulation [7177] is employed to model the
structural mechanics behavior of the thin membrane separating the air and blood
chambers. The St. VenantKirchhoff constitutive law is used to model the material
response of the thin membrane. A nonmatching interface FSI formulation [4, 14, 17,
34, 35, 42, 4446, 54, 7880] is employed. In an effort to enhance computational
efficiency, a combination of a sparse-matrix-based and a matrix-free approach was
used in the implementation of the quasi-direct nonmatching-interface FSI coupling

technique (see [19] for details). Due to the large deformations of the membrane,
the blood and air domain meshes need to be periodically regenerated. Five-to-six
remeshing steps typically occur during one cycle.
During the ejection stage the inlet is closed and the outlet is open to a resistance
boundary condition [81]. During the fill stage, the outlet is closed and the inlet is
open to the same resistance BCs. The latter setup presents a flow regime that is
often numerically unstable due to the significant amount of reversed flow coming
into the computational domain through the open boundary. The outflow stabilization
technique developed in [24] and further studied in [82] is employed here, which
renders this setup stable and enables simulating PVAD operation as described in the
first paragraph of this section.
The discretization of the time-dependent advection equations describing particle
residence time and dye injection makes use of the ALE-based, SUPG-stabilized [27]
formulation augmented with a YZ discontinuity-capturing operator [83, 84]. The
advection equations are only solved in the blood chamber using the same meshes of
linear tetrahedral elements as the fluid mechanics problem. Because the FSI equations
do not depend on the residence-time solution, we first compute the FSI problem for
three cycles to achieve a nearly-time-periodic solution. The fluid and mesh velocity,
and mesh position solutions in the last cycle are used to drive the residence time
computations.
For the FSI simulation, a stroke volume of 73 ml is chosen for this device, which
yields an ejection fraction of 68 %. A beat frequency of 80 bpm is used, for a pump
output of 5.8 l/min. Each pump cycle may be broken up into two components: the
fill stage and the ejection stage. We impose the fill period of 0.45 s, and the ejection
period of 0.3 s, and we also enforce that each stage must fill or eject the same volume,
73 ml. For simplicity, the flow is assumed to behave sinusoidally during each stage.
The number of elements in each domain fluctuates with remeshing. The device is
initialized with 557,416 elements in the blood chamber and 264,597 elements in the
air chamber. Each remeshing step respects the initial length scale of the elements.
The membrane is discretized with 1,024 C 1 -continuous quadratic NURBS elements.
A time step size of 0.001 s is used. The FSI problem is computed for three time cycles
in order to reach a time-periodic solution. The FSI solution in the last cycle is used to
drive the residence time computations. See Fig. 23 for the snapshots of flow solution
and membrane deformation at various times during the cycle.
In the residence time simulations the scalar field is initialized to zero. The
subdomain V in Eq. (2) is assumed to be the whole blood chamber, including the
inlet and outlet arms. Thus, is a measure of particle residence time in the whole
blood chamber. At the inlet branch, is set to zero and at the outlet branch its value
is left unspecified. Five cycles were required to reach a time-periodic solution for .
Figure 23 shows the snapshots of residence time distribution in the blood chamber
during the cycle. While a local maximum in the residence time occurs near the apex
of the blood chamber, which is due to a large vortical flow structure in the device, the
maximum residence time concentrates near the outlet branch for most of the cycle.
Fig. 23 PVAD flow solution (left), residence time (middle), and membrane deformation (right) at
several instances during the cycle: Top to bottom, t = 0.15 s and t = 0.525 s
1.2
1.1
0.9
0.8
0.7
0 0.2 0.4 0.6 0.8 1
Time (s)
Fig. 24 Plot of as a function of time. The fill stage is given by t [0, 0.45], and the ejection
stage as t (0.45, 0.75]
This suggests that the old material does not accumulate in the interior of the device,
and is ejected by the pump in a fairly efficient manner.
The time history of , the spatial average of the blood chamber residence time,
is shown in Fig. 24. The average residence time rises uniformly during the ejection
stage. This is because no new material is entering the blood chamber and the average
is expected to rise with time. The fill stage shows a brief and rapid decrease in , as
new material with = 0 enters the blood chamber. The trend again reverses and
begins to increase again in the later part of the fill stage as the influx of new material
slows and the blood chamber volume grows. Using Eqs. (7) and (11), we find that
100
5% cutoff
80
% Dye Remaining
73 mL Device
60
40
20
0
0 0.5 1 1.5 2 2.5
Time (s)
Fig. 25 Plot of the percentage of dye remaining in the blood chamber versus time
RT1 = 0.893 s and RT2 = 1.031 s for this device, meaning the blood particles
remain in the chamber on average for about 1 s. Note that the difference between
RT1 and RT2 is very minor, suggesting that the bulk of the residence time comes
from the particles circulating the chamber rather than directly traversing the length
of the device from the inlet to outlet.
Results from the dye injection analysis can be seen in Fig. 25. The figure shows
a percentage of dye remaining in the blood chamber after the chamber is filled for
one cycle. Note that in the first cycle over 50 % of the dye is removed. Furthermore,
it takes 2.47 s to remove 95 % of the dye subsequent to the initial fill cycle.
8 Concluding Remarks
We presented a review of how cardiovascular fluid mechanics analysis, including FSI,

can be very effectively carried out with the core and special ST and ALE techniques.
We presented several challenging computations that were successfully carried with
these techniques. The core techniques are the ALE-VMS, DSD/SST and SSTFSI
methods. The special techniques reviewed were those for stent modeling and mesh
generation and for calculation of the particle residence time. These were only a few
examples of the large number of special techniques developed for cardiovascular fluid
mechanics modeling in conjunction with the core techniques, ranging from arterial-
surface extraction techniques to methods for an estimated element-based zero-stress
state for the artery. We provided references for some of the other special techniques.
This article shows that the core and special ST and ALE techniques developed can
successfully address the computational challenges encountered in patient-specific
cardiovascular fluid mechanics modeling.
References
1. Humphrey JD (2002) Cardiovascular solid mechanics. Springer, New York

2. Holzapfel GA (2000) Nonlinear solid mechanics, a continuum approach for engineering. Wiley,
Chichester
3. Tezduyar TE, Sathe S, Keedy R, Stein K (2006) Spacetime finite element techniques for
computation of fluidstructure interactions. Comput Methods Appl Mech Eng 195:20022027.
doi:10.1016/j.cma.2004.09.014
4. Tezduyar TE, Sathe S (2007) Modeling of fluidstructure interactions with the spacetime
finite elements: solution techniques. Int J Numer Meth Fluids 54:855900. doi:10.1002/fld.
1430
5. Torii R, Oshima M, Kobayashi T, Takagi K, Tezduyar TE (2004) Influence of wall elasticity on
image-based blood flow simulation. Jpn Soc Mech Eng J Ser A 70:12241231 (in Japanese)
6. Torii R, Oshima M, Kobayashi T, Takagi K, Tezduyar TE (2006) Computer modeling of cardio-
vascular fluidstructure interactions with the deforming-spatial-domain/stabilized spacetime
formulation. Comput Meth Appl Mech Eng 195:18851895. doi:10.1016/j.cma.2005.05.050
7. Torii R, Oshima M, Kobayashi T, Takagi K, Tezduyar TE (2007) Influence of wall elastic-
ity in patient-specific hemodynamic simulations. Comput Fluids 36:160168. doi:10.1016/j.
compfluid.2005.07.014
8. Bazilevs Y, Hsu M-C, Zhang Y, Wang W, Kvamsdal T, Hentschel S, Isaksen J (2010) Compu-
tational fluidstructure interaction: methods and application to cerebral aneurysms. Biomech
Model Mechanobiol 9:481498
9. Bazilevs Y, Hsu M-C, Benson D, Sankaran S, Marsden A (2009) Computational fluidstructure
interaction: methods and application to a total cavopulmonary connection. Comput Mech
45:7789
10. Bazilevs Y, del Alamo JC, Humphrey JD (2010) From imaging to prediction: emerging non-
invasive methods in pediatric cardiology. Prog Pediatr Cardiol 30:8189
11. Figueroa CA, Baek S, Taylor CA, Humphrey JD (2009) A computational framework for fluid-
solid-growth modeling in cardiovascular simulations. Comput Meth Appl Mech Eng 199:3583
3602
12. Bazilevs Y, Calo VM, Zhang Y, Hughes TJR (2006) Isogeometric fluidstructure interaction
analysis with applications to arterial blood flow. Comput Mech 38:310322
13. Tezduyar TE, Sathe S, Cragin T, Nanna B, Conklin BS, Pausewang J, Schwaab M (2007)
Modeling of fluidstructure interactions with the spacetime finite elements: arterial fluid
mechanics. Int J Numer Meth Fluids 54:901922. doi:10.1002/fld.1443
14. Takizawa K, Christopher J, Tezduyar TE, Sathe S (2010) Spacetime finite element computation
of arterial fluidstructure interactions with patient-specific data. Int J Numer Meth Biomed Eng
26:101116. doi:10.1002/cnm.1241
15. Sugiyama K, Ii S, Takeuchi S, Takagi S, Matsumoto Y (2010) Full eulerian simulations of
biconcave Neo-Hookean particles in a Poiseuille flow. Comput Mech 46:147157
16. Manguoglu M, Takizawa K, Sameh AH, Tezduyar TE (2011) A parallel sparse algorithm tar-
geting arterial fluid mechanics computations. Comput Mech 48:377384. doi:10.1007/s00466-
011-0619-0
17. Bazilevs Y, Takizawa K, Tezduyar TE (2013) Computational fluidstructure interaction: meth-
ods and applications. Wiley, Hoboken
18. Takizawa K, Schjodt K, Puntel A, Kostov N, Tezduyar TE (2013) Patient-specific computational
analysis of the influence of a stent on the unsteady flow in cerebral aneurysms. Comput Mech
51:10611073. doi:10.1007/s00466-012-0790-y
19. Long CC, Marsden AL, Bazilevs Y (2013) Fluidstructure interaction simulation of pulsatile
ventricular assist devices. Comput Mech. doi:10.1007/s00466-013-0858-3 (published online)
20. Esmaily-Moghadam M, Bazilevs Y, Marsden AL (2013) A new preconditioning technique
for implicitly coupled multidomain simulations with applications to hemodynamics. Comput
Mech , doi:10.1007/s00466-013-0868-1 (published online)
21. Takizawa K, Takagi H, Tezduyar TE, Torii R (2013) Estimation of element-based zero-stress
state for arterial FSI computations. Comput Mech. doi:10.1007/s00466-013-0919-7 (published
online)
22. Takizawa K, Tezduyar TE, Buscher A, Asada S ( 2013) Spacetime interface-tracking with
topology change (ST-TC). Comput Mech. doi:10.1007/s00466-013-0935-7 (published online)
23. Hughes TJR, Liu WK, Zimmermann TK (1981) LagrangianEulerian finite element formula-
tion for incompressible viscous flows. Comput Meth Appl Mech Eng 29:329349
24. Bazilevs Y, Calo VM, Hughes TJR, Zhang Y (2008) Isogeometric fluidstructure interaction:
theory, algorithms, and computations. Comput Mech 43:337
25. Tezduyar TE (1992) Stabilized finite element formulations for incompressible flow computa-
tions. Adv Appl Mech 28:144. doi:10.1016/S0065-2156(08)70153-4
26. Tezduyar TE (2003) Computation of moving boundaries and interfaces and stabilization para-
meters. Int J Numer Meth Fluids 43:555575. doi:10.1002/fld.505
27. Brooks AN, Hughes TJR (1982) Streamline upwind/PetrovGalerkin formulations for convec-
tion dominated flows with particular emphasis on the incompressible NavierStokes equations.
Comput Meth Appl Mech Eng 32:199259
28. Tezduyar TE, Mittal S, Ray SE, Shih R (1992) Incompressible flow computations with stabilized
bilinear and linear equal-order-interpolation velocity-pressure elements. Comput Meth Appl
Mech Eng 95:221242. doi:10.1016/0045-7825(92)90141-6
29. Hughes TJR, Franca LP, Balestra M (1986) A new finite element formulation for computational
fluid dynamics: v. circumventing the BabukaBrezzi condition: a stable PetrovGalerkin for-
mulation of the stokes problem accommodating equal-order interpolations. Comput Meth Appl
Mech Eng 59:8599
30. Hughes TJR, Hulbert GM (1988) Spacetime finite element methods for elastodynamics: for-
mulations and error estimates. Comput Meth Appl Mech Eng 66:339363
31. Torii R, Oshima M, Kobayashi T, Takagi K, Tezduyar TE (2011) Influencing factors in image-
based fluidstructure interaction computation of cerebral aneurysms. Int J Numer Meth Fluids
65:324340. doi:10.1002/fld.2448
32. Tezduyar T, Aliabadi S, Behr M, Johnson A, Mittal S (1993) Parallel finite-element computation
of 3D flows. Computer 26:2736. doi:10.1109/2.237441
33. Tezduyar TE (2004) Finite element methods for fluid dynamics with moving boundaries and
interfaces. In: Stein E, Borst RD, Hughes TJR (eds) Encyclopedia of computational mechanics.
Fluids, vol 3. Wiley, Hoboken (Chapter 17)
34. Takizawa K, Tezduyar TE (2012) Spacetime fluidstructure interaction methods. Math Models
Math Appl Sci 22:1230001. doi:10.1142/S0218202512300013
35. Takizawa K, Tezduyar TE (2011) Multiscale spacetime fluidstructure interaction techniques.
Comput Mech 48:247267. doi:10.1007/s00466-011-0571-z
36. Hughes TJR (1995) Multiscale phenomena: Greens functions, the Dirichlet-to-Neumann for-
mulation, subgrid scale models, bubbles, and the origins of stabilized methods. Comput Meth
37. Hughes TJR, Oberai AA, Mazzei L (2001) Large eddy simulation of turbulent channel flows
by the variational multiscale method. Phys Fluids 13:17841799
38. Bazilevs Y, Calo VM, Cottrell JA, Hughes TJR, Reali A, Scovazzi G (2007) Variational mul-
tiscale residual-based turbulence modeling for large eddy simulation of incompressible flows.
39. Bazilevs Y, Akkerman I (2010) Large eddy simulation of turbulent TaylorCouette flow using
isogeometric analysis and the residual-based variational multiscale method. J Comput Phys
229:34023414
40. Tezduyar TE, Cragin T, Sathe S, Nanna B (2007) FSI computations in arterial fluid mechanics
with estimated zero-pressure arterial geometry. In: Onate E, Garcia J, Bergan P, Kvamsdal T
(eds) Marine 2007. CIMNE, Barcelona, Spain
41. Takizawa K, Moorman C, Wright S, Christopher J, Tezduyar TE (2010) Wall shear stress
calculations in spacetime finite element computation of arterial fluidstructure interactions.
Comput Mech 46:3141. doi:10.1007/s00466-009-0425-0
42. Tezduyar TE, Schwaab M, Sathe S (2009) Sequentially-coupled arterial fluidstructure

interaction (SCAFSI) technique. Comput Meth Appl Mech Eng 198:35243533. doi:10.1016/
j.cma.2008.05.024
43. Tezduyar TE, Schwaab M, Sathe S (2007) Arterial fluid mechanics with the sequentially-
coupled arterial FSI technique. In: Onate E, Papadrakakis M, Schrefler B (eds) Coupled Prob-
lems 2007. CIMNE, Barcelona, Spain
44. Tezduyar TE, Takizawa K, Moorman C, Wright S, Christopher J (2010) Multiscale sequentially-
coupled arterial FSI technique. Comput Mec 46:1729. doi:10.1007/s00466-009-0423-2
45. Takizawa K, Moorman C, Wright S, Purdue J, McPhail T, Chen PR, Warren J, Tezduyar TE
(2011) Patient-specific arterial fluidstructure interaction modeling of cerebral aneurysms. Int
J Numer Meth Fluids 65:308323. doi:10.1002/fld.2360
46. Tezduyar TE, Takizawa K, Brummer T, Chen PR (2011) Spacetime fluidstructure interaction
modeling of patient-specific cerebral aneurysms. Int J Numer Meth Biomed Eng 27:16651710.
doi:10.1002/cnm.1433
47. Takizawa K, Brummer T, Tezduyar TE, Chen PR (2012) A comparative study based on patient-
specific fluidstructure interaction modeling of cerebral aneurysms. J Appl Mech 79:010908.
doi:10.1115/1.4005071
48. Hsu M-C, Bazilevs Y (2011) Blood vessel tissue prestress modeling for vascular fluidstructure
interaction simulations. Finite Elem Anal Des 47:593599
49. Takizawa K, Schjodt K, Puntel A, Kostov N, Tezduyar TE (2012) Patient-specific computer
modeling of blood flow in cerebral arteries with aneurysm and stent. Comput Mech 50:675686.
doi:10.1007/s00466-012-0760-4
50. Bluestein D, Niu L, Schoephoerster R, Dewanjee M (1997) Fluid mechanics of arterial stenosis:
relationship to the development of mural thrombus. Ann Biomed Eng 25:344356
51. Long CC, Esmaily-Moghadam M, Marsden AL, Bazilevs Y (2013) Computation of resi-
dence time in the simulation of pulsatile ventricular assist devices. Comput Mech. doi:10.
1007/s00466-013-0931-y (published online)
52. Esmaily-Moghadam M, Hsia T-Y, Marsden A (2014) A non-discrete method for computation
of residence time in fluid mechanics simulations. Phys Fluids. doi:10.1063/1.4819142
53. Long CC, Marsden AL, Bazilevs Y (2014) Shape optimization of pulsatile ventricular assist
devices using FSI to minimize thrombotic risk. Comput Mech. (accepted for publication)
54. Takizawa K, Bazilevs Y, Tezduyar TE (2012) Spacetime and ALE-VMS techniques for
patient-specific cardiovascular fluidstructure interaction modeling. Arch Comput Meth Eng
19:171225. doi:10.1007/s11831-012-9071-3
55. Tezduyar TE, Sathe S, Schwaab M, Conklin BS (2008) Arterial fluid mechanics modeling
with the stabilized spacetime fluidstructure interaction technique. Int J Numer Meth Fluids
57:601629. doi:10.1002/fld.1633
56. Huang H, Virmani R, Younis H, Burke AP, Kamm RD, Lee RT (2001) The impact of calcifi-
cation on the biomechanical stability of atherosclerotic plaques. Circulation 103:10511056
57. Bazilevs Y, Gohean JR, Hughes TJR, Moser RD, Zhang Y (2009) Patient-specific isogeometric
fluidstructure interaction analysis of thoracic aortic blood flow due to implantation of the Jarvik
(2000) left ventricular assist device. Comput Meth Appl Mech Eng 198:35343550
58. Saad Y, Schultz M (1986) GMRES: A generalized minimal residual algorithm for solving
nonsymmetric linear systems. SIAM J Sci Stat Comput 7:856869
59. Takizawa K, Henicke B, Puntel A, Spielman T, Tezduyar TE (2012) Spacetime computational
techniques for the aerodynamics of flapping wings. J Appl Mech 79:010903. doi:10.1115/1.
4005073
60. Rhee K, Han MH, Cha SH, Khang G (2001) The changes of flow characteristics caused by a
stent in fusiform aneurysm models. Engineering in Medicine and Biology Society, 2001. In:
Proceedings of the 23rd annual international conference of the IEEE, vol 1, pp 8688. doi:10.
1109/IEMBS.2001.1018852
61. Jou L-D, Mawad ME (2011) Hemodynamic effect of neuroform stent on intimal hyperplasia
and thrombus formation in a carotid aneurysm. Med Eng Phys 33:573580. doi:10.1016/j.
medengphy.2010.12.013
62. Chung J, Hulbert GM (1993) A time integration algorithm for structural dynamics withim-
proved numerical dissipation: the generalized- method. J Appl Mech 60:371375
63. Jansen KE, Whiting CH, Hulbert GM (2000) A generalized- method for integrating the filtered
NavierStokes equations with a stabilized finite element method. Comput Meth Appl Mech
Eng 190:305319
64. Tezduyar TE, Sathe S, Keedy R, Stein K (2004) Spacetime techniques for finite element
computation of flows with moving boundaries and interfaces. In: Gallegos S, Herrera I, Botello
S, Zarate F, Ayala G (eds) Proceedings of the III international congress on numerical methods
in engineering and applied science. CD-ROM, Monterrey, Mexico
65. Tezduyar TE, Behr M, Mittal S, Johnson AA (1992) Computation of unsteady incompressible
flows with the finite element methodsspacetime formulations, iterative strategies and mas-
sively parallel implementations. In: New methods in transient analysis, PVP-Vol. 246/AMD,
vol 143. ASME, New York, pp 724
66. Johnson AA, Tezduyar TE (1994) Mesh update strategies in parallel finite element computa-
tions of flow problems with moving boundaries and interfaces. Comput Meth Appl Mech Eng
119:7394. doi:10.1016/0045-7825(94)00077-8
67. Isaksen JG, Bazilevs Y, Kvamsdal T, Zhang Y, Kaspersen JH, Waterloo K, Romner B, Inge-
brigtsen T (2008) Determination of wall tension in cerebral artery aneurysms by numerical
simulation. Stroke 39:31723178
68. Zhang Y, Wang W, Liang X, Bazilevs Y, Hsu M-C, Kvamsdal T, Brekken R, Isaksen J (2009)
High-fidelity tetrahedral mesh generation from medical imaging data for fluidstructure inter-
action analysis of cerebral aneurysms. Comput Model Eng Sci 42:131150
69. Bazilevs Y, Hsu M-C, Zhang Y, Wang W, Liang X, Kvamsdal T, Brekken R, Isaksen J (2010)
A fully-coupled fluidstructure interaction simulation of cerebral aneurysms. Comput Mech
46:316
70. Bazilevs Y, Hsu M-C, Takizawa K, Tezduyar TE (2012) ALE-VMS and ST-VMS methods for
computer modeling of wind-turbine rotor aerodynamics and fluidstructure interaction. Math
Models Meth Appl Sci 22:1230002. doi:10.1142/S0218202512300025
71. Hughes TJR, Cottrell JA, Bazilevs Y (2005) Isogeometric analysis: cad, finite elements, nurbs,
exact geometry, and mesh refinement. Compu Meth Appl Mech Eng 194:41354195
72. Cottrell JA, Hughes TJR, Bazilevs Y (2009) Isogeometric analysis. Wiley, Toward Integration
of CAD and FEA
73. Kiendl J, Bletzinger K-U, Linhard J, Wchner R (2009) Isogeometric shell analysis with
KirchhoffLove elements. Comput Meth Appl Mech Eng 198:39023914
74. Kiendl J, Bazilevs Y, Hsu M-C, Wchner R, Bletzinger K-U (2010) The bending strip method
for isogeometric analysis of KirchhoffLove shell structures comprised of multiple patches.
75. Bazilevs Y, Hsu M-C, Kiendl J, Wchner R, Bletzinger K-U (2011) 3D simulation of wind
turbine rotors at full scale. Part II: Fluidstructure interaction modeling with composite blades.
Int J Numer Meth Fluids 65:236253
76. Benson DJ, Bazilevs Y, De Luycker E, Hsu M-C, Scott M, Hughes TJR, Belytschko T (2010)
A generalized finite element formulation for arbitrary basis functions: from isogeometric analy-
sis to xfem. Int J Numer Meth Eng 83:765785
77. Benson DJ, Bazilevs Y, Hsu M-C, Hughes TJR (2011) A large deformation, rotation-free,
isogeometric shell. Comput Meth Appl Mech Eng 200:13671378
78. Tezduyar TE, Sathe S, Pausewang J, Schwaab M, Christopher J, Crabtree J (2008) Interface
projection techniques for fluidstructure interaction modeling with moving-mesh methods.
Comput Mech 43:3949. doi:10.1007/s00466-008-0261-7
79. Tezduyar TE, Takizawa K, Moorman C, Wright S, Christopher J (2010) Spacetime finite ele-
ment computation of complex fluidstructure interactions. Int J Numer Meth Fluids 64:1201
1218. doi:10.1002/fld.2221
80. Bazilevs Y, Hsu M-C, Scott MA (2012) Isogeometric fluidstructure interaction analysis with
emphasis on non-matching discretizations, and with application to wind turbines. Comput Meth
81. Vignon-Clementel IE, Figueroa CA, Jansen KE, Taylor CA (2006) Outflow boundary
conditions for three-dimensional finite element modeling of blood flow and pressure in arteries.
82. Moghadam ME, Bazilevs Y, Hsia T-Y, Vignon-Clementel IE, Marsden AL (2011) The modeling
of congenital hearts alliance (MOCHA), a comparison of outlet boundary treatments for
prevention of backflow divergence with relevance to blood flow simulations. Comput Mech
48:277291 doi:10.1007/s00466-011-0599-0
83. Tezduyar TE, Senga M (2006) Stabilization and shock-capturing parameters in SUPG formu-
lation of compressible flows. Comput Meth Appl Mech Eng 195:16211632. doi:10.1016/j.
cma.2005.05.032
84. Bazilevs Y, Calo VM, Tezduyar TE, Hughes TJR (2007) YZ discontinuity-capturing for
advection-dominated processes with application to arterial drug delivery. Int J Numer Meth
Fluids 54:593608. doi:10.1002/fld.1484
Part III
Particle Methods in Coupled Problems
Direct Numerical Simulation of Particulate
Flows Using a Fictitious Domain Method
Bircan Avci and Peter Wriggers
Abstract Multiphase flows consisting of a continuous fluid phase and a dispersed

phase of macroscopic particles are present in many engineering applications. In
general, a main task in the study of the particle-laden fluid flow of an application
is to make predictions about the systems nature for various boundary conditions,
since, depending on the volume fraction and mass concentration of the dispersed
phase a fluid-particle system shows quite different flow properties. Unfortunately,
often it is impossible to investigate such a system experimentally in detail or even at
all. An option to capture and to predict its properties is performing a direct numerical
simulation of the particulate fluid. For this purpose, a model approach based on a
fictitious domain method is proposed in this contribution. Here, the fluid and the
particle phase are treated, respectively, within the framework of the finite element
method and the discrete element method. The coupling scheme, which accounts for
the phase interaction, is realized at the particle scale. For the computation of the forces
that the fluid exerts on a particle an approach is used in which they are determined
directly from the flow field in the vicinity of its surface.
1 Introduction
Particulate flows are of great importance in very different industrial branches, e.g.,
in medical, process and chemical industries and also in geotechnical engineering
and bioengineering. Examples include fluidized beds, sedimentation, fluvial erosion,
sand production in oil wells, dust collection devices and aerosol transport in human
respiratory airways. The characteristics involving particulate flows are up to now
B. Avci (B) P. Wriggers

Institute of Continuum Mechanics, Appelstrasse 11, 30167 Hannover, Germany
P. Wriggers

106 B. Avci and P. Wriggers
neither well-investigated nor well-understood, particularly with regard to flows with

a dispersed phase composed of macroscopic particles. From the side of industry and
applied sciences there exists an intense interest in understanding the processes taking
place in these flows in order to predict their behavior, because a profound knowledge
of the fluid-particle interaction would allow easily to improve the performance of
an existing system or to design more efficient systems in the future; but the conduc-
tion of experiments on particulate flow systemsunless they can be carried out at
allis expensive and time-consuming. Performing virtual experiments on models
via numerical simulations is of course an alternative way to gain some insight into the
flow properties of a multiphase system. However, due to existing crucial limitations
regarding the hardware technology and the scalability of algorithms, simulations can-
not often totally replace real laboratory experiments. Thus, they are still required, but
even so the knowledge extracted from representative small-scale virtual experiments
can contribute in minimizing the sequence of real experiments. Consequently, the
improvement of existing as well as the development of new frameworks for simula-
tion of particulate flows is significant and demands more effort, as the simulation of
problems with large number of particles is still a great challenging task.
In particle-laden multiphase flows the strength of the phase coupling is predom-
inantly determined both by the local volume fraction of the dispersed phase and by
its mass concentration (see, e.g., Crowe [11]). That means that in case of dense or
locally highly concentrated flows the presence of particles in the fluid field can be
a determining factor for the main characteristics inherent to an engineering system.
To capture the mutual interaction of the phases in such flows, it is crucial to analyze
the respective problem at the particle scale. This necessity implies a fully resolved
model where the particles are described as having a body volume, and not just as
point-particles, because in dense flows neighboring particles can interact not only via
close range effectslike contact forces, adhesion or agglomeration, but also via
long range effects due to particle induced wakes, eddies and other local disturbances.
Those effects can only be captured in the framework of a full 3D direct numerical
simulation (DNS). However, a full 3D DNS approach requires very powerful tech-
niques and is nontrivial, even when considering systems with only one immersed
particle settling in a fluid, let alone systems with some thousands of dispersed parti-
cles in the fluid. This is due to the fact that besides the handling of the evolution of
the time varying fluid domain, the motion of the fluid-particle boundaries needs to be
continuously tracked during the flow process in order to account for the momentum
exchanges taking place at the phase interfaces.
In the last two decades, great progress has been made in the development and
improvement of DNS methods for particulate flow simulations. Basically, the pro-
posed approaches may be classified into two groups: (i) adaptive grid methods and
(ii) fixed grid methods. The first DNS approaches published in the literature are
assigned to the category (i). Here, the fluid field is described using a body-fitted
moving mesh whose elements follow locally the boundaries of the particles being
in motion. Of course, depending on the particles motion this can lead to large ele-
ment distortions in a mesh, so that the grid needs to be re-meshed from time to time.
Such a procedure is computationally very intensive, and one requires, in fact, very
Direct Numerical Simulation of Particulate Flows 107
efficient and sophisticated mesh motion and re-meshing algorithms. The first 3D
DNS computations of particulate flows belonging to this category were presented in
Johnson and Tezduyar [23], which can be considered as a pioneering work in this area
(see also Johnson and Tezduyar [2426]). In these articles, the authors propose the
deformable-spatial-domain/stabilized spacetime (DSD/SST) finite element method
(FEM) setting for the treatment of the fluid field. Further milestone contributions
related to group (i) were published by Hu et al. [1921]. These authors employ in
their work the Arbitrary-Lagrangian-Eulerian method framework in order to describe
the particulate fluid field. In terms of the approaches assigned to the fixed-grid cate-
gory (ii), one can find in the literature a number of techniques suggested for the DNS
analysis of fluid-particle interactions (see the review paper of Haeri and Shrimpton
[17] and the references cited therein for an overview). Taken together the proposed
approaches are known under the generic term fictitious domain (FD) method. The
most widely-used ones are the immersed boundary method, the distributed Lagrange
multiplier/fictitious domain method and the fictitious boundary method. They all
have in common that the fluid flow is treated in the framework of an Eulerian set-
ting, where a fixed mesh is employed. Here, a fluid mesh covers, compared to the
approaches in group (i), the whole computational domain, also including the space
of the particlesthat means the solid domain is filled as well with fluid. The main
idea on which the FD methods are based is to uncouple the particles from the mesh
and to consider them as fictitious objects having the property to traverse through the
grid without causing any element deformation. The crucial point is here to enforce
the fluid enclosed by an embedded particle to adopt its solid body motion. In general,
this is realized through imposing additional implicit constraints to the flow field.
A prominent method to simulate granular materials is the well-established discrete
element method (DEM) approach, which was proposed originally by Cundall and
Strack [12]. The application of a DEM solver to predict the behavior of the dispersed
phase in a particulate flow can be found, e.g., in Wachs [41] and Avci and Wriggers
[3]. In a DEM setting, usually a soft sphere approach based on repulsive force models
is applied in order to describe the contact among particles, which are at the same
time assumed being quasi-rigid.
In this work, a DNS approach is developed in the framework of a FD strategy for
the numerical simulation of 3D particle-laden flows. The fluid-particle interactions
are computed at the particle scale, with a fully resolved flow around the particles.
As numerical solvers regarding the simulation of the fluid part and particle part, the
FEM and DEM are used, respectively. Here, both methods are appropriately coupled
by a staggered solution procedure to handle particulate flows.
2 Governing Equations
A multiphase domain R3 is considered to describe a particulate fluid that consists

of a flow field f (t) and N particles. Therein, each particle Pi occupies the domain
ip (t). Hence, for it follows: = f (t) {ip (t)}i=1,N .
The flow of the fluid field is modeled by the non-stationary incompressible Navier
Stokes equations:

u f
f + u f u f = 0, u f = 0, x f . (1)
t
Herein, u f is the fluid velocity, f the fluid density and describes the Cauchy
stress tensor. In the numerical studies the constitutive equation for a Newtonian flow
is used:
1
= pI + 2 with = u f + (u f )T , (2)
2
where p is the pressure, I the identity tensor, the dynamic viscosity and the strain
rate tensor.
The motion of a quasi rigid particle P can be deduced from the NewtonEuler
equations. Consequently, its translational and angular velocities, U = X and , have
to satisfy:
d2 X
M = ( f )V b + F + F f (3)
dt 2
d
+ ( ) = T + T f . (4)
dt
Therein, M is the mass, X the position vector to the center of mass M , the mass
density, b the gravity and V denotes the volume of P. The tensor of inertia is
represented by . Furthermore, the sum of the contact forces is stated as F and the
fluid force that acts upon the particle surface p is considered by F f . The torques
that are caused by F and F f with respect to M are associated to the quantities T and
T f , respectively. Hence, the fluid forces can be obtained by:

Ff = t d A and T f = r t d A. (5)
p p
The traction vector t on p is defined as t = n f , where n f is the unit outward

normal vector and r is the position vector of a point at p with respect to M .
3 Constitutive Modeling of the Particle Phase

The particles, which are immersed in the fluid, are modeled as quasi rigid spheres.
To describe their collision behavior, a force-displacement based approach relying on
repulsive models is used, which allows to determine the inter-particle forces. In the
sections below, the relevant concepts of the contact models are stated briefly.
3.1 Normal Contact Model
The normal contact forces acting between colliding particles and between particles
and system boundaries are described by a constitutive viscoelastic model. For adhe-
sive particles being in contact, the JKR theory, introduced by Johnson et al. [27],
is used to determine the resultant attractive van der Waals force Fan in the contact
area (see also Maugis [35] for a detailed description of this model). As shown by
Loskofsky et al. [31], the JKR theory yields even in the case of underwater adhe-
sion satisfying results. For the purpose of governing the elastic contact force Fen , the
Hertzian law [18] constitutes a well-established model. If the particles to be treated
have also viscous material properties, for this, a consistent phenomenological model
was presented in Brilliantov et al. [5, 6], where the effect of viscosity is consid-
ered via an added dissipative force Fdn . Thus, one obtains for the forces acting on a
particle:
F n = Fen Fan + Fdn . (6)
The elastic repulsive force based on the Hertzian contact law is determined by:
4 3/2
Fen = E R , (7)
3
where = i + j is the total particle compression, R and E are the effective radius
and the effective Youngs modulus of the contact pair Pi and P j , respectively (see
Hertz [18]). As a result of the mutual compression of the particles, a circular area is
formed in the contact zone. In the Hertz model, the radius a of this area, which is
often called contact radius, is related to the total deformation via a 2 = R.
According to the JKR model, it is implied that the adhesive force acts only within
the contact area. Here, the work of adhesion to separate a unit contact area of Pi
and P j in a liquid medium (l) is defined as W = il + jl i j , where describes
the respective interfacial energy (see, e.g., Loskofsky et al. [31]). Since the adhesive
force Fan is opposed to the elastic force Fen , it reduces the elastic deformation e , and
one obtains for the total deformation:

a2 2 W a
:= e a = , (8)
R E
where the second term a is due to adhesion (see, e.g., Maugis [35] for details). Based
on this model the difference between the elastic and the adhesive force is given by:

4Ea 3 2W E
n
Fea := Fen Fan = 2a 2 . (9)
3R a
A special case here is the situation when external forces are absent. If so, an equi-
librium contact area with radius a0 is formed in the contact zone due to the mutual
compression 0 of the particles caused just by their adhesive attraction. These both
quantities are defined as:
1/3 1/3
9 W R 2 3R W 2
a0 = and 0 = . (10)
2E 4 E
To separate the particles, one has to apply a traction force under which they suffer
minute stretching deformations forming a connecting neck around the contact zone.
Once the pulling force has reached a critical level, i.e., F n = Fcn , the contact
breaks. Here, the critical force is obtained by Fcn = 3 W R/2 and the corresponding
critical deformation of the particle pair is c = a02 /(481/3 R). That means, the pulling
distance regarding their detachment is defined as = c . By incorporating these
critical quantities, one yields for the displacement in (8) and the force Fea n in (9)
the following dimensionless relationships (see Chokshi et al. [9]):

2
a 4 a 1/2
=61/3
2 (11)
c a0 3 a0
n 3 3/2
Fea a a
=4 4 . (12)
Fc a0 a0
To consider the properties of material viscosity, a dissipative force is adopted accord-

ing to Brilliantov et al. [5]. In that work, the definition of the force is given as
Fdn = Aa Fea
n /a. From this definition, the viscous force can be written as follows:

4Ea 2 3
Fdn = Aa 8 W Ea , (13)
R 2
where the dissipative factor A is related to a constant function of material viscosity,

which can also be used as a fitting parameter.
In the present contribution, the force laws on which F n in (6) is based are algo-
rithmically treated as displacement driven (see, e.g., Wriggers [43]). Introducing the
penetration measure g n = (Ri + R j ) ||Xi X j || > 0 and equating g n as
the mutual compression of Pi and P j , the individual parts of the force F n can be
computed straightforward after having determined the contact radius a. To evaluate
this quantity, one has to solve the nonlinear expression in (11) for the currently cal-
culated penetration of the examined pair of particles. The direction of the contact
force Fn = F n n of the respective particle is opposed to the direction of its displace-
ment, where n = (Xi X j )/||Xi X j || is the normal unit vector pointing from M j
towards Mi .
3.2 Tangential Contact Model
The constitutive relation of Coulombs law couples the tangential force F t via the
coefficient of friction to the normal force F n such that the relation F t = d F n
holds for sliding and F t s F n for sticking. Therein, the dynamic and the static
friction coefficients are denoted by d and s , respectively, where d s . For a
constitutive treatment of F t , a classical tangential (linear) spring-dashpot element
with an incorporated slider is used in this work in order to model the tangential
friction problem. For an overview and a discussion of different tangential contact
models proposed in the literature in the context of the DEM see Kruggel-Emden
et al. [28]. Here, a return mapping scheme is adopted for the computation of the
tangential force (see Luding [32], Wriggers [43]). This projection method needs a
tangential trail traction that takes the form:
Fot = (ct gt + d t vt ). (14)
Therein, gt is the elongation of the tangential spring, ct and d t are the tangential spring
stiffness and the tangential dissipation parameter, respectively. The tangential relative
velocity at the contact point C is given by vt = vs (vs n) n with vs = viC vCj
as the relative velocity at C , where the corresponding local velocities are defined by
viC = Ui + i ri and vCj = U j + j r j . The vectors pointing from Mi and M j
to C are associated with ri = Ri (n) and r j = R j n, respectively. By introducing
a trial function f tr , the following relation can be stated for the tangential contact:

0 : stick
f tr := ||Fot || s ||Fn || (15)
> 0 : slip.
If f tr 0, the contact point C is in the stick region, and if f tr > 0, it is in

the slip region. In the latter case, sliding occurs in the contact area. Note that if
Fot < d F n becomes valid during a sliding process, then the stick case comes into
effect. If the contact point sticks within the current time step, the actual tangential
spring gt is incremented for the succeeding time step by the relation gt = vt tD .
Consequently, the new spring length is defined by gt = gt + gt . Here, tD denotes
the time step size of the DEM. But if the contact point slides within the time step,
then the tangential spring is adjusted by means of:
1
gt = (d F n t + d t vt ) (16)
ct
in order to fulfill the Coulombs slip condition. Therein, t = Fot /||Fot || is the direction
of the trial traction. In two subsequent time steps, the contact area might be slightly
rotated. To take this into account, one canas proposed in Luding [32]project
the tangential spring onto the current rotated contact area at the beginning of each
new time step via gt = gt (gt n)n. Finally, in the context of the return mapping
scheme, the tangential contact force is F t = ||Fot || if f tr 0 holds and F t = d F n

if f tr > 0. By computing F t , one obtains Ft = F t t and Tt = r Ft that contributes
to F and T in (3) and (4), respectively.
3.3 Rolling Resistance Model
During a rolling motion of two particles the leading part of the contact area is contin-
uously compressed and the trailing part is decompressed with respect to the rolling
direction. In case of an attractive van der Waals force in the contact area, the particles
suffer an opposing torque that generates rolling resistance. Here, a model consisting
of a rolling spring-dashpot-slider element is adopted (see Iwashita and Oda [22]). At
this, the opposing torque is given by:
Mro = (c + d ). (17)
Therein, c is the rolling stiffness, d the rolling viscosity coefficient, the relative
particle rotation and denotes the corresponding relative rotational velocity. By
introducing a trial force Fro that induces an equivalent torque to Mro , the problem of
rolling resistance can be treated algorithmically like the model of tangential friction
(see Luding [33]). In this regard, the equivalent formulation can be stated as follows:
Mro = R n Fro , (18)
with Fro = (c gr +d vr ), c = c /R 2 and d = d /R 2 . Here, gr is the elongation

of the spring and vr denotes the rolling velocity that can be computed according to
Kuhn and Bagi [29] using:
1 1 1 t
vr = R (i j ) n v . (19)
2 Rj Ri
Assuming that the slider can sustain a certain critical rolling resistance torque Mcr ,
one can write analog to (15):

Mcr 0 : stick F r = ||Fro ||
frtr := ||Fro || (20)
R > 0 : slip F r = Mcr /R.
With regard to the numerical handling of the spring in this context, the respective
relationships can be expressed in a summarized form as:

< 0 gr = gr + gr , gr = vr t
Fro Mcr
frtr := 1 r (21)
> 0 gr = Fc t + d v , tr = , = .
r r r
Fc
c ||Fro || R
(a) (b) (c)
0 i
Fig. 1 a Distribution of the Lebedev quadrature points for the case of N L = 302 points. b Steplike
discretization of a particle for the evaluation of fluid forces. c A detail of the domain showing
the classification of computational elements
In this model, the projection direction of the spring relies on the common rolling
direction of the pair of particles being in contact. Thus, the projection condition is
defined as gr = (gr t) t, where t = vr /||vr ||. By computing F r , one yields the pseudo
force Fr = F r tr and respectively the rolling resistance torque Mr = Rn Fr .
Finally, the torques for the examined pair of particles {Pi , P j } can be written as
follows: Mri = Mrj =: Mr .
4 Phase Coupling
In the fixed grid approach, the mesh of the flow field does not coincide with the
boundaries of the particles. Hence, information between the Eulerian and Lagrangian
description has to be transferred. A further challenging issue concerning the coupling
of the phases is the computation of the fluid forces acting on the particle surfaces.
4.1 Evaluation of the Fluid Forces
A crucial point regarding the study of fluid-particle interactions in a fully resolved

3D DNS framework is the computation of the fluid forces to which the immersed
particles are subjected. To carry out this task, two different strategies, as shown in
Fig. 1a, b, have been considered in this work. In the following, the approach illustrated
in (a) should be abbreviated as AP1 and the one in (b) as AP2.
Approach AP1 For the integration of the fluid forces acting on the surface of a
particle a quadrature rule can be used that was developed in Lebedev and Laikov
[30]. In Fig. 1a, the distribution of the Lebedev integration points, which can also be
seen as Lagrangian force points, is displayed for the case of N L = 302 points. This
numerical integration applied to particle Pi yields:

NL
NL
F if = t dA = (dF)k = J wk tk (22)
k=1 k=1
ip

NL
NL
T if = r t dA = rk (dF)k = J rk wk tk . (23)
k=1 k=1
ip
Therein, tk = ( n)k is the traction vector at the

kth Lebedev point, wk the corre-
sponding integration weight and = (1/V ) dV specifies the averaged fluid
stress tensor of the finite element with volume V in which the point k is located. The
mapping from a unit sphere to Pi is performed via the relation J = 4 Ri2 , where
Ri is the particle radius.
Approach AP2 In this approach to the evaluation of the fluid forces exerting on
Pi , the shape of the particle surface is reproduced on the basis of the computational
grid, see Fig. 1b. Here, the calculation of F if and T if is carried out using the steplike
reconstructed surface contour of Pi . To determine these forces from this surface,
one has to compute the averaged tractions on the respective element faces j { j =
1, . . . , N }, and subsequently the corresponding forces can simply be summed up
yielding:

N
N
F if = t d = (dF) j = tj j (24)
j=1 j=1
ip

N
N
T if = r t d = r j (dF) j = r j t j j , (25)
j=1 j=1
ip
where r j is the position vector of the center of the element surface j with respect
to Mi .
Remark To characterize the motion of the DEM particles sliding through the mesh,
the mesh elements are labeled as depicted in Fig. 1c. By using the position of the
element center point E as an assignment criteria, the elements that coincide with
a particle domain ip are marked at each time step by the particle number of Pi .
Here, the interior and boundary elements are defined by i and i , respectively.
Furthermore, fluid elements are denoted by 0 .
4.2 Coupling Constraints
For the coupling process, the rigid body motion of the particles is imposed on the
flow field. The rigidity constraints due to Pi that are applied to the NavierStokes
equations can be regarded as additional Dirichlet no-slip boundary conditions (see,

e.g., Wan and Turek [42]). The constraint for an interior velocity node V ip of
an boundary element i is given by:
u f = Ui + i r p , (26)
where r p is the vector pointing from Mi to the considered velocity node. However,
for a velocity node V ip that adjoins the fluid phase, the velocity constraint is
defined as:
u f = (1 A )u f + A (Ui + i r p ), (27)
where A is the element face area fraction situated within ip . Hence, A acts as
a weighting factor for V .1 But if V / ip , then a nonlinear weighted strategy is
applied according to Luo et al. [34], which reads:
u f = (1 R )u f + R (Ui + i r p ), (28)
where the interpolation factor R = eRe p || is a nonlinear function of the relative

Reynolds number Re p = f ||Ui u f ||Di / f and of the relative distance = h/Di .
Here, Di and h denote the particle diameter and the distance from V to the surface
of Pi , respectively.
5 Solution Algorithms
5.1 FEM Solver for the Fluid Problem
A spatial and temporal discretization of (1) yields a set of nonlinear algebraic equa-
tions for the fluid velocity u f and the pressure p. The resulting coupled equation
system, which has to be solved at each time step, can be written as follows:

M + 1 tN(un+1
f ) 2 tG un+1 (M 3 tN(unf )) unf
f = . (29)
GT 0 pn+1 0
Therein, M is the mass matrix, N the matrix including the diffusive, convective and
stabilization terms, G the gradient matrix, GT the divergence matrix, t the current
time-step size and 13 are parameters of the fractional step -scheme, see Turek
1 In the present work, the nonconforming rotated trilinear Q 1 /Q 0 element pair is used where a
nodal value at V is the mean value of the velocity vector over the respective element face area, see
Turek [38] for details. The velocity nodes of this element are located at the midpoints of the element
faces.
[38] for details. To solve (29), the multigrid FEM solver FeatFlow [39] is applied
in this contribution.
5.2 DEM Solver for the Particle Problem
The movement of the particles is governed by the equations of motions given in

(3) and (4). With regard to the numerical integration of these equations in time, the
finite difference based Gear predictorcorrector scheme of third and fourth order is
applied. The Gear integration scheme is subdivided into three steps: (1) prediction
of all the kinematic variables, (2) evaluation of the forces according to Sect. 3 by
using the predicted variables and, subsequently, computation of the corresponding
accelerations and, finally, (3) correction of the predicted kinematic variables based
on the new accelerations. For the algorithmic details of the Gear scheme see Allen
and Tildesley [1].
Remark An aspect that has also to be considered in the framework of develop-
ing a FEM-DEM coupling scheme is the widely differing computational time scale
between the both numerical methods. Since, due to the displacement driven charac-
ter of the DEM concerning the force computation, one usually has tD t. To
handle this problem of unequal time scales, a sub cycling strategy can be used as
suggested by Feng et al. [13].
5.3 Search Algorithms
The computation of the contact forces is the most CPU time consuming part of a DEM
simulation. Here, the evaluation of the contact detection has to be minimized to the
neighbors of a particle, since they are its only relevant potential contact partners. For
this purpose, the Verlet-List and the Linked-Cell is combined in order to yield a fast
contact search algorithm (see, e.g., Allen and Tildesley [1], Pschel and Schwager
[36]).
In the Verlet-List procedure a list of neighboring particle indices is maintained for
each particle in the system. By defining a Verlet distance threshold value vd , a pair
of particles can be considered as neighbor if vd > |g n | holds. Once the Verlet-List
is built, the contact detection needs only to be evaluated for the neighboring pairs.
As a result, this task scales with respect to the corresponding computational effort
with O(N ). Certainly, the list has to be updated at some intervals. In this regard, a
possible rebuild criteria can be defined by smax 0.6vd , where smax is the largest
displacement of a particle since the last list update (see Pschel and Schwager [36]).
But the construction of the list in a straightforward manner scales with O(N 2 ), thus,
one has to speed up this task in order to obtain a search algorithm that scales in toto
with O(N ). For this purpose, the computational domain is divided into cubic cells
of uniform edge lengths where the cells are slightly larger than the largest particle in
the system. After assigning all particles to the cells relative to their center of mass,
the relevant particles for the construction of the neighbor list of Pi are those who
are referenced to the group of 27 cells consisting of the owner cell of Pi and of its
direct 26 surrounding cells.
For a fast assignment of the element flags and, furthermore, in order to localize
efficiently the elements containing the integration points for the computation of the
fluid forces on the particles, the approach of the Linked-Cell method is used analo-
gously. The Linked-Cell algorithm generates in this case an element list referring to
the same cell structure as for the particles. Here, an element is referenced to a cell
with respect to the position of the elements center point E . Consequently, for the
application of the velocity constraints related to particle Pi and for the computation
of its fluid forces, only the elements are significant that are binned into the group of
those 27 cells which are relevant for Pi . In order to reduce the trial computations,
some elements can also be excluded in advance from detailed considerations if the
distance between E and ip is larger than a threshold value that can be chosen
according to the largest element size in the computational domain.
6 Numerical Simulations
The numerical results of three computed test problems obtained by the presented
approach are discussed next. The first test problem is the sedimentation of one particle
in a box. In the second simulation example, the sedimentation of two particles in a
row is considered in order to mimic their drafting-kissing-tumbling effect, and the
last example deals with a particle-laden flow through a tube with changing cross
section.
6.1 Sedimentation of a Single Particle in a Box
In this example, the sedimentation of a single particle in a box filled with fluid is
examined. The considered system is shown in Fig. 2. This system corresponds to the
setup that was experimentally investigated in ten Cate et al. [37] where the authors
measured the settling velocity of the immersed particle under the action of gravity
in four test cases, each with a different fluid. In the following, the obtained simula-
tion results for the cases with minimum and maximum terminal particle Reynolds
numbers, Re p = 1.5 and Re p = 31.9, are presented. Previously, the sedimentation
problem of ten Cate et al. [37] was computed by, among others, Veeramani et al. [40]
and Feng and Michaelides [14].
To discretize the box in Fig. 2, a uniform mesh consisting of 819,200 Q 1 /Q 0
elements (80 80 128 elements) is used. All the simulations were carried out
by imposing no-slip velocity conditions at the box boundaries. The diagrams in
Fig. 3 show the computed temporal evolutions of the settling velocity of the particle
Box dimension: 10 10 16 cm
Particle position: (5/5/12.75) cm
Particle radius: R = 0.75 cm
b Particle density: P = 1.12 g/cm3
Gravity: b = 980 cm/s2
Fluid:
Case f [g/cm3 ] f [cm2 /s] ReP = U D/ f
e3 e2 C1 0.970 3.8454 1.5
e1 C2 0.960 0.6042 31.9
Fig. 2 Sedimentation of a single particle in a box. Geometry and material data
(a) (b)
0 0
Settling velocity U3 [cm/s]
AP2 -2
-1 AP1 -4
[37]
-2 [40] -6
[14] -8
-3 [8]
-10
experi- [7]
ment [10] -12
-4
-14
from correlation
-5 equations -16
0.0 1.0 2.0 3.0 4.0 5.0 0.0 0.5 1.0 1.5
Time t [s] Time t [s]
Fig. 3 Evolution of the settling velocity of the particle for a case C1 and b case C2
center in the direction of gravity for the considered two cases. As it can be seen,
each case was simulated both by means of approach AP1 and AP2. Every diagram
also includes the predicted terminal velocity of the particle based on the correlation
equations2 suggested in Clift et al. [10], Brown and Lawler [7] and Cheng [8], and
furthermore, the numerical results of Veeramani et al. [40] and those of Feng and
Michaelides [14]. At the beginning of the experiment, the particle is at rest, and it
accelerates due to the action of gravity. It is observed that when the gravitational
and the drag forces reach a state of equilibrium, the particle will sediment with a
uniform velocity, which is called terminal velocity. The simulation results show that
the presented model, both based on AP1 and AP2, is capable to predict the evolution
of the particles settling velocity. The maximum discrepancy of the predictions with
2In general, correlations for drag and terminal settling velocity are valid for a particle in an infinite
domain, but they still provide reasonable results for a relatively large distance between a particle
and system boundaries.
Time: t = 0.4 s t = 1.3 s t = 1.9 s t = 2.6 s t = 3.7 s
Time: t = 0.2 s t = 0.5 s t = 0.7 s t = 0.9 s t = 1.2 s
Fig. 4 Contour plots of the normalized velocity magnitude in the symmetry plane at different points
in time. The plots of the upper row belong to case C1 and those of the lower row to case C2
respect to the experimental data is less than 8 %. In addition, the obtained results are
in a very good agreement with those of Veeramani et al. [40], but there is a small
mismatch compared with the results of Feng and Michaelides [14]. Figure 4 shows
the computed contour plots of the normalized velocity magnitude ||u f ||/U in the
symmetry plane. Accordingly, the depicted contours range between 0 and 1. Here,
an equal spacing of 0.1 is chosen. It shows that these plots agree well with those
of ten Cate et al. [37], and that the presented model is well suited to mimic their
sedimentation experiments. There is also a good agreement with the computed plots
given in Apte et al. [2].
6.2 Sedimentation of Two Particles in a Box
This benchmark simulation is carried out to reproduce the drafting-kissing-tumbling

effect of two particles sedimenting in a row, as it can be observed in laboratory
experiments (see, e.g., the experiment reported in Fortes et al. [15]). Figure 5 shows
the system with the particles being studied in this benchmark test. The depicted setup
has been already numerically investigated by Glowinski et al. [16], Apte et al. [2]
and Breugem [4]. For the discretization of the computational domain a uniform mesh
of 1,048,576 Q 1 /Q 0 elements (64 64 256 elements) is used. It is assumed that
Box dimension: 1 1 4 cm
Particle position: ( 0.5/ 0.5/3.5) cm
(0.5/0.5/3.167) cm
Gravity: b = 980 cm/s2
b
Fluid: Viscosity: f = 0.01 cm2 /s
Density: f = 1.00 g/cm3
Particle: Radius: R1 = R2 = 1/12 cm
Density: P1 = P2 = 1.14 g/cm3
Young modulus: E = 106 N/cm2
Poisson ratio: = 0.25
e3 e2
Friction coeff.: s /d = 0.35 / 0.32
e1 Damping coeff.: A = 5 105 s
Fig. 5 Sedimentation of two particles in a box. Geometry and material data
(a) (b)
0 0
AP2
AP2
-2 AP1 -2
[2]
[16]
-4 [4] -4
[8]
-6 [7] -6
[10]

only
-8 -8
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
Time t [s] Time t [s]
Fig. 6 Evolution of the settling velocities of the particles for the drafting-kissing-tumbling problem:
a results of the present work and b comparison of the results with the literature
the box is fully filled with fluid. Concerning the boundary conditions of the fluid
domain, no-slip velocity conditions are applied at the box walls. In order to provoke
that the particles tumble when they kiss each other, a slight initial horizontal offset
in the position of P1 is introduced such that (0.5075/0.5075/3.5) cm. This offset
is here necessary, because the numerical results of previous test computations based
on different uniform symmetric meshes showed that the implemented algorithm
maintains nearly the complete symmetric properties of the system. Thus, the flow
field features too weak lateral disturbances in order to trigger the tumbling case (as
if, for instance, a free or just an anisotropic mesh is used).
In order to verify that the numerical model is able to predict the terminal settling
velocity of an almost undisturbed sedimenting single particle within the frame of
this benchmark setup, the sedimentation of P2 was computed in advance without
Fig. 7 Numerical results of the drafting-kissing-tumbling simulation at eight points in time
the presence of P1 . Figure 6a shows the obtained time history of the falling velocity
by using AP2, and one can see that the predicted terminal velocity matches those
obtained with correlations. This diagram contains also the computed evolutions of
the particle velocities by employing the approaches AP1 on the one hand and AP2
on the other hand for the drafting-kissing-tumbling case. It can be observed that
the results obtained for both simulations agree quite well. With regard to a further
verification of the presented algorithm and of its implementation, the predictions of
AP2 are compared to other numerical predictions that can be found in the literature
for the same setup, see Fig. 6b. The comparison is generally found to be good where
the presented results are particularly consistent with the simulation results of Apte
et al. [2] and Breugem [4]. At this point, one has to underline that all predicted
evolutions in Fig. 6b rely on different FD method concepts.
In Fig. 7, the process of drafting-kissing-tumbling is displayed at eight different
points in time. One observes that once the trailing particle is located within the grad-
ually growing wake region of the leading particle, it experiences a lower drag force.
Consequently, this results in a higher fall velocity for the trailing particle compared to
the leading particle (drafting). With increasing time, the gap between both particles
decreases due to their velocity difference, and thus the particles getafter a while
into contact (kissing). Clearly, this configuration is unstable. The particles tumble
as a consequence and start to separate (tumbling). The flow phenomena observed
in the experiment of Fortes et al. [15] are definitely reproduced by the presented
computational approach.
6.3 Particle-Laden Flow Through a Tube
In this test case, a particle-laden flow through a tube with different cross sections is
considered. It is assumed that the particles are allowed to adhere to each other and as
Fig. 8 Particle-laden flow through a tube. Geometry and material data
Fig. 9 Numerical results of the flow through the tube showing the velocity field at t = 6.5 [s]
well at the tube wall. Figure 8 shows the geometry of the tube including the model
data, where the parameters WPP and WPW are the work of adhesion among particles
and between particles and the tube wall, respectively. Gravitational effects are not
considered. The suspended particles are randomly inserted at the inflow boundary
with an initial velocity which is conform to the inflow velocity of the fluid. To compute
the flow in the tube, a mesh consisting of 2,304,000 Q 1 /Q 0 elements is used. This
yields a discretized system with 2,304,000 pressure nodes and 6,953,280 velocity
nodes. The time step sizes for the FEM and DEM solver are selected as t = 104 s
and tD = 106 s, respectively.
The presence of van der Waals forces leads with the chosen parameters to a
deposition of suspended particles onto the tube wall. The deposition grows with
an increasing simulation time, in particular at the end of the smaller cross section
Fig. 10 Numerical results of the flow through the tube showing the pressure field at t = 6.5 [s]
Fig. 11 Numerical results of the flow through the tube showing: ac the streamlines and the particles
at different points in time and d the particle velocity vectors at t = 2.5 s. The colors represent in
all cases the velocity magnitudes
part. With more particles adhering to the wall, the velocity increases locally, and the
particles experience higher drag forces. A situation like this is depicted in Fig. 9.
Therein, one can see the influence of the developed agglomerates on the velocity
field when they stick to the tube wall or slide along the wall. Due to the fully coupled
description of the particulate flow, the strong mutual phase interactions, as in the
situation shown in Fig. 9, can be captured by the developed DNS fluid-particle
solver. The strong local impact of the dispersed phase on the fluid phase is here
Fig. 12 Numerical results of the flow through the tube showing the fluid and particle velocities at
six points in time
obvious. In Fig. 10, one can observe the corresponding effects regarding the pressure
field (see the pressure difference when comparing the luv and the lee side of the large
agglomerate). For the case when the traction forces exceed the adhesive forces, single
particles and agglomerates break off and are subsequently transported away by the
flow field. The complexity of the evolution of the multiphase flow situation at the
outflow of the smaller cross section part is reflected in Fig. 11. There, the image (a),
which shows the flow at time t = 1.0 s, illustrates a fully developed axisymmetric
fluid flow with a certain eddy zone evolved in this region. Having a look at the images
(b) and (c), t = 2.5 s and t = 10.0 s, one can see that this axial symmetry is totally lost
and that a large number of particles has deposited on the wall. The image (d) shows
the same flow situation as in (b), but in this case including particle velocity vectors
and without streamlines. Here, the influence of the eddy on the trajectories of the
passing particles can be clearly seen. In fact, the particles which are fully caught by
the eddy change the flow direction so that they are subsequently transported against
the main flow in the tube. The different phases of the flow process are shown for the
whole system in Fig. 12.
7 Conclusion
In this work, a computational approach for the full 3D DNS of particulate flows is
presented. The approach is based on the FD method. The developed solver treats the
coupled fluid-particle problem in a staggered way by solving the phases explicitly in
succession. The mutual phase interactions are computed on the other hand implicitly.
As numerical solvers, the FEM and DEM are applied, respectively, to simulate the
fluid and particle part. In the framework of the DEM, the particle collision is described
using an adhesive viscoelastic model, and additionally, friction and rolling resistance
are considered. To verify the reliability of the algorithm and its implementation,
various test computations were performed. In this contribution, the simulation results
of two sedimentation problems were presented and discussed. Furthermore, the solver
was applied to simulate an aggregation dynamics problem of a particulate flow where
the formation of agglomerates is considered. The chosen system for this task was
a tube with different cross sections. The obtained numerical results show that the
developed solver is appropriate to deal with fluid-particle interaction problems.
References
1. Allen MP, Tildesley DJ (1987) Computer simulation of liquids. Oxford University Press, New
York
2. Apte SV, Martin M, Patankar NA (2009) A numerical method for fully resolved simulation
(FRS) of rigid particle-flow interactions in complex flows. J Comput Phys 228(8):27122738
3. Avci B, Wriggers P (2012) A DEM-FEM coupling approach for the direct numerical simulation
of 3D particulate flows. J Appl Mech 79:010901
4. Breugem WP (2012) A second-order accurate immersed boundary method for fully resolved
simulations of particle-laden flows. J Comput Phys 231(13):44694498
5. Brilliantov NV, Albers N, Spahn F, Pschel T (2007) Collision dynamics of granular particles
with adhesion. Phys Rev E 76(5, Part 1):051302
6. Brilliantov NV, Spahn F, Hertzsch JM, Pschel T (1996) Model for collisions in granular gases.
Phys Rev E 53(5, Part B):53825392
7. Brown PP, Lawler DF (2003) Sphere drag and settling velocity revisited. J Environ Eng
129(3):222231
8. Cheng NS (2009) Comparison of formulas for drag coefficient and settling velocity of spherical
particles. Powder Technol 189(3):395398
9. Chokshi A, Tielens AGGM, Hollenbach D (1993) Dust coagulation. Astrophys J 407(2, Part
1):806819
10. Clift R, Grace JR, Weber ME (1978) Bubbles, drops and particles. Academic Press, New York
11. Crowe CT (2006) Multiphase flow handbook. CRC Press, Boca Raton
12. Cundall PA, Strack ODL (1979) Discrete numerical model for granular assemblies. Geotech-
nique 29(1):4765
13. Feng YT, Han K, Owen DRJ (2007) Coupled lattice Boltzmann method and discrete element
modelling of particle transport in turbulent fluid flows: computational issues. Int J Numer Meth
Eng 72(9):11111134
14. Feng ZG, Michaelides EE (2005) Proteus: a direct forcing method in the simulations of par-
ticulate flows. J Comput Phys 202(1):2051
15. Fortes AF, Joseph DD, Lundgren TS (1987) Nonlinear mechanics of fluidization of beds of
spherical-particles. J Fluid Mech 177:467483
16. Glowinski R, Pan TW, Hesla TI, Joseph DD, Periaux J (2001) A fictitious domain approach
to the direct numerical simulation of incompressible viscous flow past moving rigid bodies:
application to particulate flow. J Comput Phys 169(2):363426
17. Haeri S, Shrimpton JS (2012) On the application of immersed boundary, fictitious domain and
body-conformal mesh methods to many particle multiphase flows. Int J Multiph Flow 40:3855
18. Hertz H (1882) ber die berhrung fester elastischer krper. Journal fr die reine und ange-
wandte Mathematik 92:156171
19. Hu HH (1996) Direct simulation of flows of solid-liquid mixtures. Int J Multiph Flow
22(2):335352
20. Hu HH, Joseph DD, Crochet MJ (1992) Direct simulation of fluid particle motions. Theor
Comput Fluid Dyn 3:285306
21. Hu HH, Patankar NA, Zhu MY (2001) Direct numerical simulations of fluid-solid systems
using the Arbitrary Lagrangian-Eulerian technique. J Comput Phys 169(2):427462
22. Iwashita K, Oda M (1998) Rolling resistance at contacts in simulation of shear band develop-
ment by DEM. J Eng Mech 124(3):285292
23. Johnson AA, Tezduyar TE (1996) Simulation of multiple spheres falling in a liquid-filled tube.
24. Johnson AA, Tezduyar TE (1997) 3D simulation of fluid-particle interactions with the number
of particles reaching 100. Comput Methods Appl Mech Eng 145(34):301321
25. Johnson AA, Tezduyar TE (1999) Advanced mesh generation and update methods for 3D flow
simulations. Comput Mech 23(2):130143
26. Johnson AA, Tezduyar TE (2001) Methods for 3D computation of fluid-object interactions in
spatially periodic flows. Comput Methods Appl Mech Eng 190(2425):32013221
27. Johnson KL, Kendall K, Roberts AD (1971) Surface energy and contact of elastic solids. Proc
R Soc Lond A 324(1558):301313
28. Kruggel-Emden H, Wirtz S, Scherer V (2008) A study on tangential force laws applicable to
the discrete element method (DEM) for materials with viscoelastic or plastic behavior. Chem
Eng Sci 63(6):15231541
29. Kuhn MR, Bagi K (2004) Alternative definition of particle rolling in a granular assembly. J
Eng Mech 130(7):826835
30. Lebedev VI, Laikov DN (1999) Quadrature formula for the sphere of 131-th algebraic order
of accuracy. Dokl Akad Nauk SSSR 366(6):741745
31. Loskofsky C, Song F, Newby BZ (2006) Underwater adhesion measurements using the JKR
technique. J Adhes 82(7):713730
32. Luding S (2004) Micro-macro transition for anisotropic, frictional granular packings. Int J
Solids Struct 41(21):58215836
33. Luding S (2008) Cohesive, frictional powders: contact models for tension. Granular Matter
10(4):235246
34. Luo K, Wang Z, Fan J (2007) A modified immersed boundary method for simulations of
fluid-particle interactions. Comput Methods Appl Mech Eng 197(14):3646
35. Maugis D (1992) Adhesion of spheresthe JKR-DMT transition using a Dugdale model. J
Colloid Interface Sci 150(1):243269
36. Pschel T, Schwager T (2005) Computational granular dynamics. Springer, New York
37. ten Cate A, Nieuwstad CH, Derksen JJ, van den Akker A (2002) Particle imaging velocimetry
experiments and lattice-Boltzmann simulations on a single sphere settling under gravity. Phys
Fluids 14(11):40124025
38. Turek S (1999) Efficient solvers for incompressible flow problems: an algorithmic and com-
putational approach. Springer, Berlin
39. Turek S, Becker C (1998) FEATFLOW. Finite element software for the incompressible Navier-
Stokes equations. Institute for Applied Mathematics, University of Heidelberg, Heidelberg
40. Veeramani C, Minev PD, Nandakumar K (2007) A fictitious domain formulation for flows with
rigid particles: a non-lagrange multiplier version. J Comput Phys 224(2):867879
41. Wachs A (2009) A DEM-DLM/FD method for direct numerical simulation of particulate flows:
sedimentation of polygonal isometric particles in a Newtonian fluid with collisions. Comput
Fluids 38(8):16081628
42. Wan D, Turek S (2006) Direct numerical simulation of particulate flow via multigrid FEM
techniques and the fictitious boundary method. Int J Numer Meth Fluids 51(5):531566
43. Wriggers P (2006) Computational contact mechanics, 2nd edn. Springer, Berlin
A Particle Finite Element Method (PFEM)
for Coupled Thermal Analysis of Quasi
and Fully Incompressible Flows
and Fluid-Structure Interaction Problems
Eugenio Oate, Alessandro Franci and Josep M. Carbonell
Abstract We present a Lagrangian formulation for coupled thermal analysis of

quasi and fully incompressible flows and fluid-structure interaction (FSI) problems
that has excellent mass preservation features. The success of the formulation lays on
a residual-based stabilized expression of the mass balance equation obtained using
the Finite Calculus (FIC) method. The governing equations are discretized with the
FEM using simplicial elements with equal linear interpolation for the velocities,
the pressure and the temperature. The merits of the formulation in terms of reduced
mass loss and overall accuracy are verified in the solution of 2D and 3D adiabatic and
thermally-coupled quasi-incompressible free-surface flows and FSI problems using
the Particle Finite Element Method (PFEM). Examples include the sloshing of water
in a tank and the falling of a water sphere and a cylinder into a tank containing water.
1 Introduction
The analysis of thermally coupled flows and their interaction with structures is rele-
vant in many fields of engineering. In this work we present a Lagrangian numerical
technique for solving this kind of problems for quasi and fully incompressible fluids
using the Particle Finite Element Method (PFEM, www.cimne.com/pfem).
E. Oate (B) A. Franci

Centre Internacional de Mtodes Numrics en Enginyeria (CIMNE), Campus Norte UPC,
http://www.cimne.com
A. Franci
J. M. Carbonell
Universitat Politcnica de Catalunya (UPC), 08034 Barcelona, Spain

130 E. Oate et al.
The PFEM treats the mesh nodes in the analysis domain as particles which can
freely move and even separate from the domain representing, for instance, the effect
of water drops or cutting particles in drilling problems. A mesh connects the nodes
discretizing the domain where the governing equations are solved using a stabilized
FEM. Examples of application of PFEM to problems in fluid and solid mechanics
including fluid-structure interaction (FSI) situations can be found in [46, 819, 28,
29, 33, 35, 36, 4043]. Early attempts of the PFEM for solving thermally coupled
flows were reported in [1, 2].
In Lagrangian analysis procedures (such as the PFEM) the motion of the fluid
particles is tracked during the transient solution. Hence, the convective terms van-
ish in the momentum and heat transfer equations and no numerical stabilization is
needed for treating those terms. Two other sources of mass loss, however, remain
in the numerical solution of Lagrangian flows, i.e. that due to the treatment of the
incompressibility constraint by a stabilized numerical method, and that induced by
the inaccuracies in tracking the flow particles and, in particular, the free surface.
In this work the PFEM equations for analysis of thermally coupled flows and FSI
problems are derived using the stabilized formulation based in the Finite Calculus
(FIC) method proposed by Oate et al. [2027, 3032, 3739] that has excellent
mass preservation features.
The lay-out of the paper is the following. In the next section we present the basic
equations for conservation of linear momentum, mass and heat transfer for a quasi-
incompressible fluid in a Lagrangian framework. A full incompressible fluid can be
considered as a particular limit case of the former. Next we derive the stabilized
FIC form of the mass balance equation. Then the finite element discretization using
simplicial element with equal order approximation for the velocity, the pressure and
the temperature is presented and the relevant matrices and vectors of the discretized
problem are given. Details of the implicit solution of the Lagrangian FEM equations
in time using a Newton iterative scheme are presented. The relevance of the bulk
stiffness terms in the tangent matrix for enhancing the convergence and accuracy of
the iterative solution scheme is discussed. The basic steps of the PFEM for solving
FSI problems are described.
The efficiency and accuracy of the PFEM technique are verified by solving a set of
adiabatic and thermally coupled quasi-incompressible free surface flow problems in
two (2D) and three (3D) dimensions involving FSI situations. The adiabatic problems
are the sloshing of water in a tank and the penetration of a water sphere into a
cylindrical tank containing water. The thermally coupled problems considered are
the extended 2D version of the adiabatic cases. The excellent performance of the
numerical method proposed in terms of mass conservation and general accuracy is
highlighted.
2 Governing Equations
We write the governing equations for a quasi-incompressible Newtonian flow prob-

lem in the Lagrangian description as follows [3, 46].
A Particle Finite Element Method (PFEM) 131
2.1 Momentum Equations
Dvi i j
bi = 0, i, j = 1, ..., n s in . (1)
Dt x j
In Eq. (1), is the analysis domain with boundary , vi and bi are the velocity
and body force components along the ith Cartesian axis, is the density, n s is the
number of space dimensions (i.e. n s = 3 for 3D problems) and i j are the Cauchy
stresses that are split in the deviatoric (si j ) and pressure ( p) components as
i j = si j + pi j (2)
where i j is the Kronecker delta. Note that the pressure is assumed to be positive for
a tension state. Summation of terms with repeated indices is assumed in Eq. (1) and
in the following, unless otherwise specified.
The relationship between the deviatoric stresses si j and the strain rates i j has the
standard form for a Newtonian fluid,

1 1 vi v j
si j = 2 i j v i j with i j = + (3)
3 2 x j xi
vi
where is the viscosity and v is the volumetric strain rate defined as v = ii = xi .
2.2 Mass Balance Equation

The standard mass balance equation for a quasi-incompressible fluid can be written
as [3, 7, 46]
rv = 0 (4a)
with
1 Dp
rv := + v . (4b)
c2 Dt
In Eq. (4b) c is the speed of sound in the fluid. For a fully incompressible fluid
c = and Eq. (4a) simplifies to the standard form, v = 0. In our work we will
retain the quasi-incompressible form of rv of Eq. (4b) for convenience.
132 E. Oate et al.
2.3 Thermal Balance

DT T
c k Q = 0 i = 1, n s in (5)
Dt xi xi
where T is the temperature, c is the thermal capacity, k is the heat conductivity and
Q is the heat source.
2.4 Boundary Conditions
2.4.1 Mechanical Problem
The boundary conditions at the Dirichlet (v ) and Neumann (t ) boundaries with

= v t are
p
vi vi = 0 on v (6)
p
i j n j ti = 0 on t i, j = 1, n s (7)
p p
where vi and ti are the prescribed velocities and prescribed tractions on v and t ,
respectively and n j are the components of the unit normal vector to the boundary
[3, 7, 46].
2.5 Thermal Problem
p = 0 on (8)
p
k + qn = 0 on q (9)
n
p
where p and qn are the prescribed temperature and the prescribed normal heat
flux at the boundaries and q , respectively and n is the direction normal to the
boundary.
Remark 1 The term Dv i
Dt in Eq. (1) is the material derivative of the ith velocity
component vi . This term is typically computed in a Lagrangian framework as
Dvi n+1 v n vi
i
= (10)
Dt t
with
n+1
vi := vi (n+1 x, n+1 t), n
vi := vi (n x, n t) (11)
where n vi is the velocity of the material point that has the position n x at time t = n t,
where x is the coordinates vector in a fixed Cartesian system [3, 7, 46].
3 Stabilized Mass Balance Equation
In this work we will use the second order FIC form of the mass balance equation in
space for a quasi-incompressible fluid [37, 38], as well as the first order FIC form of
the mass balance equation in time. These forms have the following expressions:
3.1 Second Order FIC Mass Balance Equation in Space
h i2 2 rv
rv + =0 in i = 1, ..., n s . (12a)
12 xi2
3.2 First Order FIC Mass Balance Equation in Time
Drv
rv + =0 in . (12b)
2 Dt
Equation (12a) is obtained by expressing the balance of mass in a rectangular

domain of finite size with dimensions h 1 h 2 (for 2D problems), where h i are arbi-
trary distances, and retaining up to third order terms in the Taylor series expansions
used for expressing the change of mass within the balance domain.
Equation (12b), on the other hand, is obtained by expressing the balance of mass
in a spacetime domain of infinitesimal length in space and finite dimension in
time [20]. The derivation of Eqs. (12b) for a 1D problem are shown in [41].
The FIC terms in Eqs. (12b) play the role of space and time stabilization terms
respectively. In the discretized problem, the space dimensions h i and the time dimen-
sion are related to characteristic element dimensions and the time step increment,
respectively as it will be explained later. Note that for h i = 0 and = 0 the standard
infinitesimal form of the mass balance equation, rv = 0, is obtained.
After some transformations the stabilized mass balance equation (12a) is written
as [41]
1 Dp D2 p rm i
+ v 2 + =0 (13)
Dt c Dt 2 xi
134 E. Oate et al.
where k = pc2 is the bulk modulus, is a stabilization parameter given by [41]

1
8 2
= + (14)
h2
and rm i is a static momentum term defined as
p
rm i = (2i j ) + + bi . (15)
x j xi
Equation (13) is used as the starting point for deriving the stabilized FEM formu-
lation as explained in the following sections.
4 Variational Equations
4.1 Momentum Equations
Multiplying Eq. (1) by arbitrary test functions wi with dimensions of velocity and
integrating over the analysis domain gives the weighted residual form of the
momentum equations as [3, 7, 46]

Dvi i j
wi bi d = 0. (16)
Dt x j

Integrating by parts the term involving i j and using the Neumann boundary
conditions (7) yields the weak variational form of the momentum equations as

Dvi p
wi d + i j i j d wi bi d wi ti d = 0 (17)
Dt
t
w
where i j = w
x j + xi is an arbitrary (virtual) strain rate field. Equation (17) is the
i j
standard form of the principle of virtual power [3, 7, 46].

Substituting the expression of the stresses from Eq. (2) into Eq. (17) gives

Dvi 1 p
wi d + i j 2 i j v i j + v p d wi bi d wi ti d = 0 (18)
Dt 3
t
Equation (18) can be written in matrix form as

Dv
wT d + T D d + T m pd w T bd wT t p d = 0.
Dt
t
(19)
In Eq. (19) w, v, and are vectors containing the test functions, the velocities,
the strain rates and the virtual strain rates respectively; b and t p are body force and
surface traction vectors, respectively; D is the viscous constitutive matrix and m is
an auxiliary vector. These vectors are defined as (for 3D problems)
p p p
w = [w1 , w2 , w3 ]T , v = [v1 , v2 , v3 ]T , b = [b1 , b2 , b3 ]T , t p = [t1 , t2 , t3 ]T
= [11 , 22 , 33 , 12 , 13 , 23 ]T , = [11 , 22 , 33 , 12 , 13 , 23 ]T

4/3 2/3 2/3 0 0 0
4/3 2/3 0 0 0

4/3 0 0 0
D =

, m = [1, 1, 1, 0, 0, 0]T .
1 0 0
Sym. 1 0
1
(20)
4.2 Mass Balance Equations
We multiply Eq. (13) by arbitrary (continuous) test functions q (with dimensions of

pressure) defined over the analysis domain . Integrating over gives

q Dp D2 p rm i
d q d + qv d + q d = 0. (21)
Dt c2 Dt 2 xi

Integrating by parts the last integral in Eq. (21) and using Eq. (15) gives after some
transformations [39, 40]

q Dp D2 p q p
d + q 2 d qv d + (2i j ) + + bi d
Dt c Dt 2 xi xi xi
(22)
Dvn 2 vn
q 2 + p tn d = 0.
Dt hn n
t
Expression (22) holds for 2D and 3D problems. The terms involving the first and
second material time derivative of the pressure and the boundary term in Eq. (22)
136 E. Oate et al.
are important for preserving the conservation of mass in free-surface flow problems
[10, 41].
4.3 Thermal Balance Equation
Application of the standard weighted residual method to the heat balance Eqs. (5)
and (9) leads, after standard operations, to [7, 44]

T w T p
wc d + k d w Qd + wqn d = 0 (23)
t xi xi
q
where w are the space weighting functions for the temperature.
5 FEM Discretization
We discretize the analysis domain into finite elements with n nodes in the standard
manner leading to a mesh with a total number of Ne elements and N nodes. In our
work we will choose simple 3-noded linear triangles (n = 3) for 2D problems and
4-noded tetrahedra (n = 4) for 3D problems with local linear shape functions Nie
defined for each node i (i = 1, n) of element e [34, 44]. The velocity components, the
pressure and the temperature are interpolated over the mesh in terms of their nodal
values in the same manner using the global linear shape functions N j spanning over
the elements sharing node j ( j = 1, N ) [34, 4446]. In matrix form
v = Nv v , p = N p p , T = NT T (24)
where
1 1

v i
p
T1

v2 v1
p 2
T2
v = .. with v = v2 , p =
i i
.. , T= ..

.
i
.

.
(25)
N
v3 N

N
v p T
Nv = [N1 , N2 , , N N ] , N p = NT = [N1 , N2 , , N N ]
T
with N j = N j In where In is the n n unit matrix.

In Eq. (25) vectors v, p and T contain the nodal velocities, the nodal pressures and
the nodal temperatures for the whole mesh, respectively and the upperindex denotes
the nodal value for each vector or scalar magnitude.
Substituting Eq. (24) into Eqs. (17), (22) and (23) and choosing a Galerkin formu-
lation with wi = q = wi = Ni leads to the following system of algebraic equations
M0 v + Kv + Qp fv = 0 (26a)
M1 p + M2 p QT v + (L + Mb )p f p = 0 (26b)
+ LT f = 0
CT (26c)
T
where a and a denote the first and second material time derivatives of the components
of a vector a. The different matrices and vectors in Eqs. (26a) are assembled from
the element contributions given in Box 1.
Remark 2 The boundary terms of vector f p can be incorporated in the matrices of

Eq. (26b). This leads to a non symmetrical set of equations. These boundary terms
are computed here iteratively within the incremental solution scheme.
138 E. Oate et al.
Remark 3 The presence of matrix Mb in Eq. (26b) allows us to compute the pressure
without the need of prescribing its value at the free surface. This eliminates the error
introduced when the pressure is prescribed to zero in free boundaries, which leads
to considerable mass losses in viscous flows [15].
Remark 4 The stabilization parameter of Eq. (14) is computed for each element e
using h = l e and = t as
1
8 2
= + (27)
(l e )2 t
where t is the time step used for the transient solution and l e is a characteristic
element length computed as l e = 2(e )1/n s where e is the element area (for
3-noded triangles) or volume (for 4-noded tetrahedra). For fluids with heterogeneous
material the values of and are computed at the element center.
The characteristic boundary length h n in the expression of f p (Box 1) has been
taken equal to l e in our computations.
6 Transient Solution of the Discretized Equations
Equations (26a), (26b), (26c) are solved in time with an implicit Newton-Raphson
type iterative scheme [3, 7, 44, 46]. The basic steps within a time interval [n, n + 1]
are:
Initialize variables: (n+1 x1 , n+1 v1 , n+1 p1 , n+1 T1 , n+1 rm

1 ) {n x, n v, n p, n T, n r }.
m
Iteration loop: i = 1, N I T E R.
For each iteration.
Step 1. Compute the nodal velocity increments v

From Eq. (26a), we deduce
n+1
Hvi v = n+1 rim v (28a)
with the momentum residual rm and the iteration matrix Hv given by
1
rm = M0 v + Kv + Qp fv , Hv = M0 + K + Kv (28b)
t
Step 2. Update the nodal velocities

n+1 i+1
v = n+1 vi + v (29)
Step 3. Compute the nodal pressures n+1 pi+1
From Eq. (26b) we obtain

n+1 Hi n+1 pi+1 = 1 M n+1 pi + 1 M (2n pn1 p)+QT n+1 vi+1 +n+1 f i n+1 pi+1
p 1 2 p
t t 2
(30a)
with
1 1
Hp = M1 + M2 + L + Mb (30b)
t t 2
Step 4. Update the nodal coordinates

1
n+1 i+1
x = n+1 xi + (n+1 vi+1 + n v)t (31)
2
A more accurate expression for computing n+1 xi+1 can be used involving the
nodal accelerations [40].
Step 5.Compute the nodal temperatures
From Eq. (26c) we obtain

1
C + L T = n+1 riT , n+1 i+1
T = n Ti + T (32)
t
with
+ LT f
rT = CT (33)
T
Step 6. Check convergence
Verify the following conditions:
n+1 vi+1 n+1 vi ev n v

n+1 pi+1 n+1 pi e p n p (34)
n+1 Ti+1 n+1 Ti eT n T
where ev , e p and eT are prescribed error norms. In our examples we have set ev
= e p = eT = 103 .
If conditions (34) are satisfied then make n n + 1 and proceed to the next time
step. Otherwise, make the iteration counter i i + 1 and repeat Steps 15.
Remark 5 In Eqs. (28a, 28b)(34) n+1 () denotes the values of a matrix or a vector
computed using the nodal unknowns at time n + 1. In this work the derivatives and
140 E. Oate et al.
integrals in all the matrices and the residual vectors rm and rT are computed on
the discretized geometry at time n while the nodal force vectors fv , f p and fv are
computed on the current configuration at time n + 1. This is equivalent to using an
updated Lagrangian formulation [3, 45, 46].
Remark 6 Including the bulk stiffness matrix Kv in Hv has proven to be essential for
the fast convergence, mass preservation and overall accuracy of the iterative solution
[10, 41]. The element expression of Kv can be obtained as [41]

Kev = BT m tmT Bd (35)
e
where is a positive number such that 0 < 1 that has the role of preventing
the ill-conditioning of the iteration matrix Hv for highly incompressible fluids. An
adequate selection of also improves the overall accuracy of the numerical solution
and the preservation of mass for large time steps [10]. For fully incompressible fluids
(c and = ), a finite value of is used in practice in Kv as this helps to obtaining an
accurate solution for velocities and pressure with reduced mass loss in few iterations
per time step [10]. These considerations, however, do not affect the value of within
matrix M1 in Eq. (26b) that vanishes for a fully incompressible fluid. Clearly, the
value of the terms of Kve can also be limited by reducing the time step size. This,
however, leads to an increase in the cost of the computations. A similar approach for
improving mass conservation in incompressible flows was proposed in [42].
Remark 7 The iteration matrix Hv in Eq. (28a) is an approximation of the exact

tangent matrix in the updated Lagrangian formulation for a quasi-incompressible
fluid [40]. The simplified form of Hv used in this work has yielded good results with
convergence achieved for the nodal velocities, the pressure and the temperature in
34 iterations in all the problems analyzed.
Remark8 The time step within a time interval [n, n + 1] is chosen as t

nle
= min |n v|minmax
, tb where n lmin
e is the minimum characteristic distance of all
elements in the mesh, with l computed as explained in Remark 4, |n v|max is
e
the maximum value of the modulus of the velocity of all nodes in the mesh and
tb is the critical
n time step of all nodes approaching a solid boundary defined as
lb
tb = min |n vb |max where n lb is the distance from the node to the boundary and
n v is the velocity of the node. This definition of t intends that no node crosses a
b
solid boundary during a time step.
A method that allows using large time steps in the integration of the PFEM equa-
tions can be found in [16].
Solid node
Initial cloud of nodes
Fluid node
Fixed boundary node
n
Flying Sub-domains
Fixed Domain
Mesh
boundary
Cloud
n
,
.
n
, n , n ,n , n , n
.
n

Mesh

Fixed
boundary Domain
n+1
,
.
n+1
, n+1 , n+1 ,n+1 , n+1 , n+1
etc
Cloud
Fig. 1 Sequence of steps to update a cloud of nodes representing a domain containing a fluid
and a solid part from time n (t =n t) to time n + 2 (t =n t + 2t)
7 About the Particle Finite Element Method (PFEM)
7.1 The Basis of the PFEM
Let us consider a domain V containing fluid and solid subdomains. Each subdomain
is characterized by a set of points, hereafter termed particles. The particles contain all
the information for defining the geometry and the material and mechanical properties
of the underlying subdomain. In the PFEM both subdomains are modelled using an
updated Lagrangian formulation [3, 45].
The solution steps within a time step in the PFEM are as follows:
1. The starting point at each time step is the cloud of points C in the fluid and solid
domains. For instance n C denotes the cloud at time t = n t (Fig. 1).
2. Identify the boundaries defining the analysis domain n V , as well as the subdo-
mains in the fluid and the solid. This is an essential step as some boundaries
(such as the free surface in fluids) may be severely distorted during the solution,
including separation and re-entering of nodes. The Alpha Shape method [8] is
142 E. Oate et al.
Fig. 2 2D analysis of sloshing of water in rectangular tank. Initial geometry, analysis data and
mesh of 5064 3-noded triangles discretizing the water in the tank
used for the boundary definition. Clearly, the accuracy in the reconstruction of
the boundaries depends on the number of points in the vicinity of each boundary
and on the Alpha Shape parameter. In the problems solved in this work the Alpha
Shape method has been implementation as described in [12, 28].
3. Discretize the the analysis domain n V with a finite element mesh n M. We use
an efficient mesh generation scheme based on an enhanced Delaunay tesselation
[11, 12].
4. Solve the Lagrangian equations of motion for the overall continuum using the
standard FEM. Compute the state variables in at the next (updated) configuration
for n t +t: velocities, pressure and viscous stresses in the fluid and displacements,
stresses and strains in the solid.
5. Move the mesh nodes to a new position n+1 C where n+1 denotes the time n t +t,
in terms of the time increment size.
6. Go back to step 1 and repeat the solution for the next time step to obtain n+2 C.
Note that the key differences between the PFEM and the classical FEM are the
remeshing technique and the identification of the domain boundary at each time step.
The CPU time required for meshing grows linearly with the number of nodes. As a
general rule, meshing consumes for 3D problems around 15 % of the total CPU time
per time step, while the solution of the equations (with typically 3 iterations per time
step) and the system assembly consume approximately 70 % and 15 % of the CPU
time per time step, respectively. These figures refer to analyses in a single processor
Pentium IV PC [36]. Considerable speed can be gained using parallel computing
techniques.
In this work we will apply the PFEM to problems involving a rigid domain con-
taining fluid particles only. Application of the PFEM in fluid and solid mechanics
and in fluid-structure interaction problems can be found in [46, 819, 28, 29, 33,
35, 36, 4043], as well as in www.cimne.com/pfem.
Fig. 3 2D sloshing of water in rectangular tank. Snapshots of water geometry at two different times
( = 1). Colours indicate pressure contours. a t = 5.7 s; b t = 7.4 s; c t = 13.3 s; d t = 18.6 s
8 Examples
8.1 Sloshing of Water in Prismatic Tank
The problem has been solved first in 2D. Figure 2 shows the analysis data. The fluid
oscillates due to the hydrostatic forces induced by its original position.
The problem has been run using different values of the parameter in the tangent
bulk stiffness matrix Kve (Eq. 35). The first set of results (Figs. 3 and 4) were obtained
with = 1. The problem was then solved for = 0.08, thereby, reducing in one
order the magnitude the diagonal terms in Kve .
Figure 3 shows snapshots of the water geometry at different times. Pressure con-
tours are superposed to the deformed geometry of the fluid in the figures.
144 E. Oate et al.
(a) Accumulated volume loss over 20 seconds of analysis
(b)
Volume variation (in %) per time step over 20 seconds of analysis (Current
method with = 1)
Fig. 4 2D sloshing of water in rectangular tank. a Time evolution of the percentage of water volume
loss due to the numerical algorithm. b Average volume variation per time step. Current method.
1.09 104 %. Fractional step: 2.07 104 %
Figure 4 shows the evolution of the percentage of water volume (i.e. mass) loss
introduced by the numerical solution scheme. The accumulated volume loss (in
percentage versus the initial volume) for the method proposed with = 1 is approx-
imately 1.33 % over 20 s of simulation time (Fig. 4a). The average volume variation
in absolute value per time step is 1.09104 % (Fig. 4b). The total water volume loss
is the sum of the losses induced by the numerical scheme and the losses due to the
updating of the free surface using the PFEM. No correction of mass was introduced
Fig. 5 2D sloshing of water in rectangular tank. Time evolution of percentage of water volume
loss obtained using the current method with = 0.08 (curve A) and = 1 (curve B) t = 103 s
Fig. 6 2D sloshing of water in rectangular tank. Time evolution of percentage of water volume
loss obtained with the current method. Curve A = 1 and t = 104 s. Curve B = 1 and
t = 103 s
at the end of each time step. Taking all this into account, the fluid volume loss over
the analysis period is remarkably low.
The volume losses induced by the free surface updating can be reduced using a
finer mesh in that region in conjunction with an enhanced alpha shape technique.
The total fluid volume loss can be reduced to almost zero by introducing a small
correction in the free surface at the end of each time step [41].
The fluid volume losses obtained using a standard first order fractional step method
[41] and the PFEM are shown in Fig. 4a for comparison. Clearly the method proposed
146 E. Oate et al.
Fig. 7 3D analysis of sloshing of water in prismatic tank ( = 1). Analysis data and snapshots of
water geometry at a t = 5.7 s (left) and b t = 7.4 s (right)
in this work leads to a reduction in the overall fluid volume loss, as well as in the
volume loss per time step.
Figure 5 shows a comparison between the fluid volume loss for = 1 and = 0.08
using the same time step in both cases (t = 103 s). Results show that the reduction
of the tangent bulk stiffness matrix terms leads to an improvement in the preservation
of the initial volume of the fluid. It is noted that the convergence of the iterative
solution for = 0.08 was the same as for = 1.
Figure 6 shows that a similar improvement in the volume preservation can be
obtained using = 1 and reducing the time step to t = 104 s. This, however,
increases the cost of the computations.
These results indicate that accurate numerical results with reduced volume losses
can be obtained by appropriately adjusting the parameter in the tangent bulk mod-
ulus matrix while keeping the time step size to competitive values in terms of CPU
Fig. 8 3D analysis of sloshing of water in prismatic tank ( = 1). a Time evolution of accumulated
water volume loss (in %) due to the numerical algorithm. b Volume loss (in %) per time step over
2 s of analysis. Average volume loss per time step: 1.64 104 %
cost. A study of the influence of in the numerical solution for quasi-incompressible

free surface fluids in terms of volume preservation and overall accuracy using the
formulation here presented can be found in [10].
More results for this example can be found in [41].
Figures 7 and 8 show a similar set of results for the 3D analysis of the same
sloshing problem using a relative coarse initial mesh of 106771 4-noded tetrahedra
and = 1. It is remarkable that the percentage of total fluid volume loss due to the
numerical scheme after 10 s of analysis is approximately 1 %.
148 E. Oate et al.
Fig. 9 Water sphere falling in a tank filled with water. Analysis data and initial discretization of
the sphere and the water in the tank with 88892 4-noded tetrahedra
8.2 Falling of a Water Sphere in a Cylindrical Tank Containing

Water
This example is the 3D analysis of the impact of a sphere made of water as it falls
in a cylindrical tank containing water. Both the water in the sphere and in the tank
mix in a single fluid after the impact. Figure 9 shows the material and analysis data
and the initial discretization of the sphere and the water in the tank in 88892 4-noded
tetrahedra. The problem was solved with the new stabilized method presented in the
paper with = 1. Figure 10 shows snapshots of the mixing process at different times.
An average of four iterations for convergence of the velocity and the pressure were
needed during all the steps of the analysis. The total water mass lost in the sphere and
the tank due to the numerical algorithm was 2 % after 3 s of analysis (Fig. 11a).
8.3 Sloshing of a Fluid in a Heated Tank
A fluid at initial temperature T = 20 C oscillates due to the hydrostatic forces

induced by its initial position in a rectangular tank heated to a uniform and constant
temperature of T = 75 C. The geometry and the problem data of the 2D simulation
are shown in Fig. 12. The fluid, with a very high thermal conductivity, changes its
temperature only due to the contact with the hotter tank walls. The heat flux along the
free surface has been considered null. The fluid domain has been initially discretized
with 2828 3-noded triangles. The coupled thermal-fluid dynamics simulation has
been run for 100 s using a time step increment of t = 0.005 s.
Fig. 10 Water sphere falling in tank containing water. Evolution of the impact and mixing of the
two liquids at different times. Results for = 1. a t = 0.175 s, b t = 0.275 s, c t = 0.5 s, d t = 0.9 s
Figure 13 shows some snapshots of the numerical simulation. The temperature

contours have been superposed on the fluid domain at the different time instants.
In Fig. 14 the evolution of temperature with time at the points A, B and C of
Figure 12 is plotted. The coordinates of these sample points are (0.1 m, 0.1 m),
(0.5 m, 0.4 m) and (0.9 m, 0.1 m), respectively. Figures 13 and 14 show that the fluid
does not heat uniformly because of the convection effect automatically captured by
the Lagrangian technique here presented.
150 E. Oate et al.
Fig. 11 Water sphere falling in a tank containing water. a Accumulated volume over three seconds
of analysis due to the numerical algorithm ( = 1). b Volume loss (in %) per time step. Average
volume variation per time step: 2.54 104 %
8.4 Falling of a Cylindrical Object in a Heated Tank Filled

with Fluid
An elastic object falls in a tank containing a fluid at rest. The tank walls are maintained
at temperature T = 75 C during the whole analysis, while the fluid and the solid object
have an initial temperature of T = 20 C. The geometry and the problem data of
the 2D simulation as well the thermal initial and boundary conditions, are shown in
Fig. 15. Both the fluid and the solid have a high thermal conductivity. The heat flux
along the fluid and solid surfaces in contact with the air has been considered null.
The fluid and the solid domains have been discretized with 1986 and 108 3-noded
Material data
Viscosity:
Bulk modulus:
Density:
Conductivity:
Thermal capacity:
Geometry data
H1 :
H2 :
D:
Average mesh size:
Analysis data
Total duration: 1
Time step increment:
Fig. 12 2D sloshing of a fluid in a heated tank. Initial geometry, problem data, thermal boundary
and initial conditions
Fig. 13 2D sloshing of a fluid in a heated tank. Snapshots of fluid geometry at six different times.
Colours indicate temperature contours
triangular finite elements, respectively. The duration of the simulation is 10 s and the
time step increment chosen is t = 0.0005 s.
Figure 16 collects some representative snapshots of the numerical simulation with
the temperature results plotted over the fluid and the solid domains.
The graph of Figure 17 is the evolution of temperature at the central point of the
solid object. As expected, its temperature tends to T =75 C.
152 E. Oate et al.
Fig. 14 2D sloshing of a fluid in a heated tank. Evolution of temperature with time at the points
A, B and C of Fig. 12
Analysis data
Fluid data Solid data
Total duration:
Viscosity: Young modulus:
Time step increment:
Density: Density: Geometry data
Bulk modulus: Poisson coecient: H:
Conductivity: Conductivity: L:
R:
Thermal capacity: Thermal capacity:
Mean mesh size:
Fig. 15 Falling of a solid object in a heated tank filled with fluid. Initial geometry, problem data,
thermal boundary and initial conditions
Fig. 16 Falling of a solid object in a heated tank filled with fluid. Snapshots at six different times.
Colours indicate temperature contours
Fig. 17 Falling of a solid object in a heated tank filled with fluid. Time evolution of the temperature
at the center of the solid object
154 E. Oate et al.
We have presented a FIC-based stabilized Lagrangian finite element method for

thermal-mechanical analysis of quasi and fully incompressible flows and FSI prob-
lems that has excellent mass preservation properties. The method has been success-
fully applied to the adiabatic and thermal-mechanical coupled analysis of free-surface
quasi-incompressible flows including FSI situations using the PFEM and an updated
Lagrangian formulation. These problems are more demanding in terms of the mass
preservation features of the numerical algorithm. The method proposed has yielded
excellent results for 2D and 3D adiabatic and thermally-coupled free surface flow
problems involving surface waves, water splashing, violent impact of flows with
containment walls and FSI situations.
Acknowledgments This research was partially supported by the Advanced Grant project SAFE-
CON of the European Research Council.
References
1. Aubry R, Idelsohn SR, Oate E (2005) Particle finite element method in fluid-mechanics
including thermal convection-diffusion. Comput Sruct 83(1718):14591475
2. Aubry R, Idelsohn SR, Oate E (2006) Fractional step like schemes for free surface problems
with thermal coupling using the Lagrangian PFEM. Comput Mech 38(45):294309
3. Belytschko T, Liu WK, Moran B (2013) Non linear finite element for continua and structures,
2nd edn. Wiley, New York
4. Carbonell JM, Oate E, Surez B (2010) Modeling of ground excavation with the particle finite
element method. J Eng Mech (ASCE) 136(4):455463
5. Carbonell JM, Oate E (2013) Surez B (2013) Modelling of tunnelling processes and cutting
tool wear with the Particle Finite Element Method (PFEM). Comput Mech 52:607629. doi:10.
1007/s00466-013-0835-x (Accepted)
6. Cremonesi M, Frangi A, Perego U (2011) A Lagrangian finite element approach for the simu-
lation of water-waves induced by landslides. Comput Struct 89:10861093
7. Donea J, Huerta A (2003) Finite element method for flow problems. Wiley, Chichester
8. Edelsbrunner H, Mucke EP (1999) Three dimensional alpha shapes. ACM Trans Graphics
13:4372
9. Felippa F, Oate E (2007) Nodally exact Ritz discretizations of 1D diffusion-absorption and
Helmholtz equations by variational FIC and modified equation methods. Comput Mech 39:91
111
10. Franci A, Oate E, Carbonell JM (2013) On the effect of the tangent bulk stiffness matrix in
the analysis of free surface Lagrangian flows using PFEM. Research Report CIMNE PI402.
Int J Numer Meth Biomed Eng 38(2):125138 (Submitted)
11. Idelsohn SR, Calvo N, Oate E (2003c) Polyhedrization of an arbitrary point set. Comput Meth
Appl Mech Eng 192(2224):26492668
12. Idelsohn SR, Oate E, Del Pin F (2004) The particle finite element method: a powerful tool to
solve incompressible flows with free-surfaces and breaking waves. Int J Numer Meth Biomed
Eng 61(7):964989
13. Idelsohn SR, Marti J, Limache A, Oate E (2008) Unified Lagrangian formulation for elastic
solids and incompressible fluids: Application to fluid-structure interaction problems via the
PFEM. Comput Meth Appl Mech Eng 197:17621776
14. Idelsohn SR, Mier-Torrecilla M, Oate E (2009) Multi-Fluid flows with the particle finite
element method. Comput Meth Appl Mech Eng 198:27502767
15. Idelsohn SR, Oate E (2010) The challenge of mass conservation in the solution of free-surface
flows with the fractional-step method: problems and solutions. Int J Numer Meth Biomed Eng
26:13131330
16. Idelsohn SR, Nigro N, Limache A, Oate E (2012) Large time-step explicit integration method
for solving problem with dominant convection. Comput Meth Appl Mech Eng 217220:168
185
17. Larese A, Rossi R, Oate E, Idelsohn SR (2008) Validation of the particle finite element method
(PFEM) for simulation of free surface flows. Eng Comput 25(4):385425
18. Limache A, Idelsohn SR, Rossi R, Oate E (2007) The violation of objectivity in Laplace
formulation of the Navier-Stokes equations. Int J Numer Meth Fluids 54:639664
19. Oliver X, Cante JC, Weyler R, Gonzlez C, Hernndez J (2007) Particle finite element methods
in solid mechanics problems. In: Oate E, Owen R (eds) Computational Plasticity. Springer,
Berlin, pp 87103
20. Oate E (1998) Derivation of stabilized equations for advective-diffusive transport and fluid
flow problems. Comput Meth Appl Mech Eng 151:233267
21. Oate E, Manzan M (1999) A general procedure for deriving stabilized space-time finite ele-
ment methods for advective-diffusive problems. Int J Numer Meth Fluids 31:203221
22. Oate E (2000) A stabilized finite element method for incompressible viscous flows using a
finite increment calculus formulation. Comput Meth Appl Mech Eng 182(12):355370
23. Oate E, Garca J (2001) A finite element method for fluid-structure interaction with surface
waves using a finite calculus formulation. Comput Meth Appl Mech Eng 191:635660
24. Oate E (2003) Multiscale computational analysis in mechanics using finite calculus: an intro-
duction. Comput Meth Appl Mech Eng 192(2830):30433059
25. Oate E, Taylor RL, Zienkiewicz OC, Rojek J (2003) A residual correction method based on
finite calculus. Eng Comput 20:629658
26. Oate E (2004) Possibilities of finite calculus in computational mechanics. Int J Num Meth
Eng 60(1):255281
27. Oate E, Rojek J, Taylor R, Zienkiewicz O (2004a) Finite calculus formulation for incompress-
ible solids using linear triangles and tetrahedra. Int J Num Meth Eng 59(11):14731500
28. Oate E, Idelsohn SR, Del Pin F, Aubry R (2004b) The particle finite element method. An
overview. Int J Comput Meth 1(2):267307
29. Oate E, Celigueta MA (2006a) Modeling bed erosion in free surface flows by the particle
finite element method. Acta Geotech 1(4):237252
30. Oate E, Valls A, Garca J (2006b) FIC/FEM formulation with matrix stabilizing terms for
incompressible flows at low and high Reynolds numbers. Comput Mech 38(45):440455
31. Oate E, Garca J, Idelsohn SR, Del Pin F (2006c) FIC formulations for finite element analysis
of incompressible flows. Eulerian, ALE and Lagrangian approaches. Comput Meth Appl Mech
Eng 195(2324):30013037
32. Oate E, Valls A, Garca J (2007) Computation of turbulent flows using a finite calculus-finite
element formulation. Int J Numer Meth Eng 54:609637
33. Oate E, Idelsohn SR, Celigueta MA, Rossi R (2008) Advances in the particle finite element
method for the analysis of fluid-multibody interaction and bed erosion in free surface flows.
Comput Meth Appl Mech Eng 197(1920):17771800
34. Oate E (2009) Structural analysis with the finite element method. Linear statics, vol 1. Basis
and solids. Springer (CIMNE)
35. Oate E, Rossi R, Idelsohn SR, Butler K (2010) Melting and spread of polymers in fire with
the particle finite element method. Int J Numer Meth Eng 81(8):10461072
36. Oate E, Celigueta MA, Idelsohn SR, Salazar F, Surez B (2011) Possibilities of the particle
finite element method for fluid-soil-structure interaction problems. Comput Mech 48(3):307
318
37. Oate E, Nadukandi P, Idelsohn SR, Garca J, Felippa C (2011) A family of residual-based
stabilized finite element methods for Stokes flows. Int J Num Meth Fluids 65(13):106134
156 E. Oate et al.
38. Oate E, Idelsohn SR, Felippa C (2011) Consistent pressure Laplacian stabilization for incom-
pressible continua via higher-order finite calculus. Int J Numer Meth Eng 87(15):171195
39. Oate E, Nadukandi P, Idelsohn SR (2014) P1/P0+ elements for incompressible flows with
discontinuous material properties. Comput Meth Appl Mech Eng 271:185209
40. Oate E, Carbonell JM (2013) Updated Lagrangian finite element formulation for quasi-
incompressible fluids. Research Report PI393 (CIMNE). Submitted to Comput Mech
41. Oate E, Franci A, Carbonell JM (2013) Lagrangian formulation for finite element analysis of
quasi-incompressible fluids with reduced mass losses. Int J Numer Meth Fluids doi:10.1002/
fld.3870
42. Ryzhakov P, Oate E, Rossi R, Idelsohn SR (2012) Improving mass conservation in simulation
of incompressible flows. Int J Numer Meth Eng 90(12):14351451
43. Tang B, Li JF, Wang TS (2009) Some improvements on free surface simulation by the particle
finite element method. Int J Num Meth Fluids 60(9):10321054
44. Zienkiewicz OC, Taylor RL, Zhu JZ (2005) The finite element method. The basis, 6th edn.
Elsevier, Oxford
45. Zienkiewicz OC, Taylor RL (2005) The finite element method for solid and structural mechan-
ics, 6th edn. Elsevier, Oxford
46. Zienkiewicz OC, Taylor RL, Nithiarasu P (2005) The finite element method for fluid dynamics,
6th edn. Elsevier, Oxford
Numerical Simulation and Visualization
of Material Flow in Friction Stir Welding
via Particle Tracing
N. Dialami, M. Chiumenti, M. Cervera, C. Agelet de Saracibar, J. P. Ponthot

and P. Bussetta
Abstract This work deals with the numerical simulation and material flow
visualization of Friction Stir Welding (FSW) processes. The fourth order Runge-
Kutta (RK4) integration method is used for the computation of particle trajectories.
The particle tracing method is used to study the effect of input process parameters
and pin shapes on the weld quality. The results show that the proposed method is
suitable for the optimization of the FSW process.
1 Introduction
Friction Stir Welding is a solid-state joining technique lately found by Thomas et al.
[1]. The basic concept of FSW is the following. A shouldered pin rotating at constant
rotational speed is inserted into the line between the two plates to be welded. Once the
N. Dialami (B) M. Chiumenti M. Cervera C. Agelet de Saracibar

International Center for Numerical Methods in Engineering (CIMNE), Technical University
of Catalonia, UPC BarcelonaTech, Building C1, North Campus C/ Gran Capitn s/n,
M. Chiumenti
M. Cervera
C. Agelet de Saracibar
J. P. Ponthot P. Bussetta
LTAS- MN2L, Aerospace & Mechanical Engineering, University of Liege,
Building B52/3 Chemin des Chevreuils, 1, B4000 Liege, Belgium
P. Bussetta

158 N. Dialami et al.
insertion is completed, the pin is moved along the welding line at constant rotating
and advancing speeds to form the joint.
Ideally, the pin is designed to disrupt the contacting surfaces of the work-piece,
shear the material in front of the tool and move the material behind the tool. The
depth of deformation and the tool travel speed are mainly governed by the pin. This
serves two primary functions: heating of the work-piece, and moving the material to
produce the joint. In general, both the heat and the material transfer depend on the
work-piece material properties, tool geometry, and FSW process parameters.
One of the main issues in the study of FSW is heat generation. During the process,
the material undergoes intense plastic deformation at elevated temperatures. In the
FSW process, welding is achieved by the generated heat due to friction and the
material mixing/stirring process. The generated heat must be enough to allow for
the material to flow and to obtain a deep heat affected zone. Insufficient heat forms
the voids as the material is not softened enough to flow properly. It is of practical
importance to understand the material flow characteristics for optimal tool design
and obtain high structural efficiency welds. The visualization of the material flow
is very useful to understand its behavior during the weld. This has led to numerous
investigations on material flow behavior during FSW. A method assessing the quality
of the created weld by visualization of the joint pattern is advantageous. It can be used
to have a pre-knowledge of the appropriate process parameters. However, following
the position of the material during the welding process is not an easy task, neither
experimentally or numerically.
The experimental material visualization is difficult and needs metallographic
tools. In an attempt to better understand FSW, many investigators have used ex-
perimental techniques to visualize the material flow and to estimate characteristics
of FSW. Most of the studies done so far are based on the experimental study of mate-
rial flow tracing. Two different tracer techniques were used by different researchers
for visualization of the material flow. The first was a tracer technique by marker ma-
terial where a dissimilar material is inserted into the weld line. The second technique
was to weld two dissimilar materials with the FSW process and then see the material
mixing. The marker materials were different Al-composites [24], steel balls [5],
copper foil [6, 7], plasticine and brass rods [8]. The used dissimilar base materials
were different magnesium alloys [9], aluminum to copper alloys [10, 11].
Alternatively, establishing a numerical method for the visualization of the material
trajectory in order to gain insight to the heat affected zone has been attempted.
Computational methods including the finite element method have been used to model
the material flow.
In the literature, there are several works to compute the material flow. On one
hand, within a Lagrangian framework, no special technique for the tracking of the
material is necessary as the mesh nodes are material points. In this format, re-meshing
is unavoidable [12, 13]. Meshless methods used within an updated Lagrangian for-
mulation [14] are an interesting alternative, even if its computational cost is usually
higher than the classical finite element method. In this case, re-meshing is avoided
but the material flow is known only at the nodal points. On the other hand, when using
an Eulerian/ALE approach, a specific technique to compute the material trajectories
Numerical Simulation and Visualization of Material Flow 159
must be implemented. However, the mesh density used for the FSW simulation is
not related to the definition of the set of particles used for the visualization of the
material flow. Hence, a large number of particles can be used without increasing the
computational effort devoted to the simulation of the process itself. Following this
approach, the ALE formulation together with a splitting method is proposed in [15]
to analyze different phases of FSW process. In [16] an Eulerian formulation together
with a simple mesh moving technique is used to avoid mesh distortions and reducing
the computing time due to the ALE technique.
In this work, a numerical particle tracing technology is proposed to study the
extent of material stirring during the FSW process and to study the weld quality.
The outline of the chapter is as follows. Firstly the particle tracing technique is
described using a RK4 integration technique for the computation of particle trajec-
tory. Afterwards, the proposed method is applied to different examples in order to
study the quality of the final joint for different process parameters and pin shapes.
Finally some conclusions are drawn.
2 Particle Tracing
In this work, the FSW process is simulated using an apropos kinematic framework
based on the ALE formulation [1720] and particle tracing is performed to be able to
follow the material movement in the stirring zone integrating the velocity field [21].
Due to the ALE character of the finite element analysis used, the motion of the
finite element mesh is not necessarily tied to the motion of the material. During
the analysis, a material particle moves through the mesh and at different time it is
located inside different elements. To observe material movement around the pin, it
is necessary to construct and analyze material particle trajectories. This is possible
with the use of a particle tracing method (particles are treated as material points not
as mesh nodal points).
Particle tracing is a method used to simulate the motion of material points, fol-
lowing their positions at each time-step of the analysis. This method can be naturally
applied to the study of the material flow in the welding process. In the Lagrangian
framework, as the mesh nodes represent the material points, the trajectories are
the solution of the governing system of equations. When using an Eulerian or ALE
framework the solution does not gives directly information about the material points.
However, the obtained velocity field can be used to get an insight of the extent of
material mixing during the weld.
In this method, firstly, a set of points representing the material points (tracers)
are distributed in the domain and then, a Lagrangian Ordinary Differential Equation
(ODE) for the computation of material displacement at a post-process level must
be solved. Each particles path is followed in time integrating the following ODE
equation:
D (X (t))
= V (X(t), t) (1)
Dt
Integrating (1) yields:

t
X (t) = X0 + V (X (t), t) dt (2)
0
where X (t = 0) = X0 is the initial position of the particle, X (t) is the position of

the material points at time t and V (X (t), t) is the velocity of the tracer in the position
X (t) and time t.
An appropriate time integration method for the solution of the ODE equation
is needed in order to track the particles. To solve the ODE equation, there exist a
large amount of integration techniques ranging from the simple first order Backward
Euler (BE) scheme to higher order Runge-Kutta schemes. Among them three well-
known methods are chosen and compared in [21]: Backward Euler with Sub-stepping
(BES), Fourth order Runge-Kutta (RK4) and Back and Forth Error Compensation
and Correction methods (BFECC). These methodologies, widely used in fluid dy-
namics, are suitable and robust tools to study the FSW problem allowing for a clear
visualization of the material movement at the stir-zone leading to a better understand-
ing of the welding process itself. Among them, RK4 is found to be more precise for
the simulation of FSW problem.
Moreover, a search algorithm must be executed to find the position of the material
points in the Eulerian or ALE meshes in order to identify the element containing the
tracer. The tracer velocity is obtained interpolating the nodal velocity of the back-
ground mesh and the corresponding interpolation (shape) functions, N j (X (t)), as:

n
V (X (t), t) = v j (t) N j (X (t)) (3)
j=1
where the velocity field, v j (t), is known at each node, j, of the finite element mesh
representing the domain at any time, t, of the analysis.
2.1 RK4 Method
According to the fourth order accurate RK4 method, the particle position at time-
step n+1 is computed from the advection of the initial position by four weighted
incremental displacement at intermediate time-steps.
1
X n+1 = X n + X(1) + 2X(2) + 2X(3) + X(4) (4)
6
where the incremental displacements are computed as

(1) = V X , t

X n n t

X(2) = V Xn + 1 X(1) , tn + t t

2 2 (5)

(3) 1 (2) t

X = V Xn + X , tn + t

2 2

X(4) = V Xn + X(3) , tn + t t
The RK4 method is a fourth order method, meaning

that the error per step is
O t 5 , while the total accumulated error is O t 4 . The RK4 method is an ap-
propriate choice as it can integrate exactly a circular trajectory, a standard particle
path in FSW.
The nodal velocity field, vn+ 1 , corresponding to the mid-time step tn + t 2 is
2
obtained as:
vn + vn+1
vn+ 1 = (6)
2 2
where vn and vn+1 are the nodal velocity fields at times tn and tn+1 , respectively.
3 Examples
The material flow during FSW is complex and the understanding of deformation
process is limited. It is important to point out that there are many factors that can
influence the material flow during FSW. These factors include tool geometry, welding
parameters, material types, work-piece temperature, etc.
The proposed method is used to investigate the effect of these factors on the
qualification of the final weld. The first example studies the effect of the input process
parameters on the joint creation and the second one considers the effect of the pin
shapes on the weld quality.
3.1 Input Process Parameters Effect on the Weld
To study the effect of the input process parameters on the joint quality, a 2D example
is considered. The model is a transversal cut of the pin, with 10 mm diameter,
perpendicular to the rotation axis. The cut represents the mid-section of the real
threaded pin. The contact condition between the pin and the work-piece (AA2195-
T8) is considered to be perfect sticking. Process parameters are the same as in the
experiment: welding speed Vs = 5.0833 mm/s and rotational speed Vr = 500 rpm.
A Sheppard-Wright constitutive model is used [21].
A set of 100 150 particles in a shape of a 20 30 mm2 rectangle at their initial

position is located right in front of the pin. The whole model is discretized with a mesh
of 4986 triangular elements. The problem is solved using the v/p mixed formulation
stabilized by the OSS stabilization method [1719]. The RK4 time integration method
is applied for the solution of the particle tracing problem. The model has been already
validated in the authors work [21].
To study the effect of the input parameters and to gain an insight of the ratio
influence between advancing and rotational velocities, two limit cases and the
original case problem with the same boundary condition and material properties and
the velocity fields: (a) Vs = 5.0833 mm/s; Vr = 0 rpm, (b) Vs = 0.50833 mm/s;
Vr = 500 rpm and (c) Vs = 5.0833 mm/s; Vr = 500 rpm are analyzed. Apart from
the original problem, one case where the rotational velocity is zero and the other case
where the advancing velocity is very small (advancing velocity cannot be zero) are
considered.
Studying these three cases reveals some characteristics of the material flow pattern.
Figure 1 shows the particles pattern after joint creation and the velocity streamlines
for these three sets of velocity. Case (c) is the original problem [21]. It shows an
initially straight transverse set of particles that has been welded through. Note that
the material moves backward in a curve, and a thin zone is swept forward on the
advancing side, as seen in the predicted particle tracks.
It can be observed that when the pin is not rotating (case (a)), the flow passes
through the pin (an obstacle). In this case, the advancing velocity is the same as in
the original problem. The discontinuous line is located in the center of the plate on
the weld line and the joint is not created.
In case (b), the rotational velocity of the pin is the same as the original problem
but the advancing velocity is 10 times smaller. Even though case (b) and case (c)
show the same velocity contour field, streamlines in case (b) represent more material
rotation around the pin. The reason is the much bigger ratio between the rotational
and advancing velocities than in the case (c). Moreover, via particle tracing technique
the effect of the input parameters on the weld quality can be observed. It is shown
that when the advancing velocity is much smaller than the rotational one, the joint
is defective and non-qualified and the weld line is not located at the center line.
By comparing the three cases, it can be concluded that the ratio between the
rotational and advancing velocities is crucial to obtain a qualified joint. They cannot
be arbitrary selected. Particularly, a very low advancing velocity comparing with the
rotational one does not lead to a qualified joint.
3.2 Different Pin Shapes Effect on the Weld
The second example investigates the effect of different pin shapes on the joint cre-
ation using the proposed particle tracing method. Different types of pin shape are
considered and shown in Fig. 2 including (a) triflute; (b) trivex; (c) circular; (d) trian-
gular. The pins are generated from an originalyl circular section of 10 mm diameter
Fig. 1 Creation of the weld joint with different input parameters a Vs = 5.0833 mm/s; Vr = 0 rpm
b Vs = 0.50833 mm/s; Vr = 500 rpm c Vs = 5.0833 mm/s; Vr = 500 rpm
(Fig. 2). The trivex pin design is approximately triangular; the three points of the pin
form an equilateral triangle and are connected by convex sides. The triflute pin shape
is obtained from an original circular section removing three circular segments.
Fig. 2 Different types of pin shape. a Triflute, b trivex, c circular, d triangular
Fig. 3 Temperature field and streamlines obtained from different pin
A square domain of 80 80 mm2 is considered with a circular heat affected zone

of 20 mm in diameter. An ALE framework is considered in the heat affected zone
and the rest of the circular domain is defined in the Eulerian framework. The RK4
integration technique is used for the solution of the particle tracing problem.
In a first step, the problem is analyzed under the stick condition. Figure 3 illustrates
the temperature contour fields obtained together with the streamlines. The streamlines
show that: voids are created using the pins with sharp corners such as trivex and
Fig. 4 Pressure contour field obtained from different pin shapes
especially triangular pins; the circular and triflute pin present similar streamlines
taking into account that the stick condition is assumed. Note that the triflute shows
significantly more material being captured and taken around the tool more than once,
whereas the trivex struggles to fill the space behind the tool on the advancing side.
This is therefore consistent with the generation of a void in the wake of a trivex tool.
The triflute pin has a high swept rate due to the segments and a tool design with a
higher swept rate reduces the voids.
It can be observed that the generated heat is greater for the triflute and the circular
pin than for the trivex and the triangular pins as in the stick condition more material
move together with the pins.
The pressure contour field is illustrated in Fig. 4. Pins with sharper corners have
higher maximum pressure value as for triangular and trivex pins than the ones with
convex sides as for circular and triflute pins. However the maximum pressure is of
the same order.
In a next step, the effect of slip condition on the same problem is studied. In
this case, the pin rotational velocity has less effect on the work-piece than in the
stick case. The triflute and the circular pin lead to considerably different streamlines
(Fig. 5). The streamlines for a circular pin show a passing flow through an obstacle
while for a triflute pin, they show the trapped material in the segments of the pin
moving with it. In the slip case, the joint is not qualified even though in the triflute
case, the joint is created due to the effect of the segments. In the stick case, the joint
Fig. 5 Streamlines and created joints in both the stick and slip cases for circular and triflute pins
Fig. 6 Velocity contour field in the slip case for circular and triflute pins
is created following the ring patterns observed generally in the FSW process. The
effect of the segments can be also seen in the Fig. 6 for the slip case as the material
close to the pin is affected by the pin velocity.
Figure 7 shows that for both the slip and stick cases, joints are created using
triangular and trivex pins. However, in the stick case, the joint is not qualified and
Fig. 7 Streamlines and created joints in both the stick and slip cases for triangular and trivex pins
Fig. 8 Streamlines and velocity contour field in the slip case for triangular and trivex pins
does not follow the usual pattern of FSW due to the void creation. When the slip
condition is assumed, the voids are not created. Material around the pin does not
share the same velocity as the pin, but it moves due to the non-circular shape (Fig. 8)
and a qualified joint is created.
It can be concluded that different types of pin shapes can be selected for different
conditions of the weld. In stick case, pins without sharp corners create qualified joints
while the pins with sharp corners can be used in slip cases.
Material deformed by the friction stir tool must be capable of filling the void
produced by a traversing pin. If the tool design is incorrect, the deformed material
will cool before the material can fully fill the region directly behind the tool.
The presented results are preliminary, but the proposed method could clearly be
of great benefit in reducing experimental trials if near optimal welding conditions
could be predicted directly from knowledge of the material joint behavior.
4 Conclusion
The work deals with the simulation and visualization of material flow. The simulation
of the transient phase is important for understanding the material behavior. The model
can provide this insight by computing the particles thermo-mechanical history. If
the process is defined in an ALE/Eulerian setting, an additional method must be
introduced in order to find the particles history. The particle tracing method for the
material stirring during and after welding is applied to the material flow visualization
of the FSW process. The RK4 integration method is used for the computation of
particle trajectories.
From the ring shape flow pattern left after the welding, it is found that the ratio
between the rotational and the advancing speed is one of the key points for the
qualified joint creation. The effect of pin shapes on the weld quality is studied. It is
found that in the stick case, pins with sharp corners (triangular and trivex) generate
voids while this problem does not appears in the slip case. Moreover, the effect of the
segments of a triflute pin on the weld quality is studied and shows that the material
trapped in the segments moves with the pin in both stick and slip cases.
Acknowledgments This work was supported by the European Research Council under the Ad-
vanced Grant: ERC-2009-AdG Real Time Computational Mechanics Techniques for Multi-Fluid
Problems. The authors are also thankful for the financial support of the Spanish Ministerio de
Educacin y Ciencia (PROFIT programme) within the project CIT-0204002007-82.
References
1. Thomas WM, Nicholas ED, Needham JC, Murch MG, Temple-Smith P, Dawes CJ (1991)
Friction-stir butt welding. GB Patent No. 9125978.8, International Patent No. PCT/GB92
/02203
2. London B, Mahoney M, Bingel B, Calabrese R, Waldron D (2001) Experimental methods for
determining material flow in friction stir welds. The third international symposium on friction
stir welding, Kobe, Japan, 2728 Sept 2001
3. Reynolds AP (2008) Flow visualization and simulation in fsw. Scripta Materialia 58:338342
4. Seidel TU, Reynolds AP (2001) Visualization of the material flow in aa2195 friction stir welds
using a marker insert technique. Metall Mater Trans A 32:28792884
5. Colligan K (1999) Material flow behaviour during friction stir welding of aluminium. Weld J
78:229237
6. Guerra M, Schmids C, McClure JC, Murr LE, Nunes AC (2003) Flow patterns during friction
stir welding. Mater Charact 49:95101
7. Dickerson T, Shercliff HR, Schmidt H (2003) A weld marker technique for flow visualization
in friction stir welding. 4th international symposium on friction stir welding, Park City, Utah,
USA, 1416 May 2003
8. Kallgren T, Jin L-Z, Sandstrom R (2008) Material flow during friction stir welding of copper.
7th international friction stir welding symposium, Awaji Island, Japan, 2022 May
9. Johnson R, Threadgill P (2003) Friction stir welding of magnesium alloys. In: Kaplan HI
(ed) Magnesium technology 2003 (TMS-The Minerals, Metals & Materials Society, 2003), pp
147152
10. Ouyang J, Yarrapareddy E, Kovacevic R (2006) Microstructural evolution in the friction
stir welded 6061 aluminum alloy (T6-temper condition) to copper. J Mater Proc Technol
172:110122
11. Abdollah-Zadeh A, Saeid T, Sazgari B (2008) Microstructural and mechanical properties of
friction stir welded aluminum/copper lap joints. J Alloy Compd 460:535538
12. Buffa G, Fratini L, Micari F, Shivpuri R (2008) Material flow in fsw of t-joints: experimental
and numerical analysis. Int J Mater Form 1(1):12831286
13. Buffa G, Ducato A, Fratini L (2011) Numerical procedure for residual stresses prediction in
friction stir welding. Finite Elem Anal Des 47(4):470476
14. Alfaro I, Racineux G, Poitou A, Cueto E, Chinesta F (2009) Numerical simulation of friction
stir welding by natural elements method. Int J Mater Form 2(4):225234
15. Guerdoux S, Fourment L (2009) A 3d numerical simulation of different phases of friction stir
welding. Model Simul Mater Sci Eng 17:075001
16. Feulvarch E, Roux J-C, Bergheau J-M (2013) A simple and robust moving mesh technique for
the finite element simulation of friction stir welding. J Comput Appl Math 246:269277
17. Chiumenti M, Cervera M, Dialami N (2013) Numerical modeling of friction stir welding
processes. Comput Methods Appl Mech Eng 254:353369
18. Dialami N, Chiumenti M, Cervera M (2013) An apropos kinematic framework for the numerical
modelling of friction stir welding. Comput Struct 117:4857
19. Agelet de Saracibar C, Chiumenti M, Cervera M, Dialami N, Seret A (2014) Computational
modeling and sub-grid scale stabilization of incompressibility and convection in the numerical
simulation of friction stir welding processes. Archives of Computational Methods in Engineer-
ing 21(1):337. doi:10.1007/s11831-014-9094-z
20. Bussetta P, Dialami N, Boman R, Chiumenti M, Agelet de Saracibar C, Cervera M, Ponthot JP
(2013) Comparison of a fluid and a solid approach for the numerical simulation of friction stir
welding with a non-cylindrical pin. Steel Research International. doi:10.1002/srin.201300182
21. Dialami N, Chiumenti M, Cervera M, Agelet de Saracibar C, Ponthot JP (2013) Material flow
visualization in friction stir welding via particle tracing. Int J Mater Form. doi:10.1007/s12289-
013-1157-4
Some Considerations on Surface Condition
of Solid in Computational Fluid-Structure
Interaction
Masao Yokoyama, Kohei Murotani, Genki Yagawa and Osamu Mochizuki
Abstract The surface condition of solid or structure is a serious issue in the

numerical simulation of the Fluid Structure Interaction. The present paper describes
an engineering model for calculating the FSI with the surface condition for the hydro-
gel solid, which is employed to modify the wall shear stress at the interface between
the fluid and the solid. Using the proposed model, we show some numerical results
including the large scale parallel computing of water splash generated by a hydro-gel
sphere diving into water, which is compared well with the experimental observation.
Keywords Fluid structure interaction Surface condition Splash Hydro-gel

Particle method Numerical simulation
1 Introduction
The Fluid-Structure Interaction (FSI) is one of the most popular topics in the com-
putational mechanics. It covers a wide range of phenomena of social, scientific and
engineering fields such as vehicle, medicine, civil engineering and construction,
agriculture, forestry, disaster prevention, music, sports, etc. (see Fig. 1).
Related research topics of fluid dynamics and the FSI are among others vortex,
vibration of structure, sloshing, droplet, splash and bubble. For example, the vibration
M. Yokoyama
School of Information Science, Meisei University, Hino, Japan
K. Murotani
School of Engineering, University of Tokyo, Bunkyo, Japan
G. Yagawa (B)
Center for Computational Mechanics Research, Toyo University, Bunkyo, Japan
O. Mochizuki
Faculty of Science and Engineering, Toyo University, Bunkyo, Japan

172 M. Yokoyama et al.
Engineering Biomechanics Sports & Entertainment
Ship Tank Blood vessel RBC Sportswear
Automobile Pump Microchannel Membrane Racket
Airplane Turbine Aneurysm Capsule Ball
Parachute Valve Arteriosclerosi Bird Music Instrument
Dam Breakwater Fish Butterfly Game CG
Drop Splash Vortex Surface Tension
Keywords
Bubble Turbulence Sloshing Drag Noise
Fig. 1 Some keywords and applications of fluid structure interaction
of structure caused by the Krmns vortex street has been studied for many years,
which is a locking phenomenon caused by the vortex street behind a spherical cylinder
[1]. The vortex and the exfoliation occur when a solid or a structure moves in fluid,
which often result in the destruction, the noise, the stall of an airplane or the drag
of a vehicle, and the various studies have been performed: the effect by the surface
unevenness such as a turbulator, a vortex generator, a tripping-wire, a riblet of wing,
dimples of golf ball [2], the drag reduction of ship by the micro bubble [3], the
effect of deformation by an elastic body [4], the relation of vortex and vibration [5],
etc. The FSI study has contributed to the sport engineering: the improvement of the
movement form of swimmer [6, 7]. Regarding the sound of musical instruments, the
vibration of a musical instrument and the circumference air and the pronunciation
mechanism of an air lead of pipe organ or flute have been studied [810].
When we discuss the interaction between solid and fluid, the condition of the
interface between them is controversial. It is well known that the surface condition,
the roughness of surface or the uneven shape of the solid surface gives some influence
on the flow fields and the movement of the solid as seen in the case of the dimple of
a golf ball [11].
The free surface flow such as splash and drop induced by movement of a solid
object is also interesting topic of the FSI. For example, even if the surface of wall
of hydro-gel ball and that of acrylic resin ball look smooth each other, the splashs
form created by the hydro-gel ball differs from that of the acrylic resin ball and the
velocity distribution of the water around each ball is different as well.
In the biomechanics fields, there are some interesting papers besides well-known
study such as flapping wings or swimming fish; Yabe et al. [12] developed an algo-
rithm for the calculating surface tension and contact angle of the motion of water strid-
ers, finding two types of movement by the experiment and the simulation. On the other
hand, the finite volume method simulation of the promotion function underwater by
a flagellum as a micro propulsion was performed, where the effect of the flagellum
Some Considerations on Surface Condition of Solid 173
with projecting mastigonemes as the surface condition was verified by Kobayashi

et al. [13]. In these researches, the interfacial tension and shear friction due to the dif-
ference of physical shape of surface as for the interaction of living things movement
and water flow were studied.
Although several studies have been published on the treatments of the surface of
wall in numerical computational field, majority of them are those by the introduction
of the contact angle expressing the wettability of the surface of wall. Vibration and
separation of droplet by MPS method [14], deformation of the gas which goes up the
inside of a liquid [15], contact angle of droplet on the wall with hybrid technique of
particle and mesh [16] are among others. But, these are not so accurate because they
calculate the curvature of interface with the ratio of discrete particle count using the
Continuum Surface Force method.
Experimental observation of the splash of a ball which plunged into the water
surface was studied by Worthington [17], where the influence of the splash by the
difference in the state of the surface of a ball was discussed (Fig. 2). With the dry
smooth ball, the size of the splash became small. With the ball made coarse with the
sandpaper or the wet ball, it became a big crown-like splash. It shows that, in the dive
game of swimming, since splash will become large if a swimmer jumps into water
pool without wiping the body well or with swimming suit wet, the score becomes
disadvantageous. Thus, it is important to jump into water pool after a swimmer wipes
body well in order to get the high score.
Splash variation of milk-crown was observed by Krechetnikov and Homsy [18]
(Fig. 3), where they found the crown type differs according to the Weber number
and also discovered frustration phenomena in the wave number selection of the
crown spike structure. Akers et al. [19] studied the influence of non-Newtonian fluid
on splash formation, focusing on the property of water. Duez et al. [20] reported
the relation between the splash formation and the sound generation, studying the
effect of the hydrophobic or hydrophilic surface of the body surface on splash. Yoon
et al. [21] studied the finger generated at the tip of film flow, which went along the
surface of object when it plunging into water surface.
In the numerical simulation, a ship attacked by huge wave and solid cube falling
into a recipient with water were calculated by Idelsohn et al. [22] (Fig. 4). They
claimed in the research as follows; a method is presented for the solution of the
incompressible fluid flow equations using a Lagrangian formulation. The interpola-
tion functions are those used in the Meshless Finite Element Method (MFEM). Clas-
sical stabilization terms used in the momentum equations are unnecessary due to the
lack of convective terms in the Lagrangian formulation. Furthermore, the Lagrangian
formulation simplifies the connections with fixed or moving solid structures, thus
providing a very easy way to solve fluid-structure interaction problems.
In the most of the FSI studies mentioned above, however, the wall of the solid
or the structures is assumed to be as non-slip condition in numerical simulation.
Thus, there are few studies which take the condition of slip at the surface of the
solid object into consideration, although the analyses of the flow in consideration of
structural unevenness or elasticity of the surface of a wall are conducted. The above
non-slip condition is not a realistic assumption in some cases. For example, creatures
Fig. 2 Observation of the splash by a ball [17]

Fig. 3 Different crown type occurs according to the Weber number (We = DV 2 / , density, D
diameter V velocity, : surface tension) [18]. a Regular axisymmetric crown. b Regular crown with
spikes. c Irregular crown
Fig. 4 Fluid-structure interaction by meshless FEM [22]
living in water such as fish and amphibians have a slimy mucus skin, whose principal
ingredient is a hydrogel known as mucin [23]. Furthermore, since the inner wall of
the digestive organs or the blood vessel has a slippery surface, it seems important
to take the characteristics of such slippery surface into consideration in numerical
simulation.
In this paper, we focus on the treatment of the surface condition of an object in the
FSI problem. The experimental observation of splash is given as a suitable example
in the Sect. 2. We explain an outline of the numerical method of the Navier Stokes
equation by the particle method in the Sect. 3. The model is proposed introducing
the slip of objects surface to the particle method, and the numerical simulation
of the splash under different surface conditions is carried out, comparing with the
experimental results in the Sect. 4. We show the results of splash by the large scare
parallel calculation in the Sect. 5. The concluding remarks and future works are given
in the Sect. 6.
Fig. 5 Experimental setup Launcher System
sphere
High speed
camera
Water
Tank
2 Experimental Study of Splash

In this section, the experimental result of splash generated by a hydro-gel sphere and
an acrylic resin sphere is shown to study the difference of splash patterns caused by
the surface condition of the solid object.
The dynamic views of the splash are recorded by using a high speed CMOS camera
(Vision Research Inc., Phantom v7.1), where the camera is set as 4,000 frames per
second (Fig. 5). The test condition is as follows: the radius of a sphere R is 10 mm,
the initial height h is 50R, and the impact velocity of the sphere at the water surface
Vi is 2.4 m/s.
The experimental results of a splash formed by a sphere impinging on water
surface are shown in Fig. 6, comparing the primary splash formed by a hydrogel
sphere (Fig. 6a) with that by an acrylic sphere (Fig. 6b). The primary splash means
the splash, which rises first after an object plunging. The primary splash formed in the
case of the hydrogel is a kind of the crown-type. On the other hand, the acrylic sphere
creates the column type primary splash. The splashes are considered to be formed
by the dynamics of the film-flow [24], which is a thin water flow around a sphere
surface and generated immediately after the sphere impacts the water surface. The
difference between the formation processes of Fig. 6a and b is due to the difference
of the film separation from the sphere surface.
When the film is separated from the sphere surface, a crown-type splash is formed.
The above film separation is presumably caused by the increase in the film velocity
according to the hydrophilic property of the solid wall and the attractive or repul-
sive force such as the electrostatic force between the solid wall and the water. This
Fig. 6 Comparison of splash patterns between a hydrogel (Aqar) and b acrylic resin (radius of
sphere = 10 mm and impact velocity = 2.21 m/s)
experimental observation suggests that the numerical simulation should take into
consideration the various surface conditions as the interaction between the object
and the water.
3 Governing Equations and Particle Method
Although several studies have paid attention to the coarseness of the surface or the
liquid exfoliation in the FSI simulation, the difference of a splash by the material
cannot be simulated with the conventional method. In other words, the simulated
pattern of the splash by the hydrogel object and that by the acrylic resin object
become the same result. The reason will be attributed to the fact that the above
simulations have assumed the boundary between the fluid and the solid to be of the
non-slip type. In this paper, we propose a calculation method, where the difference
of a splash due to the difference of the solid material is realized.
Employed method in the present paper is the MPS method, which is a particle
method recognized as an effective technique in performing the simulation of the
FSI. The method is a semi-implicit method, where, after calculating the temporary
position of particles with the equation of motion in the explicit stage, the Poisson
equation of pressure is solved in order to satisfy the condition that the number of
particles per a small volume is constant as the mass conservation. We discuss here
how to introduce the effect of a slip into the MPS in order to solve the flow field
around the hydro-gel surface.
The governing equation of the present flow is the incompressible Navier-Stokes

equation as follows,
D u 1
= F P + 2 u (1)
Dt
where u is the velocity vector of fluid, is the density of fluid P is the pressure, is
the kinematic viscosity of fluid and F is the external force. Assuming two particles
i and j, where there exist, respectively, pressure pi and pj . The gradient of pressure
at the point i is written as [25]

d p j pi
P = (
r j
r i )(|
r j
r i |) (2)
n0 |
r j ri |2
j=i
where d is the constant value, which is equal to the dimension of space to be analyzed
and n 0 is called the particle number density.
The Laplacian of velocity at the point i is written as
2d
2 u = (
u j ui )(|
r j ri |) (3)
n 0
i= j
where is a parameter, which is introduced in order to make statistical distribution

coincided with an analytic solution, and is the weight function assumed as follows,
re
1 (0 r re )
k (r ) = r (4)
0 (re r )
Here, r is the distance between two particles and re is the cut-off radius.
The algorism of MPS method takes following procedure (i) to (iii); (i) Tentative
velocity u is calculated at the explicit stage using F and viscous term of Navier-
Stokes equation (1), (ii) the Poisons equation of the pressure is solved at the implicit
calculation stage and the pressure p is obtained. Then, revised velocity u obtained
by this p in order that the particle number density in area is conserved, (iii) and u
is added to u then target velocity u on t is obtained, (iv) time step is increased,
t = t + dt, where the time step dt is 0.001 s in the present paper.
4 Flow Around Hydro-Gel Wall with Slip
The hydro-gel, which is a kind of polymer gel is considered here, where the slip ratio
is defined by the moisture content in the hydro-gel. We discuss here how we introduce
heuristically the influence of the slip at the hydro-gel wall into the calculation. Let
us use the diving sphere made of agar as the hydrophilic material, which is a kind of
hydrogel like gelatin, and known to be easy to control its water content and to create
arbitrary shape. Agar consists of crosslinked structure by polymer called agarose and
lots of water molecule between the polymer structures, which is known to create a
slippery surface. For example, Eddington et al. [26] reported the use of the hydrogel
as the valve for flow control of a microchannel, and Beebe et al. [27] discussed the
effectiveness of hydrogel structure for flow control on micro fluidic channels.
Figure 7 shows the velocity distributions of water flow near the surface of the
acrylic resin versus that of the agar-gel, where is the height of water flow and u is
the velocity of water. Here, being the wall shear stress under the no-slip condition
and that under the slip condition, the slip ratio is defined as follows,
= / (0 < < 1) (9)
Here, the shear stress is obtained by flow velocity near the wall experimentally.
du
= | y=0 (10)
dy
Figure 8 shows the experimental relations between the swelling ratio S and the slip
ratio for the agar-gel and the carrageenan-gel, respectively, where the swelling
ratio S is defined as follows [28],
S = (m water + m gel )/m gel (11)
where m water is the mass of water and mgel that of the solid-gel. S increases with the
amount of water contained in the solid-gel. The agar employed in this study is a kind
of hydrogel [29], which is easy in handling and controlling its shape and the degree
of the swelling [30]. Figure 8 suggests that decreases with the increase of S, or can
be expressed as
= 1 S (12)
where is estimated to be 1.2 103 in the case of the agar. It is summarized that
larger S gives more slip on the surface.
In this paper, the above relation between the increase of the swelling ratio and the
reduction of the wall friction from our experiment is taken into consideration near
the wall in the viscous term of the Navier-Stokes equation in a heuristic manner.
Since the shear force acting between the wall and the fluid is directly related with
the viscosity term of Eq. (3), we modify the term as
2d
2 u = (
u j ui ) H (|
r j ri |) (13)
n 0
i= j
with
H (r ) = (r ) (14)
y
u
water
no-slip

(acrylic resin)
with slip
(hydro-gel)
wall
Fig. 7 Comparison of flow profiles near no-slip wall (acrylic resin) and slippery wall (hydrogel)
Fig. 8 Relationship of swelling degree S and slip ratio
where index i denotes the water particle near hydro-gel wall and j the surface particle
of hydro-gel wall. Namely, is set effective only near the hydrogel wall, because
the effect of slip is available near this area. The effective length of the above reduced
weight function near the wall is assumed to be re in this study, and set to be 2.1l0,
where l0 is the initial distance between the particles.
Summarizing the above procedure, (i) select S according to the hydrophilicity of
the solid object, (ii) estimate using Eq. (12), and (iii) apply to the weight function
of viscous term of Navier-Stokes equation for the calculation of shear force near the
hydrogel wall using by Eqs. (13) and (14).
Next, we show the 2D splash simulation employing the above method. The effect
of the slip ratio on the flow around a hydrogel sphere can be taken care with
Eq. (11). The comparison of the simulation result with S = 100 and the experimental
primary
splash
t = 0.02 t = 0.03
Fig. 9 Crown-type-splash of hydrogel (S = 100) by experiment and simulation, where primary

splash (t = 0.02) and air cavity (t = 0.03) are shown
one is shown in Fig. 9, where the radius of sphere R is 10 mm and the initial height h
is 50R in the both simulation and experiment. The water tank has the width of 20R
and the depth of 20R, where we confirmed the effect of the wall was negligible.
Assuming that the sphere touches the water surface at t = 0, the left figures in
Fig. 9 are snapshots of the splashes at t = 0.02.
The first splash, which is created just after the sphere touches the surface of water
is the so-called primary splash. It is seen that the sphere creates a cavity also. The
pattern of the above crown-type splash and the air cavity obtained by the present
simulation is similar to the experimental result. The above crown-type splash and
the presence of the air cavity do not occur in the case of the acrylic resin sphere.
Sphere
S=50
S=350
Fig. 10 Comparison between simulation results of representative path lines of water particles of
primary splash for hydrogel spheres of different values of swelling parameter S
z
y
x y
Fig. 11 Analysis domain for 3D splash simulation (left figure) and arrangement of particles of
hydro-gel sphere and water viewed from the top (right figure)
Figure 10 shows the crown-type splashes and the representative path trajectories
of particles for the different swelling ratios. The dotted solid lines are the path tra-
jectories of particles when S is 50 or = 0.94, whereas the dotted lines are those
when S is 350 or = 0.7. It is seen from the figure that the splashes spread widely
with larger value of S or , or the velocity of the water near the wall is larger with
the swelling ratio, which causes the earlier exfoliation, creating the wider primary
splash.
Fig. 12 3D splash (left figure) at t = 0.05 and domain decomposition (right figure)
Fig. 13 Comparison of splash patterns with the different initial distances of water particles l0
(S = 100, re = 4.1 and t = 0.03 s)
5 Extension to 3D Simulation with Large Scale Parallel

Computing
It is considered that the 3D simulation makes it possible to observe more detailed

behaviors of the splash. For example, the 3D simulation could allow us to calculate
the crown-type splash with the finger [24] or the spike [18], which are impossible to
realize with the 2D simulation. It is also expected that we can clarify the generation
mechanism, including the number and the size of them, which depend on the Weber
number, the rotational movement of the splash around an acrylic resin sphere, which
is generated in the column-type splash and the texture at the wall of the air cavity
created by the sinking sphere.
We perform here a large scale particle simulation using the distributed memory
parallel computers, employing the two-level domain decomposition [31]. The first
level domain decomposition is performed in order to keep the balance of the number
of particles among nodes and the second level domain decomposition is performed
in order to keep the balance of the number of particles among threads in each node.
The calculation flow is shown as follows;
1. A bounding box of a whole region is defined, and is filled with buckets. Since an
influence radius of a particle is defined in the MPS method, the size of a bucket
is set to be wider than the influence radius.
2. All the particles are embedded in the buckets.
3. The bucket-based domain decomposition is performed with an equal number of
particles at each subdomain by ParMETIS [32].
4. If an imbalance in the number of particles among regions appears, the domain
decomposition is performed again in order to recover the balance of the number
of particles.
The parallel computer used here is the FX10 in the Information Technology Center
of the University of Tokyo. The processor of the above FX10 is the SPARC64 IXfx,
where a processor node has 16 cores of 1.848 GHz and 32 GB memory. In this
research, the OpenMP is used for parallelization in each node and the MPI is used
for parallelization among nodes.
Figure 11 shows the simulation setup of the crown-type splash in the case of the
hydro-gel sphere.
The initial locations of particles are arranged concentrically, and the water tank
is of a circular cylinder. Let the diameter of a particle and be 0.0005 m and 0.4,
respectively. Figure 12 is the result of the splash analysis at t = 0.05 s using 53 million
particles. It took about 12 h for this analysis using 240 nodes of the FX10. The velocity
of each particle is shown with color graduation in the left hand side of Fig. 12. The
right hand side of Fig. 12 shows the time sequence of domain decomposition, which
nodes are distinguished by different colors.
Figure 13 shows the simulation result by our 3D calculation, which are in good
agreement with the experimental result as shown in Fig. 9, where the crown-type
splash, the air cavity and the droplets scattering are expressed well. It is observed
that the particle diameter, which is described as the initial distance of particle l0,
influences the splashs pattern, namely, l0 becomes smaller and the total number of
particles is larger, the splash pattern is expressed clearer. The effect of the slip ratio on
the splash pattern in 3D simulation is now under analyzing, though we are able to see
the difference in the width and height of the splash pattern. In order to simulate the
finger or the spike in the crown-type splash and its relation with the Weber number,
we need larger scale computing yet.
6 Conclusion
1. Focusing on the treatments of the interface between the solid and the fluid,
we propose a calculation method with the slip effect on the surface of a slimy
material.
2. Experimental results of water splashes taken by a high-speed camera show that
the splash pattern caused by an acrylic resin sphere is different from that caused
by a hydro-gel sphere.
3. An engineering model to express the slimy surface, which the creatures living
in water such as fish or frogs have, is proposed, where the slip ratio , which
is the reduction ratio of the shear stress near a solid wall obtained through the
experiment, is introduced in the shear term of the Navier-Stokes equation.
4. The splash pattern calculated by the proposed method is in good agreement with
the experimental result.
5. The above method for calculating the splash is applied to the large scale parallel
computing in 3D, which depicts the more detailed splash patterns. As the future
work, a larger scale computing and a modelling of the surface tension are needed
to observe the finger or the spike as seen in the case of the milk-crown.
Acknowledgments This research was supported by the MEXT Program for the Strategic Research
Foundation at Private Universities, 20122017 and the WCU (World Class University) Program
through the Korea Science and Engineering Foundation funded by the Korean Ministry of Education,
Science and Technology (R33-2008-000-10027-0).
References
1. Billah KY, Scanlan RH (1991) Resonance, tacoma narrows bridge failure, and undergraduate
physics textbooks. Am J Phys 59(2):118124
2. Davies JM (1949) The aerodynamics of golf balls. J Appl Phys 20(9):821828
3. Kodama Y, Kakugawa A, Takahashi T, Kawashima H (2000) Experimental study on microbub-
bles and their applicability to ships for skin friction reduction. Int J Heat Fluid Flow 21(5):582
588
4. tienne S, Pelletier D (2005) A general approach to sensitivity analysis of fluid-structure
interactions. J fluids struct 21(2):169186
5. He T, Zhou D, Bao Y (2012) Combined interface boundary condition method for fluid-rigid
body interaction. Comput Methods Appl Mech Eng 223:81102
6. Pendergast DR, Mollendorf JC, Cuviello R, Termin AC (2006) Application of theoretical
principles to swimsuit drag reduction. Sports Eng 9(2):6576
7. Moria H, Chowdhury H, Alam F, Subic A, Smits AJ, Jassim R, Bajaba NS (2010) Contribution
of swimsuits to swimmers performance. Procedia Eng 2(2):25052510
8. Fletcher NH (1976) Sound production by organ flue pipes. J Acoust Soc Am 60:926
9. Coltman JW (1968) Sounding mechanism of the flute and organ pipe. J Acoust Soc Am 44:983
10. Tsuchida J, Fujisawa T, Yagawa G (2006) Direct numerical simulation of aerodynamic sounds
by a compressible cfd scheme with node-by-node finite elements. Comput Methods Appl Mech
Eng 195(13):18961910
11. Maruyama T (1999) Surface and inlet boundary conditions for the simulation of turbulent
boundary layer over complex rough surfaces. J Wind Eng Ind Aerodyn 81(1):311322
12. Yabe T, Chinda K, Hiraishi T (2007) Computation of surface tension and contact angle and its
application to water strider. Comput Fluids 36(1):184190
13. Kobayashi S, Watanabe R, Oiwa T, Morikawa H (2009) Computational study of micropropul-
sion mechanism in water modeled on flagellum with projecting mastigonemes. J Biomech Sci
Eng 4(1):1122
14. Nomura K, Koshizuka S, Oka Y, Obata H (2001) Numerical analysis of droplet breakup behav-
ior using particle method. J Nucl Sci Technol 38(12):10571064
15. Caboussat A (2006) A numerical method for the simulation of free surface flows with surface
tension. Comput Fluids 35(10):12051216
16. Liu J, Koshizuka S, Oka Y (2005) A hybrid particle-mesh method for viscous, incompressible,
multiphase flows. J Comput Phys 202(1):6593
17. Worthington AM (1882) On impact with a liquid surface. Proc R Soc Lond 34(220223):217
230
18. Krechetnikov R, Homsy GM (2009) Crown-forming instability phenomena in the drop splash
problem. J Colloid Interface Sci 331(2):555559
19. Akers B, Belmonte A (2006) Impact dynamics of a solid sphere falling into a viscoelastic
micellar fluid. J Nonnewton Fluid Mech 135(2):97108
20. Duez C, Ybert C, Clanet C, Bocquet L (2007) Making a splash with water repellency. Nat phys
3(3):180183
21. Yoon SS, Jepsen RA, Nissen MR, OHern TJ (2007) Experimental investigation on splashing
and nonlinear fingerlike instability of large water drops. J Fluids Struct 23(1):101115
22. Idelsohn SR, Onate E, Del Pin F (2003) A lagrangian meshless finite element method applied
to fluid-structure interaction problems. Comput struct 81(8):655671
23. Ling SC, Ling TYJ (1974) Anomalous drag-reducing phenomenon at a water/fish-mucus or
polymer interface. J Fluid Mech 65(03):499512
24. Kubota Y, Mochizuki O (2009) Splash formation by a spherical object plunging into water. J
Vis 12:339345
25. Koshizuka S (1995) A particle method for incompressible viscous flow with fluid fragmentation.
Comput Fluid Dynamics J 4:2946
26. Eddington DT, Beebe DJ (2004) Flow control with hydrogels. Adv Drug Deliv Rev 56(2):199
210
27. Beebe DJ, Moore JS, Bauer JM, Yu Q, Liu RH, Devadoss C, Jo BH (2000) Functional hydrogel
structures for autonomous flow control inside microfluidic channels. Nature 404(6778):588
590
28. Alupei IC, Popa M, Hamcerencu M, Abadie MJM (2002) Superabsorbant hydrogels based
on xanthan and poly (vinyl alcohol): 1. the study of the swelling properties. Eur Polymer J
38(11):23132320
29. Narayanan J, Xiong JY, Liu XY (2006) Determination of agarose gel pore size: absorbance
measurements vis a vis other techniques. J Phys: Conf Ser 28(1):83 (IOP Publishing)
30. Kikuchi K, Mochizuki O (2010) A flow on a hydrogel surface mimicked a living cell. In:
Proceedings of the 21st international symposium on transport phenomena in Kaohsiung city,
Taiwan
31. Yagawa G, Shioya R (1994) Parallel finite elements on a massively parallel computer with
domain decomposition, 4. Comput Syst Eng 4:495503
32. Karypis G, Kumar V (1998) A fast and high quality multilevel scheme for partitioning irregular
graphs. SIAM J Sci Comput 20(1):359392
Part IV
Reduced-Order Models
Reduced-Order Modelling Strategies
for the Finite Element Approximation
of the Incompressible Navier-Stokes Equations
Joan Baiges, Ramon Codina and Sergio R. Idelsohn
Abstract In this chapter we present some Reduced-Order Modelling methods we

have developed for the stabilized incompressible Navier-Stokes equations. In the
first part of the chapter, we depart from the stabilized finite element approximation
of incompressible flow equations and we build an explicit proper-orthogonal decom-
position based reduced-order model. To do this, we treat the pressure and all the
non-linear terms in an explicit way in the time integration scheme. This is possible
due to the fact that the reduced model snapshots and basis functions do already fulfill
an incompressibility constraint weakly. This allows a hyper-reduction approach in
which only the right-hand-side vector needs to be reconstructed. In the second part of
the chapter we present a domain decomposition approach for reduced-order models.
The method consists in restricting the reduced-order basis functions to the nodes
belonging to each of the subdomains. The method is extended to the particular case
in which one of the subdomains is solved by using the high-fidelity, full-order model,
while the other ones are solved by using the low-cost, reduced-order equations.
1 Introduction
Reduced-order models (ROM) are nowadays receiving a lot of interest from the
computational mechanics community. Their most attractive feature is the capability
of reproducing the response of complex physical phenomena through the solution of
systems of equations which involve only very few degrees of freedom.
J. Baiges (B) R. Codina S. R. Idelsohn

Centre Internacional de Mtodes Numrics a lEnginyeria (CIMNE), Edifici C1,
Campus Nord UPC C/ Gran Capit S/N, 08034 Barcelona, Spain
J. Baiges, R. Codina
Universitat Politcnica de Catalunya, Jordi Girona 1-3, Edifici C1, 08034 Barcelona, Spain
S. R. Idelsohn
Instituci Catalana de Recerca i Estudis Avanats (ICREA), Barcelona, Spain

190 J. Baiges et al.
Amongst the various families of reduced-order models, Proper Orthogonal

Decomposition [14, 23, 26] based ROMs consist in the training of the model by tak-
ing snapshots from a high-fidelity simulation and using them to build an orthogonal
basis which is capable of accurately representing the solution through the combina-
tion of few of this basis functions. Particularly, we are interested in the application
of POD models to the incompressible Navier-Stokes equations, which we originally
approximate by using finite elements and a stabilized formulation. The problem of
applying POD models to the incompressible flow equations has been approached by
several authors [12, 19, 20, 25, 29, 39, 40] in a range of applications like shape
optimization [1, 9, 27, 34] or flow control [3, 21, 32].
The major concerns when making use of a reduced-order model are, on the one
hand, computational cost, and on the other, accuracy. Obviously we are looking for
a cheap reduced-order model which is as accurate as possible. Unfortunately, this is
not always possible. In this chapter we present some approaches we have developed
for POD models for the incompressible Navier-Stokes equations which help enhance
the computational cost and accuracy of the reduced-order models.
One of the a priori drawbacks of traditional POD is that a straightforward appli-
cation of a POD strategy to a non-linear problem does not turn out in a drastic
reduction of the required computational cost. This is so because in order to solve the
reduced-order equations, the full-order, non-linear, system of equations needs to be
built first and then projected onto the reduced-order space. Recently, the so-called
hyper-reduction [4, 8, 15, 22, 31, 3538] has appeared as a means to circumvent
this problem. The main idea is to compute the non-linear system entries only at some
few nodes of the computational mesh, and then approximate the whole system by
extrapolating it from the values of the system at these entries.
In the first part of this chapter, we describe a reduced-order model which is
particularly suitable for hyper-reduction [7]. The basic idea is to build a reduced-order
model based on a proper orthogonal decomposition and a Galerkin projection and
treat all the terms in an explicit way in the time integration scheme. This results in a
reduced-order model where only the right-hand side of the system needs to be rebuilt
at each time step. This is possible because the reduced model snapshots do already
fulfill the stabilized continuity equation and the pressure field can be automatically
recovered at the end of each time step from the reduced order basis and solution
coefficients. We also present a method for choosing the sampling entries from which
the complete system is going to be extrapolated. The method consists of choosing
the sampling points such that the distance between the right-hand-side snapshots
and the recovered snapshots is minimized, with the restriction that the coordinates
of sampling points must coincide with the coordinates of the finite element mesh
nodes.
Another issue which needs to be dealt with when using reduced-order models is
the lack of robustness with respect to changes in the parameters which characterize
the numerical simulation. This lack of robustness causes the reduced model to be
valid only in a small parameter region close to the parameter values for which the
reduced model was built [3], requiring the snapshot collection and the reduced model
Reduced-Order Modelling Strategies for the Finite Element 191
to be updated when an optimization process leads to a parameter configuration which

becomes too separated from the starting parameter set.
In the second part of this chapter we present a domain decomposition method
for reduced-order models [6] which we apply to the finite element approximation
of the incompressible Navier-Stokes equations. Domain decomposition methods for
reduced-order models have been used for different simulation problems [2, 10, 28,
30, 33, 41]. In these partitioned approaches reduced-order models are formulated
independently and then glued together in either a monolithic or an iterative way.
Contrary to this approach, the domain decomposition method we propose is obtained
simply by restricting the reduced model basis functions to be non-null only in the
nodes of the computational mesh belonging to the considered subdomain. This def-
inition of the partitioned problem directly ensures the continuity of the recovered
reduced-order solution at the interfaces. Also, there is no need to use the classi-
cal domain-decomposition iteration by subdomain schemes, because the Domain
Decomposition Reduced-Order Model (DD-ROM) is written in terms of the parti-
tioned reduced bases in a monolithic way. One of the advantages of the proposed
method is the ease for generating a hybrid full-order/reduced-order model, as a par-
ticular case of the general DD-ROM method. The proposed hybrid DD-ROM model
can be easily used together with hyper-reduced models, as we demonstrate in the
numerical examples section.
The chapter is organized as follows. In Sect. 2 we present an explicit ROM for the
finite element approximation of the incompressible Navier-Stokes equations, and a
numerical example illustrates the behavior of the model for low Reynolds flow cases.
In Sect. 3 we describe the hyper-reduction strategy we apply to the explicit ROM,
and we also present the Discrete Best Points Interpolation Method for the selection
of sampling indices for the gappy-POD reconstruction process. Finally, the domain-
decomposition reduced-order model is presented in Sect. 4, where we also explain
the hybrid full-order/reduced-order domain decomposition approach and present a
numerical example. Some conclusions close the chapter in Sect. 5.
2 An Explicit Reduced-Order Model for the Incompressible

Navier-Stokes Equations
When solving a non-linear problem by means of a POD based ROM, it is necessary to

project the full-order system of equations to the reduced-order space at each iteration
of the non-linear problem. For non-linear problems, this is troublesome because the
expected orders of magnitude reduction in the computational cost of solving the
reduced-order system is not observed in practice: the computational time of the
reduced-order model is governed by the need of rebuilding the full-order system at
each iteration, and then projecting it to the reduced-order subspace.
This issue has motivated a lot of research recently, leading to several strategies
which reduce the cost of computing projected reduced-order system of equations
[4, 8, 15, 22, 31, 3538]. These approaches are known as hyper-reduced models. In
these methods, the non-linear and parameter-dependent terms are recovered by means
of a least-squares procedure from a series of sampling points where the function
to be approximated is computed. This allows to effectively reduce the amount of
computations required to build the reduced order system, and results in a reduced-
order model whose computational cost is directly proportional to its number of
degrees of freedom.
We have been working in a hyper-reduced approach for the incompressible Navier-
Stokes equations. The particularity of the strategy we propose is that the equations
for the reduced-order model are treated in an explicit way. This allows to send all the
non-linear terms to the right hand-side of the reduced-order system, leaving in the left-
hand side only the mass matrix due to the temporal derivatives. The main advantage
is, of course, that the mass matrix is linear, and the hyper-reduced approaches need
only to be applied to right-hand side vector. This effectively reduces the overall cost
of the reduced-order model.
Let us start by introducing some notation for the POD approximation of a general
problem. Let U R M be the global unknown vector associated to a non-linear
variational problem. Suppose that after linearizing and fully discretizing in time and
space the given problem, the following matrix form is obtained which allows to obtain
the vector of nodal unknowns U at a given iteration of the non-linear procedure, for
a certain time step:
AU = F, (1)
where A R MM is the matrix of the system whose solution is U, and F R M the

RHS vector. The POD approximation of the previous system is obtained by projecting
it onto a low dimensional subspace U R N . Vectors U are now approximated by:
U , (2)
where R MN is the basis for U and N is the dimension of the reduced order
model, with N < M. R N are the components in U expressed in the reference
system defined by . The reduced-order basis is obtained by means of the POD
method [14, 23, 26], that is by doing the singular value decomposition of a set
of solution snapshots, which in our case are taken from the results of a full-order
simulation. After projecting the full-order system to this reduced-order subspace and
applying a least squares approach, the final reduced-order system is:
T A = T F. (3)
2.1 Stabilized Finite Element Approximation of the Incompressible

In this section we summarize the finite element stabilized formulation for the
incompressible Navier-Stokes equations used in the rest of the chapter. Let us con-
sider the transient incompressible Navier-Stokes equations, which consist of finding
u : (0, T ) Rd and p : (0, T ) R such that:
t u u + u u + p = f in ,
u = 0 in ,
u = u on D ,
pn + n u = 0 on N .
for t > 0, where t u is the local time derivative of the velocity field. Rd is
a bounded domain, with d = 2, 3, is the viscosity, and f the given source term.
Appropriate initial conditions have to be appended to this problem.
Let now V = H 1 ()d , and V0 = {v V |v = 0 on D }. Let also Q = L 2 () and

D (0, T ; Q) be the distributions in time with values in Q. The variational problem
consists of finding [u, p] L 2 (0, T ; V ) D (0, T ; Q) such that:
(v, t u) + B([v, q], [u, p]) = v, f [v, q] V Q, (4)
with
u = u on D ,
where
B([v, q], [u, p]) := v, u u + (v, u) ( p, v) + (q, u).
Here, (, ) stands for the L 2 () inner product and , for the integral of the
product of two functions, not necessarily in L 2 (). Let {K } be a finite element
partition of , from which we construct the finite element spaces Vh V, Vn0
V0 , Q h Q. The semilinear form B suffers from the well-known stability issues
due to the convective nature of the flow, but also requires a compatibility between
the velocity and pressure approximation spaces due to the classical LBB inf-sup
condition. In order to deal with these stability issues, we use a stabilized finite element
formulation [16], which is as follows: for each t, find uh (t) Vh , ph (t) Q h
such that:

(v h , t uh ) + B([v h , qh ], [uh , ph ]) + K (uh v h + v h
K
+ qh , r([uh , ph ])) K = v h , f , (5)
for all v h Vh,0 , qh Q h . Initial conditions need to be appended to this problem.

In (5):
r([uh , ph ]) = t uh uh + uh uh + ph f , (6)
is the residual of the momentum equation, (, ) K is used to denote the L 2 product in

element K and K is the stabilization parameter:

|uh | K 1
K = c1 2 + c2 ,
h h
where |uh | K is the mean velocity modulus in element K , h is the element size and
c1 and c2 are stabilization constants.
Regarding the discretization in time, we consider implicit integration schemes.
For the full-order system, only implicit time integration schemes can be used, because
no time derivatives of the pressure appear in the equations. Taking this into account,
we can do the following: supposing that the velocity and pressure at time step n
[unh , phn ] are known, we may solve (5) for example with t uh being discretized using
a backward differences in time scheme:
t uh t un+1
h ,
n+1
t (uh unh )
1
1st order scheme
t un+1 := 1 3 n+1 n + 1 un1 ) 2nd order scheme (7)
h ( u
t 2 h 2u h 2 h
where t is the time step size.
2.2 Explicit Reduced-Order Model
As explained in the previous sections, it is convenient to treat the reduced-order

model by using an explicit time integration scheme, because this leads to important
computational gains when using hyper-reduced order reconstruction methods. How-
ever, we have also explained the need of using an implicit time integration scheme
for the full-order model of the incompressible Navier-Stokes equations, due to the
presence of the pressure field. Here we summarize the strategy we use for building an
explicit reduced-order model which is suitable for the incompressible Navier-Stokes
equations [7].
Let us start by introducing the velocity and pressure reduced-order subspaces.
Q Q h is the pressure subspace defined by the pressure part of the POD basis
functions , p Q is the reduced-order pressure field. V Vh is the velocity
subspace defined by the velocity part of the POD basis functions . For each time t,
u(t) V is the reduced-order velocity. In order to develop the explicit reduced-order
model where the pressure is treated in an explicit way, we take into account that:
All reduced basis functions do already fulfill the stabilized continuity equation.
Since the reduced-order basis is built from weakly incompressible solution snap-
shots and the incompressibility constraint is linear, the reduced basis functions
(and their linear combinations) do also fulfill it.
If basis functions are taken to be joint velocity-pressure basis functions (that is
contains the coefficients of functions in V Q), then the pressure at time step
n + 1 is automatically recovered from coefficients n+1 and the reduced order
basis even if all the terms involving the pressure are treated in an explicit way
in the reduced order formulation.
The variational formulation for the first order in time reduced-order model that we
propose is:
(v, t un+1 ) + (v, un un ) + ( v, un ) ( p n , v)

+ K (un v + v, t un un + un un + p n f n ) K = v, f n .
K
(8)
where the terms un and p n are a second order approximation of the state at n + 1
(the velocity and the pressure at n + 1) given by:
un = 2 un un1 ,
p n = 2 p n p n1 . (9)
In the case of the second order in time reduced-order model, we use the same varia-
tional formulation (8), but the terms un and p n are now a third order approximation
of the state at n + 1 given by:
12 n 9 n1 2 n2
un = u u + u ,
5 5 5
12 n 9 n1 2 n2
p n = p p + p . (10)
5 5 5
Note that for the first order explicit scheme we propose to use the second order
extrapolation (9), and for the second order scheme the third order extrapolation (10).
The key point of this formulation is that only the temporal derivative terms involve
values of the reduced-order velocity or pressures at the new time step. This ensures
that the resulting reduced-order matrix is linear. However, the reduced-order right-
hand-side still needs to be approximated. After solving the reduced-order system, the
velocity and pressure fields at n+1 can be recovered by multiplying the reduced-order
basis by the obtained reduced-order components n+1 .
Fig. 1 Comparison of the FOM (left) and ROM (right) velocities (top) and pressures (bottom) after
400 time steps of simulation
2.3 Numerical Example. Bidimensional Flow Past Two Cylinders
The first numerical example consists in the bidimensional flow past two cylinders.
The computational domain is a 16 8 rectangle. The cylinders are centered at
coordinates (3, 3) and (6, 5), and both of them are of diameter 1. The inflow velocity
is 1, which together with the density = 1 and the viscosity = 0.01 results in a
Reynolds number Re = 100. The time step is set to t = 0.1. The mesh is composed
of 7310 linear triangular elements. After running the full-order simulation and taking
the corresponding snapshots, the explicit reduced-order model is run. The number
of degrees of freedom for the ROM is only 10.
Figure 1 shows a comparison of the velocity and pressure fields for the full-order
and the explicit reduced-order model after 400 time steps of simulation. The high-
fidelity and the reduced-order fields are very similar. In Fig. 2 we compare the time
history and Fourier transform of the vertical velocity and the pressure at coordinates
(8.5, 4). We observe that the time history and Fourier transform of both vertical
velocity and pressure are accurate for the reduced-order model. The cpu-time for
running the full-order model is 53.24 s, the time for running the explicit reduced-
order model is 19.78 s, a 63 % reduction in computational time.
3 Hyper-Reduction Approach
At this point, we already have an explicit reduced-order model in which all the non-
linear terms are in the right-hand-side vector and the reduced system matrix is linear
and does not change between time steps. However, computing the right-hand-side
Velocity Velocity
0.5 10
FOM FOM
0.4 ROM ROM
20
dB (yvelocity)
0.3
yvelocity
30
0.2
40
0.1
50
0
0.1 60
0.2 70
0 5 10 15 20 25 30 35 40 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
Time Frequency
FOM
Pressure ROM Pressure
0.05 10
FOM
0.1 ROM
20
0.15 dB (pressure)
30
Pressure
0.2
40
0.25
50
0.3
0.35 60
0.4 70
0 5 10 15 20 25 30 35 40 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
Time Frequency
Fig. 2 Comparison of the FOM and ROM velocity and pressure time history at the point (8.5, 4)
(left) and their Fourier transform (right)
vector at each time step is still expensive (number of operations of O(M)), because
we need to recompute F n+1 and then project it to the reduced-order subspace by
calculating T F n+1 . The approach we follow for reducing this computational cost
is to reconstruct the non-linear vector F n+1 by sampling only some of the entries of
this vector and applying a lest-squares minimization strategy . The method we follow
was first presented in [18], and a similar approach has been recently used in [13]
applied to an implicit reduced-order method for the incompressible Navier-Stokes
equations.
Let us consider a reduced order basis for the right-hand-side vectors F, F ,
obtained by means of a proper orthogonal decomposition of a set of snapshots for
F. F defines a low-dimensional subspace F R M , so that any right-hand-side
vector F can be approximated as:
F F F,
where now F R N are the reduced-order coefficients for the reconstruction. Let us
also consider that we only know the nodal values for F at some sampling components
Fi(k) , 1 k n s , where n s is the number of sampling components of the vector,
i(k) denotes the kth sampling component. We now want to recover the reduced order
basis coefficients F of the reduced order basis F for vector F.
In order to recover F we can solve the least-squares minimization problem:

ns
N
F = arg min ( F,i(k) j a j Fi(k) )2 , (11)
aR N
k=1 j=1
where F,i(k) j denotes the basis vector j evaluated at the kth sampling component,
i(k).
The previous procedure provides the tools required to extrapolate the right-hand-
side vector arising from the finite element problem. The main advantage is that in
order to do so, only the nodal values at certain few sampling components are needed.
If n s is O(N ), then the computational cost of rebuilding F n+1 for solving each time
step is reduced to O(N ), and the overall cost of the reduced-order model is O(N ).
3.1 A Discrete Version of the Best Points Interpolation Method

(DBPM)
When using hyper-reduced order models the quality of the recovered right-hand-
side vector highly depends on the selected sampling components. Several strategies
have been developed for choosing these sampling components [5, 8, 17]. Amongst
the most extensively used are the Discrete Empirical Interpolation Method (DEIM)
[15], where the sampling components are selected iteratively by imposing that the
error growth at each iteration is limited and the Best Points Interpolation Method
(BPIM) approach presented in [31], where the sampling points are chosen so that the
distance between the projection of the right-hand-side snapshots onto the reduced
basis subspace and the recovered right-hand-side is minimized.
The strategy we use, presented in [7], is a hybrid between the BPIM and the
DEIM. We call it a Discrete version of the Best Point Interpolation Method (DBPIM).
Similarly to the BPIM, the method consists of minimizing the error between the
recovered right-hand-side vector snapshots and the actual snapshots. However, in
the strategy we use we force the sampling coordinates to coincide with nodal points
of the finite element mesh. Plus, once a component associated to a node of the finite
element mesh is selected, all the degrees of freedom associated to that node are
included in the sampling selection. Moreover, due to the lack of smoothness of the
vectors which are being approximated we do not use a Marquardt related strategy in
order to advance to the optimal set of sampling nodes. Instead, we use an algorithm
which advances from one set of sampling nodes to the next one by evaluating the
error of the recovered snapshots at the neighbour points in the finite element mesh
and replaces a sampling node with its neighbour if the error diminishes. The DBPIM
algorithm is detailed in Algorithm 1 for a scalar unknown (where each sampling
node is associated to a single sampling component).
The first step of the DBPIM algorithm consists of finding the projection F of
the snapshots onto the reduced order subspace defined by the reduced order basis,
F. For each snapshot, this yields the coefficients F . In the second step we choose
an initial set of sampling nodes, which can be done by using the DEIM method.
If the DEIM method is used, it will give us a set of sampling components. For a
scalar problem, each component corresponds to a node of the finite element mesh.
If the unknown is a vector field, then the nodes associated to the DEIM sampling
components are selected as initial sampling nodes, and the number of sampling nodes
is equal to the number of reduced basis functions. Otherwise, we always choose the
number of sampling nodes to be equal to the number of basis functions times an
(usually low) integer. After defining the initial set of sampling nodes, the degree(s)
of freedom associated to these sampling nodes become sampling components. For
,aprox
this initial set of sampling nodes, we recover the approximated coefficients F
by means of the previously described least-squares strategy. The error associated
to a set of sampling components i Nn s , whose k-th component is indicated as
i(k) 1, ..., M, is obtained by computing the difference between the exact and the
approximated F coefficients:

N snapshots
,aprox
e(i) = ||F (i) F || (12)
=1
Algorithm 1 Discrete best points interpolation method

Compute the optimal basis coefficients for the snapshot set:
F F = F (F ), = 1, Nsnapshots
Choose an initial set of sampling components i Nn s | 1 i(k) M, k = 1, ...n s .
,aprox s N 2
Solve: F (i) = arg min aR N nk=1 j=1 ( F,i(k) j a j Fi(k) ) , = 1, Nsnapshots
N snapshots ,aprox
e(i) = j=1 ||F (i) F || 2
while set of sampling points has changed do

for m = 1 : n s do
for l = 1 : Nneigh (i(m)) do
if the lth neighbour of i(m) has not been previously tested then
Temporarily replace sampling node i(m) by its lth neighbour
,aprox s N 2
Solve: F (i) = arg min aR N nk=1 j=1 ( F,i(k) j a j Fi(k) ) , = 1, Nsnapshots
N snapshots ,aprox
etemp (i) = =1 ||F (i) F ||
if etemp < e then
e = etemp
Permanently replace sampling node i(m) by its lth neighbour
Restart l loop
end if
end if
end for
end for
end while
where
,aprox

ns
N
2
F (i) = arg min ( F,i(k) j a j Fi(k) ) , = 1, Nsnapshots (13)
aR N
k=1 j=1
For this definition of the error, we can define the optimal set of sampling compo-
nents as:
b = arg min e(i) (14)

iNn s | 1i(k)M, k=1,...n s
where e(i) is given in (12).

In order to obtain the set of sampling points to be used in the reduced order
simulation we proceed as follows: for each sampling node of the finite element
mesh, we loop over its neighbours in the computational mesh and we temporarily
replace the sampling node by each of them. If the error of the new set is lower than
the original error, the sampling node is permanently replaced by its neighbour. This
procedure is repeated while the set of sampling points changes due to the algorithm
(while loop in Algorithm 1).
3.2 Numerical Example. Two-dimensional Low Reynolds Flow

Past a NACA Airfoil
In this section we simulate the incompressible flow around a NACA 0012 airfoil
profile [24]. The computational domain is a 32 16 rectangle, with the trailing edge
of the 8 unit long airfoil placed at (16, 8). The horizontal inflow velocity is set to 1
at x = 0, and slip boundary conditions are applied at the upper and lower walls of
the computational domain. Velocity is prescribed to 0 at the airfoil surface.
The viscosity has been set to = 0.001, which yields a Reynolds number Re
= 1000 based on the height of the airfoil. The time step has been set to t = 0.2.
In this numerical example, the C F L number associated to the finite element mesh
was C F L 62. A 29945 linear element mesh has been used. The mesh is refined
around the airfoil surface in order to be able to better capture the solution in the region
surrounding the boundary layer. The angle of attack has been set to = 0.2, and a
second order backward differences scheme has been used for the time integration.
100 velocity-pressure snapshots have been taken and the 10 first reduced basis
functions have been kept for the reduced-order model. For the hyper-reduced order
model, 100 additional snapshots for the right-hand-side have been taken and the cor-
responding 12 first reduced basis functions have been kept. The number of sampling
nodes is 36.
Fig. 3 Velocity (top) and pressure (bottom) contours at Re = 1000, = 0.2 after 200 time steps.
Full-order (left) and Hyper-Reduced Order Model (right)
FOM FOM
ROM ROM
Velocity HROM
Pressure HROM
0.4 0.1
0.3 0
0.2 0.1
0.1 0.2
yvelocity
Pressure
0
0.3
0.1
0.4
0.2
0.3 0.5
0.4 0.6
0.5 0.7
0.6 0.8
0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40
Time Time
Fig. 4 Velocity (left) and pressure (right) time history at a control point at the wake of the airfoil,
Re = 1000, = 0.2, second order time integration
Figure 3 compares the velocity and pressure fields after 200 time steps for the
full-order and the hyper-reduced model. The reduced-order model almost exactly
matches the results from the full order model.
Regarding the computational cost, the full order model takes 148.9 s to run, the
reduced-order model takes 49.6 s (33 %). Finally, reduced-order model 2, in which
the computational cost depends only on the size of the reduced-order model, takes
only 0.71 s (0.45 %) to run.
Figures 4 and 5 show the time history and spectra for the velocity and pressure at
(8, 0.5). Despite the complex flow and the high number of oscillation modes present
in the solution, the reduced-order models manage to correctly capture the main modes
amplitudes and frequencies.
Velocity Pressure
10 FOM
10 FOM
20 ROM ROM
HROM 20 HROM
dB (yvelocity)
30
dB (pressure)
30
40
50 40
60 50
70
60
80
90 70
100 80
0 0.5 1 1.5 2 2.5 0 0.5 1 1.5 2 2.5
Frequency Frequency
Fig. 5 Velocity (left) and pressure (right) spectra at a control point at the wake of airfoil, Re = 1000,
= 0.2, second order time integration
4 A Domain Decomposition Approach for POD Reduced-Order

Models
Despite the important reduction in computational cost provided by reduced-order

models, one of their major drawbacks is the lack of robustness with respect to changes
in the parameters which characterize the numerical simulation. This lack of robust-
ness causes the reduced model to be valid only in a small parameter region close to the
parameter values for which the reduced model was built [3], requiring the snapshot
collection and the reduced model to be updated when, for instance, an optimization
process leads to a parameter configuration which becomes too separated from the
starting parameter set.
In this section we present a strategy which allows to improve the behavior of
non-linear reduced models (where hyper-reduction is used for the reconstruction of
the reduced-order equations) in parameter configurations which are not present in
the snapshot set from which the reduced model is built [6]. It is based on introducing
a domain decomposition approach to the model reduction, partitioning the computa-
tional domain into several regions, each of which is dealt with localized POD bases.
This gives us the possibility of treating each of the subdomains with a different degree
of approximation, or even, as we will see, solving the full-order equations in some
of the subdomains.
4.1 Domain Decomposition POD Model
Let us consider the splitting of the computational domain into two subdomains
k , k = 1, 2, and the associated local unknowns U k R Mk , M = M1 + M2 . If
the domain decomposition is applied to the equations arising from a finite element
problem, the partition into subdomains is done by assigning each of the nodes (and
nodal unknowns) of the finite element mesh to a subdomain. This means that there
are no interface nodes, instead we define interface elements as those elements who
own nodes from different subdomains. Let us define a local reduced order basis k
consisting of the reduced basis functions ik R Mk , i = 1, ..., Nk , to approximate
U k in each subdomain. Note that the number of basis functions in each subdomain
is not necessarily the same, although we have considered it to be equal from now
on for simplicity. The possible ways to construct this basis are discussed later. This
local basis can be extended to the global domain by defining ik R M :
1
0
i1 := i , i2 := 2 , (15)
0 i
where the null terms correspond to components of the global system which lie outside
k . Taking this into account, the unknown U is approximated as:

N
U (i1 i1 + i2 i2 ) = (1 1 + 2 2 ), k R MN , k R N k = 1, 2,
i=1
(16)
where k are the solution coefficients for subdomain k.

Let A R MM be the matrix of the system whose solution is U R M , and
F R M the RHS vector. They can be partitioned into the components associated to
each subdomain k , k = 1, 2, so that

A|11 A|12 F|1
A= , A|kl R Mk Ml , F= , F|k R Mk .
A|21 A|22 F|2
The monolithic approach for the domain decomposition ROM is obtained by

introducing the union of the extensions of the local bases to the global domain as the
global reduced order basis:
(1 )T A(1 1 + 2 2 ) = (1 )T F
(2 )T A(1 1 + 2 2 ) = (2 )T F. (17)
Defining akl = (k )T Al R N N and f k = (k )T F R N , we may write

this system as
a11 1 + a12 2 = f 1 , (18)

a21 + a22 = f 2 .
1 2
(19)
If we also consider the decomposition of A and F the final reduced order system
can be written in terms of the local bases k :
Fig. 6 Local basis functions

for the domain decomposition
approach. The green
function is the sum of the
local basis function of the
left subdomain (blue) and the
local basis function of the
right subdomain (red)
(1 )T ( A|11 1 1 + A|12 2 2 ) = (1 )T F|1

(2 )T ( A|21 1 1 + A|22 2 2 ) = (2 )T F|2 .
The off diagonal block matrices correspond to the coupling terms and are null
except for the contribution of the unknowns ubicated at the domain interfaces. It
can be observed that the cost of computing the ROM system is not larger than in
the monolithic approach. However, the size of the reduced system is larger (dimen-
sion 2N ).
An important point is that each algebraic local basis function ik arises from a
function defined in space. This spatial function is a linear combination of the finite
element shape functions of the nodes of subdomain k . As a consequence, each
of the components in ik corresponds to a nodal value of the spatial field to be
represented on the finite element mesh. This is illustrated in Fig. 6, where examples
of local basis functions for a one-dimensional problem and linear finite elements
are shown. Let us also emphasize that, if the original finite element shape functions
are continuous, any local (and global) basis function will also be continuous, as a
consequence of the definition of the extension of the basis functions to the global
domain (15). This will also hold for the combination of local basis functions, even
if these belong to different subdomains. In Fig. 6 the blue basis function belongs to
the left subdomain, the red basis function belongs to the right subdomain. The green
line represents the addition of the blue and the red basis functions. Since both of the
original functions are continuous, the green function is also continuous. Note also
that there is an overlapping region where both the left and right basis functions are
non-zero.
4.2 Local POD (L-POD)
The strategy for building the local POD basis consists in performing a POD for the
part of the snapshots corresponding to each of the subdomains. The snapshots are
first partitioned according to the domain decomposition strategy and the local basis
k is obtained
from
these partitioned snapshots. The global basis is again defined
as = 1 , 2 . Note that the number of local basis functions in each subdomain
does not necessarily coincide, N1 = N2 , N = N1 + N2 .
The main features of these domain-decomposition local POD bases are the fol-
lowing:
Each local basis can be ensured to be orthonormal at the algebraic level. By con-
struction, each of the basis functions which conform the local POD has unitary
norm and is orthogonal to all the basis functions in its subdomain at the algebraic
level. Moreover, due to the domain decomposition approach, the projection of a
local basis of a given subdomain onto the space conformed by the basis functions
of any other subdomain is also zero. This ensures that if we consider the POD
decomposition globally, the union of the local bases is also an orthonormal basis.
The computation of the singular value decomposition of the local snapshots for
each subdomain requires less memory than the computation of the singular value
decomposition of the global snapshots.
In the case we are using hyper reduced models which require additional POD bases
for reconstructing the system matrix and right-hand side, we can proceed in the same
way.
Once the localized reduced order bases have been defined, the monolithic domain
decomposition reduced order model is obtained by using as reduced basis the union
of the local reduced bases. The fact that the basis functions are local makes the com-
putational cost diminish with respect to the global approach with the same number
of basis functions, because the operations can be done at the local level. However,
the number of functions is usually larger in the domain decomposition approach,
because a sufficient number of components needs to be assigned to the reduced basis
of each subdomain in order to properly represent the solution in that subdomain.
4.3 Stabilization Through Overlapping and Penalty Terms
The previous domain decomposition strategy for reduced-order models, despite its
simplicity, suffers from unstable behavior when it is used in a straightforward manner
in the explicit reduced-order model for the stabilized finite element approximation
of the incompressible Navier-Stokes equations described in the previous sections.
These instabilites can be easily explained taking into account that an explicit time
marching scheme is equivalent in this case to an explicit iteration-by-subdomain
strategy, which is known to have convergence and stability issues. This is the reason
why we propose a domain interface stabilization term, which is obtained by allowing

some overlapping between subdomains and enforcing the equality of the unknown
values at this overlapping region.
As in classical iteration by subdomain strategies, the overlapping region is
the part of which belongs to both 1 and 2 . In our approach, in which the
partitioning is obtained by assigning the nodes of the finite element mesh to 1 and
2 , overlapping is achieved by allowing some nodes close to the interface to belong
to both 1 and 2 . The local reduced bases are computed by performing the POD
of the restriction of the snapshots to k , but the obtained basis functions need to be
corrected. Suppose that the original overlapping local POD bases 01 R MN1 and
02 R MN2 are:
1
0
01 = 1 , 02 = 2 , (20)
0 2
where now
k R Mk Nk , (21)
is the restriction of the local basis functions in k to the part of the subdomain
without overlapping (Mk components), and
k R Mk Nk , (22)
corresponds to the restriction of 0k to the overlapping domain (Mk compo-

nents). Note that M := M1 = M2 , and now M = M1 + M2 + M .
In this case the corrected bases are:
1
0
1 = 1 , 2 = (1 )2 , (23)
0 2
where [0, 1] is a weighting parameter. Note that the limits = 0 and = 1

correspond to the non overlapping case. If several subdomains overlap in a certain
region, then each subdomain is assigned a weighting parameter k and we must
ensure that k = 1. The motivation for this correction is the requirement that
the resulting global reduced basis (obtained as the union of the local bases for each
subdomain) is capable of representing the global snapshot set if N is equal to the
number of snapshots. This is shown in Fig. 7, where some illustrative basis functions
for a one-dimensional problem are depicted.
Supposing that the problem defined in (1) allows us to do so, the stabiliza-
tion penalty term imposes that the solution at the overlapping region recovered
from 1 (prior to the introduction of the weighting parameter ) is equal to
the solution recovered from 2 :
Fig. 7 Overlapping local

basis functions. = 0.5.
The overlapping nodes are
depicted in gray. In this
particular case the value of
the basis functions at the
overlapping nodes coincides,
which is not necessarily the
case for L-POD
1 1 = U 1 = U 2 = 2 2 R M , (24)
where now we take k R Nk the ROM degrees of freedom for each subdomain.
This condition can be equivalently written as:
(1 )T 1 1 (1 )T 2 2 = 0,
(2 )T 1 1 (2 )T 2 2 = 0, (25)
where

0
k = k . (26)
0
Introducing (25) as a penalized constraint in the ROM system we get:
1
a11 1 + a12 2 + (M 11 1 M 12 2 ) = f 1 , (27)

1
a21 1 + a22 2 + (M 21 1 M 22 2 ) = f 2 , (28)

where
M kl = (k )T l R Nk Nl . (29)
and the definition of k for building the a matrices is taken as in (23).

An important property of the block diagonal penalty matrices M kk is that they

can only be guaranteed be full-rank matrices if = . However, this stabiliza-
tion strategy shows good results in the numerical examples even if = . The
introduction of the M matrices to the reduced order formulation allows one to obtain
a stable solution in the practical cases. The stabilization parameter is chosen so
that, on the one hand, the penalty terms are sufficiently large to provide the desired
stabilization effects, and on the other, the norm of 1 M is proportional to the norm of
A. In this way we ensure that the resulting system does not become ill-conditioned.
4.4 Full-Order / Reduced-Order Domain Decomposition

(FOM-ROM)
Another possibility is the use of a hybrid Full-Order / Reduced-Order (FOM-ROM)

approach. This is convenient if a high fidelity model is required in a certain region
of the domain, or if the conditions in a certain region strongly depart from the
conditions at which the snapshots for building the POD bases were taken. In this
cases one can choose to solve the FOM problem in one of the subdomains, while
keeping the cheaper ROM approach in the less critical subdomains. Extending the
described partitioned ROM strategy to a hybrid FOM-ROM domain decomposition
method is straightforward: the FOM-ROM is obtained by taking as local basis for
the FOM subdomain F the nodal shape functions of the finite element space for
the unknown. In the ROM subdomain R a local reduced basis needs to be built.
The hybrid FOM-ROM system without overlapping is:

A| F F A| F R R UF F| F
= . (30)
( R )T A| R F ( R )T A| R R R R ( R )T F| R
Let us remark that the time stepping strategies need not to be the same for the
full order and the reduced order equations. For instance, if the explicit reduced order
model described in the previous sections is used for the incompressible Navier-Stokes
equations, the A matrix and the F RHS vector for the reduced order equations are
taken from the explicit model, while the equations arising from the implicit time
stepping are kept for the full order equations:

= . (31)
( ) A | R F ( )T Aexp | R R R
R T exp R R ( R )T F exp | R
If a Petrov-Galerkin projection is used, this can also be introduced in the ROM

equations. For instance, the FOM-ROM system for the Petrov-Galerkin projection
described in [11, 13] would result in the following system:

= , (32)
F AR R
A RP G PG R
R F RP G
where

F = ( )
A RP G A|TF R A| F F + A|TR R A| R F ,
R T

R = ( )
A RP G A|TF R A| F R + A|TR R A| R R ,
R T
F RP G = ( R )T A|TF R F| F + ( R )T A|TR R F| R .
Also, any hyper-reduction technique for efficiently reconstructing the ROM equa-
tions can be used. The described overlapping strategies and the use weighting coef-
ficients need to be introduced in the previous formulation. This can be done in a
straightforward manner, including the use of different weighting parameters or
for the FOM and the ROM equations.
4.5 Particularities of the Application to the Incompressible

The use of the domain decomposition ROM strategy to the particular problem of
the incompressible Navier-Stokes equations is straightforward if a ROM approach
is used in all the subdomains. On the other hand, some care needs to be taken when
a FOM approximation is used in one of the subdomains while a ROM approxima-
tion is used in its neighbour subdomains. As in the original domain decomposition
strategy, a penalization term through overlapping is convenient in this FOM-ROM
approach. However, it is necessary to distinguish between the velocity and the pres-
sure unknowns of the incompressible Navier-Stokes equations in this case: only the
equality between the FOM and the ROM velocities in the overlapping region is
imposed, and no condition is required on the FOM pressure field. This is so because
the pressure field can be understood as the Lagrange multiplier enforcing the incom-
pressibility constraint, and as such it is not possible to enforce the pressure value
over the overlapping domain.
4.6 Numerical Example. Flow Injection in a Rectangular Cylinder
In this numerical example we show the capability of the proposed FOM-ROM strat-
egy to adapt to flow configurations which were not present in the original snapshot
set. The initial problem set is the incompressible flow past a rectangular cylinder at
Re = 100. The computational domain consists of a 24 12 rectangle with a square
cylinder with a side of size 1. The square cylinder is centered at coordinates (8, 6).
The horizontal inflow velocity is set to 1. Slip boundary conditions which allow the
FOM
0.15 ROM 0.238 FOM
ROM
0.1 0.24
0.05 0.242
Pressure
yvelocity
0 0.244
0.05 0.246
0.1 0.248
0.15 0.25
0.2 0.252
0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40
Time Time
Fig. 8 Comparison of the FOM and ROM velocities at (5,4) for the initial configuration
flow to move in the direction parallel to the walls are set at y = 0 and y = 12, and
velocity is set to 0 on the cylinder surface in the direction normal to the surface.
A tangential force (computed by using a wall-law approach) is used to model the
velocity in the tangential direction. The viscosity has been set to = 0.01, which
yields a Reynolds number Re = 100 based on the dimension of the cylinder and the
inflow velocity. A second order backward difference scheme has been used for the
time integration with time step t = 0.1 . In this example, a relatively fine 67224
linear element mesh has been used to solve the problem.
An initial run of the full-order model is performed for the snapshot collection and
no domain decomposition strategy is applied in the initial run. The FOM model takes
849.36 s to run.After the snapshot collection procedure, the ROM is capable of repro-
ducing the FOM solution with a good accuracy for the velocity field (2.1 % of relative
error in the L 2 -norm for the last oscillation period) , the pressure amplitude being
underpredicted (but only with 0.8 % of relative error in the last oscillation period),
and a very low computational cost (3.07 s, 0.37 % of the original computational
cost), as illustrated in Fig. 8. For the ROM run, 10 basis functions are used, which
are obtained from the POD decomposition of the original 50 snapshot collection.
As illustrated in Fig. 8, the reduced-order model is capable of reproducing the
solution of the full-order model for the configuration in which the snapshots were
taken. However, let us now consider the flow injection in the downstream side of the
cylinder illustrated in Fig. 9, which is introduced in order to modify the flow. The
velocity in the injection region (whose length is 0.2) is 0.1 in the direction normal
to the cylinder surface. Figure 10 illustrates the behavior of the reduced order model
when the injection is considered. Despite its very low computational cost compared
to the FOM model, it is clear that the ROM is incapable of reproducing the new flow
configuration; the reason for this is that the snapshot set from which the ROM basis
was built does not contain the solution with the flow injection.
Let us now consider the FOM-ROM strategy described in the previous sections.
We will decompose the physical domain into two subdomains, based on our a priori
knowledge of the boundary conditions of the problem: the first subdomain corre-
Fig. 9 Flow injection configuration. The red dotted line denotes the FOM domain for the FOM-
ROM model
FOM FOM
FOMROM FOMROM
0.15 Velocity ROM 0.15 Pressure ROM
0.1
0.1
0.05
yvelocity
0.05 0
Pressure
0 0.05
0.1
0.05 0.15
0.1 0.2
0.25
0.15
0.3
0.2 0.35
0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40
Time Time
Fig. 10 Comparison of the vertical velocity (left) and pressure (right) at (5,4) for the FOM, FOM-
ROM and ROM models for the injection case
sponds to the region surrounding the square cylinder of the rectangle (7, 10) (5, 7).
In this subdomain a FOM approach is going to be taken, and the Navier-Stokes equa-
tions are going to be solved with full accuracy. The second subdomain covers the
rest of the computational domain. Since this region does not involve the critical area
where the vortexes are formed, it is going to be solved by means of the less accurate
ROM strategy. The ROM basis are obtained from a set of 100 snapshots, from which
a L-POD basis of 10 basis functions is obtained. As it will be shown, the combination
of both strategies (FOM and ROM) allows us to recover a solution which is close to
the full FOM solution, but at a much lower computational cost.
Figure 10 shows a comparison of the vertical velocity and pressure at a point
at the wake of the cylinder with coordinates (5, 4), for the FOM, the ROM and
the FOM-ROM models. It is interesting to note that the ROM model is not able to
capture the physics of the problem; this is natural since the ROM basis does not
contain the solution of the injection case. The FOM-ROM model, on the other hand,
Fig. 11 Comparison of the velocity (top) and pressure (bottom) fields after 400 steps. Left FOM.
Right FOM-ROM
is capable of a quite accurate solution of the system evolution in the short term
in the FOM domain (13.3 % relative error for the velocity time history in the last
oscillation period and 4.6 % error in the pressure). Figure 11 compares the velocity
and pressure fields of the FOM and the FOM-ROM models. We can observe that in
the region surrounding the cylinder (FOM region) the velocity and pressure fields
are very similar, in the ROM region the velocity fields slightly differ, with more
intense vortexes or bulbs in the FOM simulation. This is due to the difficulties for
the ROM model for representing the injected velocity and pressure fields (the used
snapshots are bad for the injection case). Despite this evident lack of optimality of the
snapshot set, the FOM-ROM model is capable of properly representing the solution
in the FOM region. Figure 12 shows a comparison between the FOM simulation and
FOM-ROM model for several injection velocities. The accuracy of the FOM-ROM
model decreases as the absolute value of the injection velocity increases. This is due
to the fact that the larger the injection velocity, the more different the flow becomes
from the original FOM simulation without injection. Regarding the computational
cost, the FOM-ROM approach takes 55.56 s to run, which is only 6.7 % of the original
FOM computational cost.
5 Conclusions
In this chapter we have discussed several strategies for dealing with the reduced-order
approximation of the incompressible Navier-Stokes equations. We have departed
from a stabilized finite element full-order approximation and we have approached
the order reduction by using a Proper Orthogonal Decomposition (POD) method.
0.1 FOM 0.5 FOM

FOMROM
0.4 FOMROM
0.05 0.3
Pressure
0.2
yvelocity
0
0.1
0
0.05
0.1
0.1 0.2
0.3
0.15 0.4
0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40
Time Time
0.1 FOM 2
FOM
0.08 FOMROM FOMROM
1.5
0.06
0.04
Pressure 1
yvelocity
0.02
0 0.5
0.02
0
0.04
0.06
0.5
0.08
0.1 1
0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40
Time Time
FOM FOM
0.25 0.1
FOMROM FOMROM
0.2 0.2
0.15
0.3
0.1
Pressure
0.4
yvelocity
0.05
0.5
0
0.6
0.05
0.1 0.7
0.15 0.8
0.2 0.9
0.25
1
0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40
Time Time
Fig. 12 Comparison of the vertical velocity (left) and pressure (right) at (5,4) for the FOM, FOM-
ROM and ROM models for the injection case. Injection velocities, from top to bottom 0.2, 0.5, -0.2
In the first part of the chapter, we have focused in the construction of an explicit
reduced-order model for the incompressible Navier-Stokes equations, and the appli-
cation of hyper-reduction techniques to it. The basic idea is to treat all the terms
except the mass matrix in the temporal derivative in an explicit way. This includes
the non-linear convective term, but also the stabilization terms which can be highly
non-linear through the stabilization parameter . In order to do so, we take advan-
tage of the fact that the snapshots used for building the reduced-order basis through a
singular value decomposition in the POD procedure do already fulfill the stabilized
continuity equation. Secondly, we also acknowledge the fact that, if the velocity and
pressure are treated jointly, then the pressure can be recovered from the reduced-order
basis and the solution coefficients at the end of each time step.
The proposed explicit reduced-order model performs well in practical cases, as
illustrated in the numerical examples section. Despite the time-stepping scheme
being explicit, the Courant-Friedrichs-Levy condition can be violated, which can be
explained because the reduced basis functions expand over the whole computational
domain. On the other hand, the reduced model is sensitive to the inclusion of noisy
basis functions, which can cause unstable solutions to appear. The sensitivity of the
explicit reduced-order model to this issue can be improved by reducing the time step
and refining the finite element mesh.
A hyper-reduction strategy for the explicit reduced-order model has also been
presented, which is based on the reconstruction of the right-hand-side vector through
a gappy-pod procedure. For the selection of the indices of the gappy reconstruction,
we use a discrete version of the Best Points Interpolation Method (DBPIM), which
uses only values at the nodes of the finite element mesh, with the advantage that the
selected points can be guaranteed to be at least locally optimal.
In the second part of the chapter, we have presented a domain decomposition strat-
egy for non-linear hyper-reduced-order models. The method consists of restricting
the reduced-order basis functions to the nodes of each subdomain. This definition
of the partitioned problem directly ensures the continuity of the recovered solution.
The local POD bases are obtained by computing a local POD decomposition for
the partitioned snapshots. When applied to the explicit reduced-order model for the
incompressible Navier-Stokes equations a stabilizing penalization term is required.
This penalty term is defined so that it weakly enforces the equality of the unknown
between subdomains in an overlapping region.
The domain decomposition reduced-order model can be extended to a particular
case, in which one of the subdomains is solved by using the full-order finite ele-
ment equations while the other ones are solved using the reduced-order model. This
diminishes the computational cost in the low-resolution subdomains, while keeping
the high fidelity solution in the domain regions which are subject to more complex
physical phenomena.
Numerical examples illustrate the accuracy of the proposed methods for the solu-
tion of incompressible flow problems at a low computational cost: the reduced order-
model allows us to save up to 65 % of the computational cost, while in the case of
the hyper-reduced order models the computational saving is larger than 99 % the
original computational cost.
References
1. Akhtar I, Borggaard J, Hay A (2010) Shape sensitivity analysis in flow models using a finite-
difference approach. Math Probl Eng 123:2010
2. Antil H, Heinkenschloss M, Hoppe RHW, Sorensen DC (2010) Domain decomposition and
model reduction for the numerical solution of pde constrained optimization problems with
localized optimization variables. Comput Vis Sci 13(6):249264

3. Arian E, Fahl M, Sachs EW (2000). Trust-Region proper orthogonal decomposition for flow
control. Institute for computers, pp 20002101
4. Astrid P (2004). Reduction of process simulation models: a proper orthogonal decomposi-
tion approach. PhD thesis, Department of Electrical Engineering, Eindhoven University of
Technology
5. Astrid P, Weiland S, Willcox K, Backx T (2008) Missing point estimation in models described
by proper orthogonal decomposition. IEEE Trans Autom Control 53:22372251
6. Baiges J, Codina R, Idelsohn S (2013) A domain decomposition strategy for reduced order
models. Application to the incompressible Navier-Stokes equations. Comput Meth Appl Mech
Eng 267:2342
7. Baiges J, Codina R, Idelsohn S (2013) Explicit Reduced Order Models for the stabilized finite
element approximation of the incompressible Navier-Stokes equations. Int J Numer Meth Fluids
72:12191243
8. Barrault M, Maday Y, Nguyen NC, Patera AT (2004) An empirical interpolation method:
application to efficient reduced-basis discretization of partial differential equations. Comptes
Rendus Mathematique 339(9):667672
9. Bergmann M, Cordier L, Brancher JP (2007) Drag minimization of the cylinder wake by
trust-region proper orthogonal decomposition. Notes on numerical fluid mechanics and multi-
disciplinary design 95:19
10. Buffoni M, Telib H, Iollo A (2009) Iterative methods for model reduction by domain decom-
position. Comput Fluids 38(6):11601167
11. Bui-Thanh T, Willcox K, Ghattas O (2008) Model reduction for large-scale systems with high-
dimensional parametric input space. SIAM J Sci Comput 30(6):3270
12. Burkardt J, Gunzburger M, Lee H (2006) POD and CVT-based reduced-order modeling of
Navier-Stokes flows. Comput Meth Appl Mech Eng 196(13):337355
13. Carlberg K, Bou-Mosleh C, Farhat C (2011) Efficient non-linear model reduction via a least-
squares Petrov-Galerkin projection and compressive tensor approximations. Int J Numer Meth
Eng 86(2):155181
14. Chatterjee A (2000) An introduction to the proper orthogonal decomposition. Curr Sci
78(7):808817
15. Chaturantabut S, Sorensen DC (2009). Discrete empirical interpolation for nonlinear model
reduction. Technical Report TR09-05, Rice University, Houston, Texas
16. Codina R (2001) A stabilized finite element method for generalized stationary incompressible
flows. Comput Meth Appl Mech Eng 190:26812706
17. Drohmann M, Haasdonk B, Ohlberger M (2012) Reduced basis approximation for nonlin-
ear parameterized evolution equations based on empirical operator interpolation. SIAM J Sci
Comput 34:937962
18. Everson R, Sirovich L (1995) Karhunen-Love procedure for gappy data. J Opt Soc Am A
12:16571664
19. Galletti B, Bruneau CH, Zannetti L, Iollo A (2004) Low-order modelling of laminar flow
regimes past a confined square cylinder. J Fluid Mech 503:161170
20. Glaz B, Liu L, Friedmann PP (2010) Reduced-Order nonlinear unsteady aerodynamic modeling
using a surrogate-based recurrence framework. AIAA J 48(10):24182429
21. Graham WR, Peraire J, Tang KY (1999) Optimal control of vortex shedding using low-order
models. Part i-open-loop model development. Int J Numer Meth Eng 44(7):945972
22. Grepl MA, Maday Y, Nguyen NC, Patera AT (2007) Efficient Reduced-Basis treatment of
nonaffine and nonlinear partial differential equations. ESAIM. Math Model Numer Anal
41(03):575605
23. Holmes P, Lumley JL, Berkooz G (1998) Turbulence, coherent structures. Dynamical systems
and symmetry. Cambridge University Press, New York
24. Jacobs EN, Ward KE, Pinkerton RM (1933). The characteristics of 78 related airfoil sections
from tests in the variable-density wind tunnel. NACA report, 460
25. Kalashnikova I, Barone MF (2011). Stable and efficient galerkin reduced order models for
non-linear fluid flow. In: AIAA-2011-3110, 6th AIAA theoretical fluid mechanics conference,
Honolulu
26. Kosambi DD (1943) Statistics in function space. J Indian Math Soc 7:7688
27. Lassila T, Rozza G (2010) Parametric free-form shape design with PDE models and reduced
basis method. Comput Meth Appl Mech Eng 199(2324):15831592
28. LeGresley PA (2005). Application of proper orthogonal decomposition to design decomposition
methods. PhD thesis, Department of Aeronautics and Astronautics, Stanford University
29. Lucia DJ, Beran PS (2003) Projection methods for reduced order models of compressible flows.
J Comput Phys 188(1):252280
30. Lucia DJ, King PI, Beran PS (2003) Reduced order modeling of a two-dimensional flow with
moving shocks. Comput Fluids 32(7):917938
31. Nguyen NC, Peraire J (2008) An efficient reduced-order modeling approach for non-linear
parametrized partial differential equations. Int J Numer Meth Eng 76(1):2755
32. Noack BR, Morzynski M, Tadmor G (2011) Reduced-Order modelling for flow control.
Springer, Berlin
33. Rabczuk T, Bordas SPA, Kerfriden P, Goury O (2012). A partitioned model order reduction
approach to rationalise computational expenses in multiscale fracture mechanics
34. Rozza G, Lassila T, Manzoni A (2011) Reduced basis approximation for shape optimization
in thermal flows with a parametrized polynomial geometric map. In: Hesthaven JS, Ranquist
EM (eds) Spectral and high order methods for partial differential equations., vol 76Springer,
Berlin, pp 307315
35. Ryckelynck D (2005) A priori hyperreduction method: an adaptive approach. J Comput Phys
202(1):346366
36. Ryckelynck D (2009) Hyper-reduction of mechanical models involving internal variables. Int
J Numer Meth Eng 77(1):7589
37. Verhoeven A, Voss T, Astrid P, ter Maten EJW, Bechtold T (2007) Model order reduction for
nonlinear problems in circuit simulation. PAMM 7(1):10216031021604
38. Verhoeven A, Maten J, Striebel M, Mattheij R (2009) Model order reduction for nonlinear ic
models. In: Korytowski A, Malanowski K, Mitkowski W, Szymkat M (eds) System model-
ing and optimization, vol 312, IFIP advances in information and communication technology,
Springer, Berlin, pp 476491
39. Veroy K, Patera AT (2005). Certified real-time solution of the parametrized steady incompress-
ible Navier-Stokes equations: rigorous reduced-basis a posteriori error bounds. Int J Numer
Meth Fluids, 47(89):773788
40. Wang Z, Akhtar I, Borggaard J, Iliescu T (2011). Proper orthogonal decomposition closure
models for turbulent flows: a numerical comparison. arXiv:1106.3585
41. Wicke M, Stanton M, Treuille A. Modular bases for fluid dynamics. ACM Trans Graph,
28(3):39:139:8
A Survey of Hierarchical Model (Hi-Mod)
Reduction Methods for Elliptic Problems
Simona Perotto
Abstract In this work, we review the basic aspects of the so called Hierarchical
Model (Hi-Mod) reduction approach, recently advocated to reduce the complexity of
models for advection-diffusion-reaction phenomena in pipe-like domains featuring
a prevalent axial dynamics. The Hi-Mod approach aims at reducing the computa-
tional costs still preserving a reliable approximation of the transverse components
of the solution by properly combining finite elements and modal approximations. In
particular, we consider the convergence of this approximation to the solution of the
full problem and the different ways for selecting the number of transverse modes.
1 Why Hi-Mod Reduction?
Mathematical and numerical models are nowadays a fundamental tool for quantitative
analysis in many fields of science and engineering. On the one hand, sophisticated
models can be reliably used for complex dynamics (fluid-structure interaction, bio-
chemical reactions, etc.) not only for computing quantities of interest, but also for
solving optimization, identification or, more in general, inverse problems. On the
other hand, practical use of these tools demands a significant reduction of computa-
tional costs. This may be extremely challenging in particular for inverse problems.
For this reason, one important recent research line is devoted to the set up of sur-
rogate models and solutions for a particular problem, towards the construction of
the best trade-off between reliability and computational efficiency [19]. This can be
achieved with a reduction of the size of the (finite dimensional) solution, based on the
on-line/off-line paradigm like in the Proper Orthogonal Decomposition approach or
in the Reduced Basis method. A differentsomehow complementaryapproach is
S. Perotto (B)
MOX, Dipartimento di Matematica F. Brioschi, Politecnico di Milano,
Piazza Leonardo da Vinci 32, I-20133 Milano, Italy

218 S. Perotto
based on the simplification of the model to be solved, by taking advantage of specific

features of the problem. For instance, fluid dynamics in networks of pipes, as in the
circulatory system or in internal combustion engines, can be based on the well-known
Euler equations that describe the main axial dynamics, dropping the transverse com-
ponents. Although this is an excellent approach for predicting the dynamics over an
entire network, it may fail in describing important local details, where transverse
components are important. For this reason, coupling of Euler (in 1D) and incom-
pressible Navier-Stokes equations in (3D) has been addressed in [12], where the
so-called geometrical multiscale approach has been introduced. A different way for
including both axial and transverse components of the dynamics in a dimension-
ally homogeneous framework has been introduced in [22] for advection-diffusion-
reaction problems. According to this approach, the main dynamics, numerically
solved with a finite element approximation, is added by transverse components
described by a modal or spectral representation. A small number of modes is required
when the transverse dynamics is not important, leading to a psychologically 1D
model. As a matter of fact, the discrete problem obtained in this way is a coupled
system of 1D block problems, with interesting algebraic properties. In addition, the
number m of modes can be selected to improve locally the reliability of the model,
including, when needed, a more precise description of the transverse components in a
hierarchical fashion. In this work, we review the basic aspects of this model reduction
technique, called Hierarchical Model (Hi-Mod) reduction. In particular, we consider
the convergence of this approximation to the solution of the full problem and three
different ways for selecting the number of transverse modes. In particular, m may be
selected uniformly, piecewise constant by subdomains or at each node of the axial
finite element discretization. In addition, it can be tuned based on a priori as well as
a posteriori considerations. In the latter case, we naturally obtain a model-adaptive
tool selecting automatically the accuracy for the transverse dynamics.
2 The Computational Domain
Problems relevant to the Hi-Mod formulation feature a domain where one direction
is prevalent. Thus, we assume that IRd coincides with a d-dimensional fiber
bundle, with d = 2, 3, so that = x1D {x} x , where 1D is the support-
ing 1D domain described by only one independent variable x, while x IRd1
denotes the transverse fiber which, in general, is a function of x. Thus, we align 1D
with the dominant dynamics exhibited by the problem at hand and the fibers x with
the secondary transverse dynamics. For the sake of simplicity, we choose 1D as
the interval ]x0 , x1 [. The more general case of a curved supporting fiber can be con-
sidered as well (see Remark 2). Now, we partition the boundary of into three
disjoint sets, 0 = {x0 } x0 , 1 = {x1 } x1 and = x1D x , such that
= 0 1 . We assume that either homogeneous Dirichlet or homoge-
neous Neumann boundary conditions can be enforced on 0 , 1 and , as well as
non-homogeneous Dirichlet data can be assigned on 0 and 1 .
A Survey of Hierarchical Model (Hi-Mod) Reduction Methods 219
x x
x
2
Fig. 1 Example of the map for a rectilinear three-dimensional domain
For any x 1D , we introduce the map x : x d1 between the generic

fiber x and a reference fiber d1 of the same dimension. Maps x induce in turn

the general map : between the physical domain and the reference
domain = x {x} d1 , where the computations are carried out once and
1D
for all. A generic point in () is referred to as z = (x, y)(z = (z) = ( x ,
y),
with x = x and y = x (y)). Without loss of generality, we assume 1D to be the
subset of with y = 0, i.e., 1D coincides with the centerline of . We assume
that x is a C 1 diffeomorphism, for all x 1D , as well as is differentiable with
respect to z. This last assumption essentially excludes the presence of kinks along
. Figure 1 provides a sketch of the map for d = 3.
In two dimensions, it is always possible to identify the map x with the linear
transformation y = x (y) = y/L(x), with L(x) = meas(x ). In three dimensions
this choice is still possible for some configurations, for instance when is a cylin-
drical domain. In this case L(x) is the diameter of the pipe along the centerline x (as
it is in Fig. 1).
Finally, we introduce the Jacobian associated with the map given by

1 0
J (z) = = IRdd , (1)
z D1 (z) D2 (z)
with D1 (z) = x / x IRd1 , D2 (z) = y x IR(d1)(d1) and where y

stands for the gradient with respect to y. Notice that the first row in J (z) is the same
as in the identity matrix since the map does not modify the supporting fiber 1D .
Moreover, in the 2D case, D1 (z) and D2 (z) simplify to L (x)y/L 2 (x) and 1/L(x),
respectively.
Remark 1 The geometric framework introduced above remarkably differs with

respect to the setting used in [2, 4, 2830], where 1D models are associated with the
transverse directions while the supporting fiber has dimension (d 1).
Remark 2 (Non-rectilinear domains) Curvilinear domains are tackled in [21]. In

such a case the supporting fiber 1D coincides with a one-dimensional curved domain
220 S. Perotto
1

2
1 x
1D
1D

2 2

Fig. 2 Maps involved in the Hi-Mod procedure applied to a curved three-dimensional domain
as shown in Fig. 2. Now, the map becomes more complicated since also the
deformation of the centerline has to be taken into account. Thus, we have z =
(z) = (1 (z), 2 (z)), with
x = 1 (z) and y = 2 (z). The definition of the
Jacobian (1) accordingly changes in

1
x y 1
I (z) = = 2
.
z y 2
x
Moreover, the inverse map between and does not coincide now with 1 and

is defined apart as : so that z = ( z) = (1 (z), 2 (
z)), with x = 1 (
z)
and y = 2 (z) (see Fig. 2). Finally, we assume that both and are differentiable
with respect to z.
3 Three Different Hi-Mod Techniques
Three different approaches have been proposed so far to perform a Hi-Mod reduction.
The common idea is to exploit the fiber structure introduced on by tackling in a
different way the dependence of the solution on the dominant and on the transverse
directions. In the sections below we present separately the three Hi-Mod techniques
by following the chronological order of their proposal in the literature.
For this purpose, we first introduce the full problem, i.e., the problem we aim at
reducing. We consider a generic second-order elliptic problem in the weak form,
given by
find u V : a(u, v) = F (v) v V, (2)
with V H 1 () a Hilbert space, a(, ) : V V IR a continuous and coer-

cive bilinear form and F () : V IR a continuous linear functional, so that the
LaxMilgram lemma ensures the well-posedness of (2). The choice of the space V
takes into account the boundary conditions assigned on . Standard notation for
the Sobolev spaces as well as for the spaces of functions bounded a.e. in is adopted
(see, e.g., [18]).
3.1 Uniform Hi-Mod Reduction
This model reduction technique is first introduced in [11] and then more rigorously
investigated in [22]. The dominant and the trasverse dynamics of the problem are
described via two different functional representations, namely
1. a space V1D H 1 (1D ) spanned by functions defined on 1D and compatible
with the boundary conditions enforced along 0 and 1 ;
2. a modal basis { j } jN+ H 1 ( d1 ) of functions orthonormal with respect to
the L 2 -scalar product on
d1 , properly including the boundary data assigned on
.
By properly combining the space V1D with the modal basis, we define the uniform
hierarchically reduced space

m
Vm = vm (x, y) =
v j V1D , x 1D , y x ,
v j (x) j (x (y)), with
j=1
(3)
where m N is a given integer, fixed a priori. Space Vm identifies a hierarchy of
models: the number m of included modes determines the accuracy of the reduced
model that, in principle, can be tuned arbitrarily close to the full one. We call this
approach uniform since we use the same value for m over the entire domain. Notice
that, due to the orthonormality
of the modal functions,
the frequency coefficients in
(3) are given by
v j (x) = v x, 1 (
y) (
y) d
y, with j = 1, . . . , m.
d1 m x j
We can now state the uniform Hi-Mod formulation: given a modal index m N+ ,
find u m Vm : a(u m , vm ) = F (vm ) vm Vm . (4)
To guarantee the well-posedness of formulation (4), we introduce a conformity

hypothesis on the space Vm (i.e., Vm V , for all m N+ ) and we exploit the
well-posedness assumed for problem (2). On the other hand, the convergence of
u m to u is obtained by adding a spectral approximability assumption on Vm , i.e.,
we demand that, for any v V , limm+ inf vm Vm v vm V = 0. Indeed,
let em = u u m V be the uniform modeling error which, via the conformity
hypothesis, satisfies the modeling orthogonality property a(em , vm ) = 0, for any
vm Vm . It can be easily proved that the spectral optimality property holds, i.e., that
em V C inf vm Vm u vm V , with C a constant depending on both the continuity
and the coercivity constants of a(, ). Then, thanks to the spectral approximability
hypothesis, the convergence immediately follows.
Remark 3 (Choice of the modal basis) Different choices are possible for the modal
basis { j } jN+ . Of course, this choice strictly depends on the boundary conditions
assigned along . So far we have essentially used trigonometric functions, i.e.,
we have essentially considered homogeneous Dirichlet boundary conditions on the
horizontal sides of . In [22] we have also checked the performances associated
222 S. Perotto
with Legendre polynomials with similar results. More recently, in order to deal with
more general boundary conditions, we have introduced a new type of modal basis
called educated basis (see [3]). The idea is to solve an auxiliary Sturm-Liouville
problem on the transverse reference fiber
d1 in order to build a modal basis which
automatically includes the boundary conditions assigned along . This approach
has been successfully validated both in 2D and 3D. We finally remark that the choice
of the modal basis, together with the regularity of the full solution u, influences also
the rate of the modal convergence.
3.1.1 Discrete Uniform Hi-Mod Reduction
To make the uniform Hi-Mod approach useful in practice, we consider the discrete
counterpart of formulation (4). Following [11, 22], we discretize the main dynamics
via standard 1D finite elements while preserving the modal expansion to describe
the transverse features. For this purpose, we consider a subdivision Th of 1D into
subintervals K i = (xi1 , xi ) of width h i = xi xi1 , with h = maxi h i . Then,
h V
we consider a conforming finite element space V1D 1D associated with Th such
that dim(V1D ) = Nh < +, and we introduce a standard density hypothesis on
h
the space V1Dh . The discrete uniform Hi-Mod reduction can thus be stated as: given
a modal index m N,
h
find u m Vmh : h
a(u m , vm
h
) = F (vm
h
) vm
h
Vmh , (5)
where the discrete uniform hierarchically reduced space is

m
Vmh = vmh (x, y) = v jh (x) j (x (y)), with

v jh V1D
h , x ,y
1D x Vm ,
j=1
(6)
the last inclusion being guaranteed by the conformity assumption on V1D h .
h = u u h V (which includes
In [22] it is proved that the uniform global error em m
both the model (uu m ) and the discretization (u m u m h ) error contributions) vanishes
for m and h 0. Some results are available in the literature concerning the
rate of convergence of em h (we refer, e.g., to [8, 9, 15]). Moreover, a numerical
convergence study of the discrete Hi-Mod formulation (5) is available in Sect. 3.4.1
of [22]. For sufficiently smooth functions, we recover the convergence rate expected
from Theorems 2.1 and 3.2 in [8] for em h , namely quadratic for the L 2 norm and
linear for the H norm, with respect to both m 1 and h, respectively.

1
From a computational viewpoint, formulation (5) leads to solve a system of

coupled 1D problems instead of the full d-dimensional problem. This is expected
to be computationally advantageous, in particular if the full problem is three-
dimensional and for a modal index m small enough. To detail the effects of a
Hi-Mod reduction, let us exemplify the uniform Hi-Mod procedure on the standard
Poisson problem completed with full homogeneous Dirichlet boundary conditions.
For the sake of simplicity, we focus on the 2D case. The full space V coincides with
H01 (), while V1D = H01 (1D ). Moreover, the modal functions j vanish on .
The bilinear and linear forms in (2) are given by a(u, v) = u v d xdy and

F (v) = f v d xdy, respectively with f L 2 (). Now, we first consider the
h (x, y) =
m
modal representation u m j=1
u jh (x) j (x (y)) for the Hi-Mod approxi-
mation u m ; then, we expand each modal coefficient
h u jh in terms of the finite element

Nh Nh
basis {i }i=1 as
u jh (x) = i=1
u j, i i (x) to get the global expansion

m
Nh
u mh (x, y) =
u j, i i (x) j (x (y)). (7)

j=1 i=1
Thus, the actual unknowns are the coefficients

u j, i , for j = 1, . . . , m and i =
1, . . . , Nh . By plugging this representation in (5) after choosing as test function
h (x, y) = ( (y)) (x), we obtain the following system of 1D coupled prob-
vm k x l
lems: for j = 1, . . . , m and i = 1, . . . , Nh , find the coefficients
u j, i such that, for
any k = 1, . . . , m and l = 1, . . . , Nh , it holds
Nh
m
di (x) dl (x) di (x) dl (x)
rk1,1
j (x) +rk1,0
j (x) l (x) + rk0,1
j (x) i (x)
1D dx dx dx dx
j=1 i=1

rk0,0
+
j (x) i (x) l (x) d x
u j, i = f k (x) l (x) d x, (8)

1D
1
where f k (x) = 1 y)) ( x, x1 (y) d
1 f (x, x ( k y) D2 y , while we have that

s,t
rk j (x) = s,t
1 rk j (x, y) D2 x, x (
1 1 y) d
y, for s, t = 0, 1, with

rk1,1
j (x, y) = j (y) k (y), rk1,0
j (x,
y) = j (y) k (
y) D1 x, x1 ( y) ,

rk0,1
j (x, y) = j (
y) k (y) D1 x, x1 ( y) , (9)
2 2
1 1
rk0,0
j (x,
y) = j (
y) k (
y) D 1 x, x (
y) + D 2 x, x (
y) .
The quantities rks,t

j collect the transverse contributions. Notice that, via the map x ,
the reduced system is solved on the reference Coefficients in (9) simplify
domain .

when the map x is linear, since D2 z reduces to L(x)1 .
Analogous computations can be performed, in a straightforward way, starting
from a full three-dimensional model and for a generic second-order elliptic problem.
We refer the reader, for instance, to [11] for the detailed computations associated
with the reduction of a full 2d advection-diffusion-reaction problem.
Remark 4 System (8) shows that a full purely diffusive problem yields low-order
contributions in the reduced framework. However, the first-order terms yielded by the
reduction procedure are always weighted by the diffusive coefficient. Consequently,
224 S. Perotto
i= 1...i= 10
l=1
.
.
.
l = 10
10
k= 1 10 u1h 10 f1 10
u 2h 10 f2 10
=
uh f3
3
k= 4 uh 10 f4 10
4
j= 1 j= 4
Fig. 3 Sketch of the linear system associated with a uniform Hi-Mod reduction, for m = 4 and
Nh = 10, with [ f k ]l = F (l k ),
h , for k = 1, . . . , 4 and l =
u kh the modal coefficients of u m
1, . . . , 10
possible instabilities due to a dominant advection or reaction are in general absent

provided that the deformation indices D1 (z) and D2 (z) are small enough.
From an algebraic viewpoint, the discrete uniform Hi-Mod formulation leads to
solve a linear system characterized by an m Nh m Nh block matrix A. With reference
to system (8), we distinguish in A a macrostructure associated with the modes and
identified by the indices k and j which run on the block rows and block columns,
respectively; on the other hand, the indices l and i are related to the finite element
basis and span the rows and columns of each block, respectively. Each Nh Nh block
preserves the sparsity pattern typical of the the adopted finite element discretization.
This can be advantageously exploited both in storing and solving the associated
system. In particular, the common sparsity pattern can be stored once and for all. In
Fig. 3 we show an example of Hi-Mod sparsity pattern for m = 4 and Nh = 10 for
linear finite elements. Blockwise, we recognize the standard tridiagonal pattern.
3.1.2 A Numerical Example
We provide a numerical example to assess the performances of the uniform Hi-Mod

reduction. We solve on the domain = (0, 7) (0, 2) the standard advection-
diffusion problem u + b u = f , with b = (20, 0)T a backward field and
f (x, y) = 50 D1 D2 (x, y), where D1 = {(x, y) : (x 4.2)2 + (y 1.5)2 0.04}
and D2 = {(x, y) : (x 3.7)2 + (y 0.5)2 < 0.03} are two circular misaligned
regions with a different radius. Full homogeneous Dirichlet boundary conditions are
enforced on . Figure 4, top displays the contour-plots of the full solution computed
via a standard 2D linear finite element discretization on a uniform unstructured grid
Fig. 4 Full solution (top); uniform Hi-Mod solution u 9h (bottom)
with 31,264 elements. Due to the strong advective field and the assigned boundary
conditions, the solution is basically flat for x > 4.5, while it exhibits large variations
in the leftmost part of the domain with boundary layers in correspondence with {(x, 0)
for 0 x 3.5} and {(x, 2) for 0 x 4} and, more significantly, along {(0, y) for
0 y 2}. In Fig. 4, bottom we display the uniform Hi-Mod solution associated
with m = 9 modal functions and with a 1D mesh Th of uniform size h = 0.01.
The agreement between the full and the reduced solution is pretty good. Notice that
we do not resort to any stabilization scheme: the choice made for h guarantees that
the local Pclet number corresponding to the advective field b is strictly less than
one. However, the actual advective term in the Hi-Mod reduced formulation also
depends on D1 (z) (see [11]). This last contribution could make the chosen h locally
insufficient to ensure the stability of the discretization scheme. This could explain
the negative values of the reduced solution in the darker blue areas near the two
sources (see Fig. 4, bottom) in contrast to the minimum value zero assumed by the
full solution.
3.2 The Piecewise Hi-Mod Reduction
Figure 4, bottom clearly shows the main limit of a uniform Hi-Mod reduction. To
accurately approximate a full solution with local strong transverse components, we
need to employ a large number of modes over the whole domain, i.e., also where the
transverse dynamics are not relevant. This implies a waste in terms of computational
cost. The piecewise Hi-Mod formulation aims at improving the computational effi-
ciency of a Hi-Mod reduction by employing a different number of modes in different
parts of . Large values are associated with the zones where the transverse dynamics
226 S. Perotto
1
2 3
0 x0 1D,1 1 1D, 2 2 1D, 3 3 x1
2
1
Fig. 5 Example of 2D partition T , for s = 3
are important, while small values are selected where the 1D behavior is dominant.
As a consequence, the modal index m becomes a vector, called modal multi-index,
which collects the number of modes used in the different portions of . This justifies
the name of the approach.
In order to formulate the piecewise counterpart of (4), we need to introduce a
number of definitions. To simplify the discussion, we focus on the 2D case and we
assume to identify, via some criterion, three areas 1 , 2 and 3 in where a
different number m i of modal functions, for i = 1, 2, 3, is employed. In particular,
we denote by 1 and 2 the interface between 1 2 and 2 3 , respectively
and by 1D,i = 1D i = (i1 , i ) the subinterval of 1D associated with the
subdomain i so that i = x1D,i {x}x , i=1 3
1D,i = 1D , 1D,i 1D,i =
for i = i and i, i = 1, 2, 3, and where 0 x0 , 3 x1 (see Fig. 5). Finally, we
introduce the two-dimensional broken Sobolev space H 1 (, T ) associated with
the partition T = {i }i=1 3 of , properly modified according to the boundary
conditions assigned on [17]. The inclusion V H 1 (, T ) holds. We can
define now the piecewise hierarchically reduced space

mi
Vm (T ) = vm L 2 () : vm |i (x, y) =
v ji (x) j (x (y)), i = 1, 2, 3,
j=1
p

v ji H (1D,i ) : j = 1, . . . , m with p = 1, 2,
1
(10)

vm | p+1 ( p , 1
p
y)) vm | p ( p , 1
( p
(
y)) j ( y=0 ,
y) d
1

[N+ ]3 a given modal multi-index and m = min(m p , m p+1 ).

p
with m = {m i }i=1
3
The reduced space Vm (T ) is a subset of H 1 (, T ). The ingredients used to

define space Vm (T ) are essentially the same characterizing the uniform space Vm ,
i.e., a one-dimensional space to describe the main stream and a modal basis for
the transverse dynamics, even though now the 1D space has to be localized to the
different subdomains i . The interface condition in (10) weakly enforces the conti-
nuity of the minimum number m min of transverse modes in the whole . Different
strategies can be pursued to enforce this condition. In [11] we resort to an iterative
substructuring Dirichlet/Neumann method (see, e.g., [26, 27]). Alternatively,
different domain decomposition algorithms as well as techniques based on an appro-

priate definition of the reduced space can be pursued, as, for instance, in [2, 4, 30].
The piecewise Hi-Mod formulation can thus be stated: given a modal multi-index
m [N+ ]3 ,
find u m Vm (T ) : aT (u m , vm ) = FT (vm ) vm Vm (T ), (11)

3 3
with aT (u m , vm ) = i=1 ai (u m |i , vm |i ) and FT (vm ) = i=1 Fi (vm |i ),
where ai (, ) and Fi () are the restrictions to the subdomain i of the bilinear
and linear form in (2), respectively for i = 1, 2, 3. We remark that the piecewise
Hi-Mod formulation is well-posed in Vm (T ) with respect to the broken energy
3 1/2
norm vm T = i=1 vm |i H 1 ( )
2 [17]. As trivial subcase, formulation
i
(11) admits the uniform Hi-Mod reduction when m = (
m, m
, m

)T for a certain
N. The weak imposition of the continuity does not necessarily guarantee an

m
H 1 -conforming approximation u m to the full solution u in (2): the continuity on
of both the trace and the flux of u m is ensured to the first m min modal components
only. A global conforming approximation is yielded only if m i m i+1 , for i = 1, 2
[22].
The extension of the picewise Hi-Mod formulation to a generic number s of
subdomains is immediate.
3.2.1 Discrete Piecewise Hi-Mod Reduction
We consider now the discrete couterpart of formulation (11) by referring to a generic

partition T = {i }i=1 s of IRd , d = 2, 3. We introduce the subdivision
Th of 1D,i into the subintervals K li = (xl1
i i , x i ) of width h i = x i x i
l l l l1 for
+
l = 1, . . . , n i , i = 1, . . . , s and n i N . The discrete piecewise hierarchically
reduced space is
h
mi
Vmh (T , {Thi }) = vm Vm (T ) : vm
h
|i (x, y) = v ji,h (x) j (x (y)) (12)

j=1
i,h
i = 1, . . . , s, v ji,h

V1D Vm (T ),
i,h
where V1D H 1 (1D,i ) is a finite element space associated with Thi , such that
i,h
dim(V1D ) = Nhi < +. A standard density assumption is postulated on the spaces
i,h
V1D . The discrete piecewise Hi-Mod reduction reads: given a modal multi-index
m [N+ ]s ,
h
find u m Vmh (T , {Thi }) : aT (u m
h
, vm
h
) = FT (vm
h
) vm
h
Vmh (T , {Thi }).
(13)
228 S. Perotto
The actual unknowns are the modal coefficients

u ji,h of u m
h , for j = 1, . . . , m and
i
i = 1, . . . , s.
Despite the possible non-conformity of the reduced solution, we can state also for
this formulation a Galerkin orthogonality property, simply by subtracting (13) from
(11) for vm = vm h , to get
aT (m
h
, vm
h
) = 0 vm
h
Vmh (T , {Thi }), (14)
with mh = u u h the piecewise discretization error associated with (13).

m m
From a computational viewpoint, the discrete piecewise formulation is associated
with the resolution of s linear systems of coupled 1D problems, one for each subdo-
main. As in Fig. 3, each system is characterized by a sparse m i m i block matrix
and each Nhi Nhi block exhibits the sparsity pattern typical of the chosen 1D finite
element approximation. In particular, since we employ an iterative substructuring
scheme to impose the interface condition between the subdomains, we solve the s
linear systems at each iteration of the Dirichlet/Neumann algorithm. Of course, the
factorization of the correspoding matrices is stored once and for all at the first iter-
ation. Particular attention has to be paid to the well-posedness of each subproblem
which means to properly assign the boundary conditions on each i .
An a priori analysis for the piecewise global error em h = u u h = e + h
m m m
H (, T ), with em = u u m the piecewise modeling error and m
1 h defined as in
(14), can be found in [22], under the assumption that the Dirichlet-Neumann scheme
converges. In particular, as previously pointed out, if m is a strictly decreasing multi-
index, i.e., m i > m i for any i, i = 1, . . . , s with i > i, a conforming approximation
umh is guaranteed and the error can be bounded via the standard Ca lemma. On the
other hand, for an increasing multi-index m, we get an error bound consisting of the
usual best approximation term plus an additional correction due to the high-frequency
components near the interfaces, which are responsible for the nonconformity (see
[22, Sect. 4.2.2] for a detailed discussion).
We apply the piecewise Hi-Mod reduction to the same test case as in Sect. 3.1.2.
By exploiting the intrinsic heterogeneity of the full solution u, we divide into
three subdomains 1 = (0, 2.5) (0, 2), 2 = (2.5, 4.5) (0, 2) and 3 =
(4.5, 7) (0, 2) and we resort to m 1 = 2, m 2 = 7 and m 3 = 1 modal functions,
respectively. This choice is suggested by the behaviour itself of the full solution
since the most complex dynamics take place in the central part of , while the
solution u is essentially flat in the rightmost part of the domain. At both the interfaces
1 = {2.5} (0, 2) and 2 = {4.5} (0, 2) we apply a Dirichlet/Neumann scheme
with relaxation (i.e., Dirichlet conditions are prescribed when solving the leftmost
subdomain, Neumann conditions for the rightmost one), by setting the relaxation
parameter to 0.5. This choice guarantees the well-posedness of each problem in i
since the field b is backward. Notice that, in the presence of a forward advective
h
Fig. 6 Piecewise Hi-Mod solution u 2,7,1 at the second (top), fourth (middle) and seventh (bottom)
iteration of the domain decomposition approach for h = 0.01
field, Neumann/Dirichlet conditions have to be imposed at both 1 and 2 to get

well-posed subproblems (see Sect. 4.2.2 for an example).
Concerning the parameters characterizing the domain decomposition scheme,
we select both the initial guesses at the interfaces identically equal to zero and we
settle the convergence tolerance for the relative errors to 0.01. Finally, the same
discretization step h = 0.01 is employed along all the three subintervals 1D,1 =
(0, 2.5), 1D,2 = (2.5, 4.5), 1D,3 = (4.5, 7).
The chosen data lead the Dirichlet/Neumann scheme to converge after seven
iterations. Figure 6 collects the contour plots of the piecewise Hi-Mod solution asso-
ciated with the second, the fourth and the last iteration. For the sake of comparison,
we employ the same colour map used for the uniform case in Fig. 4. A considerable
model discontinuity can be observed at x = 2.5. The model discontinuity is due to the
fact that m 1 < m 2 . Moreover, the first interface 1 is located in an area of where
the transverse dynamics are important. This makes even more relevant the change in
the number of modes. On the contrary, no model discontinuity appears at x = 4.5,
where the solution is completely flat and since m 2 > m 3 . As already remarked in
[22], it can be verified that a finer discretization step does not significantly reduce
the model jump.
230 S. Perotto
3.3 The Pointwise Hi-Mod Reduction
As shown in the previous section, the piecewise Hi-Mod reduction may lead to a
computational improvement with respect to the uniform approach when dealing with
phenomena with significant transverse dynamics localized in a certain portion of the
domain. In this perspective, the best computational advantage is attained when one
can calibrate the subdomain with the largest modal index exactly to fit the region with
significant transverse dynamics. This is in some sense the spirit behind a pointwise
Hi-Mod reduction [24]. In this case, the modal functions are pointwise-tuned, which
justifies the name assigned to this method. Now, the modes are associated with the
nodes of the finite element partition, in contrast to the piecewise approach where the
subdomain i shares the same number of modal functions. Due to the association
mode-node, the pointwise Hi-Mod reduction makes sense only in a discrete context.
To settle the pointwise formulation, we move from the discrete modal expansion
(7) that we properly rewrite as
Nh
m
u mh (x, y) =
u j, i j (x (y)) i (x). (15)

i=1 j=1
We remark that, in this expansion, the leading role is taken by the sum on the finite
element nodes while in (7) by the one on the modes. Inspired by representation (15),
we introduce a new definition for the discrete Hi-Mod space, where we allow the
number m of the modal basis functions to vary on each finite element node xi . Thus,
the discrete pointwise hierarchically reduced space is
m iN
Nh

h
VM = vM (x, y) =
h
v j, i j (x (y)) i (x), with x 1D , y x ,

h
i=1 j=1
(16)
Nh
where M = {m iN }i=1 [N+ ] Nh is the modal multi-index collecting the number of
modes for each finite element node.
The pointwise Hi-Mod formulation is given by: for a certain modal multi-index
M [N+ ] Nh ,
h
find u M VM
h
: h
a(u M , vM
h
) = F (vM
h
) vM
h
VM
h
, (17)
where a(, ) and F () coincides with the bilinear and linear forms in (2). From
definition (16), it follows immediately that the nodewise Hi-Mod solution u M h is
always H -conformal in in contrast to the piecewise approach.

1
From an algebraic viewpoint, we have to solve a linear system whose coefficient

matrix has a structure similar to that of the uniform case (with m = maxi m iN ), except
for the fact that the rows and the columns associated with the finite element nodes
xi such that m iN < maxi m iN are deleted. This leads to solve a reduced system with
9
8
7
6
5
4
3
2
1
0 20 40 60 80 100 120 140 160 180 200
Fig. 7 Pointwise Hi-Mod solution (left); corresponding modal distribution (right)
respect to the uniform approach with a consequent computational saving. A further

significative improvement in terms of computational costs is due to the fact that the
nodewise formulation completely relieves us from using a domain decomposition
scheme in the presence of a different number of modal functions in . No iterative
procedure is now demanded to get the reduced solution. On the contrary, a domain
decomposition scheme could be advantageously exploited to deal with complex
geometries, such as bifurcations, not taken into account by the standard setting in
Sect. 2.
We assess the pointwise Hi-Mod procedure on the same test case used to validate
both the uniform and the piecewise approaches. For this purpose, we introduce a
finite element partition Th of the supporting fiber 1D = (0, 7) constituted by 200
equispaced nodes. Then, starting from the results in Figs. 4 and 6, we adopt the
corresponding modal distribution shown in Fig. 7, right. In particular, we employ
seven modes in the area around the two sources, except for a single node, close to
the center of D1 , where nine modal functions are used and for few nodes in the zone
between the two sources where only five modes are switched on.
Figure 7, left shows the contour plots of the pointwise Hi-Mod solution u M h .
It is fully comparable with the full solution in Fig. 4, top despite the presence of
some negative areas as already detected in the uniform and in the piecewise reduced
solutions. As expected, solution u Mh is an H 1 -conforming approximation, i.e., no
model discontinuity is present.
4 How to Select the Model in the Hierarchy?
An appropriate choice of the modal index m in (5) or of the modal multi-indices m

and M in (13) and (17), respectively is certainly the most critical issue of a Hi-Mod
reduction procedure. In this section, we provide two different approaches to select
the number of modal functions in . The first method coincides essentially with a
user-dependent criterion; the second technique is based on an ad hoc theoretical tool
and makes the user free from any arbitrary choice.
232 S. Perotto
Fig. 8 Uniform Hi-Mod solutions: u 1h , u 3h , u 5h , u 7h (topbottom)
4.1 An a Priori Approach
This approach is the most straightforward and the first strategy that we have pursued
to select the number of modes to be used in [11, 22].
We essentially distinguish the case when the user has no information about the
dynamics of the problem to be reduced from the case when some hints on this
problem are available. If no information is available, we can resort to a trial and
error approach: we move from the computationally cheapest choice for m, i.e.,
m = 1. Then, we gradually increase such a value and we stop when the addition
of the successive modal function does not significantly improve the accuracy of the
reduced solution. This is the procedure that we have followed to get the uniform
Hi-Mod solution u 9h in Fig. 4, bottom. Figure 8 shows the contour plots of some
h
Fig. 9 Piecewise Hi-Mod solution u 4,7,1 at the last iteration of the domain decomposition approach
for h = 0.01
of the reduced solutions yielded by this trial and error approach. It is evident that
the accuracy of u mh increases as m gets larger. The presence of the two localized
sources demands a rather large number of modes overall. While the behaviour of u is
correctly described by a single mode on the rightmost part of , at least five modes
are necessary to recognize the sources at D1 and D2 . Solution u 7h is very close to u 9h
in Fig. 4, bottom. The only difference is a slight reduction of the negative areas for
the choice m = 9.
When we have some knowledge about the full solution or about some previous
run of the Hi-Mod procedure, we may fix directly the number of modes instead of
gradually varying it. This is the approach used in the piecewise Hi-Mod reduction of
Sect. 3.2.2. Indeed, Fig. 8 suggests that few modes are sufficient in 1 and 3 , while
at least 7 modes are demanded in 2 . Of course, other choices are allowed, driven,
e.g., by specific user demands. For instance, when we aim at reducing the model
discontinuity in Fig. 6, bottom, we simply increase the number of modes in 1 . If
we employ, e.g., four instead of two modes in 1 while preserving the same values
for all the parameters involved in the domain decomposition scheme, we obtain, after
7 iterations, the Hi-Mod solution in Fig. 9. It is difficult to appreciate in this case the
model discontinuity occurring along the interface 1 = {2.5} (0, 2).
The pointwise Hi-Mod reduction is the setting where an a priori choice of the
modal multi-index is less immediate. This is essentially due to the large variability
that can be assigned to the modal indices m iN , which now may vary nodewise. Such
a variability represents a huge richness to optimize the reduced model selection
provided that the user has a precise idea on the trend of the full solution u . Figure 7
provides an example in such a direction.
4.2 An a Posteriori Approach
The main purpose of an a posteriori technique is to devise an automatic procedure

to select the modal distribution in . This automatic choice is performed via a
specific theoretical tool, represented by an a posteriori modeling error estimator.
Following [23], we focus on the piecewise Hi-Mod approach. Moreover, we assume
that the finite element partition is sufficiently fine to neglect the discretization error.
234 S. Perotto
According to a goal-oriented analysis (see, e.g., [5, 14, 16, 20]), we measure the
accuracy of the reduced model via a goal functional J : H 1 (, T ) IR which
represents a physical quantity of interest to be measured (e.g., mean or local values,
the lift and drag coefficients around bodies in external flows, convective or diffusive
fluxes). In particular, we assume J linear. We estimate the unknown value J (u) via
the computable value J (u m ), with u and u m solution to (2) and (11), respectively,
with the purpose of keeping the goal error J (u u m ) below a desired tolerance.
The automatic procedure aims at identifying the subdomains i and the associated
modal multi-index m to guarantee such a goal.
To estimate the goal error, we define the piecewise Hi-Mod dual problem: given
a modal multi-index m [N+ ]s ,
find z m Vm (T ) : aT (vm , z m ) = J (vm ) vm Vm (T ), (18)

s
where J () = i=1 Ji () with Ji = J |i for i = 1, . . . , s. Moreover, we introduce
an auxiliary hierarchically reduced space Vm+ (T ), defined exactly as Vm (T ) with
m+ [N+ ]s such that m+ > m (i.e., m i+ > m i , for any i = 1, . . . , s). The inclu-
sions Vm (T ) Vm+ (T ) H 1 (, T ) trivially hold, while we refer to space
Vm+ (T ) as the enriched piecewise hierarchically reduced space. Finally, we asso-
ciate with Vm+ (T ) the enriched piecewise Hi-Mod primal and dual formulations:
given a modal multi-index m+ [N+ ]s ,
find u m+ Vm+ (T ) : aT (u m+ , vm+ ) = FT (vm+ ) vm+ Vm+ (T ), (19)
and
find z m+ Vm+ (T ) : aT (vm+ , z m+ ) = J (vm+ ) vm+ Vm+ (T ). (20)
We recall now the main proposition in [23] which represents the reference result in
the proposal of the a posteriori modeling error estimator:
Proposition 1 Let us assume that there exists a positive constant m < 1 and a
modal multi-index M0 [N+ ]s , such that, for m, m+ [N+ ]s with m+ > m M0 ,
|J (em+ )| m |J (em )|, (21)
with em = u u m , em+ = u u m+ H 1 (, T ) the piecewise modeling error

associated with formulation (11) and (19), respectively. Then,
|J (u m+ u m )| |J (u m+ u m )|
|J (em )| . (22)
1 + m 1 m
Moreover, the following identity holds
J (u m+ u m ) = aT (u m+ u m , z m+ z m ). (23)
Relation (23), combined with the two-sided bound (22), leads to identify the a
posteriori modeling error estimator for the goal error |J (em )| with the quantity
mm+ = |aT (u m+ u m , z m+ z m )|. (24)
The lower and the upper bound in (22) represent the corresponding efficiency and reli-
ability estimate, respectively. mm+ is a goal-oriented hierarchical estimator which
combines the easy computability typical of a hierarchical estimator with the high
versatility proper of a goal functional analysis.
From a computational viewpoint, to evaluate mm+ we replace the piecewise
hierarchically reduced primal and dual solutions with the corresponding discrete
approximations. Thus, via (24), we simply evaluate the quantity (z mh z h )T K (u h
+ m m+
u mh ), where K is the stiffness matrix associated with the enriched formulation (19),
which is already available and does not need any additional assembly. Alternative
procedures to evaluate estimator mm+ are proposed in [23].
Remark 5 Hypothesis (21) represents a saturation assumption. It is rather usual in

the context of a hierarchical a posteriori analysis for the discretization error asso-
ciated with a finite element approximation [1, 6, 10]. Requirement (21) essentially
generalizes a standard saturation hypothesis to the context of a modeling error analy-
sis, by including the functional used to measure the error. A numerical check of such
a hypothesis is provided in [23, Sect. 3.3].
4.2.1 The Modeling Adaptive Procedure
The adaptive procedure proposed in [23] consists of two phases. The first phase
identifies the number s and the location of the subdomains i . In the second phase
we find the modal multi-index m to satisfy the requirement mm+ TOL , with TOL
a user-defined tolerance.
Let us focus on the first phase. To select the subdomains i , we employ a thresh-
olding technique (see Fig. 10, left). We compute the estimator mm+ in a uniform
context, i.e., by employing m
and m
+ modes on the whole , with m
+ > m
. We
usually choose small values for both m
and m
+ to contain the computational costs.
In particular, we compute the estimator normalized by its maximum value, m
m
+ ,
so that we may assume the estimate to be in the range [m min , 1], with min the

m
+
m
m
+
minimum value assumed by the normalized estimator on . Now, we pick a thresh-
old (0, 1). Then, after introducing a uniform finite element partition {K l } on
1D , we assign the value m + to the barycenter of K l . We denote by j , with

m
Kl
j = 1, . . . , s, the intersections between
m
+ and , and by j the corresponding

m
closest finite element node. The set j identifies the partition T = {i }i=1 s , with
s = s + 1, 0 = x0 , s = x1 , and with j = { j } j the interface between j

and j+1 . We highlight that subdomains i are fixed once and for all at the end of
this phase and do not change anymore during the adaptive procedure.
236 S. Perotto
(0)
As byproduct of the first phase, we also get the initial guess m(0) = {m i }i=1
s
[N+ ]sfor the modal multi-index to start the second phase of the modeling adaptive
procedure. We set

if m
(0) m
m
+ K l < , K l s.t. K l 1D,i = ,
mi = (25)

+ if m
m
m
+ K , K l s.t. K l 1D,i = ,
l
where N+ denotes the modal update. We define the initial guess m+ (0) for the
multi-index m+ in an analogous way.
Remark 6 Some crucial situations may occur when selecting the threshold . If is
less than m
min , it means that the initial guess for the modal truncation is too coarse.

m
+
We consequently refine both m
and m
+ and recompute the normalized estimator.
The algorithm above fails even when a root of the equation m
m
+ = 0 has
multiplicity strictly greater than one. A possibility to avoid this is to check also
the first derivative of m

m
+ before selecting . Finally, if m
+ exhibits several

m
oscillations around a range of values, these values are not eligible as threshold, since
this would lead to identify too many subdomains, making the numerical procedure
completely ineffective. We refer the interested reader to Remark 4 in [23] to get more
details and some example for such critical situations.
We move now to the second phase of the modeling adaptive procedure. Starting
from the initial guesses m(0) and m+ (0) defined according to (25), we apply a standard
equidistribution criterion to the subdomains i , i.e, we demand that mm
i
+ = TOL/s,
where mm+ = mm+ |i denotes the modeling error estimator associated with the
i
subdomain i for i = 1, . . . , s. In particular, to make the adaptive procedure more

efficient, we include both the model refinement and coarsening. The main steps of
the mode adaptivity are summarized in the following pseudocode:
1. set m = m(0) and m+ = m+ (0) ;
2. compute mm+ ;
3. while (mm+ > TOL & k Nmax) {
4. for i=1:s
5. compute mmi
+;
6. if mm
i
+ > delta1
TOL
s
(k) (k1) + (k) + (k1)
7. m i = m i + ; m i = mi + ;
8. elseif mm
i
+ delta2 TOL
s
(k) (k1)
9. m i = max(1, m i ), m +,(k)
i = max(1, m +,(k1)
i );
end
end
10. compute mm+ ;
11. k = k + 1; }
Some remarks are in order. We introduce a maximum number Nmax of allowed

iterations to ensure that the procedure stops. If the algorithm terminates within this
number, the modal multi-index predicted by the procedure identifies a piecewise
hierarchically reduced solution u m such that J (u m ) J (u) within tolerance TOL.
The parameters delta1 and delta2 limit the occurrence of model refinement
and coarsening, respectively by improving the algorithm robustness as well as the
computational efficiency. In particular, during the model coarsening, we force the
minimal value of admissible modes to be at least equal to one.
Finally, the control of the fulfillment of the desired tolerance for the estimator
is performed twice: at step 3., when we check if the initial guess m(0) predicts a
sufficiently rich model (in such a case, no loop is needed); then, we perform the same
check after the new prediction for m at steps 7. and 9..
We assess the reliability of both the modeling error estimator mm+ and of the adap-
tive procedure above on the same test case tackled in the previous numerical sections.
In particular, as goal quantity we choose the mean value of the solution on . This
leads us to identify the functional J in (18) with J (v) = [meas()]1 v(x, y) dz.
Notice that the dual problem still coincides with an advection-diffusion problem, but
with a forward advective field. Full homogeneous Dirichlet boundary conditions
complete the dual formulation.
We make the following choices for the input paramentes of the modeling adaptive
procedure: TOL= 2 103 , m
= 3, m
+ = 5, = 0.1, = 1, delta1 = 0.5,
delta2 = 1.5, Nmax = 10, while we introduce a uniform finite element partition
of size h = 0.05 to discretize 1D = (0, 7). The adaptive procedure detects the
three subdomains 1 = (0, 2.7) (0, 2), 2 = (2.7, 4.4) (0, 2) and 3 =
(4.4, 7) (0, 2), while the initial guesses predicted for the modal multi-indices are
m(0) = {3, 4, 3}, m+ (0) = {5, 6, 5}. The domain 2 associated with the two sources
is immediately identified as the most troublesome.
Concerning the domain decomposition algorithm, due to the advective field b, we
have to pay attention in selecting the Dirichlet and the Neumann interfaces for the
primal and dual problems to guarantee the well-posedness of each subproblem on
i . In particular, a Dirichlet/Neumann condition is assigned at 1 = {2.7}2.7 and
2 = {4.4}4.4 when solving the primal problem; conversely, a Neumann/Dirichlet
condition is enforced on 1 and 2 to solve the dual problem. We fix a tolerance equal
to 102 for the domain decomposition algorithm at both 1 and 2 . The average
number of iterations demanded to ensure this accuracy is eight for the primal problem
and nine for the dual one.
Three model adaptive iterations allow to reach the desired tolerance TOL, with a
final prediction for the modal multi-index m = m(3) = {3, 7, 1}. While the initial
(0)
number m 1 = 3 of modes is preserved on 1 , we have a gradual increase of the
number of modal functions in 2 and a model coarsening occurs in 3 . Table 1
238 S. Perotto
Table 1 Quantitative information on the second phase of the modeling adaptive procedure
k m m+ mm+
0 {3, 4, 3} {5, 6, 5} 1.81 102
1 {3, 5, 2} {5, 7, 4} 4.72 103
2 {3, 6, 1} {5, 8, 3} 4.56 103
3 {3, 7, 1} {5, 9, 3} 1.31 103
5 5 5
x 10 x 10 x 10
1 2 1.5 1.5
0 1
0.8 1
0.5
2 0
0.6 0.5
4 0.5
0.4 1 0
6
1.5
0.2 8 0.5
2
0 10 2.5 1
0 2 4 6 8 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
Fig. 10 Normalized estimator m

+ ; distribution of the the local quantities Ji (u m+ u m ), with

m
i = 1, 2, 3, for the starting guess and for the odd iterations of the adaptive procedure (leftright)
h
Fig. 11 Piecewise Hi-Mod solutions: u 3,4,3 h
, u 3,5,2 h
, u 3,7,1 (topbottom)
18
16
14
12
10
8
6
4
2
0
0 20 40 60 80 100 120 140 160 180 200
Fig. 12 Pointwise Hi-Mod solution (left); corresponding modal distribution (right)
provides more quantitative information about the adaptive procedure. The sequence
of columns gathers the iteration number k (k= 0 refers to the initial configuration
predicted by the first phase), the modal multi-indices m, m+ and the value of the
estimator mm+ . For the sake of simplicity, we set the saturation constant m in
(21) to zero. Figure 11 collects the piecewise Hi-Mod solutions for the starting
guess and for the odd iterations of the adaptive procedure, while Fig. 10 shows
the associated error estimators: in particular, the three lines correspond to the local
quantities Ji (u m+ u m ), for i = 1, 2, 3. The model discontinuity occurring at the
interface 1 is almost imperceptible.
Remark 7 The a posteriori modeling error analysis and the adaptive procedure here
presented are generalizable to both a uniform and a pointwise setting. Figure 12
shows the Hi-Mod solution predicted by the error estimator derived for a pointwise
reduction to control the energy norm of the modeling error [25]. Notice the high
number of modes used in the central part of the domain. Moreover, in [23] the model
adaptivity is successfully combined with an adaptive prediction of the finite element
partition along 1D .
5 Conclusions and Perspectives
We have presented three different approaches to model phenomena characterized by

strong horizontal dynamics, even though in the presence of meaningful transverse
dynamics. The pointwise Hi-Mod procedure seems to be the most flexible approach
since it is suited to deal with both localized and widespread dynamics. On the con-
trary the uniform approach better deals with diffused dynamics while the piecewise
approach is ideal to manage confined transverse dynamics.
Concerning the future developments, our last goal is to employ Hi-Mod reduction
for the simulation of the blood flow in the whole arterial system. Mandatory next steps
will be the generalization of the Hi-Mod procedure to unsteady nonlinear problems,
such as the Navier-Stokes equations, as well as the combination of the Hi-Mod
procedure with approaches already validated in a haemodynamics setting, such as
the geometrical multiscale formulation [7, 12, 13].
240 S. Perotto
Acknowledgments The author thanks Massimiliano Lupo Pasini for Figs. 7 and 12 and Alessandro
Veneziani for the suggestions during the preparation of the manuscript.
References
1. Achchab B, Achchab S, Agouzal A (2004) Some remarks about the hierarchical a posteriori
error estimate. Numer Meth Partial Differ Equ 20(6):919932
2. Ainsworth M (1998) A posteriori error estimation for fully discrete hierarchic models of elliptic
boundary value problems on thin domains. Numer Math 80:325362
3. Aletti M, Perotto S, Veneziani A (2014) Educated bases for hierarchical model reduction in
2D and 3D. (in preparation)
4. Babuka I, Schwab C (1996) A posteriori error estimation for hierarchic models of elliptic
boundary value problems on thin domains. SIAM J Numer Anal 33:221246
5. Bangerth W, Rannacher R (2003) Adaptive finite element methods for differential equations.
Birkhauser, Basel
6. Bank RE, Smith RK (1993) A posteriori error estimates based on hierarchical bases. SIAM J
Numer Anal 30:921935
7. Blanco PJ, Leiva JS, Feijo RA, Buscaglia GC (2011) Black-box decomposition approach for
computational hemodynamics: one-dimensional models. Comput Methods Appl Mech Eng
200(1316):13891405
8. Canuto C, Maday Y, Quarteroni A (1982) Analysis of the combined finite element and Fourier
interpolation. Numer Math 39:205220
9. Canuto C, Maday Y, Quarteroni A (1984) Combined finite element and spectral approximation
of the Navier-Stokes equations. Numer Math 44:201217
10. Drfler W, Nochetto RH (2002) Small data oscillation implies the saturation assumption.
Numer Math 91:112
11. Ern A, Perotto S, Veneziani A (2008) Hierarchical model reduction for advection-diffusion-
reaction problems. In: Kunisch K, Of G, Steinbach O (eds) Numerical mathematics and
advanced applications, pp 703710. Springer, Heidelberg
12. Formaggia L, Nobile F, Quarteroni A, Veneziani A (1999) Multiscale modelling of the circu-
latory system: a preliminary analysis. Comput Visual Sci 2:7583
13. Formaggia L, Quarteroni A, Veneziani A (eds) (2009) Cardiovascular mathematics, modeling,
simulation and applications, vol 1. Springer, Berlin
14. Giles MB, Sli E (2002) Adjoint methods for pdes: a posteriori error analysis and postprocess-
ing by duality. Acta Numer 11:145236
15. Heinrich B (1996) The Fourier-finite-element method for Poissons equation in axisymmetric
domains with edges. SIAM J Numer Anal 33:18851911
16. Johnson C (1993) A new paradigm for adaptive finite element methods. In: Whiteman J (eds)
Proceedings of MAFELAP, vol 93. Wiley, New York
17. Lasis A, Sli E (2003) Poincar-type inequalities for broken Sobolev spaces. Technical Report
0310. Oxford University Computing Laboratory
18. Lions J-L, Magenes E (1972) Non-homogeneous boundary value problems and applications.
Springer, Berlin
19. Lorenz B, Biros G, Ghattas O, Heinkenschloss M, Keyes D, Mallick B, Tenorio L, van Bloe-
men Waanders B, Willcox K, Marzouk Y (eds) (2011) Large-scale inverse problems and
quantification of uncertainty, vol 712. Wiley, Chichester
20. Oden JT, Prudhomme S (2001) Goal-oriented error estimation and adaptivity for the finite
element method. Comput Math Appl 41:735756
21. Perotto S (2014) Hierarchical model (Hi-Mod) reduction in non-rectilinear domains. In: Erhel
J, Gander M, Halpern L, Pichot G, Sassi T, Widlund O (eds) Lecture Notes in Computer
Science Engineering. Springer, Berlin,pp 407414
22. Perotto S, Ern A, Veneziani A (2010) Hierarchical local model reduction for elliptic problems:
a domain decomposition approach. Multiscale Model Simul 8(4):11021127
23. Perotto S, Veneziani A (2014) Coupled model and grid adaptivity in hierarchical reduction of
elliptic problems. J Sci Comput. doi:10.1007/s10915-013-9804-y
24. Perotto S, Zilio A (2013) Hierarchical model reduction: three different approaches. In: Can-
giani A, Davidchack R, Georgoulis E, Gorban A, Levesley J, Tretyakov M (eds) Numerical
mathematics and advanced applications, pp 851859. Springer, Berlin
25. Perotto S, Zilio A (2014) Hierarchical model reduction for time dependent problems. (in
preparation)
26. Quarteroni A, Valli A (1999) Domain decomposition methods for partial differential equations.
In: Numerical mathematics and scientific computation. Oxford University Press, New York
27. Toselli A, Widlund O (2005) Domain decomposition methods-algorithms and theory. Springer,
Berlin
28. Vogelius M, Babuka I (1981) On a dimensional reduction method I. the optimal selection of
basis functions. Math Comput 37:3146
29. Vogelius M, Babuka I (1981) On a dimensional reduction method II. some approximation-
theoretic results. Math Comput 37:4768
30. Vogelius M, Babuka I (1981) On a dimensional reduction method III. a posteriori error
estimation and an adaptive approach. Math Comput 37:361384
Part V
Multi-fluid Flows
On the Application of Two-Fluid Flows Solver
to the Casting Problem
K. Kamran, R. Rossi, P. Dadvand and S. R. Idelsohn
Abstract Two-fluid modeling of the casting process is important as it particularly

provides insight to the air behavior during the filling process. Large deformations
of the material-air front require an interface capturing technique to detect it on the
fixed Eulerian meshes. On the other hand, if sharp interface is considered, jumps
in the material properties along one single element causes severe instabilities in the
solution. We review various techniques developed for both conservative level set
method to capture the interface, and enrichment techniques to cure the instabilities.
The coupling of the level set method with one of these enrichment techniques is
applied to the mold filling process.
1 Introduction
Multi-fluid flow simulation with large deformations at the interface has to deal with
two main challenges. The first one is accurately follow or capture the interface
between the phases and the second one is treating jumps in the material properties
K. Kamran (B) R. Rossi P. Dadvand S. R. Idelsohn

Centre Internacional de Mtodes Numrics en Enginyeria (CIMNE),
Gran Capitn s/n,
R. Rossi P. Dadvand
Universitat Politcnica de Catalunya, Barcelona, Spain
P. Dadvand
S. R. Idelsohn
Instituci Catalana de Recerca i Estudis Avanats (ICREA), Barcelona, Spain

246 K. Kamran et al.
at the interface, that result in kink or jumps in the unknown fields. The most popular
technique to capture the interface is the level set method that can deal with the large
deformations of the interface. its main deficiency is the gain/loss of the material at
sharp interfaces or during the reinitialization step. On the other hand, for applications
like mold filling that one of the fluids has large density, aluminum or steel, and
the other one small density, air, instabilities appear at the interface that can totally
destroy the solution accuracy. The main reason for such behavior is the poor quality
of the linear elements to capture kink or jumps in the pressure-velocity pair. Various
enrichment techniques are provided to improve the approximation properties at the
elements cut by the interface. In the following, we first review some main algorithms
developed for the interface capturing approach by means of level set method. Then,
various enrichment techniques to treat kink and/or jumps in the unknown fields at
the interface are presented. Finally we loosely couple the level set method and one
of the enrichment techniques to model the mold filling process.
2 Level Set Method
The underlying idea behind level set method is to represent an interface as the
zero-level set of a higher dimensional function (x, t). This function is scalar and
substantially reduces the complexity of describing the interface, especially when
undergoing topological changes such as pinching and merging. The level set func-
tion (x, t) is defined to be a smooth function that is positive in one region and
negative in the other.
The motion of the interface is determined by a velocity field, u, which can depend
on a variety of things including position, time, geometry of the interface, or be given
externally for instance as the solution of the Navier-Stokes equations in a fluid flow
simulation. The advection equation for interface evolution is,
t + u = 0. (1)
This level set equation is a first order hyperbolic PDE and only needs to be solved
near the interface. Geometrical quantities related to the interface ( = 0.0), i.e. unit
normal N and curvature , can be calculated from the level set function by:

N= = N (2)

The most common choice for the level set function is the signed distance to
the interface so that || = 1. This ensures that the level set is a smoothly varying
function well suited for high order accurate numerical methods. There are several
techniques to solve the level set Eq. (1) in space and time. One of the most popular on
structured meshes is the high order Hamilton-Jacobi ENO method (HJ-ENO) com-
bined with a Runge-Kutta method [1, 2]. Despite the high order temporal and spatial
On the Application of Two-Fluid Flows Solver to the Casting Problem 247
approximations of the level set equation, instabilities may appear when the level set
cease to be a signed distance function. This situation occurs at the presence of large
topological changes at the interface vicinity, which are quite common in practical
problems. One solution is to reshape the level set function to a distance function. This
method, called reinitialization, has been shown to stabilize those numerical instabil-
ities. Reinitialization algorithms maintain the signed distance property by solving to
steady state (as fictitious time ) the equation
+ sgn(0 )( 1) = 0. (3)
where sgn(0 ) is a one-dimensional smeared out signum function [3]. Equation (3)
only needs to be solved near the interface and not in the whole domain. Efficient
ways to solve this equation to steady state via fast marching methods are discussed
in [4]. This equation can also be written in a classical Hamilton-Jacobi form as:

+ v() = sgn(0 ) and v() = sgn(0 )

Unfortunately, one of the major drawbacks of the reinitialization process is the diffi-
culty in preserving the original location of the interface, often leading to breakdown
in the conservation of mass. To overcome this problem of mass loss with the level
set method, various solutions have been proposed:
Particle level set methods
Volume of fluid methods
Geometric mass-preserving redistance methods
Discontinuous Galerkin level set methods.
2.1 Particle Level Set
Particle Level Set (PLS) [5] uses Lagrangian marker particles to rebuild the level
set in regions which are under-resolved. This is often the case for flows undergoing
stretching and tearing. Two sets of massless marker particles are placed near the
interface with one set, the positive particles, in the > 0 region and the other set,
the negative particles in the < 0 region. It is unnecessary to place particles far
from the interface and this greatly reduces the number of particles needed in a given
simulation. The region near the interface could be considered as the region covered
by all elements that has at least one corner with the distance inferior to 3 element
size. The number of particle per cell is set to 4d where d is the spatial dimension.
Figure 1a shows the zero level-set for the 2D Zalesaks disk after one revolution.
Placement of the massless particles around the zero-level set for this test can be seen
in Fig. 1b. In Fig. 1c the PLS solution after one revolution is shown.
Fig. 1 Particle Level Set (PLS) method [5], a mass gain/loss of the standard level set solution
after one revolution, b placement of massless positive (blue) and negative (red) particles around the
interface for the Zalesak test, c PLS solution after one revolution
For the purpose of interface reconstruction a radius r p is assigned to each particle

as the function of its distance to the interface. This radius is bounded (on structured
grids) by minimum and maximum values based upon the gird spacing (0.1d < r p <
0.5d , d = min{x, y, z}). The particles are then advected with the evolution
equation
dx p
= u(x p ), (4)
dt
where x p is the position of the particle and u(x p ) is its velocity. The particle velocities
are interpolated from the velocities on the underlying grid. The marker particles (4)
and the level set Eq. (1) are separately integrated forward in time. After each complete
time cycle, the particles are used to locate possible errors in the level set function.
Particles that are on the wrong side of the interface by more than their radius, as
determined by the interpolated distance (x p ), are considered to have escaped.
A local level set function is defined for each particle by means of the radius
associated to the particle,
p (x) = s p (r p x x p ) (5)
where s p is the sign of the particle, i.e. 1. These level sets are only defined locally
on the corners of the cell containing the particle and can be seen as the particle
predictions of the values of the level set function on the corners of the cell. Any
variation of from p indicates possible errors in the level set solution. The escaped
positive particles are used to rebuild the > 0 region and the escaped negative
particles to rebuild the 0 region. For example take the > 0 region and
an escaped positive particle. Using Eq. (5), the p values of the grid points on the
boundary of the cell containing the particle are calculated. Each p is compared to
the local values of and the maximum of these two values is taken as + . This is
done for all escaped positive particles creating a reduced error representation of the
> 0 region. That is, given a level set and a set of escaped positive particles E + ,
we initialize + with and then calculate
+ = max ( p , + )
pE +
Similarly for the negative region 0, we initialize with and then calculate
= min ( p , ).
pE
We merge + and back into a single level set value by setting to the value of
+ or which is least in magnitude at each grid point.
The Particle level set method relies on being a signed distance function. This
implies that after each time marching step, a reinitialization of the level set function
using Eq. (3) is necessary. Unfortunately, the reinitialization may cause the zero level
set to move, which is not desirable, so the particle level set method is employed to
correct these errors as well. During the reinitialization step the particles are not
moved to keep the zero level set, and then any error in the reinitialization scheme is
corrected by the particles. After reinitialization and correction step the radii of the
particles are adjusted according to the current (x p ) regarding their distance to the
zero level set.
In summary the order of operation in PLS is: evolve both particles and level set
function in time, correct errors in the level set function using particles, reinitialize
the distance function, again correct the errors using particles and finally adjust the
particle radii. A final task to complete the PLS is the particle reseeding, i.e. in flows
with the the interface stretching and tearing we need to periodically readopt the
particle distribution to the deformed interface [5].
2.2 Volume of Fluid
Coupled level set/volume-of-fluid (CLSVOF) method combines some of the advan-

tages of the volume-of-fluid method with the level set method to obtain a method
which is generally superior to either method alone. In the VOF [6, 7], the volume
fraction F(, t) is used to represent the free surface. Typically, represents a com-
putational cell and if F(, t) = 1, then the region is all liquid, if F(, t) = 0
then the region is all gas and for the intermediate values, contains both gas and
liquid.
One can define the volume fraction function F(, t) in terms of the level set
function . Since we have > 0 in one domain (liquid) and < 0 in the other
domain (air), one can define F(, t) as

1
F(, t) = H ((x, t)) d, (6)
||

where H is the Heaviside function,

1 if > 0,
H () = (7)
0 otherwise.
The main advantage of representing the free surface with the volume fractions is
that one can write accurate algorithms for advecting the volume fraction function so
that mass is conserved while still maintaining a sharp representation of the interface.
However, a disadvantage of the VOF method is the fact that it is difficult to compute
accurate local curvatures from volume fractions. This is mainly due to the sharp
transitions of the volume fractions at the interface. One remedy is to smooth the H
function at the interface. If one smooths too much then the numerical method will not
detect changes in curvature along the interface. In the CLSVOF method, the curvature
is not smoothed at all; instead the curvature is obtained via finite differences of the
level set function which in turn is derived from the level set function and volume-
of-fluid function at the previous time step.
The equation governing the evolution of the VOF is:
Ft + u F = 0. (8)
The coupling between the level set function and the volume-of-fluid F occurs when
the interface normals computed from the level set are used in the interface recon-
struction at each cell that in turn are used by the VOF function. Much of the research
on VOF methods has focused on obtaining higher accuracy and better representa-
tions of the interface geometry. The original piecewise linear interface reconstruction
technique (PLIC) [20] has been improved upon using parabolic (PROST) [21] and
least squares [22] techniques.
The CLSVOF can be summarized as, first update level set function n+1 , then
reconstruct the interface at each cell by one of the linear or higher order techniques
and finally update the volume-of-fluid function F n+1 . Note that reinitializing the
n+1 as the exact signed distance function to the reconstructed interface, is another
point where coupling between the LS and VOF occurs. Remind that numerically, the
smoothed Heaviside function H () is substituted for the sharp Heaviside function
H (). The smoothed Heaviside function is defined as,

1 if <

H () = 2 [1 +
1
+ 1
sin(/)] if || (9)

0 if > .
2.3 Geometric Mass-Preserving Redistance Methods
The advection distorts the initial shape of the level set function, which needs to
be reinitialized to a smooth function preserving the position of the zero-level set.
Fig. 2 Contours of the

distance function d to a
rectangle. Example
showing that the distance
to the interface from outside
of the rectangle region (solid
line contours) does not belong
to the finite element space [9]
Efficient algorithms for level-set redistancing on Cartesian meshes have been devel-
oped but few methods are available for unstructured meshes. Geometric mass-
preserving redistance method [8, 9], developed for unstructured meshes, can be
localized on a narrow band close to the interface, saving computing effort. Almost
all redistancing algorithms involves some sort of mass-rebalancing step. Geomet-
ric mass-preserving redistance includes one such step that is local and involves no
adjustable parameter.
Consider an arbitrary triangulation of the domain , and the associated space Vh
of continuous function that are linear inside each simplex. Let h Vh be a function,
and let S be its zero-level set. We look for a function h Vh which approximates the
signed distance function d to S. This function satisfies d = 1 almost everywhere
in but does not, in general, belong to Vh (see Fig. 2).
Let P be the set of nodal points that are adjacent to the zero-level set of , in the
sense that they are vertices of simplicis inside which h changes sign. If one makes
the simple assignment h (X) = d(X) for all X P, there is a volume loss (or gain)
which could render the algorithm for practical simulations. The values of h at nodes
adjacent to the zero-level set must thus be adjusted so as to preserve volume, and
the function h must be calculated at the remaining nodes using the adjusted values
at P.
Let us define K(h ) as the set of simplices in which h changes sign. The objective
is thus to calculate h such that it approximates the signed distance d while at the
same time preserving the volume,

V (h ) = H (h (x)) dx,
K(h )
where H is the Heaviside function defined in (7). The contribution to the volume of
each simplex K K(h ) is

VK (h ) = H (h (x)) dx.
K
The algorithm to compute the conserved values h is as follows,

1. Initialization: The function h is initialized over the nodes in P (cut elements),

to a first estimate 0h , calculated as the true signed distance to the interface.
2. Simplex-wise correction: In general, the initialization ends up with a function 0h
for which VK (0h ) = VK (h ), though the difference is quite small. The idea here
is to determine a constant value, K , to add to 0h over cut element to achieve
local volume preservation. The nonlinear system to be solved is,
R K ( K ) = VK (0h + K ) VK (h ) = 0
The values K are computed using a simple secant algorithm i+1

K = iK
i1 i1
R K ( K K )/(R K R K ), which converges in very few steps.
i i i
3. Node-wise correction: From the previous step simplex-wise correction to the 0h

to preserve volume on all cut elements are computed. A node-wise correction h
is defined by averaging over the simplices that share a node. The value of h on
P is finally calculated as
h = 0h + Ch ,
where C is computed such that conserve the volume over the band of elements
cut by the interface,

H 0h + Ch dx = H h (x) dx,
K(h ) K(h )
This nonlinear system for C is again solved by a simple secant method and
converges in very few iterations.
Note that considering a uniform mass-conserving correction by choosing h = 1
is not optimal since volume loss/gain is not uniform over the interface. In fact the
loss/gain in volume tends to concentrate in regions of higher curvature. Figure 3
shows the application of geometric mass-preserving redistance method to the Zalesak
test.
2.4 Discontinuous Galerkin Level Set Methods
A quadrature-free discontinuous Galerkin method (DGM) has been developed for

the conservative form of the level-set equation in [10]. This method has two main
advantages that makes it interesting comparing to the other methods. First, it is not
necessary to compute gradients of the level set function and second, the scheme
remains stable even if diverges from a signed distance function. It follows that in
this method the reinitialization step is omitted and therefore many problems of the
mass conservation are disappeared.
Fig. 3 Geometric mass-preserving redistance method [8] applied to the Zalesak test. Interface
position at different instances during one revolution and with redistancing after each step, ae
mass-preserving redistance method and fj level set method. a t = 0 s, b t = 157 s, c t = 314 s, d t
= 471 s, e t = 628 s, f t = 0 s, g t = 157 s, h t = 314 s, i t = 471 s, j t = 628 s
The conservative form of the level set Eq. (1) is:
t + (u) = u.
In case of incompressible flows, u = 0, this equation reduces to
t + (u) = 0. (10)
Within this framework, only possible for incompressible flow, it is not needed to
compute anymore. The variational form of the Eq. (10) is written as;

t wd = u wd f wds (11)

where f () = u n is the normal trace of the fluxes and u is the velocity field. Since
the DGM allows discontinuities at the interface, the flux is not uniquely determined
on it and a flux formula has to be supplied to complete the discretization process. In
the simple advection case the upwind flux is chosen to be the value of on :

+ if u n 0, i.e. The flow goes inside the domain
UP =
if u n > 0, i.e. The flow goes outside the domain.
In each element is discretized using piecewise continuous approximations and an

important assumption is considered to evaluate the nonlinear term F();
Table 1 Different level set techniques applied to the 2D Zalesak test

Method h Area % Area loss (%) L 1 error CPU time(s)
Exact 582.2
WENO [5] 1.0 613.0 5.3 0.61
PLS [5] 1.0 580.4 0.31 0.07 134.043
PLS [5] 0.5 581.7 0.08 0.031 840.038
Geometric [8] 0.5 579.1 0.55 0.17
DG TRI(4) [10] 4.0 583.27 0.18 0.05 101.22
DG TRI(4) [10] 2.0 582.17 0.005 0.011 955.5

d
= Nk ek
k=1

d
d
F() = F( Nk ek ) Nk F(ek ) (12)
k=1 k=1
In [10] Lagrangian shape functions are used to approximate and in each element
e, d Lagrangian points is considered. As the interpolations are being disconnected,
the integral form (11) is written for each element e of the mesh. Note that for elements
with straight line edges the Jacobian matrix is constant and also the matrices related
to (11), once written in the reference coordinates, are independent of each element.
In this way the DGM is quadrature-free. In [10] a TVD-Runge-Kutta of order p + 1,
where p is the polynomial interpolation order, is used for the time stepping. The RK
of order p + 1 with DG at order p can be proven to be stable under the CFL condition
t < c(2 p+1)
h
, where h is the element size and c is the norm of maximum velocity
on element e. Table 1 summarizes the comparison between, DG for various order of
interpolation and two other methods mentioned earlier, VOF and PLS, for the 2D
Zalesaks disk.
3 Two Phase Flows
Two types of methods are generally distinguished for resolving the interface between
the two phases: interface tracking in which the mesh explicitly represents the interface
and follows its movements [1113] and interface capturing in which the interface is
described implicitly as the zero level-set of an auxiliary function [6, 1416] defined on
the fixed mesh. The interface-capturing method is more convenient in case that large
topological changes occur at the interface. The main issue that need to be addressed
in two-phase flow problems is the possible jumps or kinks in the unknown fields
i.e. velocity and pressure, due to the jumps in material properties, i.e. density and
viscosity, or the surface tension. In this view, two class of methods can be recognized
in the interface-tracking category. In the first one [17] a numerical thickness is given
to the interface by means of the smoothed Heaviside function (9). The material
properties, and , are then defined for the two-fluid system at the discrete level as,
= w H (h ) + a [1 H (h )]
= w H (h ) + a [1 H (h )]
A typical stabilized form of the incompressible Navier-Stokes equations are consid-

ered to complete the simulation. Note that this family of methods require constant
interface thickness during the time and therefore it is crucial to frequently reinitialize
the distance function.
In the second class of methods, the interface is considered sharp and no smoothing
of the interface is considered. As the result, cut element with sharp jumps of several
orders of magnitudes in material properties results in kink and/or discontinuities
in the solution fields. Piecewise linear continuous approximations are too poor to
capture these phenomena. Different enrichment techniques are then developed to
treat this problem, local enrichments, that are condensed at element level, and global
enrichments.
3.1 Local Enrichments
The incompressible Navier-Stokes equations governing the two-phase motion in

domain is written as:
(t u + u u) (2 s u) + p = b
(13)
u=0
This set of equations is completed by the appropriate Dirichlet and Neumann

boundary conditions. The domain is divided into two parts by the zero-level of the
level set function denoted by + and and with following material properties;

(+ , + ) if x +
((x), (x)) =
( , ) if x .
Note that a jump in density at the interface produces a discontinuity at the pressure
gradient, or a jump in viscosity causes a discontinuity in pressure. Surface tension
can also produces a jump in pressure field at the interface.
The balance of the interface and internal forces at the interface implies that,
( + ).n = f .
Here we only consider the case that interface force f is surface tension and there-
fore is normal to the interface. The internal stress at each domain has the form
p I + 2 s u.
The variational equivalent of (13) is to find (u, p) V Q such that

(t u + u u) vd + 2 s u : s vd

p vd = b vd + v f

q ud = 0

(v, q) V Q. A stabilized variational form can be obtained based on the

Algebraic Subgrid Scale (ASGS) method. Trapezoidal rule for temporal discretiza-
tion is considered and nonlinearities are best handled in an implicit fashion using a
Newton-Raphson scheme. The discrete variational form of this problem then reads,

Ru = Gu vh d + 2 s uhn+1 : vh d phn+1 vh d + f n+1 vh d
h

+ 1 uhn vh (Gu + phn+1 )de + 2 uhn+1 vh de = 0
e e
e e
1
Rp = qh uhn+1 d + (Gu + phn+1 ) qh de = 0 (14)
e

e
(vh , qh ) Vh Q h . In (14) the term Gu is given by,
uhn+1 uhn
Gu = ( + uhn+1 uhn+1 bn+1 )
t
and the stabilization parameters given by,
h 2e
1 = and 2 = + 0.5h e |ue |
4 + 2h e |ue |
where h e and ue are elemental length and velocity, respectively. Solving (14) accu-
rately requires the finite element space to be appropriately chosen to capture the
possible kink or jumps that are expected at the solution.
When the interface cut elements on the fixed mesh, the jump in density causes a
kink in pressure field and therefore a jump in its gradient in the cut elements (see
Fig. 4). It is clear that for simple elements (triangles in 2D and tetrahedra in 3D)
linear approximation of the pressure can not represent the kink in the cut element
and in the same way the jump in pressure gradients. This pressure field in the cut
elements does not belong to the standard pressure space made up of linear nodal
interpolation and one way to capture this pressure filed is to add enrichment to the
pressure field of the cut elements. One of such modifications proposed in [18] is to
interpolate pressure as:
Fig. 4 Two-fluid hydrostatic flow with a jump in density. Linear elements can not represent kink
in pressure field inside an element. a Interface pass through the elements, b jump in density in
elements cut by the interface, c exact solution (dotted line) and FE solution (solid line)
N
node
phe = Nie pie + Nenr
e e
penr
i
N node is the number of nodes per element and N enr is the new enrichment function
added just for the elements cut by the interface. This function has the constant gradient
at each side of the interface and is designed to be local to each element and therefore
has zero value at the element nodes. Figure 5a shows a sketch of the enrichment
function for an element cut by the inteface in the 2D case. Node a belongs to the
and nodes b and c to the + . Nenr is easily defined by the level set values at the
element nodes as;
N
node N
node
Nenr = 0.5(| Nie i | + Nie |i |)
i i
In order to capture the discontinuities and take advantage of the enrichment functions
used, the integration rules need to be modified in elements cut by the front. To this
end, each tetrahedral (triangular in 2D) element is divided into up to six tetrahedra
(three triangular in 2D) sub elements. For each sub element the same integration rule
as for the non-cut elements is used (see Fig. 5b). Further enrichments are needed in
case that surface tension is present or viscosity jump arises at the interface. In this
case the pressure solution belongs to a space that is discontinuous at the interface.
Figure 6 shows one type of this enrichment for the 2D case that is introduced in [19].
The pressure is enriched at the cut elements by one additional local pressure at each
side of the interface. This new space has the property that can capture a constant
solution at each side with a possible jump at the interface. Note that both enrichment
functions are zero at the nodes of the mesh. These enrichment functions has the
following form;
(a) (b)
Fig. 5 Interface divides an element in 2D. a Nenr proposed in [18] to capture discontinuous pressure
gradient in the cut elements. b Triangular sub elements used for the numerical integration. shows
the integration points
(a) (b)
Fig. 6 Enrichment shape functions proposed in [19] to capture jump in the pressure field. Note
that they are not continuous at the interface position . a Nenr
1 , b N2
enr
1
Nenr (x) = (1 S(x))+ (x)
2
Nenr (x) = S(x) (x)
where the function S is given in terms of the standard P1 functions as,

S= Nie
iI +
with I + = {i I, x + +
e }, and and the characteristic functions for the
positive and negative sides. Note that the additional shape functions are local to each
element crossed by the interface and therefore can be condensed prior to the assembly
to maintain the size of the system matrix and its graph.
Fig. 7 Subdomains containing enriched, partially enriched and standard (non-enriched) elements
as well as the enriched nodes [23]
3.2 Global Enrichments
By global enrichment we mean those enrichment techniques that add new degrees
of freedom to the solution space, not condensed at the element level, and therefore
change the graph of the matrix. In this way, extended finite element method (XFEM)
is considered a global enrichments although the enrichment (extension) is local to the
interface zone. One exception to our classification is the intrinsic XFEM method [20]
that does not introduce any additional unknowns. All other standard XFEM methods
developed for the two-phase flows [2124], add new DOFs to the system. Similar to
the local enrichment techniques, when XFEM is used for the simulation of two-phase
flows, several enrichment schemes can be employed: velocity and/or pressure fields
may be enriched and enrichments for kinks or jumps may be used. Numerical studies
in [24] reveals that it is not advisable to enrich the velocity approximation space as
it does not improve the results significantly, but may lead to severe convergence
problems. Furthermore, the required number of iterations for the solution of the
governing equations may increase considerably. On the other hand, the enrichment
of the pressure field is essential.
Figure 7 shows three types of subdomains that can be distinguished in an XFEM
enrichment scheme. The first one enr is composed of elements cut by the interface
and are those that have all nodes enriched. The second subdomain, penr , has all
elements that have at least one node enriched and the last subdomain is the collection
of elements with standard degrees of freedom and without enrichment. All nodes
belong to the elements cut by the interface are enriched. Let us define I as the
collection of all nodes i that are enriched. The enrichment function Mi for these
nodes are written as [24],
Mi (x, t) = Ni (x, t) [(x, t) (xi , t)] i I
with (x, t) being the global enrichment function and Ni (x, t) is the standard FE
shape function for node i. Mi (x, t) is the so called shifted enrichment which ensures
Kronecker- property of the overall approximation.
A global enrichment function that is typically chosen for strong discontinuities
(jump in pressure) is the sign enrichment:

1 if (x, t) < 0
sign (x, t) = sign((x, t)) = 0 if (x, t) = 0

1 if (x, t) > 0.
where (x, t) is the level set function. For weak discontinuities (kink in pressure
fields), Mos et al. [25] proposed abs-enrichment shape function that has an important
property of being zero on standard element (i.e. penr in Fig. 7). It is defined as;
N
node N
node
(x, t) = abs (x, t) = |i |Ni (x, t) | i Ni (x, t)|
i i
Note that this shape function is quite similar to the one proposed in [18] and shown
in Fig. 5. Figure 8 shows the enrichment shape functions, Mi , for the weak and strong
discontinuities, as mentioned above, in a quadrilateral element.
Note that in general, enrichment shape functions with small support can occur
(when the interface is so close to a node) which lead to an ill-conditioned system
matrix. Two approaches to treat this problem is found in [26, 27].
4 Application to Mold Filling Process
Simulation of mold filling process is very complex not only because of the variety
of phenomena that appear, but also because of the high level of interactions between
them. Fluid flow, heat transfer and phase change effects take place at the same time
and with a complex coupling pattern. Different defects may arise during the mold
filling stage like, air entrapment, slag inclusion, formation of cold shuts and cold laps
and solidification. Heat transfer with the mold and the air can dramatically changes
the material properties and therefore change the course of the flow. It may cause
solidification in some material front while on the other zones the material is still
Fig. 8 Enrichment functions proposed by the XFEM method [23, 24] for, ad weak discontinuities
( = abs ) and, ef strong discontinuities ( = sign ). The interface is considered to pass through
sign sign sign sign
the middle of the element. a M1abs , b M2abs , c M3abs , d M4abs , e M1 , f M2 , g M3 , h M4
expanding. Cold shuts arise when the metal flowing in a section solidifies and the
flow is blocked before the region is completely filled. Cold laps are formed when
the metal stream separates and subsequently joins in a region where the metal has
already solidified. Both cold shuts and cold laps are formed due to excessive heat
loss from the flowing metal to the mold.
The flow regime in filling stage is turbulent in most cases. The standard turbulence
models require a very fine grid near the mold walls and it is rather impractical to
achieve such a fine grid in all the sections of a complex mold. Hence, only laminar
flow computations are usually done for mold filling predictions. Some authors have
used upwinding algorithms and/or turbulence models to obtain convergent results on
practically feasible grids. These models effectively increase the viscosity of the fluid
and make it possible to achieve stable solutions. The other important aspect in filling
process is the boundary condition. If no-slip boundary conditions are enforced along
the wall, the predicted filling pattern is unrealistic as the metal flowing adjacent to the
wall tends to stick to the wall itself. It can be understood as the convective velocity
becomes zero, for these nodes, in the level set Eq. (1). Slip boundary condition is
therefore prescribed for the fluid all along the mold wall. This may be seen as a simple
turbulence model and is feasible for complex mold geometries; the velocity profile
obtained in thin sections is almost uniform across the cross-section and thus mimics
the turbulent flow profile. The near-wall (viscous sublayer) region is not included in
the flow computation as in any other high Reynolds number turbulent flow model.
Instead, tangential traction forces can be prescribed on the slip boundaries to model
the effect of wall friction. This stress can be tuned by empirical factors to mimic
the mold wall material friction and therefor get more realistic flow pattern. For high
pressure die casting process, that the mold is usually made of still and with less
friction, this scaling factor is chosen much less than the gravity die casting in which
the mold is made of sand and therefor is much more rougher.
Fig. 9 Filling of a turbine blade. Red color shows the aluminum-air interface at different instances.
The inlet velocity is 0.3 m/s and the blade is filled from the bottom. a t= 0.2 s, b t= 4 s, c t= 6 s, d
t= 9 s, e t= 12 s, f t= 15 s
Another important aspect in mold filling simulation is the air exits. It is known that
in the sand molds the porous texture of the mold provide air exits. In the two-fluid
simulation that the air inside the mold is modeled as incompressible, it is essential to
provide holes or exits in the mold wall to vent the air as the metal fills the cavity. This
is very important particularly when coarse grids are used. Otherwise, air tends to
recirculate in thin sections, corners or closed zones and prevents the metal reaching
those regions and therefore unrealistically large pockets of air are entrapped. Once the
fluid has reached the vent zones they are considered closed to avoid the excessive
loss of material.
Figure 9 shows various instances during the filling of a turbine blade. Level set is
used to detect the interface position and the two-fluid multi-scale stabilized incom-
pressible formulation (14) is used. Among the different enrichments presented in
Sect. 3, the discontinuous pressure gradient is applied. The filling material is Alu-
minum with density equal to 2600 kg/m3 and the air has density of 1 kg/m3 . In order
to avoid unnecessary jumps in pressure the same dynamic viscosity, , equal to 1e5
kg
m s is considered. Boundaries are slip and a friction force of the following form is
applied in the direction opposite to the velocity at the boundary nodes,
fwall = C u (I n n) u
Fig. 10 Post-filling solidification of a mechanical part. Red zones are the liquid zones and time is
measured from the moment that filling is completed. a t = 7 s. b t = 12 s. c t = 14 s. d t = 16 s.
e t = 19 s. f t = 25 s
here C is a constant that is chosen between {0.005, 0.05}, u is the velocity at the
Gauss point and n is the exterior normal. for the mixed element is chosen as metal
material.
One of the major components of the post-filling simulation is the solidification
analysis. Solidification is accompanied by the release of latent heat at the solid-liquid
interface. Heat transfer between the filling material and the mold causes colling from
a liquid state until the material begins to solidify at the liquidus temperature, Tl , and
solidifies completely at the solidus temperature, Ts . The solid and liquid phases
are separated by a transition mixture region called mushy zone. One way to model
the solidification process is the effective specific heat method in which the energy
equation is written in terms of an effective specific heat c;
T
c = (kT )
t
where T and k are temperature and thermal conductivity, respectively. The parameter
c is the slope of the enthalpy-temperature curve and is equal to the specific heat, c,
in the solid and liquid region. In the mushy zone its value is given by
d fs
c = c L
dT
here L is the latent heat and f s the solid fraction. When the solid fraction, f s , is
assumed to vary linearly with temperature (i.e. f s = (Tl T )/(Tl Ts )), the value
of c is constant in the mushy zone as given below:
L
c = c +
Tl Ts
Figure 10, shows the post-filling solidification process of a mechanical part.

Solidus temperature, Ts , and liquidus temperature, Ts are taken as 630 and 542 C,
respectively. The latent heat varies linearly with respect to temperature and between
20003000 kj/kg. The volume of the mechanical part is 215 cm3 and it is solidified
in 35 s.
Our formulation is implemented in the free source parallel multi-physics software
platform of KRATOS [28, 29] developed at CIMNE.
5 Conclusion
The main challenge in the application of level set method to capture the interface,
is the mass conservation during both the convection and reinitialization step. To
treat this problem different level set techniques have been developed. The particle
level set method, PLS, and the coupled level set/volume-of-fluid method, CLSVOF,
have been mainly developed for the structural quadrilateral meshes and works quite
efficiently in this kind of meshes. On the other hand, geometric mass-preserving
redistance method is developed for both the structured and unstructured meshes and
is particularly useful in case that redistancing error is dominant i.e. fine meshes
and small t. Discontinuous Galerkin level set method (DGLSM) is quadrature-
free, works on structured and unstructured meshes and as the polynomial order
increases ( p > 4) its superiority to other method is evident even on very coarse
meshes (see Table 1). The convective velocity in in the level set equation comes
form the solution of Navier-Stokes equations. In case that linear elements are used,
enrichment is necessary at the interface level to improve and stabilize the velocity-
pressure pairs at the interface. Various enrichment techniques are developed to treat
kink or discontinuities due to the jump in material properties at the element level. We
presented them as the local and non-local, though both of them add enrichments at
the interface zone. Local enrichment add DOFs that can be condensed at the element
level and therefore does not change the graph of the global matrices as the interface
moves, and the non-local enrichment, XFEM, add global DOFs that change the graph
of the global matrices and can be quite costly for the practical applications. In both
of these techniques stability issues have to be taken into account in case that the
interface passes so close to the element nodes.
References
1. Osher S, Shu C-W (1991) High-order essentially nonoscillatory schemes for hamilton-jacobi
equations. SIAM J Numer Anal 28(4):907922
2. Osher S, Fedkiw R (2003) Level set methods and dynamic implicit surfaces, vol 153. Springer,
New York
3. Sussman M, Fatemi E (1999) An efficient, interface-preserving level set redistancing algorithm
and its application to interfacial incompressible fluid flow. SIAM J Sci Comput 20(4):1165
1191
4. Sethian JA (1999) Fast marching methods. SIAM Rev 41(2):199235
5. Enright D, Fedkiw R, Ferziger J, Mitchell I (2002) A hybrid particle level set method for
improved interface capturing. J Comput Phys 183(1):83116
6. Sussman M, Puckett EG (2000) A coupled level set and volume-of-fluid method for computing
3d and axisymmetric incompressible two-phase flows. J Comput Phys 162(2):301337
7. Sussman M (2003) A second order coupled level set and volume-of-fluid method for computing
growth and collapse of vapor bubbles. J Comput Phys 187(1):110136
8. Mut F, Buscaglia GC, Dari EA (2004) A new mass-conserving algorithm for level set redis-
tancing on unstructured meshes. Mecanica Computacional 23:16591678
9. Ausas RF, Dari EA, Buscaglia GC (2011) A geometric mass-preserving redistancing scheme
for the level set function. Int J Numer Meth Fluids 65(8):9891010
10. Marchandise E, Remacle J-F, Chevaugeon N (2006) A quadrature-free discontinuous galerkin
method for the level set equation. J Comput Phys 212(1):338357
11. Idelsohn S, Mier-Torrecilla M, Oate E (2009) Multi-fluid flows with the particle finite element
method. Comput Methods Appl Mech Eng 198(33):27502767
12. Kamran K, Rossi R, Oate E, Idelsohn SR (2012) A compressible lagrangian framework for
the simulation of the underwater implosion of large air bubbles. Comput Methods Appl Mech
Eng 225(1):210225
13. Bonet J, Kulasegaram S (2000) Correction and stabilization of smooth particle hydrodynamics
methods with applications in metal forming simulations. Int J Numer Meth Eng 47(6):1189
1214
14. Sunitha N, Jansen KE, Lahey RT Jr (2005) Computation of incompressible bubble dynam-
ics with a stabilized finite element level set method. Comput Methods Appl Mech Eng
194(42):45654587
15. Kees CE, Akkerman I, Farthing MW, Bazilevs Y (2011) A conservative level set method suitable
for variable-order approximations and unstructured meshes. J Comput Phys 230(12):4536
4558
16. Rossi R, Larese A, Dadvand P, Oate E (2012) An efficient edge-based level set finite element
method for free surface flow problems. Int J Numer Methods Fluids 33:737766
17. Sussman M, Smereka P, Osher S (1994) A level set approach for computing solutions to
incompressible two-phase flow. J Comput phys 114(1):146159
18. Coppola-Owen AH, Codina R (2005) Improving eulerian two-phase flow finite element approx-
imation with discontinuous gradient pressure shape functions. Int J Numer Methods Fluids
49(12):12871304
19. Ausas RF, Buscaglia GC, Idelsohn SR (2012) A new enrichment space for the treatment of
discontinuous pressures in multi-fluid flows. Int J Numer Methods Fluids 70(7):829850
20. Fries T-P, Belytschko T (2006) The intrinsic xfem: a method for arbitrary discontinuities without
additional unknowns. Int J Numer Meth Eng 68(13):13581385
21. Chessa J, Belytschko T (2003) An extended finite element method for two-phase fluids: flow
simulation and modeling. J Appl Mech 70(1):1017
22. Gro S, Reusken A (2007) An extended pressure finite element space for two-phase incom-
pressible flows with surface tension. J Comput Phys 224(1):4058
23. Rasthofer U, Henke F, Wall WA, Gravemeier V (2011) An extended residual-based variational
multiscale method for two-phase flow including surface tension. Comput Methods Appl Mech
Eng 200(21):18661876
24. Sauerland H, Fries T-P (2011) The extended finite element method for two-phase and free-
surface flows: a systematic study. J Comput Phys 230(9):33693390
25. Mos N, Cloirec M, Cartraud P, Remacle J-F (2003) A computational approach to handle
complex microstructure geometries. Comput Methods Appl Mech Eng 192(28):31633177
26. Reusken A (2008) Analysis of an extended pressure finite element space for two-phase incom-
pressible flows. Comput Vis Sci 11(46):293305
27. Bchet E, Minnebo H, Mos N, Burgardt B (2005) Improved implementation and robustness
study of the x-fem for stress analysis around cracks. Int J Numer Meth Eng 64(8):10331056
28. Dadvand P, Rossi R, Oate E (2010) An object-oriented environment for developing finite
element codes for multi-disciplinary applications. Arch Comput Methods Eng 17(3):253297
29. Dadvand P, Rossi R, Gil M, Martorell X, Cotela J, Juanpere E, Idelsohn SR, Oate E
(2012) Migration of a generic multi-physics framework to hpc environments. Comput fluids
80:301309
Recent Advances in the Particle Finite
Element Method Towards More Complex Fluid
Flow Applications
Norberto M. Nigro, Juan M. Gimenez and Sergio R. Idelsohn
Abstract This paper presents a state of the art in the Particle Finite Element Method,
normally called PFEM, its emphasis in the new ideas oriented to extend its application
not only to solve fluid structure interaction and multifluid problems, also bring new
opportunities to shorten the gap between engineering design times and computational
simulation times for general problems when Eulerian formulation were typically
chosen. In order to reduce the long history of this method here the starting point begins
with the reformulation of the method to solve academic and real problems in real
time or at least in drastically reduced computational times. The main topics involved
in this paper are around the stability and the accuracy of Lagrangian formulations
against its Eulerian counterpart shown through several academic benchmarks and a
deep analysis of the efficiency revealing that the original method needs some new
features. The former brought out a new integration method called X-IVAS and the
later has produced a new version of the method called PFEM in fixed Mesh. Once
the method had shown its good performance and how the new features impact on
the final efficiency the last developments had been done in extending the application
of this new method in multifluids and other complex fluid mechanics problems like
turbulence and reactive flows.
N. M. Nigro (B) J. M. Gimenez

Centro de Investigacion en Metodos Computacionales (CIMEC), Consejo Nacional de
Investigaciones Cientificas y Tecnicas (CONICET), Santa Fe, Argentina
N. M. Nigro
Centro de Investigacion en Metodos Computacionales (CIMEC), Universidad Nacional del Litoral
(UNL), Santa Fe, Argentina
S. R. Idelsohn
Centre Internacional de Mtodes Numrics en Enginyeria (CIMNE), Instituci Catalana de
Recerca i Estudis Avanats (ICREA), Barcelona, Spain
S. R. Idelsohn
Centro de Investigaciones en Mecnica Computacional (CIMEC), Universidad Nacional
del Litoral (UNL), Santa Fe, Argentina

268 N. M. Nigro et al.
1 Introduction
Standard formulations for the solution of the incompressible Navier-Stokes equations

may be split in two classes depending on the approach chosen for the description
of the inertial terms, namely Eulerian and Lagrangian approaches. In the first class,
the acceleration is described as the sum of the spatial derivative of the velocity plus
the convective term. In the second approach, the acceleration is simply described as
the total derivative of the velocity. Over the last 30 years, computer simulation of
incompressible flows has been mainly based on the Eulerian formulation (see [18]
for references on this topic). However, with this formulation, it is still difficult to
analyze large 3D problems in which the shape of the free-surfaces or internal inter-
faces changes continuously or in fluid structure interactions where complex contact
problems are involved. In all these problems the computing time is sometimes so high
that makes the method unpractical. In the last few years, several solutions using the
Lagrangian formulation to solve the compressible and incompressible Navier-Stokes
equations have been developed [3, 9, 11]. The advantages of these solutions to solve
problems with free-surfaces or multi-fluids with complicated internal interfaces have
been demonstrated [15]. In general, these formulations are more expensive than a
standard Eulerian approaches if they are used in homogeneous flows, but they justify
their popularity in solving free- surface flows or complicated multi-fluids flows in
which the standard Eulerian formulations are inaccurate or, sometimes, impossible
to be used [15].
When attempting to compare the efficiency of Eulerian codes against Lagrangian
ones the conclusions were not so definite. Even though Lagrangian solvers are simpler
than Eulerian ones the very small time steps normally employed in the former has
reduced its application only to specific examples.
Only few attempts in the past thought in using Lagrangian formulation for homo-
geneous fluid flow. To cite here only a few contributions we can mention the work
of Joe Stam [23]. In these work the author solve the Navier-Stokes equations in
the context of video games leaving the message that it is possible to design simpler
numerical methods that may be applied on this challenge context where the efficiency
is the key point.
One of the reasons why there are so few jobs using Lagrangian methods for homo-
geneous fluid flow applications may be the important computational cost involved
in the remeshing. Lagrangian solvers are principally based on moving particles, and
after that some sort of mesh should be built depending on the specific method the way
to do that. PFEM is one of the most popular Lagrangian methods with the particular
fact that the moving particles define a mesh where the discrete equations are solved.
Its origins go back to the early 2,000. For brevity reasons its state of the art is given
up here. Readers interested in the basis of the method may see http://www.cimne.
com/pfem/ where there are most of the publications of the method. Among the most
cited publications here we can mention [3, 12, 13, 19].
This feature obliged the method to deform the mesh up to certain limit where for
geometric reasons some sort of remeshing should be done. As the remeshing was only
Recent Advances in the Particle Finite Element Method 269
limited to some special time intervals the deforming mesh added another ingredient
to the time step selection, to avoid the mesh inversion. This severe limitation together
with another imposed for the non-linearities and those proper of explicit schemes
made the efficiency of original PFEM a serious problem. Lately the method evolved
thanks to the progress done in parallel mesh generation and remeshing avoiding this
serious limitation in some measure.
Even though these limitations and the large community that normally employ
Eulerian codes the Lagrangian formulation contains some nice features that need to
be reviewed here.
One of the most important rests on the missing of the convective term in the bal-
ance equations, converting the non-symmetric equations in symmetric and positive
definite. For Navier-Stokes equations this fact has a by-product, converting in lin-
ear the original non-linear momentum equation. These two facts avoid the usage of
stabilization terms with the strong consequence of not adding the typical numerical
diffusion needed to stabilize it. Not having convective terms, for constant coeffi-
cient problems as for laminar and homogeneous fluid flow and also for DNS (Direct
Numerical Simulation), the system matrices may be factorized at the beginning and
reusing them all the time steps, with an important saving in cpu-time. For convection
dominated flows the time step in Eulerian formulations needs to be limited attending
non-linearities and stability reasons. On the contrary, the Lagrangian formulations
do not suffer from this inconvenience when the equations are integrated with good
accuracy. This is a key point that deserves much more attention.
In particular PFEM has evolved considerably over the past few years, incorporat-
ing new ideas seeking enlarge the time steps largely in stable and accurate way.
In this sense PFEM has incorporated a novel time integration scheme called
X-IVS and its extension X-IVAS. This form of integrating based in following the
streamlines of the flow in the present time step is to some extent a better way to solve
the non-linearities of the equations of the flow.
In this way it is possible to solve the complex flow situations allowing to extend
the time steps in a significant way.
On the other hand being the information carried by the particles using the mesh
only for computing secondary fields confers to the method of high accuracy.
Therefore the goals of accuracy, robustness (stability) and efficiency are signif-
icantly improved by these new ideas included in the last version of PFEM, called
Fixed Mesh PFEM.
One of the main target of this work is to show that Lagrangian formulations are not
only valuable to solve heterogeneous fluid flows with free-surfaces. We will prove on
the contrary that even for homogeneous fluid flows, without free-surfaces or internal
interfaces, they are able to yield accurate solutions while being competitively fast
when compared to state-of-the-art eulerian solvers. Also, another goal of this paper is
to update the state of the art of PFEM joining some basis published before [6, 7, 10,
11, 22], with new findings discovering more and more nice features of the method
to become a competitive tool in the future for high performance computations.
The paper starts with a mathematical review of the problems to be treated writing
them in an Eulerian and a Lagrangian formalism.
Next, the time integration schemes are presented where it is possible to understand
the novelty introduced by X-IVS and X-IVAS.
It is followed by a section dedicated to two examples that have served as inspiring
muses for the development of new ideas which were then applied to PFEM. In these
examples may be understood the benefits of using Lagrangian solvers. While these
examples solved by Eulerian codes needs a lot of numerical artifacts, they are trivially
solved by Lagrangian ones. The next section presents the two versions of PFEM.
The first called Mobile Mesh Version is an extension to the original PFEM with
permanent remeshing and a X-IVAS time integration scheme included. Showing the
pros and cons the rest of the section is devoted to the novel idea of mixing two view
points, one based on particles and the other based on the background and fixed mesh.
This idea allowed to increase the efficiency in a very important way. Even though
some earlier attempts had been done in using the duality of particle and mesh, for
example [8], at the moment of designing the idea this information was not on the
knowledge of the authors and moreover, both ideas have only few things in common.
The next section presents some details about the Fixed Mesh version of PFEM,
how to manage the particle inventory, how to share the information between particles
and mesh. It is followed by a section where the focus is on the treatment of the
diffusive terms. Contrary to what may be a prior assumed, the Lagrangian behavior
has been superior to the Eulerian one, in regard to precision being that this part of the
calculation is of Eulerian nature. The last section is devoted to show some examples
solved numerically by PFEM where it is possible to realize that in the present status
PFEM is able to solve turbulent flows, multifluids and multiphase flows, general
multiphysics, among others. Finally some conclusions are included.
2 X-IVAS: A New Integration Method to Enhance Accuracy

and Stability
In this section the emphasis is placed on the main features that produce the big
advantages of PFEM against any other method. In general the interesting problem to
be solved is the general transport equation that is very widespread in the engineering
applications. Both, the passive scalar transport equation and the incompressible flow
represented by the Navier-Stokes equations will be considered in the rest of the paper.
In order to understand the evolution of PFEM the Eulerian and Lagrangian for-
mulations are introduced first.
2.1 Scalar Transport Equation
In the Eulerian framework a fixed coordinate system is considered as the reference

for the physical quantities. The scalar transport equation is written as:
T
+ (vT ) = (T ) + Q (1)
t
where T (x, t) is the dependent variable (passive scalar), for example the temperature,
v is the velocity vector, for this problem a given data and the diffusivity, with

the divergence operator, the gradient operator and t the temporal derivative. In
this problem x represents a fixed coordinate.
Normally this equation may be rewritten in the following form:
T
+ v T = (T ) + Q ( v)T
t (2)
where the first order derivative is split in two terms, one for the convective transport
and the other for the source term generated by the non free divergence velocity field.
Normally the incompressible flow satisfies the free divergence and in this case this
source term may be neglected.
On the other hand in the Lagrangian framework the same problem is written as:
DT
= (T ) + Q
Dt (3)
T
where DTDt = t + v T is the material derivative. The convective term works like
a variable transformation between that measured in a fixed coordinate system and
that measured in the moving coordinate system that travels with the fluid velocity
v. In this transformation the velocity field is incorporated in the dependent variable
itself being the unknown variable T = T (x p , t) with x p the location of a fluid
parcel included within the material volume. This location is at the same time another
variable, so it is needed to solve an additional equation like:
Dxp
=v
Dt (4)
Finally the problem in Lagrangian framework is:
DT
= (T ) + Q
Dt
Dxp (5)
=v
Dt
2.2 Incompressible Viscous Fluid Flow: Navier-Stokes Equations
The other problem that in this paper deserves special attention is the fluid dynamics
of an incompressible and viscous flow. It is very well known that this problem is
governed by the Navier-Stokes equations that presents the balance of the linear
momentum equation and the continuity equation or the mass balance.
Both equations normally written together in an Eulerian framework look like:

+ (v) = 0
t
v (6)
+ (v v) = () + F
t
Being the stress tensor which definition may be split in two parts, one for the
spherical (isotropic) component being proportional to the fluid pressure and the other,
the deviatoric or viscous component normally written as . The operator means
the tensor or dyadic product between two vectors. F represents the external force,
for example the gravity, and finally is the density. For incompressible flows the
density is constant, therefore, the continuity equation becomes a constrain over the
velocity field, as:
(v) = 0
(7)
Applying the above restriction also in the linear momentum equation produces a
simplified and non-conservative version like
(v) = 0
v
( + v v) = () + F (8)
t
For the Lagrangian formulation the above system is written as:
(v) = 0
Dv
= () + F (9)
Dt
Dxp
=v
Dt
2.3 Time Integration
In this section the time integration of both frameworks, the Eulerian and the
Lagrangian is presented.
For simplicity the scalar transport equation is chosen first leaving for some special
topics the extension to the vector equation system governing the fluid dynamics of
one phase incompressible viscous flow.
2.3.1 Scalar Transport Equation
For the Eulerian framework represented by Eq. (2) the integration is normally done as

t n+1
t n+1
T
dt = (v T + (T ) + Q)dt
t
tn tn

t n+1 (10)
T n+1 (x) T n (x) = (v T (x) + (T (x)) + Q(x, t))dt
tn
T n+1
(x) T (x) = (v T (x) + (T (x)) + Q(x, t))n+ t
n
For some (0, 1) the last expression in (10) gives the exact solution. As this
parameter is unknown and problem dependent some fixed values for are adopted,
= 0 for explicit schemes, = 1 for implicit schemes and = 21 for Crank-
Nicholson among others.
n+1
(v T + (T ) + Q)n+ = v T + (T ) + Q
n
+ (1 ) v T + (T ) + Q
(11)
Replacing (11) in (10)
n+1
T n+1 (x) T n (x) = v T + (T ) + Q t
n (12)
+ (1 ) v T + (T ) + Q t
The right hand side in (12) is evaluated using only the information of the nodal
point x at the two extremes of the time interval, t n and t n+1 = t n + t.
For the Lagrangian framework a similar integration scheme is applied.
tn+1 tn+1
DT
dt = ( (T ) + Q)t dt
Dt
tn tn
tn+1 tn+1
Dxp
dt = vt dt
Dt
tn tn
(13)
tn+1
t
T (xp n+1 , t n+1 ) T (xp n , t n ) = (T ) + Q dt
tn
tn+1
xp n+1 xp n = vt dt
tn
In a straightforward way it is possible to apply (11) in (13) producing the following

standard Lagrangian form:

t n+1
t
T (xp n+1 , t n+1 ) T (xp n , t n ) = (T ) + Q dt
tn
n+1
= (T ) + Q t+ (14)
n
(1 ) (T ) + Q t
xp n+1 xp n = vn+1 t + (1 )vn t
2.3.2 Navier-Stokes
The extension of the time discretization to the Navier-Stokes equations needs to solve
the pressure-velocity coupling.
It is well known that the velocity vector unknown arises solving the vector
momentum equation. Being the pressure the scalar unknown for which the continuity
equation might be the natural choice, in this equation the pressure does not appear.
Moreover, this equation is not a time evolution equation, it works like a constraint
over the velocity field, choosing only those velocity field that satisfy a free diver-
gence. To discover the equation associated with the pressure several alternatives are
possible. Among them, segregated or projection methods like fractional step appear
as good candidates. The idea behind the fractional step is to write the momentum
equation discretized in time in such a way to firstly predict a velocity using the old
value of the pressure (the pressure at the old time step) and after correcting it with the
updated pressure that arises from applying the divergence operator to the correction
equation getting a Poisson like equation for the pressure.
In synthesis the fractional step may be viewed as:
n+
vn+1 vn = + f t
n+1 n
vn+1
vn+1 +
vn+1 vn = + f t + (1 ) + f t
n+1
vn+1
vn+1 +
vn+1 vn = p + v + f t+
n
(1 ) p + v + f t
n+1
vn+1
vn+1 +
vn+1 vn = ( p n+1 + p n )t pn t + v + f t
n
+ (1 ) p + v + f t
vn+1
vn+1 +
vn+1 vn = ( p n+1 + p n )t p n t
n+1
+ v+f t
n
+ (1 ) v + f t
n+1
vn+1
vn+1 +
vn+1 vn = ( p n+1 + p n )t p n t + v + f t
n
+ (1 ) v + f t
vn+1 vn+1 +
vn+1 vn = ( p n+1 + p n )t pn t

CORRECTOR PREDICTOR CORRECTOR PREDICTOR
n+1 n
+ v(
v n+1 )+f t + (1 ) v + f t

PREDICTOR
(15)
Spiting the predictor and corrector parts of the equation in two steps and applying
the divergence to the corrector step using the constraint that vn+1 = 0,
PREDICTOR
n+1

vn+1 vn = p n t + v(
vn+1 ) + f t
n
+ (1 ) v + f t
PRESSURE EQUATION

vn+1
vn+1 = t ( p n+1 p n )
vn+1 = t ( p) = t 2 p

CORRECTOR
vn+1
vn+1 = t p (16)
with p = p n+1 p n and 2 the Laplacian operator. The equation for the pressure is
interposed between the predictor and corrector equations for the momentum equation
as it is normally found in the algorithm.
For the Eulerian formulation the above three steps may be applied straightforward
only changing v by v(x) and p by p(x).
For the Lagrangian formulation the above algorithm may be summarized as:
PREDICTOR
Explicit part
t n+1
xp n+1 = xp n + vn (xp )d
tn
n+1
v (xp n+1 ) vn (xp n ) =
t n+1
p n (xp ) + (1 ) vn (xp ) + f n (xp ) d
tn
Implicit part

vn+1 (x)
n+1
v vn+1 (x)) + f n+1 t
(x) = v (
PRESSURE EQUATION
vn+1 (x) = t 2 p(x)

CORRECTOR
vn+1 (xp )
vn+1 (xp ) = t p(xp ) (17)
It should be noted that for the whole procedure of Lagrangian formulation two
coordinates are used, one for the particles (xp ) and the other for the mesh (x). The
relation between them is presented in a next section.
2.4 X-IVS [X-IVAS]: Explicit Integration Velocity [Acceleration]

Scheme
In the scalar transport equation the velocity field is a given data, known not only for
its spatial variation also for its time variation. Therefore it is possible to include this
information explicitly in the Lagrangian formulation. Using a high accurate particle
tracking integration scheme it is possible to solve simple and complex pathlines
normally present in fluid flows (Figs. 1 and 2).
Fig. 1 Time integration
Fig. 2 Streamline integration updating the particle position and state

t n+1
xp n+1
= xp n
+ v (xp )d (18)
tn
Real trajectory:
n+1
xp n+1
= xp n
+ v (xp ) d . (19)
n
Simple approximation:
xp n+1 = xp n + vn (xp n )t (20)
Fig. 3 Streamline integration updating the particle position and state
Streamlines approximation:
N 1
n+ Ni
yn+1
p = xp n + vn (y p ) t (21)
i=0
2.4.1 Remarks
Adaptive substep: t = t N CoELE . These integration substeps may be replaced

by an almost exact integration [18]. However for practical applications a very little
difference in accuracy is observed for the former with an extra cost for the later
that push the decision on the former.
For vector systems where normally the velocity field is also an unknown the particle
velocity vp may also be updated following the same ideas changing the name of
the method to X-IVAS.
for fluxes depending on the spatial derivatives (like diffusive) X-IVAS method is
also applied.
This novel integration in Lagrangian context allows to a significantly better parti-
cle trajectory integration (X-IVS) following streamlines resolving more difficult
details of the flow with high accuracy. Figure 3 shows some details that may be
resolved without a drastically reduction in the time step, as it is normally done by
standard Lagrangian integration schemes.
In general this integration scheme does not suffer for strong time step restrictions
caused by the non-linearities present in the flow field (Fig. 3).
As it was mentioned for the unknown field, the temperature, only known for the
old time step t n , an approximation to the future time step should be done using the
X-IVAS method.

t n+1
t n+1

T (xp n+1 , t n+1 ) T (xp n , t n ) = Q(xp , )d + (T (xp , )) d
tn tn
(22)
In the PFEM method the last term at the right hand side is approximated in the
following form:

t n+1
T (xp n+1
,t n+1
) T (xp , t )
n n
Q(xp , )d
tn

t n+1
+ (1 ) (T (xp , t n ))d (23)

tn

explicit
+ (T (xp n+1 , t n+1 ))t

implicit
The last integration is only one possibility to choose among others, the explicit part
is solved simultaneously with the particle pathline computation, while the implicit
one is solved using the final position of the particles. However other choices may be
done in order to improve this computation, that for brevity reasons are not included
in this paper.
Comparing (12) with (23) the main difference between both may be written as:
EULLAG = T (xp n+1 , t n+1 ) T (xp n , t n )

T n+1 (x) T n (x) + v(x, t n+1 ) T (x, t n+1 )t (24)

+ (1 )v(x, t n ) T (x, t n )t
This difference is due to the error produced by the transformation between both
frames, an Eulerian or fixed one and the Lagrangian or mobile one. This difference
should tend to zero when the time step goes to zero. However, for large time steps
normally needed to speed up the computation, the fact of evaluating the velocity
field placed on a fixed position (x) for Eulerian formulation in two different time
intervals may introduce large errors. Moreover the spatial stabilization needed for
advection dominant problems introduce also some extra errors that tends to dissipate
the solution much more than the physics, specially at large time steps.
2.5 A Simple Coupled System, Solving the Coupled Flow Field

and a Passive Scalar
One of the main purposes of this development is its application to solve coupled
problems where the flow and several other fields are solved simultaneously with
some sort of interaction. For brevity a simple case is here presented. It deals with the
coupling of an incompressible viscous flow with a scalar transport like temperature
using the Boussinesq approach.
2.5.1 Physical Equations to Solve
Navier-Stokes Equation System for Newtonian and incompressible fluids:
Dv
= p + (v T + v) + f
Dt (25)
v =0
Scalar-Transport Equations:
D j
= ( j j ) + Q j j (1 : Nfields ) (26)
Dt
Example of coupling. For = T Boussinesq approximation:
f = g( c ) (27)
2.5.2 Discretization
The key of the PFEM algorithm is transporting the information with particles fol-
lowing the streamlines. Although the field vp is not stationary, streamlines are taken
as stationary on each time step (vp n ), the particle position follows that field and the
particle state is updated by the rate of change determined by the balance equation.

t n+1
xp n+1
= xp n
+ vn (xp ) d (28a)
tn

t n+1
vp n+1
= vp n
+ (an (xp ) + f ) d (28b)
tn

t n+1
n+1
p = np + (gn (xp ) + Q ) d (28c)
tn

where an = p n + ( T vn + vn ) and gn = ( n ), which are nodal
variables.
3 Two Examples to Show the Benefits of the New Ideas
The following two examples serve as the starting point of new ideas behind high
accurate and stable convective transport equations.
Pure advection of a passive scalar field.
Inviscid transport of a vortex.
The first example was the proof in showing the advantages of using Lagrangian
formulations when a pure advective problem is between hands. It is a Gaussian
hill profile imposed as an initial condition advected by a pure rotation motion. For
this problem the profile shape and its amplitude should be conserved all the way.
The second one is one extension of the first example applied to a vector system
like Navier-Stokes equations. It consists of an initial vortex that is transported in an
inviscid flow. For this problem the intensity of the vortex should be conserved.
3.1 Pure Advection of a Passive Scalar Field
This well known problem normally serves as a benchmark for the spatial stability
of Eulerian numerical schemes. The first scope in this benchmark is to show that no
spurious oscillations appear and the second one focus on minimizing the numerical
diffusion introduced by the stabilization schemes. Also the time integration numerical
scheme is responsible for extra numerical dissipation, being the first order explicit
( = 0) or implicit ( = 1) schemes not recommended for their high dissipation.
Crank-Nicholson ( = 21 ) is preferred in this sense. However looking at the solution
it is always present a reduction in the original amplitude that may be improved only
reducing the mesh size and the time step.
In [10] several Figs. 9.19.4 are shown where it is possible to realize how the
amplitude is drastically reduced using large Courant numbers with Eulerian formu-
lations. Even though the spatial stabilization is reduced to a minimum and the time
integration is chosen as second order the numerical dissipation is highly noticeable.
Only reducing the Courant number with finer meshes it is possible to reduce it but
never annihilate.
On the other hand Lagrangian formulations are better positioned for this kind
of problems if only the amplitude is observed. This remains exactly constant all the
way regardless the Courant number. However with standard first order integration, the
problem arises in the definition of the pathlines that are shifted inwards or outwards
depending on the explicit or implicit character of the time integration. Only with
second order time integration is possible to reduce this pathology but some sort of
iteration is needed. See Fig. 9.7 in [10].
Using the X-IVAS integration is possible to fix both errors simultaneously pro-
ducing a high accurate solution regardless the Courant number. This is also shown
in Fig. 9.7 at left in [10].
A final remark about this important result achieved on such a basic example,
showing the great capabilities of Lagrangian formulations over Eulerian ones for
convective transport, is related to some more accurate Eulerian schemes that are
currently being published for transporting signals without causing spurious diffusion.
Called as High Resolution Schemes [14, 16], these numerical methods have a robust
control to suppress the wiggles with the minimum amount of numerical dissipation.
According to the Godunov theorem [16] this is only possible in a nonlinear way. Even
though this way circumvents the drawbacks it is important to realize that Lagrangian
formulations achieved the same or better results without doing nothing special saving
the extra cost normally experienced by such schemes.
3.2 Inviscid Transport of a Vortex
Having found the good benefits of Lagrangian formulations for transporting scalar
fields in a stable and accurate way the following step was to extend the same to
vector systems. Here the incompresible viscous fluid flow model was taken. The
equivalent example in this context is the transport of a vortex in an inviscid flow. It
is well known that looking at the hyperbolic part of the whole system, neglecting
the diffusion and not considering the role of the pressure, the problem looks like
the convection of vorticity waves. If a vortex ring is imposed as initial condition,
neglecting the viscosity with slip boundary conditions on the walls, the vortex should
conserve its kinetic energy as much as possible.
Figure 4 shows how the Eulerian and Lagrangian formulation transports this vor-
ticity. For both formulations the mesh is kept fixed and the simulation had run with
the same time step, with a high enough Courant number in order to highlight the per-
formance and precision comparison. It is possible to conclude that the Lagrangian
formulation is more energy conservative than the Eulerian counterparts with the
evidence that the vortex may be transported much better. This example confirms the
advantages of Lagrangian formulation respect to Eulerian ones not only for advective
transport of scalar fields, also for non linear vector fields (Fig. 4).
Fig. 4 Inviscid vortex ring dynamics. Conservation of momentum
4 Moving and Fixed Mesh PFEM Versions
The natural evolution of PFEM method employed only one mesh built from a cloud
of points defined by the moving particle position. There was a one to one relation
between mesh nodes and particles. At each time step the original PFEM method
moved the nodes following the updated particle positions as long as the mesh was
not deformed in such a way that an invalid grid appears. Remeshing was only used
when the deformation of the mesh was so large that the time step suffered a drastic
reduction making the computation too much expensive. At that times, the remeshing
was by-passed at extreme for cpu times reasons. Summarizing the stability of the
original PFEM was mainly affected by:
critical time step for explicit advective terms (Co < O(1)).
critical time step for explicit diffusive terms (Fo < O(1)).
critical time step for the deforming mesh limited by the inversion of some elements
in the mesh (invalid)
non linearities
The sequence of the problems above defined may be summarized as:
To solve only the passive scalar transport Eq. (28a) and (28c) are used. If you want
to transport N passive scalars, you only have to solve (28c) for each one of the N
variables.
To solve the Navier Stokes equation system, (28a) and (28b) are used, and p must
be calculated. A typical Fractional Step Method is used to solve the coupling
between the pressure and the velocity (see Eq. 16).
To solve the thermal and fluid flow coupling (natural convection) (28a), (28c)
and (28b) are used and a constitutive law for the buoyancy term should be added
(Boussinesq approach)
Fig. 5 PFEM moving mesh version
All these steps need a mesh update and again the remeshing returns to the sce-
nario. During the last years a lot of progress was done in terms of more efficient
mesh generation and regeneration exploiting the parallelism, making the remesh-
ing affordable. A permanent remeshing circumvents the severe time step restriction
produced by the invalid mesh condition. In this sense the PFEM had experienced a
significant progress increasing the time steps with stable solutions.
It is necessary to update the mesh states with the particles states. There are two
approaches which have generated two versions of the method:
Remesh the geometry with new particles positions (particlesnodes): PFEM
Mobile Mesh
Project states from particles to nodes, preserving the mesh as fixed: PFEM
Fixed Mesh
The Mobile Mesh version has the following features:
a 1-1 relation between Particles and Nodes.
Remeshing at each time step.
Need permanent assembling, profiling and solving of the algebraic linear system.
Figure 5 shows how the particle motion change the mesh definition at each time
step.
The first tests showed very good features in terms of stability and accuracy getting
a drastic reduction in the cpu times involved for solving some benchmarks compared
with the original version of PFEM. This improved performance, added to the pos-
sibility of using very large time steps were the first evidences that the permanent
TOTAL
Solve
Assembly
Remeshing
4 threads
Update 2 threads
1 thread
Streamline
0 50 100 150 200

Time [s]
Fig. 6 PFEM moving mesh versionprofiling
remeshing using X-IVAS integration were two important numerical ingredients for
exploiting the good features evidenced by the Lagrangian formulation.
Even though moving mesh PFEM version has several advantages against its
Eulerian counterpart, it has some limitations in terms of efficiency. Mainly the per-
manent remeshing and the assemble/solving of implicit problems limit its scalability
The Fig. 6 shows a profiling of the moving mesh version of the PFEM.
As it is evident from this figure most of the time is spent in remeshing, assembling
and solving the implicit linear systems, with a performance similar to mesh based
methods because the particle update only consumes a small part of the whole cost.
In order to reduce the computational cost added by these two stages a novel idea
was presented: the Fixed Mesh Version of PFEM .
This new method combines particles with a background mesh. Particles carry the
information along the whole process using the mesh only for secondary computa-
tions, those needed to update the particle position and their states. It is normally
understood as an hybrid method or dual method where particles act like the mas-
ter in the computation and the mesh is the slave. The idea does not only avoid the
permanent remeshing, using a fixed background mesh it is possible to integrate all
the implicit part of the computations with an important and favorable impact on the
computational efficiency, the possibility of re-using the matrix profile and for linear
diffusion problems also its factorization.
This fact may be exploited only by Lagrangian formulation because the Eulerian
counterpart always has the convective term proportional to the changing unknown
velocity inside the system matrix to be inverted. Therefore it is not possible to take
advantage of it.
The fixed mesh version has the following features :
Particles cloud over a Fixed Background Mesh.
No need remeshing.
It needs Projections and Interpolations between particles and mesh nodes.
It needs only one LU or Cholesky factorization for implicit calculations.
Fig. 7 PFEM fixed mesh version
For constant coefficients problems an initial factorization may be built, reusing

it along the whole computation. In this way it is possible to reduce the number of
iterations of the preconditioned conjugate gradient used for solving the symmetric
and positive definite linear systems involved in the computation (Poisson for the
pressure and the heat equation for the implicit part of the diffusive terms ( > 0)).
However, the fixed mesh version has some cons, specially that concerned with
the projection error. This operation allows to reconstruct some fields transported by
the particles in a very accurate way on a mesh where the spatial derivatives should
be computed. The better the reconstruction the less the error in the computation of
the fluxes depending on the spatial derivatives (Fig. 7).
Several numerical experiments have shown that the advection part of the com-
putation is governed by particles and the diffusive part is governed by the mesh.
Therefore increasing the number of particles allows to transport very complex initial
conditions or represents complex fields produced by its time evolution under com-
plex flow fields. On the other hand refining the mesh allows to improve the diffusive
fluxes computation. But this is not the only way to get it, higher order reconstruction
is also an alternative to explore in the future.
First order reconstruction may become inconsistent, i.e. refining in the number of
particles may produce higher diffusive flux errors. To avoid it one possibility could
be to split the elements in subdomains, one around each element node, using only
that division to project particles values on the node associated with that subelement.
Instead of using all the particles belonging to the elements connected to a given
node, only those particles belonging to the subelements connected to the node are
chosen. This subdivision also serve to seed particles when that subelement is void
of particles. In a next section some details about it are presented.
In terms of efficiency the Fig. 8 shows as the fixed mesh version has changed its
profiling, now with the most important computational cost component concentrated
in the particle computation, typical of a Lagrangian formulation.
Fig. 8 PFEM fixed mesh versionprofiling. Case flux around a cylinder 2DCPU: Intel
i7-2600k (4 cores)
Being particle computations cheaper than mesh computations it put the PFEM
method in a very good condition for high performance computing stuffs, specially
for scalability.
Finally this section ends with some review of the two algorithms that were firstly
developed in the context of the PFEM, one for the scalar transport and the other for
the incompressible viscous flow.
Algorithm 1 - Time Step PFEM Scalar Transport

1. Calculate scalar change
rate on the nodes
like a FEM:
N gn d = N n d + N n d

2. Evaluate new particles position and state following the streamlines:
t n+1
xn+1
p = xnp + t n vn (xp ) d
t n+1
n+1
p = np + t n gn (xn+ p )+ Q
n+ d
3. Update particles inventory

4. Project state to the mesh:
n+1
j = (n+1 p )
Algorithm 2 - Time Step PFEM Incompressible Flow

1. Calculate
acceleration on the nodes like a FEM:

N v d = N (vn ) d + N vn d

N p d = N p d + N p d
n n n
an = = p n + v
t n+1
xn+1
t n+1
vn+1
p = vnp + t n an (xn+ p )+f
n+ d

vn+1
j = (vn+1
p )
5. Find the pressure value solving the Poisson equation system using FEM:
vn+1
j = t [ p n+1 ]
6. Update the velocity value with the new pressure:
vn+1
j = vn+1
j t ( p n+1 p n )
vn+1
p = vn+1
p t 1 ( p n+1 p n )
It is important to realize that in the fixed mesh version of PFEM there is an

operation called projection that deserves some attention. In a next section some
comments about it will be presented.
5 Particle Inventory Managment
As mentioned before, in the PFEM algorithm, after the streamline integration the
state variables are placed on the particles. Both, for reasons of incompressibility
(pressure) as for the treatment of the diffusion (viscous stress tensor) require that the
information should be located on the grid. While for the Mobile Mesh version the
mesh is done with the particles themselves, in the Fixed Mesh approach particles
and grid are decoupled and a projection from particle states to nodal states should
be done.
5.1 Projection Algorithms
Different approaches are available to perform projections, for example SPH or MLS
(Moving Least Square) techniques could be used for the interpolation, as well as
weights based on the position on the top of the underlying mesh. A brief review of
the actual techniques for projection in PFEM are presented (the equations presented
are for scalar projection. They are also valid for each component of a vector state
variable):
Mean weighted by Shape Function (P-1):

P
N j (x p ) p
p=1
j = (29)

P
N j (x p )
p=1
where P is the number of particles inside a certain region around the node j.
Mean weighted by Distance (P-2):

P
||x p x j ||2 p
p=1
j = (30)

P
||x p x j ||2
p=1
with the same definition of P as just above.

Weighted Polynomial Least Squares (P-3):
j = h (x j ) = T P(x j ) (31)
where:
P = [1 x x 2 . . .] on 1d, P = [1 x y x y x 2 y 2 . . .] on 2d (truncating at the polynomial
order required) and = (XT W X)1 (XT W y). It must be noted that to invert
the matrix in the calculation of , it is required P >= n, where n is the number
of terms used on P.
5.2 Particle Seeding
For accuracy reasons each one of the presented projection methods require a certain
number of particles in certain region near to each node. Some considerations must
be taken into account: the region around the node must be defined precisely, and
is not assured that there were particles inside each region (specially when high Co
numbers are used). Then, new particles must be created at these empty regions. In
this section several algorithms attending this issue are presented.
The first approach (S-1), used originally in PFEM, consists on setting the states
to a new particle interpolating from the nodal states at the previous time step n:

p (x p ) =
n+1 n+1
j (xn+1
p ) j
n
(32)
j
being j the area coordinates of the particle in the element. Other algorithm (S-2)
searches the state following the streamline but in backward direction, thinking in
Fig. 9 Pure-convection step problem
finding the particle location that, if it had existed at the beginning of the time step,
at the end of it would have arrived at the seed position:
t
n+1
p (x p )
n+1
= np (xnp ) + gn (xn+ ) d (33)
0
The utility of the backward integration to search the state of the new particle is shown
in the next example: the pure-convection step problem, which is defined in Fig. 9.
A boundary condition with a sharp discontinuity enters the domain transported
with a velocity vector field not aligned with the mesh. Figure 10 shows the results,
projected on the mesh, achieved when the particles are created using the criterion
defined as (S-1) (left) and also when their states are found using backward integration
criterion (S-2) (right).
The results show that S-1 criterion fails for this example putting S-2 criterion as
a much better selection for seeding particles when it is necessary.
5.2.1 Particle Removing
On other hand, using a lot of particles increases computing times. During the simu-
lation the seeding is frequent and we need to control the amount of particles inside
the domain for computational cost reasons. So, the removal action should be defined
following some criteria.
Although it is known that particle which leaves the geometry must be deleted,
inside the geometry is not clear when particles should be removed and how to do that.
Removing particles decreases computing times of the algorithm but also decreases
the quality of the solution because it introduces numerical diffusion in an indirect
way.
In the first approach (R-1), the particles are not removed unless that two or more
of them are in almost the same position. This approach obtained accurate results
solving the pure-advective case of the rotating Gaussian signal [18].
A second approach (R-2), consists on requiring minimum number and a maximum
number of particles at each sub-element that must be conserved at each time step.
Fig. 10 Pure convection transport. Results for different strategies to new particles states. Left S-1.
Right S-2
The sub-element i is the third parts of the triangle (fourth parts of the tetrahedral in
three dimensions) where the area coordinate corresponding to the vertex i is larger
than the rest. The idea is to think that if each node has enough number of particles
around, the projected state from particles to the node will be accurate. However,
as will be demonstrated in the next example, the continuous intrusion to the system
could decrease the quality of the solution, specially in scalar problems. This criterion
shows good results in solving incompressible flows.
The example of the rotating Gaussian signal without diffusion shows that the
numerical diffusion is important when a frequent creation and removing of particles
is performed following this criterion. The polar mesh (4390 elements) is the same
in all cases, the case consists of a Gaussian transported by a rotating flow without
diffusion term. Figure 11 compares the value of the maximum through two laps.
The best option for this problem is R-1 (in Fig. 1 Tmax old), while R-2 requires
a large range ([min_subele; max_subele]) of particles not to spread: comparing
0,95
0,9 T max - 1to5

T max - 1to20
maxPhi
T max old idea

0,85
0,8
0,75
0,7
0 2 4 6 8 10 12 14 16
Time [s]
Fig. 11 Rotating Gaussian. Evolution of max for different update particle techniques
[1; 20] with [1; 5], in the second option the intrusion in the system is greater and the
solution decreases its quality.
Regarding the computing times also R-1 is the best, because requires 40 s (mean
40,000 particles), whereas [1; 5] needs 50 s (mean 46,000 particles) and [1; 20] needs
100 s (mean 120,000 particles).
However, R-1 does not work fine for the steps problem (defined in previous
subsection), unless it creates new particles using backward integration (criterion
S-2). Also, due to the type of projection of the algorithms developed, which searches
particles in the sub-elements to send data to nodes, creating new particles in empty
elements does not ensure that there will be particles in the region around the node
(their sub-elements), so another type of selection of the position of the new particles
must be developed.
5.3 Converging into a New Algorithm to Update Particle Inventory

Taking into account the previous discussion, an algorithm to update particle inventory
has been implemented, that includes the benefits of the methods discussed previously
and follows the principle of not modifying the system unless it be really necessary.
It was mentioned that the condition empty-element, except for some particular
problems, shows to work not so fine. The main reason is the lack of particles close
enough to nodes. So, the new algorithm must not search empty-elements, instead of
that, it searches empty-regions.
The region j is defined as the geometric space where the node j takes particle
information to update its state using some projection operator; therefore, if this region
is empty, the node does not have enough information to be updated. Once detected
an empty-region, only one particle is created in exactly the same position of its node
and with a state found using backward integration.
(a) (b) (c)
Neighborhood
Internal Elements
Region of the Node "j" Inlet Elements
Fig. 12 Graphical scheme to understand the new algorithm proposed
Although to create new particles on the nodes is the least intrusive way to maintain
good solution on the nodes, if the problem is not of confined flow, the number of
particles will decrease while the simulation runs. Typically in the inlet flow boundary
the boundary condition is imposed. Then, creating particles in the Inlet Elements and
doing backward integration to search the states no error is committed, and more than
one particles can be created. The position does not have to be on the nodes and its state
will not have error. This approach allows to keep approximately the same numbers
of particles during the simulation, which preserves the accuracy of the method.
Particle removing is carried out when two or more particles are in a circle (2d)
or sphere (3d) with a radius proportional to the size of the element (r = h). This
approach allows to use different s over the geometry, being a new tool to control
the number of particles. Graphic representation is presented in the Fig. 12c.
Figure 12a shows the neighbor elements and the region belonging to the node j.
It must be there at least one particle in the gray zone to have a good projection, else a
new particle must be created in the same position of the node and searching its state
with backward integration. Figure 12b shows which elements are considered as inlet
elements and, if they are empty, they must create internal particles.
Finally, this algorithm allows to solve all tests presented in this paper while other
approaches have shown to fail: the Gaussian rigid rotation and the step-2d.
The last example consists on testing this algorithm in the Navier-Stokes solver
(PFEM Fixed Mesh). The case chosen is the Flow Around a Cylinder, because it
presents different zones of refinement and patches of inlet and outlet flow. The results
are presented in the Fig. 13a and b. Similar accuracy in the amplitude and frequency
can be observed, but R-2 obtains better definition of the forces signal, specially for
Cd. For more details see [10, 11].
6 Diffusive-Dominant Problems
When the problem is diffusive-dominant, the advantages of the method PFEM are not
as clear as in the advective-dominant case. The explicit calculation of the diffusion
traditionally used by PFEM is limited by the dimensionaless Fourier number and,
(a) (b)
2 2
Cl
Mittal
Cl - old
1 1,5
Cd
Cl
0 1
-1 0,5
-2
10 20 30 40 10 20 30 40
Time [s] Time [s]
Fig. 13 Lift coefficient and drag coefficient for flow around a cylinder solved using the new updating
algorithm and comparing it with R-2 (called old in the graphic)
in some particular cases, the temporal change of the transported variables vanishes
due to the shape of its own solution. To relax these restrictions, in the Sect. 6.1 a new
model to calculate the diffusion is presented. Several tests are presented to confirm
improvements in the solution.
6.1 Diffusive Implicit Correction
Simulations solving the diffusive term in an explicit way are restricted by Fo < 0.5.
This is a strong limitation for the time-step, specially on very refined mesh and in
diffusive dominant problems (where Fo > Co). Due to explicit PFEM suffers this
stability constraint the possibility of enlarging the time step may be lost when the
flow locally turns to be diffusive. As we have mentioned normally in the vicinity of
bodies some refinement is done to capture boundary layers and flow separations and
locally the Fourier number increases. Also, in some particular cases, the temporal
change of the transported variables vanishes due to the shape of the own solution:
when the time-step is chosen such that the integral of the curvature of the function
nj vanishes, the method will not apply diffusion on , so that the solution will be
wrong. This case may be present for traveling waves with diffusion.
A new approach to solve the diffusive term is based on the theta method which
consists on discretizing the non-stationary variable using a weighted mixture between
an explicit prediction and an implicit correction.
n+1 n
= gn+1 + (1 )gn (34)
t
Doing a first step in explicit way
n+1 n
= (1 )gn (35)
t
and doing the correction in an implicit way, this is subtracting (35) from (34), it
follows
n+1 n+1
= gn+1 (36)
t
Algorithm 3 - Time Step PFEM Scalar Transport Explicit Diffusion - Implicit Cor-
rection
1. Calculate scalar change
rate on the nodes
like a FEM:
N g d = N d + N d
n n n

t n+1
xn+1
t n+1
n+1
p = np + t n gn (xp ) + Q n+ d
n+1
j = (n+1
p )
5. Implicit correction:
n+1
j = n+1
j + t gn+1j .
6. Interpolate state to particles:
n+1
p = n+1
p + 1 (n+1
j ).
An standard FEM formulation is used to compute the implicit correction. This

problem may be solved either with the approaches absolute (37) or incremental (38).
1
[M + t K] n+1 = M n+1 + t Fn+ 2 (37)
1
[M + t K] n+1 = t K n+1 + t Fn+ 2 (38)
where M is a mass matrix, K is the stiffness matrix and F is the load vector of a
standard FEM discretization. It must be noted that the matrix [M + t K] for
K = K(t) and t = cte does not depend on the time, then it can be factorized
at the beginning of the computation and used as a preconditioner afterward with a
significant cpu-time reduction.
6.2 Analytic Diffusion 1D: Sinusoidal Signal
A diffusive-dominant problem with analytic solution is presented. It is solved using

explicit, implicit and semi-implicit schemes for diffusion and using different Fo
values.
Fig. 14 Comparison between Explicit ( = 0), Implicit ( = 1) and Semi-Implicit ( = 0.5)

schemes for diffusion in PFEM with the analytic solution at t = 0.002
The problem is:
2
= 2 x (0, 1) (39)
t x
(x = 0, t) = (x = L , t) = 0; t > 0 (40)
(x, t = 0) = sin(kx); t = 0 (41)
with its analytic solution as:
(x, t) = sin(kx)e(2kx)
2t
(42)
where = 1 is chosen, the wave number of the problem is k = 2 = 4 and the

mesh size is x = 0.02.
Figure 14 shows that using Fo = 5 more accurate results are obtained choosing
a semi-implicit scheme. While explicit simulations are not accurate with Fo = 5,
the solution with Fo = 0.5 is useful. These results show that temporal integration
error in PFEM behaves as usual, i.e. semi-implicit or Crank Nicholson schemes
are more accurate than first order explicit or fully implicit schemes. However for
semi-implicit solvers we need to give up the idea of an explicit solver with severe
consequences on the efficiency. Do not forget that one of the main goals in PFEM
development is having a robust (stable) solver that allows to switch between accuracy
and efficiency with greater freedom. But, due to the matrix for the implicit part of the
computation of the diffusion can be factorized once at the beginning, it is possible to
run simulations using big time-steps without loosing accuracy and efficiency. This
idea of combining explicit schemes with efficient implicit schemes gives Fixed Mesh
PFEM its stronghold. Efficient implicit schemes means solving linear systems in an
iterative way with good preconditioners.
6.3 A Pathological Case: Sinusoidal Signal Travelling
In this section, a pathological case is presented. The explicit calculation of the diffu-
sion updates the state variable with the integral of the second derivative of the variable
itself, i.e. the integral of the curvature. When certain conditions are accomplished,
that integral vanishes and the explicit diffusion is null generating wrong new states.
However, an implicit calculation of the diffusion solves that problem.
The problem consists on a sinusoidal wave transported by a field v with a non
negligible diffusive term. The idea is to force numerical and physical parameters
searching that the integral of the curvature of the function vanishes at each time-step.
If the length traveled by a particle is multiple of the length wave of the signal
(U t = m), then x n+1 p = x np + m, hence, the rate of change of the variable (its
curvature ) will be null because
d 2 d2 2 2
= = [sin( x)] = C sin( x) = g
dx 2 dx 2
x n+1
and x n g d x = 0.
This pathological situation has a very low probability and only is present in
Lagrangian formulations where advection and diffusion are weakly coupled.
The problem to solve consists on:
2
+U = 2 x (0, ) (43)
t x x
(x = 0, t) = sin(t) t > 0 (44)
2
(x, t = 0) = sin( x) t = 0 (45)

where the advection and the diffusion can be analytically solved in an uncoupled
way, allowing to determinate the decay of the signal.
2
2 ( )2 t
(x, t) = sin( [x (x0 + U t)]) e (46)

Using the parameters:
U = 5,000
=1
L x = 10
Fig. 15 Temporal evolution of sinusoidal amplitude
= 0.25
x = 0.025
t = 0.0001 (Fo = 0.16).
In Fig. 15 results obtained with different values of are presented comparing with
analytic decay. The most accurate simulation is using = 1, using = 0 decay is
not observed and with other values for intermediate solutions are obtained. Finally,
a corrective step of an erroneous explicit prediction does not ensure accurate results
due to the bad performance of explicit schemes for this very special case.
It must be emphasized that the presented case rarely appears in non-academic
problems, but it allows to demonstrate another reason to choose an implicit calcula-
tion of the diffusion instead of an explicit.
6.4 Implicit Calculation of the Viscous Diffusion
The theta method can also be adopted to calculate the viscous effects on incompress-
ible flow problems. Again, this strategy allows to extend the maximum time-step
without the limitation of the Fourier number. The expressions are similar to the
scalar case presented in Eqs. (34)(36), but replacing with v and g with v
where v = (v + v T ).
Finally, the algorithm for the implicit correction of the viscous stress tensor is
presented in the following algorithm:
Algorithm 4 - Time Step PFEM Incompressible Flow with Implicit Correction of

the viscous diffusion.
1. Calculate
acceleration on the nodes like a FEM:

N v d = N (vn ) d + N vn d

N p d = N p d + N p d
n n n
an = = p n + v
t n+1
xn+1p = xnp + t n vn (xp ) d
t n+1
v n+1
p = vnp + t n an (xp ) + f n+ d
v n+1
j = (v n+1
p )
5. Implicit correction:
vn+1
j = v n+1
j + t n+1
j .
6. Find the pressure value solving the Poisson equation system using FEM:
vn+1
j = t [ p n+1 ]
7. Update the velocity value with the new pressure:
vn+1
j = vn+1
j t ( p n+1 p n )
vn+1
p = vn+1
p t 1 ( p n+1 p n )
The equation system for the implicit correction of the viscous diffusion can be
solved using the same strategy as presented in (37) or (38). Also, for = (t) and
t = cte the matrix does not depend on the time, then it can be factorized only once.
6.5 A Simple Test for a Diffusive Dominant Problem. The Mesh

Size and the Time Step Dependency
6.5.1 Advective-Diffusive Transport of a Gaussian Hill
The transport of a Gaussian Hill problem was used to demonstrate the goodness of
PFEM method to solve a scalar transport problem [18]. This case also made evident
the pathology that explicit Eulerian approaches suffer in solving a pure advective
transport problem with CFL > 1. The problem consists of a Gaussian hill signal
used as initial condition transported with physical diffusion. The velocity field is a
flow rotating around the center of a square domain. The Gaussian signal is displaced
from the center of the domain at a certain radius and its shape makes the transported
signal have a non-zero value in a limited region of the domain initially. The signal
should be transported following circular path lines. Figure 16 shows the problem
definition.
Fig. 16 Initial temperature distribution for the advective-diffusive transport problem
6.5.2 Problem Parameter Definition
This problem is taken from Donea and Huerta [4]. The initial condition is:
1
(x, 0) = 4 (1 + cos( X ))(1 + cos(Y )) i f X2 + Y 2 1
(47)
0 other wise
where X = (x x0 )/ with a boundary condition = 0 for all the nodes lying

on the boundary. The initial position of the center of the signal and its radius are
x0 and respectively. In this example x0 = ( 16 , 16 ) and = 0.2 are taken. The
velocity field corresponds to a rigid rotation with angular velocity = 2, therefore
v(x) = ( y, x). The diffusivity chosen is = 0.0001.
Three different meshes were employed, all defined over a unit square [ 21 , 21 ]
[ 2 , 2 ]. The coarse mesh called M1, with 30 30 quadrangular elements split in
1 1
triangles. Another finer called M2, with 100 100 and finally the finest mesh called
(M3), with 500 500, in order to define a reference solution for comparison playing
the role of an almost exact solution.
6.5.3 FEM and PFEM Simulations
Next the results are compared. They were obtained using:

an Eulerian FEM+SUPG code using different time integration schemes, where
= 1 is the first order implicit Backward-Euler and = 0.5 is the second
order Crank-Nicholson. They are labeled as FEM-1order and FEM-2order
respectively.
a Lagrangian code called PFEM using different number of particles per element
seeded at the initial time step, (3,9,15,24), labeled as PFEM-3p, PFEM-9p,
PFEM-15p and PFEM-24p respectively. For the seeding and the removing at
least one particle for each subelement was fixed as the minimum limit and 20
as the maximum. The strategy for the diffusion treatment was fully implicit, i.e.
X-IVS method was employed.
(a)
(b)
Fig. 17 Amplitude evolution on mesh M1 for Co = 0.5 (a) with Co = 5 (b). PFEM results are
shown projected on the mesh
Figure 17 presents the evolution of the amplitude of the signal for the different
simulations. It is confirmed again the fact that Eulerian simulations introduce a lot of
numerical diffusion mainly due to the temporal scheme and the spatial stabilization.
It is noted that PFEM simulations start with an amplitude max = 1, it is due to the
projection of the maximum value on particles over the grid nodes. It is not an error
because the particle information does not suffer for any type of numerical dissipation
when is transported. The only error source is when the information is projected on the
mesh for secondary computations. To confirm this, the signal amplitude over the par-
ticles may be viewed in Fig. 18. Here, another important result arises: the maximum
value over the particles is independent of the number of particles initially seeded.
This fact depends on the particle removing limits chosen during the computation and
also on the projection operator design.
6.5.4 Simulation on a Finer Mesh M2
Figure 19 presents the evolution of the signal amplitude for different simulations. In
this case the maximum on the mesh are almost the same as the maximum over the
particles. It is due to the finer mesh involved, making the projection operation less
diffusive.
7 Some Applications for More Complex Problems
This section finally end the paper showing some brief details about new promising
and challenging applications of PFEM method. In some sense all theses applications
present some sort of coupling problems. The first is a typical natural convection heat
transfer problem in both, laminar and turbulent regime. The following example is
the well known benchmark of turbulence modeling proposed by Rodi and Ferziger,
a cube mounted on a channel floor and finally one example of multifluids flow, going
towards the multiphase flow problems a very demanding need of the industry.
7.1 Thermal and Fluid Dynamics Coupled Problems
7.1.1 Natural Convection in a Square Cavity
The problem presented deals with the two dimensional flow with a Prandtl number
Pr = 0.71 in a square cavity of side H = 1[m]. The boundary conditions for the
momentum equation are non slip at all boundaries. Horizontal walls are isolated,
and the vertical sides are at different temperatures Tc < T < Th ( = T for natural
convection problems). Figure 20 exhibits the geometry of the cavity. Simulations
were carried out using a mesh of triangular elements with 100 100 nodes and
refinement towards the walls. The wide range of Ra numbers (48) was obtained by
a constant temperature difference of T = 1 K adjusting the thermal expansion
coefficient to supply the desired Ra.
(a)
(b)
Fig. 18 Amplitude evolution on mesh M1 for Co = 0.5 (a) and for Co = 5 (b). PFEM results are
shown on the particles, not projected on the mesh
g H 3 (h c )
Ra = (48)

where is the thermal diffusivity corresponding to air with the above mentioned Pr
in standard temperature and pressure conditions.
(a)
(b)
Fig. 19 Amplitude evolution on mesh M2 for Co = 0.5 (a) and for Co = 5 (b). PFEM results are
shown on the mesh
7.1.2 Results and Discussion
This section provides a set of solutions at low Ra number. The quantities under study
are the following:
Fig. 20 Detail of cavity

simulated, left wall at Th ,
right wall at Tc , top and
bottom walls are insulated
Table 1 Numerical solution for thermal square cavity with PFEM comparing with reference data
Ra Data PFEM2 Corzo [1] Davis [2]
103 u max (x = 0.5) 3.605 3.640 3.634
103 ymax (x = 0.5) 0.814 0.812 0.813
103 vmax (y = 0.5) 3.650 3.700 3.679
103 xmax (y = 0.5) 0.183 0.177 0.179
104 u max (x = 0.5) 15.982 16.281 16.182
104 ymax (x = 0.5) 0.824 0.822 0.823
104 vmax (y = 0.5) 19.378 19.547 19.509
104 xmax (y = 0.5) 0.116 0.123 0.120
106 u max (x = 0.5) 64.483 64.558 65.330
106 ymax (x = 0.5) 0.845 0.851 0.851
106 vmax (y = 0.5) 218.054 221.572 216.750
106 vmax (y = 0.5) 0.037 0.067 0.039
u max ( 21 ) : The maximum horizontal velocity on the vertical mid-plane of the cavity
(together with its location).
vmax ( 21 ) : The maximum vertical velocity on the horizontal mid-plane of the cavity
(together with its location).
Table 1 shows PFEM results for Ra = 103 , 104 and 106 compared with the [1, 2]
solutions. Excellent agreement to experimental data in both results for momentum
and energy equations prove the accuracy of this approach for this low Ra number
range. The horizontal velocity component in the vertical mid-plane is shown in
Fig. 21. Here is worthy to note that when Ra number increases the boundary layer
becomes thinner and the maximum values in the velocity get closer to the walls.
Finally Fig. 22 presents the temperature profiles for the three cases.
(a) 4 (b) 20
PFEM-2 PFEM-2
3 OpenFoam OpenFoam
G.V. Davis G.V. Davis
2 10
Horizontal Velocity
Horizontal Velocity
1
0 0
0 0,2 0,4 0,6 0,8 1 0 0,2 0,4 0,6 0,8 1
1
2 10
4 20
y y
(c) 80
PFEM-2
60 OpenFoam
G.V. Davis
40
Horizontal Velocity
20
0
0 0,2 0,4 0,6 0,8 1
20
40
60
80
y
Fig. 21 Horizontal velocity profiles at x mid-plane to a Ra = 103 , b Ra = 104 and c Ra = 106
7.1.3 Natural Convection in a Cubic Cavity
The schematic model for the problem is shown in Fig. 23. The cubic cavity is one
meter length with an aspect ratio of unity and is filled with air as working fluid.
The Prandtl number is fixed at Pr = 0.71. All surrounding walls are rigid and
impermeable. The vertical walls located at x = 0 and x = 1 are retained to be
isothermal but at different temperatures of Th and Tc , respectively. The buoyancy
force due to gravity works downwards (i.e., in negative z-direction).
7.1.4 Results and Discussion
For the present range of Ra numbers, solutions were obtained on a mesh with 81,000
tetrahedral elements and around of eighteen thousand nodes, and with refinement
towards the walls. The following characteristic quantities are presented:
Fig. 22 Temperature field to a Ra = 103 , b Ra = 104 and c Ra = 106
Fig. 23 Schematic model for

the natural convection in a
cubical cavity
u max ( 21 ) : The maximum horizontal velocity for x-direction on center line (x = 0.5,
y = 0.5) of the cavity and its location.
wmax ( 21 ) : The maximum vertical velocity for z-direction on center line (y = 0.5, z
= 0.5) of the cavity and its location.
Table 2 Numerical solution for thermal cubic cavity with PFEM comparing with reference data
Ra Data PFEM Wakashima [24] Fusegi [5]
104 u max (x = y = 0.5) 0.1978 0.1989 0.2013
104 z max (x = y = 0.5) 0.8460 0.8250 0.8167
104 wmax (y = z = 0.5) 0.2190 0.2211 0.2252
104 xmax (y = z = 0.5) 0.1260 0.1253 0.1167
105 u max (x = y = 0.5) 0.1409 0.1423 0.1468
105 z max (x = y = 0.5) 0.8460 0.8500 0.8547
105 wmax (y = z = 0.5) 0.2359 0.2407 0.2471
105 xmax (y = z = 0.5) 0.0680 0.0751 0.0647
106 u max (x = y = 0.5) 0.0766 0.0813 0.0842
106 z max (x = y = 0.5) 0.8570 0.8500 0.8557
106 wmax (y = z = 0.5) 0.2897 0.2382 0.2588
106 xmax (y = z = 0.5) 0.0280 0.0500 0.0331
Fig. 24 Mesh with slices of section at mid-planes y = 0.5 and z = 0.5
Table 2 shows PFEM results for Ra = 104 , 105 and 106 compared with the
[5, 24] solutions. Finally Fig. 24 presents a wireframe of the mesh used with slices
of section at mid-planes y = 0.5 and z = 0.5 respectively.
7.2 Turbulent Flows
7.2.1 Wall Mounted Cube Simulation
Turbulent flows around three-dimensional obstacles are common in nature and occur
in many applications including flow around tall buildings, vehicles and computer
chips. Understanding and predicting the properties of these flows are necessary for
H=2h
Flow
3h
h h
3h
3h h 6h
Fig. 25 Geometry for the flow around a cube obstacle
safe, effective and economical engineering designs. Experimental techniques are

expensive and often provide data that is not sufficiently detailed. With the advent of
supercomputers it has become possible to investigate these flows using numerical
simulations.
In this paper the simulation of the turbulent flow around a cube obstacle is pre-
sented. This test is known as flow over a wall mounted cube, and it was analyzed
experimentally by Martinuzzi and Tropea [17] and numerically by Sha and Ferziger
[21], Lakehal and Rodi [15], and Rodi et al. [20] among others. Flow around a
cube exhibits characteristics as three dimensionality of the mean flow, separation
and large-scale unsteadiness. Quantitative results of this flow are scarce, then flows
patters are exhaustively analyzed and compared.
The geometry of the problem is presented in the Fig. 25.
Computational Modeling
In this work, the numerical method used is the Particle Finite Element Method
(PFEM) with Large Eddy Simulation (LES) for turbulence modeling. The Sub-Grid
Scale (SGS) model used is the Static Smagorinsky model. Regarding to the compu-
tational domain, the problem was solved using two grids: the first one is a relatively
coarse grid of one million of tetrahedral elements (refined towards the cube and
behind it), with a mesh-size of h = h/25 over the cube. On the other hand, the sec-
ond grid has the same kind of refinement but it has around four million of tetrahedral
elements, with h = h/40.
The spanwise boundary condition is slip, the spanwise width is 7 h, assuring that
blockage effects are small. In the streamwise direction, inflow-outflow boundary
conditions are used. A parabolic flow with some perturbations is used at the inlet and
fixed pressure condition is applied at the exit. The streamwise length of the domain
is 10 h.
Fig. 26 The streamlines on the symmetry plane at Re = 40,000. a shows the experimental result
of Martinuzzi and Tropea [17], b result from LES simulation on [21], and c and d presents the
results of PFEM using a coarse and finer mesh respectively
Summary of Results
Large eddy simulations were performed at Re = 40,000. Figure 26 shows a com-
parison of time-averaged streamlines on the symmetry plane. The overall prediction
of the separation region on the roof and behind the obstacle is quite good even using
coarse grids. Shah and Ferziger commented that in its simulations the stagnation
point was located high on the front face, and in this work could be arrived the same
conclusion. Fluid striking the body above it goes over the obstacle and using the finer
mesh we can find a solution where it reattach on the roof, something that Shah could
not. Using the finer mesh, the rear recirculation region is not closed, but streamlines
originating upstream of the obstacle do not enter this region; fluid enters the rear
recirculation region from sides. Near the top of the recirculation region we find the
head of the arch vortex. Results using coarse mesh are not accurate, mainly behind
the obstacle.
Figure presents the time-averaged streamlines on the floor of the channel. The
streamline patterns are consistent with those observed by Martinuzzi and Tropea [17].
These streamlines, which may be viewed as skin friction lines, show the complexity
of this 3-D flow. On the reference [21], the primary separation occurs at a saddle
point located about one obstacle height (1.05 h) ahead of the obstacle (experimental
value = 1.026), whereas PFEM simulation reach approximately (0.89 h) with coarse
grid and (0.92 h) with the finer grid. The separation region wraps around the obstacle
and forms a strong horseshoe vortex. The converging and diverging streamlines that
mark the extent of this vortex are regions of strong upwash and downwash. This
horseshoe is better represented by PFEM using the finer mesh, whereas with the
coarse mesh the streamlines are too much closed behind the obstacle. Instantaneous
pictures (not presented here) of the flow show that the horseshoe vortex is, in fact,
highly intermittent; an intact structure is almost never found in these snapshots. The
mean flow on the side faces is entirely reversed. In Shah and Ferziger, the primary
reattachment length of 1.65 h agrees well with the experimental value of 1.61 h,
however PFEM reattachment is found in 1.8 h.
In the work of Shah and Ferziger [21], it is said that both the primary separation
point ahead of the obstacle and the rear reattachment points are singular points
(zero skin friction) where the so-called separation lines begin and end. Also, they
comment that the owl-face shaped streamlines in the rear recirculation zone of the
obstacle correspond to the base of the arch vortex. The arch vortex is formed by
quasi-periodic vortex shedding from the upstream vertical corners that resembles a
von Karman street. This intact arch vortex exists only in the mean flow and is an
artifact of averaging and PFEM can reproduce only approximately this behavior, and
strangely with a coarse mesh the result are more accurate. Must be noticed that both
grids are not good enough near the floor of the channel, then a better refinement is
required to reach the same quality of results as Shah and Ferziger.
Efficiency
In this section the scalability of the current implementation of PFEM is presented.
The above mentioned test, using the finer grid, was carried out over a Infiniband
Table 3 CPU-times comparison in seconds between different PFEM2 algorithms and OpenFOAM
for one, two and four cores
Cores 1x (s) 2x (s) 4x (s)
OpenFOAM 754 402 286
PFEM2 moving mesh (CIMNE) 484 371 326
PFEM2 fixed mesh (CIMNE) 284 179 138
PFEM2 fixed mesh (CIMEC) 330 176 99
interconnected cluster, which has dual socket nodes with Intel Xeon E5-2600 CPUs
and 64 Gb RAM. The interconnection is with IB-QDR 40 Gbps (Fig. 27).
Figure 28 presents the scalability of each PFEM stage and of the entire simulation
using an Eulerian weighting strategy, obtaining approximately the same number
of degrees of freedom in each partition. Could be noted that the efficiency of the
Infiniband cluster is good enough also running with 32 cores, reaching a global
S32 26x. Using more cores the efficiency decays because there is not enough
work for each process to overweight the communication time.
7.3 Multifluids
7.3.1 Sloshing Test
In this section a comparison with the results of the sloshing test is presented. For the
experiment, the same mesh and configuration than that presented in Idelsohn et al.
[9] have been used (Figs. 29 and 30).
Table 3 shows the computational time necessary to simulate 1 sec. in an Intel(R)
Core(TM) i7-3820 CPU 3.60 GHz with OpenFOAM and PFEM2 versions of the
International Center for Numerical Methods in Engineering (CIMNE).
On the other hand, the test of the implementation presented in this paper was
executed in an Intel(R) Core(TM) i5-3230M CPU 2.60 GHz. To match the hetero-
4,007
geneous platforms a benchmarking factor 9,010 (extracted for the web-page http://
cpubenchmark.net/high_end_cpus.html) is used, and the final values are presented
in the table.
The reported values evidence that, for the settings described, PFEM with fixed
mesh is more than 2 faster than OpenFOAM.
7.3.2 Dam-Break Test
In this section a comparison with the results of a dam-break test is presented. For the
experiment, the same mesh and configuration which is presented in Idelsohn et al.
[9] have been used.
Fig. 27 The streamlines in a plane near to the floor at Re = 40,000. a shows the numerical result
of Shah and Ferziger [21], and b and c present the results of PFEM using a finer mesh respectively
32
linear
acceleration
16 X-IVAS
projection
poisson
correction
8 Total
Sn
1
1 2 4 8 16 32
#processors
Fig. 28 Speed-up over an Infiniband cluster. Case: flow around a mounted cube in 3d
0.45 0.5
0.45
wave height right [m]
wave height left [m]
0.4
0.4
0.35 0.35
0.3 0.3
0.25
0.25
0.2
0.2
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
Time [s] Time [s]
Fig. 29 Interface relative height at the vertical walls (left side and right side) for PFEM fixed mesh
Figure 31 presents snapshots of the simulation. Comparing with the results

obtained in [9], good agreement both in the shape of the free surface and in the
time evolution with experimental and OpenFOAM results can be observed. The cur-
rent version is using Courant number larger than 10 (maximum 15), without present
any difficulty for this large time-step.
The CPU-Time time needed to simulate 1 s of real time is 274.75 s (running in
serial mode). Idelsohn et al. reports 278 s for the CIMNE pfem2 fixed mesh version
and 473 s with OpenFOAM.
Fig. 30 From left to right and top to bottom: sloshing of two immiscible fluids with a large jump
in the density: snapshots at different time steps (t = 0.55, 1.15, 1.7, 2.3, 2.75, 3.35 and 5 s.)
Fig. 31 From left to right and top to bottom: snapshots of the dam break without obstacle at
t = 0, 0.2, 0.4, 0.6, 0.8 and 1 s
8 Conclusions
In this paper a review and the present developments of PFEM are presented. In recent
years much effort has been devoted to improve the performance of this method in
order to make it competitive with the rest of the solvers mostly used in computational
mechanics. Not only that, but with recent findings that have emerged is thought to be
on the gates of a paradigm shift in the way of performing the simulations, especially
considering that the community is demanding of methods that are commensurate
with the needs of engineering design.
While this paper does not delve into the numerical analysis it establishes the basis
to do so in the next few years with the target to demonstrate mathematically the
goodness of the Lagrangian methods of this type in front of the very commonly used
Eulerian methods.
Finally the last goal has been to show that in addition to the well-known virtues
that owns the method to resolve problems with heterogeneous flows, it is also possible
to implement complex homogeneous flows, as in the case of turbulence and cases
with thermal coupling.
Acknowledgments This work was partially supported by the European Research Council under the
Advanced Grant: ERC-2009-AdG Real Time Computational Mechanics Techniques for Multi-Fluid
Problems. Norberto Nigro and Juan Gimenez want to thanks to CONICET,Universidad Nacional
del Litoral and ANPCyT for their financial support (grants PICT 1645 BID (2008), CAI+D 65-
333 (2009)). Also thanks to Santiago Marquez Damian for their invaluable assistance in show the
goodness of PFEM visvis other solvers available, of whom Santiago is an expert user. To Eugenio
Oate and CIMNE for their unconditional support and his teachings throughout his scientific life.
To Pedro Morin, Marta Bergallo for interesting mathematics discussions and to Nestor Calvo and
Pablo Novara for sharing some discussions about mesh generation and computational geometry.
References
1. Corzo S, Marquez Damian S, Nigro N (2011) Numerical simulation of natural convection

phenomena. Mecnica Computacional XXX:277296
2. De Vahl Davis G (1983) Natural convection of air in a square cavity: a benchmark numerical
solution. Int J Num Meth Fluids 3:249264
3. Del Pin F (2003) The meshless finite element method applied to a lagrangian particle formula-
tion of fluid flows. Ph.D. Thesis, Facultad de Ingeniera y Ciencias Hdricas (FICH) Instituto de
Desarrollo Tecnolgico para la Industria Qumica (INTEC) Universidad Nacional del Litoral
4. Donea J, Huerta A (1983) Finite element method for flow problems. Wiley, Chichester
5. Fusegi T, Hyun J, Kuwahara K, Farouk B (1991) A numerical study of three-dimensional natural
convection ina differentially heated cubical enclosure. Int J Heat Mass Transfer 34:15431551
6. Gimenez JM, Nigro NM (2011) Parallel implementation of the particle finite element method.
Mecnica Computacional XXX:30213032
7. Gimenez JM, Nigro NM, Idelsohn SR (2012) Improvements to solve diffusion-dominant prob-
lems with pfem-2. Mecnica Computacional XXXI:137155
8. Hryb D, Cardozo M, Ferro S, Goldschmidt M (2009) Particle transport in turbulent flow using
both lagrangian and eulerian formulations. Int Cummun Heat Mass Transfer 36:451457
9. Idelsohn S, Marti JM, Becker P, Oate E (2014) Analysis of multi-fluid flows with large time-
steps using the particle finite element method. Int J Num Meth in Fluids (in press)
10. Idelsohn S, Nigro NM, Limache A, Oate E (2012) Large time-step explicit integration method
for solving problems with dominant convection. Comput Methods Appl Mech Eng 217
220:168185
11. Idelsohn SR, Nigro NM, Gimenez JM, Rossi R, Marti J (2013) A fast and accurate method to
solve the incompressible navier-stokes equations. Eng Comput 30(2):197222
12. Idelsohn SR, Oate E, Calvo N, Del Pin F (2003) The meshless finite element method. Int J
Num Meth Eng 58(6):893912
13. Idelsohn SR, Oate E, Del Pin F (2004) The particle finite element method a powerful tool to
solve incompressible flows with free-surfaces and breaking waves. Int J Numer Meth 61:964
989
14. Jasak H (1996) Error analysis and estimation for the finite volume method with applications
to fluid flows. Ph.D. Thesis, London
15. Lakehal D, Rodi W (1997) Calculation of the flow past a surface-mounted cube with two-layer
turbulence models. J Wind Eng Ind Aerodyn 67:6578
16. Leveque R (2002) Finite volume methods for hyperbolic problems, 1st edn. Cambridge Uni-
versity Press, Cambridge
17. Martinuzzi R, Tropea C (1993) The flow around surface-mounted, prismatic obstacles placed
in a fully developed channel flow. J Fluids Eng 115:8592
18. Nigro N, Gimenez J, Limache A, Idelsohn S, Oate E, Calvo N, Novara P, Morin P (2011) A new
approach to solve incompressible navier-stokes equation using a particle method. Mecnica
Computacional XXX
19. Oate E, Idelsohn SR, Del Pin F, Aubry R (2004) The particle finite element method, an
overview. Int J Comput Meth 1:267307
20. Rodi W, Ferziger J, Breuer M, Pourquie M (1997) Status of large eddy simulation: results of
a workshop. Trans ASME J Fluid Eng 119:248262
21. Shah KB, Ferziger JH (1997) A fluid mechanicians view of wind engineering: large eddy
simulation of flow past a cubic obstacle. J Wind Eng Ind Aerodyn 67&68:211224
22. Sklar DM, Gimenez JM, Nigro NM, Idelsohn SR (2012) Thermal coupling in particle finite
element method - second generation. Mecnica Computacional XXXI:41434152
23. Stam J (1999) Stable fluids. In: SIGGRAPH 99 Conference Proceedings, Annual Conference
Series, pp 121128
24. Wakashima S, Saitoh T (2004) Benchmark solutions for natural convection in a cubic cavity
using the high-order time-space method. Int J Heat Mass Transfer 47:853864
Part VI
Fluid-Structure Interactions Problems
Computational Engineering Analysis and Design
with ALE-VMS and ST Methods
Kenji Takizawa, Yuri Bazilevs, Tayfun E. Tezduyar, Ming-Chen Hsu,

Ole iseth, Kjell M. Mathisen, Nikolay Kostov and Spenser McIntyre
Abstract Flows with moving interfaces include fluidstructure interaction (FSI) and
quite a few other classes of problems, have an important place in engineering analy-
sis and design, and pose significant computational challenges. Bringing solution
and analysis to them motivated the Deforming-Spatial-Domain/Stabilized Space
Time (DSD/SST) method and also the variational multiscale version of the Arbitrary
LagrangianEulerian method (ALE-VMS). These two methods and their improved
versions have been applied to a diverse set of challenging problems with a com-
mon core computational technology need. The classes of problems solved include
free-surface and two-fluid flows, fluidobject and fluidparticle interaction, FSI, and
flows with solid surfaces in fast, linear or rotational relative motion. Some of the most
challenging FSI problems, including parachute FSI, wind-turbine FSI and arterial
FSI, are being solved and analyzed with the DSD/SST and ALE-VMS methods as
core technologies. Better accuracy and improved turbulence modeling were brought
with the recently-introduced VMS version of the DSD/SST method, which is called
DSD/SST-VMST (also ST-VMS). In specific classes of problems, such as parachute
K. Takizawa (B)
Department of Modern Mechanical Engineering and Waseda Institute for Advanced Study,
Waseda University, 1-6-1 Nishi-Waseda, Shinjuku-ku, Tokyo 169-8050, Japan
Y. Bazilevs
Structural Engineering, University of California, San Diego, 9500 Gilman Drive,
La Jolla, CA 92093, USA
T. E. Tezduyar N. Kostov S. McIntyre
M.-C. Hsu
Department of Mechanical Engineering, Iowa State University,
2025 Black Engineering, Ames, IA 50011, USA
O. iseth K. M. Mathisen
Department of Structural Engineering, Norwegian University of Science and Technology,
7491 Trondheim, Norway

FSI, arterial FSI, ship hydrodynamics, fluidobject interaction, aerodynamics of

flapping wings, and wind-turbine aerodynamics and FSI, the scope and accuracy
of the modeling were increased with the special ALE-VMS and ST techniques tar-
geting each of those classes of problems. This article provides an overview of how
the core and special ALE-VMS and ST techniques are used in computational engi-
neering analysis and design. The article includes an overview of three of the special
ALE-VMS and ST techniques, which are just a few examples of the many special
techniques that complement the core methods. The impact of the ALE-VMS and ST
methods in engineering analysis and design are shown with examples of challenging
problems solved and analyzed in parachute FSI, arterial FSI, ship hydrodynamics,
aerodynamics of flapping wings, wind-turbine aerodynamics, and bridge-deck aero-
dynamics and vortex-induced vibrations.
1 Introduction
Flows with moving interfaces include fluidstructure interaction (FSI), fluidobject

interaction (FOI), fluidparticle interaction (FPI), free-surface and multi-fluid flows,
and flows with solid surfaces in fast, linear or rotational relative motion. These
problems are frequently encountered in engineering analysis and design, pose
some of the most formidable computational challenges, and have a common core
computational technology need. That crucial need motivated the development of
the Deforming-Spatial-Domain/Stabilized SpaceTime (DSD/SST) method [15],
which is a general-purpose interface-tracking (moving-mesh) technique, as a core
computational technology. The DSD/SST method is an alternative to the Arbitrary
LagrangianEulerian (ALE) finite element formulation [6], which is the most widely
used moving-mesh technique, with increased emphasis on FSI in recent years (see,
for example, [726]). Though less widely used than the ALE formulation, over the
past 20 years the DSD/SST method has been applied to some of the most challeng-
ing moving-interface problems. The classes of problems solved with the DSD/SST
method since its inception include the free-surface and multi-fluid flows [1, 2730],
FOI [1, 28, 30], aerodynamics of flapping wings [3134], flows with solid surfaces in
fast, linear or rotational relative motion [15, 28, 29, 3537], compressible flows [28],
shallow-water flows [29, 38], FPI [28, 29], and FSI [35, 31, 3957]. Very recently,
a new version of the DSD/SST method that can address the computational challenges
involved in topology changes, such as contact between solid surfaces, was introduced
in [58] with the name ST-TC.
In the DSD/SST formulation, the ST computations are carried out one ST slab
at a time, where the slab is the slice of the ST domain between the time levels n and
n+1. The basis functions are continuous within a ST slab, but discontinuous from one
ST slab to another. The original DSD/SST method [1] is based on the SUPG/PSPG
stabilization, where SUPG and PSPG stand for the Streamline-Upwind/Petrov-
Galerkin [59] and Pressure-Stabilizing/Petrov-Galerkin [1] methods. Starting in its
very early years, the DSD/SST method also included the LSIC (least-squares on
Computational Engineering Analysis and Design 323
incompressibility constraint) stabilization. New versions of the DSD/SST method

have been introduced since its inception, including those in [3], which have been
serving as the core numerical technology in the majority of the ST FSI computations
carried out in recent years. The most recent DSD/SST method is the ST version [4,
5] of the residual-based variational multistage (RBVMS) method [6063]. It was
named DSD/SST-VMST (i.e. the version with the VMS turbulence model) in [4],
which was also called ST-VMS in [5]. The original DSD/SST method was named
DSD/SST-SUPS in [4] (i.e. the version with the SUPG/PSPG stabilization), which
was also called ST-SUPS in [22].
The ALE-VMS formulation [15, 54] is a moving-domain extension of the
RBVMS formulation, originally proposed in [62] and successfully applied to simula-
tion of turbulent flows and FSI in [911, 23, 6368]. An important additional feature
of the ALE-VMS methodology is weak enforcement of essential boundary condi-
tions. Weakly enforced essential boundary conditions were introduced in [69]. Weak
boundary conditions produce significantly more accurate solutions than strongly
enforced boundary conditions on meshes with insufficient boundary-layer resolu-
tion [63, 70, 71], which is almost always the case in practice. The ALE-VMS method
with weakly enforced boundary conditions is the main computational technology
behind the ALE-VMS computations presented in this chapter.
The Mixed Interface-Tracking/Interface-Capturing Technique (MITICT) [29] was
introduced primarily for FOI with multiple fluids (see, for example, [72, 73]). The
MITICT was successfully tested in [30], where the interface-tracking technique was
an ST formulation, and the interface-capturing method was the Edge-Tracked Inter-
face Locator Technique (ETILT) [29]. It was also tested in [74] by using a moving
Lagrangian interface technique [75] for interface tracking and the ETILT.
In this article, the ALE-VMS formulation is used in the context of the MITICT
technique, in which the air-water interface is captured using the level set approach
[76]. The level set function, which is convected with the flow, is used for separating
the air and water subdomains. The NavierStokes equations of incompressible flows
are employed in both air and water subdomains. The NavierStokes and level set
equations are written in an ALE frame [6]. The rigid object is described using balance
equations of linear and angular momentum. The ALE technique is employed to
track the interface between the moving fluid domain (consisting of air and water
subdomains) and the rigid object. Application of the ALE-VMS formulation to free-
surface flow and FOI may be found in [12, 16, 77, 78].
Moving-mesh methods require mesh update methods. Mesh update consists of
moving the mesh for as long as possible and remeshing as needed. With the key
objectives being to maintain the element quality near solid surfaces and to minimize
frequency of remeshing, a number of advanced mesh update methods [3, 27, 29,
79, 80] were developed to be used with the DSD/SST method, including those that
minimize the deformation of the small elements placed near solid surfaces.
An ST method will naturally involve more computational cost per time step than
an ALE method, but it gives us the option of using higher-order basis functions in
time, including the NURBS basis functions, which have been used very effectively
as spatial basis functions (see [9, 64, 81, 82]). This of course increases the order of
accuracy in the computations [4, 5, 48], and the desired accuracy can be attained
with larger time steps, but there are positive consequences beyond that. The ST
context provides us better accuracy and efficiency in temporal representation of
the motion and deformation of the moving interfaces and volume meshes, and better
efficiency in remeshing. This has been demonstrated in a number of 3D computations,
specifically, flapping-wing aerodynamics [3234, 83], separation aerodynamics of
spacecraft [56], and wind-turbine aerodynamics [37].
There are some advantages in using a discontinuous temporal representation in
ST computations. For a given order of temporal representation, we can reach a higher
order accuracy than one would reach with a continuous representation of the same
order. When we need to change the spatial discretization (i.e. remesh) between two
ST slabs, the temporal discontinuity between the slabs provides a natural framework
for that change. There are advantages also in continuous temporal representation. We
obtain a smooth solution, NURBS-based when needed. We also can deal with the
computed data in a more efficient way, because we can represent the data with fewer
temporal control points, and that reduces the computer storage cost. These advan-
tages motivated the development of the ST computation techniques with continuous
temporal representation (ST-C) [84].
The core and special ALE-VMS and ST FSI methods mentioned above were
motivated by the need for the solution and analysis of specific classes of challenging
problems, such as parachute FSI, arterial FSI, aerodynamics of flapping wings, ship
hydrodynamics and FOI, and wind-turbine aerodynamics and FSI. This can be seen
from the ALE-VMS and ST articles cited in the first paragraph, especially the articles
since 2008, and will also be seen from the examples we will present in this chapter.
In the case of the parachute FSI, the special methods were motivated also by the need
for supporting the design process for the NASA spacecraft parachutes.
For the governing equations and core methods, including the ALE-VMS and
DSD/SST methods, and for much of the special techniques, we refer the interested
reader to [15, 22, 33, 50, 54]. An overview of three of the special techniques is
provided in Sect. 2. Examples of the challenging problems solved are presented in
Sect. 3, and the concluding remarks are given in Sect. 4.
2 Special Methods
A certain class of FSI problems might involve some specific computational chal-
lenges beyond those encountered in a typical FSI problem. That requires develop-
ment of special FSI methods targeting those challenges. A good number of special
methods were developed in conjunction with the core ST FSI method to address the
specific computational challenges involved in parachute FSI [53], patient-specific
arterial FSI [50], aerodynamics of flapping wings [33, 34], and wind-turbine aero-
dynamics [37]. The details on these special methods can be found in the references
cited above. Here we give three examples.
Fig. 1 Parachute radial lines and gores
Fig. 2 Rings, sails, ring gaps, and sail slits
2.1 Homogenization Model for Ringsail Parachutes
Parachute FSI involves all the computational challenges of a typical FSI problem.
Spacecraft parachutes are often very large ringsail parachutes, made of a large number
of gores, where a gore is the slice of the canopy between two radial reinforcement
cables running from the parachute vent to the skirt (see Fig. 1). Ringsail parachute
gores are constructed from rings and sails, resulting in a parachute canopy with
hundreds of ring gaps and sail slits (see Fig. 2). The complexity created by this
geometric porosity makes FSI modeling inherently challenging.
The Homogenized Modeling of Geometric Porosity (HMGP) [3] and its new
version, HMGP-FG [53], were introduced to help us bypass the intractable com-
plexities of the geometric porosity by approximating it with an equivalent, locally
varying homogenized porosity. In HMGP-FG, the normal velocity crossing the para-
chute canopy under a pressure differential p is modeled as
Fig. 3 Areas used in HMGP-FG
Fig. 4 The two porosity coefficients for each patch are calculated in a one-time fluid mechanics
computation with an n-gore slice of the parachute canopy, where the flow through all the gaps and
slits is resolved. Expect for the first and last patches, each patch contains a gap or a slit. See [3, 53]
for details

AF AG |p|
u n = (kF ) J p (kG ) J sgn(p) , (1)
A1 A1
where A1 , AF and AG are defined in Fig. 3, and (kF ) J and (kG ) J are the homogenized
porosity coefficients for each patch J , calculated in a one-time fluid mechanics
computation with an n-gore slice of the parachute canopy (see Fig. 4). Even in a
fully open configuration, the parachute canopy goes through a periodic breathing
motion where the diameter varies between its minimum and maximum values. The
shapes and areas of the gaps and slits vary significantly during this breathing motion
(see Fig. 5). The porosity coefficients have very good invariance properties with
respect to these shape and area changes, and this can be seen in Fig. 6.
2.2 Flapping-Wing Motion Representation with Higher-Order

Temporal Functions
Computer modeling of the aerodynamics of flapping wings requires an accurate

temporal representation of the motion and deformation of the wings. It also requires
Fig. 5 The shapes and the areas of the slits vary significantly during the canopy breathing motion
Fig. 6 The porosity coefficients (kF ) J and (kG ) J for each patch J , at different canopy shapes
during the breathing motion. The plots show good invariance for these coefficients with respect to
the shape changes
robust and efficient ways of moving the mesh and remeshing as needed. Special
techniques to be used in conjunction with the DSD/SST method have been devel-
oped (see [3234]) based on using higher-order functions (specifically NURBS basis
functions) in time in representing the wing motion and deformation, mesh motion,
Mc
Mc+1
1.0
0.5
0.0
0 1 2 3 4 1
5 5 5 5

Fig. 7 Mesh motion is represented by using NURBS basis functions in time. The temporal-control
meshes are the coefficients of the NURBS basis functions
New New
New
1.0
0.5
0.0
0 1 2 3 4 1
5 5 5 5

Remeshing point
Fig. 8 Remeshing is handled by multiple knot insertion where we want to remesh. That point in
time becomes a patch boundary
and remeshing. Using cubic NURBS basis functions in temporal representation of

the wing position gives us a continuous representation of the acceleration, which in
turn gives us a continuous representation of the aerodynamic forces. Using NURBS
basis functions in temporal representation of the mesh motion (see Fig. 7) gives us
a very effective way of dealing with moving meshes. This allows us to do mesh
computations with longer time in between, but get the mesh-related information,
such as the coordinates and their time derivatives, from the temporal representation
whenever we need it. Figure 8 illustrates how remeshing is handled in this approach.
We perform multiple knot insertions where we want to remesh, and that point in time
becomes a patch boundary. More details on how temporal NURBS basis functions
are used in mesh motion and remeshing can be found in [32, 33].
2.3 Redistancing and Mass Conservation for the Level-Set

Formulation
When the ALE-VMS method is used in the context of the MITICT with the level-set
formulation, additional computational technology is employed to enhance the accu-
racy and robustness of the free-surface flow formulation. The use of a regularized
Heaviside function in the definition of the fluid density and viscosity necessitates the
level set to satisfy the signed-distance function property near the air-water interface.
To maintain the signed-distance property of the level set function, a redistancing
procedure based on the Eikonal partial differential equation is employed. The details
of the numerical formulation may be found in [12, 16, 77, 78].
Furthermore, both convection and redistancing of the level set do not inherently
conserve mass. Convergence to a mass-conserving solution occurs only with mesh
refinement. Coarse (and not-so-coarse) mesh simulations may suffer form significant
water mass loss. (This depends on the problem setup and boundary conditions. In
the case of liquids sloshing in closed containers, mass loss may be significant. In
problems with inflow and outflow boundaries the effect may not be as pronounced.)
This effect is amplified when the equations are integrated for a long time period,
when seemingly small mass errors for a given time step compound into a large mass
error toward the end of the computation. As a result, an explicit mass correction
procedure is necessary. To ensure mass balance at every time step, after redistancing
of the level set, we modify the level set function by a global constant, such that the
following equations holds:

n+1 d n d
n+1 n

+ tn+1 n+1/2 un+1/2
h
vn+1/2
h
n n+1/2 d = 0, (2)
n+1/2
where vh is the mesh velocity. In Eq. (2), the quantities are subscripted with a tempo-
ral index and tn+1 is the time step size. This is the simplest technique that restores
mass balance in the simulations. Other versions of mass correction are also possible:
in [75, 85, 86] the authors proposed a total-domain based mass conservation
technique, validated it experimentally in [87], and developed it in the context of
MITICT (with mass conservation for fluidsolid interfaces) in [74]. A chunk-based
(subdomain-based) version of mass conservation was developed in [29, 30].
3 Examples
Examples in Sects. 3.13.4 were computed with the DSD/SST methods, and the
examples in Sects. 3.53.8 with the ALE-VMS methods.
Fig. 9 Parachute shape and flow field at an instant during the computation and comparison with
the test data. Here VD , VRH , TB , and TS are the descent speed, horizontal speed, breathing period,
and swinging period
3.1 FSI Analysis of Spacecraft Parachutes
The first example, a parachute computation, serves the purpose of comparing our
computed results to data from drop tests with a base parachute design and gaining
confidence in our parachute FSI model. Figure 9 shows the parachute shape and flow
field at an instant during the computation and the comparison with the test data. With
that confidence, we can do simulation-based design studies [53], such as evaluating
the aerodynamic performance of the parachute as a function of the suspension line
length (see Fig. 10).
Spacecraft parachutes are typically used in clusters of two or three parachutes. The
contact between the canopies of the parachute cluster is a computational challenge
that we have addressed recently (see [53]). Figure 11 shows a cluster of three para-
chutes at three different instants during the FSI computation, with contact between
two of the parachutes.
Spacecraft parachutes are also typically used in multiple stages, starting with a
reefed stage where a cable along the parachute skirt constrains the diameter to be
less than the diameter in the subsequent stage. After a certain period of time during
the descent, the cable is cut and the parachute disreefs (i.e. expands) to the next
stage. Computing the parachute shape at the reefed stage and FSI modeling during
the disreefing involve additional computational challenges created by the increased
geometric complexities and by the rapid changes in the parachute geometry. Figure 12
shows such a disreefing (see [55]).
Fig. 10 A simulation-based parachute design study, where the objective is to evaluate the aerody-
namic performance of the parachute as a function of the suspension line length. See [53] for details
of the study
Fig. 11 A cluster of three parachutes at three instants during the FSI computation, with contact
between two of the parachutes
As an additional computational challenge, the ringsail parachute canopy might,

by design, have some of its panels and sails removed. The purpose is to increase the
aerodynamic performance of the parachute. In FSI computation of parachutes with
such modified geometric porosity, the flow through the windows created by the
removal of the panels and the wider gaps created by the removal of the sails cannot be
accurately modeled with the HMGP and needs to be actually resolved. This challenge
was successfully addressed in the computations reported in [57]. Figure 13 shows a
cluster of three parachutes with modified geometric porosity, at an instant during the
FSI computation.
3.2 Aerodynamic Analysis of Wind Turbines
Computer modeling of wind-turbine aerodynamics is challenging because correct

aerodynamic torque calculation requires correct separation-point calculation, which
requires an accurate flow field, which in turn requires good mesh resolution and
Fig. 12 Parachute disreefing from [55]: side and bottom views
Fig. 13 A cluster of three

parachutes with modi-
fied geometric porosity, at
an instant during the FSI
computation reported in [57]
turbulence model. We describe from [35] computation of the aerodynamics of an

actual wind-turbine rotor with the DSD/SST-SUPS and DSD/SST-VMST methods.
Figure 14 shows time history of the aerodynamic torque generated by a single blade,
as computed with the DST/SST-SUPS, DSD/SST-VMST, and ALE methods.
1,200
ALE VMST
1,000
Torque (kNm)
800
600
SUPS
400
200
0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2
Time (s)
Fig. 14 Time history of the aerodynamic torque generated by a single blade. Computed with the
DST/SST-SUPS (SUPS), DST/SST-VMST (VMST), and ALE methods
Including the tower in the model increases the computational challenge because
of the fast, rotational relative motion between the rotor and tower. We address this
additional challenge in [37] by using NURBS basis functions for the temporal repre-
sentation of the rotor motion, mesh motion and also in remeshing. This is essentially
the same computational technology described in Sect. 2.2 for modeling the aero-
dynamics of flapping wings. We named this ST/NURBS Mesh Update Method
(STNMUM) in [37]. Figure 15 shows, from [37], the vorticity magnitude, com-
puted with the DST/SST-VMST method and the STNMUM. In that figure, the color
range from blue to red corresponds to a vorticity range from low to high, and lighter
and darker shades of a color correspond to lower and higher values.
3.3 Patient-Specific FSI Analysis of Cerebral Arteries

with Aneurysm
Patient-specific arterial FSI modeling has many challenges. They include calculat-
ing an estimated zero-pressure arterial geometry, specifying the velocity profile at
an inflow with non-circular shape, using variable wall thickness, building layers of
refined fluid mesh near the walls, proper calculation of the wall shear stress (WSS)
and oscillatory shear index (OSI), and properly scaling the flow rate at the inflow. Spe-
cial techniques developed to address these challenges can be found in [50]. Here we
present some computations from [50] for cerebral arteries with aneurysm. Figure 16
shows the lumen obtained from voxel data for three arterial models: Model 1,
Model 2, and Model 3. Figure 17 shows the fluid mechanics mesh for Model 3.
Figure 18 shows the streamlines at the maximum flow rate.
Fig. 15 Vorticity, computed with the DST/SST-VMST method and the STNMUM (see [37])
Fig. 16 Arterial lumen geometry obtained from voxel data for Model 1, Model 2, and Model 3
Fig. 17 Fluid mechanics mesh for Model 3. Mesh at the fluidstructure interface and inflow plane
Fig. 18 Streamlines for the three models when the volumetric flow rate is maximum
3.4 Aerodynamic Analysis of Flapping Wings of an Actual Locust

and an MAV
As a last set of examples from analyses with the ST methods, we present from
[33, 34] computational aerodynamics modeling of flapping wings of an actual locust
and an MAV. The motion and deformation data for the wings is extracted from the
high-speed, multi-camera video recordings of a locust in a wind tunnel at Baylor
College of Medicine (BCM), Houston. The video recording is accomplished by
using a set of tracking points marked on the forewings (FW) and hintings (HW)
of the locust. The tracking points are seen in Fig. 19. How the wing motion and
deformation data is extracted from the video data and represented using NURBS
basis functions in space and time is described in detail in [33]. Figures 20 and 21
show the wind tunnel photographs and the computational model at eight points in
time. Figure 22 shows how the body and wings compare for the locust and MAV
models, and Fig. 23 shows the length scales involved in the computations with those
models. Figure 24 shows the streamlines for the locust. Figures 25 and 26 show for
the locust the vorticity magnitude during the second flapping cycle. Figures 27 and
28 show for the MAV the vorticity magnitude during the third flapping cycle. In
Figs. 25, 26, 27 and 28, the color range from blue to red corresponds to a vorticity
range from low to high, and lighter and darker shades of a color correspond to lower
and higher values. Figure 29 shows the lift and thrust for the locust and MAV.
3.5 The MARIN Dam Break Problem

The setup of the dam break problem, initially proposed by the Maritime Research
Institute Netherlands (MARIN) [88], is depicted in Fig. 30, and is taken from [12].
The problem consists of a column of water, initially at rest, that collapses under the
Fig. 19 Tracking points in the data set from the BCM wind tunnel
Fig. 20 Comparison of computational model and wind tunnel photographs at first four points in
time. Viewing angles are matched approximately. Wind tunnel photographs are from BCM
action of gravity and impacts a fixed rectangular container. We compute the problem
using two types of the spatial discretization: linear tetrahedral finite elements and
NURBS. The quadratic NURBS mesh is significantly more coarse than the linear
tetrahedral mesh. Free-slip and no-penetration boundary conditions are applied on all
surfaces, including the top of the tank. The problem is run until T = 6 s. Snapshots
comparing the solutions coming from tetrahedral FEM and NURBS computations
are given in Fig. 31. Large-scale features of the solution are very similar in the two
simulations, however the details of the small-scale features are better represented on
Fig. 21 Comparison of computational model and wind tunnel photographs at last four points in
time. Viewing angles are matched approximately. Wind tunnel photographs are from BCM
Fig. 22 Locust body and wings (left) and MAV body and wings (right)
90 mm 90 mm
80 mm 80 mm
Fig. 23 Length scales in the computations with the locust (left) and MAV (right) models
a much finer tetrahedral grid, as expected. Time series of the pressure at different
locations on the obstacle are shown in Fig. 32. The first wave hits the block at
approximately t = 0.5 s, and the second, much smaller wave arrives at the block
Fig. 24 Locust. Streamlines colored by velocity magnitude in m/s at approximately 25 % (left) and
50 % (right) of the second flapping cycle
Fig. 25 Locust. Vorticity for the first 4 of 8 equally-spaced points during the 2nd flapping cycle
Fig. 26 Locust. Vorticity for the last 4 of 8 equally-spaced points during the 2nd flapping cycle
at about t = 5 s. The wave impact times and pressure peaks are predicted very
well with both linear elements and quadratic NURBS. Given that the NURBS mesh
has about half of the degrees-of-freedom of the linear FEM mesh in each Cartesian
Fig. 27 MAV. Vorticity for the first 4 of 8 equally-spaced points during the 2nd flapping cycle
Fig. 28 MAV. Vorticity for the last 4 of 8 equally-spaced points during the 2nd flapping cycle
25 25
Locust Locust
20 MAV 20 MAV
Force (mN)
Force (mN)
15 15
10 10
5 5
0 0
-5 -5
-10 -10
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
t/T t/T
Fig. 29 Total lift (left) and thrust (right) generated over one cycle
direction, the accuracy of NURBS results is remarkable; linear FEM is not capable
of attaining such accuracy at this level of resolution (see [12]), and requires a finer
mesh for comparable accuracy.
Fig. 30 The MARIN dam break problem. Geometry definition. The computational domain is a
rectangular box with dimensions 3.22 m 1 m 1 m. The object has dimensions 0.2 m 0.2 m
0.4 m and is placed at the back end of the tank. The water column, initially at rest, has dimensions
1 m 1 m 0.55 m. The locations where pressure and water height are sampled are also depicted
Fig. 31 The MARIN dam break problem. Snapshots of the free surface solution on the tetrahedral
(top) and NURBS (bottom) meshes at t = 1.0, 1.5, 2.0, 4.0, and 5.0 s
3.6 Fridsma Planing Hull
We present results for the Fridsma planing hull [89]. We give a detailed definition
of the hull geometry, present a mesh refinement study, and assess the effect of hull
speed on the drag force and trim angle. Only flat-water (i.e., no waves), constant
hull speed cases are considered. The computational results presented are from [16].
The Fridsma hull geometry definition is given in Fig. 33. The hull is comprised of
idealized shapes: a bow consisting of four ruled surfaces followed by a wedge-shaped
straight section with an constant deadrise angle of 20 . Analytical expressions for
the bounding curves for the ruled surfaces are provided in the figure. The relevant
global geometry parameters are, Length (L): 114.3 cm, Beam (b): 22.86 cm, Height:
14.2875 cm, and Deadrise: 20 . The hull mass, center of gravity, and moment of
inertia are, Mass (m): 7.257 kg, xcg : 80.01 cm, z cg : 6.721 cm, Gyradius (r ): 25 % L,
12000 12000
Experiment Experiment
10000 Tet 518379 10000 Tet 518379
Pressure (Pa)
Pressure (Pa)
64x32x32 64x32x32
8000 8000
6000 6000
4000 4000
2000 2000
0 0
0 2 4 6 0 2 4 6
Time (s) Time (s)
4000 4000
Tet 518379 Tet 518379
Pressure (Pa)
Pressure (Pa)
3000 64x32x32 3000 64x32x32
2000 2000
1000 1000
0 0
0 2 4 6 0 2 4 6
Time (s) Time (s)
Fig. 32 The MARIN dam break problem. Time history of the pressure at four locations on the
obstacle. Experimental data is from [88]
114.3cm
22.86cm
x
10.3782cm
z
14.2875cm
20
x 2 z 2
+ =1
22.86cm 10.3782cm
x 2 z 2
+ =1
22.86cm 14.2875cm
x 22.86cm
x 2 y 2
+ =1
22.86cm 11.43cm
Fig. 33 Fridsma hull. Geometry definition

Fig. 34 Fridsma hull. Coarsest mesh with water and air domains shown
and I yy = mr 2 : 0.6165 kg m2 . This data pertains to the center of gravity located at

70 % of the hull length, measured from the aft of the hull [89].
We perform a mesh convergence study at Froude number Fr = 0.8950.1 The
Froude number is defined as Fr = ugL , where u is the hull speed, g is the magnitude
of gravitational acceleration, L is the hull length. At this chosen Froude number,
according to [89], the trim angle reaches its maximum. A convergence study was
performed on a sequence of four meshes. The coarsest mesh is shown in Fig. 34. The
figure also shows the water and air subdominant in the undisturbed configuration. The
mesh is dense near the hull surface and in the wake. The hull is fixed in the direction of
travel, and the corresponding velocity is set at the inflow of the computational domain
together with the level set function. The hull is allowed to pitch, and displace in the
vertical direction. At the outflow a hydrostatic pressure profile is imposed as a traction
boundary condition. On the side, bottom, and top boundaries of the computational
domain free-slip boundary conditions are imposed. Figure 35 shows the deformed
free surface colored by the flow speed relative to the hull speed. The hull rises up and
develops a trim angle such that the bow is higher than the aft. Note the presence of the
rooster tail feature, which is typical for planing hulls. Also note that the rooster tail
feature goes all the way to the outflow boundary, which suggests that a longer-domain
simulation may be needed in the future. Figure 36 shows convergence of the drag
force and trim angle. The drag force is non-dimensionalized by the gravitational force.
From the results we see that the drag force converges quickly to the experimental
value. On the other hand, the trim angle is underestimated by 12 % with respect to
the experimental data, and does not improve with mesh refinement. Possible causes
may be the choice downstream, lateral, and bottom boundary locations. Error in the
experimental data is also possible.
We also examine the effect of the hull speed on the drag force and trim angle is
studied. In addition to the Fr = 0.8950 case, we consider Fr = 0, Fr = 0.5925,
1

In Fridsma [89] the results are reported in terms of the Speed-Length Ratio (SLR), u/ L, which
is a dimensional quantity. Here we chose to report the results in terms of the Froude number.
Fig. 35 Fridsma hull. Free surface colored by the flow speed relative to the hull speed in m/s
0.25 9
8
0.2 7
Trim Angle (deg)
6
0.15
R/mg
5
4
0.1
3
0.05 2
ALEVMS 1
ALEVMS
0 0
50 60 70 80 90 100 50 60 70 80 90 100
1/3 1/3
N N
Fig. 36 Fridsma hull. Convergence of the drag force (left) and trim angle (right) with mesh refine-
ment and comparison with experimental results
and Fr = 1.190 cases. The simulations are started impulsively in the configuration
depicted in Fig. 34. In the case of Fr = 0, although the hull speed is zero, a non-zero
trim angle develops such that the hull is in equilibrium with the hydrostatic forces. In
all other cases, there is a rapid transient followed by a largely steady-state response.
The steady-state drag force and trim angle are plotted as a function of Froude number,
and compared to the experimental results in Fig. 37. Accurate prediction of the drag
force is attained in all cases. The trim angle is predicted very well for the first two
Froude number cases, and a deviation from the experiment by 1012 % is seen in the
remaining two cases.
3.7 DTMB 5415 Navy Combatant in Head Waves
Here we present the simulation of the DTMB 5415 Navy combatant at lab scale
from [78]. This ship has been investigated by other researchers, both experimentally
and computationally (see, e.g., [9092]). The length of the ship hull is 5.72 m. The
ship mass, center of gravity and inertia tensor are computed by meshing the ship
9
0.2 8
7
Trim Angle (deg)

0.15 6
5
R/mg
0.1 4
3
0.05 2
ALEVMS 1 ALEVMS
0 0
0 0.25 0.5 0.75 1 1.25 0 0.25 0.5 0.75 1 1.25
Fr Fr
Fig. 37 Fridsma hull. Steady-state drag force (left) and trim angle (right) as a function of Froude
number. Comparison with experimental results
interior and performing a direct computation. The total ship volume is 1,366 m3 .
The ship mass is equal to 532.3 kg. It is obtained by multiplying the volume of the
ship below the water line by the constant water density. The center of gravity and the
inertia tensor are computed assuming the ships effective density (i.e., the ship mass
divided by its total volume), which results in X0 = (2.761, 0, 0.280) m and

7.256E-2 2.69E-7 5.35E-2
J0 = 2.69E-7 2.89 2.44E-8 kg m2 . (3)
5.35E-2 2.44E-8 2.91
We compute the ship in head waves, meaning the waves that travel in the direction
opposite to that of the ship. We assume that the ship speed is Uin = 1.873 m/s, which
gives Fr = 0.25 based on the ship length. The ship was allowed to move vertically, to
pitch and to roll, while the rest of the rigid body degrees-of-freedom were constrained.
We make use of the linear Airy waves [93] to prescribe inlet boundary conditions.
The Airy waves may be derived using potential theory, and are specified as follows:
Given, the wave amplitude, wave length and water depth, Aw = 0.2 m, L w = 5.72 m
and h= 3.49 m, respectively, we compute k = 2/L w , the angular wavenumber,
Aw
= gk tanh(kh), the wave phase speed, and Av = sinh(kh) , the velocity amplitude.
With these definitions, the Airy waves are given by
u = Av cosh(kz) cos(kx t) + Uin (4)

v=0 (5)
w = Av sinh(kz) sin(kx t) (6)
= Aw cos(kx t) + h z, (7)
where (u, v, w)T is the fluid velocity vector and the air-water interface in the hydro-
static configuration is assumed to be located at z = 0.
Fig. 38 DTMB 5415 in head waves at t = 9.0 and 9.5 s. Water surface colored by the fluid speed
Fig. 39 Geometric model of the scaled Hardanger bridge deck section and zoom on the geometric
details of the bridge deck section model. The guide-vane-like vortex mitigation devices are located
on the underside of the deck and are shown in light red color
Figure 38 shows the ship negotiating high-amplitude waves. The right part of
Fig. 38 shows the ship partially submerged in water, which is a result of the oncoming
wave hitting the bow of the ship. In this case, near the bow, the free surface experiences
topological changes, which necessitates the use of an interface-capturing method to
handle the air-water interface for this class of problems.
3.8 Vortex-Induced Vibrations of a Bridge Deck
Here we present an example of a fluidobject interaction simulation using a scaled

model of the Hardanger bridge deck section [94]. The bridge deck geometric model
is shown in Fig. 39. This study was initiated to examine the effect of the guide-vane-
like vortex mitigation devices (VMDs) installed on the underside of the bridge deck
(see Fig. 39 for a zoom on the guide vanes) on the resulting wind aerodynamics and
structural response of the bridge. The height h, width b, and length l of the deck
scaled model are 0.0666, 0.366 and 1.7 m, respectively. The bridge deck section
computational domain is shown in Fig. 40. The locations of the top, bottom, and
lateral walls are coincident with those of the wind tunnel where the experiments took
Fig. 40 Bridge deck section computational domain and boundary conditions
Fig. 41 Aerodynamics mesh of the bridge deck section and zoom on the boundary layer mesh of
the top deck and rails
place. Note that there is only a 2.5 cm gap between the tunnel wall and the side of
the bridge deck section. All the geometric details of the model-scale bridge deck are
modeled in the computations, including the hand and bicycle rails on the top of the
deck, and the maintenance rails in the front and rear of the deck. Computations are
performed for 2.6 and 6.0 m/s wind speed, with and without the VMDs. Figure 41
shows the mesh resolution used in this study. Boundary-layer prismatic elements
are used near all solid surfaces, and tetrahedral elements are used elsewhere in the
computational domain. The mesh is refined near the deck and downstream of it to
better capture the wake turbulence. The uniform wind speed is prescribed at the
inflow boundary, the traction vector is set to zero at the outflow boundary, and the
slip condition is set on the top, bottom, and lateral boundaries of the computational
domain (see Fig. 40). The no-slip boundary condition on the bridge deck surface is
enforced weakly. The bridge deck is modeled as a rigid object. For the bridge deck
mass, moment of inertia tensor, and stiffness and damping matrices the readers are
1.4 0.4
Experiment: Without VMD Experiment: Without VMD
Experiment: With VMD Experiment: With VMD
1.2 0.2
Computation: With VMD
0
1
-0.2
CD
CL
0.8
-0.4
0.6
Computation: Without VMD -0.6 Computation: With VMD Computation: Without VMD
0.4 -0.8
0.2 -1
0 5 10 15 20 0 5 10 15 20
Time (s) Time (s)
Fig. 42 Time history of the drag and lift coefficients for cases with and without VMDs for 2.6 m/s
wind speed. Time-averaged experimental measurements from [94] are plotted for comparison
0.06
Without VMD
With VMD
0.04
0.02
(degree)
0
3
-0.02
-0.04
-0.06
0 1 2 3 4 5 6
Time (s)
Fig. 43 Time history of the pitching angle with and without VMDs for 6.0 m/s wind speed
referred to [94]. The deck is allowed to displace vertically, and undergo pitching and
rolling motions.
Figure 42 shows drag and lift coefficients for cases with and without the VMDs.
The drag and lift coefficients are defined as C D = 1 FD2 and C L = 1 FL2 . Results
2 U hl 2 U bl
are compared with the experimental measurements from [94] and reasonable agree-
ment is achieved. Figure 43 shows the time history of angular displacement of the
bridge deck corresponding to the pitching motion. The figure clearly shows that with
the added VMDs the bridge deck experiences smaller rotational motions then without,
which was also observed in the wind tunnel tests. To better understand the underlying
mechanics, the differences in the air flow with and without VMDs are shown on a
planar cut of the bridge deck in Fig. 44. The guide vanes keep the flow attached to
the underside of the deck, which delays flow separation and precludes formation of
large-scale vortical structures that drive the bridge deck response. Figure 45 shows
the 3D view of the deck with guide vanes, where air speed contours at an instant are
Fig. 44 Instantaneous air speed contours on a planar cut near the bridge deck for 2.6 m/s wind
speed. Left Case without VMDs. Right Case with VMDs
Fig. 45 Instantaneous air speed contours on a set of cuts along the deck length for 2.6 m/s wind
speed. Top and bottom deck views are shown
plotted on a set of cuts along the deck length. Top and bottom views are shown. The
flow is turbulent and 3D, which underscores the importance of 3D aerodynamics
modeling for this class of problems.
Bringing solution and analysis to specific classes of moving-interface problems with

a common computational technology need motivated the development of our core
ALE-VMS and ST methods, their recent versions, and the special ALE-VMS and
ST techniques targeting specific classes of problems, such as parachute FSI, aero-
dynamics of flapping wings, wind-turbine aerodynamics and FSI, and free-surface
flow and FOI for ship hydrodynamics. We presented an overview of how the core
and special ALE-VMS and ST techniques are used in computational engineering
analysis and design. We included an overview of three of the special ALE-VMS and
ST techniques, just as examples of the many special techniques that complement the
core methods. We presented examples of different classes of challenging problems
solved: spacecraft parachute FSI, ship hydrodynamics, wind-turbine aerodynamics,
patient-specific arterial FSI, aerodynamics of flapping wings of an actual locust and

an MAV, and vortex-induced vibrations of a bridge deck section. In some of the
problems, we included a comparison with the experimental data, and the compari-
son was always favorable. The examples show that in a diverse set of engineering
applications, with the scope and power afforded by the core and special ALE-VMS
and ST techniques, we can provide reliable analysis and support the design process.
Acknowledgments This work was supported in part by NASA JSC Grant NNX13AD87G. Method
development and evaluation components of the work on aerodynamics of flapping wings and wind-
turbine aerodynamics were supported in part by ARO Grant W911NF-12-1-0162 (TT) and Rice
Waseda research agreement (KT). The development and application of FOI techniques for bridge
aerodynamics was supported by the program for preferred research areas at the Faculty of Engineer-
ing Science and Technology, the Norwegian University of Science and Technology. The research
work on free-surface FOI was supported by the ARO Grant W911NF-11-1-0083 (YB). We wish
to thank the Texas Advanced Computing Center (TACC) at the University of Texas at Austin, the
San Diego Supercomputer Center (SDSC) at the University of California, San Diego, and the Nor-
wegian Metacenter for Computational Science (Notur) for providing some of the HPC resources
used. We thank Professor Fabrizio Gabbiani and Dr. Raymond Chan (Baylor College of Medicine)
for providing us the digital data extracted from the wind-tunnel videos of the locust.
References
meters. Int J Numer Methods Fluids 43:555575. doi:10.1002/fld.505
3. Tezduyar TE, Sathe S (2007) Modeling of fluidstructure interactions with the spacetime
finite elements: solution techniques. Int J Numer Methods Fluids 54:855900. doi:10.1002/
fld.1430
4. Takizawa K, Tezduyar TE (2011) Multiscale spacetime fluidstructure interaction techniques.
5. Takizawa K, Tezduyar TE (2012) Spacetime fluidstructure interaction methods. Math Models
Methods Appl Sci 22:1230001. doi:10.1142/S0218202512300013
6. Hughes TJR, Liu WK, Zimmermann TK (1981) Lagrangian-Eulerian finite element formula-
tion for incompressible viscous flows. Comput Methods Appl Mech Eng 29:329349
7. Ohayon R (2001) Reduced symmetric models for modal analysis of internal structural-acoustic
and hydroelastic-sloshing systems. Comput Methods Appl Mech Eng 190:30093019
8. van Brummelen EH, de Borst R (2005) On the nonnormality of subiteration for a fluidstructure
interaction problem. SIAM J Sci Comput 27:599621
9. Bazilevs Y, Calo VM, Hughes TJR, Zhang Y (2008) Isogeometric fluidstructure interaction:
10. Bazilevs Y, Hsu M-C, Akkerman I, Wright S, Takizawa K, Henicke B, Spielman T, Tezduyar
TE (2011) 3D simulation of wind turbine rotors at full scale. Part I: geometry modeling and
aerodynamics. Int J Numer Methods Fluids 65:207235. doi:10.1002/fld.2400
turbine rotors at full scale. Part II: fluidstructure interaction modeling with composite blades.
Int J Numer Methods Fluids 65:236253
12. Akkerman I, Bazilevs Y, Kees CE, Farthing MW (2011) Isogeometric analysis of free-surface
flow. J Comput Phys 230:41374152
13. Hsu M-C, Bazilevs Y (2011) Blood vessel tissue prestress modeling for vascular fluidstructure
interaction simulations. Finite Elem Anal Des 47:593599
14. Nagaoka S, Nakabayashi Y, Yagawa G, Kim YJ (2011) Accurate fluidstructure interaction
computations using elements without mid-side nodes. Comput Mech 48:269276. doi:10.1007/
s00466-011-0620-7
computer modeling of wind-turbine rotor aerodynamics and fluidstructure interaction. Math
Models Methods Appl Sci 22:1230002. doi:10.1142/S0218202512300025
16. Akkerman I, Dunaway J, Kvandal J, Spinks J, Bazilevs Y (2012) Toward free-surface modeling
of planing vessels: simulation of the fridsma hull using ALE-VMS. Comput Mech 50:719727
17. Minami S, Kawai H, Yoshimura S (2012) Parallel BDD-based monolithic approach for acoustic
fluidstructure interaction. Comput Mech 50:707718
18. Miras T, Schotte J-S, Ohayon R (2012) Energy approach for static and linearized dynamic
studies of elastic structures containing incompressible liquids with capillarity: a theoretical
formulation. Comput Mech 50:729741
19. van Opstal TM, van Brummelen EH, de Borst R, Lewis MR (2012) A finite-element/boundary-
element method for large-displacement fluidstructure interaction. Comput Mech 50:779788
20. Yao JY, Liu GR, Narmoneva DA, Hinton RB, Zhang Z-Q (2012) Immersed smoothed finite ele-
ment method for fluidstructure interaction simulation of aortic valves. Comput Mech 50:789
804
21. Larese A, Rossi R, Onate E, Idelsohn SR (2012) A coupled PFEMEulerian approach for the
solution of porous fsi problems. Comput Mech 50:805819
22. Bazilevs Y, Takizawa K, Tezduyar TE (2013) Computational fluidstructure interaction: meth-
ods and applications. Wiley, Chichester
23. Korobenko A, Hsu M-C, Akkerman I, Tippmann J, Bazilevs Y (2013) Structural mechanics
modeling and FSI simulation of wind turbines. Math Models Methods Appl Sci 23:249272
24. Yao JY, Liu GR, Qian D, Chen CL, Xu GX (2013) A moving-mesh gradient smoothing method
for compressible CFD problems. Math Models Methods Appl Sci 23:273305
25. Kamran K, Rossi R, Onate E, Idelsohn SR (2013) A compressible lagrangian framework for
modeling the fluidstructure interaction in the underwater implosion of an aluminum cylinder.
Math Models Methods Appl Sci 23:339367
26. Hsu M-C, Akkerman I, Bazilevs Y (2013) Finite element simulation of wind turbine aerody-
namics: validation study using NREL phase VI experiment. Wind Energy. doi:10.1002/we.
1599
of 3d flows. Computer 26:2736. doi:10.1109/2.237441
28. Tezduyar T, Aliabadi S, Behr M, Johnson A, Kalro V, Litke M (1996) Flow simulation and
high performance computing. Comput Mech 18:397412. doi:10.1007/BF00350249
29. Tezduyar TE (2001) Finite element methods for flow problems with moving boundaries and
interfaces. Arch Comput Methods Eng 8:83130. doi:10.1007/BF02897870
30. Akin JE, Tezduyar TE, Ungor M (2007) Computation of flow problems with the mixed
interface-tracking/interface-capturing technique (MITICT). Comput Fluids 36:211. doi:10.
1016/j.compfluid.2005.07.008
31. Mittal S, Tezduyar TE (1995) Parallel finite element simulation of 3D incompressible
flowsfluidstructure interactions. Int J Numer Methods Fluids 21:933953. doi:10.1002/
fld.1650211011
32. Takizawa K, Henicke B, Puntel A, Spielman T, Tezduyar TE (2012) Spacetime computational
4005073
33. Takizawa K, Henicke B, Puntel A, Kostov N, Tezduyar TE (2012) Spacetime techniques for
computational aerodynamics modeling of flapping wings of an actual locust. Comput Mech
50:743760. doi:10.1007/s00466-012-0759-x
34. Takizawa K, Kostov N, Puntel A, Henicke B, Tezduyar TE (2012) Spacetime computational
analysis of bio-inspired flapping-wing aerodynamics of a micro aerial vehicle. Comput Mech
50:761778. doi:10.1007/s00466-012-0758-y
35. Takizawa K, Henicke B, Tezduyar TE, Hsu M-C, Bazilevs Y (2011) Stabilized spacetime
computation of wind-turbine rotor aerodynamics. Comput Mech 48:333344. doi:10.1007/
s00466-011-0589-2
36. Takizawa K, Henicke B, Montes D, Tezduyar TE, Hsu M-C, Bazilevs Y (2011) Numerical-
performance studies for the stabilized spacetime computation of wind-turbine rotor aerody-
namics. Comput Mech 48:647657. doi:10.1007/s00466-011-0614-5
37. Takizawa K, Tezduyar TE, McIntyre S, Kostov N, Kolesar R, Habluetzel C (2014) Spacetime
VMS computation of wind-turbine rotor and tower aerodynamics. Comput Mech 53:115.
doi:10.1007/s00466-013-0888-x
38. Takase S, Kashiyama K, Tanaka S, Tezduyar TE (2011) Spacetime supg finite element com-
putation of shallow-water flows with moving shorelines. Comput Mech 48:293306. doi:10.
1007/s00466-011-0618-1
39. Kalro V, Tezduyar TE (2000) A parallel 3d computational method for fluidstructure inter-
actions in parachute systems. Comput Methods Appl Mech Eng 190:321332. doi:10.1016/
S0045-7825(00)00204-8
40. Tezduyar TE, Sathe S, Keedy R, Stein K (2006) Spacetime finite element techniques for
computation of fluidstructure interactions. Comput Methods Appl Mech Eng 195:20022027.
doi:10.1016/j.cma.2004.09.014
41. Torii R, Oshima M, Kobayashi T, Takagi K, Tezduyar TE (2006) Computer modeling of car-
diovascular fluidstructure interactions with the Deforming-Spatial-Domain/Stabilized Space
Time formulation. Comput Methods Appl Mech Eng 195:18851895. doi:10.1016/j.cma.2005.
05.050
42. Tezduyar TE, Sathe S, Cragin T, Nanna B, Conklin BS, Pausewang J, Schwaab M (2007)
Modeling of fluidstructure interactions with the spacetime finite elements: arterial fluid
mechanics. Int J Numer Methods Fluids 54:901922. doi:10.1002/fld.1443
43. Tezduyar TE, Sathe S, Pausewang J, Schwaab M, Christopher J, Crabtree J (2008) Interface
projection techniques for fluidstructure interaction modeling with moving-mesh methods.
Comput Mech 43:3949. doi:10.1007/s00466-008-0261-7
44. Tezduyar TE, Sathe S, Schwaab M, Pausewang J, Christopher J, Crabtree J (2008) Fluid
structure interaction modeling of ringsail parachutes. Comput Mech 43:133142. doi:10.1007/
s00466-008-0260-8
45. Takizawa K, Christopher J, Tezduyar TE, Sathe S (2010) Spacetime finite element computation
of arterial fluidstructure interactions with patient-specific data. Int J Numer Methods Biomed
Eng 26:101116. doi:10.1002/cnm.1241
46. Takizawa K, Moorman C, Wright S, Christopher J, Tezduyar TE (2010) Wall shear stress
calculations in spacetime finite element computation of arterial fluidstructure interactions.
Comput Mech 46:3141. doi:10.1007/s00466-009-0425-0
47. Takizawa K, Moorman C, Wright S, Spielman T, Tezduyar TE (2011) Fluidstructure inter-
action modeling and performance analysis of the orion spacecraft parachutes. Int J Numer
Methods Fluids 65:271285. doi:10.1002/fld.2348
48. Takizawa K, Wright S, Moorman C, Tezduyar TE (2011) Fluid-structure interaction modeling
of parachute clusters. Int J Numer Methods Fluids 65:286307. doi:10.1002/fld.2359
49. Torii R, Oshima M, Kobayashi T, Takagi K, Tezduyar TE (2011) Influencing factors in image-
based fluidstructure interaction computation of cerebral aneurysms. Int J Numer Methods
Fluids 65:324340. doi:10.1002/fld.2448
50. Tezduyar TE, Takizawa K, Brummer T, Chen PR (2011) Spacetime fluidstructure interaction
modeling of patient-specific cerebral aneurysms. Int J Numer Methods Biomed Eng 27:1665
1710. doi:10.1002/cnm.1433
51. Takizawa K, Spielman T, Tezduyar TE (2011) Spacetime fsi modeling and dynamical analy-
sis of spacecraft parachutes and parachute clusters. Comput Mech 48:345364. doi:10.1007/
s00466-011-0590-9
011-0619-0
53. Takizawa K, Tezduyar TE (2012) Computational methods for parachute fluidstructure

interactions. Arch Comput Methods Eng 19:125169. doi:10.1007/s11831-012-9070-4
54. Takizawa K, Bazilevs Y, Tezduyar TE (2012) Spacetime and ALE-VMS techniques for
patient-specific cardiovascular fluidstructure interaction modeling. Arch Comput Methods
Eng 19:171225. doi:10.1007/s11831-012-9071-3
55. Takizawa K, Fritze M, Montes D, Spielman T, Tezduyar TE (2012) Fluidstructure interaction
modeling of ringsail parachutes with disreefing and modified geometric porosity. Comput Mech
50:835854. doi:10.1007/s00466-012-0761-3
56. Takizawa K, Montes D, Fritze M, McIntyre S, Boben J, Tezduyar TE (2013) Methods for FSI
modeling of spacecraft parachute dynamics and cover separation. Math Models Methods Appl
Sci 23:307338. doi:10.1142/S0218202513400058
57. Takizawa K, Tezduyar TE, Boben J, Kostov N, Boswell C, Buscher A (2013) Fluidstructure
interaction modeling of clusters of spacecraft parachutes with modified geometric porosity.
Comput Mech 52:13511364. doi:10.1007/s00466-013-0880-5
58. Takizawa K, Tezduyar TE, Buscher A, Asada S (2013) Spacetime interface-tracking with
topology change (ST-TC). Comput Mech. doi:10.1007/s00466-013-0935-7
59. Brooks AN, Hughes TJR (1982) Streamline upwind/petrov-galerkin formulations for convec-
tion dominated flows with particular emphasis on the incompressible navier-stokes equations.
60. Hughes TJR (1995) Multiscale phenomena: greens functions, the dirichlet-to-neumann formu-
lation, subgrid scale models, bubbles, and the origins of stabilized methods. Comput Methods
63. Bazilevs Y, Akkerman I (2010) Large eddy simulation of turbulent TaylorCouette flow using
229:34023414
64. Bazilevs Y, Calo VM, Zhang Y, Hughes TJR (2006) Isogeometric fluidstructure interaction
65. Akkerman I, Bazilevs Y, Calo VM, Hughes TJR, Hulshoff S (2008) The role of continuity in
residual-based variational multiscale modeling of turbulence. Comput Mech 41:371378
66. Bazilevs Y, Gohean JR, Hughes TJR, Moser RD, Zhang Y (2009) Patient-specific isogeometric
fluid-structure interaction analysis of thoracic aortic blood flow due to implantation of the Jarvik
2000 left ventricular assist device. Comput Methods Appl Mech Eng 198:35343550
67. Bazilevs Y, Hsu M-C, Kiendl J, Benson DJ (2012) A computational procedure for pre-bending
of wind turbine blades. Int J Numer Methods Eng 89:323336
68. Korobenko A, Hsu M-C, Akkerman I, Bazilevs Y (2013) Aerodynamic simulation of vertical-
axis wind turbines. J Appl Mech. doi:10.1115/1.4024415
69. Bazilevs Y, Hughes TJR (2007) Weak imposition of dirichlet boundary conditions in fluid
mechanics. Comput Fluids 36:1226
70. Bazilevs Y, Michler C, Calo VM, Hughes TJR (2010) Isogeometric variational multiscale mod-
eling of wall-bounded turbulent flows with weakly enforced boundary conditions on unstretched
meshes. Comput Methods Appl Mech Eng 199:780790
71. Hsu M-C, Akkerman I, Bazilevs Y (2012) Wind turbine aerodynamics using ALE-VMS: val-
idation and role of weakly enforced boundary conditions. Comput Mech 50:499511
72. Takizawa K, Yabe T, Tsugawa Y, Tezduyar TE, Mizoe H (2007) Computation of free-surface
flows and fluidobject interactions with the cip method based on adaptive meshless soroban
grids. Comput Mech 40:167183. doi:10.1007/s00466-006-0093-2
73. Takizawa K, Tanizawa K, Yabe T, Tezduyar TE (2007) Ship hydrodynamics computations with
the CIP method based on adaptive soroban grids. Int J Numer Methods Fluids 54:10111019.
doi:10.1002/fld.1466
74. Cruchaga MA, Celentano DJ, Tezduyar TE (2007) A numerical model based on the Mixed
Interface-Tracking/Interface-Capturing Technique (MITICT) for flows with fluidsolid and
fluidfluid interfaces. Int J Numer Methods Fluids 54:10211030. doi:10.1002/fld.1498
75. Cruchaga M, Celentano D, Tezduyar T (2001) A moving lagrangian interface technique for
flow computations over fixed meshes. Comput Methods Appl Mech Eng 191:525543. doi:10.
1016/S0045-7825(01)00300-0
76. Sethian J (1999) Level set methods and fast marching methods. Cambridge University Press,
Cambridge
77. Kees CE, Akkerman I, Farthing MW, Bazilevs Y (2011) A conservative level set method suitable
for variable-order approximations and unstructured meshes. J Comput Phys 230:45364558
78. Akkerman I, Bazilevs Y, Benson DJ, Farthing MW, Kees CE (2012) Free-surface flow and fluid
object interaction modeling with emphasis on ship hydrodynamics. J Appl Mech 79:010905
flows with the finite element methodsspacetime formulations, iterative strategies and mas-
sively parallel implementations. In: Smolinski P, Liu WK, Hulbert G, Tamma K (eds) New
methods in transient analysis, PVP-Vol. 246/AMD-Vol. 143. ASME, New York, pp 724
80. Johnson AA, Tezduyar TE (1994) Mesh update strategies in parallel finite element computations
of flow problems with moving boundaries and interfaces. Comput Methods Appl Mech Eng
119:7394. doi:10.1016/0045-7825(94)00077-8
81. Hughes TJR, Cottrell JA, Bazilevs Y (2005) Isogeometric analysis: cad, finite elements, nurbs,
exact geometry, and mesh refinement. Comput Methods Appl Mech Eng 194:41354195
82. Bazilevs Y, Hughes TJR (2008) Nurbs-based isogeometric analysis for the computation of
flows about rotating components. Comput Mech 43:143150
83. Takizawa K, Henicke B, Puntel A, Kostov N, Tezduyar TE (2013) Computer modeling tech-
niques for flapping-wing aerodynamics of a locust. Comput Fluids 85:125134. doi:10.1016/
j.compfluid.2012.11.008
84. Takizawa K, Tezduyar TE (2014) Spacetime computation techniques with continuous repre-
sentation in time (ST-C). Comput Mech 53:9199. doi:10.1007/s00466-013-0895-y
85. Cruchaga M, Celentano D, Tezduyar T (2002) Computation of mould filling processes with a
moving Lagrangian interface technique. Commun Numer Methods Eng 18:483493. doi:10.
1002/cnm.506
86. Cruchaga MA, Celentano DJ, Tezduyar TE (2005) Moving-interface computations with the
edge-tracked interface locator technique (ETILT). Int J Numer Methods Fluids 47:451469.
doi:10.1002/fld.825
87. Cruchaga MA, Celentano DJ, Tezduyar TE (2007) Collapse of a liquid column: numerical
simulation and experimental validation. Comput Mech 39:453476. doi:10.1007/s00466-006-
0043-z
88. Kleefsman KMT, Fekken G, Veldman AEP, Iwanowski B, Buchner B (2005) A volume-of-fluid
based simulation method for wave impact problems. J Comput Phys 206:363393
89. Fridsma G (1968) A systematic study of the rough-water performance of planing boats. David-
son Laboratory Report 1275
90. Longo J, Stern F (2005) Uncertainty assessment for towing tank tests with example for surface
combatant dtmb model 5415. J Ship Res 49:5568
91. Garcia J, Oate E (2003) An unstructured finite element solver for ship hydrodynamics prob-
lems. J Appl Mech 70:1826
92. Longo J, Shao J, Irvine M, Stern F (2007) Phase-averaged PIV for the nominal wake of a
surface ship in regular head waves. J Fluids Eng 129:524541
93. McCormick ME (2010) Ocean engineering mechanics with applications. Cambridge University
Press, Cambridge
94. Hansen SO et al (2006) The Hardanger bridge: static and dynamic wind tunnel tests with a
section model. Technical report, Prepared for Norwegian Public Roads Administration
Computational Wind-Turbine Analysis
with the ALE-VMS and ST-VMS Methods
Yuri Bazilevs, Kenji Takizawa, Tayfun E. Tezduyar, Ming-Chen Hsu,

Nikolay Kostov and Spenser McIntyre
Abstract We provide an overview of the aerodynamic and FSI analysis of wind

turbines the first three authors teams carried out in recent years with the ALE-VMS
and ST-VMS methods. The ALE-VMS method is the variational multiscale version
of the Arbitrary LagrangianEulerian (ALE) method. The VMS components are from
the residual-based VMS (RBVMS) method. The ST-VMS method is the VMS version
of the Deforming-Spatial-Domain/Stabilized SpaceTime (DSD/SST) method. The
techniques complementing these core methods include weak enforcement of the
essential boundary conditions, NURBS-based isogeometric analysis, using NURBS
basis functions in temporal representation of the rotor motion, mesh motion and also
in remeshing, rotation representation with constant angular velocity, KirchhoffLove
shell modeling of the rotor-blade structure, and full FSI coupling. The analysis cases
include the aerodynamics of wind-turbine rotor and tower and the FSI that accounts
for the deformation of the rotor blades. The specific wind turbines considered are
NREL 5MW, NREL Phase VI and Micon 65/13M, all at full scale, and our analysis
for NREL Phase VI and Micon 65/13M includes comparison with the experimental
data.
Y. Bazilevs (B)
Structural Engineering, University of California, San Diego, 9500 Gilman Drive,
La Jolla, CA 92093, USA
K. Takizawa
Department of Modern Mechanical Engineering and Waseda Institute for Advanced
Study, Waseda University, 1-6-1 Nishi-Waseda, Shinjuku-ku, Tokyo 169-8050, Japan
T. E. Tezduyar N. Kostov S. McIntyre
M.-C. Hsu
Department of Mechanical Engineering, Iowa State University, 2025 Black Engineering,
Ames, IA 50011, USA

356 Y. Bazilevs et al.
1 Introduction
Countries around the world are putting substantial effort into the development
of wind energy technologies. The ambitious wind energy goals put pressure on the
wind energy industry research and development to significantly enhance the current
wind generation capabilities in a short period of time and decrease the associated
costs. This calls for transformative concepts and designs (e.g., floating offshore wind
turbines) that must be created and analyzed with high-precision methods and tools.
These include complex-geometry, 3D, time-dependent, multi-physics predictive sim-
ulation methods and software that will play an increasingly important role as the
demand for wind energy grows.
Currently most wind-turbine aerodynamics and aeroelasticity simulations are per-
formed using low-fidelity methods, such as the Blade Element Momentum (BEM)
theory for the rotor aerodynamics employed in conjunction with simplified structural
models of the wind-turbine blades and tower (see, e.g., [1, 2]). These methods are
very fast to implement and execute. However, the cases involving unsteady flow,
turbulence, 3D details of the wind-turbine blade and tower geometry, and other
similarly-important features, are beyond their range of applicability.
To obtain high-fidelity results for wind turbines, 3D modeling is essential. How-
ever, simulation of wind turbines at full scale engenders a number of challenges: the
flow is fully turbulent, requiring highly accurate methods and increased grid resolu-
tion. The presence of fluid boundary layers, where turbulence is created, complicates
the situation further. Wind-turbine blades are long and slender structures, with com-
plex distribution of material properties, for which the numerical approach must have
good approximation properties and avoid locking. Wind-turbine simulations involve
moving and stationary components, and the fluidstructure coupling must be accu-
rate, efficient and robust to preclude divergence of the computations. These explain
the current, modest nature of the state-of-the-art in wind-turbine simulations.
Fluidstructure interaction (FSI) simulations at full scale are essential for accurate
modeling of wind turbines. The motion and deformation of the wind-turbine blades
depend on the wind speed and air flow, and the air flow patterns depend on the
motion and deformation of the blades. In order to simulate the coupled problem, the
equations governing the air flow and the blade motions and deformations need to be
solved simultaneously, with proper kinematic and dynamic conditions coupling the
two physical systems. Without that the modeling cannot be realistic: unsteady blade
deformation affects aerodynamic efficiency and noise generation, and response to
wind gusts. Flutter analysis of large blades operating in offshore environments is of
great importance and cannot be accomplished without FSI.
In recent years, several attempts were made to address the above mentioned
challenges and to raise the fidelity and predictability levels of wind-turbine sim-
ulations. Standalone aerodynamics simulations of wind-turbine configurations in
3D were reported in [36], while standalone structural analyses of rotor blades of
complex geometry and material composition, but under assumed wind-load condi-
tions or wind-load conditions coming from separate aerodynamic computations were
Computational Wind-Turbine Analysis 357
reported in [711]. In a recent work [12] it was shown that coupled FSI modeling and
simulation of wind turbines is important for accurately predicting their mechanical
behavior at full scale.
To address the above mentioned challenges one should employ a combination
of numerical techniques, which are general, accurate, robust and efficient for the
targeted class of problems. Such techniques are summarized in what follows, with
some of them described in greater detail in the body of this book chapter.
Isogeometric Analysis (IGA), first introduced in [13] and further expanded on
in [1420], is adopted as the geometry modeling and simulation framework for
wind turbines in some of the examples presented here. We use the IGA based on
NURBS (non-uniform rational B-splines), which are more efficient than standard
finite elements for representing complex, smooth geometries, such as wind-turbine
blades. The IGA was successfully employed for computation of turbulent flows
[2126], nonlinear structures [10, 2731], and FSI [3235], and, in most cases, gave a
clear advantage over standard low-order finite elements in terms of solution accuracy
per-degree-of-freedom. This is in part attributable to the higher-order smoothness of
the basis functions employed. Flows about rotating components are naturally handled
in an isogeometric framework because all conic sections, and in particular, circular
and cylindrical shapes, are represented exactly [36].
The blade structure is governed by the isogeometric rotation-free shell formulation
with the aid of the bending-strip method [10]. The method is appropriate for thin-shell
structures comprised of multiple C 1 - or higher-order continuous surface patches that
are joined or merged with continuity no greater than C 0 . The KirchhoffLove shell
theory that relies on higher-order continuity of the basis functions is employed in
the patch interior as in [31]. Although NURBS-based IGA is employed in this work,
other discretizations such as T-splines [19, 20] or subdivision surfaces [3739], are
perfectly suited for the proposed structural modeling method.
In addition, an isogeometric representation of the analysis-suitable geometry can
be used in generating tetrahedral and hexahedral meshes for computations with the
finite element method (FEM). In this article, we use tetrahedral meshes generated
that way in wind-turbine computations with the ALE-VMS and ST-VMS methods.
The ALE-VMS method [5, 34] is the variational multiscale (VMS) version of the
Arbitrary LagrangianEulerian (ALE) method [40]. The VMS components are from
the residual-based VMS (RBVMS) method given in [21, 26, 41, 42]. The ST-VMS
method [43, 44] is the VMS version of the Deforming-Spatial-Domain/Stabilized
SpaceTime (DSD/SST) method [4549]. Earlier it was called DSD/SST-VMST
(i.e. the version with the VMS turbulence model) in [43]. The original DSD/SST for-
mulation was named DSD/SST-SUPS in [43] (i.e. the version with the SUPG/PSPG
stabilization), which was also called ST-SUPS in [50].
The ALE-VMS method originated from the RBVMS formulation of incompress-
ible turbulent flows proposed in [21] for stationary meshes, and may be thought of as
an extension of the RBVMS method to moving meshes. As such, it was presented for
the first time in [34] in the context of FSI. Although ALE-VMS gave reasonably good
results for several important turbulent flows, it was evident in [21, 24] that to obtain
accurate results for wall-bounded turbulent flows the method required relatively fine
resolution of the boundary layers. This fact makes ALE-VMS a somewhat costly
technology for full-scale wall-bounded turbulent flows at high Reynolds numbers,
which are characteristic of the present application. For this reason, weakly-enforced
essential boundary condition formulation was introduced in [51], which significantly
improved the performance of the ALE-VMS formulation in the presence of unre-
solved boundary layers [22, 23, 26]. The weak boundary condition formulation may
be thought of as an extension of Nitsches method [52] to the NavierStokes equa-
tions of incompressible flows. Another interpretation of the weak boundary condition
formulation is that it is a discontinuous Galerkin method (see, e.g., [53]), where the
continuity of the basis functions is enforced everywhere in the domain interior, but
not at the domain boundary.
The DSD/SST formulation was introduced in [4547] as a general-purpose
interface-tracking (moving-mesh) technique for flows with moving boundaries and
interfaces, including FSI and flows with moving objects. Its stabilization com-
ponents are the Streamline-Upwind/Petrov-Galerkin (SUPG) [54] and Pressure-
Stabilizing/Petrov-Galerkin (PSPG) [45, 55] stabilizations. It also includes the
LSIC (least-squares on incompressibility constraint) stabilization. Some of the
earliest FSI computations with the DSD/SST formulation were reported in [56] for
vortex-induced vibrations of a cylinder and in [57] for flow-induced vibrations of a
flexible, cantilevered pipe (1D structure with 3D flow). The DSD/SST formulation
has been used extensively in 3D computations of parachute FSI, starting with the
3D computations reported in [58] and evolving to computations with direct cou-
pling [59]. New versions of the DSD/SST formulation introduced in [49] are the
core technologies of the Stabilized ST FSI (SSTFSI) technique, which was also
introduced in [49]. The ST-VMS method and SSTFSI technique, combined with a
number of special techniques (see [6063] and references therein) have been used
in some of the most challenging parachute FSI computations (see [60, 6466] and
references therein), and also in a good number of patient-specific cardiovascular FSI
and fluid mechanics computations (see [6163, 67] and references therein). Compu-
tations with the SSTFSI technique also received a substantial attention in research
related to iterative solution of large linear systems [68, 69].
In application of the DSD/SST formulation to flows with moving objects, the
ShearSlip Mesh Update Method (SSMUM) [7072] has been very instrumental.
The SSMUM was first introduced for computation of flow around two high-speed
trains passing each other in a tunnel (see [70]). The challenge was to accurately
and efficiently update the meshes used in computations based on the DSD/SST
formulation and involving two objects in fast, linear relative motion. The idea behind
the SSMUM was to restrict the mesh moving and remeshing to a thin layer of elements
between the objects in relative motion. The mesh update at each time step can be
accomplished by a shear deformation of the elements in this layer, followed by a
slip in node connectivities. The slip in the node connectivities, to an extent, un-does
the deformation of the elements and results in elements with better shapes than those
that were shear-deformed. Because the remeshing consists of simply re-defining
the node connectivities, both the projection errors and the mesh generation cost are
minimized. A few years after the high-speed train computations, the SSMUM was
implemented for objects in fast, rotational relative motion and applied to computation
of flow past a rotating propeller [71] and flow around a helicopter [72].
The ST-VMS method was successfully tested on computation of wind-turbine
rotor aerodynamics in [6, 73, 74]. Those computations did not include a wind-turbine
tower, and therefore a mesh update method was not required. In [75], the ST-VMS
method was applied to computation of wind-turbine rotor and tower aerodynamics.
The presence of a tower requires a mesh update method that can handle the fast,
rotational relative motion between the rotor and tower. The SSMUM would have
been an option, but we decided to use a method that is more general. We use NURBS
basis functions for the temporal representation of the rotor motion, mesh motion and
also in remeshing. This is essentially the same computational technology used in
the ST-VMS computations of flapping-wing aerodynamics reported in [7679]. We
named it ST/NURBS Mesh Update Method (STNMUM) in [75]. The motion of the
rotor surface mesh created from the NURBS geometry is represented by quadratic
temporal NURBS basis functions, with sufficient number of temporal patches for
one rotation. This enables us to represent the circular paths associated with the rotor
motion exactly and, with a secondary mapping [43, 44, 50, 76], specify a con-
stant angular velocity corresponding to the invariant speeds along those paths. Given
the motion of the surface mesh, we compute meshes that serve as temporal-control
points. This is done by creating with an automatic mesh generator a new mesh at the
central control point of the temporal patch, and computing the meshes at the other two
control points by using the mesh moving technique [49, 8083] developed earlier in
conjunction with the DSD/SST method. The STNMUM allows us to do mesh com-
putations with longer time in between, but get the mesh-related information for each
ST slab, such as the coordinates and their time derivatives, from the temporal repre-
sentation whenever we need. This approach where the mesh-related information is
computed directly was called in [75] Direct Temporal Representation (DTR). In
an alternative approach, we can obtain the mesh-related data after first computing the
finite element meshes associated with each ST slab by interpolation from the tempo-
ral NURBS representation of the mesh. This approach was called Interpolated-Mesh
Temporal Representation (IMTR) in [75]. For better mesh resolution, we use layers
of thin elements near the blade surfaces. These layers of elements are created with a
special mesh generation process and are not part of what we create with the automatic
mesh generation process. They undergo rigid-body motion with the rotor. Despite
the fast, rotational relative motion between the rotor and tower, the computations
reported in [75] were carried out by using an automatic mesh generator only a total
of 6 times during an entire computation.
We refer the interested reader to [50, 74] for the following methods that are not
reviewed in this article: ALE-VMS and ST-VMS methods, formulation for weakly-
enforced essential boundary conditions, structural mechanics formulation, which is
based on the KirchhoffLove thin-shell theory and the bending-strip method (see
[10, 12, 31]), FSI coupling, mesh update, and a method for pre-bending of wind-
turbine blades, which was recently proposed in [11].
In Sect. 2, we present the sliding-interface formulation from [36, 84, 85], which
enables the simulation of rotortower interaction. The formulation was used in [85]
Fig. 1 Setup for the

simulation of a full machine.
The interior moving
subdomain, which encloses
the wind turbine rotor, and the
exterior stationary subdomain,
which houses the nacelle and
tower
for ALE-VMS aerodynamic simulations of the National Renewable Energy Lab

(NREL) Phase VI wind turbine (see [86]) for comparison to the extensive set of
experimental data available for this test case. We also present those simulations in
Sect. 2. In Sect. 3, we describe, from [75], the ST-VMS computations of the wind-
turbine rotor and tower aerodynamics. NURBS basis functions are used in temporal
representation of the rotor and volume mesh motion and in remeshing. Simulations
of the Micon 65/13M wind turbine with FSI, reported earlier in [87], are described
in Sect. 4. We end with concluding remarks in Sect. 5.
2 Sliding-Interface Formulation and RotorTower

Interaction
2.1 Sliding-Interface Formulation
In order to simulate the full wind turbine configuration and investigate the rotortower
interaction, we consider an approach that makes use of a moving subdomain, which
encloses the entire wind turbine rotor, and a stationary subdomain that contains the
rest of the wind turbine (see Fig. 1). The two domains are in relative motion and
share a sliding cylindrical interface. The meshes on each side of the interface are
nonmatching because of the relative motion (see Fig. 2). As a result, a numerical
procedure is needed to impose the continuity of the kinematics and tractions at the
stationary and rotating subdomain interface despite the fact that the interface dis-
cretizations are incompatible. Such a procedure was developed in [36] in the context
of IGA for computing flows about rotating components. The advantage of IGA for
Fig. 2 Nonmatching meshes at the sliding interface between the stationary and moving subdomains.
Left Full domain. Right Zoom on the sliding interface
rotating-component flows is that the cylindrical sliding interfaces are represented

exactly and no geometry errors are incurred. In the case of standard FEM employed
here, the geometric compatibility is only approximate. The sliding-interface coupling
was successfully tested on the NREL Phase VI wind turbine in [85] and is presented
in what follows.
Let the subscripts S and M denote the quantities pertaining to the fluid mechanics
problem on the stationary and moving subdomains, respectively. The subdomain that
encloses the rotor rotates with it, and the interior of the rotating subdomain is allowed
to deflect to accommodate the motion of the blades. However, the motion of the outer
boundary of the rotor subdomain is restricted to a rigid rotation to maintain geometric
compatibility with the stationary subdomain. To enforce the compatibility of the flow
kinematics and tractions at the sliding interface, we add the following terms to the
ALE-VMS formulation, which is now assumed to hold in both the stationary and
moving subdomains:

n eb 1
wSh wM
h
S nS M nM ) d
(
2
b=1 b
t (t )SI

n eb
1
( M nM ) uSh uM
S nS h
d
2
b=1 b
t (t )SI

n eb
wSh uSh uSh nS uSh uM
h
d

b=1 b
t (t )SI

n eb
h
wM h
uM uM
h
nM h
uM uSh d

b=1 b
t (t )SI

C IB h
n eb
+ wS wM
h
uSh uM
h
d = 0, (1)
hn
b=1 b
t (t )SI
where is given by (w, q) n = 2(w)n + qn. (t )SI is the sliding interface,

and {A} denotes the negative part of A, that is, {A} = A if A < 0 and {A} = 0
if A 0. The sliding-interface formulation may be seen as a DG method, where
the continuity of the basis function is enforced everywhere in the interior of the
two subdomains, but not at the sliding interface between them. The structure of the
terms on the sliding interface is similar to that of the weak enforcement of essential
boundary conditions. The significance of each term is explained in detail in [36]. In
the current application, uSh = 0, because the subdomain S is stationary. However, the
formulation is able to handle situations where both subdomains are in motion.
Remark 1 Nonmatching interface discretizations in the FSI and sliding-interface

problems necessitate the use of interpolation or projection of kinematic and traction
data between the nonmatching surface meshes (see, e.g., [43, 44, 88], where [44]
is more comprehensive than [43]). A computational procedure, which can simul-
taneously handle the data transfer for IGA and FEM discretizations, was proposed
in [88]. The procedure also includes a robust approach in identifying closest points
for arbitrary shaped surfaces. While such interface projections are rather straightfor-
ward for weakly-coupled FSI algorithms, they require special techniques [44, 49,
60] for strongly-coupled, direct and quasi-direct methods [44, 49, 59, 60, 89], which
become monolithic for matching discretizations.
2.2 NREL Phase VI Wind Turbine
The computational results in this section make use of the ALE-VMS technique and
are taken from [85]. The sliding-interface formulation is applied to the simulation
of the full NREL Phase VI wind turbine configuration, including the rotor (blades
and hub), nacelle and tower. The tower is composed of two cylinders with diameters
of 0.6096 and 0.4064 m that are connected with a short conical section. The tower
height is 11.144 m above the wind tunnel floor. The detailed geometry of the tower
and nacelle can be found in Hand et al. [86]. For this study, wind speeds of 7 and
10 m/s were selected from the experimental sequence S. The experimental sequence
S setup consists the wind turbine rotor in the upwind configuration, 0 yaw angle,
0 cone angle, rotational speed of 72 rpm, and blade tip pitch angle of 3 . The air
density and viscosity are 1.23 kg/m3 and 1.78 105 kg/(ms), respectively.
Figure 3 shows the mesh resolution used in the computation. The mesh is highly
refined near the rotor, nacelle and tower, as well as downstream of the wind turbine
to better capture the wake turbulence. The mesh is comprised of 6,835,647 linear
elements and 1,603,377 nodes. The size of the first boundary-layer element in the
Fig. 3 Meshes used in the full-wind-turbine simulation. Left 2D cut at x = 0 to show the flow
domain mesh quality. Right Rotor, nacelle, and tower surface mesh
Fig. 4 Air speed planar distribution and isosurfaces at an instant for the 7 m/s case
wall-normal direction is 0.002 m, and 15 layers of prismatic elements were generated

with a growth ratio of 1.2. The time step size is set to 1.0 105 s.
Figures 4 and 5 show the flow visualization of the full-wind-turbine simulations
of the 7 and 10 m/s cases, respectively. The flow structures are different between the
two cases. The tip vortex for the 7 m/s case decays very slowly as it is convected
downstream, while the tip vortex breaks down quickly for the 10 m/s case. No visible
discontinuities are present in the flow field at the sliding interface, which indicates
that the method correctly handles the kinematic compatibility conditions.
Fig. 5 Air speed planar distribution and isosurfaces at an instant for the 10 m/s case
Fig. 6 Single-blade aerodynamic torque over a full revolution for 7 m/s (left) and 10 m/s (right)
cases. The 180 azimuthal angle corresponds to the instant when the blade passes in front the tower.
The tower effect is clearly pronounced in the 7 m/s case. It is also present in the 10 m/s case, but is
not as significant. The results in both cases are in very good agreement with the experimental data
To see the influence of the tower, the single-blade aerodynamic torque over a full
revolution is plotted in Fig. 6 for both 7 and 10 m/s cases. The results of the full-
wind-turbine computations are compared with the experimental data, as well as with
the results of the rotor-only computations. For the full-wind-turbine simulation of
the 7 m/s case, Fig. 6 clearly shows the drop in the aerodynamic torque at an instant
when the blade passes in front of the tower, which corresponds to the azimuthal angle
of 180 . The drop in the torque is about 8 % relative to its value when the blade is
away from the tower. These results are in good agreement with the experimental data.
The rotor-only computation, which is also shown in the figure, is obviously unable
to predict this feature, which may be important for the transient structural response
of the blades. It should be noted, however, that the cycle-averaged aerodynamic
torque is nearly identical for the full-wind-turbine and the rotor-only simulations.
The picture is completely different for the 10 m/s case, where the influence of the
tower, although clearly present, is a lot less pronounced.
3 ST-VMS Computation of the Wind-Turbine Rotor

and Tower Aerodynamics
This section is from [75].
3.1 Rotation Representation with Constant Angular

Velocity
We use quadratic NURBS functions, as described in [43, 44, 50, 76], to represent
a circular arc. We discretize time and position as follows:

n ent
n ent

t= T (t ())t , x= T (x ())x . (2)
=1 =1
Here n ent is the number of temporal element nodes, T is the basis function, t ()
and x () are the secondary mappings for time and position, and t and x are the
time and position values corresponding to the basis function T . The basis functions
could be finite element or NURBS basis functions. For the circular arc, n ent = 3 and
they are quadratic NURBS. The secondary mapping concept above was introduced
in [43], and the velocity can be expressed as follows:
n
n
1
dx ent
dT dx ent
dT dt
= x t , (3)
dt dx d dt d
=1 =1
leading to
n
n
1
dx ent
dT ent
dT dx d
= x t . (4)
dt dx dt d dt
=1 =1
Thus, the speed along the path can be specified only by modifying the secondary
mapping. For a circular arc, two methods were introduced in [44, 76]; one is modify-
ing the secondary mapping for position and the other one is modifying both such that
dt
d is constant. We note that, in theory, the secondary mapping selections do not make
any difference as long as the relationship d x
dt is the same. In our implementation,
to keep the process general, we search for the parametric coordinate by using
an iterative solution method [44, 50, 76]. We use the latter set of the secondary
dt
mappings, having constant d . For the IMTR, we find the parametric coordinate cor-
responding to each time level and interpolate the position to obtain the corresponding
mesh. For the DTR, we first calculate time corresponding to each integration point,
including the time step size because of the jump term, and then calculate x and t
to interpolate the position and velocity from Eqs. (2) and (4).
3.2 Geometry Construction
The geometry construction for the wind-turbine rotor blade and hub was described
in [5, 6], and also partially in [73, 75]. For completeness we repeat some of that
information here. The geometry of the rotor blade is based on the NREL 5MW
offshore baseline wind turbine reported in [90]. A 61 m blade is attached to a hub
with radius of 2 m, making the total rotor radius, R, 63 m. The blade is composed
of several airfoil types. The first portion of the blade is a perfect cylinder. Farther
away from the root the cylinder is smoothly blended into a series of DU (Delft
University) airfoils. Starting at 44.55 m from the root and all the way to the tip,
the NACA64 profile is used. For each cross-section, we use quadratic NURBS to
represent the 2D airfoil shape. The weights of the NURBS functions are set to unity.
The weights are adjusted near the root to represent the circular cross-sections exactly.
The cross-sections are lofted along the blade axis direction, also using quadratic
NURBS and unit weights. This geometry-construction process yields a smooth blade
surface with a relatively small number of input parameters, which is an advantage of
the isogeometric representation. Images of the airfoil types used in the wind-turbine
rotor blade and the final blade including the twisting cross-sections can be found in
[5, 6, 73]. Starting from this rotor surface geometry, we generate a quadratic NURBS
surface with G 2 and G 1 continuity between the patches around and along the blade,
respectively. The tower geometry was created based on the tower design specified
for the NREL 5MW wind turbine, which describes a circular tower with a height of
87.6 m, a base diameter of 6 m, and a top diameter of 3.87 m. This geometry was
generated by lofting between NURBS curves for the top and base of the tower. The
rotor axis is 90 to the tower, and there is no tilt or precone. The distance between
the tower axis and the point where the three blade axes intersect is 5 m. For most of
the blade, the clearance from the tower is in the range 2.32.8 m.
3.3 Problem Setup
We compute the aerodynamics of the rotor with and without its tower for a given
rotor shape and wind speed and a specified rotor speed. The wind speed is uniform
at 9 m/s and the rotor speed is 1.08 rad/s, giving a tip speed ratio of 7.55 (see [91] for
Fig. 7 Path of a blade tip with

temporal patches and control
point numbering local to each
patch. A control point at the
start of a patch and colocated
with a control point at the
end of the previous patch is
in parentheses. Patch colors
1 blue, 2 orange, 3 purple, 4
green, 5 red, 6 teal
wind-turbine terminology). We use air properties at standard sea-level conditions.

The Reynolds number (based on the cord length at 43 R and the relative velocity there)
is approximately 12 million. At the inflow boundary the velocity is set to the wind
velocity, at the outflow boundary the stress vector is set to zero, and at the top, side,
and bottom boundaries slip conditions are imposed.
3.4 Rotor Motion
The circular turbine rotation is represented with temporal NURBS basis functions
and secondary mapping, described in Sect. 3.1. Because the three blades of the
turbine are 120 apart, rotational geometric periodicity is used such that a full 360
rotation is defined by three identical 120 segments. Each 120 segment is divided
into six patches to keep the mesh distortion under control. Each patch has three
temporal-control points. The six temporal patches and their control points are shown
in Fig. 7.
3.5 Surface Mesh
The rotor surface mesh is generated by discretizing the NURBS surface geometry at
each knot intersection, subdividing the knot spans into quadrilateral finite elements
in a structured way, and subdividing the quadrilateral elements into two triangles.
Small adjustments are made to improve the mesh near the hub. The surface mesh
position is calculated at each temporal-control point shown in Fig. 7. Figure 8 shows
the rotor surface at the three temporal-control points of the first patch. We note that
control points 1 and 3 lie on the path traveled by the points on the blades and a
portion of the hub at the start and end of the 20 rotation, but control point 2 lies
outside the circular arc. This means that the temporal-control mesh 2 is deformed
Fig. 8 Rotor surface at the

three temporal-control points
of the first patch
compared to the temporal-control meshes 1 and 3. A temporal-control mesh 2 has

to be generated for the part of the surface between the hub cross-sections rotating
with the blades and fixed to the tower. The tower surface mesh is generated from
the NURBS representation of the surface by using an unstructured triangular mesh
generator and matched with the previously generated hub mesh at the intersection.
The rotor surface mesh has 34,087 nodes and 68,112 triangles. The tower surface
mesh has 6,952 nodes and 13,806 triangles.
3.6 Volume Mesh
3.6.1 Boundary-Layer Mesh
The layers of thin elements near the blades are generated by extruding the NURBS
surface geometry into NURBS volume representation, subdividing the knot spans
into hexahedral finite elements in a structured way, and subdividing the hexahedral
elements into six tetrahedral elements. The resulting boundary-layer mesh for each
blade consists of four layers with a first-layer thickness of about 2.85 102 m and
a total thickness of about 2.85 101 m, 52 nodes in the circumferential direction
around the blade, and approximately 145 nodes in the longitudinal direction. The
tower boundary-layer mesh is generated by extruding the tower surface mesh to layers
of prismatic elements, which are then subdivided into three tetrahedral elements
each. It consists of four layers, with a first-layer thickness of 2.85 102 m and a
total thickness of 3.0 101 m. The blade and tower boundary-layer meshes do not
undergo any mesh deformation. This maintains the mesh quality in the boundary-
layer regions. Figure 9 shows the tower and blade boundary-layer meshes.
Fig. 9 Left Boundary-layer mesh at 43 R. Right Tower boundary-layer mesh
Fig. 10 A cut plane of

temporal-control Mesh 1
of patch 1 for Mesh 3
3.6.2 Overall Mesh
Three different meshes are used in the computations: Mesh 1, Mesh 2, and Mesh
3. Mesh 2 has both the rotor and the tower, with boundary-layer mesh only for the
blades. Mesh 1 has only the rotor, and is identical to Mesh 2 except the tower is filled
with volume elements. Mesh 3 has both the rotor and the tower, with boundary-layer
mesh for both the blades and the tower, and a mesh refinement region downstream of
the tower. All three meshes have an outer, coarser region, with an inner cylindrical
refinement region surrounding the rotor. This inner refinement region includes most
of the tower for Mesh 2 and Mesh 3, and the mesh refinement region downstream
of the tower for Mesh 3. Figure 10 illustrates, as an example, a cut plane of Mesh
3. The inflow and outflow boundaries are at 3.79R and 10.35R from the hub center,
respectively. The side, top, and bottom boundaries are at 2.29R, 3.17R, and 1.43R,
respectively (see Fig. 10). The volume mesh is generated once per patch using an
automatic mesh generator (a total of 6 times). The mesh is generated at control
point 2 of each patch to minimize mesh distortion between control points. We note
that only the mesh in the inner cylindrical refinement region surrounding the rotor
is generated for each patch. The outer, coarser mesh is generated only once, and
is kept the same when the inner meshes are generated for each patch. The mesh
moving technique [49, 8083] developed earlier in conjunction with the DSD/SST
method is used for computing the mesh position for control points 1 and 3. The outer
surfaces of the boundary-layer meshes serve as the boundaries where we specify
the inner boundary conditions for the mesh motion. The external boundaries of the
computational domain serve as the boundaries where we specify the outer boundary
conditions, with zero displacement. In the elasticity equations of the mesh moving
technique, a Youngs modulus of 1.0, a Poissons ratio of 0.20, and a stiffening
exponent of 1.5 are used. We use 1,500 GMRES [92] iterations for each step of the
mesh motion, with diagonal preconditioner. Each 10 range of motion is computed
over 40 steps. The approximate number of nodes for Mesh 1, Mesh 2 and Mesh 3
are 465,000, 440,000 and 595,000.
3.7 Computational Conditions
In the ST-VMS computations, the stabilization parameters are given by Eq. (7) in [49]
for M (=SUPS ) = SUPG and Eq. (19) in [75] for C (=LSIC ) = LSICHRGN . They
are both used with h RGN = h RGNT , given by Eq. (15) in [75], which was originally
introduced in [76]. The DTR and IMTR approaches are used on all three meshes.
Least-squares projection is used to interpolate the velocity and pressure between
temporal patches. Because the boundary-layer meshes and the tower and rotor surface
meshes remain identical between temporal patches, the velocity values are transferred
exactly for those nodes. The time-step size is 2.23 103 s (145 time steps per
patch), with four nonlinear iterations per time-step. First we develop the flow field
for 500 time steps while the rotor is static, ramping up the inflow velocity during
the first 300 steps from zero to the wind speed using a cosine ramp. During this
flow-development stage, we use 150, 150, 200, and 400 GMRES iterations for the
four nonlinear iterations. In computations with the rotor in motion, we use 150, 150,
200, and 400 GMRES iterations for Mesh 1, and 150, 250, 350, and 500 GMRES
iterations for Mesh 2 and Mesh 3. With the GMRES iterations in flow computations,
we use nodal-block-diagonal preconditioner. The mesh is partitioned based on the
METIS algorithm [93] to improve parallel efficiency in the computations.
3.8 Results
Figure 11 shows the torque for Mesh 1 with the DTR approach, for the last 360
rotation of a blade, with the rotation amount measured from the orientation seen
in Fig. 7. For reference purposes, Fig. 11 includes the NREL data. The torque is
within 8 % of the NREL data. Figure 12 shows the torque for the last 80 rotation
of a single blade of Mesh 1 with the DTR approach, compared with the torque from
an earlier, single-blade computation [73] using the TGI option of C (=LSIC ). The
single-blade computation has the same blade geometry, wind speed, and rotor speed,
but has a single-blade mesh in a rotationally-periodic domain. It has a more refined
boundary-layer mesh and a time-step size that is approximately five times smaller.
The higher torque seen for the single-blade computation may be due to the fact
that the computation was carried out for a much shorter duration, only 80 of rota-
tion versus 1,080 for the Mesh 1 computation. Therefore the current computation
likely represents a more settled torque value. The higher torque for the single-blade
computation may also be due the fact that the computation was carried out using a
computational domain with significantly nearer lateral boundaries. Figures 13 and
14 show the torque for all three meshes with the DTR and IMTR approaches. As can
be seen from these figures, Mesh 1 (no tower) has a very stable torque, while Mesh 2
and Mesh 3 (with tower) exhibit a significant but expected drop in torque each time
a blade passes the tower. Figure 15 shows, for each of the three meshes, the torque
obtained with the DTR and IMTR approaches. The figure illustrates that the DTR and
IMTR approaches result in a nearly identical torque magnitude for all three meshes.
Figure 16 shows the torque for Mesh 1 with the DTR approach, using two different
time-step sizes: 2.23 103 s (145 time steps per patch) and 4.49 103 s (72 time
steps per patch). Doubling the time-step size still yields a comparable torque value,
within 10 % of the value for the smaller time-step size. We also carried out a compu-
tation with the convective form of the ST-VMS formulation (see Eq. (8.17) in [44]),
but with a smaller time-step size: 4.46 104 s (725 time steps per patch). Figure 17
shows the torque for Mesh 2 with the DTR approach and the conservative and convec-
tive forms of the ST-VMS formulation. The conservative-form computation is with
the standard time-step size: 2.23 103 s (145 time steps per patch). Figure 18
shows the torque for the individual blades of Mesh 2 with the DTR approach.
The figure clearly shows the expected torque drop for each blade as it passes the
tower, while the other two blades maintain relatively constant torque. Figure 19
shows the torque for 10 equal-length spanwise sections of a blade of Mesh 2 with the
DTR approach. Greatest amount of torque is generated in sections 69 of the blade,
while section 10 at the tip and the other lower sections generate less torque. Figure 20
shows a volume rendering of the vorticity for Mesh 2 with the DTR approach. The
flow patterns vary considerably along each blade length, illustrating the necessity to
carry out the computations in 3D.
Figure 21 shows the pressure coefficient at 0.90R for the last 0 orientation of a
blade of Mesh 2, with the DTR and IMTR approaches, with the last 0 orientation
being common between the two computations. There is very little difference in the
pressure coefficient around the blades between the DTR and IMTR approaches.
Figure 22 shows the pressure coefficient at 0.90R for the last 180 orientation of a
blade of Mesh 1, Mesh 2 and Mesh 3, with the DTR approach, with the last 180
orientation being common between Mesh 2 and Mesh 3 computations. Averaged
torque (in MNm) for the last 360 rotation for Mesh 1, 2 and 3 are 2.31, 2.34 and
2.39 for the DTR approach, and 2.32, 2.34 and 2.35 for the IMTR approach. The
values show that the difference in torque between the DTR and IMTR approaches,
3.0
2.5
Torque (MNm)
2.0
1.5
1.0
0.5
ST-VMS NREL
0.0
0 30 60 90 120 150 180 210 240 270 300 330 360
Degrees
Fig. 11 Torque for Mesh 1 with the DTR approach, compared with the NREL data
1.0
0.8
Torque (MNm)
0.6
0.4
0.2
Mesh 1 Single-blade
0.0
0 10 20 30 40 50 60 70 80
Degrees
Fig. 12 Torque for a single blade of Mesh 1 with the DTR approach, compared with the torque
from an earlier single-blade computation [73] using the TGI option of C (=LSIC )
3.0
2.5
Torque (MNm)
2.0
1.5
1.0
0.5
Mesh 1 Mesh 2 Mesh 3
0.0
0 30 60 90 120 150 180 210 240 270 300 330 360
Degrees
Fig. 13 Torque for Mesh 1, Mesh 2 and Mesh 3 with the DTR approach
and between Mesh 2 and Mesh 3, is rather small. The difference in torque between
Mesh 1 and Mesh 2 and 3 illustrates effect of the tower.
3.0
2.5
2.0
Torque(MN m)
1.5
1.0
0.5
Mesh 1 Mesh 2 Mesh 3
0.0
0 30 60 90 120 150 180 210 240 270 300 330 360
Degrees
Fig. 14 Torque for Mesh 1, Mesh 2 and Mesh 3 with the IMTR approach
4 Micon 65/13M Wind Turbine with a CX-100 Blade
This section is adapted from [87]. We simulate the Micon 65/13M wind turbine
at field test conditions [94]. Micon 65/13M is a three-blade, horizontal-axis, fixed-
pitch, upwind turbine with the total rotor diameter of 19.3 m and rated power of
100 kW. The hub is located at the height of 23 m. The wind turbine stands on a
tubular steel tower, with a base diameter of 1.9 m. The drive train generator operates
at 1,200 rpm, while the rotor spins at a nominal speed of 55 rpm. The Micon 65/13M
wind turbine was used for the Long-Term Inflow and Structural Testing (LIST)
program [95] initiated by Sandia National Laboratories in 2001 to explore the use
of carbon fiber in wind turbine blades. Three experimental blade prototypes, GX-
100, CX-100 and TX-100, were developed specifically for this project. We use the
CX-100 conventional carbon-spar blade design [94, 96]. The NREL S821, S819 and
S820 airfoils are used to define the blade geometry. The details of the blade geometry
definition are provided in Table 1.
4.1 Eigenfrequency Analysis of the CX-100 Blade
The blade structure is comprised of five primary sections: leading edge, trailing
edge, root, spar cap, and shear web. The sections are shown in Fig. 23. Each section is
further subdivided into zones, each consisting of a multilayer composite layup. There
is a total of 32 zones with constant total thickness and unique laminate stacking. The
effective material properties for each of the zones are computed using the procedures
described in [50, 74]. All 32 zones are identified on the blade surface and are shown
in Fig. 23. For more details of the material composition of the CX-100 blade see [87].
We perform eigenfrequency calculations of the CX-100 blade using three quadratic
NURBS meshes. The coarsest mesh has 1,846 elements, while the finest mesh has
18,611. The mesh statistics are summarized in Table 2. The eigenfrequency results
3.0
2.5
Torque (MNm)
2.0
1.5
1.0
0.5
0.0
0 30 60 90 120 150 180 210 240 270 300 330 360
3.0
2.5
Torque (MNm)
2.0
1.5
1.0
0.5
0.0
0 30 60 90 120 150 180 210 240 270 300 330 360
Degrees
Degrees
3.0
2.5
Torque (MNm)
2.0
1.5
1.0
0.5
DTR IMTR
0.0
0 30 60 90 120 150 180 210 240 270 300 330 360
Degrees
Fig. 15 Torque with the DTR and IMTR approaches for Mesh 1, Mesh 2, and Mesh 3
are compared with the experimental data from [97, 98]. We compute the case with
free boundary conditions and the case when the blade is clamped at the root. In both
cases, the computed natural frequencies are in good agreement with the experimental
data (see Tables 3 and 4). The medium mesh shows a good balance between the
computational cost and accuracy. For this reason, this mesh is chosen for the FSI
computations presented here. The mode shapes computed using the medium mesh
for the clamped case are shown in Fig. 24.
3.0
2.5
Torque(MN m)
2.0
1.5
1.0
0.5
145 times teps per patch 72 time steps per patch
0.0
0 30 60 90 120 150 180 210 240 270 300 330 360
Degrees
Fig. 16 Torque for Mesh 1 with the DTR approach, using two different time-step sizes:
2.23 103 s (145 time steps per patch) and 4.49 103 s (72 time steps per patch)
3.0
2.5
Torque (MNm)
2.0
1.5
1.0
0.5
Conservative ST-VMS Convective ST-VMS
0.0
70 80 90 100 110 120
Degrees
Fig. 17 Torque for Mesh 2 with the DTR approach and the conservative and convective forms
of the ST-VMS formulation. The time-step sizes 4.46 104 s (725 time steps per patch) for the
convective form and 2.23 103 s (145 time steps per patch) for the conservative form. The torques
are from the same period in a rotation cycle, but the conservative-form torque is from the last 360
of the computation, and the convective-form torque is from a recently-started, ongoing computation
1.0
0.8
Torque(MN m)
0.6
0.4
0.2
Blade 1 Blade 2 Blade 3
0.0
0 30 60 90 120 150 180 210 240 270 300 330 360
Degrees
Fig. 18 Torque for the individual blades of Mesh 2 with the DTR approach
0.14
Torque (MNm) 0.12
0.10
0.08
0.06
0.04
0.02
0.00
0 30 60 90 120 150 180 210 240 270 300 330 360
Degrees
1 2 3 4 5 6 7 8 9 10
Fig. 19 Torque for 10 equal-length spanwise sections of a blade of Mesh 2 with the DTR approach
0 4.5 9
Fig. 20 Volume rendering of the vorticity (in s1 ) from the last 360 of the computation for Mesh
2 with the DTR approach
1.8 0.4 1
Fig. 21 Pressure coefficient at 0.90R for the last 0 orientation of a blade of Mesh 2, with the DTR
(left) and IMTR (right) approaches
1.8 0.4 1
Fig. 22 Pressure coefficient at 0.90R for the last 180 orientation of a blade of Mesh 1, Mesh 2,
and Mesh 3, with the DTR approach
4.2 Aerodynamics and FSI Computations
In this section, we present aerodynamic and FSI simulations. For both cases,
a constant inflow wind speed of 10.5 m/s and fixed rotor speed of 55 rpm are pre-
scribed. These correspond to the operating conditions reported for the field tests
in [94]. The air density and viscosity are 1.23 kg/m3 and 1.78 105 kg/(ms),
respectively. Zero traction boundary conditions are prescribed at the outflow and no-
penetration boundary conditions are prescribed at the top, bottom, and side surfaces
of the outer (stationary) computational domain. No-slip boundary conditions are
prescribed at the rotor, nacelle, and tower, and are imposed weakly.
Figure 25 shows the computational domain and mesh used in this study. The
mesh consists of 5,134,916 linear elements, which are triangular prisms in the rotor
Table 1 CX-100 blade with RNodes (m), Chord (m), AeroTwst ( ), and Airfoil type
RNodes Chord AeroTwst Airfoil
0.200 0.356 29.6 Cylinder
0.600 0.338 24.8 Cylinder
1.000 0.569 20.8 Cylinder
1.400 0.860 17.5 NREL S821
1.800 1.033 14.7 NREL S821
2.200 0.969 12.4 NREL S821
3.200 0.833 8.3 NREL S821
4.200 0.705 5.8 NREL S819
5.200 0.582 4.0 NREL S819
6.200 0.463 2.7 NREL S819
7.200 0.346 1.4 NREL S819
8.200 0.232 0.4 NREL S819
9.000 0.120 0.0 NREL S820
Fig. 23 Left Five primary sections of the CX-100 blade; Right 32 distinct material zones of the
CX-100 blade
Table 2 NURBS blade meshes used in the eigenfrequency analysis

Control points Elements
Mesh 1 3,469 1,846
Mesh 2 7,411 4,647
Mesh 3 25,896 18,611
boundary layers and tetrahedra everywhere else in the domain. The mesh is refined in
the rotor and tower regions for better flow resolution near the wind turbine. The size of
the first element in the wall-normal direction is 0.002 m, and 15 layers of prismatic
elements were generated with a growth ratio of 1.2. Figure 25 shows a 2D blade
cross-section at 70 % spanwise station to illustrate the boundary-layer mesh used in
the computations. The time-step size is set to 3.0 105 s. In Fig. 26 the time history
of the aerodynamic torque is plotted. As can be seen from the plot, using FSI, we
Table 3 Comparison of experimentally measured and computed natural frequencies (in Hz) for
the free case
Mode 1 Mode 2 Mode 3
Mesh 1 8.28 15.92 19.26
Mesh 2 8.22 15.61 18.21
Mesh 3 8.22 15.6 18.01
Experiment 7.68.2 15.718.1 20.221.3
Mode 1 is the first flapwise mode, Mode 2 is the first edgewise mode, and Mode 3 is the second
flapwise mode
Table 4 Comparison of experimentally measured and computed natural frequencies (in Hz) for
the clamped case
Mode 1 Mode 2 Mode 3
Mesh 1 4.33 11.82 19.69
Mesh 2 4.29 11.61 19.08
Mesh 3 4.27 11.54 18.98
Experiment 4.35 11.51 20.54
Modes 13 are the first three flapwise bending modes
Fig. 24 First (left) and second (right) flapwise bending mode for the clamped case
capture the high frequency oscillations caused by the bending and torsional motions
of the blades. In the case of the rigid blade the only high-frequency oscillations in
the torque curve are due to the trailing-edge turbulence. For the rigid blade case the
effect of the tower on the aerodynamic torque is more pronounced, while in the case
of FSI it is not as visible due to the relatively high torque oscillations. The dips in
the aerodynamic torque can be seen at 60 , 180 , and 300 azimuthal angle, which
is precisely when one of the three blades is passing the tower. The computed values
of the aerodynamic torque are plotted together with field test results from [94]. The
upper and lower dashed lines indicate the aerodynamic torque bounds, while the
middle dashed line gives its average value. Both the aerodynamic and FSI results
compare very well with the field test data. Figure 27 shows the relative wind speed
at the 70 % spanwise station rotated to the reference configuration to illustrate the
blade deflection and complexity of boundary-layer turbulent flow. Figure 28 shows
the flow field as the blade passes the tower.
Fig. 25 Left Computational domain and mesh with the refined inner region for better flow resolution
near the rotor; Right 2D blade cross-section at r/R = 70 % and the boundary-layer mesh
Azimuthal angle,
60 180 300 60 180 300 60
12000 12000
Aerodynamic Torque, N*m
10000 10000
8000 Rigid Blade Simulation Sandia Experiment

8000
FSI Simulation
6000 6000
4000 4000
2000 2000
0 0
0 0.4 0.8 1.2 1.6 2
Time, s
Fig. 26 Aerodynamic torque for the FSI and rigid-blade simulations. The experimental range for
the torque and its average are provided for comparison and are plotted using dashed lines
We provided an overview of the aerodynamic and FSI analysis of wind turbines

carried out in recent years with the ALE-VMS and ST-VMS methods. The techniques
complementing these core methods include weak enforcement of the essential bound-
ary conditions, NURBS-based isogeometric analysis, using NURBS basis functions
in temporal representation of the rotor motion, mesh motion and also in remeshing,
rotation representation with constant angular velocity, KirchhoffLove shell mod-
Fig. 27 Relative wind speed at the 70 % spanwise station for the FSI simulation at t = 0.86 s (left)
and t = 1.06 s (right). The blade deflection is clearly visible
Fig. 28 Wind speed contours at 80 % spanwise station as the blade passes the tower
eling of the rotor-blade structure, and full FSI coupling. Some of these techniques
were included in our overview. The wind-turbine analysis cases presented include the
aerodynamics of wind-turbine rotor and tower and the FSI that accounts for the defor-
mation of the rotor blades. The specific wind turbines considered were NREL 5MW,
NREL Phase VI and Micon 65/13M, all at full scale. In the case of NREL Phase VI
and Micon 65/13M we also presented a successful comparison with the experimental
data. Overall, this article demonstrates that the ALE-VMS and ST-VMS methods,
together with some new supporting techniques, have brought the aerodynamic and
FSI analysis of wind turbines to a new level, where such analyses can contribute
more to simulation-based design and testing.
Acknowledgments We wish to thank the Texas Advanced Computing Center (TACC) and the
San Diego Supercomputing Center (SDSC) for providing HPC resources that have contributed to
the research results reported in this article. The first author acknowledges the support of the NSF
CAREER Award, the NSF Award CBET-1306869, and the Air Force Office of Scientific Research
Award FA9550-12-1-0005. The ST-VMS part of the work was supported by ARO grants W911NF-
09-1-0346 and W911NF-12-1-0162 (third author) and RiceWaseda Research Agreement (second
author).
References
1. Jonkman JM, Buhl ML (2005) FAST users guide, Technical Report NREL/EL-500-38230.
National Renewable Energy Laboratory, Golden, CO
2. Jonkman J, Butterfield S, Musial W, Scott G (2009) Definition of a 5-MW reference wind
turbine for offshore system development, Technical Report NREL/TP-500-38060. National
Renewable Energy Laboratory, Golden, CO
3. Srensen NN, Michelsen JA, Schreck S (2002) Navier-Stokes predictions of the NREL Phase
VI rotor in the NASA Ames 80 ft 120 ft wind tunnel. Wind Energy 5:151169
4. Pape AL, Lecanu J (2004) 3D Navier-Stokes computations of a stall-regulated wind turbine.
Wind Energy 7:309324
5. Bazilevs Y, Hsu M-C, Akkerman I, Wright S, Takizawa K, Henicke B, Spielman T, Tezduyar
TE (2011) 3D simulation of wind turbine rotors at full scale. Part I: geometry modeling and
aerodynamics. Int J Numer Meth Fluids 65:207235. doi:10.1002/fld.2400
6. Takizawa K, Henicke B, Tezduyar TE, Hsu M-C, Bazilevs Y (2011) Stabilized space-time
computation of wind-turbine rotor aerodynamics. Comput Mech 48:333344. doi:10.1007/
s00466-011-0589-2
7. Kong C, Bang J, Sugiyama Y (2005) Structural investigation of composite wind turbine blade
considering various load cases and fatigue life. Energy 30:21012114
8. Hansen MOL, Srensen JN, Voutsinas S, Srensen N, Madsen HA (2006) State of the art in
wind turbine aerodynamics and aeroelasticity. Prog Aerosp Sci 42:285330
9. Jensen FM, Falzon BG, Ankersen J, Stang H (2006) Structural testing and numerical simulation
of a 34 m composite wind turbine blade. Compos Struct 76:5261
10. Kiendl J, Bazilevs Y, Hsu M-C, Wchner R, Bletzinger K-U (2010) The bending strip method
for isogeometric analysis of Kirchhoff-Love shell structures comprised of multiple patches.
11. Bazilevs Y, Hsu M-C, Kiendl J, Benson DJ (2012) A computational procedure for pre-bending
of wind turbine blades. Int J Numer Meth Eng 89:323336
turbine rotors at full scale. Part II: fluid-structure interaction modeling with composite blades.
Int J Numer Meth Fluids 65:236253
13. Hughes TJR, Cottrell JA, Bazilevs Y (2005) Isogeometric analysis: CAD, finite elements,
NURBS, exact geometry, and mesh refinement. Comput Methods Appl Mech Eng 194:4135
4195
14. Cottrell JA, Reali A, Bazilevs Y, Hughes TJR (2006) Isogeometric analysis of structural vibra-
tions. Comput Methods Appl Mech Eng 195:52575297
15. Bazilevs Y, da Veiga LB, Cottrell JA, Hughes TJR, Sangalli G (2006) Isogeometric analysis:
approximation, stability and error estimates for h-refined meshes. Math Models Methods Appl
Sci 16:10311090
16. Cottrell JA, Hughes TJR, Reali A (2007) Studies of refinement and continuity in isogeometric
structural analysis. Comput Meth Appl Mech Eng 196:41604183
17. Cottrell JA, Hughes TJR, Bazilevs Y (2009) Isogeometric analysis: toward integration of CAD
and FEA. Wiley, Chichester
18. Evans JA, Bazilevs Y, Babuka I, Hughes TJR (2009) n-Widths, supinfs, and optimality ratios
for the k-version of the isogeometric finite element method. Comput Methods Appl Mech Eng
198:17261741
19. Drfel MR, Jttler B, Simeon B (2010) Adaptive isogeometric analysis by local h-refinement
with T-splines. Comput Methods Appl Mech Eng 199:264275
20. Bazilevs Y, Calo VM, Cottrell JA, Evans JA, Hughes TJR, Lipton S, Scott MA, Sederberg TW
(2010) Isogeometric analysis using T-splines. Comput Methods Appl Mech Eng 199:229263
22. Bazilevs Y, Michler C, Calo VM, Hughes TJR (2007) Weak dirichlet boundary conditions for
wall-bounded turbulent flows. Comput Methods Appl Mech Eng 196:48534862
23. Bazilevs Y, Michler C, Calo VM, Hughes TJR (2010) Isogeometric variational multiscale mod-
eling of wall-bounded turbulent flows with weakly enforced boundary conditions on unstretched
meshes. Comput Methods Appl Mech Eng 199:780790
24. Akkerman I, Bazilevs Y, Calo VM, Hughes TJR, Hulshoff S (2008) The role of continuity in
residual-based variational multiscale modeling of turbulence. Comput Mech 41:371378
25. Hsu M-C, Bazilevs Y, Calo VM, Tezduyar TE, Hughes TJR (2010) Improving stability of
stabilized and multiscale formulations in flow simulations at small time steps. Comput Methods
Appl Mech Eng 199:828840. doi:10.1016/j.cma.2009.06.019
26. Bazilevs Y, Akkerman I (2010) Large eddy simulation of turbulent Taylor-Couette flow using
229:34023414
27. Elguedj T, Bazilevs Y, Calo VM, Hughes TJR (2008) B-bar and F-bar projection methods for
nearly incompressible linear and nonlinear elasticity and plasticity using higher-order nurbs
elements. Comput Methods Appl Mech Eng 197:27322762
28. Lipton S, Evans JA, Bazilevs Y, Elguedj T, Hughes TJR (2010) Robustness of isogeomet-
ric structural discretizations under severe mesh distortion. Comput Methods Appl Mech Eng
199:357373
29. Benson DJ, Bazilevs Y, De Luycker E, Hsu M-C, Scott M, Hughes TJR, Belytschko T (2010) A
generalized finite element formulation for arbitrary basis functions: from isogeometric analysis
to XFEM. Int J Numer Meth Eng 83:765785
30. Benson DJ, Bazilevs Y, Hsu M-C, Hughes TJR (2010) Isogeometric shell analysis: the Reissner-
Mindlin shell. Comput Methods Appl Mech Eng 199:276289
31. Kiendl J, Bletzinger K-U, Linhard J, Wchner R (2009) Isogeometric shell analysis with
Kirchhoff-Love elements. Comput Methods Appl Mech Eng 198:39023914
32. Zhang Y, Bazilevs Y, Goswami S, Bajaj C, Hughes TJR (2007) Patient-specific vascular NURBS
modeling for isogeometric analysis of blood flow. Comput Methods Appl Mech Eng 196:2943
2959
33. Bazilevs Y, Calo VM, Zhang Y, Hughes TJR (2006) Isogeometric fluid-structure interaction
34. Bazilevs Y, Calo VM, Hughes TJR, Zhang Y (2008) Isogeometric fluid-structure interaction:
35. Isaksen JG, Bazilevs Y, Kvamsdal T, Zhang Y, Kaspersen JH, Waterloo K, Romner B, Inge-
brigtsen T (2008) Determination of wall tension in cerebral artery aneurysms by numerical
simulation. Stroke 39:31723178
36. Bazilevs Y, Hughes TJR (2008) NURBS-based isogeometric analysis for the computation of
flows about rotating components. Comput Mech 43:143150
37. Cirak F, Ortiz M, Schrder P (2000) Subdivision surfaces: a new paradigm for thin shell
analysis. Int J Numer Meth Eng 47:20392072
38. Cirak F, Ortiz M (2001) Fully c1 -conforming subdivision elements for finite deformation thin
shell analysis. Int J Numer Meth Eng 51:813833
39. Cirak F, Scott MJ, Antonsson EK, Ortiz M, Schrder P (2002) Integrated modeling, finite-
element analysis, and engineering design for thin-shell structures using subdivision. Comput
Aided Des 34:137148
40. Hughes TJR, Liu WK, Zimmermann TK (1981) Lagrangian-Eulerian finite element formula-
tion for incompressible viscous flows. Comput Methods Appl Mech Eng 29:329349
41. Hughes TJR (1995) Multiscale phenomena: Greens functions, the Dirichlet-to-Neumann for-
mulation, subgrid scale models, bubbles, and the origins of stabilized methods. Comput Meth-
ods Appl Mech Eng 127:387401
43. Takizawa K, Tezduyar TE (2011) Multiscale space-time fluid-structure interaction techniques.
44. Takizawa K, Tezduyar TE (2012) Space-time fluid-structure interaction methods. Math Models
Methods Appl Sci 22:1230001. doi:10.1142/S0218202512300013
46. Tezduyar TE, Behr M, Liou J (1992) A new strategy for finite element computations involving
moving boundaries and interfacesthe deforming-spatial-domain/space-time procedure: I.
The concept and the preliminary numerical tests. Comput Methods Appl Mech Eng 94:339
351. doi:10.1016/0045-7825(92)90059-S
47. Tezduyar TE, Behr M, Mittal S, Liou J (1992) A new strategy for finite element computa-
tions involving moving boundaries and interfacesthe deforming-spatial-domain/space-time
procedure: II. Computation of free-surface flows, two-liquid flows, and flows with drifting
cylinders. Comput Methods Appl Mech Eng 94:353371. doi:10.1016/0045-7825(92)90060-
W
meters. Int J Numer Meth Fluids 43:555575. doi:10.1002/fld.505
49. Tezduyar TE, Sathe S (2007) Modeling of fluid-structure interactions with the space-time finite
elements: solution techniques. Int J Numer Meth Fluids 54:855900. doi:10.1002/fld.1430
50. Bazilevs Y, Takizawa K, Tezduyar TE (2013) Computational fluid-structure interaction: meth-
ods and applications. Wiley, Chichester, West Sussex, United Kingdom
51. Bazilevs Y, Hughes TJR (2007) Weak imposition of Dirichlet boundary conditions in fluid
mechanics. Comput Fluids 36:1226
52. Nitsche J (1971) Uber ein variationsprinzip zur losung von Dirichlet-problemen bei verwen-
dung von teilraumen, die keinen randbedingungen unterworfen sind. Abh Math Univ Hamburg
36:915
53. Arnold DN, Brezzi F, Cockburn B, Marini LD (2002) Unified analysis of discontinuous
Galerkin methods for elliptic problems. SIAM J Numer Anal 39:17491779
54. Brooks AN, Hughes TJR (1982) Streamline upwind/Petrov-Galerkin formulations for convec-
tion dominated flows with particular emphasis on the incompressible Navier-Stokes equations.
55. Tezduyar TE, Mittal S, Ray SE, Shih R (1992) Incompressible flow computations with stabi-
lized bilinear and linear equal-order-interpolation velocity-pressure elements. Comput Methods
Appl Mech Eng 95:221242. doi:10.1016/0045-7825(92)90141-6
56. Mittal S, Tezduyar TE (1992) A finite element study of incompressible flows past oscillating
cylinders and aerofoils. Int J Numer Meth Fluids 15:10731118. doi:10.1002/fld.1650150911
57. Mittal S, Tezduyar TE (1995) Parallel finite element simulation of 3d incompressible flows
fluid-structure interactions. Int J Numer Meth Fluids 21:933953. doi:10.1002/fld.1650211011
58. Kalro V, Tezduyar TE (2000) A parallel 3D computational method for fluid-structure inter-
actions in parachute systems. Comput Methods Appl Mech Eng 190:321332. doi:10.1016/
S0045-7825(00)00204-8
59. Tezduyar TE, Sathe S, Keedy R, Stein K (2006) Space-time finite element techniques for
computation of fluid-structure interactions. Comput Methods Appl Mech Eng 195:20022027.
doi:10.1016/j.cma.2004.09.014
60. Takizawa K, Tezduyar TE (2012) Computational methods for parachute fluid-structure inter-
actions. Arch Comput Meth Eng 19:125169. doi:10.1007/s11831-012-9070-4
61. Tezduyar TE, Takizawa K, Brummer T, Chen PR (2011) Space-time fluid-structure interaction
modeling of patient-specific cerebral aneurysms. Int J Numer Meth Biomed Eng 27:16651710.
doi:10.1002/cnm.1433
62. Takizawa K, Bazilevs Y, Tezduyar TE (2012) Space-time and ALE-VMS techniques for patient-
specific cardiovascular fluid-structure interaction modeling. Arch Comput Meth Eng 19:171
225. doi:10.1007/s11831-012-9071-3
63. Takizawa K, Schjodt K, Puntel A, Kostov N, Tezduyar TE (2012) Patient-specific computer
modeling of blood flow in cerebral arteries with aneurysm and stent. Comput Mech 50:675686.
doi:10.1007/s00466-012-0760-4
64. Takizawa K, Fritze M, Montes D, Spielman T, Tezduyar TE (2012) Fluid-structure interaction

modeling of ringsail parachutes with disreefing and modified geometric porosity. Comput Mech
50:835854. doi:10.1007/s00466-012-0761-3
65. Takizawa K, Montes D, Fritze M, McIntyre S, Boben J, Tezduyar TE (2013) Methods for FSI
modeling of spacecraft parachute dynamics and cover separation. Math Models Methods Appl
Sci 23:307338. doi:10.1142/S0218202513400058
66. Takizawa K, Tezduyar TE, Boben J, Kostov N, Boswell C, Buscher A (2013) Fluid-structure
interaction modeling of clusters of spacecraft parachutes with modified geometric porosity.
Comput Mech 52:13511364. doi:10.1007/s00466-013-0880-5
67. Takizawa K, Schjodt K, Puntel A, Kostov N, Tezduyar TE (2013) Patient-specific computational
analysis of the influence of a stent on the unsteady flow in cerebral aneurysms. Comput Mech
51:10611073. doi:10.1007/s00466-012-0790-y
68. Manguoglu M, Takizawa K, Sameh AH, Tezduyar TE (2011) Nested and parallel sparse algo-
rithms for arterial fluid mechanics computations with boundary layer mesh refinement. Int J
Numer Meth Fluids 65:135149. doi:10.1002/fld.2415
011-0619-0
70. Tezduyar T, Aliabadi S, Behr M, Johnson A, Kalro V, Litke M (1996) Flow simulation and
high performance computing. Comput Mech 18:397412. doi:10.1007/BF00350249
71. Behr M, Tezduyar T (1999) The Shear-slip mesh update method. Comput Methods Appl Mech
Eng 174:261274. doi:10.1016/S0045-7825(98)00299-0
72. Behr M, Tezduyar T (2001) Shear-slip mesh update in 3D computation of complex flow prob-
lems with rotating mechanical components. Comput Methods Appl Mech Eng 190:31893200.
doi:10.1016/S0045-7825(00)00388-1
73. Takizawa K, Henicke B, Montes D, Tezduyar TE, Hsu M-C, Bazilevs Y (2011) Numerical-
performance studies for the stabilized space-time computation of wind-turbine rotor aerody-
namics. Comput Mech 48:647657. doi:10.1007/s00466-011-0614-5
computer modeling of wind-turbine rotor aerodynamics and fluid-structure interaction. Math
Models and Methods Appl Sci 22:1230002. doi:10.1142/S0218202512300025
75. Takizawa K, Tezduyar TE, McIntyre S, Kostov N, Kolesar R, Habluetzel C (2014) Space-time
VMS computation of wind-turbine rotor and tower aerodynamics. Comput Mech 53:115.
doi:10.1007/s00466-013-0888-x
76. Takizawa K, Henicke B, Puntel A, Spielman T, Tezduyar TE (2012) Space-time computational
4005073
77. Takizawa K, Henicke B, Puntel A, Kostov N, Tezduyar TE (2012) Space-time techniques for
computational aerodynamics modeling of flapping wings of an actual locust. Comput Mech
50:743760. doi:10.1007/s00466-012-0759-x
78. Takizawa K, Kostov N, Puntel A, Henicke B, Tezduyar TE (2012) Space-time computational
analysis of bio-inspired flapping-wing aerodynamics of a micro aerial vehicle. Comput Mech
50:761778. doi:10.1007/s00466-012-0758-y
79. Takizawa K, Henicke B, Puntel A, Kostov N, Tezduyar TE (2013) Computer modeling tech-
niques for flapping-wing aerodynamics of a locust. Comput Fluids 85:125134. doi:10.1016/
j.compfluid.2012.11.008
flows with the finite element methodsspace-time formulations, iterative strategies and mas-
sively parallel implementations. In: New methods in transient analysis, PVP-vol 246/AMD-vol
143, ASME, New York, pp 724
of 3D flows. Computer 26:2736. doi:10.1109/2.237441
82. Johnson AA, Tezduyar TE (1994) Mesh update strategies in parallel finite element computa-
tions of flow problems with moving boundaries and interfaces. Comput Meth Appl Mech Eng
119:7394. doi:10.1016/0045-7825(94)00077-8
83. Tezduyar TE (2001) Finite element methods for flow problems with moving boundaries and
interfaces. Arch Comput Meth Eng 8:83130. doi:10.1007/BF02897870
84. Hsu M-C, Bazilevs Y (2012) Fluid-structure interaction modeling of wind turbines: simulating
the full machine. Comput Mech 50:821833
85. Hsu M-C, Akkerman I, Bazilevs Y (2013) Finite element simulation of wind turbine aerody-
namics: validation study using NREL Phase VI experiment. Wind Energy, published online.
doi:10.1002/we.1599
86. Hand MM, Simms DA, Fingersh LJ, Jager DW, Cotrell JR, Schreck S, Larwood SM (2001)
Unsteady aerodynamics experiment Phase VI: wind tunnel test configurations and available
data campaigns, Technical Report NREL/TP-500-29955. National Renewable Energy Labo-
ratory, Golden, CO
87. Korobenko A, Hsu M-C, Akkerman I, Tippmann J, Bazilevs Y (2013) Structural mechanics
modeling and fsi simulation of wind turbines. Math Models and Methods Appl Sci 23:249272
88. Bazilevs Y, Hsu M-C, Scott MA (2012) Isogeometric fluid-structure interaction analysis with
emphasis on non-matching discretizations, and with application to wind turbines. Comput Meth
89. Tezduyar TE, Sathe S, Stein K (2006) Solution techniques for the fully-discretized equations
in computation of fluid-structure interactions with the space-time formulations. Comput Meth
Appl Mech Eng 195:57435753. doi:10.1016/j.cma.2005.08.023
90. Jonkman J, Butterfield S, Musial W, Scott G (2009) Definition of a 5-MW reference wind
turbine for offshore system development, Technical Report NREL/TP-500-38060, National
Renewable Energy Laboratory
91. Spera DA (1994) Introduction to modern wind turbines. In: Spera DA (ed) Wind turbine
technology: fundamental concepts of wind turbine engineering, pp 4772 (ASME Press)
92. Saad Y, Schultz M (1986) GMRES: a generalized minimal residual algorithm for solving
nonsymmetric linear systems. SIAM J Sci Stat Comput 7:856869
93. Karypis G, Kumar V (1998) A fast and high quality multilevel scheme for partitioning irregular
graphs. SIAM J Sci Comput 20:359392
94. Zayas JR, Johnson WD (2008) 3X-100 blade field test, Report of the Sandia National Labora-
tories. Wind Energy Technology Department, Sandia
95. Sutherland JH, Jones PL, Neal BA (2001) The long-term inflow and structural test program.
In: Proceedings of the 2001 ASME wind energy symposium, p 162
96. Berry D, Ashwill T (2007) Design of 9-meter carbon-fiberglass prototype blades: CX-100 and
TX-100, Report of the Sandia national laboratories, New Mexico, USA
97. White JR, Adams DE, Rumsey MA (2011) Modal analysis of CX-100 rotor blade and Micon
65/13 wind turbine. In: Structural dynamics and renewable energy, vol 1, Conference proceed-
ings of the society for experimental mechanics series 10
98. Marinone T, LeBlanc B, Harvie J, Niezrecki C, Avitabile P (2012) Modal testing of a 9 m
CX-100 turbine blade. In: Topics in experimental dynamics substructuring and wind turbine
dynamics, vol 2, Conference proceedings of the society for experimental mechanics series 27
Part VII
Partitioned Method and Parallelization
Techniques
Scaling Up Multiphysics
Rainald Lhner and Joseph D. Baum
Abstract The present paper summarizes trends in supercomputing and the

consequences they will have on coupled problems in computational mechanics. The
appearance of parallel machines with more than 106 cores implies that the prevalent
scalar-pre, simple parallel solve, scalar-post environment will have to give way to
a completely scalable simulation pipeline. The grid size alone will force distributed
parallelization of meshing, domain splitting, load balancing and post-processing.
Possible ways of addressing parallel meshing and dynamic load balancing for mul-
tiphysics are treated, and examples shown.
1 Introduction 1: Computational Sciences
Computational sciences (i.e. computation-based sciences) have become the third

pillar of the empirical sciences (besides the traditional experiments and theory). It has
become inconceivable to carry out experiments or develop a new theory without the
heavy support of calculations during all stages of research. To name a few: Any large-
scale experiment in physics (e.g. particle physics), aeronautics (e.g. wind-tunnel),
automotive (e.g. wind tunnel), naval engineering (e.g. water tunnel), civil engineering
(e.g. ultimate loads), telecom engineering (e.g. efficiency of mobile communications)
is nowadays preceded by a lengthy series of pre-experiment calculations (for example
to make sure that measuring devices are operating in their expected ranges) and
followed by another lengthy series of post-experiment calculations (for example to
R. Lhner (B)
Center for Computational Fluid Dynamics, George Mason University,
M.S. 6A2, Fairfax, VA 22030-4444, USA
J. D. Baum
Advanced Technology Group, SAIC, McLean, VA 22020, USA

390 R. Lhner and J. D. Baum
understand in depth the phenomena observed). The same trend can be observed in all
the manufacturing industries, and increasingly in the medical and life sciences. Any
car, truck, train, ship, airplane, large building, bridge, skyscraper, computer, medical
device or, for that matter, consumer product of value will see considerable calculation
based design and optimization during its development. Computational mechanics is
also increasingly being used not only in the design, optimization and verification of
finished products, but also for the optimization of manufacturing processes.
The current trend in each of the fields that comprise the computational sciences
(structural/ thermo/ fluid-dynamics, electromagnetics, material science, chemistry,
etc.) is to increase physical fidelity (either by considering more scales or linking/
coupling different disciplines), improve accuracy and robustness, and push the range
of feasible (credible) problem classes. Furthermore, high-end computing of this kind
is increasingly being used to provide data bases for fast-running engineering models.
This new area, also known as real-time computing, either interpolates from these
detailed data bases, or extracts fundamental modes of the system to obtain a reduced
order model (ROM).
Increased physical fidelity and accuracy in almost all fields in applied sciences
(i.e. physics, chemistry, biology, medicine, etc.) and engineering (civil, mechanical,
aerospace, naval, electrical, telecom, etc.) naturally imply large computing require-
ments.
2 Introduction 2: The Red Shift
Scientific computing in general and supercomputing in particular have taken advan-

tage of a remarkable combination of advances over the last three decades: on the
one hand the number of transistors per area has doubled every 18 months (so-called
Moores law), and clockrates increased by two orders of magnitude. For program-
mers and users, this perfect combination led to an almost utopian environment: one
could safely assume that without any further effort, the speed of codes would auto-
matically increase due to clockrates and larger register/ cache/ memory, and larger
problems could be run due to larger memory. However, due to physical limitations,
these trends could not continue unabated. Given that heat generation increases with
the third power of the clockrate, a clear limit is in sight here. In fact, anyone who has
inspected a recent motherboard can testify that most of the mechanical engineering
effort is devoted to chip cooling. Over the last five years, clockrates have stalled
at 2.03.0 GHz, and all indications are that, if anything, they will decrease in the
future. As far as packing more transistors per area, indications are that Moores law
will continue for the foreseeable future (a decade). The only way to higher CPU
performance (loosely speaking: more floating point operations per second [FLOPS])
is then via massive parallelism. This can be achieved (and is pursued) at the level of
the chip (either via many cores or via specialized hardware, e.g. GPUs), via a net-
work of chips, or via a combination of both approaches. In fact, most of the Top 500
Scaling Up Multiphysics 391
supercomputers at present use a combination of this kind to achieve outstanding

performance.
The biggest problem facing supercomputer designers (and scientific program-
mers) is that the speed increases of subcomponents continue to advance at very dif-
ferent rates: peak processor speed advances much faster than memory transfer rates,
which in turn advance much faster than DRAM latencies, which in turn advance
much faster than the interconnect switches between processors. This so-called red
shift has led to a crisis for computer architects: current designs are driven by complex
latency-hiding mechanisms.
Field solvers, which are commonly used for computational fluid and solid mechan-
ics as well as electromagnetics, only perform a limited number of operations for the
data that is brought in and out of memory. The gains obtainable via better compilers,
prefetching, mixing floating point operations and memory fetches, etc. have already
largely been exploited during the last decade. At present, even at the chip or GPU
level, field solvers are already limited by memory transfer rates, as evidenced by
recent studies [30]. This implies that for field solvers, this red shift is particularly
worrisome.
In order to see what impact this red shift has on current large-scale production
machines, the FEM-FCT algorithm [28, 31] as implemented in the FEFLO code was
tried on two representative systems: Jaguar and Diamond. FEM-FCT is used on a
daily basis for many blast-structure interaction simulations [25, 45, 46].
Diamond is an SGI ICE machine composed of 1,920 nodes (or blades). Each node
is composed of 2 Quad-core Xeon processors Nehalem E5440 series, giving a total
of 15,360 cores. The Xeon processors run at 2.8 GHz. This machine reaches a peak
performance of 172 Teraflops. The main memory has a capacity of 45 Terabytes. The
interconection network is composed of two Infiniband interconnect (10/20 GB/s),
one interconnect is dedicated to I/O and the other to MPI traffic.
Jaguar, at the Oak Ridge Leadership Computing Facility at the Oak Ridge National
Laboratory, is a CRAY machine composed of 18,688 compute nodes in addition to
dedicated login/service nodes. Each compute node contains dual hex-core AMD
Opteron 2435 (Istanbul) processors running at 2.6 GHz, 16 GB of DDR2-800 mem-
ory, and a SeaStar 2+ router with a peak bandwidth of 57.6 GB/s. The resulting
partition contains 224,256 processing cores, 300TB of memory, and a peak perfor-
mance of 2.3 petaflops. The operating system is the Cray Linux Environment.
The case, shown in Fig. 1, was a blast in a room. This testcase has been used
for many hardware benchmarks over the years. The particular mesh used had about
nelem=40 Mels.
Figures 2 and 3 show a compilation of timings obtained for different num-
bers of domains and cores per domain. Note that once a core deals with more
than nelem=300,000, CPU performance asymptotes out to the values expected
for the chips used in these machines (O(106 ) [sec/elem/step] or O(0.55 105 )
[sec/ pt/step]). However, for a very small number of elements per domain, there
appears a constant delay per timestep of TM P I = O(0.1) [sec] (!). Given that we
have O(102 ) MPI messages/exchanges per timestep, it appears that every time a
message is started via an MPI call, a latency of Tl = O(103 ) [sec] is incurred.
Fig. 1 Blast in room: Pressures and velocities
Fig. 2 Blast in room: Timings 1.2

FEM-FCT Jaguar
(absolute) 0.25680+2.833e-6*nelem
1 FEM-FCT Diamond_0
Time per Timestep [sec]
0.09376+1.163e-6*nelem
FEM-FCT Diamond_1
0.8 0.07575+0.957e-6*nelem
0.6
0.4
0.2
0
0 100000 200000 300000 400000 500000 600000 700000
Number of Elements/Core
Fig. 3 Blast in room: Timings

Time per Element per Timestep [msec]
10
(relative) FEM-FCT Jaguar
0.25680/nelem+2.833
FEM-FCT Diamond_0
8 0.09376/nelem+1.163
FEM-FCT Diamond_1
0.07575/nelem+0.9575
6
0
0 100000 200000 300000 400000 500000 600000 700000
Number of Elements/Core
After an extensive analysis [29] it was found that the communication times between
processors were the reason for this poor performance. The sobering conclusion of
this analysis is that, at present, the limiting speed of CFD solvers is given by the MPI
communication network. And compared to raw processor speed and RAM trans-
fer rates, the speed of the MPI communication network is advancing at a slower
pace. If these trends continue, the useful mesh size per processor will increase,
not decrease (!). While these conclusions have been stated before, we predict that as
more users get access to hundreds of thousands of cores and the domain size per core
starts shrinking, they will also encounter the minimum timestep barrier reported
here.
Let us ponder the consequences of this barrier for the particular case of computa-
tional fluid dynamics (similar consequences can be drawn for other fields). The cur-
rent trend here is to migrate from the quasi-steady Reynolds-Averaged Navier Stokes
(RANS) description to the Large-Eddy Simulation (LES) of flows. This implies an
increase in mesh size of 36 orders of magnitude, and an increase in the number of
timesteps to achieve convergence / statistical data of 34 orders of magnitude. The
increase in mesh size may be easily absorbed by machines with millions of cores.
But the requirement of O(107 108 ) timesteps, coupled with a minimum timestep
barrier of (optimistically) TM P I = O(0.01) [sec] means that no matter how large the
machine, an LES run will take TL E S = O(105 106 ) [sec], i.e. two weeks under the
best of circumstances. Such long running times will clearly hinder the useful range
of applicability.
3 The Simulation Pipeline
Any scientific calculation based on the solution of partial differential equations fol-
lows the so-called simulation pipeline: domain definition, imposition of boundary
conditions, mesh generation, solver, possible mesh adaptation, post-processing of
results. To date, only the solvers have been migrated to systems with thousands (or,
in some cases, millions) of cores. The current workflow considers a scalar, large
memory machine with powerful graphics for the domain definition (CAD) and the
imposition of boundary conditions. The input data is then used to generate a mesh
on a large, shared-memory multicore machine (ncore < 64). The splitting of the
domain into pieces is also accomplished on such machines. The domain files are
then transferred to the massively parallel machine (ncore>10,000) and the solver
is run. Finally, the results are assembled and post-processed on large-memory, scalar
machines that are of the same class as those used for pre-processing. This scalar-
pre, simple parallel solve, scalar-post environment will no longer be possible once
machines with more than 106 cores become widespread. The grid size alone will
force parallelization of meshing. The same will happen to post-processing, as well
as domain splitting and load balancing.
In the sequel, we show ways of obtaining parallel grid generation and dynamic
load balancing for multiphysics.
4 Parallel Grid Generation
As stated before, many solvers have been ported to distributed parallel machines
while grid generators have, in general, lagged behind. One can cite several reasons
for this:
(a) For many applications the CPU requirements of grid generation are orders of
magnitude less than those of field solvers, i.e. it does not matter if the user has
to wait several hours for a grid;
(b) (Scalar) grid generators have achieved a high degree of maturity, generality and
widespread use, leading to the usual inertia of workflow (modus operandi) and
aversion to change;
(c) In recent years, low-cost machines with few cores but very large memories have
enabled the generation of large grids with existing (scalar) software; and
(d) In many cases it is possible to generate a mesh that is twice (2d times) as coarse
as the one desired for the simulation. This coarse mesh is then h-refined globally.
Parallel unstructured grid generation has been pursued since the early 1990s
[1, 38, 10, 13, 14, 16, 18, 24, 27, 36, 37, 42, 43, 47, 53].
The two most common ways of generating unstructured grids are the Advancing
Front Technique (AFT) [12, 17, 2224, 32, 3841] and the Generalized Delaunay
Triangulation (GDT) [24, 7, 15, 34, 47, 50, 51]. The AFT introduces one element
at a time, while the GDT introduces a new point at a time. Thus, both of these
techniques are, in principle, scalar by nature, with a large variation in the number
of operations required to introduce a new element or point. While coding and data
structures may influence the scalar speed of the core AFT or GDT, one often finds
that for large-scale applications, the evaluation of the desired element size and shape
in space, given by background grids, sources or other means [31] consumes the
largest fraction of the total grid generation time. Furthermore, the time required for
mesh improvements (and any unstructured grid generator needs them) is in many
cases higher than the core AFT or GDT modules. Typical speeds for the complete
generation of a mesh (surface, mesh, improvement) on current Intel Xeon chips with
3.2 GHz and sufficient memory are of the order of 0.52.0 Mels/min. Therefore,
it would take approximately 2,000 minutes (i.e. 1.5 days) to generate a mesh of
109 elements, a common size in computational fluid dynamics and computational
electromagnetics. Assuming perfect parallelization, this task could be performed in
the order of a minute on 2,000 processors, clearly showing the need for parallel mesh
generation.
The easiest form of achieving volume-based parallelism is by using a grid to
define the regions to be meshed by each processor. Optimally, this domain-defining
grid (DDG) should have the same surface triangulation as the desired, fine mesh, but
could be significantly coarser in the interior. In this way, the definition of the domain
to be gridded is unique, something that is notoriouly difficult to achieve by other
means (such as background grids, bins or octrees). This domain-defining grid is then
partitioned according to the estimated number of elements to be generated, allowing
for a balanced distribution of work among the processors. The domain defining grid
is also used to redistribute the elements and points after grid generation, and during
the subsequent mesh improvement. A parallel grid generator based on these ideas
has been developed over the last 3 years [26]. Figure 4a considers a typical example,
taken from a blast simulation carried out for an office complex. Figure 4bd show
the trace of the domain defining grid partition on the surface as well as the fronts
after the parallel grid generation passes using 64 domains (mpi processors) for a finer
mesh. Table 1 gives a compilation of timings for different mesh sizes, domains and
processors on different machines.
One may note that:
(a) Generating the 121 M mesh on one 8-core shared memory node (i.e. nproc = 1,
nprol = 8) is slower than the distributed memory equivalent (i.e. nproc = 8,
nprol = 1);
(b) The number of elements per core should exceed a minimum value (typically
of the order of 24 Mels) in order to reach a generation speed per core that is
acceptable;
(c) The local OMP scaling improves as the number of elements in each domain is
increased;
(d) It only takes on the order of five minutes to generate a mesh of 121 Mels on
256 cores (nproc = 32, npro = 8), and on the order of 40 minutes to generate
a mesh of 1,010 Mels on 512 cores (nproc = 32, nprol = 8).
These timings, and those of other large cases that required parallel mesh generation,
show that the proposed approach is scalable and able to produce large grids of high
quality in a modest amount of clocktime. With parallel grid generation entering
production, a major impediment to a completely scalable simulation pipeline (grid
generation, solvers, post-processing) has been removed, opening the way for truly
large-scale computations using unstructured, body-fitted grids.
5 Load Balancing for Multiphysics
For field solvers, which are commonly used for computational fluid and solid mechan-
ics as well as electromagnetics, the classic way to distribute work among many
distributed memory processors is via domain decomposition. Given that the work
requirements are proportional to the number of elements/points in a domain, the
aim is to achieve subdomains of equal size while minimizing communication. As
the communication between processors is proportional to the surface points of each
subdomain, the aim is to minimize surface-to-volume ratios, which is achieved by
keeping the domains as contiguous (non-split) and spherical as possible. Tech-
niques commonly used for domain splitting include the advancing front methods,
coordinate- and moment- recursive bisection, and space-filling curve subdivisions
[11, 19, 20, 33, 35, 44, 48, 52].
Fig. 4 Garage: a outline of geometry, bd internal surface of DDG partition and remaining front
after each pass
Fig. 4 (Continued)
Table 1 Garage
Machine nproc nprol ncore nelem CPU [sec] AbsSpeed [els/sec] RelSpeed [els/sec/core]
Xeon(1) 1 8 8 120 M 2,293 52,333 6,542
SGI ITL 8 1 8 121 M 1,605 75,389 9,423
SGI ITL 8 8 64 121 M 516 234,496 3,664
Cry AMD 8 1 8 121 M 2,512 48,169 6,021
Cry AMD 16 1 16 121 M 1,954 61,924 3,870
Cry AMD 32 1 32 121 M 1,118 100,082 3,128
SGI ITL 16 1 16 121 M 1,048 115,458 7,216
SGI ITL 16 2 32 121 M 667 181,409 5,669
SGI ITL 16 4 64 121 M 407 297,297 4,645
SGI ITL 16 8 128 121 M 329 367,781 2,873
SGI ITL 32 1 32 121 M 646 187,306 5,853
SGI ITL 32 2 64 121 M 427 283,372 4,427
SGI ITL 32 4 128 121 M 346 349,710 2,732
SGI ITL 32 8 256 121 M 316 383,030 1,496
Cry AMD 64 1 64 972 M 6,048 160,714 2,511
SGI ITL 64 8 512 1010 M 2,504 403,354 788
With the possibility of computing larger problems, the desire to compute physics
of ever increasing complexity also emerged. Some current flow applications include
a traditional (i.e. unreacting) flow solver, chemical reactions, moving embedded or
immersed bodies, particles, etc. A timestep for such an application may proceed as
follows:
Identify the (new) position of embedded/immersed bodies, obtaining the new
boundary conditions/geometric parameters required;
Advance the chemical reactions one timestep, obtaining the source-terms for the
flow solver;
Update the particles one timestep, obtaining the source-terms for the flow solver;
Advance the flowfield one timestep.
The order of these operations is not mandatory and will vary among field solvers.
What is important, though, is that at the end of each of these steps a synchronization
among processors is required: the calculation can not proceed until all processors have
completed each step in turn. Therefore, for optimal performance the load should be
balanced in each step.
This inherent requirement of all multiphysics solvers implies that, compared to
simple field solvers (e.g. just flow), the potential for load imbalances and suboptimal
performance increases substantially.
In the sequel, we describe one possible way to achieve near-optimal load balance
for multiphysics solvers. The key idea is to subdivide the global domain into more
subdomains than processors. These oversampled or oversplit subdomains are then
grouped together in an optimal way so as to achieve the best load balance possible.
5.1 Splitting Algorithm
Assume a multiphysics application with n s sequential (physics) stages si , i = 1, n s

whose work (CPU) requirements wi (e) per element can be quantified. The splitting
algorithm for N p processors (mpi domains) then proceeds as follows:
S1 Obtain the CPU requirement for each stage in each element: wi (e)
S2 Obtain the
sum (or max) of these CPU requirements for each element:
ns
wt (e) = i=1 wi (e); wt (e) = max(wi (e)), i = 1, n s
S3 Using wt (e) and any of the usual domain splitting techniques, subdivide the
domain into m N p sub-subdomains j , j = 1, m N p ;
S4 Sum up the work for each of the stages si for each sub-subdomain j :
j
Wi = wi (e), e j
S5 Obtain the (desired) average work for each stage si for a subdivision into the
desired N p processors/
subdomains:
Ai = (1/N p ) wi (e)
S6 Agglomerate the m N p sub-subdomains into N p subdomains, attempting to
exceed the desired averages Ai as little as possible.
While Steps S.1S.5 are intuitive, Step S.6 requires a more careful consideration.
The following multipass strategy seems to give acceptable results:
H1 Initialize a used domain marker for the sub-subdomains U j = 0, j = 1, mN p
H2 Initialize total work counts for each of the desired N p subdomains:
Vik = 0, i = 1, n s , k = 1, N p
H3 Loop over the increases in allowed average work: Ai
H4 Put the unassigned sub-subdomains (U j = 0) into a heap-list according to the

j
work counts Wi i = 1, n s j = 1, m N p ;
H5 Retrieve the next sub-subdomain j from the top of the heap list;
H6 If: U j > 0 (the sub-subdomain j has been used): skip (Goto H5)
H7 Loop over the desired subdomains k = 1, N p
If: Vik = 0 (the subdomain has not been initialized):
j
- Add: Vik = Vik + Wi
- Mark: U j = k
- Exit (Goto H5)
Else:
j
- If: Ai < Vik + Wi , i = 1, n s : skip (Goto H5)
j
- Add: Vi = Vi + Wi
k k
- Mark: U j = k
- Exit (Goto H5)
Endif
H8 If: items remain in the heap list: (Goto H5)
H9 If there are unassigned sub-subdomains (U j = 0):
- Increase the allowed average work: Ai
- Allocate the remaining sub-subdomains (Goto H3)
While this basic technique will balance the work properly, it does not attempt to
minimize the resulting surface-to-volume ratios. This means that many unconnected
sub-subdomains may end up belonging to a subdomain/processor. One can allevi-
ate this shortcoming by improving the selection criterion for the allocation of sub-
subdomains to domains in Step H7 above. The key idea is to favour those subdomains
j
k that satisfy the condition Ai > Vik + Wi , i = 1, n s and are as close as possible
to (preferably overlap) the sub-subdomain j being retrieved from the heap-list. This
may be implemented as follows:
- H70: Initialize closest/best/optimal subdomain and overlap distance: kopt = 0,
dopt = 0
- H71: Initialize uninitialized domain marker: k0 = N p + 1
- H72: Loop over the desired subdomains k = 1, N p
If: Vik = 0 (the subdomain has not been initialized):
- Set : k0 = min(k0 , k)
Else:
j
- If: Ai > Vik + Wi , i = 1, n s :
- Compare overlap zone (update kopt , dopt )
- Endif
Endif
- H73: If: kopt > 0 (i.e. a best subdomain has been found):
j
- Mark: U = k
j
- Exit (Goto H5)

Endif
(a) (b) (c)
Fig. 5 Blast in tube with particles
- H74: If: k0 > 0 (i.e. an uninitialized subdomain has been found):

j
- Mark: U = k
j
- Exit (Goto H5)

Endif
In order to estimate the closeness or overlap of the different sub-subdomains, a
simple bounding box approach has been used.
5.2 Example: Detonation in Tube with Particles
This example considers a relatively long tube where a detonation occurs. As the
blast wave reaches the walls of the tube, particles are introduced into the flowfield.
The number of elements is of O(5 107 ), while the number of particles eventu-
ally reaches O(2 106 ). The problem was run with 16 distributed memory (mpi)
processes/domains, and 8 shared memory cores (OpenMP) per domain, i.e. a total of
256 cores. For the present purpose, the problem was run for 1,000 steps, with a re-
split using the method described above every 100 steps. The splitting obtained at the
end of the 1,000 steps may be discerned in Fig. 5ac. Note that there are many more
domains than mpi-processes/domains, but that, as expected, the basic moment-based
recursive bisection has still produced slices along the tube. The breakdown of times
is approximately as follows: flow solver 60 %, particle update 30 %, repartitioning
and renumbering 10 %. This implies that the parallel repartioning does not lead to
an excessive increase in CPU requirements while allowing for a much better load
balance.
6 Conclusions and Outlook
The present paper has summarized trends in supercomputing and the consequences
they will have on coupled problems in computational mechanics. In particular, the
red-shift observed in the performance increase of hardware subcomponents implies
that heavy emphasis should be placed on field solvers that minimize the access to
memory per timestep update.
The trend towards parallel machines with more than 106 cores implies that the
prevalent scalar-pre, simple parallel solve, scalar-post environment will have to
give way to a completely scalable simulation pipeline. The grid size alone will force
distributed parallelization of meshing, domain splitting, load balancing and post-
processing.
Possible ways of addressing parallel meshing and dynamic load balancing for
multiphysics we treated. The examples shown indicate that these approaches have
to potential to be core building blocks of a completely scalable simulation pipeline.
Much remains to be done in this field. It involves a lot of coding and debugging. It
is perhaps not as refined as the development of high order methods or new turbulence
models. But without it, further advances in coupled, large-scale problems will not
occur.
Acknowledgments Over the course of the last 25 years CIMNE has become an internationally
recognized center where, at any given time, one can find students, post-docs, research scientists,
professors and visitors from all over the world working on many topics. For the last 20 years, my
family and I (RL) have had the immense fortune of being able to visit CIMNE and Barcelona
every summer. I have always reserved the scientific topics that time did not permit me to consider
during the academic year for those wonderful weeks at CIMNE, which I consider among the most
productive and happy of my life. Therefore, I thank all the many members of CIMNE whom I had
the privilege to work with, and in particular Prof. Oate for his friendship, contagious enthusiasm
and vision.
References
1. Andrae H, Ivanov E, Gluchshenko O, Kudryavtsev A (2008) Automatic parallel generation

of tetrahedral grids by using a domain decomposition approach. J Comp Math Math Phys
48(8):14481457
2. Baker TJ (1989) Developments and trends in three-dimensional mesh generation. Appl Num
Math 5:275304
3. Blelloch GE, Hardwick JC, Miller GL, Talmor D (1999) Design and implementation of a
practical parallel delaunay algorithm. Algorithmica 24:243269
4. Chew LP, Chrisochoides N, Sukup F (1997) Parallel constrained delaunay meshing. In: Pro-
ceedings of 1997 workshop on trends in unstructured mesh generation, Northwestern Univer-
sity, Evanston June (1997)
5. Chrisochoides N (2005) Parallel mesh generation. In: Bruaset AM, Tveito A (eds) Numerical
solution of partial differential equations on parallel computers. Springer, Berlin, pp 237259
6. Chrisochoides N, Nave D (1999) Simultaneous mesh generation and partitioning for Delaunay
meshes. In: Proceedings of 8th international meshing roundtable, South Lake Tahoe, pp 5566
7. Chrisochoides N, Nave D (2003) Parallel Delaunay mesh generation kernel. Int J Num Meth
Eng 58:161176
8. de Cougny HL, Shephard MS, Ozturan C (1994) Parallel three-dimensional mesh generation.
Comput Syst Eng 5:311323
9. de Cougny H, Shephard M (1999) Parallel volume meshing using face removals and hierarchical
repartitioning. Comp Meth Appl Mech Eng 174(34):275298
10. de Cougny HL, Shephard MS, Ozturan C (1995) Parallel three-dimensional mesh generation
on distributed memory MIMD computers. Tech Rep SCOREC Rep # 7, Rensselaer Polytechnic
Institute
11. Flower J, Otto S, Salama M (1990) Optimal mapping of irregular finite element domains to
parallel processors. 239250
12. Frykestig J (1994) Advancing front mesh generation techniques with application to the finite
element method. Pub. 94:10, Chalmers University of Technology, Gteborg
13. Galtier J, George PL (1997) Prepartitioning as a way to mesh subdomains in parallel. In: Special
symposium on trends in unstructured mesh generation (ASME/ASCE/SES), pp 107122
14. George PL (1999) Tet meshing: construction, optimization and adaptation. In: Proceedings of
8th international meshing roundtable, South Lake Tahoe, October 1999
15. George PL, Hecht F, Saltel E (1991) Automatic mesh generator with specified boundary. Comp
Meth Appl Mech Eng 92:269288
16. Ivanov EG, Andrae H, Kudryavtsev AN (2006) Domain decomposition approach for automatic
parallel generation of tetrahedral grids. Int Math J Comp Meth App Math 6(2):178193
17. Jin H, Tanner RI (1993) Generation of unstructured tetrahedral meshes by the advancing front
technique. Int J Num Meth Eng 36:18051823
18. Kadow C, Walkington N (2003) Design of a projection-based parallel Delaunay mesh genera-
tion and refinement algorithm. In: Proceedings of fourth symposium on trends in unstructured
mesh generation, Albuquerque, 2003
19. Karypis G, Kumar V (1999) Parallel multilevel k-way partitioning scheme for irregular graphs.
SIAM Rev 41(2):278300
20. Karypis G, Kumar V (1998) A parallel algorithm for multilevel graph partitioning and sparse
matrix ordering. J Parallel Distrib Comput 48:7185
21. Ko S-H, Kim N, Kim J, Thota A, Jha S (2010) Efficient runtime environment for coupled multi-
physics simulations: dynamic resource allocation and load-balancing. In: Procedings of 10th
IEEE/ACM international conference on cluster, cloud and grid computing (CCGrid), 1720
May (2010)
22. Lhner R (1988) Some useful data structures for the generation of unstructured grids. Comm
Appl Num Meth 4:123135
23. Lhner R (1996) Extensions and improvements of the advancing front grid generation tech-
nique. Comm Num Meth Eng 12:683702
24. Lhner R (2001) A parallel advancing front grid generation scheme. Int J Num Meth Eng
51:663678
25. Lhner R (2008) Applied CFD techniques, 2nd edn. Wiley, Chichester
26. Lhner R (2013) A 2nd generation parallel advancing front grid generator. AIAA-2013-0147
27. Lhner R, Camberos J, Merriam M (1992) Parallel unstructured grid generation. Comp Meth
28. Lhner R, Baum JD (2012) 40 years of FCT: status and directions. In: Kuzmin D, Lhner R,
Turek S (eds) Flux-corrected transport, 2nd edn. Springer, New York, pp 119143
29. Lhner R, Baum JD (2014) On maximum achievable speeds for field solvers. Int J Num Meth
Heat Fluid Flow (to appear)
30. Lhner R, Corrigan A, Wichmann K-R, Wall W (2013) On the achievable speeds of finite
difference solvers on CPUs and GPUs. AIAA-2013-2852
31. Lhner R, Luo H, Baum JD, Rice D (2008) Improvements in speed for explicit, transient
compressible flow solvers. Int J Num Meth Fluids 56(12):22292244
32. Lhner R, Parikh P (1988) Three-dimensional grid generation by the advancing front method.
Int J Num Meth Fluids 8:11351149
33. Lhner R, Ramamurti R, Martin D (1993) A parallelizable load balancing algorithm. AIAA-
93-0061
34. Marcum DL, Weatherill NP (1995) Unstructured grid generation using iterative point insertion
and local reconnection. AIAA J 33(9):16191625
35. Mehrota P, Saltz J, Voigt R (eds) (1992) Unstructured scientific computation on scalable mul-
tiprocessors. MIT Press, Cambridge
36. Okusanya T, Peraire J (1996) Parallel unstructured mesh generation. In: Proceedings of 5th
internatinal conference number grid generation in CFD and related fields, Mississippi, April
1996
37. Okusanya T, Peraire J (1997) 3-D parallel unstructured mesh generation. In: Proceedings of
joint ASME/ASCE/SES summer meeting 1997
38. Peraire J, Morgan K, Peiro J (1990) Unstructured finite element mesh generation and adaptive
procedures for CFD. AGARD-CP-464, p 18
39. Peraire J, Morgan K, Peiro J (1992) Adaptive remeshing in 3-D. J Comp Phys 103(2):269285
40. Peraire J, Peiro J, Formaggia L, Morgan K, Zienkiewicz OC (1988) Finite element Euler
calculations in three dimensions. Int J Num Meth Eng 26:21352159
41. Peraire J, Vahdati M, Morgan K, Zienkiewicz OC (1987) Adaptive remeshing for compressible
flow computations. J Comp Phys 72:449466
42. Said R, Weatherill NP, Morgan K, Verhoeven NA (1999) Distributed parallel delaunay mesh
generation. Comp Meth Appl Mech Eng 177:109125
43. Shostko A, Lhner R (1995) Three-dimensional parallel unstructured grid generation. Int J
Num Meth Eng 38:905925
44. Simon H (1991) Partitioning of unstructured problems for parallel processing. NASA Ames
Tech Rep RNR-91-008
45. Stck A, Camelli F, Lhner R (2010) Adjoint-based design of shock mitigation devices. Int J
Num Meth Fluids 64:443472
46. Togashi F, Baum JD, Mestreau E, Lhner R, Sunshine D (2010) Numerical simulation of
long-duration blast wave evolution in confined facilities. Shock Waves 20(5):409424
47. Tremel U, Sorensen KA, Hitzel S, Rieger H, Hassan O, Weatherill NP (2006) Parallel remeshing
of unstructured volume grids for CFD applications. Int J Num Meth Fluids 53(8):13611379
48. Vidwans A, Kallinderis Y, Venkatakrishnan V (1993) A parallel load balancing algorithm for
3-D adaptive unstructured grids. AIAA-93-3313-CP
49. Walshaw C, Cross M (2000) Parallel optimisation algorithms for multi-level mesh partitioning.
Parallel Comput 26:16351660
50. Weatherill NP, Hassan O (1994) Efficient three-dimensional Delaunay triangulation with auto-
matic point creation and imposed boundary constraints. Int J Num Meth Eng 37:20052039
51. Weatherill NP (1992) Delaunay triangulation in computational fluid dynamics. Comp Math
Appl 24(5/6):129150
52. Williams D (1990) Performance of dynamic load balancing algorithms for unstructured grid
calculations. CalTech Rep C3P913
53. Yoshimura S, Nitta H, Yagawa G, Akiba H (1998) Parallel automatic mesh generation method
of ten-million nodes problem using fuzzy knowledge processing and computational geometry.
In: Proceedings of 4th world cong comp mech Buenos Aires, Argentina, July 1998
Partitioned Solution of Coupled Stochastic
Problems
Mohammad Hadigol, Alireza Doostan, Hermann G. Matthies

and Rainer Niekamp
Abstract This work is concerned with the propagation of uncertainty across coupled
problems with high-dimensional random inputs. A stochastic model reduction
approach based on low-rank separated representations is proposed for the parti-
tioned treatment of the uncertainty space. The construction of the coupled solution
is achieved though a sequence of approximations with respect to the dimensionality
of the random inputs associated with each individual subproblem and not the com-
bined dimensionality, hence drastically reducing the overall computational cost. The
coupling between the sub-domain solutions is done via the classical Finite Element
Tearing and Interconnecting (FETI) method, thus providing a well suited framework
for parallel computing. A high-dimensional stochastic problem, a coupled 2D elliptic
PDE with random diffusion coefficient, has been considered in this paper to study
the performance and accuracy of the proposed stochastic coupling approach.
1 Introduction
For coupled problems a partitioned solution procedure, which allows the re-use of
the subproblem solvers and the accompanying software, has a long history and well-
developed procedures, see e.g. [18] and the references therein. Here we combine this
idea with that of uncertainty propagation, which is steadily gaining in importance
in order to obtain realistic predictions of the effects of such uncertainties and quan-
tify their impact on Quantities of Interest (QoI). Uncertainty quantification (UQ),
an emerging field in computational engineering and science, is concerned with the
development of rigorous and efficient solutions to this exercise. It is quite common
M. Hadigol Alireza Doostan

University of Colorado, CO 80309, Boulder, USA
H. G. Matthies (B) R. Niekamp
TU Braunschweig, 38092 Braunschweig, Germany

406 M. Hadigol et al.
to model such uncertainties probabilistically, where uncertain parameters are repre-

sented by random variables (RVs), fields, or processes. Several techniques have been
developed to study the propagation of such uncertainties in engineering systems, e.g.
see [1, 2, 14, 16, 17, 20, 25, 26] and the references therein. Among these tech-
niques, stochastic spectral methods based on polynomial chaos expansions (PCE)
have received special attention due to their advantages over traditional UQ tech-
niques such as perturbation-based and Monte Carlo sampling (MCS) methods, as
these schemes may converge much faster than MCS methods [14]. These spectral or
functional approximation methods are based on expanding the solution of interest
as functions of known RVs. The coefficients of these expansions are then computed,
for instance, via Galerkin projection, referred to as stochastic Galerkin (SG), or
pseudo-spectral collocation or stochastic collocation (SC), see e.g. [21, 27].
Already in the treatment of a single system, the high-dimensional problems which
result from PC-based techniques pose computational challenges. The problem is
even compounded when dealing with coupled stochastic systems. This may be a
common situation when one is dealing with UQ of engineering problems involving,
for instance, coupled domains or scales with independent sources of uncertainty. In
the past few years, several alternative methods have been proposed to address the
the cost associated to the high-dimensionality of standard PC methods, e.g. [3, 5, 9,
15, 21, 27, 28]. Here we want to extend these methods for the analysis of coupled
systems, but through a partitioned approach which hopefully will mitigate the cost
of coupling, following [12].
In the cases of coupled UQ problems, the solution depends on all random inputs;
hence has even more coupling, and a direct application of PC expansion techniques
looks at first glance daunting. While integration of PC expansions with standard
domain decomposition (DD) techniques may partially reduce the overall computa-
tional complexity by partitioning the physical space, e.g. see [10, 24], expansions (or
sampling) with respect to the combined set of random inputs is still required. Instead,
we propose an approach that additionally enables a partitioned treatment of the sto-
chastic space; that is, the solution is computed through a sequence of approximations
with respect to the random inputs associated with each individual sub-domain, and
not the combined set of random inputs. To this end, we propose a stochastic expan-
sion based on the so-called low-rank or separated representations and demonstrate
how it can be obtained.
Model reduction techniques based on separated or low-rank representations, one
of which is a.k.a. canonical decomposition, of high-dimensional stochastic functions
have been recently proposed, see e.g. [4, 6, 13, 22], where in some instances the low-
rank approach may generate additional coupling in an otherwise uncoupled (in the
stochastic variables) collocation approach. We here adopt a special form of separated
representations for the stochastic computation of coupled domain problems.
The plan for this chapter is as follows: In Sect. 2 we recall basic partitioned
algorithms exemplified on an abstract coupled system, which may depend on
parameters. The case when these parameters are RVs is addressed in Sect. 3.
In the following Sect. 4 we bring these two strands of computational techniques
Partitioned Solution of Coupled Stochastic Problems 407
together to develop a partitioned solution algorithm to compute a separated low-rank

approximation. A computational example is shown in Sect. 5, and concluding
remarks are offered in Sect. 6.
2 Coupled Problems
A coupled stochastic system is a special case of coupled parametric systems, where

the parameters are random variables. For completeness, it is worthwhile to recall how
a partitioned solution strategy works for a coupled system when these parameters
have a fixed value, which we concentrate on first.
2.1 One Single System
To introduce the problem and notation, where we follow [11], first look at a single
system which will be denoted abstractly as

u(t) + A(q; u(t)) = f (q; t), (1)
t
where u(t) U describes the state of the system at time t [0, T ] lying in a Hilbert
space U (for the sake of simplicity), A is apossibly non-linearoperator modelling
the physics of the system, and f U is some external influence (action / excitation
/ loading). The model depends on some parameters q Q which are uncertain. For
our purposes here it will be sufficient to look at the stationary case when u/t = 0
and f /t = 0:
A(q; u) = f (q). (2)
Often an example such as Eq. (2) is the stationary condition of some functional or
potential on U, i.e. it is equivalent to
u (q; u) = A(q; u) f (q) = 0. (3)
Later we will assume such a situation for our numerical example.

To have a concrete example of Eq. (1), consider the diffusion equation

u(x, t) div((x) u(x, t)) = f (x, t), x G, (4)
t
with appropriate boundary and initial conditions, where G Rn is a suitable domain.

The diffusing quantity is u(x, t) (heat, concentration) and the term f (x, t) models
sinks and sources. The parameter q in question could for example be the diffusion
tensor , or the initial conditions u(x, 0). The stationary case of Eq. (4) is well-known
to be the gradient of

1
(u) = u(x) (x) u(x) dx u(x) f (x) dx, (5)
2
G G
where we have assumed homogeneous Dirichlet boundary conditions in Eq. (5) for
simplicity. Later in the numerical example two such systems will be coupled at the
common part of the boundary.
Focusing on the stationary case Eq. (2), assume that we are also given an iterative
solverconvergent for all fixed values of qwhich generates successive iterates for
k = 0, . . . converging to the solution u (q).
u (k+1) (q) = S(q; u (k) (q), R(q; u (k) (q)), with u (k) (q) u (q), (6)
where S is one cycle of the solver which may also depend on the iteration counter k,
u (q) is some starting vector, and R(q; u (k) (q)) := f (q) A(q; u (k) ) is the residuum
of Eq. (1). Obviously, when the residuum vanishesR(q; u (q)) = 0the mapping
S has a fixed point u (q) = S(q; u (q), 0).
2.2 Partitioned Solution for Coupled Systems
We now turn to coupled systems, where we follow [18], and for the sake of simplicity,
we look at only two systemsagain in abstract formwhich are coupled:
A I (q I ; u I , u I I , ) = f I (q I ), A I I (q I I ; u I I , u I , ) = f I I (q I I ), (7)
where u I , u I I and q I , q I I are the state and parameters of system I or I I . Later again
we will assume that the two equations in Eq. (7) are the partial derivatives of some
functionals I (q I ; u I , u I I ) and I I (q I I ; u I I , u I ). The Lagrange multiplier of the
coupling is , and the coupling condition is
C(q I , q I I ; u I , u I I ) = 0, or rather : C(q I , q I I ; u I , u I I ), = 0. (8)
Often the system I in Eq. (7) does not depend on u I I , and vice versa, which may
make the coupling a bit looser. It makes no difference for what is to follow.
A partitioned solution procedure assumes that we have solvers separately for
the two equations in Eq. (7); more precisely that in the first equation, for u I I
and , q I , q I I fixed, that equation can be solved by a given procedureiterating
a map like in Eq. (6) to convergence u k+1 I = S I (q I ; f I , u kI , u I I , ), and vice
versa for the second equation. Additionally we assume that for fixed q I , q I I
and u I , u I I , the coupling Eq. (8) determines , again produced by an iterator
k+1 = (q I , q I I ; u I , u I I , k ). Out of these building blocks partitioned solution

procedures can be constructed, see [18] and the references therein. The simplest of
Algorithm 1 Nonlinear block-Jacobi iteration

for k = 0, 1, . . . to convergence do
u k+1
I := S I (q I ; f I , u kI , u kI I , k );
k+1
u I I := S I I (q I I ; f I I , u kI , u kI I , k );
k+1 := (q I , q I I ; u kI , u kI I , k );
end for
theseand hence sufficient to explain the principal ideais a block-Jacobi itera-

tion, shown in Algorithm 1. The three statements in the loop in Algorithm 1 may
be executed concurrently or in parallel. But often the algorithm is executed serially.
Thenbecause it involves no further cost per step and is frequently much faster
one usually switches to the block-Gauss-Seidel variant where results are used as
soon as they become available, e.g. u k+1 I from the first statement will be used in the
second instead of u kI , resulting in a block-Gauss-Seidel algorithm.
We look at the simplest instance, namely two linear symmetric systems defined
by stiffness matrices on a finite dimensional space, i.e. a version of Eq. (7) after a
possible discretisation:

K I (q I ) 0 C IT uI f I (q I )
0 K I I (q I I ) C ITI u I I = f I I (q I I ). (9)
CI C I I 0 0
Due to the symmetry, the equations Eq. (9) are the stationarity conditions for the
deterministic Lagrangian in Eq. (10), namely
u I = 0, u I I = 0, = 0,
a saddle-point for fixed (q I , q I I ):
(q I , q I I ; u I , u I I , ) = I (q I ; u I ) + I I (q I I ; u I I ) + T (C I u I C I I u I I ) :=

1 T 1 T
u I K I (q I )u I u TI f I (q I ) + u I I K I I (q I I )u I I u TII f I I (q I I ) + T (C I u I C I I u I I ).
2 2
(10)
A block-Gauss-Seidel type algorithm for solving Eq. (9) is the FETI method, which is
used here; detailed descriptions may be found at e.g. [7, 8, 23]. For nonlinear systems
it may be worthwhile to use more advanced methods than block-Gauss-Seidel, e.g
quasi Newton methods; see [18] for an analysis of such cases in particular for fluid-
structure interaction. This finishes our brief review on the partitioned solution of
coupled systems.
3 Stochastic Problems
Consider the stationary version of Eq. (1) shown in Eq. (2), where one now is inter-
ested in capturing the dependence on q. To make it clear that we regard both Eqs. (1)
and (2) as equations in U , we also denote the equivalent weak form in Eq. (11):
v U : A(u; q), v = f (q), v. (11)
In the sequel of this section, we follow the presentation in [19].
3.1 Parametric Dependence
We assume that q may be uncertain, and is modelled as a Q-valued random variable

(RVs), hence the state u(q, f ), which is a function of q and f , is a U-valued RV.
Uncertainty quantification (UQ) is the propagation of uncertainty in q and f to a
corresponding one in u. We therefore take (, A, P), a probability space with expec-
tation operator E (), such that is the set of all possible realisations, A is a -algebra
of measurable events (subsets of ), and P is a probability measure defined on A.
Assume for simplicity that q() has finite variance, i.e. q()Q S :=
L 2 ()with inner product of two random variables (RVs) r1 , r2 S defined as
usual as r1 , r2 S := E (r1r2 ) = r1 ()r2 () P(d)so that
q L 2 (, Q)
= Q L 2 () = Q S =: Q,
where we regard the tensor producthere and lateras completion in the norm
induced by the inner product on Q by q1 r1 , q2 r2 Q := q1 , q2 Q r1 , r2 S
and extended by linearity. The system model is now
A(q(); u()) = f (q()) a.s. in , (12)
and the state u = u() becomes a U-valued random variable (RV), an element of
the tensor space U := U S. We may write this also as a variational statement, cf.
Eq. (11), which facilitates both theory and numerical approximation, e.g. [1, 2, 16,
17]
v U : A(q; u), v = f, v (= E ( f, vU )), (13)
so that under certain assumptions one obtains a theory for well-posed stochastic
partial differential equations (SPDEs). As the input data q, right hand side f , and
solution u are elements of tensor product spaces, this points to the later use of low-
rank tensor approximations for efficient approximation algorithms, which will be
crucial for the separated representation. In case that Eq. (2) is a gradient as in Eq. (3),
this will carry over to Eq. (13), which is equivalent to
v U : u (q, u), v = A(q; u), v f, v = 0, (14)
where the functional / potential on U = U S is
(q, u) := E ((q; u)). (15)
3.2 Stochastic Approximations
Assume that the operator equation A(q; u) = f (q) has already been discre-
tised by your favourite methode.g. FEM or FVM or similar. This is essen-
tiallythe choice of a finite-dimensional subspace U N = span{v n }n=1 N U, with
u n u n (q) v n U N .
More importantly, a discretisation of q and u(q) is needed. Stochastic processes
and random parameter fields usually need infinitely many RVs {1 (), 2 (), . . .},
so that q = q(1 (), 2 (), . . .) Q, where the m S are known RVs. A similar
representation may be obtained for f (q) [19]. Hence the solution is also a function
of the m : u(1 , 2 , . . .) = n u n (1 , 2 , . . .)v n . We discretise further by truncation
to a finite number of RVs: () = [1 (), . . . M ()] R M , so that q = q( ) =
q( ()) and
u = u( ) = u( ()) u n ( ) v n . (16)
n
Based on this, for actual numerical computations, a representation of the stochastic

aspect has to be chosen. Some frequently explored possibilities are computing the
resulting distributions through Fokker-Planck type equations, establishing relations
for the moments of the results, representing the solution through samplesthe well
known (quasi) Monte Carlo methodsand functional or spectral approximation
where the solution is represented as a function of known RVs. Here we want to
employ low-rank methods or separated representations, and these fall in the last
(two) group(s). We chose functions (i.e. known RVs) span{X }=1B = S B S, and
make the ansatz in Eq. (16):

u n ( ) u n X (), (17)

hence giving the state u as a high-dimensional function

u( ) = u n X ( )v n = u n v n X U N S B U S. (18)
n, n,
If we take this ansatz Eq. (18) and insert it into Eq. (13), the residuum R(q( ); u( ))
will usually not vanish for all , as the finite set of functions {X } can not match

all possible parametric variations of u( ). To determine the coefficients u n , one may
choose another set of functions (known RVs) ( ) S for projection, so that the
weighted residual vanishes:

, k : v k , R(q( ); u( )) = v k , f ( ) A(q( ); u n X ( )v n ) = 0,
n,
(19)

yielding a generally coupled system of equations of size N B for the u := {u n },
the tensor coefficients representing the solution u( ).
This general Galerkin methodalso called the method of weighted residuals
usually comes in the flavours of a Bubnov-Galerkin method where = X and

the system of equations is coupled for all u n , or as a Petrov-Galerkin method with
= X . In the latter case, a frequent choice is () = ( ), i.e. col-
location / interpolation at the points , where ( ) is the Dirac- at .
By additionally ensuring the Kronecker- property X ( ) = , , one obtains the
quasi-deterministic uncoupled collocation conditions

, k : v k , f ( ) A(q( ); u n v n ) = 0, (20)
n
which can be solved directly for the u n for each independently with the solver
Eq. (6), i.e. B systems of N uncoupled equationswhich are just samples at .
In the Bubnov-Galerkin case, if additionally the equation is a gradient as in
Eq. (14)which we will assume from now onthe Eq. (19) is equivalent with min-
imising the potential in Eq. (15) over the subspace U N S B U S. In Sect. 3.3
we will look at low-rank tensor representations, for a rank-R tensor these may be seen
as multi-linear maps FR L R (U S, U S), i.e. FR : U R S R U = U S;
they give a formal way to describe such representations. Then we will replace the
functional by the composition Fr or similar for some r , and thereby pose the
minimisation problem on U r S r .
3.3 Greedy Methods for Low-Rank Approximations
In the case of minimising the potential in Eq. (15) over U N S B , we want to

represent u( ) from Eq. (18) with a small or low tensor rank R as

R
R
u( ) wr r = r ( )wr . (21)
r =1 r =1
One may observe that the tensor (u n ) has N B terms, whereas a canonical
decomposition such as Eq. (21) has R (N + B). If R min(N , B), which we hope
for and what we mean by a low-rank separated representation, then R (N + B)
(N B). This not only saves memory, but most importantly also computation
when we can operate on the terms in Eq. (21) directly.
Assume that we have already found R 1rtermsr in the approximation Eq. (21),
then for the next step define u R ( ) = rR1
=1 ( )w , and the incremental potential
R
R (w R , R ) := (u R + w R R ). (22)
The new terms w R , R to be found are determined by minimising the incremental

potential R w.r.t. wr and R ; the stationarity condition is for all , k:
w R R (w R , R ), v k U = u (q( ); u R ( ) + R ( )w R ), v k R ( ) = 0,
(23)
R R (w R , R ), X S = u (q( ); u R ( ) + R ( )w R ), w R X ( ) = 0.
(24)
In contrast to Eq. (19), the system in Eq. (23) is of size N determining w R and
the system of Eq. (24) is of size B determining R . In more detail, Eq. (23) to be
solved for the N unknowns (wnR )n=1N in w R = N w R v reads:
n=1 n n
k : A(q( ); u R ( ) + R ( )w R ) f ( ), v k R

= E [A(q( ); u ( ) + ( )
R R
wnR v n ) f ( )] ( ) , v k U = 0;
R
n
(25)
one may observe that the operator is not a sample, but some average weighted with
R . Similarly, the Eq. (24) to determine the B unknowns (R )=1
B in R ( ) =
R
X ( ) is in more detail
: A(q( ); u R ( ) + R ( )w R ) f ( ), w R X

= E A(q( ); u R ( ) + ( R X ( ))w R ) f ( ), w R U X ( ) = 0.

(26)
The basic greedy rank-one updating algorithmit may also be called the basic
ingredient in separated representation, proper generalised decomposition (PGD), or
successive rank-one updating (SR1U), a form of alternating least squares (ALS)is
simply formulated in Algorithm 2.
Algorithm 2 Basic greedy rank-one updating

u 1 := 0;
for R := 1, . . . to sufficient accuracy do
0R := 1;
for k := 1, . . . to convergence do
Solve Eq. (25) for w kR resp. (wnR )k ; minimising R w.r.t. wkR .
Solve Eq. (26) for kR resp. (R )k ; minimising R w.r.t. kR .
end for R
w R := wkR = n wn,k vn ;
R
R := kR := ,k X ;
u R+1 := u R + w R R ;
end for
The innermost loop can again be seen as a block-Gauss-Seidel method to solve

the coupled system Eqs. (23) and (24), and may certainly be accelerated. Further
improvements are cummulatively possible. First one may note that the X ( ) =
X (1 , . . . , M ) are multivariate functions X : R M R, and may be further split
up into tensor products, e.g. we make = (1 , . . . , M ) a multi-index and write
M
R ( ) = R (1 , . . . , M ) = R ( m=1 X m (m )) with some univariate func-
tions X j ; this would allow for even finer computations in Algorithm 2, minimising
in the direction of each X j separately.
Using the notation at the end of Sect. 3.2 for tensor representations, the incremental
potential in Eq. (22) may also be written as
R (w R , R ) := (u R + F1 (w R , R )), (27)
where F1 (w R , R ) = w R R ; the Algorithm 2 in the innermost loop then min-

imises R in Eq. (27) over the space U N S B . An extension of the above algorithm
although at some additional costis to use the maps FR : U NR S BR U N S B
defined by the decomposition Eq. (21)

R
FR : (w1 , . . . , w R , 1 , . . . , R ) wr r ,
r =1
and thenafter each increase of R to optimise in the innermost loop the functional

R
R (w1 , . . . , w R , 1 , . . . , R ) := (FR (w1 , . . . , w R , 1 , . . . , R )) = wr r
r =1
over the vector space U NR S BR instead of the functional R from Eq. (27). The
example to be shown in Sect. 5 was computed in this way. For the sake of brevity we
will not spell out this algorithm, more details may be found in [12].
4 Stochastic Coupled Problems
After all this preparation, let us return to Eqs. (7) and (8), the type of system we
are interested in. We assume that analogous to Sect. 3.2 the uncertainties q I in
subsystem I have been expressed or approximated by a set of independent RVs
I = (1,I , . . . , M I ,I ), i.e. q I = q I ( I ), and similarly the uncertainties q I I in
subsystem I I by I I = (1,I I , . . . , M I I ,I I ), i.e. q I I = q I I ( I I ).
The quantities describing the solution will then be functions of both I and I I ,
defined on an M I + M I I dimensional space, i.e. the stochastic modelling leads to
additional coupling. This is the difficulty, that with each additional coupled system the
dimension of the underlying variable space grows. This can of course not be avoided,
but mitigated through a low-rank separated or tensor representation. Taking the solu-
tion u I ( I , I I ) of system I in Eq. (7) as an example, we know thatsimilarly to
Eq. (21)it is representable as

R
R
u I ( I , I I ) rI ( I ) rI I ( I I )wrI = wrI rI rI I , (28)
r =1 r =1
as it is an element of U I S I S I I . Considering all three components (u I , u I I , ),

they are elements of the tensor product space (U I U I I M) (S B I S B I I ),
where M is the spatial space of Lagrange multipliers . Again we assume that finite
dimensional subspaces / bases
I
N span{v N II L
(span{v I,i }i=1 I I, j }i=1 span{ }=1 ) = (U N I U N I I M L ) (U I U I I M)
and
I II
(span{X I, }=1
B
span{X I I, }=1
B
) = (S B I S B I I ) (S I S I I )
have been chosen.
4.1 Stochastic Separated Representation
We turn right away to the linear coupled problem Eq. (9), which are the conditions
for a saddle-point of the Lagranrian Eq. (10). Here we make a low-rank ansatz as
in Eq. (28) for all solution quantities:
r
u I ( I , I I ) R wI
u I I ( I , I I ) rI ( I ) rI I ( I I ) wrI I . (29)
( I , I I ) r =1 r
The sum will again be computed term-by-term as in Algorithm 2. Following the

general case Eq. (15), first define the corresponding stochastic Lagrangian as expected
value of the deterministic Lagranian in Eq. (10):
(q I , q I I ; u I , u I I , ) := I (q I , ; u I ) + I I (q I I , ; u I I ) + T (C I u I C I I u I I )
= E ( I (q I ( I ); u I ( I , I I ))) + E ( I I (q I I ( I I ); u I I ( I , I I )))

+ E ( I , I I )T (C I u I ( I , I I ) C I I u I I ( I , I I )) (30)
Following the general prescription in Sect. 3.3, as in Eq. (22), assume that a low-rank
representation like Eq. (29) up to terms of R 1 has already been computed, define
the abbreviations
R r
u I ( I , I I )
R1 wI
u R ( I , I I ) := wr rI ( I ) rI I ( I I ),
II II
R ( I , I I ) r =1 r
and look at the incremental Lagrangian (cf. Eq. (22)) corresponding to Eq. (30):
R (w IR , w IRI , R , IR , IRI ) = (q I , q I I ; u IR +w IR IR IRI , u IRI +w IRI IR IRI , R + R IR IRI ).

(31)
4.2 Partitioned Greedy Algorithm
The conditions for stationarity of the incremental Lagrangian R from Eq. (31)as
before in Eqs. (23) and (24) are

w IR R u I , v I,i I I I

v I,i , v I I, j , : 0 = w R R = u I I , v I I,i I I I , (32)
II
R R , I I I
and on the stochastic variables

I R
X I, , X I I, : 0= =
I I R

u I , w IR X I, I I + u I I , w IRI X I, I I + , R X I, I I
.
u I , w IR I X I I, + u I I , w IRI I X I I, + , R I X I I,
(33)
From this and Eqs. (10) and (30) one obtains after a short calculation for Eq. (32)
R
K I 0 C IT wI f I
0 K I I C T w R = f , (34)
II II II
C I C I I 0 R g
which has exactly the same size and structure as Eq. (9). Hence it can be solved with
the deterministic FETI algorithm. The averaged deterministic-size terms in Eq. (34)
are

K I = E IR ( I ) IRI ( I I ) K I ( I ) IR ( I ) IRI ( I I ) = E IRI ( I I )2 E IR ( I )2 K I ( I ) ,

K I I = E IR ( I ) IRI ( I I ) K I I ( I I ) IR ( I ) IRI ( I I ) = E IR ( I )2 E IRI ( I I )2 K I I ( I I ) ,

C I = C I E IR ( I ) IRI ( I I ) , C I I = C I I E IR ( I ) IRI ( I I ) ,

f I = E ( f I ( I ) K I ( I )u IR ( I , I I )) IR ( I ) IRI ( I I ) ,

f I I = E ( f I I ( I I ) K I I ( I I )u IRI ( I , I I )) IR ( I ) IRI ( I I ) ,

g = E (C IT u IR ( I , I I ) C ITI u IRI ( I , I I )) IR ( I ) IRI ( I I ) .

For Eq. (33) one obtains in the first relation for IR = I,
R X
I, after some
computation X I, :

E X I, ( I ) IRI ( I I )2 k( I , I I ) X I, ( I ) I,
R
= E IRI ( I I ) ( I , I I ) X I, ( I ) ,

(35)
with the abbreviations ( I , I I ) := u IR ( I , I I )T f I ( I ) +u IRI ( I , I I )T f I I ( I I )
and
k( I , I I ) := u IR ( I , I I )T K I ( I ) u IR ( I , I I )+u IRI ( I , I I )T K I I ( I I ) u IRI ( I , I I ).
B R B , as
The Eq. (35) is a linear symmetric positive definite system of size I I
it is equivalent to the minimisation of R . Analogously, for I I = I I, X I I, ,

R
the second relation in Eq. (33) yields a similar linear system of size B I I B I I such
that X I I, :

E X I I, ( I I ) IR ( I )2 k( I , I I ) X I I, ( I I ) IRI, = E IR ( I ) ( I , I I ) X I I, ( I I ) .

(36)
The procedure to compute the separated solution in a partitioned way is given

by Algorithm 3.
Of course it is possible to extend this algorithm as alluded to at the end of Sect. 3.3,
namely to not only solve saddle point problems on U N I U N I I M L S B I S B I I
as it is done in Algorithm 3, but on U NR I U NR I I M LR S BRI S BRI I , details
Algorithm 3 Partitioned greedy rank-one updating

u 1I := 0; u 1I I := 0; 1 := 0;
for R := 1, . . . to sufficient accuracy do
I,0
R := 1; R
I I,0 := 1;
for k := 1, . . . to convergence do
Solve linear saddle point system Eq. (34) for (w I,k R , w R , R ) with FETI using
I I,k k
( I,k1 , I I,k1 ).
R R
Solve linear s.p.d. system Eq. (35) for I,k R using (w R , w R , R , R

I,k I I,k k I I,k1 ), minimising R
w.r.t. I,k .
R
Solve linear s.p.d. system Eq. (36) for IRI,k using (w I,k R , w R , R , R ), minimising
I I,k k I,k R
w.r.t. I I,k .
R
end for
w IR := w I,k
R ; w R := w R ; R := R ; R := R ; R := R ;
II I I,k k I I,k II I I,k
R+1
uI := u IR + w IR IR IRI ; u IR+1
I := u IRI + w IR IR IRI ; R+1 := R + IR IR IRI ;
end for
Fig. 1 L-shaped domain: coupled diffusion problem
may again be found in [12]. Of course, in that case each subproblem in the innermost
loop is R times larger, but the algorithm can find a better approximation with smaller
R. This is how the example in the following Sect. 5 was computed.
5 Computational Example
The example of a coupled problemtaken from [12] and shown here in Fig. 1can
equally well be viewed as a problem which has been partitioned. This exemplifies
another way in which the methods presented here can be used, namely to partition
(a) 1.666 (b) 10 r

r

1.670
2
10
Energy Functional
1.674
Relative Error
3
1.678 10
1.682
4
10
1.686
5
1.691 10
1 4 7 10 13 16 19 1 4 7 10 13 16 19
Separation Rank r Separation Rank r
Fig. 2 Separated Approximation: a Value of the Lagrangian, b Errors: mean and std. deviation
(a) (b)
2.0 0.55 2.0
0.55
0.50 0.50
0.45 0.45
1.5 1.5
0.40 0.40
0.35 0.35
x2
1.0 0.30 1.0

x
0.30
0.25 0.25
0.20 0.20
0.5 0.5
0.15 0.15
0.10 0.10
0.0 0.0
0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0
x1 x
1
Fig. 3 Separated ApproximationMean: a Contours for R = 1, b Contours for R = 20
large problems to break them up into manageable pieces. The L-shaped domain
in Fig. 1 has been partitioned as indicated, and thus is a coupled problem. It is a
diffusion problem with a random diffusion coefficient, where we assume that the
uncertainty in this coefficient can be modelled by independent RVs belonging to the
respective subdomains. This means that the diffusion coefficient in both subdomains
is not correlated.
The coupling conditions are enforced by Lagrange multipliers, and the coupled
problem was solved by a FETI method, and the total problem by the extension of
Algorithm 3 as alluded to at the end of Sect. 4.2. In Fig. 2a the convergence of the
value of the Lagrangian with increasing rank can be observed, where it may be noted
that beyond rank R = 7 the value does not change much any more. In Fig. 2b we show
the decrease of error for the overall mean and standard deviation with increasing rank.
In the following two figures we show the contour linesfor the mean in Fig. 3 and for
the standard deviation in Fig. 4for rank R = 1 in part a) and for R = 20 in part b).
The converged contours are shown as dashed lines in all cases. It may be observed
that for the mean in Fig. 3a the contour lines are already quite accurate for R = 1.
(a) (b)
2.0 2.0
0.08
0.08
0.07
0.07
1.5 1.5
0.06
0.06
0.05 0.05
x2
2
1.0 1.0
x
0.04 0.04
0.03 0.03
0.5 0.5
0.02 0.02
0.0 0.01 0.0 0.01

0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0
x1 x1
Fig. 4 Separated ApproximationStandard Deviation: a Contours for R = 1, b Contours for

R = 20
We have indicated a fast but partitioned computational framework for the propagation
of uncertainty through coupled problems, which also saves storage. The proposed
approach constructs a solution-adaptive stochastic basis of separated form with
respect to the random inputs characterising the uncertainty in each sub-problem,
leading to a partitioned treatment of the stochastic space and consequently to a
higher scalability of the method as compared with standard uncertainty propagation
approaches. For situations where the separation rank is small, the proposed approach
provides at the same time a reduced order representation of the coupled solution.
The deterministic coefficients associated with each separated stochastic basis cap-
ture the spatial variability of the solution and are computed via the standard finite ele-
ment tearing and interconnecting (FETI) approach. Therefore, the method achieves a
high level of parallelism while requiring no intrusion in each domain solver. Although
our present formulation of domain coupling is based on the standard FETI approach,
we foresee no major technical difficulties in employing more advanced domain cou-
pling schemes.
The proposed framework was demonstrated through its application to a linear
elliptic PDE with high-dimensional random inputs. Despite the high-dimensionality
of the random inputs, accurate estimates of the solution statistics were achieved with
relatively low separation ranks, thus demonstrating the effectiveness of the present
approach.
Acknowledgments The authors are indebted for the fruitful discussions they had with Prof. K.C.
Park from University of Colorado, Boulder. AD gratefully acknowledges the financial support of
the Department of Energy under Advanced Scientific Computing Research Early Career Research
Award DE-SC0006402. MHs work was supported by the National Science Foundation grant
CMMI-1201207. The work of HGM and RN has been partly supported by the German Research
Foundation Deutsche Forschungsgemeinschaft (DFG).
References
1. Babuka I, Chatzipantelidis P (2002) On solving elliptic stochastic partial differential equations.

2. Babuka I, Tempone R, Zouraris G (2004) Galerkin finite element approximations of stochastic
elliptic partial differential equations. SIAM J Numer Anal 42(2):800825
3. Bieri M, Andreev R, Schwab C (2010) Sparse tensor discretization of elliptic sPDEs. SIAM J
Sci Comput 31:42814304
4. Doostan A, Iaccarino G (2009) A least-squares approximation of partial differential equations
with high-dimensional random inputs. J Comput Phys 228(12):43324345
5. Doostan A, Owhadi H (2011) A non-adapted sparse approximation of PDEs with stochastic
inputs. J Comput Phys 230:30153034
6. Doostan A, Ghanem R, Red-Horse J (2007) Stochastic model reduction for chaos representa-
tions. Comput Meth Appl Mech Eng 196(3740):39513966
7. Farhat C, Roux F (1991) A method of finite element tearing and interconnecting and its parallel
solution algorithm. Int J Numer Meth Eng 32:12051227
8. Farhat C, Lesoinne M, LeTallec P, Pierson K, Rixen D (2001) FETI-DP: a dual-primal unified
FETI method-part i: a faster alternative to the two-level FETI method. Int J Num Meth Eng
50:15231544
9. Foo J, Karniadakis G (2010) Multi-element probabilistic collocation method in high dimen-
sions. J Comput Phys 229(5):15361557
10. Ghosh D, Avery P, Farhat C (2009) A FETI-preconditioned congugate gradient method for
large-scale stochastic finite element problems. Int J Numer Meth Eng 80:914931
11. Giraldi L, Litvinenko A, Liu D, Matthies HG, Nouy A (2013) To be or not to be intrusive? the
solution of parametric and stochastic equationsthe plain vanilla Galerkin case. arXiv:1309.
1617v1:[math.NA]. http://arxiv.org/abs/1309.1617v1
12. Hadigol M, Doostan A, Matthies HG, Niekamp R (2013) Partitioned treatment of uncertainty in
coupled domain problems: a separated representation approach. arXiv:1305.6818:[math.PR].
http://arxiv.org/abs/1305.6818
13. Khoromskij B, Schwab C (2011) Tensor-structured Galerkin approximation of parametric and
stochastic elliptic PDEs. SIAM J Sci Comput 33(1):364385
14. Le Matre O, Knio O (2010) Spectral methods for uncertainty quantification with applications
to computational fluid dynamics. Springer, New York
15. Ma X, Zabaras N (2009) An adaptive hierarchical sparse grid collocation algorithm for the
solution of stochastic differential equations. J Comput Phys 228:30843113
16. Matthies HG (2008) Stochastic finite elements: computational approaches to stochastic partial
differential equations. Z Angew Math Mech 88:849873
17. Matthies HG, Keese A (2005) Galerkin methods for linear and nonlinear elliptic stochastic
partial differential equations. Comput Meth Appl Mech Eng 4:12951331
18. Matthies HG, Niekamp R, Steindorf J (2006) Algorithms for strong coupling procedures.
Comput Meth Appl Mech Eng 195(1718):20282049. doi:10.1016/j.cma.2004.11.032
19. Matthies HG, Litvinenko A, Pajonk O, Rosic BV, Zander E (2012) Parametric and uncertainty
computations with tensor product representations. In: Dienstfrey A, Boisvert R (eds) Uncer-
tainty quantification in scientific computing. IFIP Advances in Information and Communication
Technology, vol 377. Springer, Berlin, pp 139150. doi:10.1007/978-3-642-32677-6
20. Najm H (2009) Uncertainty quantification and polynomial chaos techniques in computational
fluid dynamics. Ann Rev 41(1):3552
21. Nobile F, Tempone R, Webster C (2008) An anisotropic sparse grid stochastic colloca-
tion method for partial differential equations with random input data. SIAM J Numer Anal
46(5):24112442
22. Nouy A (2010) Proper generalized decompositions and separated representations for the numer-
ical solution of high dimensional stochastic problems. Arch Comput Meth Eng 17:403434
23. Park KC, Felippa CA (1998) A variational framework for solution method developments in
structural mechanics. J Appl Mech 56(1):242249
24. Subber W, Sarkar A (2012) Domain decomposition method of stochastic PDEs: a two-level
scalable preconditioner. J Phys: Conf Ser 341(1):012033
25. Xiu D (2009) Fast numerical methods for stochastic computations: a review. Commun Comput
Phys 5(24):242272
26. Xiu D (2010) Numerical methods for stochastic computations: a spectral method approach.
Princeton University Press, Princeton
27. Xiu D, Hesthaven J (2005) High-order collocation methods for differential equations with
random inputs. SIAM J Sci Comput 27(3):11181139
28. Zhang Z, Choi M, Karniadakis GE (2009) Anchor points matter in ANOVA decomposition.
In: Spectral and higher order methods for partial differential equations. Lecture Notes in Com-
putational Science and Engineering, Trondheim, pp 347355

10.1007/978 3 319 06136 8

Uploaded by

Copyright:

Available Formats

10.1007/978 3 319 06136 8

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

10.1007/978 3 319 06136 8

Uploaded by

Copyright:

Available Formats

What is the book about?

What is the book about?

What topics are covered in the book?

What topics are covered in the book?

Computational Methods in Applied Sciences

Sergio R.Idelsohn Editor

For further volumes:

Library of Congress Control Number: 2014939045

Springer International Publishing Switzerland 2014

Printed on acid-free paper

Springer is part of Springer Science+Business Media (www.springer.com)

This book contains state-of-the-art contributions in the field of Coupled Problems

Part I Non-Linear Materials in Coupled Problems

Generalized Viscoplasticity Based on Overstress (GVBO)

Numerical Simulation of Double Cup Extrusion Test Using

Part II Cardiovascular Fluid Mechanics

Simplified Fluid-Structure Interactions for Hemodynamics . . . . . . . . . 57

Patient-Specific Cardiovascular Fluid Mechanics Analysis

Part III Particle Methods in Coupled Problems

Direct Numerical Simulation of Particulate Flows Using

A Particle Finite Element Method (PFEM) for Coupled Thermal

Numerical Simulation and Visualization of Material Flow

Some Considerations on Surface Condition of Solid

Part IV Reduced-Order Models

Reduced-Order Modelling Strategies for the Finite Element

A Survey of Hierarchical Model (Hi-Mod) Reduction

Part V Multifluid Flows

On the Application of Two-Fluid Flows Solver

Recent Advances in the Particle Finite Element Method

Part VI Fluid-Structure Interactions Problems

Computational Engineering Analysis and Design

Computational Wind-Turbine Analysis with the ALE-VMS

Part VII Partitioned Method and Parallelization Techniques

Scaling Up Multiphysics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389

Partitioned Solution of Coupled Stochastic Problems . . . . . . . . . . . . . 405

Vasilina Filonova, Yang Liu and Jacob Fish

Abstract A generalized viscoplasticity based on the overstress (GVBO) model that

Keywords Polyurea Viscoplasticity model based on the overstress Multiscale

V. Filonova Y. Liu J. Fish (B)

S. R. Idelsohn (ed.), Numerical Simulations of Coupled Problems in Engineering, 3

The present manuscript focusses on a single-scale and multiscale modeling of copoly-

Fig. 1 Rheological model

2 Generalized Viscoplasticity Based on Overstress (GVBO)

In this section we present the generalized viscoplasticity model based on overstress

2.1 VBO for Large Rotations

where is the first Lame constant, and G is the shear modulus.

where k1 , k2 , k3 are material constants. The overstress invariant is defined by

The evolution law for the equilibrium stress is given by

where is the shape function defined as

= a1 + (a2 a1 ) exp (a3 ) (10)

with a1 , a2 , a3 being material constants.

and of the isotropic stress function A as

2.2 Deformation-Dependent Elastic Constitutive Tensor

We introduce the following deformation-dependent Lame constants related to small

L[J ] = [J ]1 1 + 2G[J ]I (14)

The deformation-dependent Youngs modulus and Poisson ratio are related to

G[J ] (3[J ] + 2G[J ]) [J ]

2.3 The Deformation-Dependent Viscosity Function

We assume the viscosity k to be a function of overstress and Jacobian as follows