10.1007/978 3 319 06136 8
10.1007/978 3 319 06136 8
10.1007/978 3 319 06136 8
Numerical
Simulations
of Coupled
Problems in
Engineering
Numerical Simulations of Coupled Problems
in Engineering
Computational Methods in Applied Sciences
Volume 33
Series editor
E. Oate
International Center for Numerical Methods in Engineering (CIMNE)
Technical University of Catalonia (UPC)
Edificio C-1, Campus Norte UPC
Gran Capitn, s/n
08034 Barcelona, Spain
e-mail: [email protected]
url: http://www.cimne.com
Numerical Simulations
of Coupled Problems
in Engineering
123
Editor
Sergio R. Idelsohn
International Center for Numerical
Methods in Engineering (CIMNE)
Catalan Institution for Research
and Advanced Studies (ICREA)
Barcelona
Spain
ISSN 1871-3033
ISBN 978-3-319-06135-1 ISBN 978-3-319-06136-8 (eBook)
DOI 10.1007/978-3-319-06136-8
Springer Cham Heidelberg New York Dordrecht London
Sergio R. Idelsohn
v
Contents
vii
viii Contents
1 Introduction
F2 = F2e Fin
2 (1)
The elastic response in the hyperelastic arm is assumed to obey Neo-Hookean law,
whereas the stress in the Maxwell arm 2 depends
e on
e
left stretch tensor V2 and
pressure p2 in the Maxwell arm, i.e. 2 = f V2 , p2 . For more details we refer
to [5, 6]. These models, however, have been considered for up to the strain rates of
104 s1 .
The primary objectives of the present chpater are as follows:
Develop a block copolymer constitutive model for large strains and strain rates
in the range of 105 106 s1 . Rather than utilizing the framework of multiplica-
tive decomposition (1), we will pursue an additive decomposition of the rate of
deformation
D = De + Din (2)
The evolution of the inelastic responses will follow the framework of the VBO
model [7, 8] with only exception that the elastic constants and viscosity will be
assumed to be deformation-dependent. While there are a number of available
models of polyurea in the literature [9], to our knowledge, there is no published
work utilizing the framework of viscoplasticity model based on overstress (VBO)
to describe exceedingly large strains and strain rates observed in the experiments.
In the following we will refer to the generalization of the VBO model to large
strains and strain rates as the GVBO.
Generalized Viscoplasticity Based on Overstress (GVBO) 5
Validate the proposed GVBO constitutive model. We will consider recent experi-
ments conducted at Brown University (impact of a steel flyer on a steel/polyurea/
steel sandwich plate under high shear strain rates) [1012].
Validate the GVBO model on the structural impact problem. We will consider the
impact of the projectile onto the polyurea/steel bi-layer where a polymer layer is
placed on the front or back of the plate. We will verify experimental observation
suggesting considerable advantage of placing the polyurea layer on the top of the
target plate at very high impact velocities (>1000 m/s) [1315]. It is noteworthy
to point out that for low to moderate impact velocities the back polyurea coating
of the plate shows better resistance than the front coating [16].
The last part of the manuscript will consider space-time multiscale analysis of the
block copolymer to explain some of the discrepancies of the single-scale GVBO
model. The polyurea microstructure will be modeled as a two-phase heterogeneous
anisotropic material with the inclusion phase being elastic and the soft domains
obeying GVBO constitutive model. The multiple temporal scale effect, which
gives rise to dispersion, will be taken into account using the recently developed
micro-inertia approach [17].
We start form the constitutive equation based on the additive decomposition of the
rate of deformation
= L : D Din (3)
6 V. Filonova et al.
where is an objective Cauchy stress rate; D is the rate of deformation tensor, and
Din is the inelastic rate of deformation. The elastic constitutive tensor L for isotropic
material is given by
L = 1 1 + 2GI (4)
T
T
= + R R R R (5)
The flow rule for the case of finite rotations but small strains is given by
1+ 3 sg
d = de + din = s+ (6)
E 2 Ek
where d is the deviatoric part of the rate of deformation tensor D decomposed into
elastic and inelastic parts, de and din , respectively. s is the deviatoric Cauchy stress
tensor; g is the deviatoric part of the equilibrium stress; E is the Youngs modulus;
is the Poissons ratio, and k is the viscosity function defined as
k3
k = k1 1 + (7)
k2
A = Ac A f A ; A[t = 0] = A0 (12)
Ek
where Ac , A f are material constants.
For more details about the classical VBO model see [7, 8].
[J ] = J p 0
G[J ] = J p (G 0 0 ln J ) (13)
where 0 , G 0 are the initial (small deformation) values and p is a material parameter;
the Jacobian is defined as usual by J = det(F) where F is the deformation gradient
tensor. The elastic constitutive tensor (4) is assumed to be a function of the Jacobian
The initial elastic properties E 0 , 0 are related to the initial Lame constants by
E 0 0 E0
0 = ; G0 = (15)
(1 + 0 ) (1 20 ) 2 (1 + 0 )
The elastic parameters are allowed to increase only from their initial values: G G 0 ,
0 , E E 0 ; i.e. if elastic parameters decrease based on Eq. (13), the initial
values are taken instead.
k4
E[J ] E 0
k[, J ] = k1 1 + ; k2 [J ] k2 + k3 (17)
k2 [J ] Em E0
3 Model Validation
Table 3 Thickness of the flyer, sandwich plates, and impact velocity for two experiments
Shot No. Sample polyurea Front plate Rare plate Flyer Impact velocity Angle
(mm) (mm) (mm) (mm) V0 (m/s) ( )
1201 0.097 3.582 5.578 7.411 183.45 18
1203 0.089 3.588 5.898 10.5 175 0
404 0.11 2.896 7.041 6.991 112.6 18
V0
Polyurea
Flyer Front Rear
V0
the steel plates are different. The presence of the cohesive layer does not influence
significantly the numerical results.
The comparison of numerical simulation against experimental data of Shot#1201
is depicted in Figs. 4 and 5. The simulation results can be seen to be in good agreement
with the experiments except for the normal velocity that behaves differently after the
Generalized Viscoplasticity Based on Overstress (GVBO) 11
Fig. 4 Shear and compressive stress versus shear strain and time, respectively, for Shot#1201
first peak. This is most likely related to the arrival of the boundary wave and the use
of idealized geometry considered in the simulations.
It can be seen that for Shot#404 numerical results are in good agreement with the
experimental results (see Figs. 6, 7, 8 (right)).
The compressive stress-strain curve related to pure compression loading-unloading
experiments (Shot#1203) is plotted in Fig. 8 (left). The loading compressive stress-
strain curve for Shot#404 is depicted in Fig. 8 (right) and compared with the experi-
mental observations taken from reference [11].
The pressure dependence of shearing resistance is approximated by a linear func-
tion in Fig. 9 (left), [10]. The numerical simulation results shown in Fig. 9 (right)
correspond to the problem geometry and steel material taken from Shot#1201 and
Shot#404 (Table 2). The results are obtained for impact velocities ranging between
V0 = 112.6 m/s and V0 = 183.45 m/s, and fit nicely the linear function predicted by
the experiments.
Fig. 6 Shear and compressive stress versus shear strain and time, respectively, for Shot#404
Fig. 8 Compressive stress versus compressive strain for pure-compression Shot#1203 (left) and
for pressure-shear Shot#404 [11] (right)
Fig. 9 Dependence of shear resistance on pressure: experimental data [2] (left) and numerical
simulations (right)
Fig. 10 Confined compressive stress-strain curves for GVBO and experimental data [18]
criterion for the polyurea is a maximum principal strain, and the critical damage
value is D pol = 1.4.
Material properties for the MIL-A46100 steel [19] are listed in Table 7. The para-
meter is the Taylor-Quinney empirical coefficient that represents proportion of
plastic work converted into heat, and the value 0.9 is considered in this study follow-
ing the reference [20].
The steel is modeled by the Johnson-Cook model with material constants taken
from references [21, 22]. The von Mises tensile flow stress is defined by
= A + Bn 1 + C ln 1 T m (18)
where is the equivalent (or effective) plastic strain, and 0 are the current and
reference strain rates; = /0 is the dimensionless plastic strain rate for 0 =
1.0 s1 . The homologous temperature T is defined as
T T0
T = (19)
Tm T0
where Tm and T0 are the melting and room temperatures, respectively. The Johnson-
Cook parameters for the steel are summarized in Table 8.
The Johnson-Cook model defines the damage variable as
D= (20)
f
f = D1 + D2 exp D3 1 + D4 ln 1 + D5 T (21)
Table 9 The Johnson-Cook failure parameters for the 4340 steel [22]
D1 D2 D3 D4 D5
0.05 3.44 2.12 0.002 0.61
the open literature and have been assumed to be the same as for the 4340 steel, (see
Table 9 [22]).
The 3D plate is constructed from two layers: steel and polyurea of thickness 12.7
mm. The plate radius considered in the simulations is 125 mm. We consider the
impact by a fragment simulating projectile of 0.5 caliber, 12.7 diameter, and mass
of 12.4 g.
The impact problem is simulated using Abaqus/Explicit solver with built-in
Johnson-Cook material and failure models and the user-defined material subroutine
(describing GVBO) for the polyurea. Only one quarter of the plate was simulated
assuming its symmetry. The bullet is modeled as a rigid body. The plate is assumed
to be free.
The model was meshed using 8-node hexahedral element with reduced integration.
For stabilization in Abaqus/Explicit, linear and quadratic bulk viscosity parameters
were chosen as 0.54 and 2.4, respectively.
The ballistic limit (V50 ) is estimated in the simulation as an average velocity
between the minimum impact velocity for penetration and maximum velocity before
penetration. It is listed in Table 10 for the blank steel and coated steel plates. These
simulation results are comparable with the results obtained in the experiments [13,
14]. The difference in the ballistic limit between the polyurea-steel and the blank
steel is about 15.4 %.
Note that the enhancement obtained by the front polyurea layer is slightly smaller
than in the numerical simulation reported in [23] (where an impact problem of FSP
penetrating polyurea-4340 steel was considered, and the polyurea was modeled using
different viscoplasticity model [15]). In reference [23], the ballistic limits of the blank
steel plate and the polyurea-steel bi-layer were overestimated.
Figure 11 shows the snapshot of a projectile penetration into the bi-layer of
polyurea and MIL-A46100 steel at impact velocity of V = 1370 m/s. The figure
compares the relative position of the projectile in the polyurea layer in front and back
16 V. Filonova et al.
Fig. 11 The FSP impact on polyurea-MIL-A46100 steel plate (top) and MIL-A46100 steel-
polyurea plate (bottom) at impact velocity of 1370 m/s at time 1.e-04s
Fig. 12 FSP velocity for polyurea-steel and steel-polyurea plates at impact velocity of 1370 m/s
Fig. 13 Micromechanical
model of polyurethane
consisting of hard domains
(HD) and soft domains (SD)
with a low HD content [25]
HD SD
Microstructure of the polyurea is composed of hard domains (HD) and soft domain
(SD) forming a two-phase microstructure as shown in Fig. 13. Hard segments are
with high glass-transition temperature Tg and soft segments are with low Tg . The
soft segment has its glass transition below the normal operating temperature and is,
therefore, rubbery. The hard segment has its glass transition or its melting tempera-
ture above the ordinary operating temperature and is, therefore, either glassy and/or
crystalline. It is well known that the microphase separation of hard and soft domains
is responsible for the versatile properties of this broad class of polymers [24, 25].
Recent studies [26] have shown that Tg of the soft domain is on average 80 C higher
at the free surface than in the interior and 60 C higher than at the circumference.
For the low strain rate tensile specimens, the Tg increases with strain and reaches a
maximum value at a strain of 3.6. These increases in the glass transition tempera-
tures is believed to be due to mixing of the hard and soft segments, but the precise
mechanism is not well understood and cannot be investigated without performing
micromechanical analysis of the polymer.
Numerous experimental studies (see for example [27]) suggested a significant
shift in glass transition Tg with strain rate. The precise effect of strain rate on hard
and soft domains is not known, except on the overall behavior of the polymer. We will
identify the rate dependent properties of SD and HD domain using inverse method.
18 V. Filonova et al.
Consider the following governing equations of a block copolymer over the composite
domain
i j, j + bi = u i on
i j = L i jkl kl kl on
1
i j = u (i, j) u i, j + u j,i on (22)
2
u i = u i on u
i j n j = ti on t
where u is displacement, is density; u i, j denotes the derivative of displace-
ment increment with respect to the midstep; u (i, j) denotes the symmetric gradient;
kl = u (i, j) is an integral of the rate of deformation over the time step; is an
eigenstrain; and i j is Cauchy stress. The microscopic coordinate system is y = x/ ,
and it is considered to be independent on the macroscopic system of coordinates, x,
when 1. Then the spatial derivative rule is ,i = ,xi + ,yi / .
For a viscoplastic material considered in Sect. 2 undergoing large strains, it is
convenient to decompose the eigenstrain increment as follows
kl =
in
kl + kl
el
1 (23)
kl = Ikli j L klmn E 0 , G 0
el L mni j E , G i j in
ij
(1)
i j (x, y, t) = icj (x, t) + u y, t) +
(i,y j ) (x,
O ( )
1
icj (x, t) = u c(i,x ) (x, t) = i j (x, y, t) d (25)
j ||
where Hikl , gikl are y-periodic instantaneous functions for elastic and inelastic defor-
mation, respectively. Note that for large strain problems, the unit cell domain may
evolve and therefore the influence functions may change from one increment to
another. In the present manuscript we introduce an approximation by which we
approximate the influence functions by their initial values
In the model reduction approach [2833] the eigenstrain field is discretized over
volume partitions as follows
n
N () (y) i j
c()
i j (x, y, t) = (x, t) (28)
=1
and
1
kl (x, y, t) d()
c()
i j (x, t) = () (30)
()
Inserting (27) and (28) into (26) and introducing Si
kl()
(y) = kl
(y, y) N () (y) d
gi
yields
(1)
n
kl() c()
u i (x, y, t) = Hikl (y) kl
c
(x, t) + Si (y) kl (x, t) (31)
=1
n
mn()
kl (x, y, t) = Iklmn + H(k,y
mn
l)
(y) c
mn (x, t) + S(k,yl ) (y) c()
mn (x, t)
=1
(32)
20 V. Filonova et al.
Given the rate of deformation increment and the previous converged value of Cauchy
stress in each phase of the unit cell the stress can be updated using the GVBO model
outlined in the previous section.
Finally, the coarse-scale Cauchy stress is obtained by averaging fine-scale stresses
1
icj = i j d (33)
||
where bc and c are averages of fine-scale body force b and density , respectively.
icj, j + bic = c u ic on
icj, j n cj = tic on t (36)
where Di jkl is a dispersion coefficient that depends on the material impedance, unit
cell size and overall density [17].
To compute the dispersion coefficient it is convenient to decompose it to linear
and nonlinear contributions
Fig. 14 Microstructure of
the unit cell with particles
preferentially oriented at a 45
with respect to a vertical axis
(y)
(i,y j ) (y) = H(i,y j ) (y) + c Ii jkl
h kl kl
ij (39)
h s d = 0; (y) = c (y)
n () () () ()
st () pq () pq
n n n
() pq
Dinonlin R + Q + Q
jkl icj st pq kl
c i j pq c klpq c
=1 =1 =1 kl =1 ij
() 1 st() pq() () 1 i j pq()
Rst pq = c Sr Sr d; Q i j pq = c h r Sr d (40)
|| ||
Note that the dispersion effect becomes significant when (i) the unit cell size is
large, (ii) material impedance is considerable (i.e. large difference in elastic moduli
or densities between phases) and (iii) spatial gradient of acceleration is high.
For modeling polyurea microstructure, we consider an anisotropic unit cell with ellip-
soidal particles (volume fraction 19 %) shown on Fig. 14. The particles are oriented
at preferential angle of 45 with respect to the loading direction.
The soft domains are modeled by the GVBO model while hard domains are kept
elastic with considerably higher elastic modulus than the initial modulus in the soft
domain. Material parameters of the phases (Table 11) are calibrated to the experi-
mental data of normal dynamic impact test [10] and confined test under monotonic
22 V. Filonova et al.
Fig. 15 Compressive stress and normal velocity versus time. Comparison of multiscale simulation
with calibrated parameters and experimental data, Shot#1203 [10]
Fig. 16 Confined compressive stress-strain curves. Comparison of multiscale simulation with cal-
ibrated parameters and experimental data [18]
compressive loading [18]. The results are shown in Figs. 15 and 16. The two-scale
model of polyurea is implemented in Abaqus /Explicit with MDS plugin http://
multiscale.biz.
The multiscale model of polyurea is studied for impact problem on polyurea/steel
bi-layer and single polyurea layer. In these studies we have not considered material
failure and thus relatively low impact velocity have been analyzed.
The evolution of projectile velocity is shown in Figs. 17 and 19 for bi-layer and
single layer, respectively. The response of the heterogeneous anisotropic material
Generalized Viscoplasticity Based on Overstress (GVBO) 23
Fig. 17 Evolution of the FSP velocity impacted to bi-layers at initial velocity of 115 m/s. Compar-
ison of the multiscale simulation with anisotropic heterogeneous material and a single-scale model
with isotropic homogeneous material properties
Fig. 18 Shear stress in the polyurea-steel bi-layer at initial velocity of 115 m/s at time 5.e-05 s.
Comparison of multiscale simulation with anisotropic heterogeneous material (top) and a single-
scale model with isotropic homogeneous material properties (bottom)
(Table 11) is compared with the homogeneous single scale GVBO model (Table 5).
It can be seen that a heterogeneous anisotropic polymer layer alone or placed on
the top of the bi-layer (Fig. 17, left) delays the projectile considerably better than a
homogenous polymer. In other words, the energy has been dissipated much faster
and much more effectively by heterogeneous anisotropic material. This is due to
the shear wave propagation as opposed pressure wave propagation in the case of an
isotropic polymer. When the polymer is placed on the bottom of the steel there is no
difference between the homogeneous and heterogeneous polymer (Fig. 17, right).
The snapshots showing the shear wave stress distribution in the polyurea layer
placed on the top of the bi-layer and in the polyurea plate alone are depicted in
24 V. Filonova et al.
Fig. 19 Evolution of the FSP velocity impacted on the single polyurea layer with initial velocity
of 167 m/s. Comparison of the multiscale simulation with anisotropic heterogeneous material and
a single-scale model with isotropic homogeneous material properties
Fig. 20 Shear stress in the polyurea layer model at impact velocity of 167 m/s at time 1.6e-04 s.
Comparison of multiscale simulation with anisotropic heterogeneous material (top) and a single-
scale model with isotropic homogeneous material properties (bottom)
Figs. 18 and 20, respectively. It can be seen that due to preferential orientation of the
hard domains more energy is dissipated than in a homogeneous polyurea.
To study the dispersion effect we consider an impact onto the polymer plate with
initial velocity of 300 m/s. For demonstration purposes we consider relatively large
size of the unit cell (about 1 mm) and high ratio of phase densities (hard domain is
ten times denser than a soft domain). A failure of polymer material is not included
here. The micro-inertia effect is implemented in Abaqus/Explicit solver by adding
the dispersion term into overall stress (35). The simulation results depicted in Figs. 21
Generalized Viscoplasticity Based on Overstress (GVBO) 25
Fig. 21 Evolution of the FSP velocity impacted on the single polyurea layer at initial velocity
300 m/s. Comparison of the multiscale simulation for anisotropic heterogeneous material with and
without dispersion
Fig. 22 Shear stress distribution in a polyurea layer impacted by a projectile at initial velocity of
300 m/s at time 1.47e-04 s. Comparison of the multiscale simulation with anisotropic heterogeneous
material and with (top) and without (bottom) consideration of dispersion
and 22 demonstrate that dispersion indeed contributes to the decrease in the projectile
velocity.
6 Summary
References
1. Roland CM, Casalini R (2007) Effect of hydrostatic pressure on the viscoelastic response of
polyurea. Polymer 48:57475752
2. Green MS, Tobolsky AV (1946) A new approach to the theory of relaxing polymeric media. J
Chem Phys 14(2):8092
3. Johnson AR, Quigley J, Freese CE (1995) A viscohyperelastic finite element model for rubber.
Comput Methods Appl Mech Eng 127:163180
4. Roland CM (1989) Network recovery from uniaxial extension: i. elastic equilibrium. Rubb
Chem Technol 62:863879
5. Bergstrom JS, Boyce MC (1998) Constitutive modeling of the large strain time-dependent
behavior of elastomers. J Mech Phys Solids 46(5):931954
6. Jiao T, Clifton RJ, Grunschel SE (2009) Pressure-sensitivity and constitutive modeling of an
elastomer at high strain rates. AIP Conf Proc 1195:12291232
7. Colak OU (2004) Modeling of large simple shear using a viscoplastic overstress model and
classical plasticity model with different objective stress rates. Acta Mechanica 167:171187
8. Gomaa S, Sham T-L, Krempl E (2004) Finite element formulation for finite deformation,
isotropic viscoplasticity theory based on overstress (fvbo). Int J Solids Struct 41:36073624
9. Grujicic M, He T, Pandurangan B et al (2012) Experimental characterization and material-
model development for microphase-segregated polyurea: an overview. J Mater Eng Perform
21:216
10. Clifton RJ, Jiao T (2012) Resistance of elastomers to shearing and failure at extreme loading
conditions. ONR-Workshop
11. Jiao T, Clifton RJ, Grunschel SE (2006) High strain rate response of an elastomer. AIP Conf
Proc 845:809812
12. Jiao T, Clifton RJ (2013) Measurement of the response of an elastomer at pressures up to 9gpa
and strain rates of 105 106 s1 . In: 18th Biennial international conference of the APS topical
group on shock compression of condensed matter, Washington, July 2013
13. Roland CM, Fragiadakis D, Gamache RM et al (2012) Factors influencing the ballistic impact
resistance of elastomer-coated metal substrates. Philos Mag 93(5):468477
Generalized Viscoplasticity Based on Overstress (GVBO) 27
14. Roland CM, Fragiadakis D, Gamache RM (2010) Elastomer-steel laminate armor. Compos
Struct 92:10591064
15. Roland CM, Twigg JN, Vu Y et al (2007) High strain rate mechanical behavior of polyurea.
Polymer 48:574578
16. Amini MR, Isaacs J, Nemat-Nasser S (2010) Investigation of effect of polyurea on response of
steel plates to impulsive loads in direct pressure-pulse experiments. Mech Mater 42:628639
17. Fish J, Filonova V, Kuznetsov S (2012) Micro inertia effects in nonlinear heterogeneous media.
Int J Numer Meth Eng 91(13):14061426
18. Amirkhizi AV, Isaacs J, McGee J et al (2006) An experimentally-based viscoelastic constitutive
model for polyurea, including pressure and temperature effects. Philos Mag 86(36):58475866
19. Grujicic M, Ramaswami S, Snipes JS et al (2013) Multiphysics modeling and simulations of
Mil A46100 armor-grade martensitic steel gas metal arc welding process. J Mater Eng Perform
22: 29505969
20. Borvik T, Hopperstad OS, Dey S et al (2005) Strength and ductility of weldox 460 e steel
at high strain rates, elevated temperatures and various stress triaxialities. Eng Fract Mech
72(7):10711087
21. Johnson GR, Cook WH (1983) A constitutive model and data for metals subjected to large
strains, high strain rartes and high temperatures. In: Proceedings of 7th international symposium
on ballistics, The Hague, 1983
22. Johnson GR, Cook WH (1985) Fracture characteristics of 3 metals subjected to various strains,
strain rates, temperatures and pressures. Eng Fract Mech 21(1):3148
23. Irshidat M, Al-Ostaz A, Cheng AH-D (2011) Predicting the response of polyurea coated high
hard steel plates to ballistic impact by fragment simulating projectiles. Ole Miss Project 90031.
http://www.serri.org/publications/Documents. Accessed 12 May 2011
24. Yi J, Boyce MC, Lee GF et al (2006) Large deformation rate-dependent stress-strain behavior
of polyurea and polyurethanes. Polymer 47:319329
25. Qi HJ, Boyce MC (2005) Stress-strain behavior of thermoplastic polyurethanes. Mech Mater
37:817839
26. Lee G, Mock W, Fedderly J et al (2007) The effect of mechanical deformation on the glass
transition temperature of polyurea. AIP Conf Proc 955:711714
27. Sharma A, Shukla A, Prosser RA (2002) Mechanical characterization of soft materials using
high speed photography and split hopkinson pressure bar technique. J Mater Sci 37(5):1005
1017
28. Oskay C, Fish J (2008) Fatigue life prediction using 2-scale temporal asymptotic homogeniza-
tion. Comput Mech 42(2):181195
29. Oskay C, Fish J (2007) Eigendeformation-based reduced order homogenization. Comp Meth
Appl Mech Eng 196:12161243
30. Yuan Z, Fish J (2009) Multiple scale eigendeformation-based reduced order homogenization.
Comput Methods Appl Mech Eng 198(2126):20162038
31. Yuan Z, Fish J (2009) Hierarchical model reduction at multiple scales. Int J Numer Meth Eng
79:314339
32. Fish J, Yuan Z (2008) N-scale model reduction theory. In Fish J (ed) Multiscale methods:
bridging the scales in science and engineering. Oxford University Press, New York
33. Fish J, Filonova V, Yaun Z (2013) Hybrid impotent-incompatible eigenstrain based homoge-
nization. Int J Numer Meth Eng 95(1):132
Numerical Simulation of Double Cup Extrusion
Test Using the Arbitrary Lagrangian Eulerian
Formalism
Abstract In this chapter Double Cup Extrusion Test (DCET) is modelled using the
finite element method with the help of the Arbitrary Lagrangian Eulerian (ALE)
formalism. DCET is a tribological test involving very large deformations which
are traditionally dealt with complicated and costly remeshing algorithms. Since the
topology of ALE meshes should remain constant throughout the simulation, two
very thin layers of auxiliary elements are added to the initial mesh of the billet where
the material is expected to flow. This numerical trick is combined with an original
and efficient node relocation procedure which allows the model to take into account
complex geometries of punches. The presented model is firstly validated for limited
punch strokes thanks to a purely Lagrangian simulation. It is then compared with
results from the literature. Eventually the general nature and the effectiveness of this
numerical strategy is demonstrated by a fully-coupled thermomechanical simulation
of thixoforming where the final shape of the billet is compared to experimental
measurements.
1 Introduction
The Double Cup Extrusion Test (DCET) is a tribological test dedicated to forging
operations. Before the conception of DCET, one of the easiest way to quantify
friction for this type of processes was the ring compression test (see Male and
(a) r1 (b) r2
Fig. 1 Schematic description of the ring compression test which may be used to quantify friction
in forging operations [25] a low friction (good lubrication) b high friction (bad lubrication)
Cockcroft [17]), which consists in crushing a flat ring until a prescribed thickness
is obtained, as depicted on Fig. 1. If the contacts are well lubricated (Fig. 1a), the
material flows outwards and the inner radius of the ring increases. If the friction
becomes higher (Fig. 1b), this radial motion is slowed down. A smaller radius is then
obtained (r2 < r1 ). Consequently, the final inner radius can be used as an indirect
measure of friction. However, this simple tribological test reproduces rather badly
the real contact conditions and the very high deformations that can be observed at
the interfaces between the material and the tools of real forging operations. Indeed,
according to Bay [2], it is common to reach pressures close to 2.5 GPa, surface
temperatures higher than 600 C and local surface elongation up to 3000 %. DCET
was conceived by Geiger [12] in order to measure friction and to test lubricants in
tribological conditions that are closer to these values.
Although DCET is much more elaborated than the ring compression test, the
measured friction can still be deduced from very simple geometrical quantities. The
experimental setup can be described as follows (see Fig. 2a): a cylindrical billet is
placed in a hollow container of same diameter between two punches. During the test,
the lower punch is not moving while the upper punch goes down and crunches the
specimen. Therefore the material is forced to flow along both punches in such a way
that two cups are gradually formed. If the contacts were perfectly lubricated, i.e. in the
frictionless case, the material would flow symmetrically upwards and downwards.
The H-shaped section of the forged billet would have two branches with the same
height (h 1 = h 2 ).
In practise, friction is unavoidable and induces a dissymmetry in the process.
The obtained final section looks like the one represented in Fig. 2b. The height of
the upper cup (h 1 ) is always higher than the lower one (h 2 ). The friction can then
be quantified by the cup height ratio h 1 / h 2 . The higher this ratio is, the higher the
friction was during the test.
A first direct application to this tribological test is the classification of lubricants
according to their respective efficiency in forging conditions. For example, Gariety
et al. Gariety et al. [11] have compared four lubricants thanks to DCET. They have
also studied the possibility of jamming by visualising the grooves on the free surfaces
of the billet after the test.
A second interesting application is the numerical estimation of a friction coeffi-
cient with the help of the finite element method. In the case of forging, the Trescas
law is usually chosen to model friction:
Numerical Simulation of Double Cup Extrusion Test 31
(a)
cylindrical
billet (b)
h1 h1
h2 h2
Fig. 2 a Principle of the double cup extrusion test from [23]b Picture of a deformed billet after
DCET (from Gariety et al. [11])
m max (1)
where is the friction shear stress at the contact interface, m is the friction coefficient
and max is the shear yield stress of the material. A series of numerical simulations
of DCET can be performed using a range of friction values m and the corresponding
curves of cup height ratios h 1 / h 2 versus upper punch displacements can be plotted.
This set of calibration curves and the experimental measurement are then used to
identify a mean coefficient m for the process [7, 9, 26]. This friction value might be
used later, with much care, in more complex numerical simulations of forging which
would involve the same material and the same lubricant. The finite element models of
the previously cited authors were all using the commercial code DEFORM-2D [24]
which conveniently provides an automatic remeshing procedure for quadrangular
meshes.
It is important to notice that the relevance of DCET to evaluating friction in forging
may be somewhat questionable. In fact, the material flow is mostly influenced by
the friction between the billet and the wall of the container. The friction between the
punch and the material, which is more representative of a forging operation, plays
a less significant role on the dissymmetry of the final shape of the billet. Moreover,
some authors, such as Schrader et al. [23], think that the pressures exerted by the
billet on the container are not high enough to use a Trescas law in the finite element
models. A Coulombs law should be more appropriate. Nevertheless, despite these
32 R. Boman et al.
issues, modelling this tribological test is still very interesting from a numerical point
of view.
This chapter is organised in the following way: after a brief review of the ALE
formalism, a simplified extrusion model is presented in order to explain the numerical
trick that will be used to keep the topology of the ALE mesh constant. Then, this
technique is extended to the case of extrusion with curved punches. Next, the model
is validated for small punch strokes by comparison with a classical Lagrangian model
and a simplistic ALE model using mesh smoothing. For larger punch strokes, the
model is compared to results from the literature obtained with a complete remeshing
strategy. Finally, a fully-coupled thermomechanical problem of semi-solid forming
is described and the final predicted shape of the billet is compared to experimental
observations.
This work has been done with Metafor Ponthot [22], an in-house implicit finite
element code developed at the University of Lige in Belgium.
In the ALE formalism, unlike in the Lagrangian case which is commonly used in Solid
Mechanics, the mesh no longer follows the material motion. Consequently, a new
grid coordinate system R is defined and the conservation laws and the constitutive
equations are rewritten in terms of the new coordinates [3, 4, 8, 22]:
Mass:
+ c + v = 0 (2)
t
Momentum:
v
+ (c ) v = + b (3)
t
Energy:
u
+ c u = : D + r + q (4)
t
Material:
+ (c ) = H : D + W W (5)
t
where is the mass density, is the Cauchy stress tensor, b and r are the specific
body forces and heat sources, u is the specific internal energy, D and W are the
symmetric and antisymmetric part of the velocity gradient tensor, q is the heat flux,
and H is a material tensor depending on the constitutive parameters, the stresses,
Numerical Simulation of Double Cup Extrusion Test 33
and the loading history. The last two terms in Eq. (5) result from the particular choice
of the Jaumanns objective time derivative.
The convective velocity c = v v is the difference between the material velocity
v and the mesh velocity v . In the case of nonlinear problems, such as metal forming
simulations, v should ideally depend on the solution. It is thus an additional unknown
of the latter system of equations.
In order to simplify the solution procedure and remain competitive against
Lagrangian models, the set of ALE equations is usually solved using an operator-
split procedure. Each time increment, from time t to t + t, is divided into two
successive steps. The first one is performed exactly in the same way as in the clas-
sical Lagrangian case. During this Lagrangian step the mesh follows the material
motion (v = v, c = 0) until an equilibrated configuration is obtained. The sec-
ond step, also called the Eulerian step, is divided into two substeps: the definition
of an appropriate mesh velocity v by relocating each node of the mesh to a more
suitable position Boman and Ponthot [5], followed by the data transfer from the old
mesh configuration to the new one Boman and Ponthot [6]. This transfer involves the
Gauss-point values (stress tensor components, history variables of the material such
as the equivalent plastic strain) as well as nodal values (velocities, accelerations and
temperature).
In the case of simulations of tribological tests such as DCET, the computation of
friction forces is obviously very important. Nevertheless this evaluation is not as easy
as in the Lagrangian case for which the position of each node of the mesh corresponds
to the same material particle during the whole simulation. The following strategy is
thus implemented: during the Lagrangian step, a classical penalty method is used to
compute the friction occurring at the nodes in contact with a tool. Then, after the
Eulerian step, the equilibrated internal forces are recomputed from the transferred
stress field on the new mesh. The friction forces are calculated by projection onto
the tools and the tangential gaps are recovered from these updated forces.
The extrusion process, or more precisely wiredrawing, was first investigated using
the ALE formalism by Hutink et al. [15] in 1990. In that early work, the studied
problem was 2D axisymmetric, the mesh was purely Eulerian and the stationary
solution was sought. Later, Van Haaren et al. [27] and Geijselaers and Hutink [13]
built an extrusion model in order to analyse their respective novel ALE convection
schemes. Similarly the mesh was fixed in space and a transient computation was
performed until the stationary state was reached. In those chapters, the analysis of
the results is not exhaustive: the plastic strain is solely visualised in order to compare
the numerical diffusion of the newly developed advection schemes. In particular, the
friction modelling is not discussed at all.
Transient models of extrusion have been also proposed by Atzema and Hutink [1]
and by Ponthot [21]. The ALE formalism can be very useful in this context. In these
34 R. Boman et al.
8 mm
7 mm
3 mm
Fig. 3 Axisymmetric geometry of the extrusion model of [21]. A cylindrical specimen is con-
strained to flow into a narrow channel in order to produce a hollow cylinder. The specimen shapes
at the beginning and the end of the extrusion are completely different
kind of models, the mesh is not Eulerian anymore. Since the ALE formalism requires
a constant mesh topology and thus a constant number of finite elements throughout
the simulation, it is important to take advantage of the approximate knowledge of the
final shape of the extruded billet in order to build the initial mesh. As an illustration of
the particular mesh management procedure, Ponthots model is presented in Fig. 3.
The extrusion problem is axisymmetric. A cylindrical billet is pushed by a punch into
a narrower channel so that a hollow cylinder is formed. The material is elastoplastic
(E = 200 GPa, = 0.3, Y = 210 + 10 p MPa) and the friction on the boundaries
is modelled by a Coulombs law with a friction coefficient = 0.15.
The transient solution is made up of two quadrangular regions (see Fig. 5): the
first region corresponds to the part of the crushed cylindrical billet that still remains
between the punches and the second one contains the material which has been already
extruded and which lies between the fixed punch and the container wall. At time
t = 0, the second region should be ideally empty. In order to keep a unique mesh
from the beginning to the end of the simulation, Ponthot initially assigns a very small
thickness (h = 0.01 mm) to this second region and creates a mesh on it. This artificial
region is called auxiliary region in the remainder of this work. The resulting finite
elements of the auxiliary region are thus very flat, but they can inflate as a result of the
material flux coming from the first region. The node relocation strategy is relatively
simple (see Fig. 4). Most of the vertices of the mesh are Lagrangian (i.e. they follow
the material motion). Only two vertices are Eulerian (i.e. fixed in space). The line
defining the nose of the fixed punch and its neighbour separating both regions of
the mesh are also Eulerian. The nodes of the other lines are relocated by defining a
cubic spline through them. These splines are then remeshed so that the initial node
distribution and their respective curvilinear abscissa are preserved. As far as the inner
nodes are concerned, they are continuously relocated thanks to the same transfinite
mesher that was used to generate the meshes. These node-relocation methods are
fully described in a previous chapter Boman and Ponthot [5].
Numerical Simulation of Double Cup Extrusion Test 35
fixed
punch
Lagrangian vertices
fixed
Eulerian v ertices
wall
Eulerian lines
moving
punch
Fig. 4 Node relocation procedure. Thanks to the simple geometry of the fixed punch, the definition
of the new mesh is made very easy. The nodes and the line in red are Eulerian. The other lines
which have at least one red vertex are remeshed using cubic splines
Figure 5 shows the progress of the simulation. Of course, the proposed mesh man-
agement technique entails some issues. First of all, it is mandatory to roughly know
the direction of the material flow when setting up the model. Moreover, seeing that
the number of finite elements is initially fixed in the auxiliary regions of the mesh,
these elements become larger and larger as the simulation progresses and, conse-
quently, the geometry of the extruded parts becomes crudely discretised. Finally, it
is not possible to extrude all of the material. The mesh of the billet must always be
made up of the same number of elements, but its thickness continuously decreases.
The crushed quadrangles, which lie either in the auxiliary region at the beginning of
the simulation or in the main region of the mesh at the end of the simulation, lead
to some convergence difficulties. On the one hand these finite elements are poorly
conditioned for the Lagrangian steps of the ALE algorithm and, on the other hand,
the stability criterion of the explicit data-transfer scheme of the Eulerian step is very
restrictive concerning the maximum allowable punch displacement during a single
time step. As a result, a very small time step has to be used at the beginning and at
the end of the simulation (Fig. 6). Figure 7 shows the calculated force on the moving
punch during the extrusion operation. The curve obtained by the current implemen-
tation is compared to the former results of Ponthot [21]. The trends of both curves are
very similar and the final values are identical. The discrepancy between the curves
may be explained by the differences in the ALE management of friction.
Despite these limitations, this particular mesh management technique is very
attractive for modelling extrusion or any other process for which the material flow is
predictable. For example, Gadala et al. [10] used the same ALE method to compute
the shape of a metallic chip of a cutting operation.
36 R. Boman et al.
H = 8 mm
region 2
region 1
h = 0.01 mm
(10 elements through
the thickness )
t =0s
t =1s
t =2s
Dtot = 0.95 H
t =3s
Fig. 5 Results of the extrusion test of Ponthot [21] for a punch stroke up to 95 % of the initial
thickness of the cylinder (H )
Numerical Simulation of Double Cup Extrusion Test 37
0.035
t
max
0.03
0.02
0.015
0.01
0.005
0
0 2 4 6 8
Punch displacement [mm]
70
60
50
Force [kN]
40
30
20
10 this work
Ponthot (1995)
0
0 2 4 6 8
Punch displacement [mm]
Fig. 7 Extrusion force as a function of the punch displacement and comparison with the results of
Ponthot [21]
The previous mesh management technique is now applied to a Double Cup Extrusion
Test. The chosen geometry was developed at the Engineering Research Center (ERC)
of the Ohio State University in order to assess the properties of various lubricants [7].
The exact punch geometry is described in Fig. 8 and the corresponding numerical
38 R. Boman et al.
hp
R
Df
Dp
values are listed in Table 1. They are related to the work of [23], which will be the
main reference in the remaining part of this chapter. The geometry of the punch is
far more complex than the one used by Ponthot. Even if Buschhausen et al. [7] claim
that the shape of the punch does not play a significant role on the results (its shape is
actually optimised to favour the radial flow of the lubricant, which is not modelled
here), this complex shape and all its geometrical details are retained in the model in
order to demonstrate the capabilities of our ALE node-relocation algorithm.
The initial diameter of the cylindrical billet d0 is equal to its height h 0 and to the
internal diameter of the container. The extrusion ratio is defined as the ratio of the
surface of the punch nose and the upper surface of the billet (r = D 2p /d02 ). This value,
deduced from Table 1, is equal to r = 0.25. According to Schrader et al. [23], this
particular value of r is ideal to observe large variations in the results due to friction
conditions.
The material is an AISI 1018 steel with classical elastic properties: Youngs mod-
ulus E = 200 GPa and Poissons ratio = 0.3. The nonlinear hardening is modelled
by the following law:
Y = K ( p )n (6)
where K = 735 MPa and n = 0.17. This law has been identified from a standard
tensile test the elastic part of which has been neglected. It is employed here in this
form, despite the fact that the initial yield stress is zero.
Numerical Simulation of Double Cup Extrusion Test 39
A Trescas
law models the frictional contact: Eq. (1) may be rewritten as
m Y / 3. The extrusion test is supposed to provide the value of this friction coeffi-
cient m by identification of an experimental curve and a series of numerical curves
obtained with a range of m values. In the following simulations the default value
is m = 0.05. In practise, the yield stress Y appearing in the Trescas law is usu-
ally chosen as the initial yield stress (see for example the DCET simulations of Tan
et al. [26]). In the work of Schrader et al., this numerical value is zero. Consequently,
a first possibility is to use the updated local yield stress. However, this value is only
defined at the Gauss points of the neighbouring elements of the contact nodes, and
not directly at these nodes. The yield stress should thus be extrapolated from the
Gauss points to the contact nodes. As a consequence, the friction force evaluated at
a given node depends on all the positions of the nodes of the neighbouring elements
and a new and more complete stiffness matrix must be computed in order to keep a
quadratic convergence rate. The second possibility is to choose a mean value of the
yield stress of the material. The cold-drawn AISI 1018 steel is listed on matweb [18]
with an initial yield stress of 370 MPa. When this particular value is chosen, numer-
ical results close to the ones of Schrader et al. [23] are obtained. The first choice
leads to sensibly different results, which show that some uncertainties remain in the
numerical parameters used in the reference work.
The model is axisymmetric and integrated in time by a implicit quasi-static solver
(the speed of the punch is about 10 mm/s, which is largely insufficient to produce
some inertia phenomena). The mesh is made up of Selective Reduced Integration
(SRI) quadrangles. Both punches and the container are assumed rigid. The contact
is modelled by the penalty method. The normal and tangent penalty coefficients,
p N and pT respectively, are determined by trial and error: p N = 6 104 MPa/mm
and pT = 6 103 MPa/mm for the container wall, and p N = 2 104 MPa/mm and
pT = 2 103 MPa/mm for the punches. The billet is regularly meshed with elements
of 1 mm along the extrusion direction and 0.3 mm along the radial direction for a total
of 31 52 elements). As in Ponthots model presented in the Sect. 4.1, the material
flows are anticipated by adding two very thin auxiliary meshes close to both punches
(15 elements though the thickness = 0.2 mmsee Fig. 10a).
The data-transfer step of the ALE algorithm consists in updating the stresses
(pressure p, and deviatoric stress components srr , sr z , szz ) and the equivalent plastic
strain p . These five fields are processed by a first-order Godunov scheme [6].
The mesh motion definition is obviously more complex than in the case of the former
example. The main difficulty is to define the motion of the red line highlighted in Fig. 9
which represents the surface of the billet under the punch nose and its extension up
to the container wall. Unlike its counterpart in Fig. 4, this curve may not be Eulerian
because the punch is not stationary anymore. Furthermore, given its slightly convex
geometry, the punch is not entirely in contact with the surface of the billet at the
40 R. Boman et al.
moving
punch
empty space
fixed
Lagrangian vertices p1 p2 wall
special vertices
p2
p1
Fig. 9 Node relocation of the DCET model. Unlike the case of Fig. 4, the red line may not be
Eulerian anymore
(a) (b)
punch deformed line
punch (difficult to remesh)
vertical line straight
approximation
used for
p1 remeshing p1
Fig. 10 a Zoom on the upper finely-meshed auxiliary region at the beginning of the simulationb
Excessive distorsion of the elements of the auxiliary domain if the nodes of the red line follow the
real motion of the material boundary
beginning of the computation. A small gap, which is initially empty, should be filled
during the first moments of the process.
One could imagine to simply prevent the radial motion of node p1 of Fig. 9.
The position of node p2 would be such that both nodes would have continuously
the same Y-coordinate. The other vertices would be Lagrangian. Unfortunately, this
solution does not work because an unavoidable material flux is observed between
the two regions of the mesh and the thin auxiliary mesh becomes rapidly distorted
during the first steps of the simulation. Figures 10a and b explain this issue and the
proposed solution. Initially, the vertical line above p1 is very short and very finely
meshed. During the first steps, this line is deformed because the first element of the
auxiliary region receives some spurious fluxes related to the relocation of p1 on the
piecewise-linear boundary of the mesh. These fluxes are very small but, compared
to the very small area of the elements, they are large enough to highly deteriorate
the auxiliary mesh and to make the line impossible to remesh. Consequently this
line is remeshed as if it was a straight line until the contact of p1 with the punch is
established.
Numerical Simulation of Double Cup Extrusion Test 41
The simulation is performed in two successive steps. The first one aims at filling
up the empty gap between the billet and the punch. At the end of this step, the
punch nose is entirely in contact with the billet and the situation becomes similar
to the simple extrusion problem of the previous section. During this first step, the
radial displacement of node p1 (Fig. 9) is set to zero and node p2 is Lagrangian.
The vertical line above p1 is remeshed as if it was straight in order to prevent any
distortion problem of the boundary. Doing so, a small inward spurious material flux
is tolerated through this boundary.
The second step begins when node p1 hits the punch surface. At this precise
moment, the vertical displacement of node p2 , as well as those of all the nodes of
line ( p1 , p2 ), are equaled to the one of p1 . This line follows thus the vertical motion
of the punch. Concerning the former problematic line, it is remeshed at that stage
using a cubic spline in order to precisely follow the boundary of the extruded material.
This two-step strategy is symmetrically applied to the lower part of the billet.
If some friction is modelled between the billet and the tools, the process is not
symmetrical and the transition from the first step to the second does not occur at the
same time for the upper and the lower part of the model. This is not a problem in
practise.
Finally and more classically, all the remaining curves defining the boundaries of
the billet are continuously remeshed using cubic splines. The internal nodes of the
main meshed region are relocated thanks to Giulianis smoothing method Giuliani
[14]. This method has been chosen among many others because it produces the
most regular mesh in this very case. This iterative smoothing requires five iterations
with a overrelaxation coefficient = 1.5. Eventually, both auxiliary meshes are
continuously remeshed by transfinite mapping.
The ALE model of the previous section is compared with two other models: the
first one is a classical Lagrangian model and the second one is an alternative ALE
model which simply consists in smoothing the mesh of the billet without defining
any auxiliary region. This comparison enables us to validate the proposed ALE node
relocation technique.
The punch displacement s (also called punch stroke) is limited to 8 mm so that
the three models can converge and produce results. Figure 11a shows the Lagrangian
solution. The mesh is highly distorted close to the punch nose. Having a closer
look at this problematic spot (Fig. 11b), it can be noticed that the mesh boundary
highly penetrates the punch. The node-to-surface formulation of the contact with
the rigid tool yields erroneous results: the surface of the billet is subjected to very
large local elongations and the surface mesh stretches so widely that the curvature
of the punch radius is not well described anymore. Since the contact detection only
involves the nodes of the boundary, the edges are free to cross the punch analytical
surface producing a very large geometrical error.
42 R. Boman et al.
(a)
s = 8mm
(b)
nodes
in contact
Equivalent plastic
0.0 strain ( p ) 3.0
Fig. 11 a Lagrangian solution for a 8-mm stroke. These results validate the ALE model for the
beginning of the processb Zoom on the Lagrangian solution for which the contact is very badly
taken into account
(a) (b)
s = 8mm s = 8mm
Fig. 12 a ALE solution for a 8-mm stroke (simple model). A single region is meshed and a tricky
smoothing operator is usedb ALE solution for a 8-mm stroke (two-region model)
Figure 12a presents the results obtained by the simple ALE model of DCET
without adding any auxiliary meshed regions. The initial mesh is identical to the
Lagrangian one. All the boundary nodes are relocated using cubic splines in order
Numerical Simulation of Double Cup Extrusion Test 43
0.7 equipotential +
equipotential weighted volumes 0.3 weighted volumes
Fig. 13 Comparison of the efficiency of the relocation methods. For each case, the red circle
indicates the most critical zone of the mesh where the elements are highly distorted
to avoid the excessive stretching of the element edges on the boundary, which was
previously discussed. The inner nodes are relocated, after many trials and errors,
by a very peculiar combination of two smoothing operators: 70 % of equipotential
smoothing and 30 % of weighted volume smoothing Boman and Ponthot [5]. The
equipotential part helps to keep the mesh lines almost perpendicular to each other.
The weighted-volume part tries to equalise the volumes of the neighbouring quadran-
gles. Used alone, each of these methods does not permit the computation to converge
so far. Figure 13 shows that it is possible to simulate a stoke of smax = 5.5 mm with
an equipotential smoother and a stoke of smax = 5.7 mm with a volume-weighted
smoother. An appropriate combination of both methods enables the simulation to
converge up to 9.8 mm. Nevertheless, these values are much smaller than the exper-
imental stroke value of 27 mm. Even if this stroke could be reached, it is important
to notice that the combination factors of the smoothing methods are case-dependant
(which means that they are related to a particular value of the friction coefficient
m) and very tricky to guess. This simple ALE model is thus useless except for the
validation of the two-region ALE model.
Figure 12b shows the results when using the more sophisticated ALE model
including the finely meshed auxiliary region for a punch displacement of 8 mm. This
time, the quality of the mesh is very good. The equivalent plastic strain distribution is
very similar to the one computed by the simple ALE model. Of course, the extruded
heights are slightly different because the two-region model starts at t = t0 with
nonzero heights (h 1 (t0 ) = h 2 (t0 ) = ). In order to discard this error, the extruded
heights are measured without taking into account the initial small heights of the
auxiliary meshes (see Fig. 14).
It is possible to compare more precisely these three models. Figure 15 shows the
evolution of h 1 and h 2 as a function of the punch stroke. Both ALE models give very
close numerical results that follow the trend of the Lagrangian model at the beginning
44 R. Boman et al.
h2 = y2
y2
6
2
h2
4
1
2 ALE
simple ALE
Lagrangian
0
0 2 4 6 8 10
Punch displacement [mm]
of the computation. Beyond s = 5 mm, the Lagrangian model withdraws from the
ALE models because of the penetration of the mesh inside the punch surface (see
Fig.11b). Despite the correction on the computed cup heights, the sophisticated ALE
model generates slightly different results from the simple ALE model. This small
error (h 1 = 0.14 mm and h 2 = 0.08 mm for s = 9.8 mm) certainly comes
from the contact length of the billet on the container wall that is not the same in the
two cases. The 2-region model is thus subjected to slightly more friction than the
simple model. This fact directly affects the corresponding curve of Fig. 16. However,
the global trend is quite satisfactory.
In Fig. 17, the curves representing the vertical forces measured on the tools for
the three models are very similar at the beginning of the simulation. Starting from
s = 5 mm, the Lagrangian solution does not model the real process anymore because
of the excessive material penetration into the punch. Yet the simple ALE model
provides force levels that are very close to the more sophisticated ALE model until
it ceases to converge. Finally, as it was expected, the forces of the 3-region ALE
model are slightly higher than the ones obtained by the two other models (+1.3 %
Numerical Simulation of Double Cup Extrusion Test 45
2.5
h /h
2
1 1.5
0.5 ALE
simple ALE
Lagrangian
0
0 2 4 6 8 10
Punch displacement [mm]
3
ALE
simple ALE F up
2.5 y
Lagrangian
Vertical forces [kN]
2 low
Fy
1.5
0.5 wall
Fy
0
0 2 4 6 8 10
Punch displacement [mm]
up
Fig. 17 Vertical forces computed on the upper punch Fy , on the lower punch Fylow and on the
container wall Fywall for the three models
for the force on the container wall). This difference, which is barely visible in the
Figure, could be further reduced by decreasing the value of , at the cost of a slower
convergence rate and thus an increase of the total computational time.
This preliminary study proves that the implemented ALE treatment of friction is
correct because the same results are obtained independently of the chosen formalism.
46 R. Boman et al.
Fig. 18 Deformed billets which have been obtained for several friction values m
The whole process is now simulated by using the ALE model until the vertical
displacement of the punch reaches s = 29 mm, which corresponds to 91 % of the
initial height of the billet. When using m = 0.05, the problem is solved in 346 time
increments corresponding to 422 Newton iterations. The total CPU time is 5 45
(single-threaded run on an AMD Opteron 254, 8 Ghz). This time does not depend
much on the value of the friction coefficient m. Approximately half of this time
(52 %) is spent in the Lagrangian step of the ALE algorithm. The remaining time
splits into node relocation routines (7 %) and the data-transfer scheme (41 %).
The deformed shapes of the billet are presented in Fig. 18 for three values of the
friction coefficient m (m = 0, m = 0.05 et m = 0.1). As expected, the upper cup
height h 1 becomes larger when the friction coefficient increases. Since the elastic
deformations are negligible and thanks to mass conservation, the opposite trend
is observed for the lower cup height h 2 . One must keep in mind that this volume
conservation is not automatically verified in ALE formalism. A closer look at the
volume variation of the mesh reveals that the total volume slightly increases during
the ALE computations. Table 2 shows several interesting values: the added volume
corresponding to the auxiliary meshes represents about 1 % of the exact initial volume
of the experimental billet. At the end of the simulation, an increase of about 0.5 %
of the total volume of the mesh is noticed instead of a slight loss of volume that
could be intuitively expected from the elastic response of the crushed material. This
variation mainly results from the spurious material fluxes that are generated during the
remeshing of the boundaries of the mesh. A smaller fraction of this error might also
come from the limited accuracy of the data-transfer scheme. Anyways, the observed
volume variation of the ALE mesh is always positive and it slightly increases as the
friction coefficient m increases.
Numerical Simulation of Double Cup Extrusion Test 47
Table 2 Variation of the volume of the mesh in the ALE simulations of DCET after a punch stroke
of 29 mm (m = 0.05)
V V /V0
[mm3 ] 100 [%]
Initial volume of the experimental billet V0 25137 100.00
Added volume (auxiliary meshes) 238 0.95
Initial volume of the model 25375 100.95
Final volume of the model (m = 0.00) 25492 101.41
Final volume of the model (m = 0.05) 25500 101.44
Final volume of the model (m = 0.10) 25525 101.54
4.5
3.5 m=0.2
3 m=0.15
2.5 m=0.1
2
h /h
1
2 m=0.075
m=0.05
1.5
m=0.025
1 m=0.0
0.5
exp (Schrader)
0
0 5 10 15 20 25 30
Punch displacement [mm]
Fig. 19 Set of curves of cup height ratios numerically obtained for a range of friction coefficients
m. A value for m may be deduced from the experimental measurements
According to its creator, Geiger [12], the main purpose of DCET is to numerically
identify a friction coefficient which is directly related to the chosen lubricant. To reach
this goal, a series of simulations are performed by considering a range of friction
values m. As an example, Fig. 19 shows the resulting curves in the case of the
studied model. The experimental values from Schrader et al. [23]three punctual
measurementshave been superimposed on the numerical curves. A friction value,
which is marginally greater than m = 0.05, can then be deduced visually from these
results.
48 R. Boman et al.
s=
25mm
max
h1
h1( r max )
0.0 p 3.0
Fig. 20 Two different ways for measuring the height h 1 in the ALE model: either the largest height
h max
1 (the measurement position r may vary during the simulation), or the height h 1 measured on
the container wall (always at rmax ). These simulations correspond to m = 0.05 and n = 0
This section is devoted to a comparison between the ALE model and the numerical
and experimental work of Schrader et al. [23]. These authors use DEFORM-2D, a
FE code which is dedicated to the simulation of forging and extrusion processes.
DEFORM-2D features a sophisticated automatic remeshing algorithm which is very
useful to avoid critical distortions of the quadrangular finite elements during the
computation. The numerical techniques, which are compared in this section, are thus
completely different.
In their work, Schrader et al. study the influence of the hardening coefficient n of
the material (see Eq. 6) on the cup height ratio and on the contact pressure when the
friction value is m = 0.05. The cup height ratio is plotted in Fig. 21 for n = 0.17
(the reference value) and n = 0.0 (perfectly plastic material). As far as the ALE
results are concerned, not one but two curves have been plotted for each hardening
coefficient n. The first curve is obtained when the value of h is measured on the
container wall (at r = rmax = d0 /2). The second one is related to the largest value of
h which could be measured at a variable radial position r during the simulation. As
an example, Fig. 20 shows the final shape of the billet for a hardening coefficient of
n = 0. The position of the largest value of h 1 is not located at the container wall. The
cup height ratio may vary a lot according to the particular shape of the upper free
surface of the material and to the measurement position of h 1 . This is particularly
the case when the punch stroke and h are small. For each value of n in Fig. 21, the
two ALE curves nicely surround the one obtained by Schrader using DEFORM-2D.
For a larger stroke, these three curves converge to an identical final value of the cup
height ratio.
Numerical Simulation of Double Cup Extrusion Test 49
3.5
n=0.0
3
2.5
2
h /h
2
1
n=0.17
1.5
this work h(rmax)
1
this work hmax
0.5 Schrader
exp
0
0 5 10 15 20 25
Punch displacement [mm]
Fig. 21 Influence of the hardening coefficient n of the material on the cup height ratio (m = 0.05)
1200
lower punch upper punch
1000
n=0.0
Pressure [MPa]
800
600
n=0.17
400
200
this work
Schrader
0
0 10 20 30
Y [mm]
Fig. 22 Pressure field measured on the container wall for a 8-mm stroke and two different values
of the hardening coefficient (m = 0.05)
The contact pressures on the container wall are plotted in Fig. 22 for a stroke
limited to 8 mm and two different materials (n = 0 and n = 0.17). There is a very
good agreement between the results of Schrader et al. and the ALE model. The
curves are less close to each other at the level of both punches. Nevertheless, the
global shapes of the pressure fields are quite similar.
Schrader et al. also studied the influence of the initial height h 0 of the cylindrical
billet on the obtained results. The curves obtained for ratios h 0 /d0 of 0.75, 1.0
(reference value) and 1.25 are presented in Fig. 23. Once again, the ALE results are
very close to the published results of Schrader. The largest difference is observed
50 R. Boman et al.
2
h /h
2
1
1.5
h /d =0.75 this work h(rmax)
0 0
1
this work hmax
0.5 Schrader
exp
0
0 5 10 15 20 25
Punch displacement [mm]
for the ratio h 0 /d0 = 0.75. In this case, the cup height ratio h 1 / h 2 that is computed
by the ALE model is a 5 % underestimate of the value computed by DEFORM-2D.
Since some assumptions have been made about the supposed treatment of friction
and the initial yield stress of the material, this difference may still be considered as
rather small. Consequently it can concluded that the results of the ALE model are
consistent with the ones obtained by a remeshing procedure.
5 Application to Thixoforming
In this section, the ALE model of the previous sections is used to simulate a semi-
solid forming operation, also known as thixoforming. This kind of process relies on a
specific behaviour, called thixotropy, of some alloys near their melting temperature.
They behaves as solids at rest (a billet can sustain its own weight) but they react as
liquids during shearing (for example, they can be cut easily).
A thermomechanical constitutive law which models a smooth transition between
these two behaviours has been implemented in Metafor [16]. The numerical vali-
dation of this law is performed using the ALE model of DCET and the results of a
campaign of experimental tests which was conducted at the University of Lige by
Pierret, Vaneetveld and Rassili [19, 20] in collaboration with the industrial engineer-
ing and mechanical production laboratory of ENSAM (Ecole Nationale Suprieure
dArts et Mtiers, Metz, France).
The adapted numerical model exhibits several additional difficulties compared
to the one presented previously: all the material parameters are temperature depen-
dent and a coupled thermomechanical integration scheme is used. The heat transfer
between the material (a 100Cr6 steel alloy heated up to 1370 C) and the rigid tools
(initially at 130 C) is also taken into account. The upper punch velocity, which plays
Numerical Simulation of Double Cup Extrusion Test 51
27.3 mm 26.1 mm
14.2 mm 14.7 mm
Fig. 25 Comparison of the final shape obtained experimentally (left) and the final deformed section
resulting from the ALE simulation with a friction coefficient m = 0.35 (right)
a significant role on the process due to the variable viscosity of the thixotropic mater-
ial, is not constant. Finally, there is an initial gap between the billet and the container
wall at the beginning of the process (see Fig. 24). The filling of this gap requires the
definition of an supplementary stage in the time-integration sequence.
Figure 25 presents the final shape of the billet obtained experimentally and numer-
ically. Although the friction coefficient has been chosen to get almost the same cup
heights in both cases, it is interesting to see that the simulated upper and lower
boundaries of the cups are very similar to the experimental one. This simulation also
proves that the mesh management technique presented in this chapter is able to deal
with complex material flows.
52 R. Boman et al.
6 Conclusions
An original 2D model of double cup extrusion test (DCET) has been presented in
this chapter. This model efficiently uses the ALE formalism in order to avoid a series
of complex and costly remeshing steps during the simulation. Since the DCET is a
tribological test, the model is also very interesting to validate the contact treatment
on an ALE mesh. An error in the ALE computation of the local friction force would
be immediately reflected on the global final shape of the deformed billet.
In order to keep a constant mesh topology, which is a prerequisite condition to use
the ALE formalism, it is necessary to add two very thin auxiliary material regions
to the initial mesh of the billet. These regions are made up of flat elements which
can inflate during the simulation when the billet is crushed between the punches
and the material flows from one mesh to the other. Although this particular mesh
management technique has been already used by [10, 21], it is the first time that this
kind of method is applied to a geometrically-complex process. Indeed, the noses of
the punches have not been simplified in the DCET model. They are not planar and
their curvature adds a real difficulty to the definition of the mesh motion and the
time-integration sequence.
The presented ALE model has been validated by two different means. It has been
compared first with an equivalent Lagrangian model during the beginning of the
simulations. Secondly, the ALE results have been compared to the ones computed
by DEFORM-2D which makes use of an automatic remeshing procedure. A very
good agreement has been observed between these two numerical techniques although
they are radically different.
Finally, the ALE model of DCET has been used in the frame of a fully-coupled
thermomechanical simulation of a semi-solid forming process. Once again, very
good results have been obtained without any remeshing operations.
References
1. Atzema EH, Hutink J (1995) Finite element analysis of forward/backward extrusion using
ALE techniques. In: Shen Dawson (ed) Simulation of materials processing: theory, methods
and applications : proceedings of the 5th international conference NUMIFORM. New-York
2. Bay N (1994) The state of the art in cold forging lubrication. J Mater Process Technol 46(1
2):1940. doi:10.1016/0924-0136(94)90100-7
3. Benson DJ (1989) An efficient, accurate, simple ale method for nonlinear finite element
programs. Comput Methods Appl Mech Eng 72(3):305350. doi:10.1016/0045-
7825(89)90003-0
4. Benson DJ (1992) Computational methods in lagrangian and eulerian hydrocodes. Comput
Methods Appl Mech Eng 99(23):235394. doi:10.1016/0045-7825(92)90042-I
5. Boman R, Ponthot JP (2012) Efficient ale mesh management for 3d quasi-eulerian problems.
Int J Numer Meth Eng 92(10):857890. doi:10.1002/nme.4361
6. Boman R, Ponthot JP (2013) Enhanced ALE data transfer strategy for explicit and implicit
thermomechanical simulations of high-speed processes. Int J Numer Meth Eng 53(0):6273.
doi: http://dx.doi.org/10.1016/j.ijimpeng.2012.08.007
Numerical Simulation of Double Cup Extrusion Test 53
7. Buschhausen A, Weinmann K, Lee JY, Altan T (1992) Evaluation of lubrication and friction in
cold forging using a double backward-extrusion process. J Mater Process Technol 33(12):95
108. doi:10.1016/0924-0136(92)90313-H
8. Dona J, Huerta A, Ponthot JP, Rodriguez-Ferran A (2004) Encyclopedia of computational
mechanics, chap 14: arbitrary Lagrangian-Eulerian methods, Vol 1. Wiley, pp 413437. doi:10.
1002/0470091355.ecm009
9. Forcellese A, Gabrielli F, Barcellona A, Micari F (1994) Evaluation of friction in cold metal
forming. J Mater Process Technol 45(14):619624. doi:10.1016/0924-0136(94)90408-1
10. Gadala MS, Movahhedy MR, Wang J (2002) On the mesh motion for ale modeling of
metal forming processes. Finite Elem Anal Des 38(5):435459. doi:10.1016/S0168-
874X(01)00080-4
11. Gariety M, Ngaile G, Altan T (2007) Evaluation of new cold forging lubricants without zinc
phosphate precoat. Finite Elem Anal Des 47(34):673681. doi:10.1016/j.ijmachtools.2006.
04.016
12. Geiger R (1976) Der stofffluss beim kombinierten napffliesspressen - metal flow in com-
bined can extrusion - (Berichte aus dem Institut fnr Umformtechnik, UniversitSt Stuttgart). 36,
Girardet, Essen, Germany
13. Geijselaers HJM, Hutink J (2000) Semi implicit second order discontinuous Galerkin con-
vection for ALE calculations. In: Onate E, Morgan K, Periaux J, Stein E (eds) (ECCOMAS)
European congress on computational methods in applied sciences and engineering, Barcelona
14. Giuliani S (1982) An algorithm for continuous rezoning of the hydrodynamic grid in arbi-
trary lagrangian-eulerian computer codes. Nucl Eng Des 72(2):205212. doi:10.1016/0029-
5493(82)90216-3
15. Hutink J, Vreede PT, van der Lugt J (1990) Progress in mixed eulerian-lagrangian finite
element simulation of forming processes. Int J Numer Meth Eng 30(8):14411457. doi:10.
1002/nme.1620300808
16. Koeune R (2011) Semi-solid constitutive modeling for the numerical simulation of thixoform-
ing processes. PhD thesis, University of Lige, Belgium
17. Male AT, Cockcroft MG (1965) A method for the determination of the coefficient of friction
of metals under condition of bulk plastic deformation. J Inst Met 93:3846
18. Matweb (2013) Online materials information resource. http://www.matweb.com/
19. Pierret J (2009) Quantification de la robustesse du procd de thixoformage des aciers. PhD
thesis, University of Lige, Belgium.
20. Pierret J, Rassili A, Vaneetveld G, Bigot R, Lecomte-Beckers J (2010) Friction
coefficients evaluation for steel thixoforging. Int J Mater Form 3:763766. doi:10.1007/s12289-
010-0882-1
21. Ponthot JP (1995) Advances in Arbitrary Eulerian-Lagrangian finite element simulation of large
deformation processes. In: Owen D, Oate E (eds) Computational plasticity: fundamentals and
applications -proceedings of the 4th international conference. Pineridge Press Ltd, Barcelona
22. Ponthot JP (1995) Traitement unifi de la mcanique des milieux continus solides en grandes
transformations par la mthode des lments finis. PhD thesis, Universit de Lige, Lige,
Belgium.
23. Schrader T, Shirgaokar M, Altan T (2007) A critical evaluation of the double cup extrusion
test for selection of cold forging lubricants. J Mater Process Technol 189(13):3644. doi:10.
1016/j.jmatprotec.2006.11.229
24. Scientific Forming Technologies Corporation (2013) DEFORM. http://www.deform.com/
25. Sofuoglu H, Rasty J (1999) On the measurement of friction coefficient utilizing the ring com-
pression test. Tribol Int 32(6):327335. doi:10.1016/S0301-679X(99)00055-9
26. Tan X, Bay N, Zhang W (1998) On parameters affecting metal flow and friction in the double
cup extrusion test. Scand J Metall 27(6):246252
27. Van Haaren MJ, Stoker HC, van den Boogaard AH, Hutink J (2000) The ALE-method with
triangular elements: direct convection of integration point values. Int J Numer Meth Eng
49(5):697720. doi:10.1002/1097-0207(20001020) 49:5 697:AID-NME9763.0.CO;2-U
Part II
Cardiovascular Fluid Mechanics
Simplified Fluid-Structure Interactions
for Hemodynamics
Olivier Pironneau
1 Introduction
Mastering the simulation of blood flow is the key to proper design of by-passes,
stents and heart valves (see Thiriet [17] for instance).
The problem was addressed by Charles Peskin in the nineties and his team have
made impressive simulations since, using fictitious domains and immersed boundary
techniques [1, 12, 13, 18].
Another approach, taken by Quarteroni et al. [5] and the REO project at INRIA [3,
4, 19] is to discretize the full fluid-structure coupled problem with solvers working
in moving domains.
In a seminal paper [11], Nobile and Vergana showed that the problem is well
posed and conserves energy. Nevertheless the numerical simulations are expensive
[2] and there is room for simplifications.
O. Pironneau (B)
Laboratoire Jacques-Louis Lions, Sorbonne Universits, UPMC,
Boite courrier 187, 75252 Paris Cedex 05, France
e-mail: [email protected]
In the special case of aortic flow the geometry does not change much. Typically
the aorta has a radius of 1cm and a computational geometry deals with a section of
length of 510 cm; the thickness of the aortic wall is around 0.1 cm; the heart pulse
is about 1 Hz and the pressure drop roughly 6 KPa.
In principle arteries are deformable solids subject to large displacements and
nonlinear elasticity (e.g. [7, 8, 10]). But when small displacement occurs only and
linear elasticity applies, shell models like Koiters can be used. It was shown in [11]
that if lateral displacements are neglected, Koiters model reduces to a scalar equation
for the normal displacement
on the mean position of the vessels wall; here h denotes the average thickness of
the vessel and s its volumic mass; T is the pre-stress tensor (needed because at rest
the vessel is blown up by the blood ); C is a damping term, a, b are viscoelastic terms
and f s the external normal force, i.e. s nn the normal component of the normal
stress at the surface of the solid.
Notice however that the other components of the normal stress tensor cannot be
matched with the fluid when the displacement is assumed normal.
Finally assume that [h, T, C, a] << b; then the Surface Pressure Model is
obtained:
Eh
s nn = b, with b = (2)
A(1 2 )
where A is the vessels cross section, E the Young modulus, the Poisson coefficient.
Some typical values (MKSA):
2 Boundary Conditions
with h r = 1, h = r1 , h = 1
R because, by definition
Simplified Fluid-Structure Interactions for Hemodynamics 59
1
= (k x)2 + (k y)2 + (k z)2 , k = r, , (5)
h 2k
So u = 0 and u n = 0 imply
R0 + 2r cos u r R0 + 2r cos
u = r u r + u r = 0 r u r | = (6)
r (R0 + r cos ) r R0 + r cos
Similarly
u = e h i k
i
e u k , i, k (r, , )
k
(7)
i k
with
Thus
ur r
f
r
n T (u)n = r u r + 1 + cos2 nn = p + 2 1 + cos2 u n.
r R R r
(9)
Assuming the fluid Newtonian and incompressible, the pressure p and the velocity
u are given by the Navier-Stokes equations
u
f + u u f = 0, u = 0, (12)
t
60 O. Pironneau
f 2 f 2
|u| (T ) + |u + u T |2 = |u| (0)
(T ) 2 (0,T ) 2 (0) 2
+ s u n (15)
(0,T )
Now if we consider (12) on a fixed domain with zero tangential velocities but non-
zero normal velocities on the walls then to conserve energy we need to change u u
into u u 21 |u|2 which happens to be u u due to the identity
1
u u = |u|2 u u. (16)
2
Let us recall another identity:
u = u + u (17)
Therefore the modified Navier-Stokes system suited to flows in fixed domains with
zero tangential components on the walls (u n = 0) is
u
f
u u + u + p = 0, u = 0, (18)
t
u
f u u u + u u p u p u
t
+ p u n = 0. (19)
r
with = 2 1 + cos2 . As (20) implies (10-a),
r R
energy estimates derive by choosing u = u, p = p, =
|u| (T ) +
f 2
b (T ) +
2
2| u| + 2
2
(t )2
(0,T ) (0,T )
= |u| (0) +
f 2
b2 (0) (21)
Notice that can be eliminated from (10), giving a formulation which contains
u n = 0 and
nt p = t u + bu (22)
Fig. 1 Left Surfaces of constant pressure for a flow with = 103 , b = 200 in a quarter of a
torus with R = 4, r = 2 discretized on a fixed geometry with the Nedelec edge element for the
velocity, peacewise constant pressures and linear continuous deformation. Right same as left but
with a mixte Raviart-Thomas element for the displacement
m+1
u um 1 1
f u m+ 2 u m u + u m+ 2 u p m+1 u
t
1
1
p u m+ 2 + [tbu m+ 2 + (u m+1 u m ) + p m n] u = 0. (24)
Formulation (19) is valid only if u n = 0. This condition has been removed from
(24) to make it symmetric and easy to implement but the consequence is that by
working the integrations by parts backward, it is found that this formulation implies
(18) and on :
1 1
[tbu m+ 2 + (u m+1 u m )] n ( p m+1 p m ), u m+ 2 n = 0 (25)
The first condition no longer implies that u n = 0 and the second condition is
like saying that the tangential stress is zero, which means that we match not only the
normal components of the fluid and solid normal stress but all the components.
In summary Problem 2 is different from Problem 1; both of them have physically
sound background but we need to test them numerically to see how different they
are.
64 O. Pironneau
Let Th be a triangulation with K tetraedra {Tk }1K with the usual conformity hypothe-
ses; let := k Tk R3 .
Consider the P 2 P 1 element built from
Vh = {v C 0 ()3 : vi |Tk P 2 , i = 1, 2, 3}
Q h = {q C 0 () : q|Tk P 1 } (26)
We assume that the boundary is made of two part, which is the compliant wall and
the input and output sections on which p is given and u n = 0.
For simplicity we assume that r << R, i.e. = 1. The momentum equation is also
divided by f and = / f and b is changed into b/ f .
A feasible discretization of (24) is to find [u m+1 , p m+1 , m+1 ] Vh Q h Q h
with u m+1 n| = 0, m+1 | = 0 and such that
m+1
u um m+ 12 m+ 21
u u u m
p m+1
u p u
t
1 1
+ u m+ 2 u + m+ 2 ]
1 m+ 1 1 1 1
+ b m+ 2 u n u n 2 (m+1 m ) + (u m+ 2 n) (u n)
t
= p u n , [u, p, ] Vh Q h Q h with u n| = 0, | = 0.
(27)
As for the Navier-Stokes equations, when t is small enough the problem has
a unique solution because of the energy estimate and because of a general inf-sup
condition is satisfied with p replaced by [ p, ].
Simplified Fluid-Structure Interactions for Hemodynamics 65
m+1
u um m+ 21 m+ 21
u u u m
p m+1
u p u
t
1
+ u m+ 2 u
1
+ (u m+ 2 bt + p m n) u = p u n
u Vh , p Q h with u n| = 0 (29)
2
u m+1 u m 2 m+ 12 2 1
+ | u | + b|u m+ 2 |2 t
t
1
1
m+ m+
= pm u n 2 p u n 2 (30)
5 Numerical Tests
The full model requires that be moved at every time step along its normal of a
quantity tu m n. To preserve the triangulation we follow the literature [2] and solve
an additional problem
On the problem described earlier both methods give very similar results as shown
on Fig. 2. The geometry is updated for visualisation purpose with a multiplicative
factor 100.
The geometry is a section of the aorta obtained from a MRI scan. It has 4991 ver-
tices, giving 19964 degrees of freedom for each linear systems for [u 1m+1 , u 2m+1 , u 3m+1 ,
p m+1 ]. The pressure drop from inflow section on the right to outflow section on the
left is p R = 6 cos2 ( t) and the results are shown at t = 0.8. On the smaller cross
sections a pressure drop equal to p R /2 is imposed. Problem 1 and Problem 2 are
solved for comparison with t = 0.05/, = 0.001, b = 200. Results are shown
on Fig. 3.
For Problem 1, the computation took 198 on a macbook pro 15 , 2012, 2.3MHz
core i7. For Problem 2 it took 180 . The results are very similar with some difference
on the pressure but very little on the velocities.
We end this article with an idea to address the problem of loss of stability due to the
creation of reverse flow in unwanted regions because of the boundary conditions on
the artificial inflow and out flow sections.
We borrow the idea from the PML literature (see for example [9]) and add to the
artery geometry a viscous buffer after out where = 1 >> blood (and similarly
before in but we present the theory applied to the outflow section only).
Consider a geometry where the exit section is o = {0} [0, h] in 2D where
pressure is set to p0 while pressure is set to p1 on entry. Assume that we impose a
parabolic flow u = K y(h y) at the exit of a viscous buffer L = [L , 0) [0, h],
i.e. on {L} [0, h]. Now we solve the Navier-Stokes equations on L. The
problem is to choose K so that the pressure on the inital outflow boundary o is
unchanged in the mean, namely p0 := h 1 p0 dy.
Because at every time step the system to solve is linear we shall adjust K by
superposition so that the mean pressure is p0 on out . Since, p|out p1 + ( p2
p1 ) KK2K
K 1 where p1 is computed with K = K 1 and p2 the mean pressure when
1
K = K 2 , then
p0 p1
K = K 1 + (K 2 K 1 ) (32)
p2 p1
This requires to solve the linear Stokes-like system at each time step 3 times. We
can also add K to the unknowns of the Stokes-like linear system and add out p =
|out | p0 to the equations; we used this second solution in the numerical tests because
it is much less computer intensive.
Simplified Fluid-Structure Interactions for Hemodynamics 67
Fig. 2 Left surface of equal pressure at t = 0.75 computed by solving Problem 1 with P 2 P 1 P 1
elements and a penalization of the condition u n = 0. Right same as left but with Problem 2 and
a P 2 P 1 element
The idea is tested numerically on a quarter of a 2D-torus with radii 0.6 and 1 with
= 0.002 and a pressure drop equal to cos(t) + cos(3t), t (0, 25). The PML
viscosity is 1 = 0.2. A PML region is added to both ends of the tube. Results are
shown on Fig. 4.
68 O. Pironneau
Fig. 3 Computation of [u, p] for Problems 1 & 2 for a portion of an oarta (shown upside down).
Top with Problem 1. the pressure is shown at t = 0.8 on the left on a geometry which has changed
by . On the right the third component of the velocity w is shown on the fixed geometry. Bottom
same for Problem 2
The results look very different and that is because both computations do not have
the same inflow and outflow conditions on the original inflow/outflow boundaries.
In one case the pressure is imposed pointwise with u n = 0, in the PML case
the mean pressure is imposed and no conditions are imposed on the velocity but
parabolic velocity is imposed on the inflow/outflow of the PML boundaries.
The method will be tested in 3D and reported in a future publication.
Simplified Fluid-Structure Interactions for Hemodynamics 69
Fig. 4 Left Geometry for the flow with two PML regions added. Center the velocity vectors
computed without the PML; notice the back flow in the yellow region. Right the same flow (velocity
vectors) computed with the two PML regions. The pressure drop from the two inner boundaries
(corresponding to the top and left boundaries of the geometry on the center figure) are the same as
in the center figure
7 Conclusion
In this article we have presented problems and solutions encountered with fluid-
structure interactions when a middle solution is seeked: neither the full problem
with moving geometries because it is too expensive, nor rigid walls because it is not
precise enough and it doesnt give the geometrical deformation.
The solution adopted here is to delay the geometrical deformations to the graphic
diplay only. But in doing so we have to work with the Navier-Stokes equations with
unusual boundary conditons which require unusual finite element discretizations.
For these intermediary problems we have shown that it is important to preserve
energy. Furthermore we can choose either to match exactly the normal component of
the solid and fluid normal stress tensor or to match approximately the 3 componenets
of the normal stresses by relaxing slightly the no slip condition.
In all cases the problem of back flows in the pulsating cases remains. We have
suggested a possible solution and made some preliminary tests.
References
1. Boffi D, Gastaldi L (2003) A fem for the immersed boundary method. Comput Struct 81:491
501
2. Deparis S, Fernandez MA, Formaggia L (2003) Acceleration of a fixed point algorithm for
fluid-structure interaction using transpiration conditions. ESAIM:M2AN 37(4):601616
3. Fernandez M (2013) Incremental displacement-correction schemes for incompressible fluid-
structure interaction. Numer Math 123:2165
4. Formaggia L, Gerbeau JF, Nobile F, Quarteroni A (2001) On the coupling of 3d and 1d navier-
stokes equations for flow problems in compliant vessels. Comput Methods Appl Mech Eng
191:561582
5. Formaggia L, Quarteroni A, Veneziani A (eds) (2009) Cardiovasuclar mathematics, MS and A
series. Springer, Milano
70 O. Pironneau
6. Girault V (1988) Incompressible finite element methods for Navier-Stokes equations with
nonstandard boundary conditions in R 3 . Math Comp 51(183):5574
7. Gonzalez O (2000) Exact energy and momentum conserving algorithms for general models in
nonlinear elasticity. Comput Methods Appl Mech Eng 190:17631783
8. Gonzalez O, Simo JC (1996) On the stability of symplectic and energy-momentum algorithms
for nonlinear Hamiltonian systems with symmetry. Comput Methods Appl Mech Eng 134:197
222
9. Hu Fang Q, Li XD, Lin DK (2008) Absorbing boundary conditions for nonlinear euler
and navier-stokes equations based on the perfectly matched layer technique. J Comp Phys
227:43984424
10. Le Tallec P (2001) Fluid structure interaction with large structural displacements. Comput
Methods Appl Mech Eng 190:30393067
11. Nobile F, Vergana C (2008) An effective fluid-structure interaction formulation for vascular
dynamics by generalized robin conditions. SIAM J Sci Comp 30(2):731763
12. Peskin C, McQueen D (1989) A three dimensional computational method for blood flow in the
hearth-i. immersed elastic fibers in a viscous incompressible fluid. J Comput Phys 81:372405
13. Peskin C (2002) The immersed boundary method. Acta Numerica 11:479517
14. Pichon KG, Pironneau O (2014 ) Pressure boundary conditions for blood flows. Applied Math
Conf in honnor of L. Tartar. Proc published in AIMS journal (to appear)
15. Pironneau O (1986) Conditions aux limites sur la pression pour les quations de Stokes et de
Navier-Stokes. C R Acad Sci Paris Sr I Math 303(9):403406
16. Pironneau O (1989) Finite element methods for fluids. Wiley, New York
17. Thiriet M (2011) Biomathematical and biomechanical modeling of the circulatory and venti-
latory systems. Control of cell fate in the circulatory and ventilatory systems, vol 2. Springer,
New York
18. Usabiaga F, Bell J, Buscalioni R, Donev A, Fai T, Griffith B, Peskin C (2012) Staggered
schemes for fluctuating hydrodynamics. Multiscale Model Sim 10:13691408
19. Vignon-Clementel I, Figueroa A, Jansen K, Taylor CA (2006) Outflow boundary conditions
for three-dimensional finite element modeling of blood flow and pressure in arteries. Comput
Methods Appl Mech Eng 195:37763796
Patient-Specific Cardiovascular Fluid Mechanics
Analysis with the ST and ALE-VMS Methods
K. Takizawa (B)
Department of Modern Mechanical Engineering and Waseda Institute for Advanced Study,
Waseda University, 1-6-1 Nishi-Waseda, Shinjuku-ku, Tokyo 169-8050, Japan
e-mail: [email protected]
Y. Bazilevs
Structural Engineering, University of California, San Diego, 9500 Gilman Drive, La Jolla,
CA 92093, USA
T. E. Tezduyar K. Schjodt
Mechanical Engineering, Rice University, 6100 Main Street, Houston, TX 77005, USA
C. C. Long
T-3 Fluid Dynamics and Structural Mechanics, Los Alamos National Laboratory,
Los Alamos, NM 87545, USA
A. L. Marsden
Mechanical and Aerospace Engineering, University of California,San Diego,
9500 Gilman Drive, La Jolla, CA 92093, USA
stresses, (ix) a recipe for pre-FSI computations that improve the convergence of the
FSI computations, (x) the Sequentially-Coupled Arterial FSI technique and its mul-
tiscale versions, (xi) techniques for calculation of the wall shear stress (WSS) and
oscillatory shear index (OSI), (xii) methods for stent modeling and mesh generation,
(xiii) methods for calculation of the particle residence time, and (xiv) methods for an
estimated element-based zero-stress state for the artery. Here we provide an overview
of the special techniques for stent modeling and mesh generation and calculation of
the residence time with application to pulsatile ventricular assist device (PVAD). We
provide references for some of the other special techniques. With results from earlier
computations, we show how the core and special techniques work.
1 Introduction
the FSI modeling relative to just CFD modeling creates more challenges for the
computation, which are discussed here.
Methods for solving the fully-discretized coupled equations are of two kinds:
loosely- and strongly-coupled. The strongly-coupled solution methods can further
be categorized as block-iterative, direct and quasi-direct coupling techniques
(see [3, 4] for the terminology). The strongly-coupled solution methods include the
monolithic methods, which typically imply matching discretizations at the fluid
structure interface. The quasi-direct and direct coupling methods are applicable to
cases with nonmatching fluid and structure meshes at the interface, become equivalent
to monolithic methods when the interface meshes are matching, and yield more robust
algorithms than block-iterative coupling does, especially when the structure is light
compared to the fluid masses involved in the FSI dynamics.
In loosely-coupled approaches, the equations for fluid, solid and mesh motion
are solved sequentially, in an uncoupled fashion. Typically, within each time step,
the increment in the fluid solution is computed on a fixed spatial domain, the fluid
forces on the structure are collected, and the structural-solution increment is com-
puted, which is followed by an update of the mesh position. This enables the use
of existing fluid and structural solvers, a significant motivation for adopting this
approach. Yet, difficulties in the form of lack of convergence have been noted in
a number of cases, including some cardiovascular FSI cases. In strongly-coupled
approaches, the equations for fluid, solid and mesh motion are solved in a fully cou-
pled fashion, and in a more direct way in the case of quasi-direct and direct coupling.
The main advantage is that monolithic and other quasi-direct and direct coupling
techniques are more robust in that many of the convergence problems encountered
with the loosely-coupled and block-iterative coupling approaches are completely
avoided.
Although introducing FSI to blood flow increases modeling and simulation
complexity, the computations produce physiologically more realistic results than
those generated by just CFD. Effects of including wall elasticity in vascular sim-
ulations have been examined, for example, for carotid artery [5, 6], for cerebral
aneurysms [7, 8] and for the total cavopulmonary connection [9], which was the
first use of FSI for patient-specific pediatric cardiology applications. The rigid-wall
assumption consistently shows an overestimation of the wall shear stress (WSS)
compared to the flexible wall, in some cases by as much as 50 %. Some qualitative
and quantitative differences between the rigid- and flexible-wall simulations were
also observed for the blood flow patterns. We note that the blood vessels of young
children are significantly more flexible that those of adults, making FSI modeling
especially important for pediatric cardiology [10].
Unlike the rigid-wall assumption, FSI enables simulation of the complete mechan-
ical environment of the vascular wall, both the loads acting on the wall due to blood
flow and the loads acting within the wall. The latter loads are particularly impor-
tant for they act on the cells that control wall structure and function, which in turn
may change the bulk elastic properties of the wall and hence the hemodynamics.
As a result, considering the associated wall mechanobiology and the importance of
74 K. Takizawa et al.
TAFSM in conjunction with the SSTFSI technique. These include techniques for
calculating an estimated zero-pressure (EZP) arterial geometry [40, 41], a special
mapping technique for specifying the velocity profile at an inflow boundary with
non-circular shape [14], techniques for using variable arterial wall thickness [14,
41], mesh generation techniques for building layers of refined fluid mechanics mesh
near the arterial walls [14, 41, 42], a recipe for pre-FSI computations that improve
the convergence of the FSI computations [13], the Sequentially-Coupled Arterial FSI
(SCAFSI) technique [43, 44] and its multiscale versions [44], and techniques [41]
for the projection of fluidstructure interface stresses, calculation of the WSS and
calculation of the oscillatory shear index (OSI). In FSI modeling of three cere-
bral artery segments with aneurysm reported by the TAFSM in [45], the arterial
geometries came from 3D rotational angiography (3DRA). In [45], the TAFSM
also addressed the computational challenges related to extraction of the arterial-
lumen geometry from 3DRA, generation of a mesh for that geometry, and building
a good starting point for the FSI computations. In addition to these computational
challenges common to all three cases, the computational challenges encountered in
some of these cases individually were addressed in [45]. In [46, 47], new techniques
were presented for determining the shrinking amount in the EZP process, the arterial
wall thickness, and the thickness of the layers of refined fluid mechanics volume
mesh near the arterial walls. These techniques were originally proposed in Remark 3
of [45], but the description was very brief. In [46, 47], also a new scaling technique
was introduced for specifying a more realistic volumetric flow rate. In [9], a tech-
nique was proposed to use the Laplaces equation for specifying a variable vessel
wall thickness, and in [8, 48] a prestressing technique was developed for blood ves-
sels. The former addressed the challenge of how to specify spatially varying vessel
wall thickness. It inspired the idea of using the Laplaces equation over the surface
mesh covering the lumen to specify a variable vessel wall thickness [4547]. The
latter addressed the challenge presented by the fact that patient-specific blood ves-
sel geometry data comes from a configuration that is not stress-free, the same fact
that motivated earlier the development of methods for calculating an EZP arterial
geometry [40]. Both techniques are quite general and, most importantly, are inde-
pendent of the details of the patient-specific blood vessel geometry. Related to that,
recently, methods for an estimated element-based zero-stress state for the artery were
introduced in [21].
While modeling the FSI between the blood flow and arterial walls is one of the most
challenging problems in cardiovascular fluid mechanics, there are other complex
problems that are comparably challenging. Patient-specific computation of unsteady
blood flow in an artery with aneurysm and stent is one of them. Special methods
targeting that class of cardiovascular fluid mechanics problems were introduced
in [49], and a large set of computations were reported in [18, 49].
Thrombus formation (i.e., blood clotting) is a major problem in ventricular assist
devices (VADs), especially in pulsatile VADs (PVADs). Long residence times and
areas of recirculation or stagnation may lead to increased risk of thrombosis in
PVADs [19, 50]. A method was presented in [51] for computing the particle resi-
dence time, which is known to correlate with an increased risk of thrombogenesis.
76 K. Takizawa et al.
Fig. 2 Deformed stent (left) and split lumen geometry with the stent (right)
The method was developed in an ALE [17, 23] framework (the Eulerian case was
investigated in [52]), and is suitable for flows with moving boundaries and inter-
faces, including FSI. In [53], the recently-developed residence time formulation was
employed in the definition of the objective function for the FSI-based shape opti-
mization study of a current PVAD design.
In this book chapter we provide an overview of how patient-specific cardiovascu-
lar fluid mechanics analysis, including FSI, can be very effectively carried out with
the core and special ST and ALE technique. For the governing equations and the
finite element formulations, including the ALE-VMS, DSD/SST and SSTFSI tech-
niques, we refer the interested reader to [17, 46, 54]. Special techniques for stent
Patient-Specific Cardiovascular Fluid Mechanics Analysis 77
Fig. 3 Flat stent with the periphery of the interior-boundary geometry (top) stent mesh (bottom)
Fig. 4 Aneurysm (left) and parent (right) artery segments, separated by the stent
modeling and mesh generation and particle residence time calculation are described
in Sects. 2 and 3. In Sect. 4, we give references for some of the other special tech-
niques. The fluid (blood) and structure (blood vessel wall) properties and boundary
conditions are given in Sect. 5. We present the computations in Sects. 6 and 7, and
our concluding remarks in Sect. 8.
78 K. Takizawa et al.
This section is from [49]. Mesh generation of the cerebral artery with aneurysm and
stent requires numerous steps that include taking the flat-stent design and lumen
geometry and generating a fluid volume mesh representative of a stented artery with
aneurysm. We begin by mapping the flat stent to the deformed stent, which fits
across the neck of the aneurysm. The artery is separated into two segments, parent
and aneurysm. Layers of refined mesh are generated at the stent and arterial walls in
both segments. After the remaining volume mesh in each segment is generated, the
two segments are merged on the interior-boundary mesh containing the stent.
1. Prepare the lumen geometry and flat-stent model as shown in Fig. 1. We extract
the arterial surface geometry from medical images and generate a lumen geom-
etry reflective of the inflated arterial-wall structure through the process reported
in [45]. The flat-stent model was generated using the geometry of a Cordis Pre-
cise Pro Rx nitinol self-expanding stent (PC0630RXC) with a wire diameter of
about 0.1 mm.
2. Generate a NURBS surface slightly larger than the artery such that the surface
intersects the lumen geometry as shown in Fig. 2. We swept a NURBS sur-
face following the curvature of the parent artery and extending slightly beyond
the aneurysm neck. To simplify the mesh generation process, we only model the
portion of the stent crossing the neck of the aneurysm. The intersection of the
NURBS surface and lumen geometry is the periphery of the interior boundary
containing the stent.
3. Map the periphery of the interior boundary, described above, to the flat stent
and mesh that as shown in Fig. 3. We generate a triangular surface mesh using
ANSYS ICEM CFD meshing software (ICEM) and the geometry defined
by the flat stent and interior-boundary periphery. The maximum element size
specified in ICEM mesh generation leads to the width of the stent wire being
Patient-Specific Cardiovascular Fluid Mechanics Analysis 79
meshed with 34 elements. This ensures sufficient refinement to resolve the flow
on the stent. The flat-stent mesh is then mapped from the flat NURBS surface
to the deformed NURBS surface to form the interior-boundary mesh positioned
across the neck of the aneurysm.
4. Use the periphery of the interior-boundary mesh as a predefined set of element
edges, splitting the lumen geometry into parent and aneurysm segments as shown
in Fig. 4. This reduces complexity in mesh generation. We use ICEM to generate
the triangular surface meshes on the parent and aneurysm segments.
5. Using the surface meshes for the parent and aneurysm segments, we generate
layers of refined mesh on either side of the stent and near the arterial walls. We
use the process reported in [45] to generate the layers in the parent segment. In
generating the layers in the aneurysm segment, we first start by separating the
surface mesh into different regions as shown in Fig. 5. Due to the sharp angle of
the geometry, no layers are explicitly generated in the red region. We specify a
uniform thickness for the layers of refined mesh in the blue regions. The thickness
of the first layer is approximately equal to the first layer of refined mesh in the
parent segment. There are a total of four layers, each increasing in thickness
using a progression ratio of 1.75 (the same number of layers and progression
ratio used in [45]). To prevent elements tangling, the Laplaces equation is solved
over the green region of the surface mesh to determine the thickness growth from
essentially zero at the boundary with the red region to the desired layer thickness
at the blue region boundary. We generate each of the four layers in the aneurysm
segment separately and merge the layers (see Remark 2).
6. The rest of the fluid volume mesh is generated using ICEM. The innermost
surface of the layers of refined mesh is extracted from the volume mesh and
used as the surface mesh for generating the volume mesh in both the parent and
aneurysm segments. The inner volume mesh is then merged to the refined layers.
7. The parent and aneurysm fluid volume mesh segments are merged on the interior-
boundary mesh containing the stent. For the no-stent cases, all nodes are merged
on that interior-boundary mesh. For the single- and double-stent cases, the nodes
on the stent portion of the interior-boundary mesh are not merged and instead
colocated.
Remark 1 We generate the double stent by overlaying two single flat-stent geome-
tries and translating one of them in two directions. We map the intersection of the
deformed NURBS surface and lumen geometry, which is again the periphery of the
interior boundary, to the flat double-stent geometry and mesh the double stent as
one mesh. The double-stent mesh is treated the same as the single-stent mesh in the
remaining mesh generation steps. Figure 6 shows the full single and double stents.
Remark 2 The mesh generation process for the layers of refined mesh in the
aneurysm segment presents challenges regarding tangled elements. With the mesh
refinement required by the problem, building the layers into the artery has the poten-
tial to create elements with negative Jacobians. Each layer must be checked for the
Jacobian values before generating the next layer.
80 K. Takizawa et al.
where
1 if x V
H (x) = (2)
0 otherwise
is the Heaviside function. The solution of Eq. (1) has the interpretation of the total
time that a particle, occupying a spatial position x at time t, spent in the subdomain V .
The source term given by the Heaviside function ensures that the time is accumulated
only when the particle is inside the subdomain. In this framework, both and V
may be time-dependent, which is the case here.
For applications involving flows in stationary or moving domains it is convenient
to re-write Eq. (1) with respect to the Eulerian (or spatial) frame as
+ u = H (x), (3)
t x
h2 h1
u
u
0 h1 h2 L
x
Fig. 7 Solution of the 1D model problem illustrating the particle residence time method
The following simple 1D example illustrates how the technique works. Let
= (0, L), V = (h 1 , h 2 ) (0, L), and the flow velocity u is a positive con-
stant (see Fig. 7). In this setting, Eq. (3) reduces to
+u = H (x). (5)
t x
Because the flow is from left to right and the region V is located downstream of
the inlet, particles at the inlet could not have spent any time inside V . As a result,
we set = 0 at x = 0. Since this is a pure advection problem, boundary condition
at the outlet is left unspecified. Equation (5) has a steady-state solution:
0 if x [0, h 1 ]
xh 1
(x) = if x [h 1 , h 2 ] (6)
u
h 2 h 1
u if x [h 2 , L].
Equation (6) implies that, once the transient response settles, prior to the interval
of interest, the residence time is zero. As particles enter the interval of interest moving
at constant speed the residence time is proportional to the distance from the leftmost
edge of the interval, and inversely proportional to flow speed. When particles exit
the interval of interest, the residence time stays constant. The analytical solution of
the differential equation given by Eq. (6) is in complete agreement with the expected
particle residence time for this simple case.
Remark 3 Besides illustrating how the method works, this example also shows that
the proposed technique for calculating particle residence time is meaningful when
the solution for reaches a steady state. The time-periodic solution for , which is
82 K. Takizawa et al.
often the case with cardiovascular applications, also presents a situation from which
meaningful conclusions may be drawn.
Two scalar measures of residence time proposed in [51, 52] are:
1
RT1 = H (x) (x, t) d dt, (7)
T |V |
T
where
1
|V | = H (x) d dt (8)
T
T
where
1
|Q| = (u v) n d V dt (10)
T
T V +
|V |
RT2 = . (11)
|Q|
The most attractive feature of this measure of residence time is that it requires no
information about .
Remark 4 Applied to the 1D model problem with solution given by Eq. (6) the two
h 1
residence time measures produce RT1 = h 22u and RT2 = h 2 h 1
u , which are the
average and maximum particle residence time in V , respectively. For such a simple
flow situation it is clear that the maximum residence time occurs at the outflow.
However, in multiple dimensions, in the presence of complex, recirculating flow this
may not be the case. In the case of blood pumps, one can in fact assess the device
efficiency (meaning throughput efficiency) by comparing the residence time at the
outlet with that in the interior of the domain. Residence time maxima occurring in
the interior suggest the presence of persistent flow recirculation or stagnation zones
that tend to trap the material and thus lower the pump efficiency.
An alternative approach to computing residence time is based on the dye injection
technique. We use an advection equation as before, set the source term to zero, and
Patient-Specific Cardiovascular Fluid Mechanics Analysis 83
apply a Dirichlet boundary condition on the scalar field at the inlet such that:
1 if n = 1
= , (12)
0 otherwise
where n is the cycle number. (Note that is no longer particle residence time, but
rather dye concentration.) This has the effect of injecting a dye for one cycle, which
can be visualized as it moves through the domain. The volume of dye in the domain
at any given time can be written as V dV . The dye is then ejected over subsequent
cycles, and remaining dye volume is computed at the end of each cycle. The cycles
are repeated until at least 95 % of the dye is removed from the system.
For interested reader, in this section we provide references to articles where spe-
cial techniques related to different components of arterial-geometry construction
and mesh generation can be found. These components include (a) arterial-surface
extraction from medical images [45]; (b) arterial-wall-thickness construction based
on patches [46] and based on solution of the Laplaces equation over the arterial
volume [9] or lumen [46]; (c) fluid mechanics volume mesh generation, including
layers of refined mesh near the arterial walls [46]; (d) calculating the EZP arterial
geometry [46]; (e) calculating the blood vessel tissue prestress [8, 48]; and (f) cal-
culating an estimated element-based zero-stress state for the artery [21]. We also
provide references to articles where some additional special techniques can be found.
These special techniques include (g) a mapping technique for inflow boundaries [14];
(h) boundary condition techniques for inclined inflow and outflow planes [45]; (i) a
scaling technique for specifying realistic inflow rates [47]; (j) techniques [41] for
the projection of fluidstructure interface stresses; (k) a recipe for pre-FSI compu-
tations that improve the convergence of the FSI computations [13]; (l) the SCAFSI
technique [55]; (m) methods for WSS and OSI calculations [41]; and (n) a precon-
ditioning technique [42].
As it was done for the computations reported in [5], the blood is assumed to behave
like a Newtonian fluid. The density and kinematic viscosity are set to 1000 kg/m3
and 4.0 106 m2 /s. The material density of the arterial wall is known to be close
84 K. Takizawa et al.
to that of the blood and therefore set to 1000 kg/m3 . The arterial wall is modeled
with the continuum element made of hyperelastic (Fung) material. The Fung material
constants D1 and D2 (from [56]) are 2.6447 103 N/m2 and 8.365, and the penalty
Poissons ratio is 0.45. Cerebral arteries are surrounded by cerebrospinal fluid, and
we expect that to have a damping effect on the arterial-wall dynamics. Therefore we
add a mass-proportional damping, which also helps in removing the high-frequency
modes of the structural deformation. The damping coefficient is chosen in such a
way that the structural mechanics computations remain stable at the time-step size
used. It is 1.5 104 s1 .
On the arterial walls, we specify no-slip boundary conditions for the flow. In the
structural mechanics part, as boundary condition at the ends of the arteries, we set
the normal component of the displacement to zero, and for one of those nodes we
also set to zero the tangential displacement component that needs to be specified to
preclude rigid-body motion. At the inflow boundary, we specify the velocity profile
as a function of time, using the technique introduced in [14]. We use two types of
conditions at the outflow boundaries. In the explicit version, at all outflow boundaries
of an artery segment we specify the same traction boundary condition. The traction
boundary condition is based on a pressure profile computed as described in [14]. In
the implicit version, we consider a class of outflow boundary conditions in which
the outlet traction is a function of the flow rate there (see [57] for details).
Remark 5 In the current TAFSM computations, the volumetric flow rate at the
inflow (calculated based on a velocity waveform representing the cross-sectional
maximum velocity) is scaled by a factor. The factor is determined in such a way that
the scaled flow rate, when averaged over the cardiac cycle, yields a target WSS for
Poiseuille flow over an equivalent cross-sectional area. The target WSS is 10 dyn/cm2
in the current computations. This technique was introduced in [46, 47].
All computations were carried out in a parallel computing environment. In FSI mod-
eling, the fully-discretized, coupled fluid and structural mechanics and mesh-moving
equations are solved with the quasi-direct coupling technique (see Sect. 5.2 in [4]),
and the computations were completed without any remeshing. In solving the linear
equation systems involved at every nonlinear iteration, the GMRES search tech-
nique [58] is used with a diagonal preconditioner.
Patient-Specific Cardiovascular Fluid Mechanics Analysis 85
Fig. 8 Model-M6Acom. EZP shrinking amount over the surface (lumen) extracted from the medical
image (left), wall thickness over the shrunk lumen (middle), and structure mesh at zero pressure
(right). The color range represents a value range that increases from light to dark
Fig. 9 Model-M6Acom. Fluid mechanics mesh at the lumen and outflow planes (left), thickness
of the first layer of elements near the arterial wall (middle), and the mesh at the inflow plane (right).
All pictures are from the starting point of our computation cycle. The color range represents a value
range that increases from light to dark
Remarks 21 and 22 in [46]). The time-step size is 3.333 103 s. The number of
nonlinear iterations per time step is 6. The number of GMRES iterations per nonlinear
iteration for the fluid + structure block was chosen such that mass balance is satisfied
to within at most 5 % for each case. The number of GMRES iterations is 300, and
this was sufficient for obtaining good mass balance. For all six nonlinear iterations
the fluid scale is 1.0 and the structure scale is 100. For the mesh moving block the
number of GMRES iterations is 30. Figure 10 shows the WSS when the volumetric
Patient-Specific Cardiovascular Fluid Mechanics Analysis 87
flow rate is maximum. Figure 11 shows the OSI, calculated with the technique that
excludes rigid-body rotations from the calculation (see [17, 41, 46, 54]).
This subsection is from [49]. Endovascular stent placement across the neck of an
intracranial aneurysm can lead to aneurysm occlusion and thrombosis. We compare
the flow field of arterial geometries before and after virtual stenting to assess the
changes. Select aneurysms require treatment using two or more stents to sufficiently
alter the flow field allowing for thrombosis. The test computations include a before-
stenting case and after-stenting cases for both single- and double-stent treatments
to compare the effectiveness of stenting with multiple stents. Section 6.2.1 details
the parameters for the arterial geometry used in the computations. In Sect. 6.2.2 we
compare hemodynamic values before and after stenting.
Fig. 12 Arterial lumen geometry obtained from voxel data (left) and the fluid mechanics mesh for
the single-stent case, with cross-section and inflow plane views
The ST-VMS method is used, with the stabilization parameters as given by Eq. (7)
in [4] for M (= SUPS ) = SUPG and Eq. (37) in [59] for C (=LSIC ) = HRGN . The
time step size is 3.333 103 s. The number of nonlinear iterations per time step
is 4, and the number of GMRES iterations per nonlinear iteration for the no-stent,
single-stent and double-stent cases is 1,000, 1,500 and 1,500.
0 0.01 0.1 1 10
Velocity (cm/s)
25
20
Vorticity (s-1 )
15
10
0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Time(s)
No Stent Single Stent Double Stent
peak blood flow into and within the aneurysm occurs approximately 0.02 s before
peak inflow rate in the parent artery. The time-averaged Q A decreases by 22 and 78 %
QP
in the single- and double-stent cases. Similarly, the kinetic energy averaged in space
and time decreases by 72 % in the single-stent case and 92 % in the double-stent case.
The reduction in vorticity in the aneurysm caused by stenting is shown in Figs. 14
and 15. The vorticity, averaged in space and time, is reduced by 47 and 72 % in the
single- and double-stent cases.
1 7 70 700
Vorticity (s 1 )
Computations were carried out in a parallel computing environment. The FSI equa-
tions are advanced in time using the generalized- time integrator proposed in [62]
for structural dynamics, and developed for fluid mechanics in [63] and FSI in [24].
A quasi-direct solution strategy is used, where the increments of the fluid and struc-
tural mechanics variables are obtained simultaneously [3, 4, 17, 64]. A Jacobian-
based mesh stiffening technique [17, 32, 65, 66] is used to move the fluid mechanics
mesh. The effect of the mesh motion on the fluid equations is omitted from the tan-
gent matrix for efficiency, as advocated in [57] for cardiovascular FSI applications. In
solving the linear equation systems involved at every nonlinear iteration, the GMRES
search technique is used with a block-diagonal preconditioner.
Fig. 16 Tetrahedral finite element mesh of the middle cerebral artery (MCA) bifurcation with
aneurysm. Both patient-specific models are discretized using approximately 165K elements and
30K nodes. Inlet branches are labeled M1 and outlet branches are labeled M2 for both models. The
arrows point in the direction of inflow velocity. The inlet cross-sectional areas Models 1 and 2 are
4.962 102 and 2.102102 cm2 , respectively. a Model 1; b Model 2
Fig. 17 Final prestressed state for Model 1 and 2. The models are colored by the isocontours
of wall tension, which is defined as the absolute value of the first principal in-plane stress of S0 .
a Model 1; b Model 2
are colored by the isocontours of wall tension, which is defined as the absolute value
of the first principal in-plane stress.
To assess the influence of the prestress, we perform a coupled FSI simulation of
both models and compare the results with and without prestress. Figure 18 shows
the relative wall displacement between the deformed configuration and reference
configuration coming from imaging data. The deformed configuration corresponds
to the time instant when the fluid traction vector is closest to the averaged traction
vector used for the prestress problem. Almost no difference between the reference
and deformed configurations is seen in the case of the prestressed-artery simulation,
as expected. However, in the case of non-prestressed simulation, the differences
between the two configurations are significant. This indicates that the FSI problem
is not being solved on the correct geometry. Furthermore, the relative geometry
error is larger for Model 2, which has a larger aneurysm dome and a thinner wall.
Figure 19 shows the relative wall displacement between the deformed configuration
92 K. Takizawa et al.
Fig. 18 Relative wall displacement between the deformed configuration and reference configu-
ration coming from imaging data. Top Model 1; Bottom Model 2. The deformed configuration
corresponds to the time instant when the fluid traction vector is closest to the averaged traction
vector used for the prestress computation. a With prestress; b Without prestress
Fig. 19 Relative wall displacement between the deformed configuration at peak systole and low
diastole. Top Model 1; Bottom Model 2. a With prestress; b Without prestress
at peak systole and low diastole. In both the prestressed and non-prestressed cases the
relative displacement is fairly small, yet non-negligible. The non-prestressed case,
however, makes use of the geometry that is significantly more inflated compared
to the prestressed case and the imaging data.
Patient-Specific Cardiovascular Fluid Mechanics Analysis 93
Fig. 20 Volume-rendered blood flow velocity magnitude near peak systole. Top Model 1; Bottom
Model 2. a With prestress; b Without prestress
Fig. 21 Wall shear stress near peak systole. Top Model 1; Bottom Model 2. a With prestress;
b Without prestress
Figure 20 shows a comparison of the blood flow speed near peak systole for the
simulations with and without prestress. The results between the two cases are very
similar, although some differences in the flow structures are visible, especially for
Model 2. Figure 21 shows a comparison of the WSS near peak systole for both cases.
The WSS, unlike blood flow velocity, exhibits significant differences in magnitude
94 K. Takizawa et al.
Fig. 22 PVAD geometry including blood and air chambers as well as inlet and outlet boundaries.
Darker and lighter shades are used to denote the blood and air chambers, respectively
and spatial distribution. The simulations presented clearly show the importance of
tissue prestress in patient-specific vascular FSI modeling for accurate prediction of
hemodynamic phenomena and vessel wall mechanics.
Fig. 23 PVAD flow solution (left), residence time (middle), and membrane deformation (right) at
several instances during the cycle: Top to bottom, t = 0.15 s and t = 0.525 s
1.2
1.1
0.9
0.8
0.7
0 0.2 0.4 0.6 0.8 1
Time (s)
Fig. 24 Plot of as a function of time. The fill stage is given by t [0, 0.45], and the ejection
stage as t (0.45, 0.75]
This suggests that the old material does not accumulate in the interior of the device,
and is ejected by the pump in a fairly efficient manner.
The time history of , the spatial average of the blood chamber residence time,
is shown in Fig. 24. The average residence time rises uniformly during the ejection
stage. This is because no new material is entering the blood chamber and the average
is expected to rise with time. The fill stage shows a brief and rapid decrease in , as
new material with = 0 enters the blood chamber. The trend again reverses and
begins to increase again in the later part of the fill stage as the influx of new material
slows and the blood chamber volume grows. Using Eqs. (7) and (11), we find that
Patient-Specific Cardiovascular Fluid Mechanics Analysis 97
100
5% cutoff
80
% Dye Remaining
73 mL Device
60
40
20
0
0 0.5 1 1.5 2 2.5
Time (s)
Fig. 25 Plot of the percentage of dye remaining in the blood chamber versus time
RT1 = 0.893 s and RT2 = 1.031 s for this device, meaning the blood particles
remain in the chamber on average for about 1 s. Note that the difference between
RT1 and RT2 is very minor, suggesting that the bulk of the residence time comes
from the particles circulating the chamber rather than directly traversing the length
of the device from the inlet to outlet.
Results from the dye injection analysis can be seen in Fig. 25. The figure shows
a percentage of dye remaining in the blood chamber after the chamber is filled for
one cycle. Note that in the first cycle over 50 % of the dye is removed. Furthermore,
it takes 2.47 s to remove 95 % of the dye subsequent to the initial fill cycle.
8 Concluding Remarks
References
21. Takizawa K, Takagi H, Tezduyar TE, Torii R (2013) Estimation of element-based zero-stress
state for arterial FSI computations. Comput Mech. doi:10.1007/s00466-013-0919-7 (published
online)
22. Takizawa K, Tezduyar TE, Buscher A, Asada S ( 2013) Spacetime interface-tracking with
topology change (ST-TC). Comput Mech. doi:10.1007/s00466-013-0935-7 (published online)
23. Hughes TJR, Liu WK, Zimmermann TK (1981) LagrangianEulerian finite element formula-
tion for incompressible viscous flows. Comput Meth Appl Mech Eng 29:329349
24. Bazilevs Y, Calo VM, Hughes TJR, Zhang Y (2008) Isogeometric fluidstructure interaction:
theory, algorithms, and computations. Comput Mech 43:337
25. Tezduyar TE (1992) Stabilized finite element formulations for incompressible flow computa-
tions. Adv Appl Mech 28:144. doi:10.1016/S0065-2156(08)70153-4
26. Tezduyar TE (2003) Computation of moving boundaries and interfaces and stabilization para-
meters. Int J Numer Meth Fluids 43:555575. doi:10.1002/fld.505
27. Brooks AN, Hughes TJR (1982) Streamline upwind/PetrovGalerkin formulations for convec-
tion dominated flows with particular emphasis on the incompressible NavierStokes equations.
Comput Meth Appl Mech Eng 32:199259
28. Tezduyar TE, Mittal S, Ray SE, Shih R (1992) Incompressible flow computations with stabilized
bilinear and linear equal-order-interpolation velocity-pressure elements. Comput Meth Appl
Mech Eng 95:221242. doi:10.1016/0045-7825(92)90141-6
29. Hughes TJR, Franca LP, Balestra M (1986) A new finite element formulation for computational
fluid dynamics: v. circumventing the BabukaBrezzi condition: a stable PetrovGalerkin for-
mulation of the stokes problem accommodating equal-order interpolations. Comput Meth Appl
Mech Eng 59:8599
30. Hughes TJR, Hulbert GM (1988) Spacetime finite element methods for elastodynamics: for-
mulations and error estimates. Comput Meth Appl Mech Eng 66:339363
31. Torii R, Oshima M, Kobayashi T, Takagi K, Tezduyar TE (2011) Influencing factors in image-
based fluidstructure interaction computation of cerebral aneurysms. Int J Numer Meth Fluids
65:324340. doi:10.1002/fld.2448
32. Tezduyar T, Aliabadi S, Behr M, Johnson A, Mittal S (1993) Parallel finite-element computation
of 3D flows. Computer 26:2736. doi:10.1109/2.237441
33. Tezduyar TE (2004) Finite element methods for fluid dynamics with moving boundaries and
interfaces. In: Stein E, Borst RD, Hughes TJR (eds) Encyclopedia of computational mechanics.
Fluids, vol 3. Wiley, Hoboken (Chapter 17)
34. Takizawa K, Tezduyar TE (2012) Spacetime fluidstructure interaction methods. Math Models
Math Appl Sci 22:1230001. doi:10.1142/S0218202512300013
35. Takizawa K, Tezduyar TE (2011) Multiscale spacetime fluidstructure interaction techniques.
Comput Mech 48:247267. doi:10.1007/s00466-011-0571-z
36. Hughes TJR (1995) Multiscale phenomena: Greens functions, the Dirichlet-to-Neumann for-
mulation, subgrid scale models, bubbles, and the origins of stabilized methods. Comput Meth
Appl Mech Eng 127:387401
37. Hughes TJR, Oberai AA, Mazzei L (2001) Large eddy simulation of turbulent channel flows
by the variational multiscale method. Phys Fluids 13:17841799
38. Bazilevs Y, Calo VM, Cottrell JA, Hughes TJR, Reali A, Scovazzi G (2007) Variational mul-
tiscale residual-based turbulence modeling for large eddy simulation of incompressible flows.
Comput Meth Appl Mech Eng 197:173201
39. Bazilevs Y, Akkerman I (2010) Large eddy simulation of turbulent TaylorCouette flow using
isogeometric analysis and the residual-based variational multiscale method. J Comput Phys
229:34023414
40. Tezduyar TE, Cragin T, Sathe S, Nanna B (2007) FSI computations in arterial fluid mechanics
with estimated zero-pressure arterial geometry. In: Onate E, Garcia J, Bergan P, Kvamsdal T
(eds) Marine 2007. CIMNE, Barcelona, Spain
41. Takizawa K, Moorman C, Wright S, Christopher J, Tezduyar TE (2010) Wall shear stress
calculations in spacetime finite element computation of arterial fluidstructure interactions.
Comput Mech 46:3141. doi:10.1007/s00466-009-0425-0
100 K. Takizawa et al.
62. Chung J, Hulbert GM (1993) A time integration algorithm for structural dynamics withim-
proved numerical dissipation: the generalized- method. J Appl Mech 60:371375
63. Jansen KE, Whiting CH, Hulbert GM (2000) A generalized- method for integrating the filtered
NavierStokes equations with a stabilized finite element method. Comput Meth Appl Mech
Eng 190:305319
64. Tezduyar TE, Sathe S, Keedy R, Stein K (2004) Spacetime techniques for finite element
computation of flows with moving boundaries and interfaces. In: Gallegos S, Herrera I, Botello
S, Zarate F, Ayala G (eds) Proceedings of the III international congress on numerical methods
in engineering and applied science. CD-ROM, Monterrey, Mexico
65. Tezduyar TE, Behr M, Mittal S, Johnson AA (1992) Computation of unsteady incompressible
flows with the finite element methodsspacetime formulations, iterative strategies and mas-
sively parallel implementations. In: New methods in transient analysis, PVP-Vol. 246/AMD,
vol 143. ASME, New York, pp 724
66. Johnson AA, Tezduyar TE (1994) Mesh update strategies in parallel finite element computa-
tions of flow problems with moving boundaries and interfaces. Comput Meth Appl Mech Eng
119:7394. doi:10.1016/0045-7825(94)00077-8
67. Isaksen JG, Bazilevs Y, Kvamsdal T, Zhang Y, Kaspersen JH, Waterloo K, Romner B, Inge-
brigtsen T (2008) Determination of wall tension in cerebral artery aneurysms by numerical
simulation. Stroke 39:31723178
68. Zhang Y, Wang W, Liang X, Bazilevs Y, Hsu M-C, Kvamsdal T, Brekken R, Isaksen J (2009)
High-fidelity tetrahedral mesh generation from medical imaging data for fluidstructure inter-
action analysis of cerebral aneurysms. Comput Model Eng Sci 42:131150
69. Bazilevs Y, Hsu M-C, Zhang Y, Wang W, Liang X, Kvamsdal T, Brekken R, Isaksen J (2010)
A fully-coupled fluidstructure interaction simulation of cerebral aneurysms. Comput Mech
46:316
70. Bazilevs Y, Hsu M-C, Takizawa K, Tezduyar TE (2012) ALE-VMS and ST-VMS methods for
computer modeling of wind-turbine rotor aerodynamics and fluidstructure interaction. Math
Models Meth Appl Sci 22:1230002. doi:10.1142/S0218202512300025
71. Hughes TJR, Cottrell JA, Bazilevs Y (2005) Isogeometric analysis: cad, finite elements, nurbs,
exact geometry, and mesh refinement. Compu Meth Appl Mech Eng 194:41354195
72. Cottrell JA, Hughes TJR, Bazilevs Y (2009) Isogeometric analysis. Wiley, Toward Integration
of CAD and FEA
73. Kiendl J, Bletzinger K-U, Linhard J, Wchner R (2009) Isogeometric shell analysis with
KirchhoffLove elements. Comput Meth Appl Mech Eng 198:39023914
74. Kiendl J, Bazilevs Y, Hsu M-C, Wchner R, Bletzinger K-U (2010) The bending strip method
for isogeometric analysis of KirchhoffLove shell structures comprised of multiple patches.
Comput Meth Appl Mech Eng 199:24032416
75. Bazilevs Y, Hsu M-C, Kiendl J, Wchner R, Bletzinger K-U (2011) 3D simulation of wind
turbine rotors at full scale. Part II: Fluidstructure interaction modeling with composite blades.
Int J Numer Meth Fluids 65:236253
76. Benson DJ, Bazilevs Y, De Luycker E, Hsu M-C, Scott M, Hughes TJR, Belytschko T (2010)
A generalized finite element formulation for arbitrary basis functions: from isogeometric analy-
sis to xfem. Int J Numer Meth Eng 83:765785
77. Benson DJ, Bazilevs Y, Hsu M-C, Hughes TJR (2011) A large deformation, rotation-free,
isogeometric shell. Comput Meth Appl Mech Eng 200:13671378
78. Tezduyar TE, Sathe S, Pausewang J, Schwaab M, Christopher J, Crabtree J (2008) Interface
projection techniques for fluidstructure interaction modeling with moving-mesh methods.
Comput Mech 43:3949. doi:10.1007/s00466-008-0261-7
79. Tezduyar TE, Takizawa K, Moorman C, Wright S, Christopher J (2010) Spacetime finite ele-
ment computation of complex fluidstructure interactions. Int J Numer Meth Fluids 64:1201
1218. doi:10.1002/fld.2221
80. Bazilevs Y, Hsu M-C, Scott MA (2012) Isogeometric fluidstructure interaction analysis with
emphasis on non-matching discretizations, and with application to wind turbines. Comput Meth
Appl Mech Eng 249252:2841
102 K. Takizawa et al.
81. Vignon-Clementel IE, Figueroa CA, Jansen KE, Taylor CA (2006) Outflow boundary
conditions for three-dimensional finite element modeling of blood flow and pressure in arteries.
Comput Meth Appl Mech Eng 195:37763796
82. Moghadam ME, Bazilevs Y, Hsia T-Y, Vignon-Clementel IE, Marsden AL (2011) The modeling
of congenital hearts alliance (MOCHA), a comparison of outlet boundary treatments for
prevention of backflow divergence with relevance to blood flow simulations. Comput Mech
48:277291 doi:10.1007/s00466-011-0599-0
83. Tezduyar TE, Senga M (2006) Stabilization and shock-capturing parameters in SUPG formu-
lation of compressible flows. Comput Meth Appl Mech Eng 195:16211632. doi:10.1016/j.
cma.2005.05.032
84. Bazilevs Y, Calo VM, Tezduyar TE, Hughes TJR (2007) YZ discontinuity-capturing for
advection-dominated processes with application to arterial drug delivery. Int J Numer Meth
Fluids 54:593608. doi:10.1002/fld.1484
Part III
Particle Methods in Coupled Problems
Direct Numerical Simulation of Particulate
Flows Using a Fictitious Domain Method
1 Introduction
Particulate flows are of great importance in very different industrial branches, e.g.,
in medical, process and chemical industries and also in geotechnical engineering
and bioengineering. Examples include fluidized beds, sedimentation, fluvial erosion,
sand production in oil wells, dust collection devices and aerosol transport in human
respiratory airways. The characteristics involving particulate flows are up to now
efficient and sophisticated mesh motion and re-meshing algorithms. The first 3D
DNS computations of particulate flows belonging to this category were presented in
Johnson and Tezduyar [23], which can be considered as a pioneering work in this area
(see also Johnson and Tezduyar [2426]). In these articles, the authors propose the
deformable-spatial-domain/stabilized spacetime (DSD/SST) finite element method
(FEM) setting for the treatment of the fluid field. Further milestone contributions
related to group (i) were published by Hu et al. [1921]. These authors employ in
their work the Arbitrary-Lagrangian-Eulerian method framework in order to describe
the particulate fluid field. In terms of the approaches assigned to the fixed-grid cate-
gory (ii), one can find in the literature a number of techniques suggested for the DNS
analysis of fluid-particle interactions (see the review paper of Haeri and Shrimpton
[17] and the references cited therein for an overview). Taken together the proposed
approaches are known under the generic term fictitious domain (FD) method. The
most widely-used ones are the immersed boundary method, the distributed Lagrange
multiplier/fictitious domain method and the fictitious boundary method. They all
have in common that the fluid flow is treated in the framework of an Eulerian set-
ting, where a fixed mesh is employed. Here, a fluid mesh covers, compared to the
approaches in group (i), the whole computational domain, also including the space
of the particlesthat means the solid domain is filled as well with fluid. The main
idea on which the FD methods are based is to uncouple the particles from the mesh
and to consider them as fictitious objects having the property to traverse through the
grid without causing any element deformation. The crucial point is here to enforce
the fluid enclosed by an embedded particle to adopt its solid body motion. In general,
this is realized through imposing additional implicit constraints to the flow field.
A prominent method to simulate granular materials is the well-established discrete
element method (DEM) approach, which was proposed originally by Cundall and
Strack [12]. The application of a DEM solver to predict the behavior of the dispersed
phase in a particulate flow can be found, e.g., in Wachs [41] and Avci and Wriggers
[3]. In a DEM setting, usually a soft sphere approach based on repulsive force models
is applied in order to describe the contact among particles, which are at the same
time assumed being quasi-rigid.
In this work, a DNS approach is developed in the framework of a FD strategy for
the numerical simulation of 3D particle-laden flows. The fluid-particle interactions
are computed at the particle scale, with a fully resolved flow around the particles.
As numerical solvers regarding the simulation of the fluid part and particle part, the
FEM and DEM are used, respectively. Here, both methods are appropriately coupled
by a staggered solution procedure to handle particulate flows.
2 Governing Equations
The flow of the fluid field is modeled by the non-stationary incompressible Navier
Stokes equations:
u f
f + u f u f = 0, u f = 0, x f . (1)
t
Herein, u f is the fluid velocity, f the fluid density and describes the Cauchy
stress tensor. In the numerical studies the constitutive equation for a Newtonian flow
is used:
1
= pI + 2 with = u f + (u f )T , (2)
2
where p is the pressure, I the identity tensor, the dynamic viscosity and the strain
rate tensor.
The motion of a quasi rigid particle P can be deduced from the NewtonEuler
equations. Consequently, its translational and angular velocities, U = X and , have
to satisfy:
d2 X
M = ( f )V b + F + F f (3)
dt 2
d
+ ( ) = T + T f . (4)
dt
Therein, M is the mass, X the position vector to the center of mass M , the mass
density, b the gravity and V denotes the volume of P. The tensor of inertia is
represented by . Furthermore, the sum of the contact forces is stated as F and the
fluid force that acts upon the particle surface p is considered by F f . The torques
that are caused by F and F f with respect to M are associated to the quantities T and
T f , respectively. Hence, the fluid forces can be obtained by:
Ff = t d A and T f = r t d A. (5)
p p
The normal contact forces acting between colliding particles and between particles
and system boundaries are described by a constitutive viscoelastic model. For adhe-
sive particles being in contact, the JKR theory, introduced by Johnson et al. [27],
is used to determine the resultant attractive van der Waals force Fan in the contact
area (see also Maugis [35] for a detailed description of this model). As shown by
Loskofsky et al. [31], the JKR theory yields even in the case of underwater adhe-
sion satisfying results. For the purpose of governing the elastic contact force Fen , the
Hertzian law [18] constitutes a well-established model. If the particles to be treated
have also viscous material properties, for this, a consistent phenomenological model
was presented in Brilliantov et al. [5, 6], where the effect of viscosity is consid-
ered via an added dissipative force Fdn . Thus, one obtains for the forces acting on a
particle:
The elastic repulsive force based on the Hertzian contact law is determined by:
4 3/2
Fen = E R , (7)
3
where = i + j is the total particle compression, R and E are the effective radius
and the effective Youngs modulus of the contact pair Pi and P j , respectively (see
Hertz [18]). As a result of the mutual compression of the particles, a circular area is
formed in the contact zone. In the Hertz model, the radius a of this area, which is
often called contact radius, is related to the total deformation via a 2 = R.
According to the JKR model, it is implied that the adhesive force acts only within
the contact area. Here, the work of adhesion to separate a unit contact area of Pi
and P j in a liquid medium (l) is defined as W = il + jl i j , where describes
the respective interfacial energy (see, e.g., Loskofsky et al. [31]). Since the adhesive
force Fan is opposed to the elastic force Fen , it reduces the elastic deformation e , and
one obtains for the total deformation:
a2 2 W a
:= e a = , (8)
R E
where the second term a is due to adhesion (see, e.g., Maugis [35] for details). Based
on this model the difference between the elastic and the adhesive force is given by:
4Ea 3 2W E
n
Fea := Fen Fan = 2a 2 . (9)
3R a
A special case here is the situation when external forces are absent. If so, an equi-
librium contact area with radius a0 is formed in the contact zone due to the mutual
110 B. Avci and P. Wriggers
compression 0 of the particles caused just by their adhesive attraction. These both
quantities are defined as:
1/3 1/3
9 W R 2 3R W 2
a0 = and 0 = . (10)
2E 4 E
To separate the particles, one has to apply a traction force under which they suffer
minute stretching deformations forming a connecting neck around the contact zone.
Once the pulling force has reached a critical level, i.e., F n = Fcn , the contact
breaks. Here, the critical force is obtained by Fcn = 3 W R/2 and the corresponding
critical deformation of the particle pair is c = a02 /(481/3 R). That means, the pulling
distance regarding their detachment is defined as = c . By incorporating these
critical quantities, one yields for the displacement in (8) and the force Fea n in (9)
n 3 3/2
Fea a a
=4 4 . (12)
Fc a0 a0
4Ea 2 3
Fdn = Aa 8 W Ea , (13)
R 2
The constitutive relation of Coulombs law couples the tangential force F t via the
coefficient of friction to the normal force F n such that the relation F t = d F n
holds for sliding and F t s F n for sticking. Therein, the dynamic and the static
friction coefficients are denoted by d and s , respectively, where d s . For a
constitutive treatment of F t , a classical tangential (linear) spring-dashpot element
with an incorporated slider is used in this work in order to model the tangential
friction problem. For an overview and a discussion of different tangential contact
models proposed in the literature in the context of the DEM see Kruggel-Emden
et al. [28]. Here, a return mapping scheme is adopted for the computation of the
tangential force (see Luding [32], Wriggers [43]). This projection method needs a
tangential trail traction that takes the form:
Therein, gt is the elongation of the tangential spring, ct and d t are the tangential spring
stiffness and the tangential dissipation parameter, respectively. The tangential relative
velocity at the contact point C is given by vt = vs (vs n) n with vs = viC vCj
as the relative velocity at C , where the corresponding local velocities are defined by
viC = Ui + i ri and vCj = U j + j r j . The vectors pointing from Mi and M j
to C are associated with ri = Ri (n) and r j = R j n, respectively. By introducing
a trial function f tr , the following relation can be stated for the tangential contact:
0 : stick
f tr := ||Fot || s ||Fn || (15)
> 0 : slip.
1
gt = (d F n t + d t vt ) (16)
ct
in order to fulfill the Coulombs slip condition. Therein, t = Fot /||Fot || is the direction
of the trial traction. In two subsequent time steps, the contact area might be slightly
rotated. To take this into account, one canas proposed in Luding [32]project
the tangential spring onto the current rotated contact area at the beginning of each
new time step via gt = gt (gt n)n. Finally, in the context of the return mapping
112 B. Avci and P. Wriggers
During a rolling motion of two particles the leading part of the contact area is contin-
uously compressed and the trailing part is decompressed with respect to the rolling
direction. In case of an attractive van der Waals force in the contact area, the particles
suffer an opposing torque that generates rolling resistance. Here, a model consisting
of a rolling spring-dashpot-slider element is adopted (see Iwashita and Oda [22]). At
this, the opposing torque is given by:
Mro = (c + d ). (17)
Therein, c is the rolling stiffness, d the rolling viscosity coefficient, the relative
particle rotation and denotes the corresponding relative rotational velocity. By
introducing a trial force Fro that induces an equivalent torque to Mro , the problem of
rolling resistance can be treated algorithmically like the model of tangential friction
(see Luding [33]). In this regard, the equivalent formulation can be stated as follows:
Assuming that the slider can sustain a certain critical rolling resistance torque Mcr ,
one can write analog to (15):
Mcr 0 : stick F r = ||Fro ||
frtr := ||Fro || (20)
R > 0 : slip F r = Mcr /R.
With regard to the numerical handling of the spring in this context, the respective
relationships can be expressed in a summarized form as:
< 0 gr = gr + gr , gr = vr t
Fro Mcr
frtr := 1 r (21)
> 0 gr = Fc t + d v , tr = , = .
r r r
Fc
c ||Fro || R
Direct Numerical Simulation of Particulate Flows 113
0 i
Fig. 1 a Distribution of the Lebedev quadrature points for the case of N L = 302 points. b Steplike
discretization of a particle for the evaluation of fluid forces. c A detail of the domain showing
the classification of computational elements
In this model, the projection direction of the spring relies on the common rolling
direction of the pair of particles being in contact. Thus, the projection condition is
defined as gr = (gr t) t, where t = vr /||vr ||. By computing F r , one yields the pseudo
force Fr = F r tr and respectively the rolling resistance torque Mr = Rn Fr .
Finally, the torques for the examined pair of particles {Pi , P j } can be written as
follows: Mri = Mrj =: Mr .
4 Phase Coupling
In the fixed grid approach, the mesh of the flow field does not coincide with the
boundaries of the particles. Hence, information between the Eulerian and Lagrangian
description has to be transferred. A further challenging issue concerning the coupling
of the phases is the computation of the fluid forces acting on the particle surfaces.
NL
NL
F if = t dA = (dF)k = J wk tk (22)
k=1 k=1
ip
NL
NL
T if = r t dA = rk (dF)k = J rk wk tk . (23)
k=1 k=1
ip
N
N
F if = t d = (dF) j = tj j (24)
j=1 j=1
ip
N
N
T if = r t d = r j (dF) j = r j t j j , (25)
j=1 j=1
ip
where r j is the position vector of the center of the element surface j with respect
to Mi .
Remark To characterize the motion of the DEM particles sliding through the mesh,
the mesh elements are labeled as depicted in Fig. 1c. By using the position of the
element center point E as an assignment criteria, the elements that coincide with
a particle domain ip are marked at each time step by the particle number of Pi .
Here, the interior and boundary elements are defined by i and i , respectively.
Furthermore, fluid elements are denoted by 0 .
For the coupling process, the rigid body motion of the particles is imposed on the
flow field. The rigidity constraints due to Pi that are applied to the NavierStokes
Direct Numerical Simulation of Particulate Flows 115
u f = Ui + i r p , (26)
where r p is the vector pointing from Mi to the considered velocity node. However,
for a velocity node V ip that adjoins the fluid phase, the velocity constraint is
defined as:
u f = (1 A )u f + A (Ui + i r p ), (27)
where A is the element face area fraction situated within ip . Hence, A acts as
a weighting factor for V .1 But if V / ip , then a nonlinear weighted strategy is
applied according to Luo et al. [34], which reads:
u f = (1 R )u f + R (Ui + i r p ), (28)
5 Solution Algorithms
A spatial and temporal discretization of (1) yields a set of nonlinear algebraic equa-
tions for the fluid velocity u f and the pressure p. The resulting coupled equation
system, which has to be solved at each time step, can be written as follows:
M + 1 tN(un+1
f ) 2 tG un+1 (M 3 tN(unf )) unf
f = . (29)
GT 0 pn+1 0
Therein, M is the mass matrix, N the matrix including the diffusive, convective and
stabilization terms, G the gradient matrix, GT the divergence matrix, t the current
time-step size and 13 are parameters of the fractional step -scheme, see Turek
1 In the present work, the nonconforming rotated trilinear Q 1 /Q 0 element pair is used where a
nodal value at V is the mean value of the velocity vector over the respective element face area, see
Turek [38] for details. The velocity nodes of this element are located at the midpoints of the element
faces.
116 B. Avci and P. Wriggers
[38] for details. To solve (29), the multigrid FEM solver FeatFlow [39] is applied
in this contribution.
The computation of the contact forces is the most CPU time consuming part of a DEM
simulation. Here, the evaluation of the contact detection has to be minimized to the
neighbors of a particle, since they are its only relevant potential contact partners. For
this purpose, the Verlet-List and the Linked-Cell is combined in order to yield a fast
contact search algorithm (see, e.g., Allen and Tildesley [1], Pschel and Schwager
[36]).
In the Verlet-List procedure a list of neighboring particle indices is maintained for
each particle in the system. By defining a Verlet distance threshold value vd , a pair
of particles can be considered as neighbor if vd > |g n | holds. Once the Verlet-List
is built, the contact detection needs only to be evaluated for the neighboring pairs.
As a result, this task scales with respect to the corresponding computational effort
with O(N ). Certainly, the list has to be updated at some intervals. In this regard, a
possible rebuild criteria can be defined by smax 0.6vd , where smax is the largest
displacement of a particle since the last list update (see Pschel and Schwager [36]).
But the construction of the list in a straightforward manner scales with O(N 2 ), thus,
one has to speed up this task in order to obtain a search algorithm that scales in toto
with O(N ). For this purpose, the computational domain is divided into cubic cells
of uniform edge lengths where the cells are slightly larger than the largest particle in
Direct Numerical Simulation of Particulate Flows 117
the system. After assigning all particles to the cells relative to their center of mass,
the relevant particles for the construction of the neighbor list of Pi are those who
are referenced to the group of 27 cells consisting of the owner cell of Pi and of its
direct 26 surrounding cells.
For a fast assignment of the element flags and, furthermore, in order to localize
efficiently the elements containing the integration points for the computation of the
fluid forces on the particles, the approach of the Linked-Cell method is used analo-
gously. The Linked-Cell algorithm generates in this case an element list referring to
the same cell structure as for the particles. Here, an element is referenced to a cell
with respect to the position of the elements center point E . Consequently, for the
application of the velocity constraints related to particle Pi and for the computation
of its fluid forces, only the elements are significant that are binned into the group of
those 27 cells which are relevant for Pi . In order to reduce the trial computations,
some elements can also be excluded in advance from detailed considerations if the
distance between E and ip is larger than a threshold value that can be chosen
according to the largest element size in the computational domain.
6 Numerical Simulations
The numerical results of three computed test problems obtained by the presented
approach are discussed next. The first test problem is the sedimentation of one particle
in a box. In the second simulation example, the sedimentation of two particles in a
row is considered in order to mimic their drafting-kissing-tumbling effect, and the
last example deals with a particle-laden flow through a tube with changing cross
section.
In this example, the sedimentation of a single particle in a box filled with fluid is
examined. The considered system is shown in Fig. 2. This system corresponds to the
setup that was experimentally investigated in ten Cate et al. [37] where the authors
measured the settling velocity of the immersed particle under the action of gravity
in four test cases, each with a different fluid. In the following, the obtained simula-
tion results for the cases with minimum and maximum terminal particle Reynolds
numbers, Re p = 1.5 and Re p = 31.9, are presented. Previously, the sedimentation
problem of ten Cate et al. [37] was computed by, among others, Veeramani et al. [40]
and Feng and Michaelides [14].
To discretize the box in Fig. 2, a uniform mesh consisting of 819,200 Q 1 /Q 0
elements (80 80 128 elements) is used. All the simulations were carried out
by imposing no-slip velocity conditions at the box boundaries. The diagrams in
Fig. 3 show the computed temporal evolutions of the settling velocity of the particle
118 B. Avci and P. Wriggers
Box dimension: 10 10 16 cm
Particle position: (5/5/12.75) cm
Particle radius: R = 0.75 cm
b Particle density: P = 1.12 g/cm3
Gravity: b = 980 cm/s2
Fluid:
Case f [g/cm3 ] f [cm2 /s] ReP = U D/ f
e3 e2 C1 0.970 3.8454 1.5
e1 C2 0.960 0.6042 31.9
(a) (b)
0 0
Settling velocity U3 [cm/s]
Settling velocity U3 [cm/s]
AP2 -2
-1 AP1 -4
[37]
-2 [40] -6
[14] -8
-3 [8]
-10
experi- [7]
ment [10] -12
-4
-14
from correlation
-5 equations -16
0.0 1.0 2.0 3.0 4.0 5.0 0.0 0.5 1.0 1.5
Time t [s] Time t [s]
Fig. 3 Evolution of the settling velocity of the particle for a case C1 and b case C2
center in the direction of gravity for the considered two cases. As it can be seen,
each case was simulated both by means of approach AP1 and AP2. Every diagram
also includes the predicted terminal velocity of the particle based on the correlation
equations2 suggested in Clift et al. [10], Brown and Lawler [7] and Cheng [8], and
furthermore, the numerical results of Veeramani et al. [40] and those of Feng and
Michaelides [14]. At the beginning of the experiment, the particle is at rest, and it
accelerates due to the action of gravity. It is observed that when the gravitational
and the drag forces reach a state of equilibrium, the particle will sediment with a
uniform velocity, which is called terminal velocity. The simulation results show that
the presented model, both based on AP1 and AP2, is capable to predict the evolution
of the particles settling velocity. The maximum discrepancy of the predictions with
2In general, correlations for drag and terminal settling velocity are valid for a particle in an infinite
domain, but they still provide reasonable results for a relatively large distance between a particle
and system boundaries.
Direct Numerical Simulation of Particulate Flows 119
Fig. 4 Contour plots of the normalized velocity magnitude in the symmetry plane at different points
in time. The plots of the upper row belong to case C1 and those of the lower row to case C2
respect to the experimental data is less than 8 %. In addition, the obtained results are
in a very good agreement with those of Veeramani et al. [40], but there is a small
mismatch compared with the results of Feng and Michaelides [14]. Figure 4 shows
the computed contour plots of the normalized velocity magnitude ||u f ||/U in the
symmetry plane. Accordingly, the depicted contours range between 0 and 1. Here,
an equal spacing of 0.1 is chosen. It shows that these plots agree well with those
of ten Cate et al. [37], and that the presented model is well suited to mimic their
sedimentation experiments. There is also a good agreement with the computed plots
given in Apte et al. [2].
Box dimension: 1 1 4 cm
Particle position: ( 0.5/ 0.5/3.5) cm
(0.5/0.5/3.167) cm
Gravity: b = 980 cm/s2
b
Fluid: Viscosity: f = 0.01 cm2 /s
Density: f = 1.00 g/cm3
Particle: Radius: R1 = R2 = 1/12 cm
Density: P1 = P2 = 1.14 g/cm3
Young modulus: E = 106 N/cm2
Poisson ratio: = 0.25
e3 e2
Friction coeff.: s /d = 0.35 / 0.32
e1 Damping coeff.: A = 5 105 s
(a) (b)
0 0
Settling velocity U3 [cm/s]
AP2
Settling velocity U3 [cm/s]
AP2
-2 AP1 -2
[2]
[16]
-4 [4] -4
[8]
-6 [7] -6
[10]
only
-8 -8
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
Time t [s] Time t [s]
Fig. 6 Evolution of the settling velocities of the particles for the drafting-kissing-tumbling problem:
a results of the present work and b comparison of the results with the literature
the box is fully filled with fluid. Concerning the boundary conditions of the fluid
domain, no-slip velocity conditions are applied at the box walls. In order to provoke
that the particles tumble when they kiss each other, a slight initial horizontal offset
in the position of P1 is introduced such that (0.5075/0.5075/3.5) cm. This offset
is here necessary, because the numerical results of previous test computations based
on different uniform symmetric meshes showed that the implemented algorithm
maintains nearly the complete symmetric properties of the system. Thus, the flow
field features too weak lateral disturbances in order to trigger the tumbling case (as
if, for instance, a free or just an anisotropic mesh is used).
In order to verify that the numerical model is able to predict the terminal settling
velocity of an almost undisturbed sedimenting single particle within the frame of
this benchmark setup, the sedimentation of P2 was computed in advance without
Direct Numerical Simulation of Particulate Flows 121
the presence of P1 . Figure 6a shows the obtained time history of the falling velocity
by using AP2, and one can see that the predicted terminal velocity matches those
obtained with correlations. This diagram contains also the computed evolutions of
the particle velocities by employing the approaches AP1 on the one hand and AP2
on the other hand for the drafting-kissing-tumbling case. It can be observed that
the results obtained for both simulations agree quite well. With regard to a further
verification of the presented algorithm and of its implementation, the predictions of
AP2 are compared to other numerical predictions that can be found in the literature
for the same setup, see Fig. 6b. The comparison is generally found to be good where
the presented results are particularly consistent with the simulation results of Apte
et al. [2] and Breugem [4]. At this point, one has to underline that all predicted
evolutions in Fig. 6b rely on different FD method concepts.
In Fig. 7, the process of drafting-kissing-tumbling is displayed at eight different
points in time. One observes that once the trailing particle is located within the grad-
ually growing wake region of the leading particle, it experiences a lower drag force.
Consequently, this results in a higher fall velocity for the trailing particle compared to
the leading particle (drafting). With increasing time, the gap between both particles
decreases due to their velocity difference, and thus the particles getafter a while
into contact (kissing). Clearly, this configuration is unstable. The particles tumble
as a consequence and start to separate (tumbling). The flow phenomena observed
in the experiment of Fortes et al. [15] are definitely reproduced by the presented
computational approach.
In this test case, a particle-laden flow through a tube with different cross sections is
considered. It is assumed that the particles are allowed to adhere to each other and as
122 B. Avci and P. Wriggers
Fig. 9 Numerical results of the flow through the tube showing the velocity field at t = 6.5 [s]
well at the tube wall. Figure 8 shows the geometry of the tube including the model
data, where the parameters WPP and WPW are the work of adhesion among particles
and between particles and the tube wall, respectively. Gravitational effects are not
considered. The suspended particles are randomly inserted at the inflow boundary
with an initial velocity which is conform to the inflow velocity of the fluid. To compute
the flow in the tube, a mesh consisting of 2,304,000 Q 1 /Q 0 elements is used. This
yields a discretized system with 2,304,000 pressure nodes and 6,953,280 velocity
nodes. The time step sizes for the FEM and DEM solver are selected as t = 104 s
and tD = 106 s, respectively.
The presence of van der Waals forces leads with the chosen parameters to a
deposition of suspended particles onto the tube wall. The deposition grows with
an increasing simulation time, in particular at the end of the smaller cross section
Direct Numerical Simulation of Particulate Flows 123
Fig. 10 Numerical results of the flow through the tube showing the pressure field at t = 6.5 [s]
Fig. 11 Numerical results of the flow through the tube showing: ac the streamlines and the particles
at different points in time and d the particle velocity vectors at t = 2.5 s. The colors represent in
all cases the velocity magnitudes
part. With more particles adhering to the wall, the velocity increases locally, and the
particles experience higher drag forces. A situation like this is depicted in Fig. 9.
Therein, one can see the influence of the developed agglomerates on the velocity
field when they stick to the tube wall or slide along the wall. Due to the fully coupled
description of the particulate flow, the strong mutual phase interactions, as in the
situation shown in Fig. 9, can be captured by the developed DNS fluid-particle
solver. The strong local impact of the dispersed phase on the fluid phase is here
124 B. Avci and P. Wriggers
Fig. 12 Numerical results of the flow through the tube showing the fluid and particle velocities at
six points in time
Direct Numerical Simulation of Particulate Flows 125
obvious. In Fig. 10, one can observe the corresponding effects regarding the pressure
field (see the pressure difference when comparing the luv and the lee side of the large
agglomerate). For the case when the traction forces exceed the adhesive forces, single
particles and agglomerates break off and are subsequently transported away by the
flow field. The complexity of the evolution of the multiphase flow situation at the
outflow of the smaller cross section part is reflected in Fig. 11. There, the image (a),
which shows the flow at time t = 1.0 s, illustrates a fully developed axisymmetric
fluid flow with a certain eddy zone evolved in this region. Having a look at the images
(b) and (c), t = 2.5 s and t = 10.0 s, one can see that this axial symmetry is totally lost
and that a large number of particles has deposited on the wall. The image (d) shows
the same flow situation as in (b), but in this case including particle velocity vectors
and without streamlines. Here, the influence of the eddy on the trajectories of the
passing particles can be clearly seen. In fact, the particles which are fully caught by
the eddy change the flow direction so that they are subsequently transported against
the main flow in the tube. The different phases of the flow process are shown for the
whole system in Fig. 12.
7 Conclusion
In this work, a computational approach for the full 3D DNS of particulate flows is
presented. The approach is based on the FD method. The developed solver treats the
coupled fluid-particle problem in a staggered way by solving the phases explicitly in
succession. The mutual phase interactions are computed on the other hand implicitly.
As numerical solvers, the FEM and DEM are applied, respectively, to simulate the
fluid and particle part. In the framework of the DEM, the particle collision is described
using an adhesive viscoelastic model, and additionally, friction and rolling resistance
are considered. To verify the reliability of the algorithm and its implementation,
various test computations were performed. In this contribution, the simulation results
of two sedimentation problems were presented and discussed. Furthermore, the solver
was applied to simulate an aggregation dynamics problem of a particulate flow where
the formation of agglomerates is considered. The chosen system for this task was
a tube with different cross sections. The obtained numerical results show that the
developed solver is appropriate to deal with fluid-particle interaction problems.
References
1. Allen MP, Tildesley DJ (1987) Computer simulation of liquids. Oxford University Press, New
York
2. Apte SV, Martin M, Patankar NA (2009) A numerical method for fully resolved simulation
(FRS) of rigid particle-flow interactions in complex flows. J Comput Phys 228(8):27122738
3. Avci B, Wriggers P (2012) A DEM-FEM coupling approach for the direct numerical simulation
of 3D particulate flows. J Appl Mech 79:010901
126 B. Avci and P. Wriggers
4. Breugem WP (2012) A second-order accurate immersed boundary method for fully resolved
simulations of particle-laden flows. J Comput Phys 231(13):44694498
5. Brilliantov NV, Albers N, Spahn F, Pschel T (2007) Collision dynamics of granular particles
with adhesion. Phys Rev E 76(5, Part 1):051302
6. Brilliantov NV, Spahn F, Hertzsch JM, Pschel T (1996) Model for collisions in granular gases.
Phys Rev E 53(5, Part B):53825392
7. Brown PP, Lawler DF (2003) Sphere drag and settling velocity revisited. J Environ Eng
129(3):222231
8. Cheng NS (2009) Comparison of formulas for drag coefficient and settling velocity of spherical
particles. Powder Technol 189(3):395398
9. Chokshi A, Tielens AGGM, Hollenbach D (1993) Dust coagulation. Astrophys J 407(2, Part
1):806819
10. Clift R, Grace JR, Weber ME (1978) Bubbles, drops and particles. Academic Press, New York
11. Crowe CT (2006) Multiphase flow handbook. CRC Press, Boca Raton
12. Cundall PA, Strack ODL (1979) Discrete numerical model for granular assemblies. Geotech-
nique 29(1):4765
13. Feng YT, Han K, Owen DRJ (2007) Coupled lattice Boltzmann method and discrete element
modelling of particle transport in turbulent fluid flows: computational issues. Int J Numer Meth
Eng 72(9):11111134
14. Feng ZG, Michaelides EE (2005) Proteus: a direct forcing method in the simulations of par-
ticulate flows. J Comput Phys 202(1):2051
15. Fortes AF, Joseph DD, Lundgren TS (1987) Nonlinear mechanics of fluidization of beds of
spherical-particles. J Fluid Mech 177:467483
16. Glowinski R, Pan TW, Hesla TI, Joseph DD, Periaux J (2001) A fictitious domain approach
to the direct numerical simulation of incompressible viscous flow past moving rigid bodies:
application to particulate flow. J Comput Phys 169(2):363426
17. Haeri S, Shrimpton JS (2012) On the application of immersed boundary, fictitious domain and
body-conformal mesh methods to many particle multiphase flows. Int J Multiph Flow 40:3855
18. Hertz H (1882) ber die berhrung fester elastischer krper. Journal fr die reine und ange-
wandte Mathematik 92:156171
19. Hu HH (1996) Direct simulation of flows of solid-liquid mixtures. Int J Multiph Flow
22(2):335352
20. Hu HH, Joseph DD, Crochet MJ (1992) Direct simulation of fluid particle motions. Theor
Comput Fluid Dyn 3:285306
21. Hu HH, Patankar NA, Zhu MY (2001) Direct numerical simulations of fluid-solid systems
using the Arbitrary Lagrangian-Eulerian technique. J Comput Phys 169(2):427462
22. Iwashita K, Oda M (1998) Rolling resistance at contacts in simulation of shear band develop-
ment by DEM. J Eng Mech 124(3):285292
23. Johnson AA, Tezduyar TE (1996) Simulation of multiple spheres falling in a liquid-filled tube.
Comput Methods Appl Mech Eng 134(34):351373
24. Johnson AA, Tezduyar TE (1997) 3D simulation of fluid-particle interactions with the number
of particles reaching 100. Comput Methods Appl Mech Eng 145(34):301321
25. Johnson AA, Tezduyar TE (1999) Advanced mesh generation and update methods for 3D flow
simulations. Comput Mech 23(2):130143
26. Johnson AA, Tezduyar TE (2001) Methods for 3D computation of fluid-object interactions in
spatially periodic flows. Comput Methods Appl Mech Eng 190(2425):32013221
27. Johnson KL, Kendall K, Roberts AD (1971) Surface energy and contact of elastic solids. Proc
R Soc Lond A 324(1558):301313
28. Kruggel-Emden H, Wirtz S, Scherer V (2008) A study on tangential force laws applicable to
the discrete element method (DEM) for materials with viscoelastic or plastic behavior. Chem
Eng Sci 63(6):15231541
29. Kuhn MR, Bagi K (2004) Alternative definition of particle rolling in a granular assembly. J
Eng Mech 130(7):826835
Direct Numerical Simulation of Particulate Flows 127
30. Lebedev VI, Laikov DN (1999) Quadrature formula for the sphere of 131-th algebraic order
of accuracy. Dokl Akad Nauk SSSR 366(6):741745
31. Loskofsky C, Song F, Newby BZ (2006) Underwater adhesion measurements using the JKR
technique. J Adhes 82(7):713730
32. Luding S (2004) Micro-macro transition for anisotropic, frictional granular packings. Int J
Solids Struct 41(21):58215836
33. Luding S (2008) Cohesive, frictional powders: contact models for tension. Granular Matter
10(4):235246
34. Luo K, Wang Z, Fan J (2007) A modified immersed boundary method for simulations of
fluid-particle interactions. Comput Methods Appl Mech Eng 197(14):3646
35. Maugis D (1992) Adhesion of spheresthe JKR-DMT transition using a Dugdale model. J
Colloid Interface Sci 150(1):243269
36. Pschel T, Schwager T (2005) Computational granular dynamics. Springer, New York
37. ten Cate A, Nieuwstad CH, Derksen JJ, van den Akker A (2002) Particle imaging velocimetry
experiments and lattice-Boltzmann simulations on a single sphere settling under gravity. Phys
Fluids 14(11):40124025
38. Turek S (1999) Efficient solvers for incompressible flow problems: an algorithmic and com-
putational approach. Springer, Berlin
39. Turek S, Becker C (1998) FEATFLOW. Finite element software for the incompressible Navier-
Stokes equations. Institute for Applied Mathematics, University of Heidelberg, Heidelberg
40. Veeramani C, Minev PD, Nandakumar K (2007) A fictitious domain formulation for flows with
rigid particles: a non-lagrange multiplier version. J Comput Phys 224(2):867879
41. Wachs A (2009) A DEM-DLM/FD method for direct numerical simulation of particulate flows:
sedimentation of polygonal isometric particles in a Newtonian fluid with collisions. Comput
Fluids 38(8):16081628
42. Wan D, Turek S (2006) Direct numerical simulation of particulate flow via multigrid FEM
techniques and the fictitious boundary method. Int J Numer Meth Fluids 51(5):531566
43. Wriggers P (2006) Computational contact mechanics, 2nd edn. Springer, Berlin
A Particle Finite Element Method (PFEM)
for Coupled Thermal Analysis of Quasi
and Fully Incompressible Flows
and Fluid-Structure Interaction Problems
1 Introduction
The analysis of thermally coupled flows and their interaction with structures is rele-
vant in many fields of engineering. In this work we present a Lagrangian numerical
technique for solving this kind of problems for quasi and fully incompressible fluids
using the Particle Finite Element Method (PFEM, www.cimne.com/pfem).
The PFEM treats the mesh nodes in the analysis domain as particles which can
freely move and even separate from the domain representing, for instance, the effect
of water drops or cutting particles in drilling problems. A mesh connects the nodes
discretizing the domain where the governing equations are solved using a stabilized
FEM. Examples of application of PFEM to problems in fluid and solid mechanics
including fluid-structure interaction (FSI) situations can be found in [46, 819, 28,
29, 33, 35, 36, 4043]. Early attempts of the PFEM for solving thermally coupled
flows were reported in [1, 2].
In Lagrangian analysis procedures (such as the PFEM) the motion of the fluid
particles is tracked during the transient solution. Hence, the convective terms van-
ish in the momentum and heat transfer equations and no numerical stabilization is
needed for treating those terms. Two other sources of mass loss, however, remain
in the numerical solution of Lagrangian flows, i.e. that due to the treatment of the
incompressibility constraint by a stabilized numerical method, and that induced by
the inaccuracies in tracking the flow particles and, in particular, the free surface.
In this work the PFEM equations for analysis of thermally coupled flows and FSI
problems are derived using the stabilized formulation based in the Finite Calculus
(FIC) method proposed by Oate et al. [2027, 3032, 3739] that has excellent
mass preservation features.
The lay-out of the paper is the following. In the next section we present the basic
equations for conservation of linear momentum, mass and heat transfer for a quasi-
incompressible fluid in a Lagrangian framework. A full incompressible fluid can be
considered as a particular limit case of the former. Next we derive the stabilized
FIC form of the mass balance equation. Then the finite element discretization using
simplicial element with equal order approximation for the velocity, the pressure and
the temperature is presented and the relevant matrices and vectors of the discretized
problem are given. Details of the implicit solution of the Lagrangian FEM equations
in time using a Newton iterative scheme are presented. The relevance of the bulk
stiffness terms in the tangent matrix for enhancing the convergence and accuracy of
the iterative solution scheme is discussed. The basic steps of the PFEM for solving
FSI problems are described.
The efficiency and accuracy of the PFEM technique are verified by solving a set of
adiabatic and thermally coupled quasi-incompressible free surface flow problems in
two (2D) and three (3D) dimensions involving FSI situations. The adiabatic problems
are the sloshing of water in a tank and the penetration of a water sphere into a
cylindrical tank containing water. The thermally coupled problems considered are
the extended 2D version of the adiabatic cases. The excellent performance of the
numerical method proposed in terms of mass conservation and general accuracy is
highlighted.
2 Governing Equations
Dvi i j
bi = 0, i, j = 1, ..., n s in . (1)
Dt x j
In Eq. (1), is the analysis domain with boundary , vi and bi are the velocity
and body force components along the ith Cartesian axis, is the density, n s is the
number of space dimensions (i.e. n s = 3 for 3D problems) and i j are the Cauchy
stresses that are split in the deviatoric (si j ) and pressure ( p) components as
i j = si j + pi j (2)
where i j is the Kronecker delta. Note that the pressure is assumed to be positive for
a tension state. Summation of terms with repeated indices is assumed in Eq. (1) and
in the following, unless otherwise specified.
The relationship between the deviatoric stresses si j and the strain rates i j has the
standard form for a Newtonian fluid,
1 1 vi v j
si j = 2 i j v i j with i j = + (3)
3 2 x j xi
vi
where is the viscosity and v is the volumetric strain rate defined as v = ii = xi .
rv = 0 (4a)
with
1 Dp
rv := + v . (4b)
c2 Dt
In Eq. (4b) c is the speed of sound in the fluid. For a fully incompressible fluid
c = and Eq. (4a) simplifies to the standard form, v = 0. In our work we will
retain the quasi-incompressible form of rv of Eq. (4b) for convenience.
132 E. Oate et al.
where T is the temperature, c is the thermal capacity, k is the heat conductivity and
Q is the heat source.
p = 0 on (8)
p
k + qn = 0 on q (9)
n
p
where p and qn are the prescribed temperature and the prescribed normal heat
flux at the boundaries and q , respectively and n is the direction normal to the
boundary.
Remark 1 The term Dv i
Dt in Eq. (1) is the material derivative of the ith velocity
component vi . This term is typically computed in a Lagrangian framework as
Dvi n+1 v n vi
i
= (10)
Dt t
with
n+1
vi := vi (n+1 x, n+1 t), n
vi := vi (n x, n t) (11)
A Particle Finite Element Method (PFEM) 133
where n vi is the velocity of the material point that has the position n x at time t = n t,
where x is the coordinates vector in a fixed Cartesian system [3, 7, 46].
In this work we will use the second order FIC form of the mass balance equation in
space for a quasi-incompressible fluid [37, 38], as well as the first order FIC form of
the mass balance equation in time. These forms have the following expressions:
h i2 2 rv
rv + =0 in i = 1, ..., n s . (12a)
12 xi2
Drv
rv + =0 in . (12b)
2 Dt
p
rm i = (2i j ) + + bi . (15)
x j xi
Equation (13) is used as the starting point for deriving the stabilized FEM formu-
lation as explained in the following sections.
4 Variational Equations
Multiplying Eq. (1) by arbitrary test functions wi with dimensions of velocity and
integrating over the analysis domain gives the weighted residual form of the
momentum equations as [3, 7, 46]
Dvi i j
wi bi d = 0. (16)
Dt x j
Integrating by parts the term involving i j and using the Neumann boundary
conditions (7) yields the weak variational form of the momentum equations as
Dvi p
wi d + i j i j d wi bi d wi ti d = 0 (17)
Dt
t
w
where i j = w
x j + xi is an arbitrary (virtual) strain rate field. Equation (17) is the
i j
= [11 , 22 , 33 , 12 , 13 , 23 ]T , = [11 , 22 , 33 , 12 , 13 , 23 ]T
4/3 2/3 2/3 0 0 0
4/3 2/3 0 0 0
4/3 0 0 0
D =
, m = [1, 1, 1, 0, 0, 0]T .
1 0 0
Sym. 1 0
1
(20)
Integrating by parts the last integral in Eq. (21) and using Eq. (15) gives after some
transformations [39, 40]
q Dp D2 p q p
d + q 2 d qv d + (2i j ) + + bi d
Dt c Dt 2 xi xi xi
(22)
Dvn 2 vn
q 2 + p tn d = 0.
Dt hn n
t
Expression (22) holds for 2D and 3D problems. The terms involving the first and
second material time derivative of the pressure and the boundary term in Eq. (22)
136 E. Oate et al.
are important for preserving the conservation of mass in free-surface flow problems
[10, 41].
Application of the standard weighted residual method to the heat balance Eqs. (5)
and (9) leads, after standard operations, to [7, 44]
T w T p
wc d + k d w Qd + wqn d = 0 (23)
t xi xi
q
5 FEM Discretization
We discretize the analysis domain into finite elements with n nodes in the standard
manner leading to a mesh with a total number of Ne elements and N nodes. In our
work we will choose simple 3-noded linear triangles (n = 3) for 2D problems and
4-noded tetrahedra (n = 4) for 3D problems with local linear shape functions Nie
defined for each node i (i = 1, n) of element e [34, 44]. The velocity components, the
pressure and the temperature are interpolated over the mesh in terms of their nodal
values in the same manner using the global linear shape functions N j spanning over
the elements sharing node j ( j = 1, N ) [34, 4446]. In matrix form
v = Nv v , p = N p p , T = NT T (24)
where
1 1
v i
p
T1
v2 v1
p 2
T2
v = .. with v = v2 , p =
i i
.. , T= ..
.
i
.
.
(25)
N
v3 N
N
v p T
Nv = [N1 , N2 , , N N ] , N p = NT = [N1 , N2 , , N N ]
T
Substituting Eq. (24) into Eqs. (17), (22) and (23) and choosing a Galerkin formu-
lation with wi = q = wi = Ni leads to the following system of algebraic equations
M0 v + Kv + Qp fv = 0 (26a)
M1 p + M2 p QT v + (L + Mb )p f p = 0 (26b)
+ LT f = 0
CT (26c)
T
where a and a denote the first and second material time derivatives of the components
of a vector a. The different matrices and vectors in Eqs. (26a) are assembled from
the element contributions given in Box 1.
Remark 3 The presence of matrix Mb in Eq. (26b) allows us to compute the pressure
without the need of prescribing its value at the free surface. This eliminates the error
introduced when the pressure is prescribed to zero in free boundaries, which leads
to considerable mass losses in viscous flows [15].
Remark 4 The stabilization parameter of Eq. (14) is computed for each element e
using h = l e and = t as
1
8 2
= + (27)
(l e )2 t
where t is the time step used for the transient solution and l e is a characteristic
element length computed as l e = 2(e )1/n s where e is the element area (for
3-noded triangles) or volume (for 4-noded tetrahedra). For fluids with heterogeneous
material the values of and are computed at the element center.
The characteristic boundary length h n in the expression of f p (Box 1) has been
taken equal to l e in our computations.
Equations (26a), (26b), (26c) are solved in time with an implicit Newton-Raphson
type iterative scheme [3, 7, 44, 46]. The basic steps within a time interval [n, n + 1]
are:
1
rm = M0 v + Kv + Qp fv , Hv = M0 + K + Kv (28b)
t
A more accurate expression for computing n+1 xi+1 can be used involving the
nodal accelerations [40].
Step 5.Compute the nodal temperatures
with
+ LT f
rT = CT (33)
T
where ev , e p and eT are prescribed error norms. In our examples we have set ev
= e p = eT = 103 .
If conditions (34) are satisfied then make n n + 1 and proceed to the next time
step. Otherwise, make the iteration counter i i + 1 and repeat Steps 15.
Remark 5 In Eqs. (28a, 28b)(34) n+1 () denotes the values of a matrix or a vector
computed using the nodal unknowns at time n + 1. In this work the derivatives and
140 E. Oate et al.
integrals in all the matrices and the residual vectors rm and rT are computed on
the discretized geometry at time n while the nodal force vectors fv , f p and fv are
computed on the current configuration at time n + 1. This is equivalent to using an
updated Lagrangian formulation [3, 45, 46].
Remark 6 Including the bulk stiffness matrix Kv in Hv has proven to be essential for
the fast convergence, mass preservation and overall accuracy of the iterative solution
[10, 41]. The element expression of Kv can be obtained as [41]
Kev = BT m tmT Bd (35)
e
where is a positive number such that 0 < 1 that has the role of preventing
the ill-conditioning of the iteration matrix Hv for highly incompressible fluids. An
adequate selection of also improves the overall accuracy of the numerical solution
and the preservation of mass for large time steps [10]. For fully incompressible fluids
(c and = ), a finite value of is used in practice in Kv as this helps to obtaining an
accurate solution for velocities and pressure with reduced mass loss in few iterations
per time step [10]. These considerations, however, do not affect the value of within
matrix M1 in Eq. (26b) that vanishes for a fully incompressible fluid. Clearly, the
value of the terms of Kve can also be limited by reducing the time step size. This,
however, leads to an increase in the cost of the computations. A similar approach for
improving mass conservation in incompressible flows was proposed in [42].
the maximum value of the modulus of the velocity of all nodes in the mesh and
tb is the critical
n time step of all nodes approaching a solid boundary defined as
lb
tb = min |n vb |max where n lb is the distance from the node to the boundary and
n v is the velocity of the node. This definition of t intends that no node crosses a
b
solid boundary during a time step.
A method that allows using large time steps in the integration of the PFEM equa-
tions can be found in [16].
A Particle Finite Element Method (PFEM) 141
Solid node
Initial cloud of nodes
Fluid node
Fixed boundary node
n
Flying Sub-domains
Fixed Domain
Mesh
boundary
Cloud
n
,
.
n
, n , n ,n , n , n
.
n
Mesh
Fixed
boundary Domain
n+1
,
.
n+1
, n+1 , n+1 ,n+1 , n+1 , n+1
etc
Cloud
Fig. 1 Sequence of steps to update a cloud of nodes representing a domain containing a fluid
and a solid part from time n (t =n t) to time n + 2 (t =n t + 2t)
Let us consider a domain V containing fluid and solid subdomains. Each subdomain
is characterized by a set of points, hereafter termed particles. The particles contain all
the information for defining the geometry and the material and mechanical properties
of the underlying subdomain. In the PFEM both subdomains are modelled using an
updated Lagrangian formulation [3, 45].
The solution steps within a time step in the PFEM are as follows:
1. The starting point at each time step is the cloud of points C in the fluid and solid
domains. For instance n C denotes the cloud at time t = n t (Fig. 1).
2. Identify the boundaries defining the analysis domain n V , as well as the subdo-
mains in the fluid and the solid. This is an essential step as some boundaries
(such as the free surface in fluids) may be severely distorted during the solution,
including separation and re-entering of nodes. The Alpha Shape method [8] is
142 E. Oate et al.
Fig. 2 2D analysis of sloshing of water in rectangular tank. Initial geometry, analysis data and
mesh of 5064 3-noded triangles discretizing the water in the tank
used for the boundary definition. Clearly, the accuracy in the reconstruction of
the boundaries depends on the number of points in the vicinity of each boundary
and on the Alpha Shape parameter. In the problems solved in this work the Alpha
Shape method has been implementation as described in [12, 28].
3. Discretize the the analysis domain n V with a finite element mesh n M. We use
an efficient mesh generation scheme based on an enhanced Delaunay tesselation
[11, 12].
4. Solve the Lagrangian equations of motion for the overall continuum using the
standard FEM. Compute the state variables in at the next (updated) configuration
for n t +t: velocities, pressure and viscous stresses in the fluid and displacements,
stresses and strains in the solid.
5. Move the mesh nodes to a new position n+1 C where n+1 denotes the time n t +t,
in terms of the time increment size.
6. Go back to step 1 and repeat the solution for the next time step to obtain n+2 C.
Note that the key differences between the PFEM and the classical FEM are the
remeshing technique and the identification of the domain boundary at each time step.
The CPU time required for meshing grows linearly with the number of nodes. As a
general rule, meshing consumes for 3D problems around 15 % of the total CPU time
per time step, while the solution of the equations (with typically 3 iterations per time
step) and the system assembly consume approximately 70 % and 15 % of the CPU
time per time step, respectively. These figures refer to analyses in a single processor
Pentium IV PC [36]. Considerable speed can be gained using parallel computing
techniques.
In this work we will apply the PFEM to problems involving a rigid domain con-
taining fluid particles only. Application of the PFEM in fluid and solid mechanics
and in fluid-structure interaction problems can be found in [46, 819, 28, 29, 33,
35, 36, 4043], as well as in www.cimne.com/pfem.
A Particle Finite Element Method (PFEM) 143
Fig. 3 2D sloshing of water in rectangular tank. Snapshots of water geometry at two different times
( = 1). Colours indicate pressure contours. a t = 5.7 s; b t = 7.4 s; c t = 13.3 s; d t = 18.6 s
8 Examples
The problem has been solved first in 2D. Figure 2 shows the analysis data. The fluid
oscillates due to the hydrostatic forces induced by its original position.
The problem has been run using different values of the parameter in the tangent
bulk stiffness matrix Kve (Eq. 35). The first set of results (Figs. 3 and 4) were obtained
with = 1. The problem was then solved for = 0.08, thereby, reducing in one
order the magnitude the diagonal terms in Kve .
Figure 3 shows snapshots of the water geometry at different times. Pressure con-
tours are superposed to the deformed geometry of the fluid in the figures.
144 E. Oate et al.
(b)
Volume variation (in %) per time step over 20 seconds of analysis (Current
method with = 1)
Fig. 4 2D sloshing of water in rectangular tank. a Time evolution of the percentage of water volume
loss due to the numerical algorithm. b Average volume variation per time step. Current method.
1.09 104 %. Fractional step: 2.07 104 %
Figure 4 shows the evolution of the percentage of water volume (i.e. mass) loss
introduced by the numerical solution scheme. The accumulated volume loss (in
percentage versus the initial volume) for the method proposed with = 1 is approx-
imately 1.33 % over 20 s of simulation time (Fig. 4a). The average volume variation
in absolute value per time step is 1.09104 % (Fig. 4b). The total water volume loss
is the sum of the losses induced by the numerical scheme and the losses due to the
updating of the free surface using the PFEM. No correction of mass was introduced
A Particle Finite Element Method (PFEM) 145
Fig. 5 2D sloshing of water in rectangular tank. Time evolution of percentage of water volume
loss obtained using the current method with = 0.08 (curve A) and = 1 (curve B) t = 103 s
Fig. 6 2D sloshing of water in rectangular tank. Time evolution of percentage of water volume
loss obtained with the current method. Curve A = 1 and t = 104 s. Curve B = 1 and
t = 103 s
at the end of each time step. Taking all this into account, the fluid volume loss over
the analysis period is remarkably low.
The volume losses induced by the free surface updating can be reduced using a
finer mesh in that region in conjunction with an enhanced alpha shape technique.
The total fluid volume loss can be reduced to almost zero by introducing a small
correction in the free surface at the end of each time step [41].
The fluid volume losses obtained using a standard first order fractional step method
[41] and the PFEM are shown in Fig. 4a for comparison. Clearly the method proposed
146 E. Oate et al.
Fig. 7 3D analysis of sloshing of water in prismatic tank ( = 1). Analysis data and snapshots of
water geometry at a t = 5.7 s (left) and b t = 7.4 s (right)
in this work leads to a reduction in the overall fluid volume loss, as well as in the
volume loss per time step.
Figure 5 shows a comparison between the fluid volume loss for = 1 and = 0.08
using the same time step in both cases (t = 103 s). Results show that the reduction
of the tangent bulk stiffness matrix terms leads to an improvement in the preservation
of the initial volume of the fluid. It is noted that the convergence of the iterative
solution for = 0.08 was the same as for = 1.
Figure 6 shows that a similar improvement in the volume preservation can be
obtained using = 1 and reducing the time step to t = 104 s. This, however,
increases the cost of the computations.
These results indicate that accurate numerical results with reduced volume losses
can be obtained by appropriately adjusting the parameter in the tangent bulk mod-
ulus matrix while keeping the time step size to competitive values in terms of CPU
A Particle Finite Element Method (PFEM) 147
Fig. 8 3D analysis of sloshing of water in prismatic tank ( = 1). a Time evolution of accumulated
water volume loss (in %) due to the numerical algorithm. b Volume loss (in %) per time step over
2 s of analysis. Average volume loss per time step: 1.64 104 %
Fig. 9 Water sphere falling in a tank filled with water. Analysis data and initial discretization of
the sphere and the water in the tank with 88892 4-noded tetrahedra
This example is the 3D analysis of the impact of a sphere made of water as it falls
in a cylindrical tank containing water. Both the water in the sphere and in the tank
mix in a single fluid after the impact. Figure 9 shows the material and analysis data
and the initial discretization of the sphere and the water in the tank in 88892 4-noded
tetrahedra. The problem was solved with the new stabilized method presented in the
paper with = 1. Figure 10 shows snapshots of the mixing process at different times.
An average of four iterations for convergence of the velocity and the pressure were
needed during all the steps of the analysis. The total water mass lost in the sphere and
the tank due to the numerical algorithm was 2 % after 3 s of analysis (Fig. 11a).
Fig. 10 Water sphere falling in tank containing water. Evolution of the impact and mixing of the
two liquids at different times. Results for = 1. a t = 0.175 s, b t = 0.275 s, c t = 0.5 s, d t = 0.9 s
Fig. 11 Water sphere falling in a tank containing water. a Accumulated volume over three seconds
of analysis due to the numerical algorithm ( = 1). b Volume loss (in %) per time step. Average
volume variation per time step: 2.54 104 %
An elastic object falls in a tank containing a fluid at rest. The tank walls are maintained
at temperature T = 75 C during the whole analysis, while the fluid and the solid object
have an initial temperature of T = 20 C. The geometry and the problem data of
the 2D simulation as well the thermal initial and boundary conditions, are shown in
Fig. 15. Both the fluid and the solid have a high thermal conductivity. The heat flux
along the fluid and solid surfaces in contact with the air has been considered null.
The fluid and the solid domains have been discretized with 1986 and 108 3-noded
A Particle Finite Element Method (PFEM) 151
Material data
Viscosity:
Bulk modulus:
Density:
Conductivity:
Thermal capacity:
Geometry data
H1 :
H2 :
D:
Average mesh size:
Analysis data
Total duration: 1
Time step increment:
Fig. 12 2D sloshing of a fluid in a heated tank. Initial geometry, problem data, thermal boundary
and initial conditions
Fig. 13 2D sloshing of a fluid in a heated tank. Snapshots of fluid geometry at six different times.
Colours indicate temperature contours
triangular finite elements, respectively. The duration of the simulation is 10 s and the
time step increment chosen is t = 0.0005 s.
Figure 16 collects some representative snapshots of the numerical simulation with
the temperature results plotted over the fluid and the solid domains.
The graph of Figure 17 is the evolution of temperature at the central point of the
solid object. As expected, its temperature tends to T =75 C.
152 E. Oate et al.
Fig. 14 2D sloshing of a fluid in a heated tank. Evolution of temperature with time at the points
A, B and C of Fig. 12
Analysis data
Fluid data Solid data
Total duration:
Viscosity: Young modulus:
Time step increment:
Density: Density: Geometry data
Bulk modulus: Poisson coecient: H:
Conductivity: Conductivity: L:
R:
Thermal capacity: Thermal capacity:
Mean mesh size:
Fig. 15 Falling of a solid object in a heated tank filled with fluid. Initial geometry, problem data,
thermal boundary and initial conditions
A Particle Finite Element Method (PFEM) 153
Fig. 16 Falling of a solid object in a heated tank filled with fluid. Snapshots at six different times.
Colours indicate temperature contours
Fig. 17 Falling of a solid object in a heated tank filled with fluid. Time evolution of the temperature
at the center of the solid object
154 E. Oate et al.
9 Concluding Remarks
Acknowledgments This research was partially supported by the Advanced Grant project SAFE-
CON of the European Research Council.
References
1. Aubry R, Idelsohn SR, Oate E (2005) Particle finite element method in fluid-mechanics
including thermal convection-diffusion. Comput Sruct 83(1718):14591475
2. Aubry R, Idelsohn SR, Oate E (2006) Fractional step like schemes for free surface problems
with thermal coupling using the Lagrangian PFEM. Comput Mech 38(45):294309
3. Belytschko T, Liu WK, Moran B (2013) Non linear finite element for continua and structures,
2nd edn. Wiley, New York
4. Carbonell JM, Oate E, Surez B (2010) Modeling of ground excavation with the particle finite
element method. J Eng Mech (ASCE) 136(4):455463
5. Carbonell JM, Oate E (2013) Surez B (2013) Modelling of tunnelling processes and cutting
tool wear with the Particle Finite Element Method (PFEM). Comput Mech 52:607629. doi:10.
1007/s00466-013-0835-x (Accepted)
6. Cremonesi M, Frangi A, Perego U (2011) A Lagrangian finite element approach for the simu-
lation of water-waves induced by landslides. Comput Struct 89:10861093
7. Donea J, Huerta A (2003) Finite element method for flow problems. Wiley, Chichester
8. Edelsbrunner H, Mucke EP (1999) Three dimensional alpha shapes. ACM Trans Graphics
13:4372
9. Felippa F, Oate E (2007) Nodally exact Ritz discretizations of 1D diffusion-absorption and
Helmholtz equations by variational FIC and modified equation methods. Comput Mech 39:91
111
10. Franci A, Oate E, Carbonell JM (2013) On the effect of the tangent bulk stiffness matrix in
the analysis of free surface Lagrangian flows using PFEM. Research Report CIMNE PI402.
Int J Numer Meth Biomed Eng 38(2):125138 (Submitted)
11. Idelsohn SR, Calvo N, Oate E (2003c) Polyhedrization of an arbitrary point set. Comput Meth
Appl Mech Eng 192(2224):26492668
12. Idelsohn SR, Oate E, Del Pin F (2004) The particle finite element method: a powerful tool to
solve incompressible flows with free-surfaces and breaking waves. Int J Numer Meth Biomed
Eng 61(7):964989
13. Idelsohn SR, Marti J, Limache A, Oate E (2008) Unified Lagrangian formulation for elastic
solids and incompressible fluids: Application to fluid-structure interaction problems via the
PFEM. Comput Meth Appl Mech Eng 197:17621776
A Particle Finite Element Method (PFEM) 155
14. Idelsohn SR, Mier-Torrecilla M, Oate E (2009) Multi-Fluid flows with the particle finite
element method. Comput Meth Appl Mech Eng 198:27502767
15. Idelsohn SR, Oate E (2010) The challenge of mass conservation in the solution of free-surface
flows with the fractional-step method: problems and solutions. Int J Numer Meth Biomed Eng
26:13131330
16. Idelsohn SR, Nigro N, Limache A, Oate E (2012) Large time-step explicit integration method
for solving problem with dominant convection. Comput Meth Appl Mech Eng 217220:168
185
17. Larese A, Rossi R, Oate E, Idelsohn SR (2008) Validation of the particle finite element method
(PFEM) for simulation of free surface flows. Eng Comput 25(4):385425
18. Limache A, Idelsohn SR, Rossi R, Oate E (2007) The violation of objectivity in Laplace
formulation of the Navier-Stokes equations. Int J Numer Meth Fluids 54:639664
19. Oliver X, Cante JC, Weyler R, Gonzlez C, Hernndez J (2007) Particle finite element methods
in solid mechanics problems. In: Oate E, Owen R (eds) Computational Plasticity. Springer,
Berlin, pp 87103
20. Oate E (1998) Derivation of stabilized equations for advective-diffusive transport and fluid
flow problems. Comput Meth Appl Mech Eng 151:233267
21. Oate E, Manzan M (1999) A general procedure for deriving stabilized space-time finite ele-
ment methods for advective-diffusive problems. Int J Numer Meth Fluids 31:203221
22. Oate E (2000) A stabilized finite element method for incompressible viscous flows using a
finite increment calculus formulation. Comput Meth Appl Mech Eng 182(12):355370
23. Oate E, Garca J (2001) A finite element method for fluid-structure interaction with surface
waves using a finite calculus formulation. Comput Meth Appl Mech Eng 191:635660
24. Oate E (2003) Multiscale computational analysis in mechanics using finite calculus: an intro-
duction. Comput Meth Appl Mech Eng 192(2830):30433059
25. Oate E, Taylor RL, Zienkiewicz OC, Rojek J (2003) A residual correction method based on
finite calculus. Eng Comput 20:629658
26. Oate E (2004) Possibilities of finite calculus in computational mechanics. Int J Num Meth
Eng 60(1):255281
27. Oate E, Rojek J, Taylor R, Zienkiewicz O (2004a) Finite calculus formulation for incompress-
ible solids using linear triangles and tetrahedra. Int J Num Meth Eng 59(11):14731500
28. Oate E, Idelsohn SR, Del Pin F, Aubry R (2004b) The particle finite element method. An
overview. Int J Comput Meth 1(2):267307
29. Oate E, Celigueta MA (2006a) Modeling bed erosion in free surface flows by the particle
finite element method. Acta Geotech 1(4):237252
30. Oate E, Valls A, Garca J (2006b) FIC/FEM formulation with matrix stabilizing terms for
incompressible flows at low and high Reynolds numbers. Comput Mech 38(45):440455
31. Oate E, Garca J, Idelsohn SR, Del Pin F (2006c) FIC formulations for finite element analysis
of incompressible flows. Eulerian, ALE and Lagrangian approaches. Comput Meth Appl Mech
Eng 195(2324):30013037
32. Oate E, Valls A, Garca J (2007) Computation of turbulent flows using a finite calculus-finite
element formulation. Int J Numer Meth Eng 54:609637
33. Oate E, Idelsohn SR, Celigueta MA, Rossi R (2008) Advances in the particle finite element
method for the analysis of fluid-multibody interaction and bed erosion in free surface flows.
Comput Meth Appl Mech Eng 197(1920):17771800
34. Oate E (2009) Structural analysis with the finite element method. Linear statics, vol 1. Basis
and solids. Springer (CIMNE)
35. Oate E, Rossi R, Idelsohn SR, Butler K (2010) Melting and spread of polymers in fire with
the particle finite element method. Int J Numer Meth Eng 81(8):10461072
36. Oate E, Celigueta MA, Idelsohn SR, Salazar F, Surez B (2011) Possibilities of the particle
finite element method for fluid-soil-structure interaction problems. Comput Mech 48(3):307
318
37. Oate E, Nadukandi P, Idelsohn SR, Garca J, Felippa C (2011) A family of residual-based
stabilized finite element methods for Stokes flows. Int J Num Meth Fluids 65(13):106134
156 E. Oate et al.
38. Oate E, Idelsohn SR, Felippa C (2011) Consistent pressure Laplacian stabilization for incom-
pressible continua via higher-order finite calculus. Int J Numer Meth Eng 87(15):171195
39. Oate E, Nadukandi P, Idelsohn SR (2014) P1/P0+ elements for incompressible flows with
discontinuous material properties. Comput Meth Appl Mech Eng 271:185209
40. Oate E, Carbonell JM (2013) Updated Lagrangian finite element formulation for quasi-
incompressible fluids. Research Report PI393 (CIMNE). Submitted to Comput Mech
41. Oate E, Franci A, Carbonell JM (2013) Lagrangian formulation for finite element analysis of
quasi-incompressible fluids with reduced mass losses. Int J Numer Meth Fluids doi:10.1002/
fld.3870
42. Ryzhakov P, Oate E, Rossi R, Idelsohn SR (2012) Improving mass conservation in simulation
of incompressible flows. Int J Numer Meth Eng 90(12):14351451
43. Tang B, Li JF, Wang TS (2009) Some improvements on free surface simulation by the particle
finite element method. Int J Num Meth Fluids 60(9):10321054
44. Zienkiewicz OC, Taylor RL, Zhu JZ (2005) The finite element method. The basis, 6th edn.
Elsevier, Oxford
45. Zienkiewicz OC, Taylor RL (2005) The finite element method for solid and structural mechan-
ics, 6th edn. Elsevier, Oxford
46. Zienkiewicz OC, Taylor RL, Nithiarasu P (2005) The finite element method for fluid dynamics,
6th edn. Elsevier, Oxford
Numerical Simulation and Visualization
of Material Flow in Friction Stir Welding
via Particle Tracing
Abstract This work deals with the numerical simulation and material flow
visualization of Friction Stir Welding (FSW) processes. The fourth order Runge-
Kutta (RK4) integration method is used for the computation of particle trajectories.
The particle tracing method is used to study the effect of input process parameters
and pin shapes on the weld quality. The results show that the proposed method is
suitable for the optimization of the FSW process.
1 Introduction
Friction Stir Welding is a solid-state joining technique lately found by Thomas et al.
[1]. The basic concept of FSW is the following. A shouldered pin rotating at constant
rotational speed is inserted into the line between the two plates to be welded. Once the
insertion is completed, the pin is moved along the welding line at constant rotating
and advancing speeds to form the joint.
Ideally, the pin is designed to disrupt the contacting surfaces of the work-piece,
shear the material in front of the tool and move the material behind the tool. The
depth of deformation and the tool travel speed are mainly governed by the pin. This
serves two primary functions: heating of the work-piece, and moving the material to
produce the joint. In general, both the heat and the material transfer depend on the
work-piece material properties, tool geometry, and FSW process parameters.
One of the main issues in the study of FSW is heat generation. During the process,
the material undergoes intense plastic deformation at elevated temperatures. In the
FSW process, welding is achieved by the generated heat due to friction and the
material mixing/stirring process. The generated heat must be enough to allow for
the material to flow and to obtain a deep heat affected zone. Insufficient heat forms
the voids as the material is not softened enough to flow properly. It is of practical
importance to understand the material flow characteristics for optimal tool design
and obtain high structural efficiency welds. The visualization of the material flow
is very useful to understand its behavior during the weld. This has led to numerous
investigations on material flow behavior during FSW. A method assessing the quality
of the created weld by visualization of the joint pattern is advantageous. It can be used
to have a pre-knowledge of the appropriate process parameters. However, following
the position of the material during the welding process is not an easy task, neither
experimentally or numerically.
The experimental material visualization is difficult and needs metallographic
tools. In an attempt to better understand FSW, many investigators have used ex-
perimental techniques to visualize the material flow and to estimate characteristics
of FSW. Most of the studies done so far are based on the experimental study of mate-
rial flow tracing. Two different tracer techniques were used by different researchers
for visualization of the material flow. The first was a tracer technique by marker ma-
terial where a dissimilar material is inserted into the weld line. The second technique
was to weld two dissimilar materials with the FSW process and then see the material
mixing. The marker materials were different Al-composites [24], steel balls [5],
copper foil [6, 7], plasticine and brass rods [8]. The used dissimilar base materials
were different magnesium alloys [9], aluminum to copper alloys [10, 11].
Alternatively, establishing a numerical method for the visualization of the material
trajectory in order to gain insight to the heat affected zone has been attempted.
Computational methods including the finite element method have been used to model
the material flow.
In the literature, there are several works to compute the material flow. On one
hand, within a Lagrangian framework, no special technique for the tracking of the
material is necessary as the mesh nodes are material points. In this format, re-meshing
is unavoidable [12, 13]. Meshless methods used within an updated Lagrangian for-
mulation [14] are an interesting alternative, even if its computational cost is usually
higher than the classical finite element method. In this case, re-meshing is avoided
but the material flow is known only at the nodal points. On the other hand, when using
an Eulerian/ALE approach, a specific technique to compute the material trajectories
Numerical Simulation and Visualization of Material Flow 159
must be implemented. However, the mesh density used for the FSW simulation is
not related to the definition of the set of particles used for the visualization of the
material flow. Hence, a large number of particles can be used without increasing the
computational effort devoted to the simulation of the process itself. Following this
approach, the ALE formulation together with a splitting method is proposed in [15]
to analyze different phases of FSW process. In [16] an Eulerian formulation together
with a simple mesh moving technique is used to avoid mesh distortions and reducing
the computing time due to the ALE technique.
In this work, a numerical particle tracing technology is proposed to study the
extent of material stirring during the FSW process and to study the weld quality.
The outline of the chapter is as follows. Firstly the particle tracing technique is
described using a RK4 integration technique for the computation of particle trajec-
tory. Afterwards, the proposed method is applied to different examples in order to
study the quality of the final joint for different process parameters and pin shapes.
Finally some conclusions are drawn.
2 Particle Tracing
In this work, the FSW process is simulated using an apropos kinematic framework
based on the ALE formulation [1720] and particle tracing is performed to be able to
follow the material movement in the stirring zone integrating the velocity field [21].
Due to the ALE character of the finite element analysis used, the motion of the
finite element mesh is not necessarily tied to the motion of the material. During
the analysis, a material particle moves through the mesh and at different time it is
located inside different elements. To observe material movement around the pin, it
is necessary to construct and analyze material particle trajectories. This is possible
with the use of a particle tracing method (particles are treated as material points not
as mesh nodal points).
Particle tracing is a method used to simulate the motion of material points, fol-
lowing their positions at each time-step of the analysis. This method can be naturally
applied to the study of the material flow in the welding process. In the Lagrangian
framework, as the mesh nodes represent the material points, the trajectories are
the solution of the governing system of equations. When using an Eulerian or ALE
framework the solution does not gives directly information about the material points.
However, the obtained velocity field can be used to get an insight of the extent of
material mixing during the weld.
In this method, firstly, a set of points representing the material points (tracers)
are distributed in the domain and then, a Lagrangian Ordinary Differential Equation
(ODE) for the computation of material displacement at a post-process level must
be solved. Each particles path is followed in time integrating the following ODE
equation:
D (X (t))
= V (X(t), t) (1)
Dt
160 N. Dialami et al.
n
V (X (t), t) = v j (t) N j (X (t)) (3)
j=1
where the velocity field, v j (t), is known at each node, j, of the finite element mesh
representing the domain at any time, t, of the analysis.
According to the fourth order accurate RK4 method, the particle position at time-
step n+1 is computed from the advection of the initial position by four weighted
incremental displacement at intermediate time-steps.
1
X n+1 = X n + X(1) + 2X(2) + 2X(3) + X(4) (4)
6
where the incremental displacements are computed as
Numerical Simulation and Visualization of Material Flow 161
(1) = V X , t
X n n t
X(2) = V Xn + 1 X(1) , tn + t t
2 2 (5)
(3) 1 (2) t
X = V Xn + X , tn + t
2 2
X(4) = V Xn + X(3) , tn + t t
3 Examples
The material flow during FSW is complex and the understanding of deformation
process is limited. It is important to point out that there are many factors that can
influence the material flow during FSW. These factors include tool geometry, welding
parameters, material types, work-piece temperature, etc.
The proposed method is used to investigate the effect of these factors on the
qualification of the final weld. The first example studies the effect of the input process
parameters on the joint creation and the second one considers the effect of the pin
shapes on the weld quality.
To study the effect of the input process parameters on the joint quality, a 2D example
is considered. The model is a transversal cut of the pin, with 10 mm diameter,
perpendicular to the rotation axis. The cut represents the mid-section of the real
threaded pin. The contact condition between the pin and the work-piece (AA2195-
T8) is considered to be perfect sticking. Process parameters are the same as in the
experiment: welding speed Vs = 5.0833 mm/s and rotational speed Vr = 500 rpm.
A Sheppard-Wright constitutive model is used [21].
162 N. Dialami et al.
The second example investigates the effect of different pin shapes on the joint cre-
ation using the proposed particle tracing method. Different types of pin shape are
considered and shown in Fig. 2 including (a) triflute; (b) trivex; (c) circular; (d) trian-
gular. The pins are generated from an originalyl circular section of 10 mm diameter
Numerical Simulation and Visualization of Material Flow 163
Fig. 1 Creation of the weld joint with different input parameters a Vs = 5.0833 mm/s; Vr = 0 rpm
b Vs = 0.50833 mm/s; Vr = 500 rpm c Vs = 5.0833 mm/s; Vr = 500 rpm
(Fig. 2). The trivex pin design is approximately triangular; the three points of the pin
form an equilateral triangle and are connected by convex sides. The triflute pin shape
is obtained from an original circular section removing three circular segments.
164 N. Dialami et al.
especially triangular pins; the circular and triflute pin present similar streamlines
taking into account that the stick condition is assumed. Note that the triflute shows
significantly more material being captured and taken around the tool more than once,
whereas the trivex struggles to fill the space behind the tool on the advancing side.
This is therefore consistent with the generation of a void in the wake of a trivex tool.
The triflute pin has a high swept rate due to the segments and a tool design with a
higher swept rate reduces the voids.
It can be observed that the generated heat is greater for the triflute and the circular
pin than for the trivex and the triangular pins as in the stick condition more material
move together with the pins.
The pressure contour field is illustrated in Fig. 4. Pins with sharper corners have
higher maximum pressure value as for triangular and trivex pins than the ones with
convex sides as for circular and triflute pins. However the maximum pressure is of
the same order.
In a next step, the effect of slip condition on the same problem is studied. In
this case, the pin rotational velocity has less effect on the work-piece than in the
stick case. The triflute and the circular pin lead to considerably different streamlines
(Fig. 5). The streamlines for a circular pin show a passing flow through an obstacle
while for a triflute pin, they show the trapped material in the segments of the pin
moving with it. In the slip case, the joint is not qualified even though in the triflute
case, the joint is created due to the effect of the segments. In the stick case, the joint
166 N. Dialami et al.
Fig. 5 Streamlines and created joints in both the stick and slip cases for circular and triflute pins
Fig. 6 Velocity contour field in the slip case for circular and triflute pins
is created following the ring patterns observed generally in the FSW process. The
effect of the segments can be also seen in the Fig. 6 for the slip case as the material
close to the pin is affected by the pin velocity.
Figure 7 shows that for both the slip and stick cases, joints are created using
triangular and trivex pins. However, in the stick case, the joint is not qualified and
Numerical Simulation and Visualization of Material Flow 167
Fig. 7 Streamlines and created joints in both the stick and slip cases for triangular and trivex pins
Fig. 8 Streamlines and velocity contour field in the slip case for triangular and trivex pins
does not follow the usual pattern of FSW due to the void creation. When the slip
condition is assumed, the voids are not created. Material around the pin does not
share the same velocity as the pin, but it moves due to the non-circular shape (Fig. 8)
and a qualified joint is created.
168 N. Dialami et al.
It can be concluded that different types of pin shapes can be selected for different
conditions of the weld. In stick case, pins without sharp corners create qualified joints
while the pins with sharp corners can be used in slip cases.
Material deformed by the friction stir tool must be capable of filling the void
produced by a traversing pin. If the tool design is incorrect, the deformed material
will cool before the material can fully fill the region directly behind the tool.
The presented results are preliminary, but the proposed method could clearly be
of great benefit in reducing experimental trials if near optimal welding conditions
could be predicted directly from knowledge of the material joint behavior.
4 Conclusion
The work deals with the simulation and visualization of material flow. The simulation
of the transient phase is important for understanding the material behavior. The model
can provide this insight by computing the particles thermo-mechanical history. If
the process is defined in an ALE/Eulerian setting, an additional method must be
introduced in order to find the particles history. The particle tracing method for the
material stirring during and after welding is applied to the material flow visualization
of the FSW process. The RK4 integration method is used for the computation of
particle trajectories.
From the ring shape flow pattern left after the welding, it is found that the ratio
between the rotational and the advancing speed is one of the key points for the
qualified joint creation. The effect of pin shapes on the weld quality is studied. It is
found that in the stick case, pins with sharp corners (triangular and trivex) generate
voids while this problem does not appears in the slip case. Moreover, the effect of the
segments of a triflute pin on the weld quality is studied and shows that the material
trapped in the segments moves with the pin in both stick and slip cases.
Acknowledgments This work was supported by the European Research Council under the Ad-
vanced Grant: ERC-2009-AdG Real Time Computational Mechanics Techniques for Multi-Fluid
Problems. The authors are also thankful for the financial support of the Spanish Ministerio de
Educacin y Ciencia (PROFIT programme) within the project CIT-0204002007-82.
References
1. Thomas WM, Nicholas ED, Needham JC, Murch MG, Temple-Smith P, Dawes CJ (1991)
Friction-stir butt welding. GB Patent No. 9125978.8, International Patent No. PCT/GB92
/02203
2. London B, Mahoney M, Bingel B, Calabrese R, Waldron D (2001) Experimental methods for
determining material flow in friction stir welds. The third international symposium on friction
stir welding, Kobe, Japan, 2728 Sept 2001
3. Reynolds AP (2008) Flow visualization and simulation in fsw. Scripta Materialia 58:338342
Numerical Simulation and Visualization of Material Flow 169
4. Seidel TU, Reynolds AP (2001) Visualization of the material flow in aa2195 friction stir welds
using a marker insert technique. Metall Mater Trans A 32:28792884
5. Colligan K (1999) Material flow behaviour during friction stir welding of aluminium. Weld J
78:229237
6. Guerra M, Schmids C, McClure JC, Murr LE, Nunes AC (2003) Flow patterns during friction
stir welding. Mater Charact 49:95101
7. Dickerson T, Shercliff HR, Schmidt H (2003) A weld marker technique for flow visualization
in friction stir welding. 4th international symposium on friction stir welding, Park City, Utah,
USA, 1416 May 2003
8. Kallgren T, Jin L-Z, Sandstrom R (2008) Material flow during friction stir welding of copper.
7th international friction stir welding symposium, Awaji Island, Japan, 2022 May
9. Johnson R, Threadgill P (2003) Friction stir welding of magnesium alloys. In: Kaplan HI
(ed) Magnesium technology 2003 (TMS-The Minerals, Metals & Materials Society, 2003), pp
147152
10. Ouyang J, Yarrapareddy E, Kovacevic R (2006) Microstructural evolution in the friction
stir welded 6061 aluminum alloy (T6-temper condition) to copper. J Mater Proc Technol
172:110122
11. Abdollah-Zadeh A, Saeid T, Sazgari B (2008) Microstructural and mechanical properties of
friction stir welded aluminum/copper lap joints. J Alloy Compd 460:535538
12. Buffa G, Fratini L, Micari F, Shivpuri R (2008) Material flow in fsw of t-joints: experimental
and numerical analysis. Int J Mater Form 1(1):12831286
13. Buffa G, Ducato A, Fratini L (2011) Numerical procedure for residual stresses prediction in
friction stir welding. Finite Elem Anal Des 47(4):470476
14. Alfaro I, Racineux G, Poitou A, Cueto E, Chinesta F (2009) Numerical simulation of friction
stir welding by natural elements method. Int J Mater Form 2(4):225234
15. Guerdoux S, Fourment L (2009) A 3d numerical simulation of different phases of friction stir
welding. Model Simul Mater Sci Eng 17:075001
16. Feulvarch E, Roux J-C, Bergheau J-M (2013) A simple and robust moving mesh technique for
the finite element simulation of friction stir welding. J Comput Appl Math 246:269277
17. Chiumenti M, Cervera M, Dialami N (2013) Numerical modeling of friction stir welding
processes. Comput Methods Appl Mech Eng 254:353369
18. Dialami N, Chiumenti M, Cervera M (2013) An apropos kinematic framework for the numerical
modelling of friction stir welding. Comput Struct 117:4857
19. Agelet de Saracibar C, Chiumenti M, Cervera M, Dialami N, Seret A (2014) Computational
modeling and sub-grid scale stabilization of incompressibility and convection in the numerical
simulation of friction stir welding processes. Archives of Computational Methods in Engineer-
ing 21(1):337. doi:10.1007/s11831-014-9094-z
20. Bussetta P, Dialami N, Boman R, Chiumenti M, Agelet de Saracibar C, Cervera M, Ponthot JP
(2013) Comparison of a fluid and a solid approach for the numerical simulation of friction stir
welding with a non-cylindrical pin. Steel Research International. doi:10.1002/srin.201300182
21. Dialami N, Chiumenti M, Cervera M, Agelet de Saracibar C, Ponthot JP (2013) Material flow
visualization in friction stir welding via particle tracing. Int J Mater Form. doi:10.1007/s12289-
013-1157-4
Some Considerations on Surface Condition
of Solid in Computational Fluid-Structure
Interaction
1 Introduction
The Fluid-Structure Interaction (FSI) is one of the most popular topics in the com-
putational mechanics. It covers a wide range of phenomena of social, scientific and
engineering fields such as vehicle, medicine, civil engineering and construction,
agriculture, forestry, disaster prevention, music, sports, etc. (see Fig. 1).
Related research topics of fluid dynamics and the FSI are among others vortex,
vibration of structure, sloshing, droplet, splash and bubble. For example, the vibration
M. Yokoyama
School of Information Science, Meisei University, Hino, Japan
K. Murotani
School of Engineering, University of Tokyo, Bunkyo, Japan
G. Yagawa (B)
Center for Computational Mechanics Research, Toyo University, Bunkyo, Japan
e-mail: [email protected]
O. Mochizuki
Faculty of Science and Engineering, Toyo University, Bunkyo, Japan
Keywords
Bubble Turbulence Sloshing Drag Noise
of structure caused by the Krmns vortex street has been studied for many years,
which is a locking phenomenon caused by the vortex street behind a spherical cylinder
[1]. The vortex and the exfoliation occur when a solid or a structure moves in fluid,
which often result in the destruction, the noise, the stall of an airplane or the drag
of a vehicle, and the various studies have been performed: the effect by the surface
unevenness such as a turbulator, a vortex generator, a tripping-wire, a riblet of wing,
dimples of golf ball [2], the drag reduction of ship by the micro bubble [3], the
effect of deformation by an elastic body [4], the relation of vortex and vibration [5],
etc. The FSI study has contributed to the sport engineering: the improvement of the
movement form of swimmer [6, 7]. Regarding the sound of musical instruments, the
vibration of a musical instrument and the circumference air and the pronunciation
mechanism of an air lead of pipe organ or flute have been studied [810].
When we discuss the interaction between solid and fluid, the condition of the
interface between them is controversial. It is well known that the surface condition,
the roughness of surface or the uneven shape of the solid surface gives some influence
on the flow fields and the movement of the solid as seen in the case of the dimple of
a golf ball [11].
The free surface flow such as splash and drop induced by movement of a solid
object is also interesting topic of the FSI. For example, even if the surface of wall
of hydro-gel ball and that of acrylic resin ball look smooth each other, the splashs
form created by the hydro-gel ball differs from that of the acrylic resin ball and the
velocity distribution of the water around each ball is different as well.
In the biomechanics fields, there are some interesting papers besides well-known
study such as flapping wings or swimming fish; Yabe et al. [12] developed an algo-
rithm for the calculating surface tension and contact angle of the motion of water strid-
ers, finding two types of movement by the experiment and the simulation. On the other
hand, the finite volume method simulation of the promotion function underwater by
a flagellum as a micro propulsion was performed, where the effect of the flagellum
Some Considerations on Surface Condition of Solid 173
Fig. 3 Different crown type occurs according to the Weber number (We = DV 2 / , density, D
diameter V velocity, : surface tension) [18]. a Regular axisymmetric crown. b Regular crown with
spikes. c Irregular crown
living in water such as fish and amphibians have a slimy mucus skin, whose principal
ingredient is a hydrogel known as mucin [23]. Furthermore, since the inner wall of
the digestive organs or the blood vessel has a slippery surface, it seems important
to take the characteristics of such slippery surface into consideration in numerical
simulation.
In this paper, we focus on the treatment of the surface condition of an object in the
FSI problem. The experimental observation of splash is given as a suitable example
in the Sect. 2. We explain an outline of the numerical method of the Navier Stokes
equation by the particle method in the Sect. 3. The model is proposed introducing
the slip of objects surface to the particle method, and the numerical simulation
of the splash under different surface conditions is carried out, comparing with the
experimental results in the Sect. 4. We show the results of splash by the large scare
parallel calculation in the Sect. 5. The concluding remarks and future works are given
in the Sect. 6.
176 M. Yokoyama et al.
sphere
High speed
camera
Water
Tank
Fig. 6 Comparison of splash patterns between a hydrogel (Aqar) and b acrylic resin (radius of
sphere = 10 mm and impact velocity = 2.21 m/s)
experimental observation suggests that the numerical simulation should take into
consideration the various surface conditions as the interaction between the object
and the water.
Although several studies have paid attention to the coarseness of the surface or the
liquid exfoliation in the FSI simulation, the difference of a splash by the material
cannot be simulated with the conventional method. In other words, the simulated
pattern of the splash by the hydrogel object and that by the acrylic resin object
become the same result. The reason will be attributed to the fact that the above
simulations have assumed the boundary between the fluid and the solid to be of the
non-slip type. In this paper, we propose a calculation method, where the difference
of a splash due to the difference of the solid material is realized.
Employed method in the present paper is the MPS method, which is a particle
method recognized as an effective technique in performing the simulation of the
FSI. The method is a semi-implicit method, where, after calculating the temporary
position of particles with the equation of motion in the explicit stage, the Poisson
equation of pressure is solved in order to satisfy the condition that the number of
particles per a small volume is constant as the mass conservation. We discuss here
how to introduce the effect of a slip into the MPS in order to solve the flow field
around the hydro-gel surface.
178 M. Yokoyama et al.
where u is the velocity vector of fluid, is the density of fluid P is the pressure, is
the kinematic viscosity of fluid and F is the external force. Assuming two particles
i and j, where there exist, respectively, pressure pi and pj . The gradient of pressure
at the point i is written as [25]
d p j pi
P = (
r j
r i )(|
r j
r i |) (2)
n0 |
r j ri |2
j=i
where d is the constant value, which is equal to the dimension of space to be analyzed
and n 0 is called the particle number density.
The Laplacian of velocity at the point i is written as
2d
2 u = (
u j ui )(|
r j ri |) (3)
n 0
i= j
Here, r is the distance between two particles and re is the cut-off radius.
The algorism of MPS method takes following procedure (i) to (iii); (i) Tentative
velocity u is calculated at the explicit stage using F and viscous term of Navier-
Stokes equation (1), (ii) the Poisons equation of the pressure is solved at the implicit
calculation stage and the pressure p is obtained. Then, revised velocity u obtained
by this p in order that the particle number density in area is conserved, (iii) and u
is added to u then target velocity u on t is obtained, (iv) time step is increased,
t = t + dt, where the time step dt is 0.001 s in the present paper.
The hydro-gel, which is a kind of polymer gel is considered here, where the slip ratio
is defined by the moisture content in the hydro-gel. We discuss here how we introduce
heuristically the influence of the slip at the hydro-gel wall into the calculation. Let
us use the diving sphere made of agar as the hydrophilic material, which is a kind of
hydrogel like gelatin, and known to be easy to control its water content and to create
Some Considerations on Surface Condition of Solid 179
arbitrary shape. Agar consists of crosslinked structure by polymer called agarose and
lots of water molecule between the polymer structures, which is known to create a
slippery surface. For example, Eddington et al. [26] reported the use of the hydrogel
as the valve for flow control of a microchannel, and Beebe et al. [27] discussed the
effectiveness of hydrogel structure for flow control on micro fluidic channels.
Figure 7 shows the velocity distributions of water flow near the surface of the
acrylic resin versus that of the agar-gel, where is the height of water flow and u is
the velocity of water. Here, being the wall shear stress under the no-slip condition
and that under the slip condition, the slip ratio is defined as follows,
Here, the shear stress is obtained by flow velocity near the wall experimentally.
du
= | y=0 (10)
dy
Figure 8 shows the experimental relations between the swelling ratio S and the slip
ratio for the agar-gel and the carrageenan-gel, respectively, where the swelling
ratio S is defined as follows [28],
where m water is the mass of water and mgel that of the solid-gel. S increases with the
amount of water contained in the solid-gel. The agar employed in this study is a kind
of hydrogel [29], which is easy in handling and controlling its shape and the degree
of the swelling [30]. Figure 8 suggests that decreases with the increase of S, or can
be expressed as
= 1 S (12)
where is estimated to be 1.2 103 in the case of the agar. It is summarized that
larger S gives more slip on the surface.
In this paper, the above relation between the increase of the swelling ratio and the
reduction of the wall friction from our experiment is taken into consideration near
the wall in the viscous term of the Navier-Stokes equation in a heuristic manner.
Since the shear force acting between the wall and the fluid is directly related with
the viscosity term of Eq. (3), we modify the term as
2d
2 u = (
u j ui ) H (|
r j ri |) (13)
n 0
i= j
with
H (r ) = (r ) (14)
180 M. Yokoyama et al.
y
u
water
no-slip
(acrylic resin)
with slip
(hydro-gel)
wall
Fig. 7 Comparison of flow profiles near no-slip wall (acrylic resin) and slippery wall (hydrogel)
where index i denotes the water particle near hydro-gel wall and j the surface particle
of hydro-gel wall. Namely, is set effective only near the hydrogel wall, because
the effect of slip is available near this area. The effective length of the above reduced
weight function near the wall is assumed to be re in this study, and set to be 2.1l0,
where l0 is the initial distance between the particles.
Summarizing the above procedure, (i) select S according to the hydrophilicity of
the solid object, (ii) estimate using Eq. (12), and (iii) apply to the weight function
of viscous term of Navier-Stokes equation for the calculation of shear force near the
hydrogel wall using by Eqs. (13) and (14).
Next, we show the 2D splash simulation employing the above method. The effect
of the slip ratio on the flow around a hydrogel sphere can be taken care with
Eq. (11). The comparison of the simulation result with S = 100 and the experimental
Some Considerations on Surface Condition of Solid 181
primary
splash
t = 0.02 t = 0.03
one is shown in Fig. 9, where the radius of sphere R is 10 mm and the initial height h
is 50R in the both simulation and experiment. The water tank has the width of 20R
and the depth of 20R, where we confirmed the effect of the wall was negligible.
Assuming that the sphere touches the water surface at t = 0, the left figures in
Fig. 9 are snapshots of the splashes at t = 0.02.
The first splash, which is created just after the sphere touches the surface of water
is the so-called primary splash. It is seen that the sphere creates a cavity also. The
pattern of the above crown-type splash and the air cavity obtained by the present
simulation is similar to the experimental result. The above crown-type splash and
the presence of the air cavity do not occur in the case of the acrylic resin sphere.
182 M. Yokoyama et al.
Sphere
S=50
S=350
Fig. 10 Comparison between simulation results of representative path lines of water particles of
primary splash for hydrogel spheres of different values of swelling parameter S
z
y
x y
Fig. 11 Analysis domain for 3D splash simulation (left figure) and arrangement of particles of
hydro-gel sphere and water viewed from the top (right figure)
Figure 10 shows the crown-type splashes and the representative path trajectories
of particles for the different swelling ratios. The dotted solid lines are the path tra-
jectories of particles when S is 50 or = 0.94, whereas the dotted lines are those
when S is 350 or = 0.7. It is seen from the figure that the splashes spread widely
with larger value of S or , or the velocity of the water near the wall is larger with
the swelling ratio, which causes the earlier exfoliation, creating the wider primary
splash.
Some Considerations on Surface Condition of Solid 183
Fig. 12 3D splash (left figure) at t = 0.05 and domain decomposition (right figure)
Fig. 13 Comparison of splash patterns with the different initial distances of water particles l0
(S = 100, re = 4.1 and t = 0.03 s)
1. A bounding box of a whole region is defined, and is filled with buckets. Since an
influence radius of a particle is defined in the MPS method, the size of a bucket
is set to be wider than the influence radius.
2. All the particles are embedded in the buckets.
3. The bucket-based domain decomposition is performed with an equal number of
particles at each subdomain by ParMETIS [32].
4. If an imbalance in the number of particles among regions appears, the domain
decomposition is performed again in order to recover the balance of the number
of particles.
The parallel computer used here is the FX10 in the Information Technology Center
of the University of Tokyo. The processor of the above FX10 is the SPARC64 IXfx,
where a processor node has 16 cores of 1.848 GHz and 32 GB memory. In this
research, the OpenMP is used for parallelization in each node and the MPI is used
for parallelization among nodes.
Figure 11 shows the simulation setup of the crown-type splash in the case of the
hydro-gel sphere.
The initial locations of particles are arranged concentrically, and the water tank
is of a circular cylinder. Let the diameter of a particle and be 0.0005 m and 0.4,
respectively. Figure 12 is the result of the splash analysis at t = 0.05 s using 53 million
particles. It took about 12 h for this analysis using 240 nodes of the FX10. The velocity
of each particle is shown with color graduation in the left hand side of Fig. 12. The
right hand side of Fig. 12 shows the time sequence of domain decomposition, which
nodes are distinguished by different colors.
Figure 13 shows the simulation result by our 3D calculation, which are in good
agreement with the experimental result as shown in Fig. 9, where the crown-type
splash, the air cavity and the droplets scattering are expressed well. It is observed
that the particle diameter, which is described as the initial distance of particle l0,
influences the splashs pattern, namely, l0 becomes smaller and the total number of
particles is larger, the splash pattern is expressed clearer. The effect of the slip ratio on
the splash pattern in 3D simulation is now under analyzing, though we are able to see
the difference in the width and height of the splash pattern. In order to simulate the
finger or the spike in the crown-type splash and its relation with the Weber number,
we need larger scale computing yet.
6 Conclusion
1. Focusing on the treatments of the interface between the solid and the fluid,
we propose a calculation method with the slip effect on the surface of a slimy
material.
2. Experimental results of water splashes taken by a high-speed camera show that
the splash pattern caused by an acrylic resin sphere is different from that caused
by a hydro-gel sphere.
Some Considerations on Surface Condition of Solid 185
3. An engineering model to express the slimy surface, which the creatures living
in water such as fish or frogs have, is proposed, where the slip ratio , which
is the reduction ratio of the shear stress near a solid wall obtained through the
experiment, is introduced in the shear term of the Navier-Stokes equation.
4. The splash pattern calculated by the proposed method is in good agreement with
the experimental result.
5. The above method for calculating the splash is applied to the large scale parallel
computing in 3D, which depicts the more detailed splash patterns. As the future
work, a larger scale computing and a modelling of the surface tension are needed
to observe the finger or the spike as seen in the case of the milk-crown.
Acknowledgments This research was supported by the MEXT Program for the Strategic Research
Foundation at Private Universities, 20122017 and the WCU (World Class University) Program
through the Korea Science and Engineering Foundation funded by the Korean Ministry of Education,
Science and Technology (R33-2008-000-10027-0).
References
1. Billah KY, Scanlan RH (1991) Resonance, tacoma narrows bridge failure, and undergraduate
physics textbooks. Am J Phys 59(2):118124
2. Davies JM (1949) The aerodynamics of golf balls. J Appl Phys 20(9):821828
3. Kodama Y, Kakugawa A, Takahashi T, Kawashima H (2000) Experimental study on microbub-
bles and their applicability to ships for skin friction reduction. Int J Heat Fluid Flow 21(5):582
588
4. tienne S, Pelletier D (2005) A general approach to sensitivity analysis of fluid-structure
interactions. J fluids struct 21(2):169186
5. He T, Zhou D, Bao Y (2012) Combined interface boundary condition method for fluid-rigid
body interaction. Comput Methods Appl Mech Eng 223:81102
6. Pendergast DR, Mollendorf JC, Cuviello R, Termin AC (2006) Application of theoretical
principles to swimsuit drag reduction. Sports Eng 9(2):6576
7. Moria H, Chowdhury H, Alam F, Subic A, Smits AJ, Jassim R, Bajaba NS (2010) Contribution
of swimsuits to swimmers performance. Procedia Eng 2(2):25052510
8. Fletcher NH (1976) Sound production by organ flue pipes. J Acoust Soc Am 60:926
9. Coltman JW (1968) Sounding mechanism of the flute and organ pipe. J Acoust Soc Am 44:983
10. Tsuchida J, Fujisawa T, Yagawa G (2006) Direct numerical simulation of aerodynamic sounds
by a compressible cfd scheme with node-by-node finite elements. Comput Methods Appl Mech
Eng 195(13):18961910
11. Maruyama T (1999) Surface and inlet boundary conditions for the simulation of turbulent
boundary layer over complex rough surfaces. J Wind Eng Ind Aerodyn 81(1):311322
12. Yabe T, Chinda K, Hiraishi T (2007) Computation of surface tension and contact angle and its
application to water strider. Comput Fluids 36(1):184190
13. Kobayashi S, Watanabe R, Oiwa T, Morikawa H (2009) Computational study of micropropul-
sion mechanism in water modeled on flagellum with projecting mastigonemes. J Biomech Sci
Eng 4(1):1122
14. Nomura K, Koshizuka S, Oka Y, Obata H (2001) Numerical analysis of droplet breakup behav-
ior using particle method. J Nucl Sci Technol 38(12):10571064
15. Caboussat A (2006) A numerical method for the simulation of free surface flows with surface
tension. Comput Fluids 35(10):12051216
186 M. Yokoyama et al.
16. Liu J, Koshizuka S, Oka Y (2005) A hybrid particle-mesh method for viscous, incompressible,
multiphase flows. J Comput Phys 202(1):6593
17. Worthington AM (1882) On impact with a liquid surface. Proc R Soc Lond 34(220223):217
230
18. Krechetnikov R, Homsy GM (2009) Crown-forming instability phenomena in the drop splash
problem. J Colloid Interface Sci 331(2):555559
19. Akers B, Belmonte A (2006) Impact dynamics of a solid sphere falling into a viscoelastic
micellar fluid. J Nonnewton Fluid Mech 135(2):97108
20. Duez C, Ybert C, Clanet C, Bocquet L (2007) Making a splash with water repellency. Nat phys
3(3):180183
21. Yoon SS, Jepsen RA, Nissen MR, OHern TJ (2007) Experimental investigation on splashing
and nonlinear fingerlike instability of large water drops. J Fluids Struct 23(1):101115
22. Idelsohn SR, Onate E, Del Pin F (2003) A lagrangian meshless finite element method applied
to fluid-structure interaction problems. Comput struct 81(8):655671
23. Ling SC, Ling TYJ (1974) Anomalous drag-reducing phenomenon at a water/fish-mucus or
polymer interface. J Fluid Mech 65(03):499512
24. Kubota Y, Mochizuki O (2009) Splash formation by a spherical object plunging into water. J
Vis 12:339345
25. Koshizuka S (1995) A particle method for incompressible viscous flow with fluid fragmentation.
Comput Fluid Dynamics J 4:2946
26. Eddington DT, Beebe DJ (2004) Flow control with hydrogels. Adv Drug Deliv Rev 56(2):199
210
27. Beebe DJ, Moore JS, Bauer JM, Yu Q, Liu RH, Devadoss C, Jo BH (2000) Functional hydrogel
structures for autonomous flow control inside microfluidic channels. Nature 404(6778):588
590
28. Alupei IC, Popa M, Hamcerencu M, Abadie MJM (2002) Superabsorbant hydrogels based
on xanthan and poly (vinyl alcohol): 1. the study of the swelling properties. Eur Polymer J
38(11):23132320
29. Narayanan J, Xiong JY, Liu XY (2006) Determination of agarose gel pore size: absorbance
measurements vis a vis other techniques. J Phys: Conf Ser 28(1):83 (IOP Publishing)
30. Kikuchi K, Mochizuki O (2010) A flow on a hydrogel surface mimicked a living cell. In:
Proceedings of the 21st international symposium on transport phenomena in Kaohsiung city,
Taiwan
31. Yagawa G, Shioya R (1994) Parallel finite elements on a massively parallel computer with
domain decomposition, 4. Comput Syst Eng 4:495503
32. Karypis G, Kumar V (1998) A fast and high quality multilevel scheme for partitioning irregular
graphs. SIAM J Sci Comput 20(1):359392
Part IV
Reduced-Order Models
Reduced-Order Modelling Strategies
for the Finite Element Approximation
of the Incompressible Navier-Stokes Equations
1 Introduction
Reduced-order models (ROM) are nowadays receiving a lot of interest from the
computational mechanics community. Their most attractive feature is the capability
of reproducing the response of complex physical phenomena through the solution of
systems of equations which involve only very few degrees of freedom.
[4, 8, 15, 22, 31, 3538]. These approaches are known as hyper-reduced models. In
these methods, the non-linear and parameter-dependent terms are recovered by means
of a least-squares procedure from a series of sampling points where the function
to be approximated is computed. This allows to effectively reduce the amount of
computations required to build the reduced order system, and results in a reduced-
order model whose computational cost is directly proportional to its number of
degrees of freedom.
We have been working in a hyper-reduced approach for the incompressible Navier-
Stokes equations. The particularity of the strategy we propose is that the equations
for the reduced-order model are treated in an explicit way. This allows to send all the
non-linear terms to the right hand-side of the reduced-order system, leaving in the left-
hand side only the mass matrix due to the temporal derivatives. The main advantage
is, of course, that the mass matrix is linear, and the hyper-reduced approaches need
only to be applied to right-hand side vector. This effectively reduces the overall cost
of the reduced-order model.
Let us start by introducing some notation for the POD approximation of a general
problem. Let U R M be the global unknown vector associated to a non-linear
variational problem. Suppose that after linearizing and fully discretizing in time and
space the given problem, the following matrix form is obtained which allows to obtain
the vector of nodal unknowns U at a given iteration of the non-linear procedure, for
a certain time step:
AU = F, (1)
U , (2)
where R MN is the basis for U and N is the dimension of the reduced order
model, with N < M. R N are the components in U expressed in the reference
system defined by . The reduced-order basis is obtained by means of the POD
method [14, 23, 26], that is by doing the singular value decomposition of a set
of solution snapshots, which in our case are taken from the results of a full-order
simulation. After projecting the full-order system to this reduced-order subspace and
applying a least squares approach, the final reduced-order system is:
T A = T F. (3)
Reduced-Order Modelling Strategies for the Finite Element 193
In this section we summarize the finite element stabilized formulation for the
incompressible Navier-Stokes equations used in the rest of the chapter. Let us con-
sider the transient incompressible Navier-Stokes equations, which consist of finding
u : (0, T ) Rd and p : (0, T ) R such that:
t u u + u u + p = f in ,
u = 0 in ,
u = u on D ,
pn + n u = 0 on N .
for t > 0, where t u is the local time derivative of the velocity field. Rd is
a bounded domain, with d = 2, 3, is the viscosity, and f the given source term.
Appropriate initial conditions have to be appended to this problem.
Let now V = H 1 ()d , and V0 = {v V |v = 0 on D }. Let also Q = L 2 () and
D (0, T ; Q) be the distributions in time with values in Q. The variational problem
consists of finding [u, p] L 2 (0, T ; V ) D (0, T ; Q) such that:
with
u = u on D ,
where
Here, (, ) stands for the L 2 () inner product and , for the integral of the
product of two functions, not necessarily in L 2 (). Let {K } be a finite element
partition of , from which we construct the finite element spaces Vh V, Vn0
V0 , Q h Q. The semilinear form B suffers from the well-known stability issues
due to the convective nature of the flow, but also requires a compatibility between
the velocity and pressure approximation spaces due to the classical LBB inf-sup
condition. In order to deal with these stability issues, we use a stabilized finite element
formulation [16], which is as follows: for each t, find uh (t) Vh , ph (t) Q h
such that:
(v h , t uh ) + B([v h , qh ], [uh , ph ]) + K (uh v h + v h
K
+ qh , r([uh , ph ])) K = v h , f , (5)
194 J. Baiges et al.
r([uh , ph ]) = t uh uh + uh uh + ph f , (6)
where |uh | K is the mean velocity modulus in element K , h is the element size and
c1 and c2 are stabilization constants.
Regarding the discretization in time, we consider implicit integration schemes.
For the full-order system, only implicit time integration schemes can be used, because
no time derivatives of the pressure appear in the equations. Taking this into account,
we can do the following: supposing that the velocity and pressure at time step n
[unh , phn ] are known, we may solve (5) for example with t uh being discretized using
a backward differences in time scheme:
t uh t un+1
h ,
n+1
t (uh unh )
1
1st order scheme
t un+1 := 1 3 n+1 n + 1 un1 ) 2nd order scheme (7)
h ( u
t 2 h 2u h 2 h
All reduced basis functions do already fulfill the stabilized continuity equation.
Since the reduced-order basis is built from weakly incompressible solution snap-
shots and the incompressibility constraint is linear, the reduced basis functions
(and their linear combinations) do also fulfill it.
If basis functions are taken to be joint velocity-pressure basis functions (that is
contains the coefficients of functions in V Q), then the pressure at time step
n + 1 is automatically recovered from coefficients n+1 and the reduced order
basis even if all the terms involving the pressure are treated in an explicit way
in the reduced order formulation.
The variational formulation for the first order in time reduced-order model that we
propose is:
where the terms un and p n are a second order approximation of the state at n + 1
(the velocity and the pressure at n + 1) given by:
un = 2 un un1 ,
p n = 2 p n p n1 . (9)
In the case of the second order in time reduced-order model, we use the same varia-
tional formulation (8), but the terms un and p n are now a third order approximation
of the state at n + 1 given by:
12 n 9 n1 2 n2
un = u u + u ,
5 5 5
12 n 9 n1 2 n2
p n = p p + p . (10)
5 5 5
Note that for the first order explicit scheme we propose to use the second order
extrapolation (9), and for the second order scheme the third order extrapolation (10).
The key point of this formulation is that only the temporal derivative terms involve
values of the reduced-order velocity or pressures at the new time step. This ensures
that the resulting reduced-order matrix is linear. However, the reduced-order right-
hand-side still needs to be approximated. After solving the reduced-order system, the
velocity and pressure fields at n+1 can be recovered by multiplying the reduced-order
basis by the obtained reduced-order components n+1 .
196 J. Baiges et al.
Fig. 1 Comparison of the FOM (left) and ROM (right) velocities (top) and pressures (bottom) after
400 time steps of simulation
The first numerical example consists in the bidimensional flow past two cylinders.
The computational domain is a 16 8 rectangle. The cylinders are centered at
coordinates (3, 3) and (6, 5), and both of them are of diameter 1. The inflow velocity
is 1, which together with the density = 1 and the viscosity = 0.01 results in a
Reynolds number Re = 100. The time step is set to t = 0.1. The mesh is composed
of 7310 linear triangular elements. After running the full-order simulation and taking
the corresponding snapshots, the explicit reduced-order model is run. The number
of degrees of freedom for the ROM is only 10.
Figure 1 shows a comparison of the velocity and pressure fields for the full-order
and the explicit reduced-order model after 400 time steps of simulation. The high-
fidelity and the reduced-order fields are very similar. In Fig. 2 we compare the time
history and Fourier transform of the vertical velocity and the pressure at coordinates
(8.5, 4). We observe that the time history and Fourier transform of both vertical
velocity and pressure are accurate for the reduced-order model. The cpu-time for
running the full-order model is 53.24 s, the time for running the explicit reduced-
order model is 19.78 s, a 63 % reduction in computational time.
3 Hyper-Reduction Approach
At this point, we already have an explicit reduced-order model in which all the non-
linear terms are in the right-hand-side vector and the reduced system matrix is linear
and does not change between time steps. However, computing the right-hand-side
Reduced-Order Modelling Strategies for the Finite Element 197
Velocity Velocity
0.5 10
FOM FOM
0.4 ROM ROM
20
dB (yvelocity)
0.3
yvelocity
30
0.2
40
0.1
50
0
0.1 60
0.2 70
0 5 10 15 20 25 30 35 40 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
Time Frequency
FOM
Pressure ROM Pressure
0.05 10
FOM
0.1 ROM
20
0.15 dB (pressure)
30
Pressure
0.2
40
0.25
50
0.3
0.35 60
0.4 70
0 5 10 15 20 25 30 35 40 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
Time Frequency
Fig. 2 Comparison of the FOM and ROM velocity and pressure time history at the point (8.5, 4)
(left) and their Fourier transform (right)
vector at each time step is still expensive (number of operations of O(M)), because
we need to recompute F n+1 and then project it to the reduced-order subspace by
calculating T F n+1 . The approach we follow for reducing this computational cost
is to reconstruct the non-linear vector F n+1 by sampling only some of the entries of
this vector and applying a lest-squares minimization strategy . The method we follow
was first presented in [18], and a similar approach has been recently used in [13]
applied to an implicit reduced-order method for the incompressible Navier-Stokes
equations.
Let us consider a reduced order basis for the right-hand-side vectors F, F ,
obtained by means of a proper orthogonal decomposition of a set of snapshots for
F. F defines a low-dimensional subspace F R M , so that any right-hand-side
vector F can be approximated as:
F F F,
where now F R N are the reduced-order coefficients for the reconstruction. Let us
also consider that we only know the nodal values for F at some sampling components
Fi(k) , 1 k n s , where n s is the number of sampling components of the vector,
198 J. Baiges et al.
i(k) denotes the kth sampling component. We now want to recover the reduced order
basis coefficients F of the reduced order basis F for vector F.
In order to recover F we can solve the least-squares minimization problem:
ns
N
F = arg min ( F,i(k) j a j Fi(k) )2 , (11)
aR N
k=1 j=1
where F,i(k) j denotes the basis vector j evaluated at the kth sampling component,
i(k).
The previous procedure provides the tools required to extrapolate the right-hand-
side vector arising from the finite element problem. The main advantage is that in
order to do so, only the nodal values at certain few sampling components are needed.
If n s is O(N ), then the computational cost of rebuilding F n+1 for solving each time
step is reduced to O(N ), and the overall cost of the reduced-order model is O(N ).
When using hyper-reduced order models the quality of the recovered right-hand-
side vector highly depends on the selected sampling components. Several strategies
have been developed for choosing these sampling components [5, 8, 17]. Amongst
the most extensively used are the Discrete Empirical Interpolation Method (DEIM)
[15], where the sampling components are selected iteratively by imposing that the
error growth at each iteration is limited and the Best Points Interpolation Method
(BPIM) approach presented in [31], where the sampling points are chosen so that the
distance between the projection of the right-hand-side snapshots onto the reduced
basis subspace and the recovered right-hand-side is minimized.
The strategy we use, presented in [7], is a hybrid between the BPIM and the
DEIM. We call it a Discrete version of the Best Point Interpolation Method (DBPIM).
Similarly to the BPIM, the method consists of minimizing the error between the
recovered right-hand-side vector snapshots and the actual snapshots. However, in
the strategy we use we force the sampling coordinates to coincide with nodal points
of the finite element mesh. Plus, once a component associated to a node of the finite
element mesh is selected, all the degrees of freedom associated to that node are
included in the sampling selection. Moreover, due to the lack of smoothness of the
vectors which are being approximated we do not use a Marquardt related strategy in
order to advance to the optimal set of sampling nodes. Instead, we use an algorithm
which advances from one set of sampling nodes to the next one by evaluating the
error of the recovered snapshots at the neighbour points in the finite element mesh
and replaces a sampling node with its neighbour if the error diminishes. The DBPIM
algorithm is detailed in Algorithm 1 for a scalar unknown (where each sampling
node is associated to a single sampling component).
Reduced-Order Modelling Strategies for the Finite Element 199
The first step of the DBPIM algorithm consists of finding the projection F of
the snapshots onto the reduced order subspace defined by the reduced order basis,
F. For each snapshot, this yields the coefficients F . In the second step we choose
an initial set of sampling nodes, which can be done by using the DEIM method.
If the DEIM method is used, it will give us a set of sampling components. For a
scalar problem, each component corresponds to a node of the finite element mesh.
If the unknown is a vector field, then the nodes associated to the DEIM sampling
components are selected as initial sampling nodes, and the number of sampling nodes
is equal to the number of reduced basis functions. Otherwise, we always choose the
number of sampling nodes to be equal to the number of basis functions times an
(usually low) integer. After defining the initial set of sampling nodes, the degree(s)
of freedom associated to these sampling nodes become sampling components. For
,aprox
this initial set of sampling nodes, we recover the approximated coefficients F
by means of the previously described least-squares strategy. The error associated
to a set of sampling components i Nn s , whose k-th component is indicated as
i(k) 1, ..., M, is obtained by computing the difference between the exact and the
approximated F coefficients:
N snapshots
,aprox
e(i) = ||F (i) F || (12)
=1
where
,aprox
ns
N
2
F (i) = arg min ( F,i(k) j a j Fi(k) ) , = 1, Nsnapshots (13)
aR N
k=1 j=1
For this definition of the error, we can define the optimal set of sampling compo-
nents as:
In this section we simulate the incompressible flow around a NACA 0012 airfoil
profile [24]. The computational domain is a 32 16 rectangle, with the trailing edge
of the 8 unit long airfoil placed at (16, 8). The horizontal inflow velocity is set to 1
at x = 0, and slip boundary conditions are applied at the upper and lower walls of
the computational domain. Velocity is prescribed to 0 at the airfoil surface.
The viscosity has been set to = 0.001, which yields a Reynolds number Re
= 1000 based on the height of the airfoil. The time step has been set to t = 0.2.
In this numerical example, the C F L number associated to the finite element mesh
was C F L 62. A 29945 linear element mesh has been used. The mesh is refined
around the airfoil surface in order to be able to better capture the solution in the region
surrounding the boundary layer. The angle of attack has been set to = 0.2, and a
second order backward differences scheme has been used for the time integration.
100 velocity-pressure snapshots have been taken and the 10 first reduced basis
functions have been kept for the reduced-order model. For the hyper-reduced order
model, 100 additional snapshots for the right-hand-side have been taken and the cor-
responding 12 first reduced basis functions have been kept. The number of sampling
nodes is 36.
Reduced-Order Modelling Strategies for the Finite Element 201
Fig. 3 Velocity (top) and pressure (bottom) contours at Re = 1000, = 0.2 after 200 time steps.
Full-order (left) and Hyper-Reduced Order Model (right)
FOM FOM
ROM ROM
Velocity HROM
Pressure HROM
0.4 0.1
0.3 0
0.2 0.1
0.1 0.2
yvelocity
Pressure
0
0.3
0.1
0.4
0.2
0.3 0.5
0.4 0.6
0.5 0.7
0.6 0.8
0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40
Time Time
Fig. 4 Velocity (left) and pressure (right) time history at a control point at the wake of the airfoil,
Re = 1000, = 0.2, second order time integration
Figure 3 compares the velocity and pressure fields after 200 time steps for the
full-order and the hyper-reduced model. The reduced-order model almost exactly
matches the results from the full order model.
Regarding the computational cost, the full order model takes 148.9 s to run, the
reduced-order model takes 49.6 s (33 %). Finally, reduced-order model 2, in which
the computational cost depends only on the size of the reduced-order model, takes
only 0.71 s (0.45 %) to run.
Figures 4 and 5 show the time history and spectra for the velocity and pressure at
(8, 0.5). Despite the complex flow and the high number of oscillation modes present
in the solution, the reduced-order models manage to correctly capture the main modes
amplitudes and frequencies.
202 J. Baiges et al.
Velocity Pressure
10 FOM
10 FOM
20 ROM ROM
HROM 20 HROM
dB (yvelocity)
30
dB (pressure)
30
40
50 40
60 50
70
60
80
90 70
100 80
0 0.5 1 1.5 2 2.5 0 0.5 1 1.5 2 2.5
Frequency Frequency
Fig. 5 Velocity (left) and pressure (right) spectra at a control point at the wake of airfoil, Re = 1000,
= 0.2, second order time integration
Let us consider the splitting of the computational domain into two subdomains
k , k = 1, 2, and the associated local unknowns U k R Mk , M = M1 + M2 . If
the domain decomposition is applied to the equations arising from a finite element
problem, the partition into subdomains is done by assigning each of the nodes (and
Reduced-Order Modelling Strategies for the Finite Element 203
nodal unknowns) of the finite element mesh to a subdomain. This means that there
are no interface nodes, instead we define interface elements as those elements who
own nodes from different subdomains. Let us define a local reduced order basis k
consisting of the reduced basis functions ik R Mk , i = 1, ..., Nk , to approximate
U k in each subdomain. Note that the number of basis functions in each subdomain
is not necessarily the same, although we have considered it to be equal from now
on for simplicity. The possible ways to construct this basis are discussed later. This
local basis can be extended to the global domain by defining ik R M :
1
0
i1 := i , i2 := 2 , (15)
0 i
where the null terms correspond to components of the global system which lie outside
k . Taking this into account, the unknown U is approximated as:
N
U (i1 i1 + i2 i2 ) = (1 1 + 2 2 ), k R MN , k R N k = 1, 2,
i=1
(16)
(1 )T A(1 1 + 2 2 ) = (1 )T F
(2 )T A(1 1 + 2 2 ) = (2 )T F. (17)
If we also consider the decomposition of A and F the final reduced order system
can be written in terms of the local bases k :
204 J. Baiges et al.
The off diagonal block matrices correspond to the coupling terms and are null
except for the contribution of the unknowns ubicated at the domain interfaces. It
can be observed that the cost of computing the ROM system is not larger than in
the monolithic approach. However, the size of the reduced system is larger (dimen-
sion 2N ).
An important point is that each algebraic local basis function ik arises from a
function defined in space. This spatial function is a linear combination of the finite
element shape functions of the nodes of subdomain k . As a consequence, each
of the components in ik corresponds to a nodal value of the spatial field to be
represented on the finite element mesh. This is illustrated in Fig. 6, where examples
of local basis functions for a one-dimensional problem and linear finite elements
are shown. Let us also emphasize that, if the original finite element shape functions
are continuous, any local (and global) basis function will also be continuous, as a
consequence of the definition of the extension of the basis functions to the global
domain (15). This will also hold for the combination of local basis functions, even
if these belong to different subdomains. In Fig. 6 the blue basis function belongs to
the left subdomain, the red basis function belongs to the right subdomain. The green
line represents the addition of the blue and the red basis functions. Since both of the
original functions are continuous, the green function is also continuous. Note also
that there is an overlapping region where both the left and right basis functions are
non-zero.
Reduced-Order Modelling Strategies for the Finite Element 205
The strategy for building the local POD basis consists in performing a POD for the
part of the snapshots corresponding to each of the subdomains. The snapshots are
first partitioned according to the domain decomposition strategy and the local basis
k is obtained
from
these partitioned snapshots. The global basis is again defined
as = 1 , 2 . Note that the number of local basis functions in each subdomain
does not necessarily coincide, N1 = N2 , N = N1 + N2 .
The main features of these domain-decomposition local POD bases are the fol-
lowing:
Each local basis can be ensured to be orthonormal at the algebraic level. By con-
struction, each of the basis functions which conform the local POD has unitary
norm and is orthogonal to all the basis functions in its subdomain at the algebraic
level. Moreover, due to the domain decomposition approach, the projection of a
local basis of a given subdomain onto the space conformed by the basis functions
of any other subdomain is also zero. This ensures that if we consider the POD
decomposition globally, the union of the local bases is also an orthonormal basis.
The computation of the singular value decomposition of the local snapshots for
each subdomain requires less memory than the computation of the singular value
decomposition of the global snapshots.
In the case we are using hyper reduced models which require additional POD bases
for reconstructing the system matrix and right-hand side, we can proceed in the same
way.
Once the localized reduced order bases have been defined, the monolithic domain
decomposition reduced order model is obtained by using as reduced basis the union
of the local reduced bases. The fact that the basis functions are local makes the com-
putational cost diminish with respect to the global approach with the same number
of basis functions, because the operations can be done at the local level. However,
the number of functions is usually larger in the domain decomposition approach,
because a sufficient number of components needs to be assigned to the reduced basis
of each subdomain in order to properly represent the solution in that subdomain.
The previous domain decomposition strategy for reduced-order models, despite its
simplicity, suffers from unstable behavior when it is used in a straightforward manner
in the explicit reduced-order model for the stabilized finite element approximation
of the incompressible Navier-Stokes equations described in the previous sections.
These instabilites can be easily explained taking into account that an explicit time
marching scheme is equivalent in this case to an explicit iteration-by-subdomain
strategy, which is known to have convergence and stability issues. This is the reason
206 J. Baiges et al.
where now
k R Mk Nk , (21)
is the restriction of the local basis functions in k to the part of the subdomain
without overlapping (Mk components), and
k R Mk Nk , (22)
1 1 = U 1 = U 2 = 2 2 R M , (24)
where now we take k R Nk the ROM degrees of freedom for each subdomain.
This condition can be equivalently written as:
(1 )T 1 1 (1 )T 2 2 = 0,
(2 )T 1 1 (2 )T 2 2 = 0, (25)
where
0
k = k . (26)
0
1
a11 1 + a12 2 + (M 11 1 M 12 2 ) = f 1 , (27)
1
a21 1 + a22 2 + (M 21 1 M 22 2 ) = f 2 , (28)
where
M kl = (k )T l R Nk Nl . (29)
Let us remark that the time stepping strategies need not to be the same for the
full order and the reduced order equations. For instance, if the explicit reduced order
model described in the previous sections is used for the incompressible Navier-Stokes
equations, the A matrix and the F RHS vector for the reduced order equations are
taken from the explicit model, while the equations arising from the implicit time
stepping are kept for the full order equations:
A| F F A| F R R UF F| F
= . (31)
( ) A | R F ( )T Aexp | R R R
R T exp R R ( R )T F exp | R
where
F = ( )
A RP G A|TF R A| F F + A|TR R A| R F ,
R T
R = ( )
A RP G A|TF R A| F R + A|TR R A| R R ,
R T
F RP G = ( R )T A|TF R F| F + ( R )T A|TR R F| R .
Also, any hyper-reduction technique for efficiently reconstructing the ROM equa-
tions can be used. The described overlapping strategies and the use weighting coef-
ficients need to be introduced in the previous formulation. This can be done in a
straightforward manner, including the use of different weighting parameters or
for the FOM and the ROM equations.
The use of the domain decomposition ROM strategy to the particular problem of
the incompressible Navier-Stokes equations is straightforward if a ROM approach
is used in all the subdomains. On the other hand, some care needs to be taken when
a FOM approximation is used in one of the subdomains while a ROM approxima-
tion is used in its neighbour subdomains. As in the original domain decomposition
strategy, a penalization term through overlapping is convenient in this FOM-ROM
approach. However, it is necessary to distinguish between the velocity and the pres-
sure unknowns of the incompressible Navier-Stokes equations in this case: only the
equality between the FOM and the ROM velocities in the overlapping region is
imposed, and no condition is required on the FOM pressure field. This is so because
the pressure field can be understood as the Lagrange multiplier enforcing the incom-
pressibility constraint, and as such it is not possible to enforce the pressure value
over the overlapping domain.
In this numerical example we show the capability of the proposed FOM-ROM strat-
egy to adapt to flow configurations which were not present in the original snapshot
set. The initial problem set is the incompressible flow past a rectangular cylinder at
Re = 100. The computational domain consists of a 24 12 rectangle with a square
cylinder with a side of size 1. The square cylinder is centered at coordinates (8, 6).
The horizontal inflow velocity is set to 1. Slip boundary conditions which allow the
210 J. Baiges et al.
FOM
0.15 ROM 0.238 FOM
ROM
0.1 0.24
0.05 0.242
Pressure
yvelocity
0 0.244
0.05 0.246
0.1 0.248
0.15 0.25
0.2 0.252
0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40
Time Time
Fig. 8 Comparison of the FOM and ROM velocities at (5,4) for the initial configuration
flow to move in the direction parallel to the walls are set at y = 0 and y = 12, and
velocity is set to 0 on the cylinder surface in the direction normal to the surface.
A tangential force (computed by using a wall-law approach) is used to model the
velocity in the tangential direction. The viscosity has been set to = 0.01, which
yields a Reynolds number Re = 100 based on the dimension of the cylinder and the
inflow velocity. A second order backward difference scheme has been used for the
time integration with time step t = 0.1 . In this example, a relatively fine 67224
linear element mesh has been used to solve the problem.
An initial run of the full-order model is performed for the snapshot collection and
no domain decomposition strategy is applied in the initial run. The FOM model takes
849.36 s to run.After the snapshot collection procedure, the ROM is capable of repro-
ducing the FOM solution with a good accuracy for the velocity field (2.1 % of relative
error in the L 2 -norm for the last oscillation period) , the pressure amplitude being
underpredicted (but only with 0.8 % of relative error in the last oscillation period),
and a very low computational cost (3.07 s, 0.37 % of the original computational
cost), as illustrated in Fig. 8. For the ROM run, 10 basis functions are used, which
are obtained from the POD decomposition of the original 50 snapshot collection.
As illustrated in Fig. 8, the reduced-order model is capable of reproducing the
solution of the full-order model for the configuration in which the snapshots were
taken. However, let us now consider the flow injection in the downstream side of the
cylinder illustrated in Fig. 9, which is introduced in order to modify the flow. The
velocity in the injection region (whose length is 0.2) is 0.1 in the direction normal
to the cylinder surface. Figure 10 illustrates the behavior of the reduced order model
when the injection is considered. Despite its very low computational cost compared
to the FOM model, it is clear that the ROM is incapable of reproducing the new flow
configuration; the reason for this is that the snapshot set from which the ROM basis
was built does not contain the solution with the flow injection.
Let us now consider the FOM-ROM strategy described in the previous sections.
We will decompose the physical domain into two subdomains, based on our a priori
knowledge of the boundary conditions of the problem: the first subdomain corre-
Reduced-Order Modelling Strategies for the Finite Element 211
Fig. 9 Flow injection configuration. The red dotted line denotes the FOM domain for the FOM-
ROM model
FOM FOM
FOMROM FOMROM
0.1
0.1
0.05
yvelocity
0.05 0
Pressure
0 0.05
0.1
0.05 0.15
0.1 0.2
0.25
0.15
0.3
0.2 0.35
0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40
Time Time
Fig. 10 Comparison of the vertical velocity (left) and pressure (right) at (5,4) for the FOM, FOM-
ROM and ROM models for the injection case
sponds to the region surrounding the square cylinder of the rectangle (7, 10) (5, 7).
In this subdomain a FOM approach is going to be taken, and the Navier-Stokes equa-
tions are going to be solved with full accuracy. The second subdomain covers the
rest of the computational domain. Since this region does not involve the critical area
where the vortexes are formed, it is going to be solved by means of the less accurate
ROM strategy. The ROM basis are obtained from a set of 100 snapshots, from which
a L-POD basis of 10 basis functions is obtained. As it will be shown, the combination
of both strategies (FOM and ROM) allows us to recover a solution which is close to
the full FOM solution, but at a much lower computational cost.
Figure 10 shows a comparison of the vertical velocity and pressure at a point
at the wake of the cylinder with coordinates (5, 4), for the FOM, the ROM and
the FOM-ROM models. It is interesting to note that the ROM model is not able to
capture the physics of the problem; this is natural since the ROM basis does not
contain the solution of the injection case. The FOM-ROM model, on the other hand,
212 J. Baiges et al.
Fig. 11 Comparison of the velocity (top) and pressure (bottom) fields after 400 steps. Left FOM.
Right FOM-ROM
is capable of a quite accurate solution of the system evolution in the short term
in the FOM domain (13.3 % relative error for the velocity time history in the last
oscillation period and 4.6 % error in the pressure). Figure 11 compares the velocity
and pressure fields of the FOM and the FOM-ROM models. We can observe that in
the region surrounding the cylinder (FOM region) the velocity and pressure fields
are very similar, in the ROM region the velocity fields slightly differ, with more
intense vortexes or bulbs in the FOM simulation. This is due to the difficulties for
the ROM model for representing the injected velocity and pressure fields (the used
snapshots are bad for the injection case). Despite this evident lack of optimality of the
snapshot set, the FOM-ROM model is capable of properly representing the solution
in the FOM region. Figure 12 shows a comparison between the FOM simulation and
FOM-ROM model for several injection velocities. The accuracy of the FOM-ROM
model decreases as the absolute value of the injection velocity increases. This is due
to the fact that the larger the injection velocity, the more different the flow becomes
from the original FOM simulation without injection. Regarding the computational
cost, the FOM-ROM approach takes 55.56 s to run, which is only 6.7 % of the original
FOM computational cost.
5 Conclusions
In this chapter we have discussed several strategies for dealing with the reduced-order
approximation of the incompressible Navier-Stokes equations. We have departed
from a stabilized finite element full-order approximation and we have approached
the order reduction by using a Proper Orthogonal Decomposition (POD) method.
Reduced-Order Modelling Strategies for the Finite Element 213
Pressure
0.2
yvelocity
0
0.1
0
0.05
0.1
0.1 0.2
0.3
0.15 0.4
0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40
Time Time
0.1 FOM 2
FOM
0.08 FOMROM FOMROM
1.5
0.06
0.04
Pressure 1
yvelocity
0.02
0 0.5
0.02
0
0.04
0.06
0.5
0.08
0.1 1
0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40
Time Time
FOM FOM
0.25 0.1
FOMROM FOMROM
0.2 0.2
0.15
0.3
0.1
Pressure
0.4
yvelocity
0.05
0.5
0
0.6
0.05
0.1 0.7
0.15 0.8
0.2 0.9
0.25
1
0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40
Time Time
Fig. 12 Comparison of the vertical velocity (left) and pressure (right) at (5,4) for the FOM, FOM-
ROM and ROM models for the injection case. Injection velocities, from top to bottom 0.2, 0.5, -0.2
In the first part of the chapter, we have focused in the construction of an explicit
reduced-order model for the incompressible Navier-Stokes equations, and the appli-
cation of hyper-reduction techniques to it. The basic idea is to treat all the terms
except the mass matrix in the temporal derivative in an explicit way. This includes
the non-linear convective term, but also the stabilization terms which can be highly
non-linear through the stabilization parameter . In order to do so, we take advan-
tage of the fact that the snapshots used for building the reduced-order basis through a
singular value decomposition in the POD procedure do already fulfill the stabilized
continuity equation. Secondly, we also acknowledge the fact that, if the velocity and
214 J. Baiges et al.
pressure are treated jointly, then the pressure can be recovered from the reduced-order
basis and the solution coefficients at the end of each time step.
The proposed explicit reduced-order model performs well in practical cases, as
illustrated in the numerical examples section. Despite the time-stepping scheme
being explicit, the Courant-Friedrichs-Levy condition can be violated, which can be
explained because the reduced basis functions expand over the whole computational
domain. On the other hand, the reduced model is sensitive to the inclusion of noisy
basis functions, which can cause unstable solutions to appear. The sensitivity of the
explicit reduced-order model to this issue can be improved by reducing the time step
and refining the finite element mesh.
A hyper-reduction strategy for the explicit reduced-order model has also been
presented, which is based on the reconstruction of the right-hand-side vector through
a gappy-pod procedure. For the selection of the indices of the gappy reconstruction,
we use a discrete version of the Best Points Interpolation Method (DBPIM), which
uses only values at the nodes of the finite element mesh, with the advantage that the
selected points can be guaranteed to be at least locally optimal.
In the second part of the chapter, we have presented a domain decomposition strat-
egy for non-linear hyper-reduced-order models. The method consists of restricting
the reduced-order basis functions to the nodes of each subdomain. This definition
of the partitioned problem directly ensures the continuity of the recovered solution.
The local POD bases are obtained by computing a local POD decomposition for
the partitioned snapshots. When applied to the explicit reduced-order model for the
incompressible Navier-Stokes equations a stabilizing penalization term is required.
This penalty term is defined so that it weakly enforces the equality of the unknown
between subdomains in an overlapping region.
The domain decomposition reduced-order model can be extended to a particular
case, in which one of the subdomains is solved by using the full-order finite ele-
ment equations while the other ones are solved using the reduced-order model. This
diminishes the computational cost in the low-resolution subdomains, while keeping
the high fidelity solution in the domain regions which are subject to more complex
physical phenomena.
Numerical examples illustrate the accuracy of the proposed methods for the solu-
tion of incompressible flow problems at a low computational cost: the reduced order-
model allows us to save up to 65 % of the computational cost, while in the case of
the hyper-reduced order models the computational saving is larger than 99 % the
original computational cost.
References
1. Akhtar I, Borggaard J, Hay A (2010) Shape sensitivity analysis in flow models using a finite-
difference approach. Math Probl Eng 123:2010
2. Antil H, Heinkenschloss M, Hoppe RHW, Sorensen DC (2010) Domain decomposition and
model reduction for the numerical solution of pde constrained optimization problems with
Reduced-Order Modelling Strategies for the Finite Element 215
25. Kalashnikova I, Barone MF (2011). Stable and efficient galerkin reduced order models for
non-linear fluid flow. In: AIAA-2011-3110, 6th AIAA theoretical fluid mechanics conference,
Honolulu
26. Kosambi DD (1943) Statistics in function space. J Indian Math Soc 7:7688
27. Lassila T, Rozza G (2010) Parametric free-form shape design with PDE models and reduced
basis method. Comput Meth Appl Mech Eng 199(2324):15831592
28. LeGresley PA (2005). Application of proper orthogonal decomposition to design decomposition
methods. PhD thesis, Department of Aeronautics and Astronautics, Stanford University
29. Lucia DJ, Beran PS (2003) Projection methods for reduced order models of compressible flows.
J Comput Phys 188(1):252280
30. Lucia DJ, King PI, Beran PS (2003) Reduced order modeling of a two-dimensional flow with
moving shocks. Comput Fluids 32(7):917938
31. Nguyen NC, Peraire J (2008) An efficient reduced-order modeling approach for non-linear
parametrized partial differential equations. Int J Numer Meth Eng 76(1):2755
32. Noack BR, Morzynski M, Tadmor G (2011) Reduced-Order modelling for flow control.
Springer, Berlin
33. Rabczuk T, Bordas SPA, Kerfriden P, Goury O (2012). A partitioned model order reduction
approach to rationalise computational expenses in multiscale fracture mechanics
34. Rozza G, Lassila T, Manzoni A (2011) Reduced basis approximation for shape optimization
in thermal flows with a parametrized polynomial geometric map. In: Hesthaven JS, Ranquist
EM (eds) Spectral and high order methods for partial differential equations., vol 76Springer,
Berlin, pp 307315
35. Ryckelynck D (2005) A priori hyperreduction method: an adaptive approach. J Comput Phys
202(1):346366
36. Ryckelynck D (2009) Hyper-reduction of mechanical models involving internal variables. Int
J Numer Meth Eng 77(1):7589
37. Verhoeven A, Voss T, Astrid P, ter Maten EJW, Bechtold T (2007) Model order reduction for
nonlinear problems in circuit simulation. PAMM 7(1):10216031021604
38. Verhoeven A, Maten J, Striebel M, Mattheij R (2009) Model order reduction for nonlinear ic
models. In: Korytowski A, Malanowski K, Mitkowski W, Szymkat M (eds) System model-
ing and optimization, vol 312, IFIP advances in information and communication technology,
Springer, Berlin, pp 476491
39. Veroy K, Patera AT (2005). Certified real-time solution of the parametrized steady incompress-
ible Navier-Stokes equations: rigorous reduced-basis a posteriori error bounds. Int J Numer
Meth Fluids, 47(89):773788
40. Wang Z, Akhtar I, Borggaard J, Iliescu T (2011). Proper orthogonal decomposition closure
models for turbulent flows: a numerical comparison. arXiv:1106.3585
41. Wicke M, Stanton M, Treuille A. Modular bases for fluid dynamics. ACM Trans Graph,
28(3):39:139:8
A Survey of Hierarchical Model (Hi-Mod)
Reduction Methods for Elliptic Problems
Simona Perotto
Abstract In this work, we review the basic aspects of the so called Hierarchical
Model (Hi-Mod) reduction approach, recently advocated to reduce the complexity of
models for advection-diffusion-reaction phenomena in pipe-like domains featuring
a prevalent axial dynamics. The Hi-Mod approach aims at reducing the computa-
tional costs still preserving a reliable approximation of the transverse components
of the solution by properly combining finite elements and modal approximations. In
particular, we consider the convergence of this approximation to the solution of the
full problem and the different ways for selecting the number of transverse modes.
Mathematical and numerical models are nowadays a fundamental tool for quantitative
analysis in many fields of science and engineering. On the one hand, sophisticated
models can be reliably used for complex dynamics (fluid-structure interaction, bio-
chemical reactions, etc.) not only for computing quantities of interest, but also for
solving optimization, identification or, more in general, inverse problems. On the
other hand, practical use of these tools demands a significant reduction of computa-
tional costs. This may be extremely challenging in particular for inverse problems.
For this reason, one important recent research line is devoted to the set up of sur-
rogate models and solutions for a particular problem, towards the construction of
the best trade-off between reliability and computational efficiency [19]. This can be
achieved with a reduction of the size of the (finite dimensional) solution, based on the
on-line/off-line paradigm like in the Proper Orthogonal Decomposition approach or
in the Reduced Basis method. A differentsomehow complementaryapproach is
S. Perotto (B)
MOX, Dipartimento di Matematica F. Brioschi, Politecnico di Milano,
Piazza Leonardo da Vinci 32, I-20133 Milano, Italy
e-mail: [email protected]
Problems relevant to the Hi-Mod formulation feature a domain where one direction
is prevalent. Thus, we assume that IRd coincides with a d-dimensional fiber
bundle, with d = 2, 3, so that = x1D {x} x , where 1D is the support-
ing 1D domain described by only one independent variable x, while x IRd1
denotes the transverse fiber which, in general, is a function of x. Thus, we align 1D
with the dominant dynamics exhibited by the problem at hand and the fibers x with
the secondary transverse dynamics. For the sake of simplicity, we choose 1D as
the interval ]x0 , x1 [. The more general case of a curved supporting fiber can be con-
sidered as well (see Remark 2). Now, we partition the boundary of into three
disjoint sets, 0 = {x0 } x0 , 1 = {x1 } x1 and = x1D x , such that
= 0 1 . We assume that either homogeneous Dirichlet or homoge-
neous Neumann boundary conditions can be enforced on 0 , 1 and , as well as
non-homogeneous Dirichlet data can be assigned on 0 and 1 .
A Survey of Hierarchical Model (Hi-Mod) Reduction Methods 219
x x
x
2
1
2
1 x
1D
1D
2 2
Fig. 2 Maps involved in the Hi-Mod procedure applied to a curved three-dimensional domain
as shown in Fig. 2. Now, the map becomes more complicated since also the
deformation of the centerline has to be taken into account. Thus, we have z =
(z) = (1 (z), 2 (z)), with
x = 1 (z) and y = 2 (z). The definition of the
Jacobian (1) accordingly changes in
1
x y 1
I (z) = = 2
.
z y 2
x
Moreover, the inverse map between and does not coincide now with 1 and
is defined apart as : so that z = ( z) = (1 (z), 2 (
z)), with x = 1 (
z)
and y = 2 (z) (see Fig. 2). Finally, we assume that both and are differentiable
with respect to z.
Three different approaches have been proposed so far to perform a Hi-Mod reduction.
The common idea is to exploit the fiber structure introduced on by tackling in a
different way the dependence of the solution on the dominant and on the transverse
directions. In the sections below we present separately the three Hi-Mod techniques
by following the chronological order of their proposal in the literature.
For this purpose, we first introduce the full problem, i.e., the problem we aim at
reducing. We consider a generic second-order elliptic problem in the weak form,
given by
find u V : a(u, v) = F (v) v V, (2)
This model reduction technique is first introduced in [11] and then more rigorously
investigated in [22]. The dominant and the trasverse dynamics of the problem are
described via two different functional representations, namely
1. a space V1D H 1 (1D ) spanned by functions defined on 1D and compatible
with the boundary conditions enforced along 0 and 1 ;
2. a modal basis { j } jN+ H 1 ( d1 ) of functions orthonormal with respect to
the L 2 -scalar product on
d1 , properly including the boundary data assigned on
.
By properly combining the space V1D with the modal basis, we define the uniform
hierarchically reduced space
m
Vm = vm (x, y) =
v j V1D , x 1D , y x ,
v j (x) j (x (y)), with
j=1
(3)
where m N is a given integer, fixed a priori. Space Vm identifies a hierarchy of
models: the number m of included modes determines the accuracy of the reduced
model that, in principle, can be tuned arbitrarily close to the full one. We call this
approach uniform since we use the same value for m over the entire domain. Notice
that, due to the orthonormality
of
the modal functions,
the frequency coefficients in
(3) are given by
v j (x) = v x, 1 (
y) (
y) d
y, with j = 1, . . . , m.
d1 m x j
We can now state the uniform Hi-Mod formulation: given a modal index m N+ ,
Remark 3 (Choice of the modal basis) Different choices are possible for the modal
basis { j } jN+ . Of course, this choice strictly depends on the boundary conditions
assigned along . So far we have essentially used trigonometric functions, i.e.,
we have essentially considered homogeneous Dirichlet boundary conditions on the
horizontal sides of . In [22] we have also checked the performances associated
222 S. Perotto
with Legendre polynomials with similar results. More recently, in order to deal with
more general boundary conditions, we have introduced a new type of modal basis
called educated basis (see [3]). The idea is to solve an auxiliary Sturm-Liouville
problem on the transverse reference fiber
d1 in order to build a modal basis which
automatically includes the boundary conditions assigned along . This approach
has been successfully validated both in 2D and 3D. We finally remark that the choice
of the modal basis, together with the regularity of the full solution u, influences also
the rate of the modal convergence.
To make the uniform Hi-Mod approach useful in practice, we consider the discrete
counterpart of formulation (4). Following [11, 22], we discretize the main dynamics
via standard 1D finite elements while preserving the modal expansion to describe
the transverse features. For this purpose, we consider a subdivision Th of 1D into
subintervals K i = (xi1 , xi ) of width h i = xi xi1 , with h = maxi h i . Then,
h V
we consider a conforming finite element space V1D 1D associated with Th such
that dim(V1D ) = Nh < +, and we introduce a standard density hypothesis on
h
the space V1Dh . The discrete uniform Hi-Mod reduction can thus be stated as: given
a modal index m N,
h
find u m Vmh : h
a(u m , vm
h
) = F (vm
h
) vm
h
Vmh , (5)
v jh V1D
h , x ,y
1D x Vm ,
j=1
(6)
the last inclusion being guaranteed by the conformity assumption on V1D h .
h = u u h V (which includes
In [22] it is proved that the uniform global error em m
both the model (uu m ) and the discretization (u m u m h ) error contributions) vanishes
for m and h 0. Some results are available in the literature concerning the
rate of convergence of em h (we refer, e.g., to [8, 9, 15]). Moreover, a numerical
convergence study of the discrete Hi-Mod formulation (5) is available in Sect. 3.4.1
of [22]. For sufficiently smooth functions, we recover the convergence rate expected
from Theorems 2.1 and 3.2 in [8] for em h , namely quadratic for the L 2 norm and
For the sake of simplicity, we focus on the 2D case. The full space V coincides with
H01 (), while V1D = H01 (1D ). Moreover, the modal functions j vanish on .
The bilinear and linear forms in (2) are given by a(u, v) = u v d xdy and
F (v) = f v d xdy, respectively with f L 2 (). Now, we first consider the
h (x, y) =
m
modal representation u m j=1
u jh (x) j (x (y)) for the Hi-Mod approxi-
mation u m ; then, we expand each modal coefficient
u jh (x) = i=1
m
Nh
u mh (x, y) =
Nh
m
di (x) dl (x) di (x) dl (x)
rk1,1
j (x) +rk1,0
j (x) l (x) + rk0,1
j (x) i (x)
1D dx dx dx dx
j=1 i=1
rk0,0
+
j (x) i (x) l (x) d x
1
where f k (x) = 1 y)) ( x, x1 (y) d
1 f (x, x ( k y) D2 y , while we have that
s,t
rk j (x) = s,t
1 rk j (x, y) D2 x, x (
1 1 y) d
y, for s, t = 0, 1, with
rk1,1
j (x, y) = j (y) k (y), rk1,0
j (x,
y) = j (y) k (
y) D1 x, x1 ( y) ,
rk0,1
j (x, y) = j (
y) k (y) D1 x, x1 ( y) , (9)
2
2
1 1
rk0,0
j (x,
y) = j (
y) k (
y) D 1 x, x (
y) + D 2 x, x (
y) .
Remark 4 System (8) shows that a full purely diffusive problem yields low-order
contributions in the reduced framework. However, the first-order terms yielded by the
reduction procedure are always weighted by the diffusive coefficient. Consequently,
224 S. Perotto
i= 1...i= 10
l=1
.
.
.
l = 10
10
k= 1 10 u1h 10 f1 10
u 2h 10 f2 10
=
uh f3
3
k= 4 uh 10 f4 10
4
j= 1 j= 4
Fig. 3 Sketch of the linear system associated with a uniform Hi-Mod reduction, for m = 4 and
Nh = 10, with [ f k ]l = F (l k ),
h , for k = 1, . . . , 4 and l =
u kh the modal coefficients of u m
1, . . . , 10
with 31,264 elements. Due to the strong advective field and the assigned boundary
conditions, the solution is basically flat for x > 4.5, while it exhibits large variations
in the leftmost part of the domain with boundary layers in correspondence with {(x, 0)
for 0 x 3.5} and {(x, 2) for 0 x 4} and, more significantly, along {(0, y) for
0 y 2}. In Fig. 4, bottom we display the uniform Hi-Mod solution associated
with m = 9 modal functions and with a 1D mesh Th of uniform size h = 0.01.
The agreement between the full and the reduced solution is pretty good. Notice that
we do not resort to any stabilization scheme: the choice made for h guarantees that
the local Pclet number corresponding to the advective field b is strictly less than
one. However, the actual advective term in the Hi-Mod reduced formulation also
depends on D1 (z) (see [11]). This last contribution could make the chosen h locally
insufficient to ensure the stability of the discretization scheme. This could explain
the negative values of the reduced solution in the darker blue areas near the two
sources (see Fig. 4, bottom) in contrast to the minimum value zero assumed by the
full solution.
Figure 4, bottom clearly shows the main limit of a uniform Hi-Mod reduction. To
accurately approximate a full solution with local strong transverse components, we
need to employ a large number of modes over the whole domain, i.e., also where the
transverse dynamics are not relevant. This implies a waste in terms of computational
cost. The piecewise Hi-Mod formulation aims at improving the computational effi-
ciency of a Hi-Mod reduction by employing a different number of modes in different
parts of . Large values are associated with the zones where the transverse dynamics
226 S. Perotto
1
2 3
2
1
are important, while small values are selected where the 1D behavior is dominant.
As a consequence, the modal index m becomes a vector, called modal multi-index,
which collects the number of modes used in the different portions of . This justifies
the name of the approach.
In order to formulate the piecewise counterpart of (4), we need to introduce a
number of definitions. To simplify the discussion, we focus on the 2D case and we
assume to identify, via some criterion, three areas 1 , 2 and 3 in where a
different number m i of modal functions, for i = 1, 2, 3, is employed. In particular,
we denote by 1 and 2 the interface between 1 2 and 2 3 , respectively
and by 1D,i = 1D i = (i1 , i ) the subinterval of 1D associated with the
subdomain i so that i = x1D,i {x}x , i=1 3
1D,i = 1D , 1D,i 1D,i =
for i = i and i, i = 1, 2, 3, and where 0 x0 , 3 x1 (see Fig. 5). Finally, we
introduce the two-dimensional broken Sobolev space H 1 (, T ) associated with
the partition T = {i }i=1 3 of , properly modified according to the boundary
conditions assigned on [17]. The inclusion V H 1 (, T ) holds. We can
define now the piecewise hierarchically reduced space
mi
Vm (T ) = vm L 2 () : vm |i (x, y) =
v ji (x) j (x (y)), i = 1, 2, 3,
j=1
p
v ji H (1D,i ) : j = 1, . . . , m with p = 1, 2,
1
(10)
vm | p+1 ( p , 1
p
y)) vm | p ( p , 1
( p
(
y)) j ( y=0 ,
y) d
1
h
mi
Vmh (T , {Thi }) = vm Vm (T ) : vm
h
|i (x, y) = v ji,h (x) j (x (y)) (12)
j=1
i,h
i = 1, . . . , s, v ji,h
V1D Vm (T ),
i,h
where V1D H 1 (1D,i ) is a finite element space associated with Thi , such that
i,h
dim(V1D ) = Nhi < +. A standard density assumption is postulated on the spaces
i,h
V1D . The discrete piecewise Hi-Mod reduction reads: given a modal multi-index
m [N+ ]s ,
h
find u m Vmh (T , {Thi }) : aT (u m
h
, vm
h
) = FT (vm
h
) vm
h
Vmh (T , {Thi }).
(13)
228 S. Perotto
aT (m
h
, vm
h
) = 0 vm
h
Vmh (T , {Thi }), (14)
(14), can be found in [22], under the assumption that the Dirichlet-Neumann scheme
converges. In particular, as previously pointed out, if m is a strictly decreasing multi-
index, i.e., m i > m i for any i, i = 1, . . . , s with i > i, a conforming approximation
umh is guaranteed and the error can be bounded via the standard Ca lemma. On the
other hand, for an increasing multi-index m, we get an error bound consisting of the
usual best approximation term plus an additional correction due to the high-frequency
components near the interfaces, which are responsible for the nonconformity (see
[22, Sect. 4.2.2] for a detailed discussion).
We apply the piecewise Hi-Mod reduction to the same test case as in Sect. 3.1.2.
By exploiting the intrinsic heterogeneity of the full solution u, we divide into
three subdomains 1 = (0, 2.5) (0, 2), 2 = (2.5, 4.5) (0, 2) and 3 =
(4.5, 7) (0, 2) and we resort to m 1 = 2, m 2 = 7 and m 3 = 1 modal functions,
respectively. This choice is suggested by the behaviour itself of the full solution
since the most complex dynamics take place in the central part of , while the
solution u is essentially flat in the rightmost part of the domain. At both the interfaces
1 = {2.5} (0, 2) and 2 = {4.5} (0, 2) we apply a Dirichlet/Neumann scheme
with relaxation (i.e., Dirichlet conditions are prescribed when solving the leftmost
subdomain, Neumann conditions for the rightmost one), by setting the relaxation
parameter to 0.5. This choice guarantees the well-posedness of each problem in i
since the field b is backward. Notice that, in the presence of a forward advective
A Survey of Hierarchical Model (Hi-Mod) Reduction Methods 229
h
Fig. 6 Piecewise Hi-Mod solution u 2,7,1 at the second (top), fourth (middle) and seventh (bottom)
iteration of the domain decomposition approach for h = 0.01
As shown in the previous section, the piecewise Hi-Mod reduction may lead to a
computational improvement with respect to the uniform approach when dealing with
phenomena with significant transverse dynamics localized in a certain portion of the
domain. In this perspective, the best computational advantage is attained when one
can calibrate the subdomain with the largest modal index exactly to fit the region with
significant transverse dynamics. This is in some sense the spirit behind a pointwise
Hi-Mod reduction [24]. In this case, the modal functions are pointwise-tuned, which
justifies the name assigned to this method. Now, the modes are associated with the
nodes of the finite element partition, in contrast to the piecewise approach where the
subdomain i shares the same number of modal functions. Due to the association
mode-node, the pointwise Hi-Mod reduction makes sense only in a discrete context.
To settle the pointwise formulation, we move from the discrete modal expansion
(7) that we properly rewrite as
Nh
m
u mh (x, y) =
We remark that, in this expansion, the leading role is taken by the sum on the finite
element nodes while in (7) by the one on the modes. Inspired by representation (15),
we introduce a new definition for the discrete Hi-Mod space, where we allow the
number m of the modal basis functions to vary on each finite element node xi . Thus,
the discrete pointwise hierarchically reduced space is
m iN
Nh
h
VM = vM (x, y) =
h
v j, i j (x (y)) i (x), with x 1D , y x ,
h
i=1 j=1
(16)
Nh
where M = {m iN }i=1 [N+ ] Nh is the modal multi-index collecting the number of
modes for each finite element node.
The pointwise Hi-Mod formulation is given by: for a certain modal multi-index
M [N+ ] Nh ,
h
find u M VM
h
: h
a(u M , vM
h
) = F (vM
h
) vM
h
VM
h
, (17)
where a(, ) and F () coincides with the bilinear and linear forms in (2). From
definition (16), it follows immediately that the nodewise Hi-Mod solution u M h is
9
8
7
6
5
4
3
2
1
0 20 40 60 80 100 120 140 160 180 200
We assess the pointwise Hi-Mod procedure on the same test case used to validate
both the uniform and the piecewise approaches. For this purpose, we introduce a
finite element partition Th of the supporting fiber 1D = (0, 7) constituted by 200
equispaced nodes. Then, starting from the results in Figs. 4 and 6, we adopt the
corresponding modal distribution shown in Fig. 7, right. In particular, we employ
seven modes in the area around the two sources, except for a single node, close to
the center of D1 , where nine modal functions are used and for few nodes in the zone
between the two sources where only five modes are switched on.
Figure 7, left shows the contour plots of the pointwise Hi-Mod solution u M h .
It is fully comparable with the full solution in Fig. 4, top despite the presence of
some negative areas as already detected in the uniform and in the piecewise reduced
solutions. As expected, solution u Mh is an H 1 -conforming approximation, i.e., no
This approach is the most straightforward and the first strategy that we have pursued
to select the number of modes to be used in [11, 22].
We essentially distinguish the case when the user has no information about the
dynamics of the problem to be reduced from the case when some hints on this
problem are available. If no information is available, we can resort to a trial and
error approach: we move from the computationally cheapest choice for m, i.e.,
m = 1. Then, we gradually increase such a value and we stop when the addition
of the successive modal function does not significantly improve the accuracy of the
reduced solution. This is the procedure that we have followed to get the uniform
Hi-Mod solution u 9h in Fig. 4, bottom. Figure 8 shows the contour plots of some
A Survey of Hierarchical Model (Hi-Mod) Reduction Methods 233
h
Fig. 9 Piecewise Hi-Mod solution u 4,7,1 at the last iteration of the domain decomposition approach
for h = 0.01
of the reduced solutions yielded by this trial and error approach. It is evident that
the accuracy of u mh increases as m gets larger. The presence of the two localized
sources demands a rather large number of modes overall. While the behaviour of u is
correctly described by a single mode on the rightmost part of , at least five modes
are necessary to recognize the sources at D1 and D2 . Solution u 7h is very close to u 9h
in Fig. 4, bottom. The only difference is a slight reduction of the negative areas for
the choice m = 9.
When we have some knowledge about the full solution or about some previous
run of the Hi-Mod procedure, we may fix directly the number of modes instead of
gradually varying it. This is the approach used in the piecewise Hi-Mod reduction of
Sect. 3.2.2. Indeed, Fig. 8 suggests that few modes are sufficient in 1 and 3 , while
at least 7 modes are demanded in 2 . Of course, other choices are allowed, driven,
e.g., by specific user demands. For instance, when we aim at reducing the model
discontinuity in Fig. 6, bottom, we simply increase the number of modes in 1 . If
we employ, e.g., four instead of two modes in 1 while preserving the same values
for all the parameters involved in the domain decomposition scheme, we obtain, after
7 iterations, the Hi-Mod solution in Fig. 9. It is difficult to appreciate in this case the
model discontinuity occurring along the interface 1 = {2.5} (0, 2).
The pointwise Hi-Mod reduction is the setting where an a priori choice of the
modal multi-index is less immediate. This is essentially due to the large variability
that can be assigned to the modal indices m iN , which now may vary nodewise. Such
a variability represents a huge richness to optimize the reduced model selection
provided that the user has a precise idea on the trend of the full solution u . Figure 7
provides an example in such a direction.
According to a goal-oriented analysis (see, e.g., [5, 14, 16, 20]), we measure the
accuracy of the reduced model via a goal functional J : H 1 (, T ) IR which
represents a physical quantity of interest to be measured (e.g., mean or local values,
the lift and drag coefficients around bodies in external flows, convective or diffusive
fluxes). In particular, we assume J linear. We estimate the unknown value J (u) via
the computable value J (u m ), with u and u m solution to (2) and (11), respectively,
with the purpose of keeping the goal error J (u u m ) below a desired tolerance.
The automatic procedure aims at identifying the subdomains i and the associated
modal multi-index m to guarantee such a goal.
To estimate the goal error, we define the piecewise Hi-Mod dual problem: given
a modal multi-index m [N+ ]s ,
and
We recall now the main proposition in [23] which represents the reference result in
the proposal of the a posteriori modeling error estimator:
Proposition 1 Let us assume that there exists a positive constant m < 1 and a
modal multi-index M0 [N+ ]s , such that, for m, m+ [N+ ]s with m+ > m M0 ,
|J (u m+ u m )| |J (u m+ u m )|
|J (em )| . (22)
1 + m 1 m
J (u m+ u m ) = aT (u m+ u m , z m+ z m ). (23)
A Survey of Hierarchical Model (Hi-Mod) Reduction Methods 235
Relation (23), combined with the two-sided bound (22), leads to identify the a
posteriori modeling error estimator for the goal error |J (em )| with the quantity
The lower and the upper bound in (22) represent the corresponding efficiency and reli-
ability estimate, respectively. mm+ is a goal-oriented hierarchical estimator which
combines the easy computability typical of a hierarchical estimator with the high
versatility proper of a goal functional analysis.
From a computational viewpoint, to evaluate mm+ we replace the piecewise
hierarchically reduced primal and dual solutions with the corresponding discrete
approximations. Thus, via (24), we simply evaluate the quantity (z mh z h )T K (u h
+ m m+
u mh ), where K is the stiffness matrix associated with the enriched formulation (19),
which is already available and does not need any additional assembly. Alternative
procedures to evaluate estimator mm+ are proposed in [23].
The adaptive procedure proposed in [23] consists of two phases. The first phase
identifies the number s and the location of the subdomains i . In the second phase
we find the modal multi-index m to satisfy the requirement mm+ TOL , with TOL
a user-defined tolerance.
Let us focus on the first phase. To select the subdomains i , we employ a thresh-
olding technique (see Fig. 10, left). We compute the estimator mm+ in a uniform
context, i.e., by employing m
and m
+ modes on the whole , with m
+ > m
. We
usually choose small values for both m
and m
+ to contain the computational costs.
In particular, we compute the estimator normalized by its maximum value, m
m
+ ,
so that we may assume the estimate to be in the range [m min , 1], with min the
m
+
m
m
+
minimum value assumed by the normalized estimator on . Now, we pick a thresh-
old (0, 1). Then, after introducing a uniform finite element partition {K l } on
1D , we assign the value m + to the barycenter of K l . We denote by j , with
m
Kl
j = 1, . . . , s, the intersections between
m
+ and , and by j the corresponding
m
closest finite element node. The set j identifies the partition T = {i }i=1 s , with
(0)
As byproduct of the first phase, we also get the initial guess m(0) = {m i }i=1
s
[N+ ]sfor the modal multi-index to start the second phase of the modeling adaptive
procedure. We set
if m
(0) m
m
+ K l < , K l s.t. K l 1D,i = ,
mi = (25)
+ if m
m
m
+ K , K l s.t. K l 1D,i = ,
l
where N+ denotes the modal update. We define the initial guess m+ (0) for the
multi-index m+ in an analogous way.
Remark 6 Some crucial situations may occur when selecting the threshold . If is
less than m
min , it means that the initial guess for the modal truncation is too coarse.
m
+
We consequently refine both m
and m
+ and recompute the normalized estimator.
The algorithm above fails even when a root of the equation m
m
+ = 0 has
multiplicity strictly greater than one. A possibility to avoid this is to check also
the first derivative of m
m
+ before selecting . Finally, if m
+ exhibits several
m
oscillations around a range of values, these values are not eligible as threshold, since
this would lead to identify too many subdomains, making the numerical procedure
completely ineffective. We refer the interested reader to Remark 4 in [23] to get more
details and some example for such critical situations.
We move now to the second phase of the modeling adaptive procedure. Starting
from the initial guesses m(0) and m+ (0) defined according to (25), we apply a standard
equidistribution criterion to the subdomains i , i.e, we demand that mm
i
+ = TOL/s,
where mm+ = mm+ |i denotes the modeling error estimator associated with the
i
6. if mm
i
+ > delta1
TOL
s
(k) (k1) + (k) + (k1)
7. m i = m i + ; m i = mi + ;
8. elseif mm
i
+ delta2 TOL
s
(k) (k1)
9. m i = max(1, m i ), m +,(k)
i = max(1, m +,(k1)
i );
end
end
10. compute mm+ ;
11. k = k + 1; }
A Survey of Hierarchical Model (Hi-Mod) Reduction Methods 237
We assess the reliability of both the modeling error estimator mm+ and of the adap-
tive procedure above on the same test case tackled in the previous numerical sections.
In particular, as goal quantity we choose the mean value of the solution on . This
leads us to identify the functional J in (18) with J (v) = [meas()]1 v(x, y) dz.
Notice that the dual problem still coincides with an advection-diffusion problem, but
with a forward advective field. Full homogeneous Dirichlet boundary conditions
complete the dual formulation.
We make the following choices for the input paramentes of the modeling adaptive
procedure: TOL= 2 103 , m
= 3, m
+ = 5, = 0.1, = 1, delta1 = 0.5,
delta2 = 1.5, Nmax = 10, while we introduce a uniform finite element partition
of size h = 0.05 to discretize 1D = (0, 7). The adaptive procedure detects the
three subdomains 1 = (0, 2.7) (0, 2), 2 = (2.7, 4.4) (0, 2) and 3 =
(4.4, 7) (0, 2), while the initial guesses predicted for the modal multi-indices are
m(0) = {3, 4, 3}, m+ (0) = {5, 6, 5}. The domain 2 associated with the two sources
is immediately identified as the most troublesome.
Concerning the domain decomposition algorithm, due to the advective field b, we
have to pay attention in selecting the Dirichlet and the Neumann interfaces for the
primal and dual problems to guarantee the well-posedness of each subproblem on
i . In particular, a Dirichlet/Neumann condition is assigned at 1 = {2.7}2.7 and
2 = {4.4}4.4 when solving the primal problem; conversely, a Neumann/Dirichlet
condition is enforced on 1 and 2 to solve the dual problem. We fix a tolerance equal
to 102 for the domain decomposition algorithm at both 1 and 2 . The average
number of iterations demanded to ensure this accuracy is eight for the primal problem
and nine for the dual one.
Three model adaptive iterations allow to reach the desired tolerance TOL, with a
final prediction for the modal multi-index m = m(3) = {3, 7, 1}. While the initial
(0)
number m 1 = 3 of modes is preserved on 1 , we have a gradual increase of the
number of modal functions in 2 and a model coarsening occurs in 3 . Table 1
238 S. Perotto
Table 1 Quantitative information on the second phase of the modeling adaptive procedure
k m m+ mm+
0 {3, 4, 3} {5, 6, 5} 1.81 102
1 {3, 5, 2} {5, 7, 4} 4.72 103
2 {3, 6, 1} {5, 8, 3} 4.56 103
3 {3, 7, 1} {5, 9, 3} 1.31 103
5 5 5
x 10 x 10 x 10
1 2 1.5 1.5
0 1
0.8 1
0.5
2 0
0.6 0.5
4 0.5
0.4 1 0
6
1.5
0.2 8 0.5
2
0 10 2.5 1
0 2 4 6 8 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
h
Fig. 11 Piecewise Hi-Mod solutions: u 3,4,3 h
, u 3,5,2 h
, u 3,7,1 (topbottom)
A Survey of Hierarchical Model (Hi-Mod) Reduction Methods 239
18
16
14
12
10
8
6
4
2
0
0 20 40 60 80 100 120 140 160 180 200
provides more quantitative information about the adaptive procedure. The sequence
of columns gathers the iteration number k (k= 0 refers to the initial configuration
predicted by the first phase), the modal multi-indices m, m+ and the value of the
estimator mm+ . For the sake of simplicity, we set the saturation constant m in
(21) to zero. Figure 11 collects the piecewise Hi-Mod solutions for the starting
guess and for the odd iterations of the adaptive procedure, while Fig. 10 shows
the associated error estimators: in particular, the three lines correspond to the local
quantities Ji (u m+ u m ), for i = 1, 2, 3. The model discontinuity occurring at the
interface 1 is almost imperceptible.
Remark 7 The a posteriori modeling error analysis and the adaptive procedure here
presented are generalizable to both a uniform and a pointwise setting. Figure 12
shows the Hi-Mod solution predicted by the error estimator derived for a pointwise
reduction to control the energy norm of the modeling error [25]. Notice the high
number of modes used in the central part of the domain. Moreover, in [23] the model
adaptivity is successfully combined with an adaptive prediction of the finite element
partition along 1D .
Acknowledgments The author thanks Massimiliano Lupo Pasini for Figs. 7 and 12 and Alessandro
Veneziani for the suggestions during the preparation of the manuscript.
References
1. Achchab B, Achchab S, Agouzal A (2004) Some remarks about the hierarchical a posteriori
error estimate. Numer Meth Partial Differ Equ 20(6):919932
2. Ainsworth M (1998) A posteriori error estimation for fully discrete hierarchic models of elliptic
boundary value problems on thin domains. Numer Math 80:325362
3. Aletti M, Perotto S, Veneziani A (2014) Educated bases for hierarchical model reduction in
2D and 3D. (in preparation)
4. Babuka I, Schwab C (1996) A posteriori error estimation for hierarchic models of elliptic
boundary value problems on thin domains. SIAM J Numer Anal 33:221246
5. Bangerth W, Rannacher R (2003) Adaptive finite element methods for differential equations.
Birkhauser, Basel
6. Bank RE, Smith RK (1993) A posteriori error estimates based on hierarchical bases. SIAM J
Numer Anal 30:921935
7. Blanco PJ, Leiva JS, Feijo RA, Buscaglia GC (2011) Black-box decomposition approach for
computational hemodynamics: one-dimensional models. Comput Methods Appl Mech Eng
200(1316):13891405
8. Canuto C, Maday Y, Quarteroni A (1982) Analysis of the combined finite element and Fourier
interpolation. Numer Math 39:205220
9. Canuto C, Maday Y, Quarteroni A (1984) Combined finite element and spectral approximation
of the Navier-Stokes equations. Numer Math 44:201217
10. Drfler W, Nochetto RH (2002) Small data oscillation implies the saturation assumption.
Numer Math 91:112
11. Ern A, Perotto S, Veneziani A (2008) Hierarchical model reduction for advection-diffusion-
reaction problems. In: Kunisch K, Of G, Steinbach O (eds) Numerical mathematics and
advanced applications, pp 703710. Springer, Heidelberg
12. Formaggia L, Nobile F, Quarteroni A, Veneziani A (1999) Multiscale modelling of the circu-
latory system: a preliminary analysis. Comput Visual Sci 2:7583
13. Formaggia L, Quarteroni A, Veneziani A (eds) (2009) Cardiovascular mathematics, modeling,
simulation and applications, vol 1. Springer, Berlin
14. Giles MB, Sli E (2002) Adjoint methods for pdes: a posteriori error analysis and postprocess-
ing by duality. Acta Numer 11:145236
15. Heinrich B (1996) The Fourier-finite-element method for Poissons equation in axisymmetric
domains with edges. SIAM J Numer Anal 33:18851911
16. Johnson C (1993) A new paradigm for adaptive finite element methods. In: Whiteman J (eds)
Proceedings of MAFELAP, vol 93. Wiley, New York
17. Lasis A, Sli E (2003) Poincar-type inequalities for broken Sobolev spaces. Technical Report
0310. Oxford University Computing Laboratory
18. Lions J-L, Magenes E (1972) Non-homogeneous boundary value problems and applications.
Springer, Berlin
19. Lorenz B, Biros G, Ghattas O, Heinkenschloss M, Keyes D, Mallick B, Tenorio L, van Bloe-
men Waanders B, Willcox K, Marzouk Y (eds) (2011) Large-scale inverse problems and
quantification of uncertainty, vol 712. Wiley, Chichester
20. Oden JT, Prudhomme S (2001) Goal-oriented error estimation and adaptivity for the finite
element method. Comput Math Appl 41:735756
21. Perotto S (2014) Hierarchical model (Hi-Mod) reduction in non-rectilinear domains. In: Erhel
J, Gander M, Halpern L, Pichot G, Sassi T, Widlund O (eds) Lecture Notes in Computer
Science Engineering. Springer, Berlin,pp 407414
A Survey of Hierarchical Model (Hi-Mod) Reduction Methods 241
22. Perotto S, Ern A, Veneziani A (2010) Hierarchical local model reduction for elliptic problems:
a domain decomposition approach. Multiscale Model Simul 8(4):11021127
23. Perotto S, Veneziani A (2014) Coupled model and grid adaptivity in hierarchical reduction of
elliptic problems. J Sci Comput. doi:10.1007/s10915-013-9804-y
24. Perotto S, Zilio A (2013) Hierarchical model reduction: three different approaches. In: Can-
giani A, Davidchack R, Georgoulis E, Gorban A, Levesley J, Tretyakov M (eds) Numerical
mathematics and advanced applications, pp 851859. Springer, Berlin
25. Perotto S, Zilio A (2014) Hierarchical model reduction for time dependent problems. (in
preparation)
26. Quarteroni A, Valli A (1999) Domain decomposition methods for partial differential equations.
In: Numerical mathematics and scientific computation. Oxford University Press, New York
27. Toselli A, Widlund O (2005) Domain decomposition methods-algorithms and theory. Springer,
Berlin
28. Vogelius M, Babuka I (1981) On a dimensional reduction method I. the optimal selection of
basis functions. Math Comput 37:3146
29. Vogelius M, Babuka I (1981) On a dimensional reduction method II. some approximation-
theoretic results. Math Comput 37:4768
30. Vogelius M, Babuka I (1981) On a dimensional reduction method III. a posteriori error
estimation and an adaptive approach. Math Comput 37:361384
Part V
Multi-fluid Flows
On the Application of Two-Fluid Flows Solver
to the Casting Problem
1 Introduction
Multi-fluid flow simulation with large deformations at the interface has to deal with
two main challenges. The first one is accurately follow or capture the interface
between the phases and the second one is treating jumps in the material properties
at the interface, that result in kink or jumps in the unknown fields. The most popular
technique to capture the interface is the level set method that can deal with the large
deformations of the interface. its main deficiency is the gain/loss of the material at
sharp interfaces or during the reinitialization step. On the other hand, for applications
like mold filling that one of the fluids has large density, aluminum or steel, and
the other one small density, air, instabilities appear at the interface that can totally
destroy the solution accuracy. The main reason for such behavior is the poor quality
of the linear elements to capture kink or jumps in the pressure-velocity pair. Various
enrichment techniques are provided to improve the approximation properties at the
elements cut by the interface. In the following, we first review some main algorithms
developed for the interface capturing approach by means of level set method. Then,
various enrichment techniques to treat kink and/or jumps in the unknown fields at
the interface are presented. Finally we loosely couple the level set method and one
of the enrichment techniques to model the mold filling process.
The underlying idea behind level set method is to represent an interface as the
zero-level set of a higher dimensional function (x, t). This function is scalar and
substantially reduces the complexity of describing the interface, especially when
undergoing topological changes such as pinching and merging. The level set func-
tion (x, t) is defined to be a smooth function that is positive in one region and
negative in the other.
The motion of the interface is determined by a velocity field, u, which can depend
on a variety of things including position, time, geometry of the interface, or be given
externally for instance as the solution of the Navier-Stokes equations in a fluid flow
simulation. The advection equation for interface evolution is,
t + u = 0. (1)
This level set equation is a first order hyperbolic PDE and only needs to be solved
near the interface. Geometrical quantities related to the interface ( = 0.0), i.e. unit
normal N and curvature , can be calculated from the level set function by:
N= = N (2)
The most common choice for the level set function is the signed distance to
the interface so that || = 1. This ensures that the level set is a smoothly varying
function well suited for high order accurate numerical methods. There are several
techniques to solve the level set Eq. (1) in space and time. One of the most popular on
structured meshes is the high order Hamilton-Jacobi ENO method (HJ-ENO) com-
bined with a Runge-Kutta method [1, 2]. Despite the high order temporal and spatial
On the Application of Two-Fluid Flows Solver to the Casting Problem 247
approximations of the level set equation, instabilities may appear when the level set
cease to be a signed distance function. This situation occurs at the presence of large
topological changes at the interface vicinity, which are quite common in practical
problems. One solution is to reshape the level set function to a distance function. This
method, called reinitialization, has been shown to stabilize those numerical instabil-
ities. Reinitialization algorithms maintain the signed distance property by solving to
steady state (as fictitious time ) the equation
+ sgn(0 )( 1) = 0. (3)
where sgn(0 ) is a one-dimensional smeared out signum function [3]. Equation (3)
only needs to be solved near the interface and not in the whole domain. Efficient
ways to solve this equation to steady state via fast marching methods are discussed
in [4]. This equation can also be written in a classical Hamilton-Jacobi form as:
+ v() = sgn(0 ) and v() = sgn(0 )
Unfortunately, one of the major drawbacks of the reinitialization process is the diffi-
culty in preserving the original location of the interface, often leading to breakdown
in the conservation of mass. To overcome this problem of mass loss with the level
set method, various solutions have been proposed:
Particle level set methods
Volume of fluid methods
Geometric mass-preserving redistance methods
Discontinuous Galerkin level set methods.
Particle Level Set (PLS) [5] uses Lagrangian marker particles to rebuild the level
set in regions which are under-resolved. This is often the case for flows undergoing
stretching and tearing. Two sets of massless marker particles are placed near the
interface with one set, the positive particles, in the > 0 region and the other set,
the negative particles in the < 0 region. It is unnecessary to place particles far
from the interface and this greatly reduces the number of particles needed in a given
simulation. The region near the interface could be considered as the region covered
by all elements that has at least one corner with the distance inferior to 3 element
size. The number of particle per cell is set to 4d where d is the spatial dimension.
Figure 1a shows the zero level-set for the 2D Zalesaks disk after one revolution.
Placement of the massless particles around the zero-level set for this test can be seen
in Fig. 1b. In Fig. 1c the PLS solution after one revolution is shown.
248 K. Kamran et al.
Fig. 1 Particle Level Set (PLS) method [5], a mass gain/loss of the standard level set solution
after one revolution, b placement of massless positive (blue) and negative (red) particles around the
interface for the Zalesak test, c PLS solution after one revolution
where x p is the position of the particle and u(x p ) is its velocity. The particle velocities
are interpolated from the velocities on the underlying grid. The marker particles (4)
and the level set Eq. (1) are separately integrated forward in time. After each complete
time cycle, the particles are used to locate possible errors in the level set function.
Particles that are on the wrong side of the interface by more than their radius, as
determined by the interpolated distance (x p ), are considered to have escaped.
A local level set function is defined for each particle by means of the radius
associated to the particle,
p (x) = s p (r p x x p ) (5)
where s p is the sign of the particle, i.e. 1. These level sets are only defined locally
on the corners of the cell containing the particle and can be seen as the particle
predictions of the values of the level set function on the corners of the cell. Any
variation of from p indicates possible errors in the level set solution. The escaped
positive particles are used to rebuild the > 0 region and the escaped negative
particles to rebuild the 0 region. For example take the > 0 region and
an escaped positive particle. Using Eq. (5), the p values of the grid points on the
boundary of the cell containing the particle are calculated. Each p is compared to
the local values of and the maximum of these two values is taken as + . This is
done for all escaped positive particles creating a reduced error representation of the
> 0 region. That is, given a level set and a set of escaped positive particles E + ,
On the Application of Two-Fluid Flows Solver to the Casting Problem 249
+ = max ( p , + )
pE +
Similarly for the negative region 0, we initialize with and then calculate
= min ( p , ).
pE
We merge + and back into a single level set value by setting to the value of
+ or which is least in magnitude at each grid point.
The Particle level set method relies on being a signed distance function. This
implies that after each time marching step, a reinitialization of the level set function
using Eq. (3) is necessary. Unfortunately, the reinitialization may cause the zero level
set to move, which is not desirable, so the particle level set method is employed to
correct these errors as well. During the reinitialization step the particles are not
moved to keep the zero level set, and then any error in the reinitialization scheme is
corrected by the particles. After reinitialization and correction step the radii of the
particles are adjusted according to the current (x p ) regarding their distance to the
zero level set.
In summary the order of operation in PLS is: evolve both particles and level set
function in time, correct errors in the level set function using particles, reinitialize
the distance function, again correct the errors using particles and finally adjust the
particle radii. A final task to complete the PLS is the particle reseeding, i.e. in flows
with the the interface stretching and tearing we need to periodically readopt the
particle distribution to the deformed interface [5].
The main advantage of representing the free surface with the volume fractions is
that one can write accurate algorithms for advecting the volume fraction function so
that mass is conserved while still maintaining a sharp representation of the interface.
However, a disadvantage of the VOF method is the fact that it is difficult to compute
accurate local curvatures from volume fractions. This is mainly due to the sharp
transitions of the volume fractions at the interface. One remedy is to smooth the H
function at the interface. If one smooths too much then the numerical method will not
detect changes in curvature along the interface. In the CLSVOF method, the curvature
is not smoothed at all; instead the curvature is obtained via finite differences of the
level set function which in turn is derived from the level set function and volume-
of-fluid function at the previous time step.
The equation governing the evolution of the VOF is:
Ft + u F = 0. (8)
The coupling between the level set function and the volume-of-fluid F occurs when
the interface normals computed from the level set are used in the interface recon-
struction at each cell that in turn are used by the VOF function. Much of the research
on VOF methods has focused on obtaining higher accuracy and better representa-
tions of the interface geometry. The original piecewise linear interface reconstruction
technique (PLIC) [20] has been improved upon using parabolic (PROST) [21] and
least squares [22] techniques.
The CLSVOF can be summarized as, first update level set function n+1 , then
reconstruct the interface at each cell by one of the linear or higher order techniques
and finally update the volume-of-fluid function F n+1 . Note that reinitializing the
n+1 as the exact signed distance function to the reconstructed interface, is another
point where coupling between the LS and VOF occurs. Remind that numerically, the
smoothed Heaviside function H () is substituted for the sharp Heaviside function
H (). The smoothed Heaviside function is defined as,
1 if <
H () = 2 [1 +
1
+ 1
sin(/)] if || (9)
0 if > .
The advection distorts the initial shape of the level set function, which needs to
be reinitialized to a smooth function preserving the position of the zero-level set.
On the Application of Two-Fluid Flows Solver to the Casting Problem 251
Efficient algorithms for level-set redistancing on Cartesian meshes have been devel-
oped but few methods are available for unstructured meshes. Geometric mass-
preserving redistance method [8, 9], developed for unstructured meshes, can be
localized on a narrow band close to the interface, saving computing effort. Almost
all redistancing algorithms involves some sort of mass-rebalancing step. Geomet-
ric mass-preserving redistance includes one such step that is local and involves no
adjustable parameter.
Consider an arbitrary triangulation of the domain , and the associated space Vh
of continuous function that are linear inside each simplex. Let h Vh be a function,
and let S be its zero-level set. We look for a function h Vh which approximates the
signed distance function d to S. This function satisfies d = 1 almost everywhere
in but does not, in general, belong to Vh (see Fig. 2).
Let P be the set of nodal points that are adjacent to the zero-level set of , in the
sense that they are vertices of simplicis inside which h changes sign. If one makes
the simple assignment h (X) = d(X) for all X P, there is a volume loss (or gain)
which could render the algorithm for practical simulations. The values of h at nodes
adjacent to the zero-level set must thus be adjusted so as to preserve volume, and
the function h must be calculated at the remaining nodes using the adjusted values
at P.
Let us define K(h ) as the set of simplices in which h changes sign. The objective
is thus to calculate h such that it approximates the signed distance d while at the
same time preserving the volume,
V (h ) = H (h (x)) dx,
K(h )
where H is the Heaviside function defined in (7). The contribution to the volume of
each simplex K K(h ) is
VK (h ) = H (h (x)) dx.
K
R K ( K ) = VK (0h + K ) VK (h ) = 0
where C is computed such that conserve the volume over the band of elements
cut by the interface,
H 0h + Ch dx = H h (x) dx,
K(h ) K(h )
This nonlinear system for C is again solved by a simple secant method and
converges in very few iterations.
Note that considering a uniform mass-conserving correction by choosing h = 1
is not optimal since volume loss/gain is not uniform over the interface. In fact the
loss/gain in volume tends to concentrate in regions of higher curvature. Figure 3
shows the application of geometric mass-preserving redistance method to the Zalesak
test.
Fig. 3 Geometric mass-preserving redistance method [8] applied to the Zalesak test. Interface
position at different instances during one revolution and with redistancing after each step, ae
mass-preserving redistance method and fj level set method. a t = 0 s, b t = 157 s, c t = 314 s, d t
= 471 s, e t = 628 s, f t = 0 s, g t = 157 s, h t = 314 s, i t = 471 s, j t = 628 s
t + (u) = u.
t + (u) = 0. (10)
Within this framework, only possible for incompressible flow, it is not needed to
compute anymore. The variational form of the Eq. (10) is written as;
t wd = u wd f wds (11)
where f () = u n is the normal trace of the fluxes and u is the velocity field. Since
the DGM allows discontinuities at the interface, the flux is not uniquely determined
on it and a flux formula has to be supplied to complete the discretization process. In
the simple advection case the upwind flux is chosen to be the value of on :
+ if u n 0, i.e. The flow goes inside the domain
UP =
if u n > 0, i.e. The flow goes outside the domain.
d
= Nk ek
k=1
d
d
F() = F( Nk ek ) Nk F(ek ) (12)
k=1 k=1
In [10] Lagrangian shape functions are used to approximate and in each element
e, d Lagrangian points is considered. As the interpolations are being disconnected,
the integral form (11) is written for each element e of the mesh. Note that for elements
with straight line edges the Jacobian matrix is constant and also the matrices related
to (11), once written in the reference coordinates, are independent of each element.
In this way the DGM is quadrature-free. In [10] a TVD-Runge-Kutta of order p + 1,
where p is the polynomial interpolation order, is used for the time stepping. The RK
of order p + 1 with DG at order p can be proven to be stable under the CFL condition
t < c(2 p+1)
h
, where h is the element size and c is the norm of maximum velocity
on element e. Table 1 summarizes the comparison between, DG for various order of
interpolation and two other methods mentioned earlier, VOF and PLS, for the 2D
Zalesaks disk.
Two types of methods are generally distinguished for resolving the interface between
the two phases: interface tracking in which the mesh explicitly represents the interface
and follows its movements [1113] and interface capturing in which the interface is
described implicitly as the zero level-set of an auxiliary function [6, 1416] defined on
the fixed mesh. The interface-capturing method is more convenient in case that large
topological changes occur at the interface. The main issue that need to be addressed
in two-phase flow problems is the possible jumps or kinks in the unknown fields
i.e. velocity and pressure, due to the jumps in material properties, i.e. density and
viscosity, or the surface tension. In this view, two class of methods can be recognized
in the interface-tracking category. In the first one [17] a numerical thickness is given
to the interface by means of the smoothed Heaviside function (9). The material
On the Application of Two-Fluid Flows Solver to the Casting Problem 255
properties, and , are then defined for the two-fluid system at the discrete level as,
= w H (h ) + a [1 H (h )]
= w H (h ) + a [1 H (h )]
(t u + u u) (2 s u) + p = b
(13)
u=0
Note that a jump in density at the interface produces a discontinuity at the pressure
gradient, or a jump in viscosity causes a discontinuity in pressure. Surface tension
can also produces a jump in pressure field at the interface.
The balance of the interface and internal forces at the interface implies that,
( + ).n = f .
Here we only consider the case that interface force f is surface tension and there-
fore is normal to the interface. The internal stress at each domain has the form
p I + 2 s u.
The variational equivalent of (13) is to find (u, p) V Q such that
256 K. Kamran et al.
(t u + u u) vd + 2 s u : s vd
p vd = b vd + v f
q ud = 0
uhn+1 uhn
Gu = ( + uhn+1 uhn+1 bn+1 )
t
and the stabilization parameters given by,
h 2e
1 = and 2 = + 0.5h e |ue |
4 + 2h e |ue |
where h e and ue are elemental length and velocity, respectively. Solving (14) accu-
rately requires the finite element space to be appropriately chosen to capture the
possible kink or jumps that are expected at the solution.
When the interface cut elements on the fixed mesh, the jump in density causes a
kink in pressure field and therefore a jump in its gradient in the cut elements (see
Fig. 4). It is clear that for simple elements (triangles in 2D and tetrahedra in 3D)
linear approximation of the pressure can not represent the kink in the cut element
and in the same way the jump in pressure gradients. This pressure field in the cut
elements does not belong to the standard pressure space made up of linear nodal
interpolation and one way to capture this pressure filed is to add enrichment to the
pressure field of the cut elements. One of such modifications proposed in [18] is to
interpolate pressure as:
On the Application of Two-Fluid Flows Solver to the Casting Problem 257
Fig. 4 Two-fluid hydrostatic flow with a jump in density. Linear elements can not represent kink
in pressure field inside an element. a Interface pass through the elements, b jump in density in
elements cut by the interface, c exact solution (dotted line) and FE solution (solid line)
N
node
phe = Nie pie + Nenr
e e
penr
i
N node is the number of nodes per element and N enr is the new enrichment function
added just for the elements cut by the interface. This function has the constant gradient
at each side of the interface and is designed to be local to each element and therefore
has zero value at the element nodes. Figure 5a shows a sketch of the enrichment
function for an element cut by the inteface in the 2D case. Node a belongs to the
and nodes b and c to the + . Nenr is easily defined by the level set values at the
element nodes as;
N
node N
node
Nenr = 0.5(| Nie i | + Nie |i |)
i i
In order to capture the discontinuities and take advantage of the enrichment functions
used, the integration rules need to be modified in elements cut by the front. To this
end, each tetrahedral (triangular in 2D) element is divided into up to six tetrahedra
(three triangular in 2D) sub elements. For each sub element the same integration rule
as for the non-cut elements is used (see Fig. 5b). Further enrichments are needed in
case that surface tension is present or viscosity jump arises at the interface. In this
case the pressure solution belongs to a space that is discontinuous at the interface.
Figure 6 shows one type of this enrichment for the 2D case that is introduced in [19].
The pressure is enriched at the cut elements by one additional local pressure at each
side of the interface. This new space has the property that can capture a constant
solution at each side with a possible jump at the interface. Note that both enrichment
functions are zero at the nodes of the mesh. These enrichment functions has the
following form;
258 K. Kamran et al.
(a) (b)
Fig. 5 Interface divides an element in 2D. a Nenr proposed in [18] to capture discontinuous pressure
gradient in the cut elements. b Triangular sub elements used for the numerical integration. shows
the integration points
(a) (b)
Fig. 6 Enrichment shape functions proposed in [19] to capture jump in the pressure field. Note
that they are not continuous at the interface position . a Nenr
1 , b N2
enr
1
Nenr (x) = (1 S(x))+ (x)
2
Nenr (x) = S(x) (x)
with I + = {i I, x + +
e }, and and the characteristic functions for the
positive and negative sides. Note that the additional shape functions are local to each
element crossed by the interface and therefore can be condensed prior to the assembly
to maintain the size of the system matrix and its graph.
On the Application of Two-Fluid Flows Solver to the Casting Problem 259
Fig. 7 Subdomains containing enriched, partially enriched and standard (non-enriched) elements
as well as the enriched nodes [23]
By global enrichment we mean those enrichment techniques that add new degrees
of freedom to the solution space, not condensed at the element level, and therefore
change the graph of the matrix. In this way, extended finite element method (XFEM)
is considered a global enrichments although the enrichment (extension) is local to the
interface zone. One exception to our classification is the intrinsic XFEM method [20]
that does not introduce any additional unknowns. All other standard XFEM methods
developed for the two-phase flows [2124], add new DOFs to the system. Similar to
the local enrichment techniques, when XFEM is used for the simulation of two-phase
flows, several enrichment schemes can be employed: velocity and/or pressure fields
may be enriched and enrichments for kinks or jumps may be used. Numerical studies
in [24] reveals that it is not advisable to enrich the velocity approximation space as
it does not improve the results significantly, but may lead to severe convergence
problems. Furthermore, the required number of iterations for the solution of the
governing equations may increase considerably. On the other hand, the enrichment
of the pressure field is essential.
Figure 7 shows three types of subdomains that can be distinguished in an XFEM
enrichment scheme. The first one enr is composed of elements cut by the interface
and are those that have all nodes enriched. The second subdomain, penr , has all
260 K. Kamran et al.
elements that have at least one node enriched and the last subdomain is the collection
of elements with standard degrees of freedom and without enrichment. All nodes
belong to the elements cut by the interface are enriched. Let us define I as the
collection of all nodes i that are enriched. The enrichment function Mi for these
nodes are written as [24],
with (x, t) being the global enrichment function and Ni (x, t) is the standard FE
shape function for node i. Mi (x, t) is the so called shifted enrichment which ensures
Kronecker- property of the overall approximation.
A global enrichment function that is typically chosen for strong discontinuities
(jump in pressure) is the sign enrichment:
1 if (x, t) < 0
sign (x, t) = sign((x, t)) = 0 if (x, t) = 0
1 if (x, t) > 0.
where (x, t) is the level set function. For weak discontinuities (kink in pressure
fields), Mos et al. [25] proposed abs-enrichment shape function that has an important
property of being zero on standard element (i.e. penr in Fig. 7). It is defined as;
N
node N
node
(x, t) = abs (x, t) = |i |Ni (x, t) | i Ni (x, t)|
i i
Note that this shape function is quite similar to the one proposed in [18] and shown
in Fig. 5. Figure 8 shows the enrichment shape functions, Mi , for the weak and strong
discontinuities, as mentioned above, in a quadrilateral element.
Note that in general, enrichment shape functions with small support can occur
(when the interface is so close to a node) which lead to an ill-conditioned system
matrix. Two approaches to treat this problem is found in [26, 27].
Simulation of mold filling process is very complex not only because of the variety
of phenomena that appear, but also because of the high level of interactions between
them. Fluid flow, heat transfer and phase change effects take place at the same time
and with a complex coupling pattern. Different defects may arise during the mold
filling stage like, air entrapment, slag inclusion, formation of cold shuts and cold laps
and solidification. Heat transfer with the mold and the air can dramatically changes
the material properties and therefore change the course of the flow. It may cause
solidification in some material front while on the other zones the material is still
On the Application of Two-Fluid Flows Solver to the Casting Problem 261
Fig. 8 Enrichment functions proposed by the XFEM method [23, 24] for, ad weak discontinuities
( = abs ) and, ef strong discontinuities ( = sign ). The interface is considered to pass through
sign sign sign sign
the middle of the element. a M1abs , b M2abs , c M3abs , d M4abs , e M1 , f M2 , g M3 , h M4
expanding. Cold shuts arise when the metal flowing in a section solidifies and the
flow is blocked before the region is completely filled. Cold laps are formed when
the metal stream separates and subsequently joins in a region where the metal has
already solidified. Both cold shuts and cold laps are formed due to excessive heat
loss from the flowing metal to the mold.
The flow regime in filling stage is turbulent in most cases. The standard turbulence
models require a very fine grid near the mold walls and it is rather impractical to
achieve such a fine grid in all the sections of a complex mold. Hence, only laminar
flow computations are usually done for mold filling predictions. Some authors have
used upwinding algorithms and/or turbulence models to obtain convergent results on
practically feasible grids. These models effectively increase the viscosity of the fluid
and make it possible to achieve stable solutions. The other important aspect in filling
process is the boundary condition. If no-slip boundary conditions are enforced along
the wall, the predicted filling pattern is unrealistic as the metal flowing adjacent to the
wall tends to stick to the wall itself. It can be understood as the convective velocity
becomes zero, for these nodes, in the level set Eq. (1). Slip boundary condition is
therefore prescribed for the fluid all along the mold wall. This may be seen as a simple
turbulence model and is feasible for complex mold geometries; the velocity profile
obtained in thin sections is almost uniform across the cross-section and thus mimics
the turbulent flow profile. The near-wall (viscous sublayer) region is not included in
the flow computation as in any other high Reynolds number turbulent flow model.
Instead, tangential traction forces can be prescribed on the slip boundaries to model
the effect of wall friction. This stress can be tuned by empirical factors to mimic
the mold wall material friction and therefor get more realistic flow pattern. For high
pressure die casting process, that the mold is usually made of still and with less
friction, this scaling factor is chosen much less than the gravity die casting in which
the mold is made of sand and therefor is much more rougher.
262 K. Kamran et al.
Fig. 9 Filling of a turbine blade. Red color shows the aluminum-air interface at different instances.
The inlet velocity is 0.3 m/s and the blade is filled from the bottom. a t= 0.2 s, b t= 4 s, c t= 6 s, d
t= 9 s, e t= 12 s, f t= 15 s
Another important aspect in mold filling simulation is the air exits. It is known that
in the sand molds the porous texture of the mold provide air exits. In the two-fluid
simulation that the air inside the mold is modeled as incompressible, it is essential to
provide holes or exits in the mold wall to vent the air as the metal fills the cavity. This
is very important particularly when coarse grids are used. Otherwise, air tends to
recirculate in thin sections, corners or closed zones and prevents the metal reaching
those regions and therefore unrealistically large pockets of air are entrapped. Once the
fluid has reached the vent zones they are considered closed to avoid the excessive
loss of material.
Figure 9 shows various instances during the filling of a turbine blade. Level set is
used to detect the interface position and the two-fluid multi-scale stabilized incom-
pressible formulation (14) is used. Among the different enrichments presented in
Sect. 3, the discontinuous pressure gradient is applied. The filling material is Alu-
minum with density equal to 2600 kg/m3 and the air has density of 1 kg/m3 . In order
to avoid unnecessary jumps in pressure the same dynamic viscosity, , equal to 1e5
kg
m s is considered. Boundaries are slip and a friction force of the following form is
applied in the direction opposite to the velocity at the boundary nodes,
fwall = C u (I n n) u
On the Application of Two-Fluid Flows Solver to the Casting Problem 263
Fig. 10 Post-filling solidification of a mechanical part. Red zones are the liquid zones and time is
measured from the moment that filling is completed. a t = 7 s. b t = 12 s. c t = 14 s. d t = 16 s.
e t = 19 s. f t = 25 s
here C is a constant that is chosen between {0.005, 0.05}, u is the velocity at the
Gauss point and n is the exterior normal. for the mixed element is chosen as metal
material.
One of the major components of the post-filling simulation is the solidification
analysis. Solidification is accompanied by the release of latent heat at the solid-liquid
interface. Heat transfer between the filling material and the mold causes colling from
a liquid state until the material begins to solidify at the liquidus temperature, Tl , and
solidifies completely at the solidus temperature, Ts . The solid and liquid phases
are separated by a transition mixture region called mushy zone. One way to model
the solidification process is the effective specific heat method in which the energy
equation is written in terms of an effective specific heat c;
T
c = (kT )
t
where T and k are temperature and thermal conductivity, respectively. The parameter
c is the slope of the enthalpy-temperature curve and is equal to the specific heat, c,
in the solid and liquid region. In the mushy zone its value is given by
d fs
c = c L
dT
here L is the latent heat and f s the solid fraction. When the solid fraction, f s , is
assumed to vary linearly with temperature (i.e. f s = (Tl T )/(Tl Ts )), the value
of c is constant in the mushy zone as given below:
264 K. Kamran et al.
L
c = c +
Tl Ts
5 Conclusion
The main challenge in the application of level set method to capture the interface,
is the mass conservation during both the convection and reinitialization step. To
treat this problem different level set techniques have been developed. The particle
level set method, PLS, and the coupled level set/volume-of-fluid method, CLSVOF,
have been mainly developed for the structural quadrilateral meshes and works quite
efficiently in this kind of meshes. On the other hand, geometric mass-preserving
redistance method is developed for both the structured and unstructured meshes and
is particularly useful in case that redistancing error is dominant i.e. fine meshes
and small t. Discontinuous Galerkin level set method (DGLSM) is quadrature-
free, works on structured and unstructured meshes and as the polynomial order
increases ( p > 4) its superiority to other method is evident even on very coarse
meshes (see Table 1). The convective velocity in in the level set equation comes
form the solution of Navier-Stokes equations. In case that linear elements are used,
enrichment is necessary at the interface level to improve and stabilize the velocity-
pressure pairs at the interface. Various enrichment techniques are developed to treat
kink or discontinuities due to the jump in material properties at the element level. We
presented them as the local and non-local, though both of them add enrichments at
the interface zone. Local enrichment add DOFs that can be condensed at the element
level and therefore does not change the graph of the global matrices as the interface
moves, and the non-local enrichment, XFEM, add global DOFs that change the graph
of the global matrices and can be quite costly for the practical applications. In both
of these techniques stability issues have to be taken into account in case that the
interface passes so close to the element nodes.
References
1. Osher S, Shu C-W (1991) High-order essentially nonoscillatory schemes for hamilton-jacobi
equations. SIAM J Numer Anal 28(4):907922
On the Application of Two-Fluid Flows Solver to the Casting Problem 265
2. Osher S, Fedkiw R (2003) Level set methods and dynamic implicit surfaces, vol 153. Springer,
New York
3. Sussman M, Fatemi E (1999) An efficient, interface-preserving level set redistancing algorithm
and its application to interfacial incompressible fluid flow. SIAM J Sci Comput 20(4):1165
1191
4. Sethian JA (1999) Fast marching methods. SIAM Rev 41(2):199235
5. Enright D, Fedkiw R, Ferziger J, Mitchell I (2002) A hybrid particle level set method for
improved interface capturing. J Comput Phys 183(1):83116
6. Sussman M, Puckett EG (2000) A coupled level set and volume-of-fluid method for computing
3d and axisymmetric incompressible two-phase flows. J Comput Phys 162(2):301337
7. Sussman M (2003) A second order coupled level set and volume-of-fluid method for computing
growth and collapse of vapor bubbles. J Comput Phys 187(1):110136
8. Mut F, Buscaglia GC, Dari EA (2004) A new mass-conserving algorithm for level set redis-
tancing on unstructured meshes. Mecanica Computacional 23:16591678
9. Ausas RF, Dari EA, Buscaglia GC (2011) A geometric mass-preserving redistancing scheme
for the level set function. Int J Numer Meth Fluids 65(8):9891010
10. Marchandise E, Remacle J-F, Chevaugeon N (2006) A quadrature-free discontinuous galerkin
method for the level set equation. J Comput Phys 212(1):338357
11. Idelsohn S, Mier-Torrecilla M, Oate E (2009) Multi-fluid flows with the particle finite element
method. Comput Methods Appl Mech Eng 198(33):27502767
12. Kamran K, Rossi R, Oate E, Idelsohn SR (2012) A compressible lagrangian framework for
the simulation of the underwater implosion of large air bubbles. Comput Methods Appl Mech
Eng 225(1):210225
13. Bonet J, Kulasegaram S (2000) Correction and stabilization of smooth particle hydrodynamics
methods with applications in metal forming simulations. Int J Numer Meth Eng 47(6):1189
1214
14. Sunitha N, Jansen KE, Lahey RT Jr (2005) Computation of incompressible bubble dynam-
ics with a stabilized finite element level set method. Comput Methods Appl Mech Eng
194(42):45654587
15. Kees CE, Akkerman I, Farthing MW, Bazilevs Y (2011) A conservative level set method suitable
for variable-order approximations and unstructured meshes. J Comput Phys 230(12):4536
4558
16. Rossi R, Larese A, Dadvand P, Oate E (2012) An efficient edge-based level set finite element
method for free surface flow problems. Int J Numer Methods Fluids 33:737766
17. Sussman M, Smereka P, Osher S (1994) A level set approach for computing solutions to
incompressible two-phase flow. J Comput phys 114(1):146159
18. Coppola-Owen AH, Codina R (2005) Improving eulerian two-phase flow finite element approx-
imation with discontinuous gradient pressure shape functions. Int J Numer Methods Fluids
49(12):12871304
19. Ausas RF, Buscaglia GC, Idelsohn SR (2012) A new enrichment space for the treatment of
discontinuous pressures in multi-fluid flows. Int J Numer Methods Fluids 70(7):829850
20. Fries T-P, Belytschko T (2006) The intrinsic xfem: a method for arbitrary discontinuities without
additional unknowns. Int J Numer Meth Eng 68(13):13581385
21. Chessa J, Belytschko T (2003) An extended finite element method for two-phase fluids: flow
simulation and modeling. J Appl Mech 70(1):1017
22. Gro S, Reusken A (2007) An extended pressure finite element space for two-phase incom-
pressible flows with surface tension. J Comput Phys 224(1):4058
23. Rasthofer U, Henke F, Wall WA, Gravemeier V (2011) An extended residual-based variational
multiscale method for two-phase flow including surface tension. Comput Methods Appl Mech
Eng 200(21):18661876
24. Sauerland H, Fries T-P (2011) The extended finite element method for two-phase and free-
surface flows: a systematic study. J Comput Phys 230(9):33693390
25. Mos N, Cloirec M, Cartraud P, Remacle J-F (2003) A computational approach to handle
complex microstructure geometries. Comput Methods Appl Mech Eng 192(28):31633177
266 K. Kamran et al.
26. Reusken A (2008) Analysis of an extended pressure finite element space for two-phase incom-
pressible flows. Comput Vis Sci 11(46):293305
27. Bchet E, Minnebo H, Mos N, Burgardt B (2005) Improved implementation and robustness
study of the x-fem for stress analysis around cracks. Int J Numer Meth Eng 64(8):10331056
28. Dadvand P, Rossi R, Oate E (2010) An object-oriented environment for developing finite
element codes for multi-disciplinary applications. Arch Comput Methods Eng 17(3):253297
29. Dadvand P, Rossi R, Gil M, Martorell X, Cotela J, Juanpere E, Idelsohn SR, Oate E
(2012) Migration of a generic multi-physics framework to hpc environments. Comput fluids
80:301309
Recent Advances in the Particle Finite
Element Method Towards More Complex Fluid
Flow Applications
Abstract This paper presents a state of the art in the Particle Finite Element Method,
normally called PFEM, its emphasis in the new ideas oriented to extend its application
not only to solve fluid structure interaction and multifluid problems, also bring new
opportunities to shorten the gap between engineering design times and computational
simulation times for general problems when Eulerian formulation were typically
chosen. In order to reduce the long history of this method here the starting point begins
with the reformulation of the method to solve academic and real problems in real
time or at least in drastically reduced computational times. The main topics involved
in this paper are around the stability and the accuracy of Lagrangian formulations
against its Eulerian counterpart shown through several academic benchmarks and a
deep analysis of the efficiency revealing that the original method needs some new
features. The former brought out a new integration method called X-IVAS and the
later has produced a new version of the method called PFEM in fixed Mesh. Once
the method had shown its good performance and how the new features impact on
the final efficiency the last developments had been done in extending the application
of this new method in multifluids and other complex fluid mechanics problems like
turbulence and reactive flows.
1 Introduction
limited to some special time intervals the deforming mesh added another ingredient
to the time step selection, to avoid the mesh inversion. This severe limitation together
with another imposed for the non-linearities and those proper of explicit schemes
made the efficiency of original PFEM a serious problem. Lately the method evolved
thanks to the progress done in parallel mesh generation and remeshing avoiding this
serious limitation in some measure.
Even though these limitations and the large community that normally employ
Eulerian codes the Lagrangian formulation contains some nice features that need to
be reviewed here.
One of the most important rests on the missing of the convective term in the bal-
ance equations, converting the non-symmetric equations in symmetric and positive
definite. For Navier-Stokes equations this fact has a by-product, converting in lin-
ear the original non-linear momentum equation. These two facts avoid the usage of
stabilization terms with the strong consequence of not adding the typical numerical
diffusion needed to stabilize it. Not having convective terms, for constant coeffi-
cient problems as for laminar and homogeneous fluid flow and also for DNS (Direct
Numerical Simulation), the system matrices may be factorized at the beginning and
reusing them all the time steps, with an important saving in cpu-time. For convection
dominated flows the time step in Eulerian formulations needs to be limited attending
non-linearities and stability reasons. On the contrary, the Lagrangian formulations
do not suffer from this inconvenience when the equations are integrated with good
accuracy. This is a key point that deserves much more attention.
In particular PFEM has evolved considerably over the past few years, incorporat-
ing new ideas seeking enlarge the time steps largely in stable and accurate way.
In this sense PFEM has incorporated a novel time integration scheme called
X-IVS and its extension X-IVAS. This form of integrating based in following the
streamlines of the flow in the present time step is to some extent a better way to solve
the non-linearities of the equations of the flow.
In this way it is possible to solve the complex flow situations allowing to extend
the time steps in a significant way.
On the other hand being the information carried by the particles using the mesh
only for computing secondary fields confers to the method of high accuracy.
Therefore the goals of accuracy, robustness (stability) and efficiency are signif-
icantly improved by these new ideas included in the last version of PFEM, called
Fixed Mesh PFEM.
One of the main target of this work is to show that Lagrangian formulations are not
only valuable to solve heterogeneous fluid flows with free-surfaces. We will prove on
the contrary that even for homogeneous fluid flows, without free-surfaces or internal
interfaces, they are able to yield accurate solutions while being competitively fast
when compared to state-of-the-art eulerian solvers. Also, another goal of this paper is
to update the state of the art of PFEM joining some basis published before [6, 7, 10,
11, 22], with new findings discovering more and more nice features of the method
to become a competitive tool in the future for high performance computations.
The paper starts with a mathematical review of the problems to be treated writing
them in an Eulerian and a Lagrangian formalism.
270 N. M. Nigro et al.
Next, the time integration schemes are presented where it is possible to understand
the novelty introduced by X-IVS and X-IVAS.
It is followed by a section dedicated to two examples that have served as inspiring
muses for the development of new ideas which were then applied to PFEM. In these
examples may be understood the benefits of using Lagrangian solvers. While these
examples solved by Eulerian codes needs a lot of numerical artifacts, they are trivially
solved by Lagrangian ones. The next section presents the two versions of PFEM.
The first called Mobile Mesh Version is an extension to the original PFEM with
permanent remeshing and a X-IVAS time integration scheme included. Showing the
pros and cons the rest of the section is devoted to the novel idea of mixing two view
points, one based on particles and the other based on the background and fixed mesh.
This idea allowed to increase the efficiency in a very important way. Even though
some earlier attempts had been done in using the duality of particle and mesh, for
example [8], at the moment of designing the idea this information was not on the
knowledge of the authors and moreover, both ideas have only few things in common.
The next section presents some details about the Fixed Mesh version of PFEM,
how to manage the particle inventory, how to share the information between particles
and mesh. It is followed by a section where the focus is on the treatment of the
diffusive terms. Contrary to what may be a prior assumed, the Lagrangian behavior
has been superior to the Eulerian one, in regard to precision being that this part of the
calculation is of Eulerian nature. The last section is devoted to show some examples
solved numerically by PFEM where it is possible to realize that in the present status
PFEM is able to solve turbulent flows, multifluids and multiphase flows, general
multiphysics, among others. Finally some conclusions are included.
In this section the emphasis is placed on the main features that produce the big
advantages of PFEM against any other method. In general the interesting problem to
be solved is the general transport equation that is very widespread in the engineering
applications. Both, the passive scalar transport equation and the incompressible flow
represented by the Navier-Stokes equations will be considered in the rest of the paper.
In order to understand the evolution of PFEM the Eulerian and Lagrangian for-
mulations are introduced first.
T
+ (vT ) = (T ) + Q (1)
t
where T (x, t) is the dependent variable (passive scalar), for example the temperature,
v is the velocity vector, for this problem a given data and the diffusivity, with
the divergence operator, the gradient operator and t the temporal derivative. In
this problem x represents a fixed coordinate.
Normally this equation may be rewritten in the following form:
T
+ v T = (T ) + Q ( v)T
t (2)
where the first order derivative is split in two terms, one for the convective transport
and the other for the source term generated by the non free divergence velocity field.
Normally the incompressible flow satisfies the free divergence and in this case this
source term may be neglected.
On the other hand in the Lagrangian framework the same problem is written as:
DT
= (T ) + Q
Dt (3)
T
where DTDt = t + v T is the material derivative. The convective term works like
a variable transformation between that measured in a fixed coordinate system and
that measured in the moving coordinate system that travels with the fluid velocity
v. In this transformation the velocity field is incorporated in the dependent variable
itself being the unknown variable T = T (x p , t) with x p the location of a fluid
parcel included within the material volume. This location is at the same time another
variable, so it is needed to solve an additional equation like:
Dxp
=v
Dt (4)
DT
= (T ) + Q
Dt
Dxp (5)
=v
Dt
272 N. M. Nigro et al.
The other problem that in this paper deserves special attention is the fluid dynamics
of an incompressible and viscous flow. It is very well known that this problem is
governed by the Navier-Stokes equations that presents the balance of the linear
momentum equation and the continuity equation or the mass balance.
Both equations normally written together in an Eulerian framework look like:
+ (v) = 0
t
v (6)
+ (v v) = () + F
t
Being the stress tensor which definition may be split in two parts, one for the
spherical (isotropic) component being proportional to the fluid pressure and the other,
the deviatoric or viscous component normally written as . The operator means
the tensor or dyadic product between two vectors. F represents the external force,
for example the gravity, and finally is the density. For incompressible flows the
density is constant, therefore, the continuity equation becomes a constrain over the
velocity field, as:
(v) = 0
(7)
Applying the above restriction also in the linear momentum equation produces a
simplified and non-conservative version like
(v) = 0
v
( + v v) = () + F (8)
t
(v) = 0
Dv
= () + F (9)
Dt
Dxp
=v
Dt
Recent Advances in the Particle Finite Element Method 273
In this section the time integration of both frameworks, the Eulerian and the
Lagrangian is presented.
For simplicity the scalar transport equation is chosen first leaving for some special
topics the extension to the vector equation system governing the fluid dynamics of
one phase incompressible viscous flow.
For the Eulerian framework represented by Eq. (2) the integration is normally done as
t n+1
t n+1
T
dt = (v T + (T ) + Q)dt
t
tn tn
t n+1 (10)
T n+1 (x) T n (x) = (v T (x) + (T (x)) + Q(x, t))dt
tn
T n+1
(x) T (x) = (v T (x) + (T (x)) + Q(x, t))n+ t
n
For some (0, 1) the last expression in (10) gives the exact solution. As this
parameter is unknown and problem dependent some fixed values for are adopted,
= 0 for explicit schemes, = 1 for implicit schemes and = 21 for Crank-
Nicholson among others.
n+1
(v T + (T ) + Q)n+ = v T + (T ) + Q
n
+ (1 ) v T + (T ) + Q
(11)
Replacing (11) in (10)
n+1
T n+1 (x) T n (x) = v T + (T ) + Q t
n (12)
+ (1 ) v T + (T ) + Q t
The right hand side in (12) is evaluated using only the information of the nodal
point x at the two extremes of the time interval, t n and t n+1 = t n + t.
For the Lagrangian framework a similar integration scheme is applied.
274 N. M. Nigro et al.
tn+1 tn+1
DT
dt = ( (T ) + Q)t dt
Dt
tn tn
tn+1 tn+1
Dxp
dt = vt dt
Dt
tn tn
(13)
tn+1
t
T (xp n+1 , t n+1 ) T (xp n , t n ) = (T ) + Q dt
tn
tn+1
xp n+1 xp n = vt dt
tn
t n+1
t
T (xp n+1 , t n+1 ) T (xp n , t n ) = (T ) + Q dt
tn
n+1
= (T ) + Q t+ (14)
n
(1 ) (T ) + Q t
xp n+1 xp n = vn+1 t + (1 )vn t
2.3.2 Navier-Stokes
The extension of the time discretization to the Navier-Stokes equations needs to solve
the pressure-velocity coupling.
It is well known that the velocity vector unknown arises solving the vector
momentum equation. Being the pressure the scalar unknown for which the continuity
equation might be the natural choice, in this equation the pressure does not appear.
Moreover, this equation is not a time evolution equation, it works like a constraint
over the velocity field, choosing only those velocity field that satisfy a free diver-
gence. To discover the equation associated with the pressure several alternatives are
possible. Among them, segregated or projection methods like fractional step appear
as good candidates. The idea behind the fractional step is to write the momentum
equation discretized in time in such a way to firstly predict a velocity using the old
value of the pressure (the pressure at the old time step) and after correcting it with the
updated pressure that arises from applying the divergence operator to the correction
equation getting a Poisson like equation for the pressure.
In synthesis the fractional step may be viewed as:
Recent Advances in the Particle Finite Element Method 275
n+
vn+1 vn = + f t
n+1 n
vn+1
vn+1 +
vn+1 vn = + f t + (1 ) + f t
n+1
vn+1
vn+1 +
vn+1 vn = p + v + f t+
n
(1 ) p + v + f t
n+1
vn+1
vn+1 +
vn+1 vn = ( p n+1 + p n )t pn t + v + f t
n
+ (1 ) p + v + f t
vn+1
vn+1 +
vn+1 vn = ( p n+1 + p n )t p n t
n+1
+ v+f t
n
+ (1 ) v + f t
n+1
vn+1
vn+1 +
vn+1 vn = ( p n+1 + p n )t p n t + v + f t
n
+ (1 ) v + f t
vn+1 vn+1 +
vn+1 vn = ( p n+1 + p n )t pn t
CORRECTOR PREDICTOR CORRECTOR PREDICTOR
n+1 n
+ v(
v n+1 )+f t + (1 ) v + f t
PREDICTOR
(15)
Spiting the predictor and corrector parts of the equation in two steps and applying
the divergence to the corrector step using the constraint that vn+1 = 0,
PREDICTOR
n+1
vn+1 vn = p n t + v(
vn+1 ) + f t
n
+ (1 ) v + f t
PRESSURE EQUATION
vn+1
vn+1 = t ( p n+1 p n )
vn+1 = t ( p) = t 2 p
CORRECTOR
vn+1
vn+1 = t p (16)
276 N. M. Nigro et al.
with p = p n+1 p n and 2 the Laplacian operator. The equation for the pressure is
interposed between the predictor and corrector equations for the momentum equation
as it is normally found in the algorithm.
For the Eulerian formulation the above three steps may be applied straightforward
only changing v by v(x) and p by p(x).
For the Lagrangian formulation the above algorithm may be summarized as:
PREDICTOR
Explicit part
t n+1
xp n+1 = xp n + vn (xp )d
tn
n+1
v (xp n+1 ) vn (xp n ) =
t n+1
p n (xp ) + (1 ) vn (xp ) + f n (xp ) d
tn
Implicit part
vn+1 (x)
n+1
v vn+1 (x)) + f n+1 t
(x) = v (
PRESSURE EQUATION
CORRECTOR
vn+1 (xp )
vn+1 (xp ) = t p(xp ) (17)
It should be noted that for the whole procedure of Lagrangian formulation two
coordinates are used, one for the particles (xp ) and the other for the mesh (x). The
relation between them is presented in a next section.
In the scalar transport equation the velocity field is a given data, known not only for
its spatial variation also for its time variation. Therefore it is possible to include this
information explicitly in the Lagrangian formulation. Using a high accurate particle
tracking integration scheme it is possible to solve simple and complex pathlines
normally present in fluid flows (Figs. 1 and 2).
Recent Advances in the Particle Finite Element Method 277
t n+1
xp n+1
= xp n
+ v (xp )d (18)
tn
Real trajectory:
n+1
xp n+1
= xp n
+ v (xp ) d . (19)
n
Simple approximation:
xp n+1 = xp n + vn (xp n )t (20)
278 N. M. Nigro et al.
Streamlines approximation:
N 1
n+ Ni
yn+1
p = xp n + vn (y p ) t (21)
i=0
2.4.1 Remarks
t n+1
t n+1
T (xp n+1 , t n+1 ) T (xp n , t n ) = Q(xp , )d + (T (xp , )) d
tn tn
(22)
In the PFEM method the last term at the right hand side is approximated in the
following form:
t n+1
T (xp n+1
,t n+1
) T (xp , t )
n n
Q(xp , )d
tn
t n+1
The last integration is only one possibility to choose among others, the explicit part
is solved simultaneously with the particle pathline computation, while the implicit
one is solved using the final position of the particles. However other choices may be
done in order to improve this computation, that for brevity reasons are not included
in this paper.
Comparing (12) with (23) the main difference between both may be written as:
This difference is due to the error produced by the transformation between both
frames, an Eulerian or fixed one and the Lagrangian or mobile one. This difference
should tend to zero when the time step goes to zero. However, for large time steps
normally needed to speed up the computation, the fact of evaluating the velocity
field placed on a fixed position (x) for Eulerian formulation in two different time
intervals may introduce large errors. Moreover the spatial stabilization needed for
advection dominant problems introduce also some extra errors that tends to dissipate
the solution much more than the physics, specially at large time steps.
280 N. M. Nigro et al.
One of the main purposes of this development is its application to solve coupled
problems where the flow and several other fields are solved simultaneously with
some sort of interaction. For brevity a simple case is here presented. It deals with the
coupling of an incompressible viscous flow with a scalar transport like temperature
using the Boussinesq approach.
Dv
= p + (v T + v) + f
Dt (25)
v =0
Scalar-Transport Equations:
D j
= ( j j ) + Q j j (1 : Nfields ) (26)
Dt
f = g( c ) (27)
2.5.2 Discretization
The key of the PFEM algorithm is transporting the information with particles fol-
lowing the streamlines. Although the field vp is not stationary, streamlines are taken
as stationary on each time step (vp n ), the particle position follows that field and the
particle state is updated by the rate of change determined by the balance equation.
t n+1
xp n+1
= xp n
+ vn (xp ) d (28a)
tn
t n+1
vp n+1
= vp n
+ (an (xp ) + f ) d (28b)
tn
Recent Advances in the Particle Finite Element Method 281
t n+1
n+1
p = np + (gn (xp ) + Q ) d (28c)
tn
where an = p n + ( T vn + vn ) and gn = ( n ), which are nodal
variables.
The following two examples serve as the starting point of new ideas behind high
accurate and stable convective transport equations.
Pure advection of a passive scalar field.
Inviscid transport of a vortex.
The first example was the proof in showing the advantages of using Lagrangian
formulations when a pure advective problem is between hands. It is a Gaussian
hill profile imposed as an initial condition advected by a pure rotation motion. For
this problem the profile shape and its amplitude should be conserved all the way.
The second one is one extension of the first example applied to a vector system
like Navier-Stokes equations. It consists of an initial vortex that is transported in an
inviscid flow. For this problem the intensity of the vortex should be conserved.
This well known problem normally serves as a benchmark for the spatial stability
of Eulerian numerical schemes. The first scope in this benchmark is to show that no
spurious oscillations appear and the second one focus on minimizing the numerical
diffusion introduced by the stabilization schemes. Also the time integration numerical
scheme is responsible for extra numerical dissipation, being the first order explicit
( = 0) or implicit ( = 1) schemes not recommended for their high dissipation.
Crank-Nicholson ( = 21 ) is preferred in this sense. However looking at the solution
it is always present a reduction in the original amplitude that may be improved only
reducing the mesh size and the time step.
In [10] several Figs. 9.19.4 are shown where it is possible to realize how the
amplitude is drastically reduced using large Courant numbers with Eulerian formu-
lations. Even though the spatial stabilization is reduced to a minimum and the time
integration is chosen as second order the numerical dissipation is highly noticeable.
Only reducing the Courant number with finer meshes it is possible to reduce it but
never annihilate.
282 N. M. Nigro et al.
On the other hand Lagrangian formulations are better positioned for this kind
of problems if only the amplitude is observed. This remains exactly constant all the
way regardless the Courant number. However with standard first order integration, the
problem arises in the definition of the pathlines that are shifted inwards or outwards
depending on the explicit or implicit character of the time integration. Only with
second order time integration is possible to reduce this pathology but some sort of
iteration is needed. See Fig. 9.7 in [10].
Using the X-IVAS integration is possible to fix both errors simultaneously pro-
ducing a high accurate solution regardless the Courant number. This is also shown
in Fig. 9.7 at left in [10].
A final remark about this important result achieved on such a basic example,
showing the great capabilities of Lagrangian formulations over Eulerian ones for
convective transport, is related to some more accurate Eulerian schemes that are
currently being published for transporting signals without causing spurious diffusion.
Called as High Resolution Schemes [14, 16], these numerical methods have a robust
control to suppress the wiggles with the minimum amount of numerical dissipation.
According to the Godunov theorem [16] this is only possible in a nonlinear way. Even
though this way circumvents the drawbacks it is important to realize that Lagrangian
formulations achieved the same or better results without doing nothing special saving
the extra cost normally experienced by such schemes.
Having found the good benefits of Lagrangian formulations for transporting scalar
fields in a stable and accurate way the following step was to extend the same to
vector systems. Here the incompresible viscous fluid flow model was taken. The
equivalent example in this context is the transport of a vortex in an inviscid flow. It
is well known that looking at the hyperbolic part of the whole system, neglecting
the diffusion and not considering the role of the pressure, the problem looks like
the convection of vorticity waves. If a vortex ring is imposed as initial condition,
neglecting the viscosity with slip boundary conditions on the walls, the vortex should
conserve its kinetic energy as much as possible.
Figure 4 shows how the Eulerian and Lagrangian formulation transports this vor-
ticity. For both formulations the mesh is kept fixed and the simulation had run with
the same time step, with a high enough Courant number in order to highlight the per-
formance and precision comparison. It is possible to conclude that the Lagrangian
formulation is more energy conservative than the Eulerian counterparts with the
evidence that the vortex may be transported much better. This example confirms the
advantages of Lagrangian formulation respect to Eulerian ones not only for advective
transport of scalar fields, also for non linear vector fields (Fig. 4).
Recent Advances in the Particle Finite Element Method 283
The natural evolution of PFEM method employed only one mesh built from a cloud
of points defined by the moving particle position. There was a one to one relation
between mesh nodes and particles. At each time step the original PFEM method
moved the nodes following the updated particle positions as long as the mesh was
not deformed in such a way that an invalid grid appears. Remeshing was only used
when the deformation of the mesh was so large that the time step suffered a drastic
reduction making the computation too much expensive. At that times, the remeshing
was by-passed at extreme for cpu times reasons. Summarizing the stability of the
original PFEM was mainly affected by:
critical time step for explicit advective terms (Co < O(1)).
critical time step for explicit diffusive terms (Fo < O(1)).
critical time step for the deforming mesh limited by the inversion of some elements
in the mesh (invalid)
non linearities
The sequence of the problems above defined may be summarized as:
To solve only the passive scalar transport Eq. (28a) and (28c) are used. If you want
to transport N passive scalars, you only have to solve (28c) for each one of the N
variables.
To solve the Navier Stokes equation system, (28a) and (28b) are used, and p must
be calculated. A typical Fractional Step Method is used to solve the coupling
between the pressure and the velocity (see Eq. 16).
To solve the thermal and fluid flow coupling (natural convection) (28a), (28c)
and (28b) are used and a constitutive law for the buoyancy term should be added
(Boussinesq approach)
284 N. M. Nigro et al.
All these steps need a mesh update and again the remeshing returns to the sce-
nario. During the last years a lot of progress was done in terms of more efficient
mesh generation and regeneration exploiting the parallelism, making the remesh-
ing affordable. A permanent remeshing circumvents the severe time step restriction
produced by the invalid mesh condition. In this sense the PFEM had experienced a
significant progress increasing the time steps with stable solutions.
It is necessary to update the mesh states with the particles states. There are two
approaches which have generated two versions of the method:
Remesh the geometry with new particles positions (particlesnodes): PFEM
Mobile Mesh
Project states from particles to nodes, preserving the mesh as fixed: PFEM
Fixed Mesh
The Mobile Mesh version has the following features:
a 1-1 relation between Particles and Nodes.
Remeshing at each time step.
Need permanent assembling, profiling and solving of the algebraic linear system.
Figure 5 shows how the particle motion change the mesh definition at each time
step.
The first tests showed very good features in terms of stability and accuracy getting
a drastic reduction in the cpu times involved for solving some benchmarks compared
with the original version of PFEM. This improved performance, added to the pos-
sibility of using very large time steps were the first evidences that the permanent
Recent Advances in the Particle Finite Element Method 285
TOTAL
Solve
Assembly
Remeshing
4 threads
Update 2 threads
1 thread
Streamline
remeshing using X-IVAS integration were two important numerical ingredients for
exploiting the good features evidenced by the Lagrangian formulation.
Even though moving mesh PFEM version has several advantages against its
Eulerian counterpart, it has some limitations in terms of efficiency. Mainly the per-
manent remeshing and the assemble/solving of implicit problems limit its scalability
The Fig. 6 shows a profiling of the moving mesh version of the PFEM.
As it is evident from this figure most of the time is spent in remeshing, assembling
and solving the implicit linear systems, with a performance similar to mesh based
methods because the particle update only consumes a small part of the whole cost.
In order to reduce the computational cost added by these two stages a novel idea
was presented: the Fixed Mesh Version of PFEM .
This new method combines particles with a background mesh. Particles carry the
information along the whole process using the mesh only for secondary computa-
tions, those needed to update the particle position and their states. It is normally
understood as an hybrid method or dual method where particles act like the mas-
ter in the computation and the mesh is the slave. The idea does not only avoid the
permanent remeshing, using a fixed background mesh it is possible to integrate all
the implicit part of the computations with an important and favorable impact on the
computational efficiency, the possibility of re-using the matrix profile and for linear
diffusion problems also its factorization.
This fact may be exploited only by Lagrangian formulation because the Eulerian
counterpart always has the convective term proportional to the changing unknown
velocity inside the system matrix to be inverted. Therefore it is not possible to take
advantage of it.
The fixed mesh version has the following features :
Particles cloud over a Fixed Background Mesh.
No need remeshing.
It needs Projections and Interpolations between particles and mesh nodes.
It needs only one LU or Cholesky factorization for implicit calculations.
286 N. M. Nigro et al.
Fig. 8 PFEM fixed mesh versionprofiling. Case flux around a cylinder 2DCPU: Intel
i7-2600k (4 cores)
Being particle computations cheaper than mesh computations it put the PFEM
method in a very good condition for high performance computing stuffs, specially
for scalability.
Finally this section ends with some review of the two algorithms that were firstly
developed in the context of the PFEM, one for the scalar transport and the other for
the incompressible viscous flow.
an = = p n + v
2. Evaluate new particles position and state following the streamlines:
t n+1
xn+1
p = xnp + t n vn (xp ) d
t n+1
vn+1
p = vnp + t n an (xn+ p )+f
n+ d
As mentioned before, in the PFEM algorithm, after the streamline integration the
state variables are placed on the particles. Both, for reasons of incompressibility
(pressure) as for the treatment of the diffusion (viscous stress tensor) require that the
information should be located on the grid. While for the Mobile Mesh version the
mesh is done with the particles themselves, in the Fixed Mesh approach particles
and grid are decoupled and a projection from particle states to nodal states should
be done.
Different approaches are available to perform projections, for example SPH or MLS
(Moving Least Square) techniques could be used for the interpolation, as well as
weights based on the position on the top of the underlying mesh. A brief review of
the actual techniques for projection in PFEM are presented (the equations presented
are for scalar projection. They are also valid for each component of a vector state
variable):
Recent Advances in the Particle Finite Element Method 289
P
N j (x p ) p
p=1
j = (29)
P
N j (x p )
p=1
where P is the number of particles inside a certain region around the node j.
Mean weighted by Distance (P-2):
P
||x p x j ||2 p
p=1
j = (30)
P
||x p x j ||2
p=1
j = h (x j ) = T P(x j ) (31)
where:
P = [1 x x 2 . . .] on 1d, P = [1 x y x y x 2 y 2 . . .] on 2d (truncating at the polynomial
order required) and = (XT W X)1 (XT W y). It must be noted that to invert
the matrix in the calculation of , it is required P >= n, where n is the number
of terms used on P.
For accuracy reasons each one of the presented projection methods require a certain
number of particles in certain region near to each node. Some considerations must
be taken into account: the region around the node must be defined precisely, and
is not assured that there were particles inside each region (specially when high Co
numbers are used). Then, new particles must be created at these empty regions. In
this section several algorithms attending this issue are presented.
The first approach (S-1), used originally in PFEM, consists on setting the states
to a new particle interpolating from the nodal states at the previous time step n:
p (x p ) =
n+1 n+1
j (xn+1
p ) j
n
(32)
j
being j the area coordinates of the particle in the element. Other algorithm (S-2)
searches the state following the streamline but in backward direction, thinking in
290 N. M. Nigro et al.
finding the particle location that, if it had existed at the beginning of the time step,
at the end of it would have arrived at the seed position:
t
n+1
p (x p )
n+1
= np (xnp ) + gn (xn+ ) d (33)
0
The utility of the backward integration to search the state of the new particle is shown
in the next example: the pure-convection step problem, which is defined in Fig. 9.
A boundary condition with a sharp discontinuity enters the domain transported
with a velocity vector field not aligned with the mesh. Figure 10 shows the results,
projected on the mesh, achieved when the particles are created using the criterion
defined as (S-1) (left) and also when their states are found using backward integration
criterion (S-2) (right).
The results show that S-1 criterion fails for this example putting S-2 criterion as
a much better selection for seeding particles when it is necessary.
On other hand, using a lot of particles increases computing times. During the simu-
lation the seeding is frequent and we need to control the amount of particles inside
the domain for computational cost reasons. So, the removal action should be defined
following some criteria.
Although it is known that particle which leaves the geometry must be deleted,
inside the geometry is not clear when particles should be removed and how to do that.
Removing particles decreases computing times of the algorithm but also decreases
the quality of the solution because it introduces numerical diffusion in an indirect
way.
In the first approach (R-1), the particles are not removed unless that two or more
of them are in almost the same position. This approach obtained accurate results
solving the pure-advective case of the rotating Gaussian signal [18].
A second approach (R-2), consists on requiring minimum number and a maximum
number of particles at each sub-element that must be conserved at each time step.
Recent Advances in the Particle Finite Element Method 291
Fig. 10 Pure convection transport. Results for different strategies to new particles states. Left S-1.
Right S-2
The sub-element i is the third parts of the triangle (fourth parts of the tetrahedral in
three dimensions) where the area coordinate corresponding to the vertex i is larger
than the rest. The idea is to think that if each node has enough number of particles
around, the projected state from particles to the node will be accurate. However,
as will be demonstrated in the next example, the continuous intrusion to the system
could decrease the quality of the solution, specially in scalar problems. This criterion
shows good results in solving incompressible flows.
The example of the rotating Gaussian signal without diffusion shows that the
numerical diffusion is important when a frequent creation and removing of particles
is performed following this criterion. The polar mesh (4390 elements) is the same
in all cases, the case consists of a Gaussian transported by a rotating flow without
diffusion term. Figure 11 compares the value of the maximum through two laps.
The best option for this problem is R-1 (in Fig. 1 Tmax old), while R-2 requires
a large range ([min_subele; max_subele]) of particles not to spread: comparing
292 N. M. Nigro et al.
0,95
0,8
0,75
0,7
0 2 4 6 8 10 12 14 16
Time [s]
Fig. 11 Rotating Gaussian. Evolution of max for different update particle techniques
[1; 20] with [1; 5], in the second option the intrusion in the system is greater and the
solution decreases its quality.
Regarding the computing times also R-1 is the best, because requires 40 s (mean
40,000 particles), whereas [1; 5] needs 50 s (mean 46,000 particles) and [1; 20] needs
100 s (mean 120,000 particles).
However, R-1 does not work fine for the steps problem (defined in previous
subsection), unless it creates new particles using backward integration (criterion
S-2). Also, due to the type of projection of the algorithms developed, which searches
particles in the sub-elements to send data to nodes, creating new particles in empty
elements does not ensure that there will be particles in the region around the node
(their sub-elements), so another type of selection of the position of the new particles
must be developed.
Neighborhood
Internal Elements
Region of the Node "j" Inlet Elements
Although to create new particles on the nodes is the least intrusive way to maintain
good solution on the nodes, if the problem is not of confined flow, the number of
particles will decrease while the simulation runs. Typically in the inlet flow boundary
the boundary condition is imposed. Then, creating particles in the Inlet Elements and
doing backward integration to search the states no error is committed, and more than
one particles can be created. The position does not have to be on the nodes and its state
will not have error. This approach allows to keep approximately the same numbers
of particles during the simulation, which preserves the accuracy of the method.
Particle removing is carried out when two or more particles are in a circle (2d)
or sphere (3d) with a radius proportional to the size of the element (r = h). This
approach allows to use different s over the geometry, being a new tool to control
the number of particles. Graphic representation is presented in the Fig. 12c.
Figure 12a shows the neighbor elements and the region belonging to the node j.
It must be there at least one particle in the gray zone to have a good projection, else a
new particle must be created in the same position of the node and searching its state
with backward integration. Figure 12b shows which elements are considered as inlet
elements and, if they are empty, they must create internal particles.
Finally, this algorithm allows to solve all tests presented in this paper while other
approaches have shown to fail: the Gaussian rigid rotation and the step-2d.
The last example consists on testing this algorithm in the Navier-Stokes solver
(PFEM Fixed Mesh). The case chosen is the Flow Around a Cylinder, because it
presents different zones of refinement and patches of inlet and outlet flow. The results
are presented in the Fig. 13a and b. Similar accuracy in the amplitude and frequency
can be observed, but R-2 obtains better definition of the forces signal, specially for
Cd. For more details see [10, 11].
6 Diffusive-Dominant Problems
When the problem is diffusive-dominant, the advantages of the method PFEM are not
as clear as in the advective-dominant case. The explicit calculation of the diffusion
traditionally used by PFEM is limited by the dimensionaless Fourier number and,
294 N. M. Nigro et al.
(a) (b)
2 2
Cl
Mittal
Cl - old
1 1,5
Cd
Cl
0 1
-1 0,5
-2
10 20 30 40 10 20 30 40
Time [s] Time [s]
Fig. 13 Lift coefficient and drag coefficient for flow around a cylinder solved using the new updating
algorithm and comparing it with R-2 (called old in the graphic)
in some particular cases, the temporal change of the transported variables vanishes
due to the shape of its own solution. To relax these restrictions, in the Sect. 6.1 a new
model to calculate the diffusion is presented. Several tests are presented to confirm
improvements in the solution.
Simulations solving the diffusive term in an explicit way are restricted by Fo < 0.5.
This is a strong limitation for the time-step, specially on very refined mesh and in
diffusive dominant problems (where Fo > Co). Due to explicit PFEM suffers this
stability constraint the possibility of enlarging the time step may be lost when the
flow locally turns to be diffusive. As we have mentioned normally in the vicinity of
bodies some refinement is done to capture boundary layers and flow separations and
locally the Fourier number increases. Also, in some particular cases, the temporal
change of the transported variables vanishes due to the shape of the own solution:
when the time-step is chosen such that the integral of the curvature of the function
nj vanishes, the method will not apply diffusion on , so that the solution will be
wrong. This case may be present for traveling waves with diffusion.
A new approach to solve the diffusive term is based on the theta method which
consists on discretizing the non-stationary variable using a weighted mixture between
an explicit prediction and an implicit correction.
n+1 n
= gn+1 + (1 )gn (34)
t
Doing a first step in explicit way
n+1 n
= (1 )gn (35)
t
Recent Advances in the Particle Finite Element Method 295
and doing the correction in an implicit way, this is subtracting (35) from (34), it
follows
n+1 n+1
= gn+1 (36)
t
Algorithm 3 - Time Step PFEM Scalar Transport Explicit Diffusion - Implicit Cor-
rection
1. Calculate scalar change
rate on the nodes
like a FEM:
N g d = N d + N d
n n n
1
[M + t K] n+1 = t K n+1 + t Fn+ 2 (38)
where M is a mass matrix, K is the stiffness matrix and F is the load vector of a
standard FEM discretization. It must be noted that the matrix [M + t K] for
K = K(t) and t = cte does not depend on the time, then it can be factorized
at the beginning of the computation and used as a preconditioner afterward with a
significant cpu-time reduction.
2
= 2 x (0, 1) (39)
t x
(x = 0, t) = (x = L , t) = 0; t > 0 (40)
(x, t = 0) = sin(kx); t = 0 (41)
(x, t) = sin(kx)e(2kx)
2t
(42)
PFEM its stronghold. Efficient implicit schemes means solving linear systems in an
iterative way with good preconditioners.
In this section, a pathological case is presented. The explicit calculation of the diffu-
sion updates the state variable with the integral of the second derivative of the variable
itself, i.e. the integral of the curvature. When certain conditions are accomplished,
that integral vanishes and the explicit diffusion is null generating wrong new states.
However, an implicit calculation of the diffusion solves that problem.
The problem consists on a sinusoidal wave transported by a field v with a non
negligible diffusive term. The idea is to force numerical and physical parameters
searching that the integral of the curvature of the function vanishes at each time-step.
If the length traveled by a particle is multiple of the length wave of the signal
(U t = m), then x n+1 p = x np + m, hence, the rate of change of the variable (its
curvature ) will be null because
d 2 d2 2 2
= = [sin( x)] = C sin( x) = g
dx 2 dx 2
x n+1
and x n g d x = 0.
This pathological situation has a very low probability and only is present in
Lagrangian formulations where advection and diffusion are weakly coupled.
The problem to solve consists on:
2
+U = 2 x (0, ) (43)
t x x
(x = 0, t) = sin(t) t > 0 (44)
2
(x, t = 0) = sin( x) t = 0 (45)
where the advection and the diffusion can be analytically solved in an uncoupled
way, allowing to determinate the decay of the signal.
2
2 ( )2 t
(x, t) = sin( [x (x0 + U t)]) e (46)
Using the parameters:
U = 5,000
=1
L x = 10
298 N. M. Nigro et al.
= 0.25
x = 0.025
t = 0.0001 (Fo = 0.16).
In Fig. 15 results obtained with different values of are presented comparing with
analytic decay. The most accurate simulation is using = 1, using = 0 decay is
not observed and with other values for intermediate solutions are obtained. Finally,
a corrective step of an erroneous explicit prediction does not ensure accurate results
due to the bad performance of explicit schemes for this very special case.
It must be emphasized that the presented case rarely appears in non-academic
problems, but it allows to demonstrate another reason to choose an implicit calcula-
tion of the diffusion instead of an explicit.
The theta method can also be adopted to calculate the viscous effects on incompress-
ible flow problems. Again, this strategy allows to extend the maximum time-step
without the limitation of the Fourier number. The expressions are similar to the
scalar case presented in Eqs. (34)(36), but replacing with v and g with v
where v = (v + v T ).
Finally, the algorithm for the implicit correction of the viscous stress tensor is
presented in the following algorithm:
Recent Advances in the Particle Finite Element Method 299
an = = p n + v
2. Evaluate new particles position and state following the streamlines:
t n+1
xn+1p = xnp + t n vn (xp ) d
t n+1
v n+1
p = vnp + t n an (xp ) + f n+ d
3. Update particles inventory
4. Project state to the mesh:
v n+1
j = (v n+1
p )
5. Implicit correction:
vn+1
j = v n+1
j + t n+1
j .
6. Find the pressure value solving the Poisson equation system using FEM:
vn+1
j = t [ p n+1 ]
7. Update the velocity value with the new pressure:
vn+1
j = vn+1
j t ( p n+1 p n )
vn+1
p = vn+1
p t 1 ( p n+1 p n )
The equation system for the implicit correction of the viscous diffusion can be
solved using the same strategy as presented in (37) or (38). Also, for = (t) and
t = cte the matrix does not depend on the time, then it can be factorized only once.
The transport of a Gaussian Hill problem was used to demonstrate the goodness of
PFEM method to solve a scalar transport problem [18]. This case also made evident
the pathology that explicit Eulerian approaches suffer in solving a pure advective
transport problem with CFL > 1. The problem consists of a Gaussian hill signal
used as initial condition transported with physical diffusion. The velocity field is a
flow rotating around the center of a square domain. The Gaussian signal is displaced
from the center of the domain at a certain radius and its shape makes the transported
signal have a non-zero value in a limited region of the domain initially. The signal
should be transported following circular path lines. Figure 16 shows the problem
definition.
300 N. M. Nigro et al.
This problem is taken from Donea and Huerta [4]. The initial condition is:
1
(x, 0) = 4 (1 + cos( X ))(1 + cos(Y )) i f X2 + Y 2 1
(47)
0 other wise
triangles. Another finer called M2, with 100 100 and finally the finest mesh called
(M3), with 500 500, in order to define a reference solution for comparison playing
the role of an almost exact solution.
(a)
(b)
Fig. 17 Amplitude evolution on mesh M1 for Co = 0.5 (a) with Co = 5 (b). PFEM results are
shown projected on the mesh
Figure 17 presents the evolution of the amplitude of the signal for the different
simulations. It is confirmed again the fact that Eulerian simulations introduce a lot of
numerical diffusion mainly due to the temporal scheme and the spatial stabilization.
302 N. M. Nigro et al.
It is noted that PFEM simulations start with an amplitude max = 1, it is due to the
projection of the maximum value on particles over the grid nodes. It is not an error
because the particle information does not suffer for any type of numerical dissipation
when is transported. The only error source is when the information is projected on the
mesh for secondary computations. To confirm this, the signal amplitude over the par-
ticles may be viewed in Fig. 18. Here, another important result arises: the maximum
value over the particles is independent of the number of particles initially seeded.
This fact depends on the particle removing limits chosen during the computation and
also on the projection operator design.
Figure 19 presents the evolution of the signal amplitude for different simulations. In
this case the maximum on the mesh are almost the same as the maximum over the
particles. It is due to the finer mesh involved, making the projection operation less
diffusive.
This section finally end the paper showing some brief details about new promising
and challenging applications of PFEM method. In some sense all theses applications
present some sort of coupling problems. The first is a typical natural convection heat
transfer problem in both, laminar and turbulent regime. The following example is
the well known benchmark of turbulence modeling proposed by Rodi and Ferziger,
a cube mounted on a channel floor and finally one example of multifluids flow, going
towards the multiphase flow problems a very demanding need of the industry.
The problem presented deals with the two dimensional flow with a Prandtl number
Pr = 0.71 in a square cavity of side H = 1[m]. The boundary conditions for the
momentum equation are non slip at all boundaries. Horizontal walls are isolated,
and the vertical sides are at different temperatures Tc < T < Th ( = T for natural
convection problems). Figure 20 exhibits the geometry of the cavity. Simulations
were carried out using a mesh of triangular elements with 100 100 nodes and
refinement towards the walls. The wide range of Ra numbers (48) was obtained by
a constant temperature difference of T = 1 K adjusting the thermal expansion
coefficient to supply the desired Ra.
Recent Advances in the Particle Finite Element Method 303
(a)
(b)
Fig. 18 Amplitude evolution on mesh M1 for Co = 0.5 (a) and for Co = 5 (b). PFEM results are
shown on the particles, not projected on the mesh
g H 3 (h c )
Ra = (48)
where is the thermal diffusivity corresponding to air with the above mentioned Pr
in standard temperature and pressure conditions.
304 N. M. Nigro et al.
(a)
(b)
Fig. 19 Amplitude evolution on mesh M2 for Co = 0.5 (a) and for Co = 5 (b). PFEM results are
shown on the mesh
This section provides a set of solutions at low Ra number. The quantities under study
are the following:
Recent Advances in the Particle Finite Element Method 305
Table 1 Numerical solution for thermal square cavity with PFEM comparing with reference data
Ra Data PFEM2 Corzo [1] Davis [2]
103 u max (x = 0.5) 3.605 3.640 3.634
103 ymax (x = 0.5) 0.814 0.812 0.813
103 vmax (y = 0.5) 3.650 3.700 3.679
103 xmax (y = 0.5) 0.183 0.177 0.179
104 u max (x = 0.5) 15.982 16.281 16.182
104 ymax (x = 0.5) 0.824 0.822 0.823
104 vmax (y = 0.5) 19.378 19.547 19.509
104 xmax (y = 0.5) 0.116 0.123 0.120
106 u max (x = 0.5) 64.483 64.558 65.330
106 ymax (x = 0.5) 0.845 0.851 0.851
106 vmax (y = 0.5) 218.054 221.572 216.750
106 vmax (y = 0.5) 0.037 0.067 0.039
u max ( 21 ) : The maximum horizontal velocity on the vertical mid-plane of the cavity
(together with its location).
vmax ( 21 ) : The maximum vertical velocity on the horizontal mid-plane of the cavity
(together with its location).
Table 1 shows PFEM results for Ra = 103 , 104 and 106 compared with the [1, 2]
solutions. Excellent agreement to experimental data in both results for momentum
and energy equations prove the accuracy of this approach for this low Ra number
range. The horizontal velocity component in the vertical mid-plane is shown in
Fig. 21. Here is worthy to note that when Ra number increases the boundary layer
becomes thinner and the maximum values in the velocity get closer to the walls.
Finally Fig. 22 presents the temperature profiles for the three cases.
306 N. M. Nigro et al.
(a) 4 (b) 20
PFEM-2 PFEM-2
3 OpenFoam OpenFoam
G.V. Davis G.V. Davis
2 10
Horizontal Velocity
Horizontal Velocity
1
0 0
0 0,2 0,4 0,6 0,8 1 0 0,2 0,4 0,6 0,8 1
1
2 10
4 20
y y
(c) 80
PFEM-2
60 OpenFoam
G.V. Davis
40
Horizontal Velocity
20
0
0 0,2 0,4 0,6 0,8 1
20
40
60
80
y
The schematic model for the problem is shown in Fig. 23. The cubic cavity is one
meter length with an aspect ratio of unity and is filled with air as working fluid.
The Prandtl number is fixed at Pr = 0.71. All surrounding walls are rigid and
impermeable. The vertical walls located at x = 0 and x = 1 are retained to be
isothermal but at different temperatures of Th and Tc , respectively. The buoyancy
force due to gravity works downwards (i.e., in negative z-direction).
For the present range of Ra numbers, solutions were obtained on a mesh with 81,000
tetrahedral elements and around of eighteen thousand nodes, and with refinement
towards the walls. The following characteristic quantities are presented:
Recent Advances in the Particle Finite Element Method 307
u max ( 21 ) : The maximum horizontal velocity for x-direction on center line (x = 0.5,
y = 0.5) of the cavity and its location.
wmax ( 21 ) : The maximum vertical velocity for z-direction on center line (y = 0.5, z
= 0.5) of the cavity and its location.
308 N. M. Nigro et al.
Table 2 Numerical solution for thermal cubic cavity with PFEM comparing with reference data
Ra Data PFEM Wakashima [24] Fusegi [5]
104 u max (x = y = 0.5) 0.1978 0.1989 0.2013
104 z max (x = y = 0.5) 0.8460 0.8250 0.8167
104 wmax (y = z = 0.5) 0.2190 0.2211 0.2252
104 xmax (y = z = 0.5) 0.1260 0.1253 0.1167
105 u max (x = y = 0.5) 0.1409 0.1423 0.1468
105 z max (x = y = 0.5) 0.8460 0.8500 0.8547
105 wmax (y = z = 0.5) 0.2359 0.2407 0.2471
105 xmax (y = z = 0.5) 0.0680 0.0751 0.0647
106 u max (x = y = 0.5) 0.0766 0.0813 0.0842
106 z max (x = y = 0.5) 0.8570 0.8500 0.8557
106 wmax (y = z = 0.5) 0.2897 0.2382 0.2588
106 xmax (y = z = 0.5) 0.0280 0.0500 0.0331
Table 2 shows PFEM results for Ra = 104 , 105 and 106 compared with the
[5, 24] solutions. Finally Fig. 24 presents a wireframe of the mesh used with slices
of section at mid-planes y = 0.5 and z = 0.5 respectively.
Turbulent flows around three-dimensional obstacles are common in nature and occur
in many applications including flow around tall buildings, vehicles and computer
chips. Understanding and predicting the properties of these flows are necessary for
Recent Advances in the Particle Finite Element Method 309
H=2h
Flow
3h
h h
3h
3h h 6h
Fig. 26 The streamlines on the symmetry plane at Re = 40,000. a shows the experimental result
of Martinuzzi and Tropea [17], b result from LES simulation on [21], and c and d presents the
results of PFEM using a coarse and finer mesh respectively
Recent Advances in the Particle Finite Element Method 311
Summary of Results
Large eddy simulations were performed at Re = 40,000. Figure 26 shows a com-
parison of time-averaged streamlines on the symmetry plane. The overall prediction
of the separation region on the roof and behind the obstacle is quite good even using
coarse grids. Shah and Ferziger commented that in its simulations the stagnation
point was located high on the front face, and in this work could be arrived the same
conclusion. Fluid striking the body above it goes over the obstacle and using the finer
mesh we can find a solution where it reattach on the roof, something that Shah could
not. Using the finer mesh, the rear recirculation region is not closed, but streamlines
originating upstream of the obstacle do not enter this region; fluid enters the rear
recirculation region from sides. Near the top of the recirculation region we find the
head of the arch vortex. Results using coarse mesh are not accurate, mainly behind
the obstacle.
Figure presents the time-averaged streamlines on the floor of the channel. The
streamline patterns are consistent with those observed by Martinuzzi and Tropea [17].
These streamlines, which may be viewed as skin friction lines, show the complexity
of this 3-D flow. On the reference [21], the primary separation occurs at a saddle
point located about one obstacle height (1.05 h) ahead of the obstacle (experimental
value = 1.026), whereas PFEM simulation reach approximately (0.89 h) with coarse
grid and (0.92 h) with the finer grid. The separation region wraps around the obstacle
and forms a strong horseshoe vortex. The converging and diverging streamlines that
mark the extent of this vortex are regions of strong upwash and downwash. This
horseshoe is better represented by PFEM using the finer mesh, whereas with the
coarse mesh the streamlines are too much closed behind the obstacle. Instantaneous
pictures (not presented here) of the flow show that the horseshoe vortex is, in fact,
highly intermittent; an intact structure is almost never found in these snapshots. The
mean flow on the side faces is entirely reversed. In Shah and Ferziger, the primary
reattachment length of 1.65 h agrees well with the experimental value of 1.61 h,
however PFEM reattachment is found in 1.8 h.
In the work of Shah and Ferziger [21], it is said that both the primary separation
point ahead of the obstacle and the rear reattachment points are singular points
(zero skin friction) where the so-called separation lines begin and end. Also, they
comment that the owl-face shaped streamlines in the rear recirculation zone of the
obstacle correspond to the base of the arch vortex. The arch vortex is formed by
quasi-periodic vortex shedding from the upstream vertical corners that resembles a
von Karman street. This intact arch vortex exists only in the mean flow and is an
artifact of averaging and PFEM can reproduce only approximately this behavior, and
strangely with a coarse mesh the result are more accurate. Must be noticed that both
grids are not good enough near the floor of the channel, then a better refinement is
required to reach the same quality of results as Shah and Ferziger.
Efficiency
In this section the scalability of the current implementation of PFEM is presented.
The above mentioned test, using the finer grid, was carried out over a Infiniband
312 N. M. Nigro et al.
Table 3 CPU-times comparison in seconds between different PFEM2 algorithms and OpenFOAM
for one, two and four cores
Cores 1x (s) 2x (s) 4x (s)
OpenFOAM 754 402 286
PFEM2 moving mesh (CIMNE) 484 371 326
PFEM2 fixed mesh (CIMNE) 284 179 138
PFEM2 fixed mesh (CIMEC) 330 176 99
interconnected cluster, which has dual socket nodes with Intel Xeon E5-2600 CPUs
and 64 Gb RAM. The interconnection is with IB-QDR 40 Gbps (Fig. 27).
Figure 28 presents the scalability of each PFEM stage and of the entire simulation
using an Eulerian weighting strategy, obtaining approximately the same number
of degrees of freedom in each partition. Could be noted that the efficiency of the
Infiniband cluster is good enough also running with 32 cores, reaching a global
S32 26x. Using more cores the efficiency decays because there is not enough
work for each process to overweight the communication time.
7.3 Multifluids
In this section a comparison with the results of the sloshing test is presented. For the
experiment, the same mesh and configuration than that presented in Idelsohn et al.
[9] have been used (Figs. 29 and 30).
Table 3 shows the computational time necessary to simulate 1 sec. in an Intel(R)
Core(TM) i7-3820 CPU 3.60 GHz with OpenFOAM and PFEM2 versions of the
International Center for Numerical Methods in Engineering (CIMNE).
On the other hand, the test of the implementation presented in this paper was
executed in an Intel(R) Core(TM) i5-3230M CPU 2.60 GHz. To match the hetero-
4,007
geneous platforms a benchmarking factor 9,010 (extracted for the web-page http://
cpubenchmark.net/high_end_cpus.html) is used, and the final values are presented
in the table.
The reported values evidence that, for the settings described, PFEM with fixed
mesh is more than 2 faster than OpenFOAM.
In this section a comparison with the results of a dam-break test is presented. For the
experiment, the same mesh and configuration which is presented in Idelsohn et al.
[9] have been used.
Recent Advances in the Particle Finite Element Method 313
Fig. 27 The streamlines in a plane near to the floor at Re = 40,000. a shows the numerical result
of Shah and Ferziger [21], and b and c present the results of PFEM using a finer mesh respectively
314 N. M. Nigro et al.
32
linear
acceleration
16 X-IVAS
projection
poisson
correction
8 Total
Sn
1
1 2 4 8 16 32
#processors
Fig. 28 Speed-up over an Infiniband cluster. Case: flow around a mounted cube in 3d
0.45 0.5
0.45
wave height right [m]
wave height left [m]
0.4
0.4
0.35 0.35
0.3 0.3
0.25
0.25
0.2
0.2
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
Time [s] Time [s]
Fig. 29 Interface relative height at the vertical walls (left side and right side) for PFEM fixed mesh
Fig. 30 From left to right and top to bottom: sloshing of two immiscible fluids with a large jump
in the density: snapshots at different time steps (t = 0.55, 1.15, 1.7, 2.3, 2.75, 3.35 and 5 s.)
316 N. M. Nigro et al.
Fig. 31 From left to right and top to bottom: snapshots of the dam break without obstacle at
t = 0, 0.2, 0.4, 0.6, 0.8 and 1 s
Recent Advances in the Particle Finite Element Method 317
8 Conclusions
In this paper a review and the present developments of PFEM are presented. In recent
years much effort has been devoted to improve the performance of this method in
order to make it competitive with the rest of the solvers mostly used in computational
mechanics. Not only that, but with recent findings that have emerged is thought to be
on the gates of a paradigm shift in the way of performing the simulations, especially
considering that the community is demanding of methods that are commensurate
with the needs of engineering design.
While this paper does not delve into the numerical analysis it establishes the basis
to do so in the next few years with the target to demonstrate mathematically the
goodness of the Lagrangian methods of this type in front of the very commonly used
Eulerian methods.
Finally the last goal has been to show that in addition to the well-known virtues
that owns the method to resolve problems with heterogeneous flows, it is also possible
to implement complex homogeneous flows, as in the case of turbulence and cases
with thermal coupling.
Acknowledgments This work was partially supported by the European Research Council under the
Advanced Grant: ERC-2009-AdG Real Time Computational Mechanics Techniques for Multi-Fluid
Problems. Norberto Nigro and Juan Gimenez want to thanks to CONICET,Universidad Nacional
del Litoral and ANPCyT for their financial support (grants PICT 1645 BID (2008), CAI+D 65-
333 (2009)). Also thanks to Santiago Marquez Damian for their invaluable assistance in show the
goodness of PFEM visvis other solvers available, of whom Santiago is an expert user. To Eugenio
Oate and CIMNE for their unconditional support and his teachings throughout his scientific life.
To Pedro Morin, Marta Bergallo for interesting mathematics discussions and to Nestor Calvo and
Pablo Novara for sharing some discussions about mesh generation and computational geometry.
References
9. Idelsohn S, Marti JM, Becker P, Oate E (2014) Analysis of multi-fluid flows with large time-
steps using the particle finite element method. Int J Num Meth in Fluids (in press)
10. Idelsohn S, Nigro NM, Limache A, Oate E (2012) Large time-step explicit integration method
for solving problems with dominant convection. Comput Methods Appl Mech Eng 217
220:168185
11. Idelsohn SR, Nigro NM, Gimenez JM, Rossi R, Marti J (2013) A fast and accurate method to
solve the incompressible navier-stokes equations. Eng Comput 30(2):197222
12. Idelsohn SR, Oate E, Calvo N, Del Pin F (2003) The meshless finite element method. Int J
Num Meth Eng 58(6):893912
13. Idelsohn SR, Oate E, Del Pin F (2004) The particle finite element method a powerful tool to
solve incompressible flows with free-surfaces and breaking waves. Int J Numer Meth 61:964
989
14. Jasak H (1996) Error analysis and estimation for the finite volume method with applications
to fluid flows. Ph.D. Thesis, London
15. Lakehal D, Rodi W (1997) Calculation of the flow past a surface-mounted cube with two-layer
turbulence models. J Wind Eng Ind Aerodyn 67:6578
16. Leveque R (2002) Finite volume methods for hyperbolic problems, 1st edn. Cambridge Uni-
versity Press, Cambridge
17. Martinuzzi R, Tropea C (1993) The flow around surface-mounted, prismatic obstacles placed
in a fully developed channel flow. J Fluids Eng 115:8592
18. Nigro N, Gimenez J, Limache A, Idelsohn S, Oate E, Calvo N, Novara P, Morin P (2011) A new
approach to solve incompressible navier-stokes equation using a particle method. Mecnica
Computacional XXX
19. Oate E, Idelsohn SR, Del Pin F, Aubry R (2004) The particle finite element method, an
overview. Int J Comput Meth 1:267307
20. Rodi W, Ferziger J, Breuer M, Pourquie M (1997) Status of large eddy simulation: results of
a workshop. Trans ASME J Fluid Eng 119:248262
21. Shah KB, Ferziger JH (1997) A fluid mechanicians view of wind engineering: large eddy
simulation of flow past a cubic obstacle. J Wind Eng Ind Aerodyn 67&68:211224
22. Sklar DM, Gimenez JM, Nigro NM, Idelsohn SR (2012) Thermal coupling in particle finite
element method - second generation. Mecnica Computacional XXXI:41434152
23. Stam J (1999) Stable fluids. In: SIGGRAPH 99 Conference Proceedings, Annual Conference
Series, pp 121128
24. Wakashima S, Saitoh T (2004) Benchmark solutions for natural convection in a cubic cavity
using the high-order time-space method. Int J Heat Mass Transfer 47:853864
Part VI
Fluid-Structure Interactions Problems
Computational Engineering Analysis and Design
with ALE-VMS and ST Methods
Abstract Flows with moving interfaces include fluidstructure interaction (FSI) and
quite a few other classes of problems, have an important place in engineering analy-
sis and design, and pose significant computational challenges. Bringing solution
and analysis to them motivated the Deforming-Spatial-Domain/Stabilized Space
Time (DSD/SST) method and also the variational multiscale version of the Arbitrary
LagrangianEulerian method (ALE-VMS). These two methods and their improved
versions have been applied to a diverse set of challenging problems with a com-
mon core computational technology need. The classes of problems solved include
free-surface and two-fluid flows, fluidobject and fluidparticle interaction, FSI, and
flows with solid surfaces in fast, linear or rotational relative motion. Some of the most
challenging FSI problems, including parachute FSI, wind-turbine FSI and arterial
FSI, are being solved and analyzed with the DSD/SST and ALE-VMS methods as
core technologies. Better accuracy and improved turbulence modeling were brought
with the recently-introduced VMS version of the DSD/SST method, which is called
DSD/SST-VMST (also ST-VMS). In specific classes of problems, such as parachute
K. Takizawa (B)
Department of Modern Mechanical Engineering and Waseda Institute for Advanced Study,
Waseda University, 1-6-1 Nishi-Waseda, Shinjuku-ku, Tokyo 169-8050, Japan
e-mail: [email protected]
Y. Bazilevs
Structural Engineering, University of California, San Diego, 9500 Gilman Drive,
La Jolla, CA 92093, USA
T. E. Tezduyar N. Kostov S. McIntyre
Mechanical Engineering, Rice University, 6100 Main Street, Houston, TX 77005, USA
M.-C. Hsu
Department of Mechanical Engineering, Iowa State University,
2025 Black Engineering, Ames, IA 50011, USA
O. iseth K. M. Mathisen
Department of Structural Engineering, Norwegian University of Science and Technology,
7491 Trondheim, Norway
1 Introduction
accuracy in the computations [4, 5, 48], and the desired accuracy can be attained
with larger time steps, but there are positive consequences beyond that. The ST
context provides us better accuracy and efficiency in temporal representation of
the motion and deformation of the moving interfaces and volume meshes, and better
efficiency in remeshing. This has been demonstrated in a number of 3D computations,
specifically, flapping-wing aerodynamics [3234, 83], separation aerodynamics of
spacecraft [56], and wind-turbine aerodynamics [37].
There are some advantages in using a discontinuous temporal representation in
ST computations. For a given order of temporal representation, we can reach a higher
order accuracy than one would reach with a continuous representation of the same
order. When we need to change the spatial discretization (i.e. remesh) between two
ST slabs, the temporal discontinuity between the slabs provides a natural framework
for that change. There are advantages also in continuous temporal representation. We
obtain a smooth solution, NURBS-based when needed. We also can deal with the
computed data in a more efficient way, because we can represent the data with fewer
temporal control points, and that reduces the computer storage cost. These advan-
tages motivated the development of the ST computation techniques with continuous
temporal representation (ST-C) [84].
The core and special ALE-VMS and ST FSI methods mentioned above were
motivated by the need for the solution and analysis of specific classes of challenging
problems, such as parachute FSI, arterial FSI, aerodynamics of flapping wings, ship
hydrodynamics and FOI, and wind-turbine aerodynamics and FSI. This can be seen
from the ALE-VMS and ST articles cited in the first paragraph, especially the articles
since 2008, and will also be seen from the examples we will present in this chapter.
In the case of the parachute FSI, the special methods were motivated also by the need
for supporting the design process for the NASA spacecraft parachutes.
For the governing equations and core methods, including the ALE-VMS and
DSD/SST methods, and for much of the special techniques, we refer the interested
reader to [15, 22, 33, 50, 54]. An overview of three of the special techniques is
provided in Sect. 2. Examples of the challenging problems solved are presented in
Sect. 3, and the concluding remarks are given in Sect. 4.
2 Special Methods
A certain class of FSI problems might involve some specific computational chal-
lenges beyond those encountered in a typical FSI problem. That requires develop-
ment of special FSI methods targeting those challenges. A good number of special
methods were developed in conjunction with the core ST FSI method to address the
specific computational challenges involved in parachute FSI [53], patient-specific
arterial FSI [50], aerodynamics of flapping wings [33, 34], and wind-turbine aero-
dynamics [37]. The details on these special methods can be found in the references
cited above. Here we give three examples.
Computational Engineering Analysis and Design 325
Parachute FSI involves all the computational challenges of a typical FSI problem.
Spacecraft parachutes are often very large ringsail parachutes, made of a large number
of gores, where a gore is the slice of the canopy between two radial reinforcement
cables running from the parachute vent to the skirt (see Fig. 1). Ringsail parachute
gores are constructed from rings and sails, resulting in a parachute canopy with
hundreds of ring gaps and sail slits (see Fig. 2). The complexity created by this
geometric porosity makes FSI modeling inherently challenging.
The Homogenized Modeling of Geometric Porosity (HMGP) [3] and its new
version, HMGP-FG [53], were introduced to help us bypass the intractable com-
plexities of the geometric porosity by approximating it with an equivalent, locally
varying homogenized porosity. In HMGP-FG, the normal velocity crossing the para-
chute canopy under a pressure differential p is modeled as
326 K. Takizawa et al.
Fig. 4 The two porosity coefficients for each patch are calculated in a one-time fluid mechanics
computation with an n-gore slice of the parachute canopy, where the flow through all the gaps and
slits is resolved. Expect for the first and last patches, each patch contains a gap or a slit. See [3, 53]
for details
AF AG |p|
u n = (kF ) J p (kG ) J sgn(p) , (1)
A1 A1
where A1 , AF and AG are defined in Fig. 3, and (kF ) J and (kG ) J are the homogenized
porosity coefficients for each patch J , calculated in a one-time fluid mechanics
computation with an n-gore slice of the parachute canopy (see Fig. 4). Even in a
fully open configuration, the parachute canopy goes through a periodic breathing
motion where the diameter varies between its minimum and maximum values. The
shapes and areas of the gaps and slits vary significantly during this breathing motion
(see Fig. 5). The porosity coefficients have very good invariance properties with
respect to these shape and area changes, and this can be seen in Fig. 6.
Fig. 5 The shapes and the areas of the slits vary significantly during the canopy breathing motion
Fig. 6 The porosity coefficients (kF ) J and (kG ) J for each patch J , at different canopy shapes
during the breathing motion. The plots show good invariance for these coefficients with respect to
the shape changes
robust and efficient ways of moving the mesh and remeshing as needed. Special
techniques to be used in conjunction with the DSD/SST method have been devel-
oped (see [3234]) based on using higher-order functions (specifically NURBS basis
functions) in time in representing the wing motion and deformation, mesh motion,
328 K. Takizawa et al.
Mc
Mc+1
1.0
0.5
0.0
0 1 2 3 4 1
5 5 5 5
Fig. 7 Mesh motion is represented by using NURBS basis functions in time. The temporal-control
meshes are the coefficients of the NURBS basis functions
New New
New
1.0
0.5
0.0
0 1 2 3 4 1
5 5 5 5
Remeshing point
Fig. 8 Remeshing is handled by multiple knot insertion where we want to remesh. That point in
time becomes a patch boundary
When the ALE-VMS method is used in the context of the MITICT with the level-set
formulation, additional computational technology is employed to enhance the accu-
racy and robustness of the free-surface flow formulation. The use of a regularized
Heaviside function in the definition of the fluid density and viscosity necessitates the
level set to satisfy the signed-distance function property near the air-water interface.
To maintain the signed-distance property of the level set function, a redistancing
procedure based on the Eikonal partial differential equation is employed. The details
of the numerical formulation may be found in [12, 16, 77, 78].
Furthermore, both convection and redistancing of the level set do not inherently
conserve mass. Convergence to a mass-conserving solution occurs only with mesh
refinement. Coarse (and not-so-coarse) mesh simulations may suffer form significant
water mass loss. (This depends on the problem setup and boundary conditions. In
the case of liquids sloshing in closed containers, mass loss may be significant. In
problems with inflow and outflow boundaries the effect may not be as pronounced.)
This effect is amplified when the equations are integrated for a long time period,
when seemingly small mass errors for a given time step compound into a large mass
error toward the end of the computation. As a result, an explicit mass correction
procedure is necessary. To ensure mass balance at every time step, after redistancing
of the level set, we modify the level set function by a global constant, such that the
following equations holds:
n+1 d n d
n+1 n
+ tn+1 n+1/2 un+1/2
h
vn+1/2
h
n n+1/2 d = 0, (2)
n+1/2
where vh is the mesh velocity. In Eq. (2), the quantities are subscripted with a tempo-
ral index and tn+1 is the time step size. This is the simplest technique that restores
mass balance in the simulations. Other versions of mass correction are also possible:
in [75, 85, 86] the authors proposed a total-domain based mass conservation
technique, validated it experimentally in [87], and developed it in the context of
MITICT (with mass conservation for fluidsolid interfaces) in [74]. A chunk-based
(subdomain-based) version of mass conservation was developed in [29, 30].
3 Examples
Examples in Sects. 3.13.4 were computed with the DSD/SST methods, and the
examples in Sects. 3.53.8 with the ALE-VMS methods.
330 K. Takizawa et al.
Fig. 9 Parachute shape and flow field at an instant during the computation and comparison with
the test data. Here VD , VRH , TB , and TS are the descent speed, horizontal speed, breathing period,
and swinging period
The first example, a parachute computation, serves the purpose of comparing our
computed results to data from drop tests with a base parachute design and gaining
confidence in our parachute FSI model. Figure 9 shows the parachute shape and flow
field at an instant during the computation and the comparison with the test data. With
that confidence, we can do simulation-based design studies [53], such as evaluating
the aerodynamic performance of the parachute as a function of the suspension line
length (see Fig. 10).
Spacecraft parachutes are typically used in clusters of two or three parachutes. The
contact between the canopies of the parachute cluster is a computational challenge
that we have addressed recently (see [53]). Figure 11 shows a cluster of three para-
chutes at three different instants during the FSI computation, with contact between
two of the parachutes.
Spacecraft parachutes are also typically used in multiple stages, starting with a
reefed stage where a cable along the parachute skirt constrains the diameter to be
less than the diameter in the subsequent stage. After a certain period of time during
the descent, the cable is cut and the parachute disreefs (i.e. expands) to the next
stage. Computing the parachute shape at the reefed stage and FSI modeling during
the disreefing involve additional computational challenges created by the increased
geometric complexities and by the rapid changes in the parachute geometry. Figure 12
shows such a disreefing (see [55]).
Computational Engineering Analysis and Design 331
Fig. 10 A simulation-based parachute design study, where the objective is to evaluate the aerody-
namic performance of the parachute as a function of the suspension line length. See [53] for details
of the study
Fig. 11 A cluster of three parachutes at three instants during the FSI computation, with contact
between two of the parachutes
1,200
ALE VMST
1,000
Torque (kNm)
800
600
SUPS
400
200
0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2
Time (s)
Fig. 14 Time history of the aerodynamic torque generated by a single blade. Computed with the
DST/SST-SUPS (SUPS), DST/SST-VMST (VMST), and ALE methods
Including the tower in the model increases the computational challenge because
of the fast, rotational relative motion between the rotor and tower. We address this
additional challenge in [37] by using NURBS basis functions for the temporal repre-
sentation of the rotor motion, mesh motion and also in remeshing. This is essentially
the same computational technology described in Sect. 2.2 for modeling the aero-
dynamics of flapping wings. We named this ST/NURBS Mesh Update Method
(STNMUM) in [37]. Figure 15 shows, from [37], the vorticity magnitude, com-
puted with the DST/SST-VMST method and the STNMUM. In that figure, the color
range from blue to red corresponds to a vorticity range from low to high, and lighter
and darker shades of a color correspond to lower and higher values.
Patient-specific arterial FSI modeling has many challenges. They include calculat-
ing an estimated zero-pressure arterial geometry, specifying the velocity profile at
an inflow with non-circular shape, using variable wall thickness, building layers of
refined fluid mesh near the walls, proper calculation of the wall shear stress (WSS)
and oscillatory shear index (OSI), and properly scaling the flow rate at the inflow. Spe-
cial techniques developed to address these challenges can be found in [50]. Here we
present some computations from [50] for cerebral arteries with aneurysm. Figure 16
shows the lumen obtained from voxel data for three arterial models: Model 1,
Model 2, and Model 3. Figure 17 shows the fluid mechanics mesh for Model 3.
Figure 18 shows the streamlines at the maximum flow rate.
334 K. Takizawa et al.
Fig. 15 Vorticity, computed with the DST/SST-VMST method and the STNMUM (see [37])
Fig. 16 Arterial lumen geometry obtained from voxel data for Model 1, Model 2, and Model 3
Fig. 17 Fluid mechanics mesh for Model 3. Mesh at the fluidstructure interface and inflow plane
Computational Engineering Analysis and Design 335
Fig. 18 Streamlines for the three models when the volumetric flow rate is maximum
As a last set of examples from analyses with the ST methods, we present from
[33, 34] computational aerodynamics modeling of flapping wings of an actual locust
and an MAV. The motion and deformation data for the wings is extracted from the
high-speed, multi-camera video recordings of a locust in a wind tunnel at Baylor
College of Medicine (BCM), Houston. The video recording is accomplished by
using a set of tracking points marked on the forewings (FW) and hintings (HW)
of the locust. The tracking points are seen in Fig. 19. How the wing motion and
deformation data is extracted from the video data and represented using NURBS
basis functions in space and time is described in detail in [33]. Figures 20 and 21
show the wind tunnel photographs and the computational model at eight points in
time. Figure 22 shows how the body and wings compare for the locust and MAV
models, and Fig. 23 shows the length scales involved in the computations with those
models. Figure 24 shows the streamlines for the locust. Figures 25 and 26 show for
the locust the vorticity magnitude during the second flapping cycle. Figures 27 and
28 show for the MAV the vorticity magnitude during the third flapping cycle. In
Figs. 25, 26, 27 and 28, the color range from blue to red corresponds to a vorticity
range from low to high, and lighter and darker shades of a color correspond to lower
and higher values. Figure 29 shows the lift and thrust for the locust and MAV.
Fig. 19 Tracking points in the data set from the BCM wind tunnel
Fig. 20 Comparison of computational model and wind tunnel photographs at first four points in
time. Viewing angles are matched approximately. Wind tunnel photographs are from BCM
action of gravity and impacts a fixed rectangular container. We compute the problem
using two types of the spatial discretization: linear tetrahedral finite elements and
NURBS. The quadratic NURBS mesh is significantly more coarse than the linear
tetrahedral mesh. Free-slip and no-penetration boundary conditions are applied on all
surfaces, including the top of the tank. The problem is run until T = 6 s. Snapshots
comparing the solutions coming from tetrahedral FEM and NURBS computations
are given in Fig. 31. Large-scale features of the solution are very similar in the two
simulations, however the details of the small-scale features are better represented on
Computational Engineering Analysis and Design 337
Fig. 21 Comparison of computational model and wind tunnel photographs at last four points in
time. Viewing angles are matched approximately. Wind tunnel photographs are from BCM
Fig. 22 Locust body and wings (left) and MAV body and wings (right)
90 mm 90 mm
80 mm 80 mm
Fig. 23 Length scales in the computations with the locust (left) and MAV (right) models
a much finer tetrahedral grid, as expected. Time series of the pressure at different
locations on the obstacle are shown in Fig. 32. The first wave hits the block at
approximately t = 0.5 s, and the second, much smaller wave arrives at the block
338 K. Takizawa et al.
Fig. 24 Locust. Streamlines colored by velocity magnitude in m/s at approximately 25 % (left) and
50 % (right) of the second flapping cycle
Fig. 25 Locust. Vorticity for the first 4 of 8 equally-spaced points during the 2nd flapping cycle
Fig. 26 Locust. Vorticity for the last 4 of 8 equally-spaced points during the 2nd flapping cycle
at about t = 5 s. The wave impact times and pressure peaks are predicted very
well with both linear elements and quadratic NURBS. Given that the NURBS mesh
has about half of the degrees-of-freedom of the linear FEM mesh in each Cartesian
Computational Engineering Analysis and Design 339
Fig. 27 MAV. Vorticity for the first 4 of 8 equally-spaced points during the 2nd flapping cycle
Fig. 28 MAV. Vorticity for the last 4 of 8 equally-spaced points during the 2nd flapping cycle
25 25
Locust Locust
20 MAV 20 MAV
Force (mN)
Force (mN)
15 15
10 10
5 5
0 0
-5 -5
-10 -10
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
t/T t/T
Fig. 29 Total lift (left) and thrust (right) generated over one cycle
direction, the accuracy of NURBS results is remarkable; linear FEM is not capable
of attaining such accuracy at this level of resolution (see [12]), and requires a finer
mesh for comparable accuracy.
340 K. Takizawa et al.
Fig. 30 The MARIN dam break problem. Geometry definition. The computational domain is a
rectangular box with dimensions 3.22 m 1 m 1 m. The object has dimensions 0.2 m 0.2 m
0.4 m and is placed at the back end of the tank. The water column, initially at rest, has dimensions
1 m 1 m 0.55 m. The locations where pressure and water height are sampled are also depicted
Fig. 31 The MARIN dam break problem. Snapshots of the free surface solution on the tetrahedral
(top) and NURBS (bottom) meshes at t = 1.0, 1.5, 2.0, 4.0, and 5.0 s
We present results for the Fridsma planing hull [89]. We give a detailed definition
of the hull geometry, present a mesh refinement study, and assess the effect of hull
speed on the drag force and trim angle. Only flat-water (i.e., no waves), constant
hull speed cases are considered. The computational results presented are from [16].
The Fridsma hull geometry definition is given in Fig. 33. The hull is comprised of
idealized shapes: a bow consisting of four ruled surfaces followed by a wedge-shaped
straight section with an constant deadrise angle of 20 . Analytical expressions for
the bounding curves for the ruled surfaces are provided in the figure. The relevant
global geometry parameters are, Length (L): 114.3 cm, Beam (b): 22.86 cm, Height:
14.2875 cm, and Deadrise: 20 . The hull mass, center of gravity, and moment of
inertia are, Mass (m): 7.257 kg, xcg : 80.01 cm, z cg : 6.721 cm, Gyradius (r ): 25 % L,
Computational Engineering Analysis and Design 341
12000 12000
Experiment Experiment
10000 Tet 518379 10000 Tet 518379
Pressure (Pa)
Pressure (Pa)
64x32x32 64x32x32
8000 8000
6000 6000
4000 4000
2000 2000
0 0
0 2 4 6 0 2 4 6
Time (s) Time (s)
4000 4000
Experiment Experiment
Tet 518379 Tet 518379
Pressure (Pa)
Pressure (Pa)
3000 64x32x32 3000 64x32x32
2000 2000
1000 1000
0 0
0 2 4 6 0 2 4 6
Time (s) Time (s)
Fig. 32 The MARIN dam break problem. Time history of the pressure at four locations on the
obstacle. Experimental data is from [88]
114.3cm
22.86cm
x
10.3782cm
z
14.2875cm
20
x 2 z 2
+ =1
22.86cm 10.3782cm
x 2 z 2
+ =1
22.86cm 14.2875cm
x 22.86cm
x 2 y 2
+ =1
22.86cm 11.43cm
Fig. 34 Fridsma hull. Coarsest mesh with water and air domains shown
1
In Fridsma [89] the results are reported in terms of the Speed-Length Ratio (SLR), u/ L, which
is a dimensional quantity. Here we chose to report the results in terms of the Froude number.
Computational Engineering Analysis and Design 343
Fig. 35 Fridsma hull. Free surface colored by the flow speed relative to the hull speed in m/s
0.25 9
8
0.2 7
Trim Angle (deg)
6
0.15
R/mg
5
4
0.1
3
0.05 2
ALEVMS 1
ALEVMS
Experiment Experiment
0 0
50 60 70 80 90 100 50 60 70 80 90 100
1/3 1/3
N N
Fig. 36 Fridsma hull. Convergence of the drag force (left) and trim angle (right) with mesh refine-
ment and comparison with experimental results
and Fr = 1.190 cases. The simulations are started impulsively in the configuration
depicted in Fig. 34. In the case of Fr = 0, although the hull speed is zero, a non-zero
trim angle develops such that the hull is in equilibrium with the hydrostatic forces. In
all other cases, there is a rapid transient followed by a largely steady-state response.
The steady-state drag force and trim angle are plotted as a function of Froude number,
and compared to the experimental results in Fig. 37. Accurate prediction of the drag
force is attained in all cases. The trim angle is predicted very well for the first two
Froude number cases, and a deviation from the experiment by 1012 % is seen in the
remaining two cases.
Here we present the simulation of the DTMB 5415 Navy combatant at lab scale
from [78]. This ship has been investigated by other researchers, both experimentally
and computationally (see, e.g., [9092]). The length of the ship hull is 5.72 m. The
ship mass, center of gravity and inertia tensor are computed by meshing the ship
344 K. Takizawa et al.
9
0.2 8
7
0.1 4
3
0.05 2
ALEVMS 1 ALEVMS
Experiment Experiment
0 0
0 0.25 0.5 0.75 1 1.25 0 0.25 0.5 0.75 1 1.25
Fr Fr
Fig. 37 Fridsma hull. Steady-state drag force (left) and trim angle (right) as a function of Froude
number. Comparison with experimental results
interior and performing a direct computation. The total ship volume is 1,366 m3 .
The ship mass is equal to 532.3 kg. It is obtained by multiplying the volume of the
ship below the water line by the constant water density. The center of gravity and the
inertia tensor are computed assuming the ships effective density (i.e., the ship mass
divided by its total volume), which results in X0 = (2.761, 0, 0.280) m and
7.256E-2 2.69E-7 5.35E-2
J0 = 2.69E-7 2.89 2.44E-8 kg m2 . (3)
5.35E-2 2.44E-8 2.91
We compute the ship in head waves, meaning the waves that travel in the direction
opposite to that of the ship. We assume that the ship speed is Uin = 1.873 m/s, which
gives Fr = 0.25 based on the ship length. The ship was allowed to move vertically, to
pitch and to roll, while the rest of the rigid body degrees-of-freedom were constrained.
We make use of the linear Airy waves [93] to prescribe inlet boundary conditions.
The Airy waves may be derived using potential theory, and are specified as follows:
Given, the wave amplitude, wave length and water depth, Aw = 0.2 m, L w = 5.72 m
and h= 3.49 m, respectively, we compute k = 2/L w , the angular wavenumber,
Aw
= gk tanh(kh), the wave phase speed, and Av = sinh(kh) , the velocity amplitude.
With these definitions, the Airy waves are given by
where (u, v, w)T is the fluid velocity vector and the air-water interface in the hydro-
static configuration is assumed to be located at z = 0.
Computational Engineering Analysis and Design 345
Fig. 38 DTMB 5415 in head waves at t = 9.0 and 9.5 s. Water surface colored by the fluid speed
Fig. 39 Geometric model of the scaled Hardanger bridge deck section and zoom on the geometric
details of the bridge deck section model. The guide-vane-like vortex mitigation devices are located
on the underside of the deck and are shown in light red color
Figure 38 shows the ship negotiating high-amplitude waves. The right part of
Fig. 38 shows the ship partially submerged in water, which is a result of the oncoming
wave hitting the bow of the ship. In this case, near the bow, the free surface experiences
topological changes, which necessitates the use of an interface-capturing method to
handle the air-water interface for this class of problems.
Fig. 41 Aerodynamics mesh of the bridge deck section and zoom on the boundary layer mesh of
the top deck and rails
place. Note that there is only a 2.5 cm gap between the tunnel wall and the side of
the bridge deck section. All the geometric details of the model-scale bridge deck are
modeled in the computations, including the hand and bicycle rails on the top of the
deck, and the maintenance rails in the front and rear of the deck. Computations are
performed for 2.6 and 6.0 m/s wind speed, with and without the VMDs. Figure 41
shows the mesh resolution used in this study. Boundary-layer prismatic elements
are used near all solid surfaces, and tetrahedral elements are used elsewhere in the
computational domain. The mesh is refined near the deck and downstream of it to
better capture the wake turbulence. The uniform wind speed is prescribed at the
inflow boundary, the traction vector is set to zero at the outflow boundary, and the
slip condition is set on the top, bottom, and lateral boundaries of the computational
domain (see Fig. 40). The no-slip boundary condition on the bridge deck surface is
enforced weakly. The bridge deck is modeled as a rigid object. For the bridge deck
mass, moment of inertia tensor, and stiffness and damping matrices the readers are
Computational Engineering Analysis and Design 347
1.4 0.4
Experiment: Without VMD Experiment: Without VMD
Experiment: With VMD Experiment: With VMD
1.2 0.2
Computation: With VMD
0
1
-0.2
CD
CL
0.8
-0.4
0.6
Computation: Without VMD -0.6 Computation: With VMD Computation: Without VMD
0.4 -0.8
0.2 -1
0 5 10 15 20 0 5 10 15 20
Time (s) Time (s)
Fig. 42 Time history of the drag and lift coefficients for cases with and without VMDs for 2.6 m/s
wind speed. Time-averaged experimental measurements from [94] are plotted for comparison
0.06
Without VMD
With VMD
0.04
0.02
(degree)
0
3
-0.02
-0.04
-0.06
0 1 2 3 4 5 6
Time (s)
Fig. 43 Time history of the pitching angle with and without VMDs for 6.0 m/s wind speed
referred to [94]. The deck is allowed to displace vertically, and undergo pitching and
rolling motions.
Figure 42 shows drag and lift coefficients for cases with and without the VMDs.
The drag and lift coefficients are defined as C D = 1 FD2 and C L = 1 FL2 . Results
2 U hl 2 U bl
are compared with the experimental measurements from [94] and reasonable agree-
ment is achieved. Figure 43 shows the time history of angular displacement of the
bridge deck corresponding to the pitching motion. The figure clearly shows that with
the added VMDs the bridge deck experiences smaller rotational motions then without,
which was also observed in the wind tunnel tests. To better understand the underlying
mechanics, the differences in the air flow with and without VMDs are shown on a
planar cut of the bridge deck in Fig. 44. The guide vanes keep the flow attached to
the underside of the deck, which delays flow separation and precludes formation of
large-scale vortical structures that drive the bridge deck response. Figure 45 shows
the 3D view of the deck with guide vanes, where air speed contours at an instant are
348 K. Takizawa et al.
Fig. 44 Instantaneous air speed contours on a planar cut near the bridge deck for 2.6 m/s wind
speed. Left Case without VMDs. Right Case with VMDs
Fig. 45 Instantaneous air speed contours on a set of cuts along the deck length for 2.6 m/s wind
speed. Top and bottom deck views are shown
plotted on a set of cuts along the deck length. Top and bottom views are shown. The
flow is turbulent and 3D, which underscores the importance of 3D aerodynamics
modeling for this class of problems.
4 Concluding Remarks
Acknowledgments This work was supported in part by NASA JSC Grant NNX13AD87G. Method
development and evaluation components of the work on aerodynamics of flapping wings and wind-
turbine aerodynamics were supported in part by ARO Grant W911NF-12-1-0162 (TT) and Rice
Waseda research agreement (KT). The development and application of FOI techniques for bridge
aerodynamics was supported by the program for preferred research areas at the Faculty of Engineer-
ing Science and Technology, the Norwegian University of Science and Technology. The research
work on free-surface FOI was supported by the ARO Grant W911NF-11-1-0083 (YB). We wish
to thank the Texas Advanced Computing Center (TACC) at the University of Texas at Austin, the
San Diego Supercomputer Center (SDSC) at the University of California, San Diego, and the Nor-
wegian Metacenter for Computational Science (Notur) for providing some of the HPC resources
used. We thank Professor Fabrizio Gabbiani and Dr. Raymond Chan (Baylor College of Medicine)
for providing us the digital data extracted from the wind-tunnel videos of the locust.
References
1. Tezduyar TE (1992) Stabilized finite element formulations for incompressible flow computa-
tions. Adv Appl Mech 28:144. doi:10.1016/S0065-2156(08)70153-4
2. Tezduyar TE (2003) Computation of moving boundaries and interfaces and stabilization para-
meters. Int J Numer Methods Fluids 43:555575. doi:10.1002/fld.505
3. Tezduyar TE, Sathe S (2007) Modeling of fluidstructure interactions with the spacetime
finite elements: solution techniques. Int J Numer Methods Fluids 54:855900. doi:10.1002/
fld.1430
4. Takizawa K, Tezduyar TE (2011) Multiscale spacetime fluidstructure interaction techniques.
Comput Mech 48:247267. doi:10.1007/s00466-011-0571-z
5. Takizawa K, Tezduyar TE (2012) Spacetime fluidstructure interaction methods. Math Models
Methods Appl Sci 22:1230001. doi:10.1142/S0218202512300013
6. Hughes TJR, Liu WK, Zimmermann TK (1981) Lagrangian-Eulerian finite element formula-
tion for incompressible viscous flows. Comput Methods Appl Mech Eng 29:329349
7. Ohayon R (2001) Reduced symmetric models for modal analysis of internal structural-acoustic
and hydroelastic-sloshing systems. Comput Methods Appl Mech Eng 190:30093019
8. van Brummelen EH, de Borst R (2005) On the nonnormality of subiteration for a fluidstructure
interaction problem. SIAM J Sci Comput 27:599621
9. Bazilevs Y, Calo VM, Hughes TJR, Zhang Y (2008) Isogeometric fluidstructure interaction:
theory, algorithms, and computations. Comput Mech 43:337
10. Bazilevs Y, Hsu M-C, Akkerman I, Wright S, Takizawa K, Henicke B, Spielman T, Tezduyar
TE (2011) 3D simulation of wind turbine rotors at full scale. Part I: geometry modeling and
aerodynamics. Int J Numer Methods Fluids 65:207235. doi:10.1002/fld.2400
11. Bazilevs Y, Hsu M-C, Kiendl J, Wchner R, Bletzinger K-U (2011) 3D simulation of wind
turbine rotors at full scale. Part II: fluidstructure interaction modeling with composite blades.
Int J Numer Methods Fluids 65:236253
12. Akkerman I, Bazilevs Y, Kees CE, Farthing MW (2011) Isogeometric analysis of free-surface
flow. J Comput Phys 230:41374152
350 K. Takizawa et al.
13. Hsu M-C, Bazilevs Y (2011) Blood vessel tissue prestress modeling for vascular fluidstructure
interaction simulations. Finite Elem Anal Des 47:593599
14. Nagaoka S, Nakabayashi Y, Yagawa G, Kim YJ (2011) Accurate fluidstructure interaction
computations using elements without mid-side nodes. Comput Mech 48:269276. doi:10.1007/
s00466-011-0620-7
15. Bazilevs Y, Hsu M-C, Takizawa K, Tezduyar TE (2012) ALE-VMS and ST-VMS methods for
computer modeling of wind-turbine rotor aerodynamics and fluidstructure interaction. Math
Models Methods Appl Sci 22:1230002. doi:10.1142/S0218202512300025
16. Akkerman I, Dunaway J, Kvandal J, Spinks J, Bazilevs Y (2012) Toward free-surface modeling
of planing vessels: simulation of the fridsma hull using ALE-VMS. Comput Mech 50:719727
17. Minami S, Kawai H, Yoshimura S (2012) Parallel BDD-based monolithic approach for acoustic
fluidstructure interaction. Comput Mech 50:707718
18. Miras T, Schotte J-S, Ohayon R (2012) Energy approach for static and linearized dynamic
studies of elastic structures containing incompressible liquids with capillarity: a theoretical
formulation. Comput Mech 50:729741
19. van Opstal TM, van Brummelen EH, de Borst R, Lewis MR (2012) A finite-element/boundary-
element method for large-displacement fluidstructure interaction. Comput Mech 50:779788
20. Yao JY, Liu GR, Narmoneva DA, Hinton RB, Zhang Z-Q (2012) Immersed smoothed finite ele-
ment method for fluidstructure interaction simulation of aortic valves. Comput Mech 50:789
804
21. Larese A, Rossi R, Onate E, Idelsohn SR (2012) A coupled PFEMEulerian approach for the
solution of porous fsi problems. Comput Mech 50:805819
22. Bazilevs Y, Takizawa K, Tezduyar TE (2013) Computational fluidstructure interaction: meth-
ods and applications. Wiley, Chichester
23. Korobenko A, Hsu M-C, Akkerman I, Tippmann J, Bazilevs Y (2013) Structural mechanics
modeling and FSI simulation of wind turbines. Math Models Methods Appl Sci 23:249272
24. Yao JY, Liu GR, Qian D, Chen CL, Xu GX (2013) A moving-mesh gradient smoothing method
for compressible CFD problems. Math Models Methods Appl Sci 23:273305
25. Kamran K, Rossi R, Onate E, Idelsohn SR (2013) A compressible lagrangian framework for
modeling the fluidstructure interaction in the underwater implosion of an aluminum cylinder.
Math Models Methods Appl Sci 23:339367
26. Hsu M-C, Akkerman I, Bazilevs Y (2013) Finite element simulation of wind turbine aerody-
namics: validation study using NREL phase VI experiment. Wind Energy. doi:10.1002/we.
1599
27. Tezduyar T, Aliabadi S, Behr M, Johnson A, Mittal S (1993) Parallel finite-element computation
of 3d flows. Computer 26:2736. doi:10.1109/2.237441
28. Tezduyar T, Aliabadi S, Behr M, Johnson A, Kalro V, Litke M (1996) Flow simulation and
high performance computing. Comput Mech 18:397412. doi:10.1007/BF00350249
29. Tezduyar TE (2001) Finite element methods for flow problems with moving boundaries and
interfaces. Arch Comput Methods Eng 8:83130. doi:10.1007/BF02897870
30. Akin JE, Tezduyar TE, Ungor M (2007) Computation of flow problems with the mixed
interface-tracking/interface-capturing technique (MITICT). Comput Fluids 36:211. doi:10.
1016/j.compfluid.2005.07.008
31. Mittal S, Tezduyar TE (1995) Parallel finite element simulation of 3D incompressible
flowsfluidstructure interactions. Int J Numer Methods Fluids 21:933953. doi:10.1002/
fld.1650211011
32. Takizawa K, Henicke B, Puntel A, Spielman T, Tezduyar TE (2012) Spacetime computational
techniques for the aerodynamics of flapping wings. J Appl Mech 79:010903. doi:10.1115/1.
4005073
33. Takizawa K, Henicke B, Puntel A, Kostov N, Tezduyar TE (2012) Spacetime techniques for
computational aerodynamics modeling of flapping wings of an actual locust. Comput Mech
50:743760. doi:10.1007/s00466-012-0759-x
34. Takizawa K, Kostov N, Puntel A, Henicke B, Tezduyar TE (2012) Spacetime computational
analysis of bio-inspired flapping-wing aerodynamics of a micro aerial vehicle. Comput Mech
50:761778. doi:10.1007/s00466-012-0758-y
Computational Engineering Analysis and Design 351
35. Takizawa K, Henicke B, Tezduyar TE, Hsu M-C, Bazilevs Y (2011) Stabilized spacetime
computation of wind-turbine rotor aerodynamics. Comput Mech 48:333344. doi:10.1007/
s00466-011-0589-2
36. Takizawa K, Henicke B, Montes D, Tezduyar TE, Hsu M-C, Bazilevs Y (2011) Numerical-
performance studies for the stabilized spacetime computation of wind-turbine rotor aerody-
namics. Comput Mech 48:647657. doi:10.1007/s00466-011-0614-5
37. Takizawa K, Tezduyar TE, McIntyre S, Kostov N, Kolesar R, Habluetzel C (2014) Spacetime
VMS computation of wind-turbine rotor and tower aerodynamics. Comput Mech 53:115.
doi:10.1007/s00466-013-0888-x
38. Takase S, Kashiyama K, Tanaka S, Tezduyar TE (2011) Spacetime supg finite element com-
putation of shallow-water flows with moving shorelines. Comput Mech 48:293306. doi:10.
1007/s00466-011-0618-1
39. Kalro V, Tezduyar TE (2000) A parallel 3d computational method for fluidstructure inter-
actions in parachute systems. Comput Methods Appl Mech Eng 190:321332. doi:10.1016/
S0045-7825(00)00204-8
40. Tezduyar TE, Sathe S, Keedy R, Stein K (2006) Spacetime finite element techniques for
computation of fluidstructure interactions. Comput Methods Appl Mech Eng 195:20022027.
doi:10.1016/j.cma.2004.09.014
41. Torii R, Oshima M, Kobayashi T, Takagi K, Tezduyar TE (2006) Computer modeling of car-
diovascular fluidstructure interactions with the Deforming-Spatial-Domain/Stabilized Space
Time formulation. Comput Methods Appl Mech Eng 195:18851895. doi:10.1016/j.cma.2005.
05.050
42. Tezduyar TE, Sathe S, Cragin T, Nanna B, Conklin BS, Pausewang J, Schwaab M (2007)
Modeling of fluidstructure interactions with the spacetime finite elements: arterial fluid
mechanics. Int J Numer Methods Fluids 54:901922. doi:10.1002/fld.1443
43. Tezduyar TE, Sathe S, Pausewang J, Schwaab M, Christopher J, Crabtree J (2008) Interface
projection techniques for fluidstructure interaction modeling with moving-mesh methods.
Comput Mech 43:3949. doi:10.1007/s00466-008-0261-7
44. Tezduyar TE, Sathe S, Schwaab M, Pausewang J, Christopher J, Crabtree J (2008) Fluid
structure interaction modeling of ringsail parachutes. Comput Mech 43:133142. doi:10.1007/
s00466-008-0260-8
45. Takizawa K, Christopher J, Tezduyar TE, Sathe S (2010) Spacetime finite element computation
of arterial fluidstructure interactions with patient-specific data. Int J Numer Methods Biomed
Eng 26:101116. doi:10.1002/cnm.1241
46. Takizawa K, Moorman C, Wright S, Christopher J, Tezduyar TE (2010) Wall shear stress
calculations in spacetime finite element computation of arterial fluidstructure interactions.
Comput Mech 46:3141. doi:10.1007/s00466-009-0425-0
47. Takizawa K, Moorman C, Wright S, Spielman T, Tezduyar TE (2011) Fluidstructure inter-
action modeling and performance analysis of the orion spacecraft parachutes. Int J Numer
Methods Fluids 65:271285. doi:10.1002/fld.2348
48. Takizawa K, Wright S, Moorman C, Tezduyar TE (2011) Fluid-structure interaction modeling
of parachute clusters. Int J Numer Methods Fluids 65:286307. doi:10.1002/fld.2359
49. Torii R, Oshima M, Kobayashi T, Takagi K, Tezduyar TE (2011) Influencing factors in image-
based fluidstructure interaction computation of cerebral aneurysms. Int J Numer Methods
Fluids 65:324340. doi:10.1002/fld.2448
50. Tezduyar TE, Takizawa K, Brummer T, Chen PR (2011) Spacetime fluidstructure interaction
modeling of patient-specific cerebral aneurysms. Int J Numer Methods Biomed Eng 27:1665
1710. doi:10.1002/cnm.1433
51. Takizawa K, Spielman T, Tezduyar TE (2011) Spacetime fsi modeling and dynamical analy-
sis of spacecraft parachutes and parachute clusters. Comput Mech 48:345364. doi:10.1007/
s00466-011-0590-9
52. Manguoglu M, Takizawa K, Sameh AH, Tezduyar TE (2011) A parallel sparse algorithm tar-
geting arterial fluid mechanics computations. Comput Mech 48:377384. doi:10.1007/s00466-
011-0619-0
352 K. Takizawa et al.
74. Cruchaga MA, Celentano DJ, Tezduyar TE (2007) A numerical model based on the Mixed
Interface-Tracking/Interface-Capturing Technique (MITICT) for flows with fluidsolid and
fluidfluid interfaces. Int J Numer Methods Fluids 54:10211030. doi:10.1002/fld.1498
75. Cruchaga M, Celentano D, Tezduyar T (2001) A moving lagrangian interface technique for
flow computations over fixed meshes. Comput Methods Appl Mech Eng 191:525543. doi:10.
1016/S0045-7825(01)00300-0
76. Sethian J (1999) Level set methods and fast marching methods. Cambridge University Press,
Cambridge
77. Kees CE, Akkerman I, Farthing MW, Bazilevs Y (2011) A conservative level set method suitable
for variable-order approximations and unstructured meshes. J Comput Phys 230:45364558
78. Akkerman I, Bazilevs Y, Benson DJ, Farthing MW, Kees CE (2012) Free-surface flow and fluid
object interaction modeling with emphasis on ship hydrodynamics. J Appl Mech 79:010905
79. Tezduyar TE, Behr M, Mittal S, Johnson AA (1992) Computation of unsteady incompressible
flows with the finite element methodsspacetime formulations, iterative strategies and mas-
sively parallel implementations. In: Smolinski P, Liu WK, Hulbert G, Tamma K (eds) New
methods in transient analysis, PVP-Vol. 246/AMD-Vol. 143. ASME, New York, pp 724
80. Johnson AA, Tezduyar TE (1994) Mesh update strategies in parallel finite element computations
of flow problems with moving boundaries and interfaces. Comput Methods Appl Mech Eng
119:7394. doi:10.1016/0045-7825(94)00077-8
81. Hughes TJR, Cottrell JA, Bazilevs Y (2005) Isogeometric analysis: cad, finite elements, nurbs,
exact geometry, and mesh refinement. Comput Methods Appl Mech Eng 194:41354195
82. Bazilevs Y, Hughes TJR (2008) Nurbs-based isogeometric analysis for the computation of
flows about rotating components. Comput Mech 43:143150
83. Takizawa K, Henicke B, Puntel A, Kostov N, Tezduyar TE (2013) Computer modeling tech-
niques for flapping-wing aerodynamics of a locust. Comput Fluids 85:125134. doi:10.1016/
j.compfluid.2012.11.008
84. Takizawa K, Tezduyar TE (2014) Spacetime computation techniques with continuous repre-
sentation in time (ST-C). Comput Mech 53:9199. doi:10.1007/s00466-013-0895-y
85. Cruchaga M, Celentano D, Tezduyar T (2002) Computation of mould filling processes with a
moving Lagrangian interface technique. Commun Numer Methods Eng 18:483493. doi:10.
1002/cnm.506
86. Cruchaga MA, Celentano DJ, Tezduyar TE (2005) Moving-interface computations with the
edge-tracked interface locator technique (ETILT). Int J Numer Methods Fluids 47:451469.
doi:10.1002/fld.825
87. Cruchaga MA, Celentano DJ, Tezduyar TE (2007) Collapse of a liquid column: numerical
simulation and experimental validation. Comput Mech 39:453476. doi:10.1007/s00466-006-
0043-z
88. Kleefsman KMT, Fekken G, Veldman AEP, Iwanowski B, Buchner B (2005) A volume-of-fluid
based simulation method for wave impact problems. J Comput Phys 206:363393
89. Fridsma G (1968) A systematic study of the rough-water performance of planing boats. David-
son Laboratory Report 1275
90. Longo J, Stern F (2005) Uncertainty assessment for towing tank tests with example for surface
combatant dtmb model 5415. J Ship Res 49:5568
91. Garcia J, Oate E (2003) An unstructured finite element solver for ship hydrodynamics prob-
lems. J Appl Mech 70:1826
92. Longo J, Shao J, Irvine M, Stern F (2007) Phase-averaged PIV for the nominal wake of a
surface ship in regular head waves. J Fluids Eng 129:524541
93. McCormick ME (2010) Ocean engineering mechanics with applications. Cambridge University
Press, Cambridge
94. Hansen SO et al (2006) The Hardanger bridge: static and dynamic wind tunnel tests with a
section model. Technical report, Prepared for Norwegian Public Roads Administration
Computational Wind-Turbine Analysis
with the ALE-VMS and ST-VMS Methods
Y. Bazilevs (B)
Structural Engineering, University of California, San Diego, 9500 Gilman Drive,
La Jolla, CA 92093, USA
e-mail: [email protected]
K. Takizawa
Department of Modern Mechanical Engineering and Waseda Institute for Advanced
Study, Waseda University, 1-6-1 Nishi-Waseda, Shinjuku-ku, Tokyo 169-8050, Japan
T. E. Tezduyar N. Kostov S. McIntyre
Mechanical Engineering, Rice University, 6100 Main Street, Houston, TX 77005, USA
M.-C. Hsu
Department of Mechanical Engineering, Iowa State University, 2025 Black Engineering,
Ames, IA 50011, USA
1 Introduction
Countries around the world are putting substantial effort into the development
of wind energy technologies. The ambitious wind energy goals put pressure on the
wind energy industry research and development to significantly enhance the current
wind generation capabilities in a short period of time and decrease the associated
costs. This calls for transformative concepts and designs (e.g., floating offshore wind
turbines) that must be created and analyzed with high-precision methods and tools.
These include complex-geometry, 3D, time-dependent, multi-physics predictive sim-
ulation methods and software that will play an increasingly important role as the
demand for wind energy grows.
Currently most wind-turbine aerodynamics and aeroelasticity simulations are per-
formed using low-fidelity methods, such as the Blade Element Momentum (BEM)
theory for the rotor aerodynamics employed in conjunction with simplified structural
models of the wind-turbine blades and tower (see, e.g., [1, 2]). These methods are
very fast to implement and execute. However, the cases involving unsteady flow,
turbulence, 3D details of the wind-turbine blade and tower geometry, and other
similarly-important features, are beyond their range of applicability.
To obtain high-fidelity results for wind turbines, 3D modeling is essential. How-
ever, simulation of wind turbines at full scale engenders a number of challenges: the
flow is fully turbulent, requiring highly accurate methods and increased grid resolu-
tion. The presence of fluid boundary layers, where turbulence is created, complicates
the situation further. Wind-turbine blades are long and slender structures, with com-
plex distribution of material properties, for which the numerical approach must have
good approximation properties and avoid locking. Wind-turbine simulations involve
moving and stationary components, and the fluidstructure coupling must be accu-
rate, efficient and robust to preclude divergence of the computations. These explain
the current, modest nature of the state-of-the-art in wind-turbine simulations.
Fluidstructure interaction (FSI) simulations at full scale are essential for accurate
modeling of wind turbines. The motion and deformation of the wind-turbine blades
depend on the wind speed and air flow, and the air flow patterns depend on the
motion and deformation of the blades. In order to simulate the coupled problem, the
equations governing the air flow and the blade motions and deformations need to be
solved simultaneously, with proper kinematic and dynamic conditions coupling the
two physical systems. Without that the modeling cannot be realistic: unsteady blade
deformation affects aerodynamic efficiency and noise generation, and response to
wind gusts. Flutter analysis of large blades operating in offshore environments is of
great importance and cannot be accomplished without FSI.
In recent years, several attempts were made to address the above mentioned
challenges and to raise the fidelity and predictability levels of wind-turbine sim-
ulations. Standalone aerodynamics simulations of wind-turbine configurations in
3D were reported in [36], while standalone structural analyses of rotor blades of
complex geometry and material composition, but under assumed wind-load condi-
tions or wind-load conditions coming from separate aerodynamic computations were
Computational Wind-Turbine Analysis 357
reported in [711]. In a recent work [12] it was shown that coupled FSI modeling and
simulation of wind turbines is important for accurately predicting their mechanical
behavior at full scale.
To address the above mentioned challenges one should employ a combination
of numerical techniques, which are general, accurate, robust and efficient for the
targeted class of problems. Such techniques are summarized in what follows, with
some of them described in greater detail in the body of this book chapter.
Isogeometric Analysis (IGA), first introduced in [13] and further expanded on
in [1420], is adopted as the geometry modeling and simulation framework for
wind turbines in some of the examples presented here. We use the IGA based on
NURBS (non-uniform rational B-splines), which are more efficient than standard
finite elements for representing complex, smooth geometries, such as wind-turbine
blades. The IGA was successfully employed for computation of turbulent flows
[2126], nonlinear structures [10, 2731], and FSI [3235], and, in most cases, gave a
clear advantage over standard low-order finite elements in terms of solution accuracy
per-degree-of-freedom. This is in part attributable to the higher-order smoothness of
the basis functions employed. Flows about rotating components are naturally handled
in an isogeometric framework because all conic sections, and in particular, circular
and cylindrical shapes, are represented exactly [36].
The blade structure is governed by the isogeometric rotation-free shell formulation
with the aid of the bending-strip method [10]. The method is appropriate for thin-shell
structures comprised of multiple C 1 - or higher-order continuous surface patches that
are joined or merged with continuity no greater than C 0 . The KirchhoffLove shell
theory that relies on higher-order continuity of the basis functions is employed in
the patch interior as in [31]. Although NURBS-based IGA is employed in this work,
other discretizations such as T-splines [19, 20] or subdivision surfaces [3739], are
perfectly suited for the proposed structural modeling method.
In addition, an isogeometric representation of the analysis-suitable geometry can
be used in generating tetrahedral and hexahedral meshes for computations with the
finite element method (FEM). In this article, we use tetrahedral meshes generated
that way in wind-turbine computations with the ALE-VMS and ST-VMS methods.
The ALE-VMS method [5, 34] is the variational multiscale (VMS) version of the
Arbitrary LagrangianEulerian (ALE) method [40]. The VMS components are from
the residual-based VMS (RBVMS) method given in [21, 26, 41, 42]. The ST-VMS
method [43, 44] is the VMS version of the Deforming-Spatial-Domain/Stabilized
SpaceTime (DSD/SST) method [4549]. Earlier it was called DSD/SST-VMST
(i.e. the version with the VMS turbulence model) in [43]. The original DSD/SST for-
mulation was named DSD/SST-SUPS in [43] (i.e. the version with the SUPG/PSPG
stabilization), which was also called ST-SUPS in [50].
The ALE-VMS method originated from the RBVMS formulation of incompress-
ible turbulent flows proposed in [21] for stationary meshes, and may be thought of as
an extension of the RBVMS method to moving meshes. As such, it was presented for
the first time in [34] in the context of FSI. Although ALE-VMS gave reasonably good
results for several important turbulent flows, it was evident in [21, 24] that to obtain
accurate results for wall-bounded turbulent flows the method required relatively fine
358 Y. Bazilevs et al.
resolution of the boundary layers. This fact makes ALE-VMS a somewhat costly
technology for full-scale wall-bounded turbulent flows at high Reynolds numbers,
which are characteristic of the present application. For this reason, weakly-enforced
essential boundary condition formulation was introduced in [51], which significantly
improved the performance of the ALE-VMS formulation in the presence of unre-
solved boundary layers [22, 23, 26]. The weak boundary condition formulation may
be thought of as an extension of Nitsches method [52] to the NavierStokes equa-
tions of incompressible flows. Another interpretation of the weak boundary condition
formulation is that it is a discontinuous Galerkin method (see, e.g., [53]), where the
continuity of the basis functions is enforced everywhere in the domain interior, but
not at the domain boundary.
The DSD/SST formulation was introduced in [4547] as a general-purpose
interface-tracking (moving-mesh) technique for flows with moving boundaries and
interfaces, including FSI and flows with moving objects. Its stabilization com-
ponents are the Streamline-Upwind/Petrov-Galerkin (SUPG) [54] and Pressure-
Stabilizing/Petrov-Galerkin (PSPG) [45, 55] stabilizations. It also includes the
LSIC (least-squares on incompressibility constraint) stabilization. Some of the
earliest FSI computations with the DSD/SST formulation were reported in [56] for
vortex-induced vibrations of a cylinder and in [57] for flow-induced vibrations of a
flexible, cantilevered pipe (1D structure with 3D flow). The DSD/SST formulation
has been used extensively in 3D computations of parachute FSI, starting with the
3D computations reported in [58] and evolving to computations with direct cou-
pling [59]. New versions of the DSD/SST formulation introduced in [49] are the
core technologies of the Stabilized ST FSI (SSTFSI) technique, which was also
introduced in [49]. The ST-VMS method and SSTFSI technique, combined with a
number of special techniques (see [6063] and references therein) have been used
in some of the most challenging parachute FSI computations (see [60, 6466] and
references therein), and also in a good number of patient-specific cardiovascular FSI
and fluid mechanics computations (see [6163, 67] and references therein). Compu-
tations with the SSTFSI technique also received a substantial attention in research
related to iterative solution of large linear systems [68, 69].
In application of the DSD/SST formulation to flows with moving objects, the
ShearSlip Mesh Update Method (SSMUM) [7072] has been very instrumental.
The SSMUM was first introduced for computation of flow around two high-speed
trains passing each other in a tunnel (see [70]). The challenge was to accurately
and efficiently update the meshes used in computations based on the DSD/SST
formulation and involving two objects in fast, linear relative motion. The idea behind
the SSMUM was to restrict the mesh moving and remeshing to a thin layer of elements
between the objects in relative motion. The mesh update at each time step can be
accomplished by a shear deformation of the elements in this layer, followed by a
slip in node connectivities. The slip in the node connectivities, to an extent, un-does
the deformation of the elements and results in elements with better shapes than those
that were shear-deformed. Because the remeshing consists of simply re-defining
the node connectivities, both the projection errors and the mesh generation cost are
minimized. A few years after the high-speed train computations, the SSMUM was
Computational Wind-Turbine Analysis 359
implemented for objects in fast, rotational relative motion and applied to computation
of flow past a rotating propeller [71] and flow around a helicopter [72].
The ST-VMS method was successfully tested on computation of wind-turbine
rotor aerodynamics in [6, 73, 74]. Those computations did not include a wind-turbine
tower, and therefore a mesh update method was not required. In [75], the ST-VMS
method was applied to computation of wind-turbine rotor and tower aerodynamics.
The presence of a tower requires a mesh update method that can handle the fast,
rotational relative motion between the rotor and tower. The SSMUM would have
been an option, but we decided to use a method that is more general. We use NURBS
basis functions for the temporal representation of the rotor motion, mesh motion and
also in remeshing. This is essentially the same computational technology used in
the ST-VMS computations of flapping-wing aerodynamics reported in [7679]. We
named it ST/NURBS Mesh Update Method (STNMUM) in [75]. The motion of the
rotor surface mesh created from the NURBS geometry is represented by quadratic
temporal NURBS basis functions, with sufficient number of temporal patches for
one rotation. This enables us to represent the circular paths associated with the rotor
motion exactly and, with a secondary mapping [43, 44, 50, 76], specify a con-
stant angular velocity corresponding to the invariant speeds along those paths. Given
the motion of the surface mesh, we compute meshes that serve as temporal-control
points. This is done by creating with an automatic mesh generator a new mesh at the
central control point of the temporal patch, and computing the meshes at the other two
control points by using the mesh moving technique [49, 8083] developed earlier in
conjunction with the DSD/SST method. The STNMUM allows us to do mesh com-
putations with longer time in between, but get the mesh-related information for each
ST slab, such as the coordinates and their time derivatives, from the temporal repre-
sentation whenever we need. This approach where the mesh-related information is
computed directly was called in [75] Direct Temporal Representation (DTR). In
an alternative approach, we can obtain the mesh-related data after first computing the
finite element meshes associated with each ST slab by interpolation from the tempo-
ral NURBS representation of the mesh. This approach was called Interpolated-Mesh
Temporal Representation (IMTR) in [75]. For better mesh resolution, we use layers
of thin elements near the blade surfaces. These layers of elements are created with a
special mesh generation process and are not part of what we create with the automatic
mesh generation process. They undergo rigid-body motion with the rotor. Despite
the fast, rotational relative motion between the rotor and tower, the computations
reported in [75] were carried out by using an automatic mesh generator only a total
of 6 times during an entire computation.
We refer the interested reader to [50, 74] for the following methods that are not
reviewed in this article: ALE-VMS and ST-VMS methods, formulation for weakly-
enforced essential boundary conditions, structural mechanics formulation, which is
based on the KirchhoffLove thin-shell theory and the bending-strip method (see
[10, 12, 31]), FSI coupling, mesh update, and a method for pre-bending of wind-
turbine blades, which was recently proposed in [11].
In Sect. 2, we present the sliding-interface formulation from [36, 84, 85], which
enables the simulation of rotortower interaction. The formulation was used in [85]
360 Y. Bazilevs et al.
In order to simulate the full wind turbine configuration and investigate the rotortower
interaction, we consider an approach that makes use of a moving subdomain, which
encloses the entire wind turbine rotor, and a stationary subdomain that contains the
rest of the wind turbine (see Fig. 1). The two domains are in relative motion and
share a sliding cylindrical interface. The meshes on each side of the interface are
nonmatching because of the relative motion (see Fig. 2). As a result, a numerical
procedure is needed to impose the continuity of the kinematics and tractions at the
stationary and rotating subdomain interface despite the fact that the interface dis-
cretizations are incompatible. Such a procedure was developed in [36] in the context
of IGA for computing flows about rotating components. The advantage of IGA for
Computational Wind-Turbine Analysis 361
Fig. 2 Nonmatching meshes at the sliding interface between the stationary and moving subdomains.
Left Full domain. Right Zoom on the sliding interface
n eb 1
wSh wM
h
S nS M nM ) d
(
2
b=1 b
t (t )SI
n eb
1
( M nM ) uSh uM
S nS h
d
2
b=1 b
t (t )SI
n eb
wSh uSh uSh nS uSh uM
h
d
b=1 b
t (t )SI
n eb
h
wM h
uM uM
h
nM h
uM uSh d
b=1 b
t (t )SI
362 Y. Bazilevs et al.
C IB h
n eb
+ wS wM
h
uSh uM
h
d = 0, (1)
hn
b=1 b
t (t )SI
The computational results in this section make use of the ALE-VMS technique and
are taken from [85]. The sliding-interface formulation is applied to the simulation
of the full NREL Phase VI wind turbine configuration, including the rotor (blades
and hub), nacelle and tower. The tower is composed of two cylinders with diameters
of 0.6096 and 0.4064 m that are connected with a short conical section. The tower
height is 11.144 m above the wind tunnel floor. The detailed geometry of the tower
and nacelle can be found in Hand et al. [86]. For this study, wind speeds of 7 and
10 m/s were selected from the experimental sequence S. The experimental sequence
S setup consists the wind turbine rotor in the upwind configuration, 0 yaw angle,
0 cone angle, rotational speed of 72 rpm, and blade tip pitch angle of 3 . The air
density and viscosity are 1.23 kg/m3 and 1.78 105 kg/(ms), respectively.
Figure 3 shows the mesh resolution used in the computation. The mesh is highly
refined near the rotor, nacelle and tower, as well as downstream of the wind turbine
to better capture the wake turbulence. The mesh is comprised of 6,835,647 linear
elements and 1,603,377 nodes. The size of the first boundary-layer element in the
Computational Wind-Turbine Analysis 363
Fig. 3 Meshes used in the full-wind-turbine simulation. Left 2D cut at x = 0 to show the flow
domain mesh quality. Right Rotor, nacelle, and tower surface mesh
Fig. 4 Air speed planar distribution and isosurfaces at an instant for the 7 m/s case
Fig. 5 Air speed planar distribution and isosurfaces at an instant for the 10 m/s case
Fig. 6 Single-blade aerodynamic torque over a full revolution for 7 m/s (left) and 10 m/s (right)
cases. The 180 azimuthal angle corresponds to the instant when the blade passes in front the tower.
The tower effect is clearly pronounced in the 7 m/s case. It is also present in the 10 m/s case, but is
not as significant. The results in both cases are in very good agreement with the experimental data
To see the influence of the tower, the single-blade aerodynamic torque over a full
revolution is plotted in Fig. 6 for both 7 and 10 m/s cases. The results of the full-
wind-turbine computations are compared with the experimental data, as well as with
the results of the rotor-only computations. For the full-wind-turbine simulation of
the 7 m/s case, Fig. 6 clearly shows the drop in the aerodynamic torque at an instant
when the blade passes in front of the tower, which corresponds to the azimuthal angle
of 180 . The drop in the torque is about 8 % relative to its value when the blade is
away from the tower. These results are in good agreement with the experimental data.
The rotor-only computation, which is also shown in the figure, is obviously unable
Computational Wind-Turbine Analysis 365
to predict this feature, which may be important for the transient structural response
of the blades. It should be noted, however, that the cycle-averaged aerodynamic
torque is nearly identical for the full-wind-turbine and the rotor-only simulations.
The picture is completely different for the 10 m/s case, where the influence of the
tower, although clearly present, is a lot less pronounced.
We use quadratic NURBS functions, as described in [43, 44, 50, 76], to represent
a circular arc. We discretize time and position as follows:
n ent
n ent
t= T (t ())t , x= T (x ())x . (2)
=1 =1
Here n ent is the number of temporal element nodes, T is the basis function, t ()
and x () are the secondary mappings for time and position, and t and x are the
time and position values corresponding to the basis function T . The basis functions
could be finite element or NURBS basis functions. For the circular arc, n ent = 3 and
they are quadratic NURBS. The secondary mapping concept above was introduced
in [43], and the velocity can be expressed as follows:
n
n
1
dx ent
dT dx ent
dT dt
= x t , (3)
dt dx d dt d
=1 =1
leading to
n
n
1
dx ent
dT ent
dT dx d
= x t . (4)
dt dx dt d dt
=1 =1
Thus, the speed along the path can be specified only by modifying the secondary
mapping. For a circular arc, two methods were introduced in [44, 76]; one is modify-
ing the secondary mapping for position and the other one is modifying both such that
dt
d is constant. We note that, in theory, the secondary mapping selections do not make
any difference as long as the relationship d x
dt is the same. In our implementation,
366 Y. Bazilevs et al.
to keep the process general, we search for the parametric coordinate by using
an iterative solution method [44, 50, 76]. We use the latter set of the secondary
dt
mappings, having constant d . For the IMTR, we find the parametric coordinate cor-
responding to each time level and interpolate the position to obtain the corresponding
mesh. For the DTR, we first calculate time corresponding to each integration point,
including the time step size because of the jump term, and then calculate x and t
to interpolate the position and velocity from Eqs. (2) and (4).
The geometry construction for the wind-turbine rotor blade and hub was described
in [5, 6], and also partially in [73, 75]. For completeness we repeat some of that
information here. The geometry of the rotor blade is based on the NREL 5MW
offshore baseline wind turbine reported in [90]. A 61 m blade is attached to a hub
with radius of 2 m, making the total rotor radius, R, 63 m. The blade is composed
of several airfoil types. The first portion of the blade is a perfect cylinder. Farther
away from the root the cylinder is smoothly blended into a series of DU (Delft
University) airfoils. Starting at 44.55 m from the root and all the way to the tip,
the NACA64 profile is used. For each cross-section, we use quadratic NURBS to
represent the 2D airfoil shape. The weights of the NURBS functions are set to unity.
The weights are adjusted near the root to represent the circular cross-sections exactly.
The cross-sections are lofted along the blade axis direction, also using quadratic
NURBS and unit weights. This geometry-construction process yields a smooth blade
surface with a relatively small number of input parameters, which is an advantage of
the isogeometric representation. Images of the airfoil types used in the wind-turbine
rotor blade and the final blade including the twisting cross-sections can be found in
[5, 6, 73]. Starting from this rotor surface geometry, we generate a quadratic NURBS
surface with G 2 and G 1 continuity between the patches around and along the blade,
respectively. The tower geometry was created based on the tower design specified
for the NREL 5MW wind turbine, which describes a circular tower with a height of
87.6 m, a base diameter of 6 m, and a top diameter of 3.87 m. This geometry was
generated by lofting between NURBS curves for the top and base of the tower. The
rotor axis is 90 to the tower, and there is no tilt or precone. The distance between
the tower axis and the point where the three blade axes intersect is 5 m. For most of
the blade, the clearance from the tower is in the range 2.32.8 m.
We compute the aerodynamics of the rotor with and without its tower for a given
rotor shape and wind speed and a specified rotor speed. The wind speed is uniform
at 9 m/s and the rotor speed is 1.08 rad/s, giving a tip speed ratio of 7.55 (see [91] for
Computational Wind-Turbine Analysis 367
The circular turbine rotation is represented with temporal NURBS basis functions
and secondary mapping, described in Sect. 3.1. Because the three blades of the
turbine are 120 apart, rotational geometric periodicity is used such that a full 360
rotation is defined by three identical 120 segments. Each 120 segment is divided
into six patches to keep the mesh distortion under control. Each patch has three
temporal-control points. The six temporal patches and their control points are shown
in Fig. 7.
The rotor surface mesh is generated by discretizing the NURBS surface geometry at
each knot intersection, subdividing the knot spans into quadrilateral finite elements
in a structured way, and subdividing the quadrilateral elements into two triangles.
Small adjustments are made to improve the mesh near the hub. The surface mesh
position is calculated at each temporal-control point shown in Fig. 7. Figure 8 shows
the rotor surface at the three temporal-control points of the first patch. We note that
control points 1 and 3 lie on the path traveled by the points on the blades and a
portion of the hub at the start and end of the 20 rotation, but control point 2 lies
outside the circular arc. This means that the temporal-control mesh 2 is deformed
368 Y. Bazilevs et al.
The layers of thin elements near the blades are generated by extruding the NURBS
surface geometry into NURBS volume representation, subdividing the knot spans
into hexahedral finite elements in a structured way, and subdividing the hexahedral
elements into six tetrahedral elements. The resulting boundary-layer mesh for each
blade consists of four layers with a first-layer thickness of about 2.85 102 m and
a total thickness of about 2.85 101 m, 52 nodes in the circumferential direction
around the blade, and approximately 145 nodes in the longitudinal direction. The
tower boundary-layer mesh is generated by extruding the tower surface mesh to layers
of prismatic elements, which are then subdivided into three tetrahedral elements
each. It consists of four layers, with a first-layer thickness of 2.85 102 m and a
total thickness of 3.0 101 m. The blade and tower boundary-layer meshes do not
undergo any mesh deformation. This maintains the mesh quality in the boundary-
layer regions. Figure 9 shows the tower and blade boundary-layer meshes.
Computational Wind-Turbine Analysis 369
Three different meshes are used in the computations: Mesh 1, Mesh 2, and Mesh
3. Mesh 2 has both the rotor and the tower, with boundary-layer mesh only for the
blades. Mesh 1 has only the rotor, and is identical to Mesh 2 except the tower is filled
with volume elements. Mesh 3 has both the rotor and the tower, with boundary-layer
mesh for both the blades and the tower, and a mesh refinement region downstream of
the tower. All three meshes have an outer, coarser region, with an inner cylindrical
refinement region surrounding the rotor. This inner refinement region includes most
of the tower for Mesh 2 and Mesh 3, and the mesh refinement region downstream
of the tower for Mesh 3. Figure 10 illustrates, as an example, a cut plane of Mesh
3. The inflow and outflow boundaries are at 3.79R and 10.35R from the hub center,
respectively. The side, top, and bottom boundaries are at 2.29R, 3.17R, and 1.43R,
respectively (see Fig. 10). The volume mesh is generated once per patch using an
automatic mesh generator (a total of 6 times). The mesh is generated at control
point 2 of each patch to minimize mesh distortion between control points. We note
that only the mesh in the inner cylindrical refinement region surrounding the rotor
370 Y. Bazilevs et al.
is generated for each patch. The outer, coarser mesh is generated only once, and
is kept the same when the inner meshes are generated for each patch. The mesh
moving technique [49, 8083] developed earlier in conjunction with the DSD/SST
method is used for computing the mesh position for control points 1 and 3. The outer
surfaces of the boundary-layer meshes serve as the boundaries where we specify
the inner boundary conditions for the mesh motion. The external boundaries of the
computational domain serve as the boundaries where we specify the outer boundary
conditions, with zero displacement. In the elasticity equations of the mesh moving
technique, a Youngs modulus of 1.0, a Poissons ratio of 0.20, and a stiffening
exponent of 1.5 are used. We use 1,500 GMRES [92] iterations for each step of the
mesh motion, with diagonal preconditioner. Each 10 range of motion is computed
over 40 steps. The approximate number of nodes for Mesh 1, Mesh 2 and Mesh 3
are 465,000, 440,000 and 595,000.
In the ST-VMS computations, the stabilization parameters are given by Eq. (7) in [49]
for M (=SUPS ) = SUPG and Eq. (19) in [75] for C (=LSIC ) = LSICHRGN . They
are both used with h RGN = h RGNT , given by Eq. (15) in [75], which was originally
introduced in [76]. The DTR and IMTR approaches are used on all three meshes.
Least-squares projection is used to interpolate the velocity and pressure between
temporal patches. Because the boundary-layer meshes and the tower and rotor surface
meshes remain identical between temporal patches, the velocity values are transferred
exactly for those nodes. The time-step size is 2.23 103 s (145 time steps per
patch), with four nonlinear iterations per time-step. First we develop the flow field
for 500 time steps while the rotor is static, ramping up the inflow velocity during
the first 300 steps from zero to the wind speed using a cosine ramp. During this
flow-development stage, we use 150, 150, 200, and 400 GMRES iterations for the
four nonlinear iterations. In computations with the rotor in motion, we use 150, 150,
200, and 400 GMRES iterations for Mesh 1, and 150, 250, 350, and 500 GMRES
iterations for Mesh 2 and Mesh 3. With the GMRES iterations in flow computations,
we use nodal-block-diagonal preconditioner. The mesh is partitioned based on the
METIS algorithm [93] to improve parallel efficiency in the computations.
3.8 Results
Figure 11 shows the torque for Mesh 1 with the DTR approach, for the last 360
rotation of a blade, with the rotation amount measured from the orientation seen
in Fig. 7. For reference purposes, Fig. 11 includes the NREL data. The torque is
within 8 % of the NREL data. Figure 12 shows the torque for the last 80 rotation
of a single blade of Mesh 1 with the DTR approach, compared with the torque from
Computational Wind-Turbine Analysis 371
an earlier, single-blade computation [73] using the TGI option of C (=LSIC ). The
single-blade computation has the same blade geometry, wind speed, and rotor speed,
but has a single-blade mesh in a rotationally-periodic domain. It has a more refined
boundary-layer mesh and a time-step size that is approximately five times smaller.
The higher torque seen for the single-blade computation may be due to the fact
that the computation was carried out for a much shorter duration, only 80 of rota-
tion versus 1,080 for the Mesh 1 computation. Therefore the current computation
likely represents a more settled torque value. The higher torque for the single-blade
computation may also be due the fact that the computation was carried out using a
computational domain with significantly nearer lateral boundaries. Figures 13 and
14 show the torque for all three meshes with the DTR and IMTR approaches. As can
be seen from these figures, Mesh 1 (no tower) has a very stable torque, while Mesh 2
and Mesh 3 (with tower) exhibit a significant but expected drop in torque each time
a blade passes the tower. Figure 15 shows, for each of the three meshes, the torque
obtained with the DTR and IMTR approaches. The figure illustrates that the DTR and
IMTR approaches result in a nearly identical torque magnitude for all three meshes.
Figure 16 shows the torque for Mesh 1 with the DTR approach, using two different
time-step sizes: 2.23 103 s (145 time steps per patch) and 4.49 103 s (72 time
steps per patch). Doubling the time-step size still yields a comparable torque value,
within 10 % of the value for the smaller time-step size. We also carried out a compu-
tation with the convective form of the ST-VMS formulation (see Eq. (8.17) in [44]),
but with a smaller time-step size: 4.46 104 s (725 time steps per patch). Figure 17
shows the torque for Mesh 2 with the DTR approach and the conservative and convec-
tive forms of the ST-VMS formulation. The conservative-form computation is with
the standard time-step size: 2.23 103 s (145 time steps per patch). Figure 18
shows the torque for the individual blades of Mesh 2 with the DTR approach.
The figure clearly shows the expected torque drop for each blade as it passes the
tower, while the other two blades maintain relatively constant torque. Figure 19
shows the torque for 10 equal-length spanwise sections of a blade of Mesh 2 with the
DTR approach. Greatest amount of torque is generated in sections 69 of the blade,
while section 10 at the tip and the other lower sections generate less torque. Figure 20
shows a volume rendering of the vorticity for Mesh 2 with the DTR approach. The
flow patterns vary considerably along each blade length, illustrating the necessity to
carry out the computations in 3D.
Figure 21 shows the pressure coefficient at 0.90R for the last 0 orientation of a
blade of Mesh 2, with the DTR and IMTR approaches, with the last 0 orientation
being common between the two computations. There is very little difference in the
pressure coefficient around the blades between the DTR and IMTR approaches.
Figure 22 shows the pressure coefficient at 0.90R for the last 180 orientation of a
blade of Mesh 1, Mesh 2 and Mesh 3, with the DTR approach, with the last 180
orientation being common between Mesh 2 and Mesh 3 computations. Averaged
torque (in MNm) for the last 360 rotation for Mesh 1, 2 and 3 are 2.31, 2.34 and
2.39 for the DTR approach, and 2.32, 2.34 and 2.35 for the IMTR approach. The
values show that the difference in torque between the DTR and IMTR approaches,
372 Y. Bazilevs et al.
3.0
2.5
Torque (MNm)
2.0
1.5
1.0
0.5
ST-VMS NREL
0.0
0 30 60 90 120 150 180 210 240 270 300 330 360
Degrees
Fig. 11 Torque for Mesh 1 with the DTR approach, compared with the NREL data
1.0
0.8
Torque (MNm)
0.6
0.4
0.2
Mesh 1 Single-blade
0.0
0 10 20 30 40 50 60 70 80
Degrees
Fig. 12 Torque for a single blade of Mesh 1 with the DTR approach, compared with the torque
from an earlier single-blade computation [73] using the TGI option of C (=LSIC )
3.0
2.5
Torque (MNm)
2.0
1.5
1.0
0.5
Mesh 1 Mesh 2 Mesh 3
0.0
0 30 60 90 120 150 180 210 240 270 300 330 360
Degrees
Fig. 13 Torque for Mesh 1, Mesh 2 and Mesh 3 with the DTR approach
and between Mesh 2 and Mesh 3, is rather small. The difference in torque between
Mesh 1 and Mesh 2 and 3 illustrates effect of the tower.
Computational Wind-Turbine Analysis 373
3.0
2.5
2.0
Torque(MN m)
1.5
1.0
0.5
Mesh 1 Mesh 2 Mesh 3
0.0
0 30 60 90 120 150 180 210 240 270 300 330 360
Degrees
Fig. 14 Torque for Mesh 1, Mesh 2 and Mesh 3 with the IMTR approach
This section is adapted from [87]. We simulate the Micon 65/13M wind turbine
at field test conditions [94]. Micon 65/13M is a three-blade, horizontal-axis, fixed-
pitch, upwind turbine with the total rotor diameter of 19.3 m and rated power of
100 kW. The hub is located at the height of 23 m. The wind turbine stands on a
tubular steel tower, with a base diameter of 1.9 m. The drive train generator operates
at 1,200 rpm, while the rotor spins at a nominal speed of 55 rpm. The Micon 65/13M
wind turbine was used for the Long-Term Inflow and Structural Testing (LIST)
program [95] initiated by Sandia National Laboratories in 2001 to explore the use
of carbon fiber in wind turbine blades. Three experimental blade prototypes, GX-
100, CX-100 and TX-100, were developed specifically for this project. We use the
CX-100 conventional carbon-spar blade design [94, 96]. The NREL S821, S819 and
S820 airfoils are used to define the blade geometry. The details of the blade geometry
definition are provided in Table 1.
The blade structure is comprised of five primary sections: leading edge, trailing
edge, root, spar cap, and shear web. The sections are shown in Fig. 23. Each section is
further subdivided into zones, each consisting of a multilayer composite layup. There
is a total of 32 zones with constant total thickness and unique laminate stacking. The
effective material properties for each of the zones are computed using the procedures
described in [50, 74]. All 32 zones are identified on the blade surface and are shown
in Fig. 23. For more details of the material composition of the CX-100 blade see [87].
We perform eigenfrequency calculations of the CX-100 blade using three quadratic
NURBS meshes. The coarsest mesh has 1,846 elements, while the finest mesh has
18,611. The mesh statistics are summarized in Table 2. The eigenfrequency results
374 Y. Bazilevs et al.
3.0
2.5
Torque (MNm)
2.0
1.5
1.0
0.5
0.0
0 30 60 90 120 150 180 210 240 270 300 330 360
3.0
2.5
Torque (MNm)
2.0
1.5
1.0
0.5
0.0
0 30 60 90 120 150 180 210 240 270 300 330 360
Degrees
Degrees
3.0
2.5
Torque (MNm)
2.0
1.5
1.0
0.5
DTR IMTR
0.0
0 30 60 90 120 150 180 210 240 270 300 330 360
Degrees
Fig. 15 Torque with the DTR and IMTR approaches for Mesh 1, Mesh 2, and Mesh 3
are compared with the experimental data from [97, 98]. We compute the case with
free boundary conditions and the case when the blade is clamped at the root. In both
cases, the computed natural frequencies are in good agreement with the experimental
data (see Tables 3 and 4). The medium mesh shows a good balance between the
computational cost and accuracy. For this reason, this mesh is chosen for the FSI
computations presented here. The mode shapes computed using the medium mesh
for the clamped case are shown in Fig. 24.
Computational Wind-Turbine Analysis 375
3.0
2.5
Torque(MN m)
2.0
1.5
1.0
0.5
145 times teps per patch 72 time steps per patch
0.0
0 30 60 90 120 150 180 210 240 270 300 330 360
Degrees
Fig. 16 Torque for Mesh 1 with the DTR approach, using two different time-step sizes:
2.23 103 s (145 time steps per patch) and 4.49 103 s (72 time steps per patch)
3.0
2.5
Torque (MNm)
2.0
1.5
1.0
0.5
Conservative ST-VMS Convective ST-VMS
0.0
70 80 90 100 110 120
Degrees
Fig. 17 Torque for Mesh 2 with the DTR approach and the conservative and convective forms
of the ST-VMS formulation. The time-step sizes 4.46 104 s (725 time steps per patch) for the
convective form and 2.23 103 s (145 time steps per patch) for the conservative form. The torques
are from the same period in a rotation cycle, but the conservative-form torque is from the last 360
of the computation, and the convective-form torque is from a recently-started, ongoing computation
1.0
0.8
Torque(MN m)
0.6
0.4
0.2
Blade 1 Blade 2 Blade 3
0.0
0 30 60 90 120 150 180 210 240 270 300 330 360
Degrees
Fig. 18 Torque for the individual blades of Mesh 2 with the DTR approach
376 Y. Bazilevs et al.
0.14
0.10
0.08
0.06
0.04
0.02
0.00
0 30 60 90 120 150 180 210 240 270 300 330 360
Degrees
1 2 3 4 5 6 7 8 9 10
Fig. 19 Torque for 10 equal-length spanwise sections of a blade of Mesh 2 with the DTR approach
0 4.5 9
Fig. 20 Volume rendering of the vorticity (in s1 ) from the last 360 of the computation for Mesh
2 with the DTR approach
Computational Wind-Turbine Analysis 377
1.8 0.4 1
Fig. 21 Pressure coefficient at 0.90R for the last 0 orientation of a blade of Mesh 2, with the DTR
(left) and IMTR (right) approaches
1.8 0.4 1
Fig. 22 Pressure coefficient at 0.90R for the last 180 orientation of a blade of Mesh 1, Mesh 2,
and Mesh 3, with the DTR approach
In this section, we present aerodynamic and FSI simulations. For both cases,
a constant inflow wind speed of 10.5 m/s and fixed rotor speed of 55 rpm are pre-
scribed. These correspond to the operating conditions reported for the field tests
in [94]. The air density and viscosity are 1.23 kg/m3 and 1.78 105 kg/(ms),
respectively. Zero traction boundary conditions are prescribed at the outflow and no-
penetration boundary conditions are prescribed at the top, bottom, and side surfaces
of the outer (stationary) computational domain. No-slip boundary conditions are
prescribed at the rotor, nacelle, and tower, and are imposed weakly.
Figure 25 shows the computational domain and mesh used in this study. The
mesh consists of 5,134,916 linear elements, which are triangular prisms in the rotor
378 Y. Bazilevs et al.
Table 1 CX-100 blade with RNodes (m), Chord (m), AeroTwst ( ), and Airfoil type
RNodes Chord AeroTwst Airfoil
0.200 0.356 29.6 Cylinder
0.600 0.338 24.8 Cylinder
1.000 0.569 20.8 Cylinder
1.400 0.860 17.5 NREL S821
1.800 1.033 14.7 NREL S821
2.200 0.969 12.4 NREL S821
3.200 0.833 8.3 NREL S821
4.200 0.705 5.8 NREL S819
5.200 0.582 4.0 NREL S819
6.200 0.463 2.7 NREL S819
7.200 0.346 1.4 NREL S819
8.200 0.232 0.4 NREL S819
9.000 0.120 0.0 NREL S820
Fig. 23 Left Five primary sections of the CX-100 blade; Right 32 distinct material zones of the
CX-100 blade
boundary layers and tetrahedra everywhere else in the domain. The mesh is refined in
the rotor and tower regions for better flow resolution near the wind turbine. The size of
the first element in the wall-normal direction is 0.002 m, and 15 layers of prismatic
elements were generated with a growth ratio of 1.2. Figure 25 shows a 2D blade
cross-section at 70 % spanwise station to illustrate the boundary-layer mesh used in
the computations. The time-step size is set to 3.0 105 s. In Fig. 26 the time history
of the aerodynamic torque is plotted. As can be seen from the plot, using FSI, we
Computational Wind-Turbine Analysis 379
Table 3 Comparison of experimentally measured and computed natural frequencies (in Hz) for
the free case
Mode 1 Mode 2 Mode 3
Mesh 1 8.28 15.92 19.26
Mesh 2 8.22 15.61 18.21
Mesh 3 8.22 15.6 18.01
Experiment 7.68.2 15.718.1 20.221.3
Mode 1 is the first flapwise mode, Mode 2 is the first edgewise mode, and Mode 3 is the second
flapwise mode
Table 4 Comparison of experimentally measured and computed natural frequencies (in Hz) for
the clamped case
Mode 1 Mode 2 Mode 3
Mesh 1 4.33 11.82 19.69
Mesh 2 4.29 11.61 19.08
Mesh 3 4.27 11.54 18.98
Experiment 4.35 11.51 20.54
Modes 13 are the first three flapwise bending modes
Fig. 24 First (left) and second (right) flapwise bending mode for the clamped case
capture the high frequency oscillations caused by the bending and torsional motions
of the blades. In the case of the rigid blade the only high-frequency oscillations in
the torque curve are due to the trailing-edge turbulence. For the rigid blade case the
effect of the tower on the aerodynamic torque is more pronounced, while in the case
of FSI it is not as visible due to the relatively high torque oscillations. The dips in
the aerodynamic torque can be seen at 60 , 180 , and 300 azimuthal angle, which
is precisely when one of the three blades is passing the tower. The computed values
of the aerodynamic torque are plotted together with field test results from [94]. The
upper and lower dashed lines indicate the aerodynamic torque bounds, while the
middle dashed line gives its average value. Both the aerodynamic and FSI results
compare very well with the field test data. Figure 27 shows the relative wind speed
at the 70 % spanwise station rotated to the reference configuration to illustrate the
blade deflection and complexity of boundary-layer turbulent flow. Figure 28 shows
the flow field as the blade passes the tower.
380 Y. Bazilevs et al.
Fig. 25 Left Computational domain and mesh with the refined inner region for better flow resolution
near the rotor; Right 2D blade cross-section at r/R = 70 % and the boundary-layer mesh
Azimuthal angle,
60 180 300 60 180 300 60
12000 12000
Aerodynamic Torque, N*m
10000 10000
6000 6000
4000 4000
2000 2000
0 0
0 0.4 0.8 1.2 1.6 2
Time, s
Fig. 26 Aerodynamic torque for the FSI and rigid-blade simulations. The experimental range for
the torque and its average are provided for comparison and are plotted using dashed lines
5 Concluding Remarks
Fig. 27 Relative wind speed at the 70 % spanwise station for the FSI simulation at t = 0.86 s (left)
and t = 1.06 s (right). The blade deflection is clearly visible
Fig. 28 Wind speed contours at 80 % spanwise station as the blade passes the tower
eling of the rotor-blade structure, and full FSI coupling. Some of these techniques
were included in our overview. The wind-turbine analysis cases presented include the
aerodynamics of wind-turbine rotor and tower and the FSI that accounts for the defor-
mation of the rotor blades. The specific wind turbines considered were NREL 5MW,
NREL Phase VI and Micon 65/13M, all at full scale. In the case of NREL Phase VI
and Micon 65/13M we also presented a successful comparison with the experimental
data. Overall, this article demonstrates that the ALE-VMS and ST-VMS methods,
together with some new supporting techniques, have brought the aerodynamic and
FSI analysis of wind turbines to a new level, where such analyses can contribute
more to simulation-based design and testing.
Acknowledgments We wish to thank the Texas Advanced Computing Center (TACC) and the
San Diego Supercomputing Center (SDSC) for providing HPC resources that have contributed to
the research results reported in this article. The first author acknowledges the support of the NSF
CAREER Award, the NSF Award CBET-1306869, and the Air Force Office of Scientific Research
Award FA9550-12-1-0005. The ST-VMS part of the work was supported by ARO grants W911NF-
09-1-0346 and W911NF-12-1-0162 (third author) and RiceWaseda Research Agreement (second
author).
382 Y. Bazilevs et al.
References
1. Jonkman JM, Buhl ML (2005) FAST users guide, Technical Report NREL/EL-500-38230.
National Renewable Energy Laboratory, Golden, CO
2. Jonkman J, Butterfield S, Musial W, Scott G (2009) Definition of a 5-MW reference wind
turbine for offshore system development, Technical Report NREL/TP-500-38060. National
Renewable Energy Laboratory, Golden, CO
3. Srensen NN, Michelsen JA, Schreck S (2002) Navier-Stokes predictions of the NREL Phase
VI rotor in the NASA Ames 80 ft 120 ft wind tunnel. Wind Energy 5:151169
4. Pape AL, Lecanu J (2004) 3D Navier-Stokes computations of a stall-regulated wind turbine.
Wind Energy 7:309324
5. Bazilevs Y, Hsu M-C, Akkerman I, Wright S, Takizawa K, Henicke B, Spielman T, Tezduyar
TE (2011) 3D simulation of wind turbine rotors at full scale. Part I: geometry modeling and
aerodynamics. Int J Numer Meth Fluids 65:207235. doi:10.1002/fld.2400
6. Takizawa K, Henicke B, Tezduyar TE, Hsu M-C, Bazilevs Y (2011) Stabilized space-time
computation of wind-turbine rotor aerodynamics. Comput Mech 48:333344. doi:10.1007/
s00466-011-0589-2
7. Kong C, Bang J, Sugiyama Y (2005) Structural investigation of composite wind turbine blade
considering various load cases and fatigue life. Energy 30:21012114
8. Hansen MOL, Srensen JN, Voutsinas S, Srensen N, Madsen HA (2006) State of the art in
wind turbine aerodynamics and aeroelasticity. Prog Aerosp Sci 42:285330
9. Jensen FM, Falzon BG, Ankersen J, Stang H (2006) Structural testing and numerical simulation
of a 34 m composite wind turbine blade. Compos Struct 76:5261
10. Kiendl J, Bazilevs Y, Hsu M-C, Wchner R, Bletzinger K-U (2010) The bending strip method
for isogeometric analysis of Kirchhoff-Love shell structures comprised of multiple patches.
Comput Methods Appl Mech Eng 199:24032416
11. Bazilevs Y, Hsu M-C, Kiendl J, Benson DJ (2012) A computational procedure for pre-bending
of wind turbine blades. Int J Numer Meth Eng 89:323336
12. Bazilevs Y, Hsu M-C, Kiendl J, Wchner R, Bletzinger K-U (2011) 3D simulation of wind
turbine rotors at full scale. Part II: fluid-structure interaction modeling with composite blades.
Int J Numer Meth Fluids 65:236253
13. Hughes TJR, Cottrell JA, Bazilevs Y (2005) Isogeometric analysis: CAD, finite elements,
NURBS, exact geometry, and mesh refinement. Comput Methods Appl Mech Eng 194:4135
4195
14. Cottrell JA, Reali A, Bazilevs Y, Hughes TJR (2006) Isogeometric analysis of structural vibra-
tions. Comput Methods Appl Mech Eng 195:52575297
15. Bazilevs Y, da Veiga LB, Cottrell JA, Hughes TJR, Sangalli G (2006) Isogeometric analysis:
approximation, stability and error estimates for h-refined meshes. Math Models Methods Appl
Sci 16:10311090
16. Cottrell JA, Hughes TJR, Reali A (2007) Studies of refinement and continuity in isogeometric
structural analysis. Comput Meth Appl Mech Eng 196:41604183
17. Cottrell JA, Hughes TJR, Bazilevs Y (2009) Isogeometric analysis: toward integration of CAD
and FEA. Wiley, Chichester
18. Evans JA, Bazilevs Y, Babuka I, Hughes TJR (2009) n-Widths, supinfs, and optimality ratios
for the k-version of the isogeometric finite element method. Comput Methods Appl Mech Eng
198:17261741
19. Drfel MR, Jttler B, Simeon B (2010) Adaptive isogeometric analysis by local h-refinement
with T-splines. Comput Methods Appl Mech Eng 199:264275
20. Bazilevs Y, Calo VM, Cottrell JA, Evans JA, Hughes TJR, Lipton S, Scott MA, Sederberg TW
(2010) Isogeometric analysis using T-splines. Comput Methods Appl Mech Eng 199:229263
21. Bazilevs Y, Calo VM, Cottrell JA, Hughes TJR, Reali A, Scovazzi G (2007) Variational mul-
tiscale residual-based turbulence modeling for large eddy simulation of incompressible flows.
Comput Methods Appl Mech Eng 197:173201
Computational Wind-Turbine Analysis 383
22. Bazilevs Y, Michler C, Calo VM, Hughes TJR (2007) Weak dirichlet boundary conditions for
wall-bounded turbulent flows. Comput Methods Appl Mech Eng 196:48534862
23. Bazilevs Y, Michler C, Calo VM, Hughes TJR (2010) Isogeometric variational multiscale mod-
eling of wall-bounded turbulent flows with weakly enforced boundary conditions on unstretched
meshes. Comput Methods Appl Mech Eng 199:780790
24. Akkerman I, Bazilevs Y, Calo VM, Hughes TJR, Hulshoff S (2008) The role of continuity in
residual-based variational multiscale modeling of turbulence. Comput Mech 41:371378
25. Hsu M-C, Bazilevs Y, Calo VM, Tezduyar TE, Hughes TJR (2010) Improving stability of
stabilized and multiscale formulations in flow simulations at small time steps. Comput Methods
Appl Mech Eng 199:828840. doi:10.1016/j.cma.2009.06.019
26. Bazilevs Y, Akkerman I (2010) Large eddy simulation of turbulent Taylor-Couette flow using
isogeometric analysis and the residual-based variational multiscale method. J Comput Phys
229:34023414
27. Elguedj T, Bazilevs Y, Calo VM, Hughes TJR (2008) B-bar and F-bar projection methods for
nearly incompressible linear and nonlinear elasticity and plasticity using higher-order nurbs
elements. Comput Methods Appl Mech Eng 197:27322762
28. Lipton S, Evans JA, Bazilevs Y, Elguedj T, Hughes TJR (2010) Robustness of isogeomet-
ric structural discretizations under severe mesh distortion. Comput Methods Appl Mech Eng
199:357373
29. Benson DJ, Bazilevs Y, De Luycker E, Hsu M-C, Scott M, Hughes TJR, Belytschko T (2010) A
generalized finite element formulation for arbitrary basis functions: from isogeometric analysis
to XFEM. Int J Numer Meth Eng 83:765785
30. Benson DJ, Bazilevs Y, Hsu M-C, Hughes TJR (2010) Isogeometric shell analysis: the Reissner-
Mindlin shell. Comput Methods Appl Mech Eng 199:276289
31. Kiendl J, Bletzinger K-U, Linhard J, Wchner R (2009) Isogeometric shell analysis with
Kirchhoff-Love elements. Comput Methods Appl Mech Eng 198:39023914
32. Zhang Y, Bazilevs Y, Goswami S, Bajaj C, Hughes TJR (2007) Patient-specific vascular NURBS
modeling for isogeometric analysis of blood flow. Comput Methods Appl Mech Eng 196:2943
2959
33. Bazilevs Y, Calo VM, Zhang Y, Hughes TJR (2006) Isogeometric fluid-structure interaction
analysis with applications to arterial blood flow. Comput Mech 38:310322
34. Bazilevs Y, Calo VM, Hughes TJR, Zhang Y (2008) Isogeometric fluid-structure interaction:
theory, algorithms, and computations. Comput Mech 43:337
35. Isaksen JG, Bazilevs Y, Kvamsdal T, Zhang Y, Kaspersen JH, Waterloo K, Romner B, Inge-
brigtsen T (2008) Determination of wall tension in cerebral artery aneurysms by numerical
simulation. Stroke 39:31723178
36. Bazilevs Y, Hughes TJR (2008) NURBS-based isogeometric analysis for the computation of
flows about rotating components. Comput Mech 43:143150
37. Cirak F, Ortiz M, Schrder P (2000) Subdivision surfaces: a new paradigm for thin shell
analysis. Int J Numer Meth Eng 47:20392072
38. Cirak F, Ortiz M (2001) Fully c1 -conforming subdivision elements for finite deformation thin
shell analysis. Int J Numer Meth Eng 51:813833
39. Cirak F, Scott MJ, Antonsson EK, Ortiz M, Schrder P (2002) Integrated modeling, finite-
element analysis, and engineering design for thin-shell structures using subdivision. Comput
Aided Des 34:137148
40. Hughes TJR, Liu WK, Zimmermann TK (1981) Lagrangian-Eulerian finite element formula-
tion for incompressible viscous flows. Comput Methods Appl Mech Eng 29:329349
41. Hughes TJR (1995) Multiscale phenomena: Greens functions, the Dirichlet-to-Neumann for-
mulation, subgrid scale models, bubbles, and the origins of stabilized methods. Comput Meth-
ods Appl Mech Eng 127:387401
42. Hughes TJR, Oberai AA, Mazzei L (2001) Large eddy simulation of turbulent channel flows
by the variational multiscale method. Phys Fluids 13:17841799
43. Takizawa K, Tezduyar TE (2011) Multiscale space-time fluid-structure interaction techniques.
Comput Mech 48:247267. doi:10.1007/s00466-011-0571-z
384 Y. Bazilevs et al.
44. Takizawa K, Tezduyar TE (2012) Space-time fluid-structure interaction methods. Math Models
Methods Appl Sci 22:1230001. doi:10.1142/S0218202512300013
45. Tezduyar TE (1992) Stabilized finite element formulations for incompressible flow computa-
tions. Adv Appl Mech 28:144. doi:10.1016/S0065-2156(08)70153-4
46. Tezduyar TE, Behr M, Liou J (1992) A new strategy for finite element computations involving
moving boundaries and interfacesthe deforming-spatial-domain/space-time procedure: I.
The concept and the preliminary numerical tests. Comput Methods Appl Mech Eng 94:339
351. doi:10.1016/0045-7825(92)90059-S
47. Tezduyar TE, Behr M, Mittal S, Liou J (1992) A new strategy for finite element computa-
tions involving moving boundaries and interfacesthe deforming-spatial-domain/space-time
procedure: II. Computation of free-surface flows, two-liquid flows, and flows with drifting
cylinders. Comput Methods Appl Mech Eng 94:353371. doi:10.1016/0045-7825(92)90060-
W
48. Tezduyar TE (2003) Computation of moving boundaries and interfaces and stabilization para-
meters. Int J Numer Meth Fluids 43:555575. doi:10.1002/fld.505
49. Tezduyar TE, Sathe S (2007) Modeling of fluid-structure interactions with the space-time finite
elements: solution techniques. Int J Numer Meth Fluids 54:855900. doi:10.1002/fld.1430
50. Bazilevs Y, Takizawa K, Tezduyar TE (2013) Computational fluid-structure interaction: meth-
ods and applications. Wiley, Chichester, West Sussex, United Kingdom
51. Bazilevs Y, Hughes TJR (2007) Weak imposition of Dirichlet boundary conditions in fluid
mechanics. Comput Fluids 36:1226
52. Nitsche J (1971) Uber ein variationsprinzip zur losung von Dirichlet-problemen bei verwen-
dung von teilraumen, die keinen randbedingungen unterworfen sind. Abh Math Univ Hamburg
36:915
53. Arnold DN, Brezzi F, Cockburn B, Marini LD (2002) Unified analysis of discontinuous
Galerkin methods for elliptic problems. SIAM J Numer Anal 39:17491779
54. Brooks AN, Hughes TJR (1982) Streamline upwind/Petrov-Galerkin formulations for convec-
tion dominated flows with particular emphasis on the incompressible Navier-Stokes equations.
Comput Methods Appl Mech Eng 32:199259
55. Tezduyar TE, Mittal S, Ray SE, Shih R (1992) Incompressible flow computations with stabi-
lized bilinear and linear equal-order-interpolation velocity-pressure elements. Comput Methods
Appl Mech Eng 95:221242. doi:10.1016/0045-7825(92)90141-6
56. Mittal S, Tezduyar TE (1992) A finite element study of incompressible flows past oscillating
cylinders and aerofoils. Int J Numer Meth Fluids 15:10731118. doi:10.1002/fld.1650150911
57. Mittal S, Tezduyar TE (1995) Parallel finite element simulation of 3d incompressible flows
fluid-structure interactions. Int J Numer Meth Fluids 21:933953. doi:10.1002/fld.1650211011
58. Kalro V, Tezduyar TE (2000) A parallel 3D computational method for fluid-structure inter-
actions in parachute systems. Comput Methods Appl Mech Eng 190:321332. doi:10.1016/
S0045-7825(00)00204-8
59. Tezduyar TE, Sathe S, Keedy R, Stein K (2006) Space-time finite element techniques for
computation of fluid-structure interactions. Comput Methods Appl Mech Eng 195:20022027.
doi:10.1016/j.cma.2004.09.014
60. Takizawa K, Tezduyar TE (2012) Computational methods for parachute fluid-structure inter-
actions. Arch Comput Meth Eng 19:125169. doi:10.1007/s11831-012-9070-4
61. Tezduyar TE, Takizawa K, Brummer T, Chen PR (2011) Space-time fluid-structure interaction
modeling of patient-specific cerebral aneurysms. Int J Numer Meth Biomed Eng 27:16651710.
doi:10.1002/cnm.1433
62. Takizawa K, Bazilevs Y, Tezduyar TE (2012) Space-time and ALE-VMS techniques for patient-
specific cardiovascular fluid-structure interaction modeling. Arch Comput Meth Eng 19:171
225. doi:10.1007/s11831-012-9071-3
63. Takizawa K, Schjodt K, Puntel A, Kostov N, Tezduyar TE (2012) Patient-specific computer
modeling of blood flow in cerebral arteries with aneurysm and stent. Comput Mech 50:675686.
doi:10.1007/s00466-012-0760-4
Computational Wind-Turbine Analysis 385
83. Tezduyar TE (2001) Finite element methods for flow problems with moving boundaries and
interfaces. Arch Comput Meth Eng 8:83130. doi:10.1007/BF02897870
84. Hsu M-C, Bazilevs Y (2012) Fluid-structure interaction modeling of wind turbines: simulating
the full machine. Comput Mech 50:821833
85. Hsu M-C, Akkerman I, Bazilevs Y (2013) Finite element simulation of wind turbine aerody-
namics: validation study using NREL Phase VI experiment. Wind Energy, published online.
doi:10.1002/we.1599
86. Hand MM, Simms DA, Fingersh LJ, Jager DW, Cotrell JR, Schreck S, Larwood SM (2001)
Unsteady aerodynamics experiment Phase VI: wind tunnel test configurations and available
data campaigns, Technical Report NREL/TP-500-29955. National Renewable Energy Labo-
ratory, Golden, CO
87. Korobenko A, Hsu M-C, Akkerman I, Tippmann J, Bazilevs Y (2013) Structural mechanics
modeling and fsi simulation of wind turbines. Math Models and Methods Appl Sci 23:249272
88. Bazilevs Y, Hsu M-C, Scott MA (2012) Isogeometric fluid-structure interaction analysis with
emphasis on non-matching discretizations, and with application to wind turbines. Comput Meth
Appl Mech Eng 249252:2841
89. Tezduyar TE, Sathe S, Stein K (2006) Solution techniques for the fully-discretized equations
in computation of fluid-structure interactions with the space-time formulations. Comput Meth
Appl Mech Eng 195:57435753. doi:10.1016/j.cma.2005.08.023
90. Jonkman J, Butterfield S, Musial W, Scott G (2009) Definition of a 5-MW reference wind
turbine for offshore system development, Technical Report NREL/TP-500-38060, National
Renewable Energy Laboratory
91. Spera DA (1994) Introduction to modern wind turbines. In: Spera DA (ed) Wind turbine
technology: fundamental concepts of wind turbine engineering, pp 4772 (ASME Press)
92. Saad Y, Schultz M (1986) GMRES: a generalized minimal residual algorithm for solving
nonsymmetric linear systems. SIAM J Sci Stat Comput 7:856869
93. Karypis G, Kumar V (1998) A fast and high quality multilevel scheme for partitioning irregular
graphs. SIAM J Sci Comput 20:359392
94. Zayas JR, Johnson WD (2008) 3X-100 blade field test, Report of the Sandia National Labora-
tories. Wind Energy Technology Department, Sandia
95. Sutherland JH, Jones PL, Neal BA (2001) The long-term inflow and structural test program.
In: Proceedings of the 2001 ASME wind energy symposium, p 162
96. Berry D, Ashwill T (2007) Design of 9-meter carbon-fiberglass prototype blades: CX-100 and
TX-100, Report of the Sandia national laboratories, New Mexico, USA
97. White JR, Adams DE, Rumsey MA (2011) Modal analysis of CX-100 rotor blade and Micon
65/13 wind turbine. In: Structural dynamics and renewable energy, vol 1, Conference proceed-
ings of the society for experimental mechanics series 10
98. Marinone T, LeBlanc B, Harvie J, Niezrecki C, Avitabile P (2012) Modal testing of a 9 m
CX-100 turbine blade. In: Topics in experimental dynamics substructuring and wind turbine
dynamics, vol 2, Conference proceedings of the society for experimental mechanics series 27
Part VII
Partitioned Method and Parallelization
Techniques
Scaling Up Multiphysics
R. Lhner (B)
Center for Computational Fluid Dynamics, George Mason University,
M.S. 6A2, Fairfax, VA 22030-4444, USA
e-mail: [email protected]
J. D. Baum
Advanced Technology Group, SAIC, McLean, VA 22020, USA
e-mail: [email protected]
understand in depth the phenomena observed). The same trend can be observed in all
the manufacturing industries, and increasingly in the medical and life sciences. Any
car, truck, train, ship, airplane, large building, bridge, skyscraper, computer, medical
device or, for that matter, consumer product of value will see considerable calculation
based design and optimization during its development. Computational mechanics is
also increasingly being used not only in the design, optimization and verification of
finished products, but also for the optimization of manufacturing processes.
The current trend in each of the fields that comprise the computational sciences
(structural/ thermo/ fluid-dynamics, electromagnetics, material science, chemistry,
etc.) is to increase physical fidelity (either by considering more scales or linking/
coupling different disciplines), improve accuracy and robustness, and push the range
of feasible (credible) problem classes. Furthermore, high-end computing of this kind
is increasingly being used to provide data bases for fast-running engineering models.
This new area, also known as real-time computing, either interpolates from these
detailed data bases, or extracts fundamental modes of the system to obtain a reduced
order model (ROM).
Increased physical fidelity and accuracy in almost all fields in applied sciences
(i.e. physics, chemistry, biology, medicine, etc.) and engineering (civil, mechanical,
aerospace, naval, electrical, telecom, etc.) naturally imply large computing require-
ments.
0.09376+1.163e-6*nelem
FEM-FCT Diamond_1
0.8 0.07575+0.957e-6*nelem
0.6
0.4
0.2
0
0 100000 200000 300000 400000 500000 600000 700000
Number of Elements/Core
10
(relative) FEM-FCT Jaguar
0.25680/nelem+2.833
FEM-FCT Diamond_0
8 0.09376/nelem+1.163
FEM-FCT Diamond_1
0.07575/nelem+0.9575
6
0
0 100000 200000 300000 400000 500000 600000 700000
Number of Elements/Core
Scaling Up Multiphysics 393
After an extensive analysis [29] it was found that the communication times between
processors were the reason for this poor performance. The sobering conclusion of
this analysis is that, at present, the limiting speed of CFD solvers is given by the MPI
communication network. And compared to raw processor speed and RAM trans-
fer rates, the speed of the MPI communication network is advancing at a slower
pace. If these trends continue, the useful mesh size per processor will increase,
not decrease (!). While these conclusions have been stated before, we predict that as
more users get access to hundreds of thousands of cores and the domain size per core
starts shrinking, they will also encounter the minimum timestep barrier reported
here.
Let us ponder the consequences of this barrier for the particular case of computa-
tional fluid dynamics (similar consequences can be drawn for other fields). The cur-
rent trend here is to migrate from the quasi-steady Reynolds-Averaged Navier Stokes
(RANS) description to the Large-Eddy Simulation (LES) of flows. This implies an
increase in mesh size of 36 orders of magnitude, and an increase in the number of
timesteps to achieve convergence / statistical data of 34 orders of magnitude. The
increase in mesh size may be easily absorbed by machines with millions of cores.
But the requirement of O(107 108 ) timesteps, coupled with a minimum timestep
barrier of (optimistically) TM P I = O(0.01) [sec] means that no matter how large the
machine, an LES run will take TL E S = O(105 106 ) [sec], i.e. two weeks under the
best of circumstances. Such long running times will clearly hinder the useful range
of applicability.
Any scientific calculation based on the solution of partial differential equations fol-
lows the so-called simulation pipeline: domain definition, imposition of boundary
conditions, mesh generation, solver, possible mesh adaptation, post-processing of
results. To date, only the solvers have been migrated to systems with thousands (or,
in some cases, millions) of cores. The current workflow considers a scalar, large
memory machine with powerful graphics for the domain definition (CAD) and the
imposition of boundary conditions. The input data is then used to generate a mesh
on a large, shared-memory multicore machine (ncore < 64). The splitting of the
domain into pieces is also accomplished on such machines. The domain files are
then transferred to the massively parallel machine (ncore>10,000) and the solver
is run. Finally, the results are assembled and post-processed on large-memory, scalar
machines that are of the same class as those used for pre-processing. This scalar-
pre, simple parallel solve, scalar-post environment will no longer be possible once
machines with more than 106 cores become widespread. The grid size alone will
force parallelization of meshing. The same will happen to post-processing, as well
as domain splitting and load balancing.
In the sequel, we show ways of obtaining parallel grid generation and dynamic
load balancing for multiphysics.
394 R. Lhner and J. D. Baum
As stated before, many solvers have been ported to distributed parallel machines
while grid generators have, in general, lagged behind. One can cite several reasons
for this:
(a) For many applications the CPU requirements of grid generation are orders of
magnitude less than those of field solvers, i.e. it does not matter if the user has
to wait several hours for a grid;
(b) (Scalar) grid generators have achieved a high degree of maturity, generality and
widespread use, leading to the usual inertia of workflow (modus operandi) and
aversion to change;
(c) In recent years, low-cost machines with few cores but very large memories have
enabled the generation of large grids with existing (scalar) software; and
(d) In many cases it is possible to generate a mesh that is twice (2d times) as coarse
as the one desired for the simulation. This coarse mesh is then h-refined globally.
Parallel unstructured grid generation has been pursued since the early 1990s
[1, 38, 10, 13, 14, 16, 18, 24, 27, 36, 37, 42, 43, 47, 53].
The two most common ways of generating unstructured grids are the Advancing
Front Technique (AFT) [12, 17, 2224, 32, 3841] and the Generalized Delaunay
Triangulation (GDT) [24, 7, 15, 34, 47, 50, 51]. The AFT introduces one element
at a time, while the GDT introduces a new point at a time. Thus, both of these
techniques are, in principle, scalar by nature, with a large variation in the number
of operations required to introduce a new element or point. While coding and data
structures may influence the scalar speed of the core AFT or GDT, one often finds
that for large-scale applications, the evaluation of the desired element size and shape
in space, given by background grids, sources or other means [31] consumes the
largest fraction of the total grid generation time. Furthermore, the time required for
mesh improvements (and any unstructured grid generator needs them) is in many
cases higher than the core AFT or GDT modules. Typical speeds for the complete
generation of a mesh (surface, mesh, improvement) on current Intel Xeon chips with
3.2 GHz and sufficient memory are of the order of 0.52.0 Mels/min. Therefore,
it would take approximately 2,000 minutes (i.e. 1.5 days) to generate a mesh of
109 elements, a common size in computational fluid dynamics and computational
electromagnetics. Assuming perfect parallelization, this task could be performed in
the order of a minute on 2,000 processors, clearly showing the need for parallel mesh
generation.
The easiest form of achieving volume-based parallelism is by using a grid to
define the regions to be meshed by each processor. Optimally, this domain-defining
grid (DDG) should have the same surface triangulation as the desired, fine mesh, but
could be significantly coarser in the interior. In this way, the definition of the domain
to be gridded is unique, something that is notoriouly difficult to achieve by other
means (such as background grids, bins or octrees). This domain-defining grid is then
partitioned according to the estimated number of elements to be generated, allowing
Scaling Up Multiphysics 395
for a balanced distribution of work among the processors. The domain defining grid
is also used to redistribute the elements and points after grid generation, and during
the subsequent mesh improvement. A parallel grid generator based on these ideas
has been developed over the last 3 years [26]. Figure 4a considers a typical example,
taken from a blast simulation carried out for an office complex. Figure 4bd show
the trace of the domain defining grid partition on the surface as well as the fronts
after the parallel grid generation passes using 64 domains (mpi processors) for a finer
mesh. Table 1 gives a compilation of timings for different mesh sizes, domains and
processors on different machines.
One may note that:
(a) Generating the 121 M mesh on one 8-core shared memory node (i.e. nproc = 1,
nprol = 8) is slower than the distributed memory equivalent (i.e. nproc = 8,
nprol = 1);
(b) The number of elements per core should exceed a minimum value (typically
of the order of 24 Mels) in order to reach a generation speed per core that is
acceptable;
(c) The local OMP scaling improves as the number of elements in each domain is
increased;
(d) It only takes on the order of five minutes to generate a mesh of 121 Mels on
256 cores (nproc = 32, npro = 8), and on the order of 40 minutes to generate
a mesh of 1,010 Mels on 512 cores (nproc = 32, nprol = 8).
These timings, and those of other large cases that required parallel mesh generation,
show that the proposed approach is scalable and able to produce large grids of high
quality in a modest amount of clocktime. With parallel grid generation entering
production, a major impediment to a completely scalable simulation pipeline (grid
generation, solvers, post-processing) has been removed, opening the way for truly
large-scale computations using unstructured, body-fitted grids.
For field solvers, which are commonly used for computational fluid and solid mechan-
ics as well as electromagnetics, the classic way to distribute work among many
distributed memory processors is via domain decomposition. Given that the work
requirements are proportional to the number of elements/points in a domain, the
aim is to achieve subdomains of equal size while minimizing communication. As
the communication between processors is proportional to the surface points of each
subdomain, the aim is to minimize surface-to-volume ratios, which is achieved by
keeping the domains as contiguous (non-split) and spherical as possible. Tech-
niques commonly used for domain splitting include the advancing front methods,
coordinate- and moment- recursive bisection, and space-filling curve subdivisions
[11, 19, 20, 33, 35, 44, 48, 52].
396 R. Lhner and J. D. Baum
Fig. 4 Garage: a outline of geometry, bd internal surface of DDG partition and remaining front
after each pass
Scaling Up Multiphysics 397
Fig. 4 (Continued)
Table 1 Garage
Machine nproc nprol ncore nelem CPU [sec] AbsSpeed [els/sec] RelSpeed [els/sec/core]
Xeon(1) 1 8 8 120 M 2,293 52,333 6,542
SGI ITL 8 1 8 121 M 1,605 75,389 9,423
SGI ITL 8 8 64 121 M 516 234,496 3,664
Cry AMD 8 1 8 121 M 2,512 48,169 6,021
Cry AMD 16 1 16 121 M 1,954 61,924 3,870
Cry AMD 32 1 32 121 M 1,118 100,082 3,128
SGI ITL 16 1 16 121 M 1,048 115,458 7,216
SGI ITL 16 2 32 121 M 667 181,409 5,669
SGI ITL 16 4 64 121 M 407 297,297 4,645
SGI ITL 16 8 128 121 M 329 367,781 2,873
SGI ITL 32 1 32 121 M 646 187,306 5,853
SGI ITL 32 2 64 121 M 427 283,372 4,427
SGI ITL 32 4 128 121 M 346 349,710 2,732
SGI ITL 32 8 256 121 M 316 383,030 1,496
Cry AMD 64 1 64 972 M 6,048 160,714 2,511
SGI ITL 64 8 512 1010 M 2,504 403,354 788
With the possibility of computing larger problems, the desire to compute physics
of ever increasing complexity also emerged. Some current flow applications include
a traditional (i.e. unreacting) flow solver, chemical reactions, moving embedded or
immersed bodies, particles, etc. A timestep for such an application may proceed as
follows:
Identify the (new) position of embedded/immersed bodies, obtaining the new
boundary conditions/geometric parameters required;
398 R. Lhner and J. D. Baum
Advance the chemical reactions one timestep, obtaining the source-terms for the
flow solver;
Update the particles one timestep, obtaining the source-terms for the flow solver;
Advance the flowfield one timestep.
The order of these operations is not mandatory and will vary among field solvers.
What is important, though, is that at the end of each of these steps a synchronization
among processors is required: the calculation can not proceed until all processors have
completed each step in turn. Therefore, for optimal performance the load should be
balanced in each step.
This inherent requirement of all multiphysics solvers implies that, compared to
simple field solvers (e.g. just flow), the potential for load imbalances and suboptimal
performance increases substantially.
In the sequel, we describe one possible way to achieve near-optimal load balance
for multiphysics solvers. The key idea is to subdivide the global domain into more
subdomains than processors. These oversampled or oversplit subdomains are then
grouped together in an optimal way so as to achieve the best load balance possible.
- Mark: U j = k
- Exit (Goto H5)
Endif
H8 If: items remain in the heap list: (Goto H5)
H9 If there are unassigned sub-subdomains (U j = 0):
- Increase the allowed average work: Ai
- Allocate the remaining sub-subdomains (Goto H3)
While this basic technique will balance the work properly, it does not attempt to
minimize the resulting surface-to-volume ratios. This means that many unconnected
sub-subdomains may end up belonging to a subdomain/processor. One can allevi-
ate this shortcoming by improving the selection criterion for the allocation of sub-
subdomains to domains in Step H7 above. The key idea is to favour those subdomains
j
k that satisfy the condition Ai > Vik + Wi , i = 1, n s and are as close as possible
to (preferably overlap) the sub-subdomain j being retrieved from the heap-list. This
may be implemented as follows:
- H70: Initialize closest/best/optimal subdomain and overlap distance: kopt = 0,
dopt = 0
- H71: Initialize uninitialized domain marker: k0 = N p + 1
- H72: Loop over the desired subdomains k = 1, N p
If: Vik = 0 (the subdomain has not been initialized):
- Set : k0 = min(k0 , k)
Else:
j
- If: Ai > Vik + Wi , i = 1, n s :
- Compare overlap zone (update kopt , dopt )
- Endif
Endif
- H73: If: kopt > 0 (i.e. a best subdomain has been found):
j
- Add: Vik = Vik + Wi
- Mark: U = k
j
This example considers a relatively long tube where a detonation occurs. As the
blast wave reaches the walls of the tube, particles are introduced into the flowfield.
The number of elements is of O(5 107 ), while the number of particles eventu-
ally reaches O(2 106 ). The problem was run with 16 distributed memory (mpi)
processes/domains, and 8 shared memory cores (OpenMP) per domain, i.e. a total of
256 cores. For the present purpose, the problem was run for 1,000 steps, with a re-
split using the method described above every 100 steps. The splitting obtained at the
end of the 1,000 steps may be discerned in Fig. 5ac. Note that there are many more
domains than mpi-processes/domains, but that, as expected, the basic moment-based
recursive bisection has still produced slices along the tube. The breakdown of times
is approximately as follows: flow solver 60 %, particle update 30 %, repartitioning
and renumbering 10 %. This implies that the parallel repartioning does not lead to
an excessive increase in CPU requirements while allowing for a much better load
balance.
Scaling Up Multiphysics 401
The present paper has summarized trends in supercomputing and the consequences
they will have on coupled problems in computational mechanics. In particular, the
red-shift observed in the performance increase of hardware subcomponents implies
that heavy emphasis should be placed on field solvers that minimize the access to
memory per timestep update.
The trend towards parallel machines with more than 106 cores implies that the
prevalent scalar-pre, simple parallel solve, scalar-post environment will have to
give way to a completely scalable simulation pipeline. The grid size alone will force
distributed parallelization of meshing, domain splitting, load balancing and post-
processing.
Possible ways of addressing parallel meshing and dynamic load balancing for
multiphysics we treated. The examples shown indicate that these approaches have
to potential to be core building blocks of a completely scalable simulation pipeline.
Much remains to be done in this field. It involves a lot of coding and debugging. It
is perhaps not as refined as the development of high order methods or new turbulence
models. But without it, further advances in coupled, large-scale problems will not
occur.
Acknowledgments Over the course of the last 25 years CIMNE has become an internationally
recognized center where, at any given time, one can find students, post-docs, research scientists,
professors and visitors from all over the world working on many topics. For the last 20 years, my
family and I (RL) have had the immense fortune of being able to visit CIMNE and Barcelona
every summer. I have always reserved the scientific topics that time did not permit me to consider
during the academic year for those wonderful weeks at CIMNE, which I consider among the most
productive and happy of my life. Therefore, I thank all the many members of CIMNE whom I had
the privilege to work with, and in particular Prof. Oate for his friendship, contagious enthusiasm
and vision.
References
7. Chrisochoides N, Nave D (2003) Parallel Delaunay mesh generation kernel. Int J Num Meth
Eng 58:161176
8. de Cougny HL, Shephard MS, Ozturan C (1994) Parallel three-dimensional mesh generation.
Comput Syst Eng 5:311323
9. de Cougny H, Shephard M (1999) Parallel volume meshing using face removals and hierarchical
repartitioning. Comp Meth Appl Mech Eng 174(34):275298
10. de Cougny HL, Shephard MS, Ozturan C (1995) Parallel three-dimensional mesh generation
on distributed memory MIMD computers. Tech Rep SCOREC Rep # 7, Rensselaer Polytechnic
Institute
11. Flower J, Otto S, Salama M (1990) Optimal mapping of irregular finite element domains to
parallel processors. 239250
12. Frykestig J (1994) Advancing front mesh generation techniques with application to the finite
element method. Pub. 94:10, Chalmers University of Technology, Gteborg
13. Galtier J, George PL (1997) Prepartitioning as a way to mesh subdomains in parallel. In: Special
symposium on trends in unstructured mesh generation (ASME/ASCE/SES), pp 107122
14. George PL (1999) Tet meshing: construction, optimization and adaptation. In: Proceedings of
8th international meshing roundtable, South Lake Tahoe, October 1999
15. George PL, Hecht F, Saltel E (1991) Automatic mesh generator with specified boundary. Comp
Meth Appl Mech Eng 92:269288
16. Ivanov EG, Andrae H, Kudryavtsev AN (2006) Domain decomposition approach for automatic
parallel generation of tetrahedral grids. Int Math J Comp Meth App Math 6(2):178193
17. Jin H, Tanner RI (1993) Generation of unstructured tetrahedral meshes by the advancing front
technique. Int J Num Meth Eng 36:18051823
18. Kadow C, Walkington N (2003) Design of a projection-based parallel Delaunay mesh genera-
tion and refinement algorithm. In: Proceedings of fourth symposium on trends in unstructured
mesh generation, Albuquerque, 2003
19. Karypis G, Kumar V (1999) Parallel multilevel k-way partitioning scheme for irregular graphs.
SIAM Rev 41(2):278300
20. Karypis G, Kumar V (1998) A parallel algorithm for multilevel graph partitioning and sparse
matrix ordering. J Parallel Distrib Comput 48:7185
21. Ko S-H, Kim N, Kim J, Thota A, Jha S (2010) Efficient runtime environment for coupled multi-
physics simulations: dynamic resource allocation and load-balancing. In: Procedings of 10th
IEEE/ACM international conference on cluster, cloud and grid computing (CCGrid), 1720
May (2010)
22. Lhner R (1988) Some useful data structures for the generation of unstructured grids. Comm
Appl Num Meth 4:123135
23. Lhner R (1996) Extensions and improvements of the advancing front grid generation tech-
nique. Comm Num Meth Eng 12:683702
24. Lhner R (2001) A parallel advancing front grid generation scheme. Int J Num Meth Eng
51:663678
25. Lhner R (2008) Applied CFD techniques, 2nd edn. Wiley, Chichester
26. Lhner R (2013) A 2nd generation parallel advancing front grid generator. AIAA-2013-0147
27. Lhner R, Camberos J, Merriam M (1992) Parallel unstructured grid generation. Comp Meth
Appl Mech Eng 95:343357
28. Lhner R, Baum JD (2012) 40 years of FCT: status and directions. In: Kuzmin D, Lhner R,
Turek S (eds) Flux-corrected transport, 2nd edn. Springer, New York, pp 119143
29. Lhner R, Baum JD (2014) On maximum achievable speeds for field solvers. Int J Num Meth
Heat Fluid Flow (to appear)
30. Lhner R, Corrigan A, Wichmann K-R, Wall W (2013) On the achievable speeds of finite
difference solvers on CPUs and GPUs. AIAA-2013-2852
31. Lhner R, Luo H, Baum JD, Rice D (2008) Improvements in speed for explicit, transient
compressible flow solvers. Int J Num Meth Fluids 56(12):22292244
32. Lhner R, Parikh P (1988) Three-dimensional grid generation by the advancing front method.
Int J Num Meth Fluids 8:11351149
Scaling Up Multiphysics 403
33. Lhner R, Ramamurti R, Martin D (1993) A parallelizable load balancing algorithm. AIAA-
93-0061
34. Marcum DL, Weatherill NP (1995) Unstructured grid generation using iterative point insertion
and local reconnection. AIAA J 33(9):16191625
35. Mehrota P, Saltz J, Voigt R (eds) (1992) Unstructured scientific computation on scalable mul-
tiprocessors. MIT Press, Cambridge
36. Okusanya T, Peraire J (1996) Parallel unstructured mesh generation. In: Proceedings of 5th
internatinal conference number grid generation in CFD and related fields, Mississippi, April
1996
37. Okusanya T, Peraire J (1997) 3-D parallel unstructured mesh generation. In: Proceedings of
joint ASME/ASCE/SES summer meeting 1997
38. Peraire J, Morgan K, Peiro J (1990) Unstructured finite element mesh generation and adaptive
procedures for CFD. AGARD-CP-464, p 18
39. Peraire J, Morgan K, Peiro J (1992) Adaptive remeshing in 3-D. J Comp Phys 103(2):269285
40. Peraire J, Peiro J, Formaggia L, Morgan K, Zienkiewicz OC (1988) Finite element Euler
calculations in three dimensions. Int J Num Meth Eng 26:21352159
41. Peraire J, Vahdati M, Morgan K, Zienkiewicz OC (1987) Adaptive remeshing for compressible
flow computations. J Comp Phys 72:449466
42. Said R, Weatherill NP, Morgan K, Verhoeven NA (1999) Distributed parallel delaunay mesh
generation. Comp Meth Appl Mech Eng 177:109125
43. Shostko A, Lhner R (1995) Three-dimensional parallel unstructured grid generation. Int J
Num Meth Eng 38:905925
44. Simon H (1991) Partitioning of unstructured problems for parallel processing. NASA Ames
Tech Rep RNR-91-008
45. Stck A, Camelli F, Lhner R (2010) Adjoint-based design of shock mitigation devices. Int J
Num Meth Fluids 64:443472
46. Togashi F, Baum JD, Mestreau E, Lhner R, Sunshine D (2010) Numerical simulation of
long-duration blast wave evolution in confined facilities. Shock Waves 20(5):409424
47. Tremel U, Sorensen KA, Hitzel S, Rieger H, Hassan O, Weatherill NP (2006) Parallel remeshing
of unstructured volume grids for CFD applications. Int J Num Meth Fluids 53(8):13611379
48. Vidwans A, Kallinderis Y, Venkatakrishnan V (1993) A parallel load balancing algorithm for
3-D adaptive unstructured grids. AIAA-93-3313-CP
49. Walshaw C, Cross M (2000) Parallel optimisation algorithms for multi-level mesh partitioning.
Parallel Comput 26:16351660
50. Weatherill NP, Hassan O (1994) Efficient three-dimensional Delaunay triangulation with auto-
matic point creation and imposed boundary constraints. Int J Num Meth Eng 37:20052039
51. Weatherill NP (1992) Delaunay triangulation in computational fluid dynamics. Comp Math
Appl 24(5/6):129150
52. Williams D (1990) Performance of dynamic load balancing algorithms for unstructured grid
calculations. CalTech Rep C3P913
53. Yoshimura S, Nitta H, Yagawa G, Akiba H (1998) Parallel automatic mesh generation method
of ten-million nodes problem using fuzzy knowledge processing and computational geometry.
In: Proceedings of 4th world cong comp mech Buenos Aires, Argentina, July 1998
Partitioned Solution of Coupled Stochastic
Problems
Abstract This work is concerned with the propagation of uncertainty across coupled
problems with high-dimensional random inputs. A stochastic model reduction
approach based on low-rank separated representations is proposed for the parti-
tioned treatment of the uncertainty space. The construction of the coupled solution
is achieved though a sequence of approximations with respect to the dimensionality
of the random inputs associated with each individual subproblem and not the com-
bined dimensionality, hence drastically reducing the overall computational cost. The
coupling between the sub-domain solutions is done via the classical Finite Element
Tearing and Interconnecting (FETI) method, thus providing a well suited framework
for parallel computing. A high-dimensional stochastic problem, a coupled 2D elliptic
PDE with random diffusion coefficient, has been considered in this paper to study
the performance and accuracy of the proposed stochastic coupling approach.
1 Introduction
For coupled problems a partitioned solution procedure, which allows the re-use of
the subproblem solvers and the accompanying software, has a long history and well-
developed procedures, see e.g. [18] and the references therein. Here we combine this
idea with that of uncertainty propagation, which is steadily gaining in importance
in order to obtain realistic predictions of the effects of such uncertainties and quan-
tify their impact on Quantities of Interest (QoI). Uncertainty quantification (UQ),
an emerging field in computational engineering and science, is concerned with the
development of rigorous and efficient solutions to this exercise. It is quite common
2 Coupled Problems
To introduce the problem and notation, where we follow [11], first look at a single
system which will be denoted abstractly as
u(t) + A(q; u(t)) = f (q; t), (1)
t
where u(t) U describes the state of the system at time t [0, T ] lying in a Hilbert
space U (for the sake of simplicity), A is apossibly non-linearoperator modelling
the physics of the system, and f U is some external influence (action / excitation
/ loading). The model depends on some parameters q Q which are uncertain. For
our purposes here it will be sufficient to look at the stationary case when u/t = 0
and f /t = 0:
A(q; u) = f (q). (2)
Often an example such as Eq. (2) is the stationary condition of some functional or
potential on U, i.e. it is equivalent to
u(x, t) div((x) u(x, t)) = f (x, t), x G, (4)
t
tensor , or the initial conditions u(x, 0). The stationary case of Eq. (4) is well-known
to be the gradient of
1
(u) = u(x) (x) u(x) dx u(x) f (x) dx, (5)
2
G G
where we have assumed homogeneous Dirichlet boundary conditions in Eq. (5) for
simplicity. Later in the numerical example two such systems will be coupled at the
common part of the boundary.
Focusing on the stationary case Eq. (2), assume that we are also given an iterative
solverconvergent for all fixed values of qwhich generates successive iterates for
k = 0, . . . converging to the solution u (q).
u (k+1) (q) = S(q; u (k) (q), R(q; u (k) (q)), with u (k) (q) u (q), (6)
where S is one cycle of the solver which may also depend on the iteration counter k,
u (q) is some starting vector, and R(q; u (k) (q)) := f (q) A(q; u (k) ) is the residuum
of Eq. (1). Obviously, when the residuum vanishesR(q; u (q)) = 0the mapping
S has a fixed point u (q) = S(q; u (q), 0).
We now turn to coupled systems, where we follow [18], and for the sake of simplicity,
we look at only two systemsagain in abstract formwhich are coupled:
A I (q I ; u I , u I I , ) = f I (q I ), A I I (q I I ; u I I , u I , ) = f I I (q I I ), (7)
where u I , u I I and q I , q I I are the state and parameters of system I or I I . Later again
we will assume that the two equations in Eq. (7) are the partial derivatives of some
functionals I (q I ; u I , u I I ) and I I (q I I ; u I I , u I ). The Lagrange multiplier of the
coupling is , and the coupling condition is
Often the system I in Eq. (7) does not depend on u I I , and vice versa, which may
make the coupling a bit looser. It makes no difference for what is to follow.
A partitioned solution procedure assumes that we have solvers separately for
the two equations in Eq. (7); more precisely that in the first equation, for u I I
and , q I , q I I fixed, that equation can be solved by a given procedureiterating
a map like in Eq. (6) to convergence u k+1 I = S I (q I ; f I , u kI , u I I , ), and vice
versa for the second equation. Additionally we assume that for fixed q I , q I I
and u I , u I I , the coupling Eq. (8) determines , again produced by an iterator
Partitioned Solution of Coupled Stochastic Problems 409
Due to the symmetry, the equations Eq. (9) are the stationarity conditions for the
deterministic Lagrangian in Eq. (10), namely
u I = 0, u I I = 0, = 0,
(q I , q I I ; u I , u I I , ) = I (q I ; u I ) + I I (q I I ; u I I ) + T (C I u I C I I u I I ) :=
1 T 1 T
u I K I (q I )u I u TI f I (q I ) + u I I K I I (q I I )u I I u TII f I I (q I I ) + T (C I u I C I I u I I ).
2 2
(10)
A block-Gauss-Seidel type algorithm for solving Eq. (9) is the FETI method, which is
used here; detailed descriptions may be found at e.g. [7, 8, 23]. For nonlinear systems
it may be worthwhile to use more advanced methods than block-Gauss-Seidel, e.g
quasi Newton methods; see [18] for an analysis of such cases in particular for fluid-
structure interaction. This finishes our brief review on the partitioned solution of
coupled systems.
410 M. Hadigol et al.
3 Stochastic Problems
Consider the stationary version of Eq. (1) shown in Eq. (2), where one now is inter-
ested in capturing the dependence on q. To make it clear that we regard both Eqs. (1)
and (2) as equations in U , we also denote the equivalent weak form in Eq. (11):
q L 2 (, Q)
= Q L 2 () = Q S =: Q,
where we regard the tensor producthere and lateras completion in the norm
induced by the inner product on Q by q1 r1 , q2 r2 Q := q1 , q2 Q r1 , r2 S
and extended by linearity. The system model is now
and the state u = u() becomes a U-valued random variable (RV), an element of
the tensor space U := U S. We may write this also as a variational statement, cf.
Eq. (11), which facilitates both theory and numerical approximation, e.g. [1, 2, 16,
17]
v U : A(q; u), v = f, v (= E ( f, vU )), (13)
so that under certain assumptions one obtains a theory for well-posed stochastic
partial differential equations (SPDEs). As the input data q, right hand side f , and
solution u are elements of tensor product spaces, this points to the later use of low-
rank tensor approximations for efficient approximation algorithms, which will be
crucial for the separated representation. In case that Eq. (2) is a gradient as in Eq. (3),
this will carry over to Eq. (13), which is equivalent to
Partitioned Solution of Coupled Stochastic Problems 411
Assume that the operator equation A(q; u) = f (q) has already been discre-
tised by your favourite methode.g. FEM or FVM or similar. This is essen-
tiallythe choice of a finite-dimensional subspace U N = span{v n }n=1 N U, with
u n u n (q) v n U N .
More importantly, a discretisation of q and u(q) is needed. Stochastic processes
and random parameter fields usually need infinitely many RVs {1 (), 2 (), . . .},
so that q = q(1 (), 2 (), . . .) Q, where the m S are known RVs. A similar
representation may be obtained for f (q) [19]. Hence the solution is also a function
of the m : u(1 , 2 , . . .) = n u n (1 , 2 , . . .)v n . We discretise further by truncation
to a finite number of RVs: () = [1 (), . . . M ()] R M , so that q = q( ) =
q( ()) and
u = u( ) = u( ()) u n ( ) v n . (16)
n
If we take this ansatz Eq. (18) and insert it into Eq. (13), the residuum R(q( ); u( ))
will usually not vanish for all , as the finite set of functions {X } can not match
all possible parametric variations of u( ). To determine the coefficients u n , one may
412 M. Hadigol et al.
choose another set of functions (known RVs) ( ) S for projection, so that the
weighted residual vanishes:
, k : v k , R(q( ); u( )) = v k , f ( ) A(q( ); u n X ( )v n ) = 0,
n,
(19)
yielding a generally coupled system of equations of size N B for the u := {u n },
the tensor coefficients representing the solution u( ).
This general Galerkin methodalso called the method of weighted residuals
usually comes in the flavours of a Bubnov-Galerkin method where = X and
the system of equations is coupled for all u n , or as a Petrov-Galerkin method with
= X . In the latter case, a frequent choice is () = ( ), i.e. col-
location / interpolation at the points , where ( ) is the Dirac- at .
By additionally ensuring the Kronecker- property X ( ) = , , one obtains the
quasi-deterministic uncoupled collocation conditions
, k : v k , f ( ) A(q( ); u n v n ) = 0, (20)
n
which can be solved directly for the u n for each independently with the solver
Eq. (6), i.e. B systems of N uncoupled equationswhich are just samples at .
In the Bubnov-Galerkin case, if additionally the equation is a gradient as in
Eq. (14)which we will assume from now onthe Eq. (19) is equivalent with min-
imising the potential in Eq. (15) over the subspace U N S B U S. In Sect. 3.3
we will look at low-rank tensor representations, for a rank-R tensor these may be seen
as multi-linear maps FR L R (U S, U S), i.e. FR : U R S R U = U S;
they give a formal way to describe such representations. Then we will replace the
functional by the composition Fr or similar for some r , and thereby pose the
minimisation problem on U r S r .
R
R
u( ) wr r = r ( )wr . (21)
r =1 r =1
One may observe that the tensor (u n ) has N B terms, whereas a canonical
decomposition such as Eq. (21) has R (N + B). If R min(N , B), which we hope
for and what we mean by a low-rank separated representation, then R (N + B)
Partitioned Solution of Coupled Stochastic Problems 413
(N B). This not only saves memory, but most importantly also computation
when we can operate on the terms in Eq. (21) directly.
Assume that we have already found R 1rtermsr in the approximation Eq. (21),
then for the next step define u R ( ) = rR1
=1 ( )w , and the incremental potential
R
R (w R , R ) := (u R + w R R ). (22)
w R R (w R , R ), v k U = u (q( ); u R ( ) + R ( )w R ), v k R ( ) = 0,
(23)
R R (w R , R ), X S = u (q( ); u R ( ) + R ( )w R ), w R X ( ) = 0.
(24)
In contrast to Eq. (19), the system in Eq. (23) is of size N determining w R and
the system of Eq. (24) is of size B determining R . In more detail, Eq. (23) to be
solved for the N unknowns (wnR )n=1N in w R = N w R v reads:
n=1 n n
k : A(q( ); u R ( ) + R ( )w R ) f ( ), v k R
= E [A(q( ); u ( ) + ( )
R R
wnR v n ) f ( )] ( ) , v k U = 0;
R
n
(25)
one may observe that the operator is not a sample, but some average weighted with
R . Similarly, the Eq. (24) to determine the B unknowns (R )=1
B in R ( ) =
R
X ( ) is in more detail
: A(q( ); u R ( ) + R ( )w R ) f ( ), w R X
= E A(q( ); u R ( ) + ( R X ( ))w R ) f ( ), w R U X ( ) = 0.
(26)
The basic greedy rank-one updating algorithmit may also be called the basic
ingredient in separated representation, proper generalised decomposition (PGD), or
successive rank-one updating (SR1U), a form of alternating least squares (ALS)is
simply formulated in Algorithm 2.
414 M. Hadigol et al.
R (w R , R ) := (u R + F1 (w R , R )), (27)
R
FR : (w1 , . . . , w R , 1 , . . . , R ) wr r ,
r =1
and thenafter each increase of R to optimise in the innermost loop the functional
R
R (w1 , . . . , w R , 1 , . . . , R ) := (FR (w1 , . . . , w R , 1 , . . . , R )) = wr r
r =1
over the vector space U NR S BR instead of the functional R from Eq. (27). The
example to be shown in Sect. 5 was computed in this way. For the sake of brevity we
will not spell out this algorithm, more details may be found in [12].
Partitioned Solution of Coupled Stochastic Problems 415
After all this preparation, let us return to Eqs. (7) and (8), the type of system we
are interested in. We assume that analogous to Sect. 3.2 the uncertainties q I in
subsystem I have been expressed or approximated by a set of independent RVs
I = (1,I , . . . , M I ,I ), i.e. q I = q I ( I ), and similarly the uncertainties q I I in
subsystem I I by I I = (1,I I , . . . , M I I ,I I ), i.e. q I I = q I I ( I I ).
The quantities describing the solution will then be functions of both I and I I ,
defined on an M I + M I I dimensional space, i.e. the stochastic modelling leads to
additional coupling. This is the difficulty, that with each additional coupled system the
dimension of the underlying variable space grows. This can of course not be avoided,
but mitigated through a low-rank separated or tensor representation. Taking the solu-
tion u I ( I , I I ) of system I in Eq. (7) as an example, we know thatsimilarly to
Eq. (21)it is representable as
R
R
u I ( I , I I ) rI ( I ) rI I ( I I )wrI = wrI rI rI I , (28)
r =1 r =1
and
I II
(span{X I, }=1
B
span{X I I, }=1
B
) = (S B I S B I I ) (S I S I I )
We turn right away to the linear coupled problem Eq. (9), which are the conditions
for a saddle-point of the Lagranrian Eq. (10). Here we make a low-rank ansatz as
in Eq. (28) for all solution quantities:
r
u I ( I , I I ) R wI
u I I ( I , I I ) rI ( I ) rI I ( I I ) wrI I . (29)
( I , I I ) r =1 r
416 M. Hadigol et al.
(q I , q I I ; u I , u I I , ) := I (q I , ; u I ) + I I (q I I , ; u I I ) + T (C I u I C I I u I I )
= E ( I (q I ( I ); u I ( I , I I ))) + E ( I I (q I I ( I I ); u I I ( I , I I )))
+ E ( I , I I )T (C I u I ( I , I I ) C I I u I I ( I , I I )) (30)
Following the general prescription in Sect. 3.3, as in Eq. (22), assume that a low-rank
representation like Eq. (29) up to terms of R 1 has already been computed, define
the abbreviations
R r
u I ( I , I I )
R1 wI
u R ( I , I I ) := wr rI ( I ) rI I ( I I ),
II II
R ( I , I I ) r =1 r
and look at the incremental Lagrangian (cf. Eq. (22)) corresponding to Eq. (30):
The conditions for stationarity of the incremental Lagrangian R from Eq. (31)as
before in Eqs. (23) and (24) are
w IR R u I , v I,i I I I
v I,i , v I I, j , : 0 = w R R = u I I , v I I,i I I I , (32)
II
R R , I I I
From this and Eqs. (10) and (30) one obtains after a short calculation for Eq. (32)
R
K I 0 C IT wI f I
0 K I I C T w R = f , (34)
II II II
C I C I I 0 R g
which has exactly the same size and structure as Eq. (9). Hence it can be solved with
the deterministic FETI algorithm. The averaged deterministic-size terms in Eq. (34)
are
K I = E IR ( I ) IRI ( I I ) K I ( I ) IR ( I ) IRI ( I I ) = E IRI ( I I )2 E IR ( I )2 K I ( I ) ,
K I I = E IR ( I ) IRI ( I I ) K I I ( I I ) IR ( I ) IRI ( I I ) = E IR ( I )2 E IRI ( I I )2 K I I ( I I ) ,
C I = C I E IR ( I ) IRI ( I I ) , C I I = C I I E IR ( I ) IRI ( I I ) ,
f I = E ( f I ( I ) K I ( I )u IR ( I , I I )) IR ( I ) IRI ( I I ) ,
f I I = E ( f I I ( I I ) K I I ( I I )u IRI ( I , I I )) IR ( I ) IRI ( I I ) ,
g = E (C IT u IR ( I , I I ) C ITI u IRI ( I , I I )) IR ( I ) IRI ( I I ) .
For Eq. (33) one obtains in the first relation for IR = I,
R X
I, after some
computation X I, :
E X I, ( I ) IRI ( I I )2 k( I , I I ) X I, ( I ) I,
R
= E IRI ( I I ) ( I , I I ) X I, ( I ) ,
(35)
with the abbreviations ( I , I I ) := u IR ( I , I I )T f I ( I ) +u IRI ( I , I I )T f I I ( I I )
and
B R B , as
The Eq. (35) is a linear symmetric positive definite system of size I I
the second relation in Eq. (33) yields a similar linear system of size B I I B I I such
that X I I, :
E X I I, ( I I ) IR ( I )2 k( I , I I ) X I I, ( I I ) IRI, = E IR ( I ) ( I , I I ) X I I, ( I I ) .
(36)
Solve linear s.p.d. system Eq. (36) for IRI,k using (w I,k R , w R , R , R ), minimising
I I,k k I,k R
w.r.t. I I,k .
R
end for
w IR := w I,k
R ; w R := w R ; R := R ; R := R ; R := R ;
II I I,k k I I,k II I I,k
R+1
uI := u IR + w IR IR IRI ; u IR+1
I := u IRI + w IR IR IRI ; R+1 := R + IR IR IRI ;
end for
may again be found in [12]. Of course, in that case each subproblem in the innermost
loop is R times larger, but the algorithm can find a better approximation with smaller
R. This is how the example in the following Sect. 5 was computed.
5 Computational Example
The example of a coupled problemtaken from [12] and shown here in Fig. 1can
equally well be viewed as a problem which has been partitioned. This exemplifies
another way in which the methods presented here can be used, namely to partition
Partitioned Solution of Coupled Stochastic Problems 419
1.674
Relative Error
3
1.678 10
1.682
4
10
1.686
5
1.691 10
1 4 7 10 13 16 19 1 4 7 10 13 16 19
Separation Rank r Separation Rank r
Fig. 2 Separated Approximation: a Value of the Lagrangian, b Errors: mean and std. deviation
(a) (b)
2.0 0.55 2.0
0.55
0.50 0.50
0.45 0.45
1.5 1.5
0.40 0.40
0.35 0.35
x2
0.30
0.25 0.25
0.20 0.20
0.5 0.5
0.15 0.15
0.10 0.10
0.0 0.0
0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0
x1 x
1
large problems to break them up into manageable pieces. The L-shaped domain
in Fig. 1 has been partitioned as indicated, and thus is a coupled problem. It is a
diffusion problem with a random diffusion coefficient, where we assume that the
uncertainty in this coefficient can be modelled by independent RVs belonging to the
respective subdomains. This means that the diffusion coefficient in both subdomains
is not correlated.
The coupling conditions are enforced by Lagrange multipliers, and the coupled
problem was solved by a FETI method, and the total problem by the extension of
Algorithm 3 as alluded to at the end of Sect. 4.2. In Fig. 2a the convergence of the
value of the Lagrangian with increasing rank can be observed, where it may be noted
that beyond rank R = 7 the value does not change much any more. In Fig. 2b we show
the decrease of error for the overall mean and standard deviation with increasing rank.
In the following two figures we show the contour linesfor the mean in Fig. 3 and for
the standard deviation in Fig. 4for rank R = 1 in part a) and for R = 20 in part b).
The converged contours are shown as dashed lines in all cases. It may be observed
that for the mean in Fig. 3a the contour lines are already quite accurate for R = 1.
420 M. Hadigol et al.
(a) (b)
2.0 2.0
0.08
0.08
0.07
0.07
1.5 1.5
0.06
0.06
0.05 0.05
x2
2
1.0 1.0
x
0.04 0.04
0.03 0.03
0.5 0.5
0.02 0.02
6 Concluding Remarks
We have indicated a fast but partitioned computational framework for the propagation
of uncertainty through coupled problems, which also saves storage. The proposed
approach constructs a solution-adaptive stochastic basis of separated form with
respect to the random inputs characterising the uncertainty in each sub-problem,
leading to a partitioned treatment of the stochastic space and consequently to a
higher scalability of the method as compared with standard uncertainty propagation
approaches. For situations where the separation rank is small, the proposed approach
provides at the same time a reduced order representation of the coupled solution.
The deterministic coefficients associated with each separated stochastic basis cap-
ture the spatial variability of the solution and are computed via the standard finite ele-
ment tearing and interconnecting (FETI) approach. Therefore, the method achieves a
high level of parallelism while requiring no intrusion in each domain solver. Although
our present formulation of domain coupling is based on the standard FETI approach,
we foresee no major technical difficulties in employing more advanced domain cou-
pling schemes.
The proposed framework was demonstrated through its application to a linear
elliptic PDE with high-dimensional random inputs. Despite the high-dimensionality
of the random inputs, accurate estimates of the solution statistics were achieved with
relatively low separation ranks, thus demonstrating the effectiveness of the present
approach.
Acknowledgments The authors are indebted for the fruitful discussions they had with Prof. K.C.
Park from University of Colorado, Boulder. AD gratefully acknowledges the financial support of
the Department of Energy under Advanced Scientific Computing Research Early Career Research
Award DE-SC0006402. MHs work was supported by the National Science Foundation grant
CMMI-1201207. The work of HGM and RN has been partly supported by the German Research
Foundation Deutsche Forschungsgemeinschaft (DFG).
Partitioned Solution of Coupled Stochastic Problems 421
References
24. Subber W, Sarkar A (2012) Domain decomposition method of stochastic PDEs: a two-level
scalable preconditioner. J Phys: Conf Ser 341(1):012033
25. Xiu D (2009) Fast numerical methods for stochastic computations: a review. Commun Comput
Phys 5(24):242272
26. Xiu D (2010) Numerical methods for stochastic computations: a spectral method approach.
Princeton University Press, Princeton
27. Xiu D, Hesthaven J (2005) High-order collocation methods for differential equations with
random inputs. SIAM J Sci Comput 27(3):11181139
28. Zhang Z, Choi M, Karniadakis GE (2009) Anchor points matter in ANOVA decomposition.
In: Spectral and higher order methods for partial differential equations. Lecture Notes in Com-
putational Science and Engineering, Trondheim, pp 347355