S To Chas Tic Multi Scale

COMPUTATIONAL STRATEGIES FOR
DATA-DRIVEN MODELING OF STOCHASTIC

SYSTEMS
A Dissertation
Presented to the Faculty of the Graduate School

of Cornell University
in Partial Fulfillment of the Requirements for the Degree of
Doctor of Philosophy
by
Baskar Ganapathysubramanian
August 2008
c 2008 Baskar Ganapathysubramanian
ALL RIGHTS RESERVED

COMPUTATIONAL STRATEGIES FOR DATA-DRIVEN MODELING OF
STOCHASTIC SYSTEMS
Baskar Ganapathysubramanian, Ph.D.
Cornell University 2008
Many physical systems of fundamental and industrial importance are signif-

icantly affected by the underlying fluctuations/variations in boundary, initial
conditions as well as variabilities in operating and surrounding conditions.

There has been increasing interest in analyzing and quantifying the effects of
uncertain inputs in the solution of partial differential equations that describe
these physical phenomena. Such analysis naturally leads to a rigorous method-
ology to design/control physical processes in the presence of multiple sources

of uncertainty. Spectral stochastic methods and Monte-Carlo based sampling
methods are two approaches that have been used to analyze these problems.
As the complexity of the problem or the number of random variables involved
in describing the input uncertainties increases, these approaches become highly

impractical from implementation and convergence points-of-view, creating a
bottle-neck to the utility of stochastic analysis. In this thesis, the sparse grid col-
location method based on the Smolyak algorithm is developed as a viable alter-
nate method for solving high-dimensional stochastic partial differential equa-

tions. The second bottleneck to the utility of stochastic modelling is the con-
struction of realistic, viable models of the input variability. In the second part of
the thesis, a framework to construct realistic input models that encode the vari-
abilities in initial and boundary conditions as well as other parameters using

data-driven strategies are developed.
In the third part of the thesis, the data-driven input model generation strate-
gies coupled with the sparse grid collocation strategies are utilized to analyze
systems characterized by multi-length scale uncertainties. A stochastic varia-
tional multiscale formulation is developed to incorporate uncertain multiscale
features. The framework is applied to analyze flow through random heteroge-

neous media when only limited statistics about the permeability variation are
given.
BIOGRAPHICAL SKETCH
The author was born in Madurai, India in October, 1981. After completing his
high school, the author was admitted into the Mechanical Engineering program
at IIT, Madras in 1999, from where he recieved his Bachelor’s degree in engineer-
ing in June, 2003. The author entered the doctoral program at the Sibley School
of Mechanical and Aerospace Engineering, Cornell University in August, 2003.
iii
The thesis is dedicated to my parents and brother for their constant support
and encouragement towards academic pursuits during my school years.
”We have not succeeded in answering all our problems. The answers we have found
only serve to raise a whole set of new questions. In some ways we feel we are as
confused as ever, but we believe we are confused on a higher level and about more
important things.” - Posted outside the mathematics reading room,
Tromsø University
iv
ACKNOWLEDGEMENTS
I would like to thank my thesis advisor, Professor Nicholas Zabaras, for his con-
stant support and guidance over the last 3 years. I would also like to thank
Professors Tomas Arias and Michael Thompson for serving on my special com-
mittee and for their encouragement and suggestions at various times during the
course of this work.
The financial support for this project was provided by NASA (grant NAG8-
1671) and the National Science Foundation (grant DMI-0113295). I would like
to thank the Sibley School of Mechanical and Aerospace Engineering for having
supported me through a teaching assistantship for part of my study at Cornell.
The computing for this project was supported by the Cornell Theory Center dur-
ing 2003-2008. The parallel simulators were developed based on open source
scientific computation package Pestc. I would like to acknowledge the effort

of its developers. I am indebted to present and former members of the MPDC
group, especially to Dr. Shankar Ganapathysubramanian, and Dr. Lijian Tan.
I would like to thank my good friends Jared, Poorna, Stephanie, Marleen
and Dmitry. You have made life beyond the lab fun and fullfilling!
v
TABLE OF CONTENTS
Biographical Sketch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii

Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
1 Introduction 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Some definitions and mathematical preliminaries . . . . . . . . . 12
2 Solving stochastic partial differential equations: A sparse grid colloca-

tion approach 15
2.1 Problem definition . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2 Generalized polynomial chaos expansions . . . . . . . . . . . . . . 16
2.3 Collocation methods . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.3.1 Fundamentals of the collocation method . . . . . . . . . . 19
2.3.2 Building the interpolating polynomial basis . . . . . . . . 21
2.3.3 From univariate interpolation to multivariate interpolation 23
2.4 Sparse grid collocation methods . . . . . . . . . . . . . . . . . . . . 24
2.4.1 Smolyak’s construction of sparse sets . . . . . . . . . . . . 24
2.4.2 Interpolation error . . . . . . . . . . . . . . . . . . . . . . . 26
2.4.3 Solution strategy . . . . . . . . . . . . . . . . . . . . . . . . 27
2.4.4 Numerical illustrations . . . . . . . . . . . . . . . . . . . . 28
2.5 Adaptive sparse grid collocation methods . . . . . . . . . . . . . . 33
2.5.1 Generalized sparse grid construction . . . . . . . . . . . . 34
2.5.2 Interpolation procedure . . . . . . . . . . . . . . . . . . . . 35
2.5.3 Numerical illustrations . . . . . . . . . . . . . . . . . . . . 37
2.6 Ingredients for an adaptive sparse grid collocation implementation 39
2.6.1 Deterministic code . . . . . . . . . . . . . . . . . . . . . . . 39
2.6.2 Interpolation functions . . . . . . . . . . . . . . . . . . . . . 40
2.6.3 Post-processing . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.7 Numerical examples . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.7.1 Natural convection with random boundary conditions . . 43
2.7.2 Natural convection with random boundary topology . . . 51
2.7.3 Natural convection with random boundary topology:
Large dimensions . . . . . . . . . . . . . . . . . . . . . . . . 60
2.7.4 Convection in heterogeneous porous media . . . . . . . . 64
2.8 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
vi
3 Data-driven approaches to constructing stochastic input models 71
3.1 Representing input randomness . . . . . . . . . . . . . . . . . . . . 71
3.1.1 Input parameters as random variables . . . . . . . . . . . . 71
3.1.2 Random field as a collection of random variables . . . . . 72
3.1.3 The Karhunen-Loève expansion . . . . . . . . . . . . . . . 73
3.1.4 Model reduction as a tool for constructing input models . 75
3.3 Nonlinear model reduction: Its necessity and some basic ideas . . 82
3.3.1 Linear model reduction: where does it fail? . . . . . . . . . 83
3.3.2 Non-linear model reduction: preliminary ideas . . . . . . 84
3.4 Mathematical formulation . . . . . . . . . . . . . . . . . . . . . . . 88
3.4.1 Some definitions and the compactness of the manifold M 89
3.4.2 Estimating the pair-wise geodesic distances: graph ap-
proximations . . . . . . . . . . . . . . . . . . . . . . . . . . 92
3.4.3 Estimating the optimal dimension, d, of the low-
dimensional representation . . . . . . . . . . . . . . . . . . 94
3.4.4 Constructing the low-dimensional parametrization, A . . 97
3.4.5 Construction of a non-parametric mapping F −1 : A → MS2 99
3.4.6 The low-dimensional stochastic input model F −1 : A → MS 2 107
3.5 Numerical implementation . . . . . . . . . . . . . . . . . . . . . . 107
3.5.1 Microstructure reconstruction: creating the samples xi . . 108
3.5.2 Constructing the low-dimensional region A . . . . . . . . 108
3.5.3 Utilizing this reduced-order representation: Stochastic
collocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
3.6 Illustrative example . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
3.6.1 Two-phase microstructures . . . . . . . . . . . . . . . . . . 113
3.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
4 Analysis of flow through heterogeneous random media: A stochastic

multiscale system 125
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
4.3 Stochastic variational multiscale formulation . . . . . . . . . . . . 130
4.4 The complete algorithm . . . . . . . . . . . . . . . . . . . . . . . . 141
4.5 Numerical examples . . . . . . . . . . . . . . . . . . . . . . . . . . 142
4.5.1 Effect of uncertainties at different scales . . . . . . . . . . . 143
4.5.2 Effect of localized uncertainties . . . . . . . . . . . . . . . . 156
5 Conclusions and suggestions for the future research 166

5.1 Stochastic inverse problems . . . . . . . . . . . . . . . . . . . . . . 167
5.2 Incorporating correlation statistics and investigating regularization168
5.3 Stochastic low-dimensional modeling . . . . . . . . . . . . . . . . 171
Bibliography 174
vii
LIST OF TABLES
2.1 Solution times (minutes) . . . . . . . . . . . . . . . . . . . . . . . . 49
viii
LIST OF FIGURES
2.1 Collocation nodes for a two-dimensional problem in a level 4

full (left) and sparse grid (right). Both the grids are constructed
from the same level of one-dimensional interpolation functions.
The sparse grid offers marginally worse interpolation properties
with huge improvements in the number of points sampled. . . . 28
2.2 Comparison of the exact (left) and interpolant (right) functions. . 29
2.3 Interpolation error for anisotropic functions: As ̺ increases, the
function becomes steeper (Fig. 2.4). More points are required to
accurately interpolate the function. . . . . . . . . . . . . . . . . . 30
2.4 Slice of the function in the x plane. The arrow shows the direc-
tion of increasing ̺. . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.5 Interpolation error for a discontinuous function. . . . . . . . . . . 31
2.6 Top row: Interpolated function at different levels of interpola-
tion, Bottom row: The corresponding sparse grids used. . . . . . 32
2.7 The adaptive grid used for ρ = 1 (left) and ρ = 1000 (right). No-
tice the increased node points in the y direction for the second
case. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.8 Error versus the number of nodes using the adaptive sparse grid
method for interpolating anisotropic functions. Compare with
the error plot using the conventional sparse grid method shown
in Fig. 2.3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.9 Left: Adaptive mesh for the discontinuous problem, Right: In-
terpolated discontinuous functions. . . . . . . . . . . . . . . . . . 38
2.10 Schematic of a natural convection problem with random bound-
ary temperature conditions. . . . . . . . . . . . . . . . . . . . . . . 44
2.11 Error reduction with increased number of support nodes for 2, 4
(top figures) and 8 (bottom figure) dimensions. . . . . . . . . . . 45
2.12 Mean Temperature and u velocity contours from different so-
lution strategies. Top row: A level 6 collocation method, Mid-
dle row: Monte-Carlo sampling over 65000 points, Bottom row:
Second-order GPCE expansion. . . . . . . . . . . . . . . . . . . . . 46
2.13 Standard deviation of Temperature and u velocity contours
from different solution strategies. Top row: A level 6 colloca-
tion method, Middle row: Monte-Carlo sampling over 65000
points, Bottom row: Second-order GPCE expansion. . . . . . . . 48
2.14 Probability distribution functions for the dependent variables
at point (0.34, 0). Top row: Temperature and pressure, Bottom
row: u and v velocity components. . . . . . . . . . . . . . . . . . . 49
2.15 Schematic of natural convection with random boundary topol-
ogy (surface roughness). . . . . . . . . . . . . . . . . . . . . . . . . 52
2.16 Error reduction using the adaptive sparse grid methodology. . . 53
ix
2.17 Realizations of the problem: Temperature, u velocity and v ve-
locity components. . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
2.18 Mean values of the variables: Top left: Temperature, Top right: u
velocity component, Bottom left: v velocity component, Bottom
right: pressure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
2.19 Mean velocity vectors and streamlines. . . . . . . . . . . . . . . . 56
2.20 Standard deviation of the variables: Top left: Temperature, Top
right: u velocity component, Bottom left: v velocity component,
Bottom right: pressure. . . . . . . . . . . . . . . . . . . . . . . . . 56
2.21 Probability distribution functions for the dependent variables
row: u and v velocity components. . . . . . . . . . . . . . . . . . . 58
2.22 Mode shift: Left: Temperature, Right: v velocity component. . . . 58
2.23 Mode shift: Left: Temperature, Right: v velocity component. . . . 59
2.24 Extreme realizations of temperature. . . . . . . . . . . . . . . . . 59
2.25 Mean and standard deviations of temperature. . . . . . . . . . . 60
2.26 Mode shift: Left: Temperature, Right: v velocity component. . . 60
2.27 Left: ACF of the surface roughness, Right: Eigen-spectrum of the
correlation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
2.28 Schematic of natural convection with wall surface roughness
computed from experiments. . . . . . . . . . . . . . . . . . . . . . 61
2.29 Realizations of temperature. . . . . . . . . . . . . . . . . . . . . . 62
2.30 Mean values of temperature and v velocity component. . . . . . 63
2.32 Standard deviations of temperature and v velocity component. . 64
2.33 Schematic of convection in a medium with random porosity. . . 65
2.34 Fontainebleau sandstone: Correlation function and cross-
sectional image. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
2.35 Eigen spectrum of the correlation kernel. . . . . . . . . . . . . . . 66
2.36 Mean values of the variables: Top left: Temperature, Top right: u
velocity component, Bottom left: v velocity component, Bottom
right: pressure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
2.38 Standard deviation of the variables: Top left: Temperature, Top
right: u velocity component, Bottom left: v velocity component,
Bottom right: pressure. . . . . . . . . . . . . . . . . . . . . . . . . 68
3.1 Limited information about important input quantities necessi-

tates construction of stochastic input models . . . . . . . . . . . . 76
3.2 Plot of the number of samples versus the number of eigenvectors
required to represent 80% of the ‘energy’ spectrum contained in
the samples. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
x
3.3 A set of points lying on a spiral in 3D space. The global coordi-
nates of any point on the spiral is represented as a 3-tuple. The
figure on the left (a) shows the reduced-order model resulting
from a linear PCA based reduction. The figure on the right (b)
depicts a non-linear strategy that results in an accurate repre-
sentation of the 3D spiral (data taken from [78]) that works by
“unravelling and smoothing” the 3D spiral into a 2D sheet. . . . 85
3.4 Figure (a) on the left shows images of the same object (from [78])
taken at various poses (left-right and up-down) while Figure (b)
on the right shows various two-phase microstructures that sat-
isfy a specific volume-fraction and two-point correlation. . . . . . 86
3.5 The figure above illustrates the simplest possible mapping be-
tween the low-dimensional region A and the high-dimensional
microstructural space MS2 . Given an arbitrary point ξ ∈ A, find
the point yk closest to ξ from the sampled points {yi }, i = 1, . . . , N.
Assign the image value of yk i.e. xk to the image of ξ. . . . . . . . 101
3.6 The figure above illustrates a local linear (k-neighbor) interpo-
lation mapping between the low-dimensional region A and the
high-dimensional microstructural space MS2 . . . . . . . . . . . . 102
3.7 Simple estimate for the interpolation error in Method 2: Let r de-
note the local radius of curvature of the function near the point
xexact (the filled square). We approximate the curve by a linear
patch, resulting in some error. This error is the distance between
the approximate linear image, x (the unfilled square) and the ac-
tual point, xexact (the filled square). This distance can be approx-
imated from simple geometry as a function of r and the local
geodesic distance between the points. . . . . . . . . . . . . . . . . 103
3.8 The local linear interpolation method can be made exact by sim-
ply projecting the image obtained after interpolation onto the
manifold. This is illustrated in the figure above, where the bot-
ted blue line represents the linear interpolation and the curved
line represents the projection operator that constructs the image
lying on the manifold. . . . . . . . . . . . . . . . . . . . . . . . . . 105
3.9 Sample illustration of the projection step: The figure on the left is
a microstructure after interpolation. Projecting it onto the man-
ifold yields the microstructure on the right. There is negligible
change in the three point correlation between the micorstructures. 105
3.10 Experimental image of a two-phase composite (from [66]). . . . . 114
3.11 The two-point correlation function. . . . . . . . . . . . . . . . . . 114
3.12 Comparison of the two-point correlation function from experi-
ments and from the GRF. . . . . . . . . . . . . . . . . . . . . . . . 116
3.13 One instance (realization) of the two-phase microstructure. . . . 116
3.14 Plot of the length functional of the MST of the graph G for vari-
ous sample sizes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
xi
Pd
i λ
3.15 (Left) The cumulative eigenspectrum of the data, Pi=1 N
λ
. (Right)
i=1 i
The residual variance for different dimensionalities of the region
A, computed from MDS. . . . . . . . . . . . . . . . . . . . . . . . 118
3.16 Reduction in the interpolation error with increasing number of
collocation points. . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
3.17 Steady-state mean temperature: (a) temperature contour, (b-d)
temperature iso-surfaces, and (e-g) temperature slices. . . . . . . 121
3.18 Standard deviation of temperature: (a) standard deviation con-
tours, (b) standard deviation iso-surfaces, (c) temperature PDF
at a point, and (d-f) standard deviation slices. . . . . . . . . . . . 122
4.1 Schematic of the problem of interest. . . . . . . . . . . . . . . . . 128

4.2 The stochastic multiscale framework. . . . . . . . . . . . . . . . . 141
4.3 Schematic of the developed stochastic multiscale framework. . . 142
4.4 Schematic of the domain. A fault runs across the middle of the
domain. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
4.5 The Lawyer Canyon in Texas, where extensive measurements
of permeability was made. Data available includes the semi-
variograms of the permeability. . . . . . . . . . . . . . . . . . . . . 144
4.6 (left) Experimental semi-variogram of permeability. (right) The
KL expansion of the correlation structure. . . . . . . . . . . . . . . 145
4.7 Three eigenvectors constructed from the KL expansion. . . . . . 145
4.8 Two extreme realizations of the log-permeability generated from
the KL expansion. . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
4.9 Convergence of the stochastic solution (mean and standard de-
viation) with increasing number of collocation points. (left)
Stochastic coarse-scale pressure, (right) Stochastic coarse-scale x-
direction flux. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
4.10 Mean stochastic coarse-scale solution. (left) Coarse-scale pres-
sure, (right) Coarse-scale x-direction flux. . . . . . . . . . . . . . . 147
4.11 Standard deviation of the stochastic coarse-scale solution. (left)
Coarse-scale pressure, (right) Coarse-scale x-direction flux. . . . . 148
4.12 Difference in the mean value of the stochastic solutions using the
adaptive sparse grid strategy and Monte Carlo sampling. (left)
Mean pressure, (right) Mean x-direction flux. . . . . . . . . . . . . 148
4.13 Difference in the standard deviation of the stochastic solutions
using the adaptive sparse grid strategy and Monte Carlo sam-
pling. (left) Standard deviation in pressure, (right) Standard de-
viation in the x-direction flux. . . . . . . . . . . . . . . . . . . . . . 149
4.14 Streamlines drawn on the mean stochastic coarse-scale velocity. . 150
4.15 Probability distribution functions of the coarse-scale flux (x-
direction) for two points in the domain. . . . . . . . . . . . . . . . 150
xii
direction flux. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
4.17 Mean contours of the stochastic coarse-scale solution. (left)
4.18 Streamlines drawn on the mean stochastic coarse-scale velocity. . 153
4.21 The refinement of the adaptive sparse grid. Each row of points
represents the refinement at a particular level of interpolation.
The interpolation starts at the bottom. . . . . . . . . . . . . . . . . 155
4.23 (left and center) Two realizations of the coarse-scale flux (x-
direction) for a 1% perturbation in the x-location of the perme-
ability barrier, (right) Difference between the two solutions. . . . 156
4.24 Schematic of the domain. The permeability is uncertain in part
of the domain. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
4.25 The fine-scale log permeability distribution in the domain. The
axes are scaled (1 unit = 50 ft). . . . . . . . . . . . . . . . . . . . . 158
4.26 The eigen-spectrum of the data set. The first 6 eigen-modes rep-
resent most of the variation that the input data set exhibits. . . . 159
direction flux. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
4.30 Probability distribution functions of the coarse-scale pressure
(left) and flux (x-direction) (right) for a point upstream of the
uncertainty. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
(left) and flux (x-direction) (right) for a point in the uncertain
domain. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
4.32 Plot of the length functional of the MST of the graph G for vari-
ous sample sizes. The intrinsic dimension is related to the slope
of the graph. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
xiii
uncertainty. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
domain. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
xiv
CHAPTER 1
INTRODUCTION
1.1 Introduction
With the rapid advances in computational power and easier access to high-
performance computing platforms, it has now become possible to computation-

ally investigate and design realistic multiscale, multiphysics phenomena. As a
direct consequence of this computational ability, there has been growing aware-
ness that the tightly coupled and highly nonlinear systems (that such problems
are composed of) are affected to a significant extent by the inherent (usually un-
resolvable) uncertainties in material properties, fluctuations in operating condi-
tions, and system characteristics. The scale at which these phenomena occur can
range from the micro-scale (MEMS devices), macro-scale (devices/components
made from polycrystalline or functionally graded materials) to the geological

scale (geothermal energy systems).
A good example of how the underlying variability affects performance is

in Micro-Electro Mechanical Systems (MEMS). The material properties and the
geometric parameters specifying MEMS devices play a significant part in de-
termining their performance. Commercial feasibility requires using low cost
manufacturing processes to fabricate the MEMS devices. This often results in

significant uncertainties in these parameters, which leads to large variations in
the device performance. To accurately predict and subsequently design/tailor
the performance of such systems, it becomes essential for one to include the
effects of these input uncertainties into the system and understand how they
propagate and alter the final solution.
1
Another class of problems that illustrate the necessity of incorporating the
effects of uncertainty (in the lower scales) is in the design and analysis of com-
ponents made of polycrystalline, functionally graded and other heterogeneous
materials. Most engineering devices involve the use of such multi-component
materials which constitute a major element in structural, transport, and chem-

ical engineering applications. They are ubiquitous, entering wherever there is
a need for cheap, resilient and strong materials. The thermal and elastic prop-
erties of such materials (particularly polycrystals) are highly anisotropic and
heterogeneous, depending on the local microstructure. Experimental evidence

has shown that microstructural variability in polycrystalline materials can have
a significant impact on the variability of macro-properties such as strength or
stiffness as well as in the performance of devices made of random heteroge-
neous media. Nevertheless, there has been no dedicated effort in the engineer-
ing community for incorporating the effect of such microstructural uncertainty
in process or device design even though it is well known that most performance
related failures of critical components are directly related with rare events (tails
of the PDF) of microstructural phenomena [1, 2, 3, 4, 5, 6]. In addition, no pre-

dictive modeling or robust design of devices made of heterogeneous media
(MEMs, energetic materials, polycrystals, etc.) is practically feasible without
accounting for microstructural uncertainty.
Sources of uncertainties: The above two examples clearly underscore the
necessity of incorporating the effects of uncertainty in property variation and

operating conditions during the analysis and design of components and de-
vices. This necessity of understanding uncertainty effects extends to almost
all complex systems involving multiple coupled physical phenomena. Fa-
miliar systems include analysis of thermal transport through devices (noz-
2
zle flaps, gears, etc.) that are composed of polycrystals and/or functionally
graded materials, hydrodynamic transport through porous media and chem-
ical point operations (e.g. filtration, packed columns. fluidization, sedimen-
tation and distillation operations). Other sources of uncertainty include er-
rors/fluctuations/variations in input constitutive relations, material properties

as well as initial/boundary conditions that are used in the model of the sys-
tem that is being analyzed. These inputs are usually available or derived from
experimental data. The presence of uncertainties/perturbations in experimen-
tal studies implies that these input parameters have some inherent uncertain-
ties. The uncertainties in the physical model may also be due to inaccuracies
or indeterminacies of the initial conditions. This may be due to variation in
experimental data or due to variability in the operating environment. Uncer-
tainties may also creep in due to issues in describing the physical system. This
may be caused by errors in geometry, roughness, and variable boundary condi-
tions. Uncertainties could also occur due to the mathematical representation of
the physical system. Examples include errors due to approximate mathematical
models (e.g. linearization of non-linear equations) and discretization errors.
Ingredients for stochastic analysis: There are two key ingredients for con-
structing a framework to analyze the effects of various sources of uncertainty in
a system.
• The first is a mathematically rigorous framework that can be used to de-

fine the stochastic differential equations representing the systems along
with a viable solution strategy for the same. The stochasticity (i.e. inclusion
of the effects of uncertainty) in this set of differential equations enters as

uncertain boundary and initial conditions, variations in properties, fluctu-
3
ations and thermal perturbations, or more generally as noise. We denote
these as the input uncertainties.
• The second ingredient is a set of techniques to construct usable, realistic
models of these input uncertainties. These models should utilize available

experimental information and/or expert knowledge. This is a crucial (yet
understated) issue with stochastic modeling. It is necessary to provide
meaningful/realistic models for the input uncertainty to draw any mean-
ingful conclusions from the resulting solutions. For instance, trying to ac-
count for the variability in room temperature (in a simple heat transfer in
a room problem) using an input model that allows the room temperature
to vary from -100 OC to +100 OC is a valid albeit probably useless model.
Solving stochastic partial differential equations: There are various frame-
works that have been developed in recent years for incorporating the effects of
uncertainties in various systems. The presence of uncertainties can be modelled
in the system through reformulation of the governing equations as stochastic
partial differential equations (SPDEs). Solution techniques for SPDEs can be
broadly classified into two approaches: statistical and non-statistical. Monte-

Carlo simulations and its variants constitute the statistical approach to solve
SPDEs. This approach gives access to the complete statistics. This method does
not approximate the solution space (like the other methods discussed later) The
other major attraction of these methods are that their convergence rates do not
depend on the number of independent random variables. Further, they are very
easy to implement. The only necessity of computing uncertainties using the
statistical approach is the availability of a working deterministic code. How-
ever, the statistical approach becomes quickly intractable for complex problems
in multiple random dimensions. This is because the number of realizations re-
4
quired to acquire good statistics is usually quite large. Furthermore, the number
of realizations changes with the variance of the input parameters and the trun-
cation errors in this inherently statistical method are hard to estimate. This has
in part been alleviated by improved sampling techniques like Latin hypercube
sampling and importance sampling among others [31].
A non-statistical approach consists of approximating and then modelling the

uncertainty in the system. One example of this approach is the perturbation
method. Here, the random field is expanded about its mean in a truncated Tay-
lor series. This method is usually restricted to small values of uncertainties.
A similar method is to expand the inverse of the stochastic operator in a Neu-

mann series. This method is again restricted to small uncertainties and further-
more it is almost impossible to apply these methods to complex problems [7]. In
the context of expanding the solution in terms of its statistical moments, there
has been some recent work in reformulating the problem as a set of moment
differential equations (MDE). The closure equations for these MDE’s are then
derived from Taylor expansions of the input uncertainty as well as perturba-
tion and asymptotic expansions. The random domain decomposition method
[33, 34, 35, 36] coupled with the MDE method has been shown to be success-
ful in solving complex fluid flow problems in porous media having large input
variances.
A more recent approach to model uncertainty is based on the spectral

stochastic finite element method (SSFEM) [7]. In this method, the random field
is discretized directly, i.e uncertainty is treated as an additional dimension along
with space and time and a field variable is expanded along the uncertain di-
mension using suitable expansions. The basic idea is to project the dependent
5
variables of the model onto a stochastic space spanned by a set of complete
orthogonal polynomials. These polynomials are functions of a set of random
variables ξ(θ) where θ is a realization of the random event space. Ghanem and
Spanos first used SSFEM in linear elastic problems in [7]. Ghanem and co-
workers subsequently applied it to transport in random media [8], structural

dynamics applications [9] and heat conduction problems [10]. Karniadakis and
co-workers applied a generalized polynomial chaos expansion to model uncer-
tainty in diffusion [11], fluid flow applications [12] and transient heat conduc-
tion problems [13]. In the original work by Wiener, Gaussian random variables
were used with Hermite polynomials. The scheme has been extended to in-
clude other random distributions [37, 38, 39, 40, 41, 42]. Error bounds, and
convergence studies [43, 44, 45, 46, 47] have shown that these methods exhibit
fast convergence rates with increasing orders of expansions. These convergence

studies assume that the solutions are sufficiently smooth in the random space.
Like the statistical methods, the SSFEM approach also reduces the problem
to a set of deterministic equations. But unlike the statistical methods, the re-
sulting deterministic equations are often coupled. This coupled nature of the
deterministic problems makes the solution of the stochastic problem extremely

complex as the number of stochastic dimensions increases or the number of
expansion terms is increased. In fact, computational complexity of the prob-
lem increases combinatorially with the number of stochastic dimensions and
the number of expansion terms. This problem is especially acute when dealing
with realistic problems in fluid mechanics. The number of random dimensions
needed to represent surface fluctuations (roughness) or permeability variations
is relatively high (N ≥ 8). To solve these problems in high-dimensional spaces
and to allow non-smooth variations of the solution in the random space, there
6
have been recent efforts to couple the fast convergence of the Galerkin methods
with the decoupled nature of Monte-Carlo sampling [48]. Babuska et al. [15]
proposed a methodology wherein the Galerkin approximation is used to model
the physical space while a collocation scheme is used to sample the random
space. A tensor product rule was used to interpolate the variables in stochas-
tic space using products of one-dimensional interpolation functions (double or-
thogonal polynomials) based on Gauss points. Though this scheme effectively
decoupled the deterministic system of equations, the number of realizations re-
quired to build the interpolation scheme increased as power of the number of

random dimensions (nNpt ). Xiu and Hesthaven [16] recently used the Smolyak al-
gorithm to build sparse grid interpolants in high-dimensional space. The sparse
grid collocation and cubature schemes have been well studied and utilized in
different fields [49, 50, 52, 53, 54]. Using this method, interpolation schemes can
be constructed with orders of magnitude reduction in the number of sampled
points to give the same level of approximation (up to a logarithmic factor) as
interpolation on a uniform grid.
In the standard sparse grid collocation approach, all the dimensions are
treated equally. In many problems usually encountered, not all the dimensions
are important. That is, the solution varies much more smoothly in some par-
ticular dimension than in others. This brings up the possibility of reduction in
the computational effort required to solve the stochastic problem by weighting
the number of sampling points in the stochastic dimensions according to the

solution smoothness in that dimension. But it is not possible to know a priori
which dimensions are more important. It is however possible to develop heuris-
tic methods to adaptively sample the stochastic space to reduce the number of
function evaluations.
7
In Chapter. 2, an adaptive extension to the sparse grid collocation method
is developed that can automatically detect the important dimensions and bias
the sparse sampling towards those dimensions. This results in substantial sav-
ings in the computational effort required to solve large dimensional problems.
We further utilize these methods to tackle significant problems in natural con-

vection like investigating the effects of surface roughness on the dynamics of
Rayleigh-Bénard convection and the effects of heterogeneous porous media on
natural convection. We are able to accurately capture equilibrium shifts and effi-
ciently scale to a moderately large number of stochastic dimensions. We further

provide a road map to convert any deterministic code to include the effects of
input uncertainty in a non-intrusive manner utilizing the adaptive sparse grid
collocation method (ASGC).
Constructing reliable stochastic input models: In most complex systems in-
volving multiple coupled physical phenomena, the material (thermo-physical)

properties as well as the material distribution (topology) vary at a length scale
much smaller than the system size. Familiar systems include analysis of ther-
mal transport through devices (nozzle flaps, gears, etc.) that are composed
of polycrystals and/or functionally graded materials, hydrodynamic transport

through porous media and chemical flow through packed filtration beds. In
such problems, the only information that is usually available experimentally to
quantify these variations are statistical correlations. This leads to an analysis of
the problem assuming that the property and topological variations are random
fields satisfying the experimental correlations. To perform any such analysis,
one must first construct models of these variations to be used as inputs in the
subsequent uncertainty analysis. The analysis of the effect of such uncertainties
on the system needs the construction of a stochastic model (preferably a low-
8
dimensional, continuous mapping) that encodes and quantifies the variation of
material topology and properties in a mathematically rigorous way.
There have been very few previous investigations into developing stochas-
tic input representations. The recent work in [19] looks at developing proba-
bilistic models of random coefficients in SPDEs using a maximum likelihood
framework. The random domain decomposition (RDD) method was used

in [32, 33, 34] to construct probabilistic models for heterogeneous permeabil-
ity distributions. This methodology has been shown to work very well in de-
scribing permeability variations in geological strata [32]. Nevertheless, almost
all of these techniques for constructing input models are based on the concept
of transforming experimental data and statistics into probability distributions
of the property. These techniques are usually highly application specific, re-
quire expert knowledge in assigning probabilities and invariably require some
amount of heuristic parameter fitting.
The framework developed in this thesis (Chapter. 3) utilizes the available
statistical information about the variability of random media to construct a set

of plausible realizations of the material topology and property. The framework
subsequently encodes these property realizations into a low-dimensional continu-
ous space that represents all the possible property variations permitted by the
experimental data. By sampling over this low-dimensional equivalent surrogate

space, one is essentially sampling over the random space of property variations
that satisfy the experimental data. This low-dimensional representation is sub-
sequently used as a stochastic input model in the uncertainty analysis. The
major advantages of this framework are the enormous reduction in complex-

ity due to the analysis in a low-dimensional space and more importantly the
9
absence of the requirement of any expert knowledge. In addition, the general-
ity of the mathematical developments results in a framework that can construct
input models for any property variability, seamlessly meshing with any recon-
struction algorithm and software that produce plausible data sets.
We develop linear and non-linear dimension reduction strategies to em-
bed data variations into a low-dimensional manifold that can serve as the in-
put model for subsequent analysis. The significant features of this framework
are as follows: A completely general methodology to generate viable, realistic,
reduced-order stochastic input models based on experimental data is proposed.
To the best knowledge of the author, this is the first time that such a broad frame-
work for generating models for any property variation has been developed. We
provide a rigorous mathematical basis for the proposed framework. Further-
more, error estimates and strict convergence estimates of the reduced-order
model are provided. The basic model reduction ideas developed are not just
limited to generation of viable stochastic input models of property variations.
This framework has direct applicability to problems where working in high-
dimensional spaces is computationally intractable, for instance, in visualization
of property evolution, extracting process-property maps in low-dimensional

spaces, among others. Furthermore, the generation of a low-dimensional sur-
rogate space has major ramifications in the optimizing of properties, processes
and structures, making complicated operations like searching, contouring and
sorting computationally much more feasible.
Transport through random porous media- Stochastic multiscaling: One
of the challenging mathematical issues in the analysis of transport through het-

erogeneous random media is the multiscale nature of the property variations.
10
Complete response evaluation involving full (fine-scale) spatial and temporal
resolution simulations of multiscale systems is extremely expensive. Com-
putational techniques have been developed that solve for a coarse-scale so-
lution by defining an appropriate coarse-scale problem that captures the ef-
fect of the fine-scales [94]. The more popular techniques developed for such
upscaling fall under the category of multiscale methods viz. the variational
multiscale (VMS) method (operator upscaling) [95, 96, 97, 98, 99, 100], the
heterogenous multiscale method [101, 102] and the multiscale finite element
method [103, 104, 105, 106, 107]. Further related techniques are presented
in [108, 109]. These computationally scalable techniques have resulted in the
development of black box simulators that have been used with significant suc-
cess in the solution of large scale transport problems in complex geometries.
The multiscale analysis of such systems inherently assumes that the com-
plete, fine-scale variation of the permeability is known. This assumption lim-

its the applicability of these frameworks since it is usually not possible to ex-
perimentally determine the complete structure of the media at the finest scale.
In most cases, only a few statistical descriptors of the property variation or
the property variation in small test regions can be experimentally determined.

This limited information about the permeability necessitates viewing the per-
meability variation as a random field that satisfies certain statistical proper-
ties/correlations. This naturally results in describing the physical phenomena
as stochastic multiscale partial differential equations (SPDEs) instead of multi-

scale partial differential equations (PDEs).
The basic idea to solve the stochastic multiscale set of equations is to ex-
tend deterministic multiscale methods to their stochastic analogues. Spectral
11
strategies to pose and solve stochastic multiscale problems have been investi-
gated by Xu [112] and Asokan and Zabaras [42]. Collocation based strategies
have recently been developed in [20]. The key is to define appropriate ways to
link the fine-scale (subgrid) stochastic variation with the coarse-scale stochastic
variation of the dependent variables.
In the final part of the thesis, we utilize the computational strategies de-
veloped to analyzing flow through random heterogeneous media given lim-
ited statistical information about the multiscale permeability variation. We
link stochastic analysis and multiscale methods to investigate this problem.
We utilize data-driven strategies to encode the limited information and sub-

sequently construct a finite-dimensional representation of the multiscale per-
meability variation. A stochastic variational multiscale method is formulated
to incorporate the effects of the multiscale (stochastic) permeability. This is
based on a straightforward extension to the Mixed Multiscale Finite Element

Method [113] to solve the resulting SPDEs.
In the next section we go over some necessary basic mathematical prelimi-

naries before discussing the input model generation strategies and the stochastic
sparse grid collocation approach to solving stochastic partial differential equa-
tions.
1.2 Some definitions and mathematical preliminaries
For the sake of clarity, let us denote the variable’s of interest as θ, u, p. Denote
by D, a region in Rnsd , the domain of interest. nsd = 1, 2, 3 is the number of
space dimensions over which D is defined. Denote by T ≡ [0, tmax ] the time do-
12
main of interest. Denote the boundary of D as ∂D. Denote the set of governing
(deterministic) differential equations defining the evolution of θ as
B(u, p, θ : x, t, {q, α}) = 0. (1.1)
The solution of the above set of equations results in functions u : D × T →

Rnsd , p : D × T → R, θ : D × T → R. In the above equation, {q, α} represent
the input conditions, boundary conditions, material parameters as well as any
constitutive relations that the physical system may require. For instance in the
context of thermal problems, {q, α} could represent the heat flux and the thermal
conductivity of the system while in the context of elastic deformation problems,
{q, α} could represent the displacement boundary conditions and the elastic ten-
sor. In conventional deterministic analysis one is given values or a deterministic

function representation for {q, α} and subsequently solves for the evolution of
(u, p, θ) as a function of these input conditions. As emphasized earlier, in many
cases these input parameters have some uncertainty associated with them. This
brings up the necessity of stochastic analysis.
Suppose, without loss of generality, that the input quantity α is in fact un-
certain. Assume also, for the sake of clarity, that α is a scalar. Denote by Ωα the
space of possible values that α can take. Now, Ωα can be a set of discrete points
{α1 , . . . , αN } or it can be a continuous range of values α ∈ [a, b]. To rigorously
define this notion of variability, the notion of a σ-algebra is introduced. A σ-
algebra over a set Ωα is a nonempty collection F of subsets of Ωα that is closed

under complementation and countable unions of its members. Intuitively, a
σ-algebra of a set/region is a smaller collection of subsets of the given set al-
lowing countably infinite operations/subsets. The use of σ-algebras allows us
to restrict our attention to a smaller, and usually more useful, collection of sub-
sets of a given set. The main use of σ-algebras is in the definition of measures on
13
Ωα . This is one of the cornerstones of probability theory (the interested reader
is referred to [17] for a gentle introduction to σ-algebras in probability theory).
The measure that we are usually interested in is the probability measure, P, i.e.
the probability of a particular subset in the σ-algebra occurring. That is the
function/measure P : F → [0, 1]. With these definitions, we are now ready to

define the probability space in which the variations of the random input α are
defined on. Let (Ωα , F , P) be a probability space, where Ωα is the space of basic
outcomes, F is the minimal σ-algebra of the subsets of Ωα . P is the probabil-
ity measure on F . The σ-algebra can be viewed as a collection of all possible

events that can be derived from the basic outcomes in Ωα and have a well de-
fined probability with respect to P. The notation ωα will be used to refer to the
basic outcomes in the sample space, i.e., ωα ∈ Ωα .
From a deterministic version of α we now have a probabilistic definition
of α(ωα ). Note that α(ωα ) is still an abstract quantity, that lies in an abstract
probability space (Ωα , F , P). In Chapter. 3, we will look at different ways of
representing this random process for straightforward utility in a computational
framework. This probabilistic definition of α(ωα ) necessitates a redefinition of
the governing equation. This can be done naturally as follows: define the evo-
lution of (u, p, θ) as
B(u, p, θ : x, t, {q, α(ωα)}) = 0. (1.2)
≡ B(u, p, θ : x, t, ωα) = 0. (1.3)
Now, the dependent variable θ depends not only on (x, t) but also on the space
Ωα . The solution of the above set of equations results in functions u : D × T ×
Ωα → Rnsd , p : D × T × Ωα → R, θ : D × T × Ωα → R
14
CHAPTER 2
SOLVING STOCHASTIC PARTIAL DIFFERENTIAL EQUATIONS: A
SPARSE GRID COLLOCATION APPROACH
2.1 Problem definition
The incorporation of uncertain inputs in the analysis of a system defined by a

set of differential equations results in a set of stochastic differential equations
B(u, p, θ : x, t, ωα) = 0. (2.1)
defined over an abstract probability space (Ωα , F , P). The solution methodol-
ogy to solve of the above set of coupled differential equations is to first re-
duce the complexity of the problem by reducing the probability space into a
finite dimensional space. In some cases the random field α(x, ω) can be repre-
sented/described by a finite length random vector [ξ 1 , . . . , ξ N ] : Ω → RN . In
other cases the random field can have a spatial correlation or variation. There is
rich literature on techniques to extract/fit correlations for these random fields
from input experimental/numerical data. This is an area of intense ongoing re-

search [19]. In Chapter. 3, we detail some of these techniques. Using the ‘finite-
dimensional noise assumption’ [45], the random process is decomposed into
a finite set of random variables. A commonly used decomposition is via the
Karhunen-Loève expansion. In the K-L expansion, the covariance kernel of the

input random process is spectrally decomposed to a finite set of uncorrelated
random variables. Upon decomposition and characterization of the random in-
puts into N random variables, ξ i (ω) i = 1, . . . , N, the solution to the coupled
system of Eqs. (2.33)-(2.35) can be described by this set of random variables.
15
Thus we can consider the following:
u(x, t, ω) = u(x, t, ξ 1(ω), . . . , ξ N (ω)),
θ(x, t, ω) = θ(x, t, ξ 1(ω), . . . , ξ N (ω)),
p(x, t, ω) = p(x, t, ξ 1(ω), . . . , ξ N (ω)).
It is usually assumed that ξ i (ω) are independent random variables with prob-
ability distribution functions ρi : Γi → R with bounded ranges, Γi . The joint

probability density of the N-tuple in the above equations ξ = (ξ 1 (ω), . . . , ξ N (ω))
is then given by
N
Y
ρ(ξ) = ρi (ξ i ), ∀ ξ ∈ Γ, (2.2)
i=1
QN
where the support Γ = i=1 Γi ⊂ RN . The outcome of the above procedure is
that the set of Eqs.2.1) can now be written as a set of differential equations in
(N + nsd) dimensions, where N is the dimensionality of the truncated random
space Γ and nsd is the dimensionality of the physical space D:
B(u, p, θ : x, t, ξ) = 0. (2.3)
2.2 Generalized polynomial chaos expansions
In this approach, the dependent variables (u, p, θ) are regarded as random pro-
cesses. These random fields are then expanded into a truncated series. One
could use the Karhunen-Loève expansion to represent these fields, but unfortu-
nately one does not a priori know the correlation functions for these variables.
The generalized polynomial chaos expansion is used to represent the variables

in terms of orthogonal polynomials in the stochastic space spanned by ξ ∈ Γ.
16
This is written as
P
X
κ(x, t, ω) = κi (x, t)Φi (ξ i (ω)), (2.4)
i=0
where κ is any of the dependent variables. Here κi are the deterministic coef-
ficients that can be thought of as being similar to the coefficients in a Fourier

expansion of a function. Note that the possible infinite summation has been
truncated to only P terms. The random trial basis functions {Φi } are chosen
according to the type of random variable {ξ i } that has been used to describe
the input random field. For example, if Gaussian random variables are cho-
sen then the Askey based orthogonal polynomials {Φi } are chosen to be Hermite
polynomials, if ξ i are chosen to be uniform random variables, then {Φi } must
be Legendre polynomials (for a complete description of the GPCE scheme, the
interested reader is referred to [41]). The total number of expansions terms is

determined by the stochastic dimension (N) and the highest-order (p) of the or-
thogonal polynomials as follows:
(N + p)!
P+1 = . (2.5)
N!p!
The GPCE is used to expand all the dependent variables in terms of the orthog-
onal polynomials. Substituting these expansions into the governing equation,
Eq. (2.3), gives:

P
X P
X P
X
B( ui Φi , pi Φi , θi Φi : x, t, ξ) = 0. (2.6)
i=0 i=0 i=0
Following this, a Galerkin projection of the above equation onto each polyno-
mial basis Φi is conducted to ensure that the error is orthogonal to the functional
space spanned by the finite dimensional basis Φi :
P
X P
X P
X
hB( ui Φi , pi Φi , θi Φi : x, t, ξ), Φ j i = 0. (2.7)
i=0 i=0 i=0
where ha, bi is the inner product of the functions a and b over the ensemble
R
(ha, bi = Ω abdx). By using the orthogonality of the polynomial basis, we can
17
obtain a set of (P + 1) coupled equations for each random mode ui , pi , θi . By
utilizing the polynomial chaos expansion followed by the Galerkin projection,
the randomness has been transferred from the dependent variables to the basis
polynomials. The resulting governing equations for the expansion coefficients
are deterministic.
This methodology has been very successful in solving SPDEs [7, 11, 12, 13,
37, 39, 41, 42, 46]. We will restrict our discussion to the disadvantages of this
method so as to motivate the next few sections. The coupled nature of the
expansion coefficients makes the implementation of the GPCE method non-
trivial. Substantial effort has to be put in to convert a validated deterministic

code into a feasible stochastic one. As the number of stochastic dimensions
increases, the rapidly growing number (P + 1) of basis functions effectively re-
duces the efficiency of the method. New faster and more efficient solvers and
data processing methodologies are necessary for the efficient solution of higher-
dimensional problems. The major constraint for the utility of this methodology
is the coupled nature of the resulting equations. This resulted in the search for a
method that combined the fast convergence and error estimates of the spectral
Galerkin method with the decoupled nature of the more statistical methods like
the Monte-Carlo methods.
2.3 Collocation methods
In the spectral Galerkin method described in the previous section, the spatial
domain is approximated using a finite element discretization and the stochastic
space is also approximated using a spectral element discretization. This rep-
18
resentation of the stochastic space causes the coupled nature of the resulting
equations. Another approach is to have a finite element approximation for the
spatial domain and approximate the multidimensional stochastic space using
interpolating functions. The interpolating functions are mutually orthogonal
and the resulting equations are decoupled. This approach is called the colloca-
tion approach, where one computes the deterministic solution at various points
in the stochastic space and then builds an interpolated function that best ap-
proximates the required solution. We now detail the issues involved in con-
structing the stochastic solution using the collocation method. We follow the
spirit of the discussion in Xiu et.al [16] to give an intuitive development of the
method.
2.3.1 Fundamentals of the collocation method
The basic idea of the collocation approach is to build an interpolation function
for the dependent variables using their values at particular points in the stochas-
tic space. The Galerkin projection is then applied to find that interpolation func-
tion which minimizes the projected error of the approximated system. Denote
by ξ any point in the random space Γ ⊂ RN , by ΠN , the space of all N-variate
polynomials and by ΠNp , the subspace of polynomials of total degree at most p.

The problem of interpolation can be stated as follows:
M
Given a set of nodes ΘN = {ξi }i=1 in the N-dimensional random space Γ and
the smooth function f : RN → R, find the polynomial I f such that I f (ξi ) =

f (ξi ), ∀ i = 1, . . . , M.
The polynomial approximation I f can be expressed using the Lagrange in-
19
terpolation polynomials as follows:
M
X
I f (ξ) = f (ξk )Lk (ξ), (2.8)
k=1
where Li (ξ j ) = δi j . Now, once the interpolating polynomial have been generated
using the nodes {ΘN }, the value of the function at any point ξ ∈ Γ is approxi-
mately I f (ξ). The Lagrange interpolated values of (u, p, θ), denoted by (û, p̂, θ̂)
are as follows:
M
X
(û(ξ), p̂(ξ), θ̂(ξ)) = (Iu(ξ), Ip(ξ), Iθ(ξ)) = (u(ξ k ), p(ξk ), θ(ξk ))Lk (ξ). (2.9)
k=1
Substituting this into the governing equation, Eq. (2.3), gives

M
X M
X M
X
B( u(ξ i )Li (ξ), p(ξ i )Li (ξ), θ(ξi )Li (ξ) : x, t, ξ) = 0. (2.10)
i=1 i=1 i=1
The interpolation form of the solution immediately leads to M decoupled deter-

ministic systems
B(u(ξ i ), p(ξi ), θ(ξi ) : x, t, ξi ) = 0, i = 1, . . . , M. (2.11)
The collocation method collapses the (N + d)-dimensional problem to solving M

deterministic problems in d dimensions. The statistics of the random solution
can then be obtained by
M
X Z
α α
< z (x) >= z (x, ξ k ) Lk (ξ)ρ(ξ)dξ. (2.12)
k=1 Γ
Remark 1. Relation to the spectral Galerkin method: Babuska and coworkers [44,
45] realized that by choosing appropriate basis functions, the so-called double
orthogonal functions, the spectral stochastic Galerkin formulation can also be
made decoupled. This is specific to the case when the random input is a linear
functions of the input random variables. They also showed that in case the input
uncertainties are multi-linear combinations of the random variables {ξ i (ω)} (as
20
would happen if one uses the K-L expansion to truncate the input random fields
to a finite set of random variables), the solution obtained from the collocation
procedure coincides with the stochastic Galerkin solution [15]. They argue that
the collocation method can be seen as a pseudo-spectral Galerkin method with
some distinct advantages.
2.3.2 Building the interpolating polynomial basis
In most problems involving input uncertainties, the final objective is usually

to compute the moments of the dependent variables (Eq. 3.14) and/or gener-
ate the probability distribution functions (PDFs) at some point in the solution
phase space. This is accomplished through specific quadrature and sampling
operations on the interpolation function. The analysis and construction of the

interpolation function is usually introduced through a univariate interpolation
formula.
Let f : [a, b] → R be a function that has to be interpolated by a polynomial

In ( f ) using a finite number of nodes a ≤ x0 < x1 . . . < xn ≤ b. There exists a
(unique) polynomial In ( f ) ∈ Πn1 satisfying In ( f )(xi ) = f (xi ) for i = 0, 1 . . . , n. This
can be written in the form:

n
X
In ( f )(x) = f (xi )Li (x), (2.13)
i=0
where the basis polynomials are given by

n
Y x − xk
Li (x) = . (2.14)
x − xk
k=0,k,i i
As the number of points n increases, the interpolation function In ( f ) represents

the function f better. This is irrespective of how one chooses the nodes {xi }ni=0 .
21
However, uniform convergence (k f − In ( f )k∞ → 0 as n → ∞) is not guaranteed
for any arbitrary distribution of the nodes. Utilizing a Taylor series expansion,
the interpolation error for a function f ∈ Cn+1 [a, b] at a point x is given by [54]
n
f n+1 (ξ) Y
f (x) − In ( f )(x) = (x − x j ). (2.15)
(n + 1)! j=0
The magnitude of the (n + 1)th derivative could outweigh the product of the
nodes. But there are unique node distributions that exist where uniform con-
vergence with the number of nodes can be proved if f ∈ C1 [a, b].
One way of quantifying the approximating quality of the interpolating poly-

nomial for a given node distribution (denoted by X) is to compare it with a
polynomial whose nodes have been selected in an optimal way to theoretically

minimize the interpolation error. This best approximation polynomial p∗n ( f ) ∈ Πn1 ,
is defined as the polynomial with the following property [54]:
k f − p∗n ( f )k∞ ≤ k f − I( f )k∞ ∀ I( f ) ∈ Πn1 . (2.16)
The distance of any interpolation polynomial I( f ) from the best approximation
polynomial is measured in terms of the Lebesgue constant Λn (X) as follows [54]:
k f − I( f )k∞ ≤ k f − p∗n ( f )k∞ (1 + Λn (X)), (2.17)
where Λn (X) = max x∈[a,b] ( ni=0 |Li (x)|) where X is any distribution of the nodes.
P
It is apparent that Λn (X) does not depend on the function f but in fact only
depends on the distribution of the points X. Therefore it is possible to construct
a priori, sets of nodes X with a small Lebesgue constant. One such type of node
distribution is the interpolation based at the Chebyshev extrema [54].
Once the type of support nodes {X} has been chosen and the Lebesgue con-
stant for the node distribution computed, it is straightforward to arrive at er-
22
ror bounds for the polynomial interpolation function IX . For instance, the er-
ror bound for the Chebyshev node based interpolation function of the function
f ∈ Ck is given by [54],[55]
k f − IX ( f )k∞ ≤ Cn−k log(n). (2.18)
2.3.3 From univariate interpolation to multivariate interpola-
tion
When one is dealing with multiple stochastic dimensions, it is straightforward

to extend the interpolation functions developed in one dimension to multiple
dimensions as simple tensor products. If u(ξ) is a function that has to be ap-
proximated in N dimensional space, and i = (m1 , m2 , . . . mN ) are the number of
nodes used in the interpolation in N dimensions, the full-tensor product inter-

polation formula is given as
IN u(ξ) = (Ii1 ⊗ . . . ⊗ IiN )(u)(ξ)

Xm1 XmN
= ... u(ξ ij11 , . . . , ξ ijNN ).(Lij11 ⊗ . . . ⊗ LijNN ), (2.19)
j1 =1 jN =1
where Iik are the interpolation functions in the ik direction and ξ ijkm is the jm th
point in the kth coordinate. Clearly, the above formula needs m1 ×. . .×mN function
evaluations, at points sampled on a regular grid. In the simplest case of using
only two points in each dimension, the total number of points required for a full-
tensor product interpolation is M = 2N . This number grows very quickly as the
number of dimensions is increased. Thus one has to look at intelligent ways of
sampling points in the regular grid described by the full-tensor product formula
so as to reduce the number of function evaluations required. This immediately

leads to sparse collocation methods that are briefly reviewed next.
23
2.4 Sparse grid collocation methods
For the univariate case, Gauss points and Chebechev points have least interpola-
tion error (for polynomial approximations). In the case of multivariate interpo-
lation, one feasible methodology that has been used is to construct interpolants
and nodal points by tensor products of one-dimensional interpolants and nodal

points. An obvious disadvantage of this strategy is that the number of points
required increases combinatorially as the number of stochastic dimensions is
increased.
The Smolyak algorithm provides a way to construct interpolation func-
tions based on a minimal number of points in multi-dimensional space. Using

Smolyak’s method, univariate interpolation formulae are extended to the multi-
variate case by using tensor products in a special way. This provides an interpo-
lation strategy with potentially orders of magnitude reduction in the number of
support nodes required. The algorithm provides a linear combination of tensor

products chosen in such a way that the interpolation property is conserved for
higher dimensions.
2.4.1 Smolyak’s construction of sparse sets
Smolyak’s algorithm provides a means of reducing the number of support

nodes from the full-tensor product formula while maintaining the approxima-
tion quality of the interpolation formula up to a logarithmic factor. Consider the

one-dimensional interpolation formula
m
X
m
U (f) = f (x j )L j . (2.20)
j=1
24
Let us denote the set of points used to interpolate the one-dimensional function
by Θ(k) . For instance, Θ(3) could represent the Chebyshev points that interpolate
a third-order polynomial. The Smolyak algorithm constructs the sparse inter-
polant Aq,N (where N is the number of stochastic dimensions and q − N is the or-
der of interpolation) using products of one-dimensional functions. Aq,N is given

as [53, 54]
!
X
q−|i| N−1
Aq,N ( f ) = (−1) . .(U i1 ⊗ . . . ⊗ U iN ), (2.21)
q−N+1≤|i|≤q
q−i
with AN−1,N = 0 and where i = (i1 , . . . , iN ) and |i| = i1 + . . . + iN . Here ik

can be thought of as the level of interpolation along the k − th direction. The
Smolyak algorithm builds the interpolation function by adding a combination
of one dimensional functions of order ik with the constraint that the sum total
(|i| = i1 +. . .+iN ) across all dimensions is between q − N +1 and q. The structure of
the algorithm becomes clearer when one considers the incremental interpolant,
∆i given by [49, 52, 53, 54]
U 0 = 0, ∆i = U i − U i−1 . (2.22)
The Smolyak interpolation Aq,N is then given by

X
Aq,N ( f ) = (∆i1 ⊗ . . . ⊗ ∆iN )( f )
|i|≤q
X
= Aq−1,N ( f ) + (∆i1 ⊗ . . . ⊗ ∆iN )( f ). (2.23)
|i|=q
To compute the interpolant Aq,N ( f ) from scratch, one needs to compute the func-
tion at the nodes covered by the sparse grid Hq,N
[
Hq,N = (Θ1(i1 ) × . . . × Θ(i1 N ) ). (2.24)
q−N+1≤|i|≤q
The construction of the algorithm allows one to utilize all the previous re-
sults generated to improve the interpolation (this is immediately obvious from
25
Eq. (2.23)). By choosing appropriate points (like the Chebyshev and Gauss-
Lobatto points) for interpolating the one-dimensional function, one can ensure
that the sets of points Θ(i) are nested (Θ(i) ⊂ Θ(i+1) ). To extend the interpolation
from level i to i + 1, one only has to evaluate the function at the grid points that
are unique to Θ(i+1) , that is, at Θi∆ = Θi \Θi−1 . Thus, to go from an order q − 1 inter-
polation to an order q interpolation in N dimensions, one only needs to evaluate
the function at the differential nodes ∆Hq,N given by
[
∆Hq,N = (Θi∆1 ⊗ . . . ⊗ Θi∆N ). (2.25)
|i|=q
2.4.2 Interpolation error
As a matter of notation, the interpolation function used will be denoted AN+k,N ,
where k is called the level of the Smolyak construction. The interpolation error
using the Smolyak algorithm to construct multidimensional functions (using
the piecewise multilinear basis) is [53, 54]
k f − Aq,N ( f )k = O(M −2 |log2 M|3(N−1) ), (2.26)
where M = dim(H(q, N)) is the number of interpolation points. On the other

hand, the construction of multidimensional functions using polynomial basis
functions gives an interpolation error of [53, 54]
k f − Aq,N ( f )k = O(M −k |log2 M|(k+2)(N+1)+1 ), (2.27)
assuming that the function f ∈ F dk , that is, it has continuous derivatives up to

order k.
26
2.4.3 Solution strategy
The final solution strategy is as follows: A stochastic collocation method in

Γ ⊂ RN along with a finite element discretization in the physical space D ⊂ Rd
is used. Given a particular level of interpolation of the Smolyak algorithm in
M
N-dimensional random space, we define the set of collocation nodes {ξk }k=1 on
which the interpolation function is constructed. Given a piecewise FEM mesh

Xdh , find, for k = 1, . . . , M, solutions
(uhk (x), phk (x), θkh (x)) = (uh (x, ξk ), ph (x, ξk ), θh (x, ξ k )), (2.28)
such that
B(uh (ξi ), ph (ξ i ), θh (ξi ) : ξ i , x, t) = 0, i = 1, .., M. (2.29)
The final numerical solution takes the form

M
X
z(x, ξ) = zk (x)L̂k (ξ), (2.30)
k=1
where z is one of uh , ph , θh , and L̂k are the multidimensional interpolation func-
tions constructed using the Smolyak algorithm.
Recently, rigorous error estimates for the collocation based stochastic

methodology have been developed in [16] and [56]. The error is basically split
into a superposition of errors: the error due to the discretization of the deter-
ministic solution, εD , and the error due to the interpolation of the solution in
stochastic space, εI . This error can be written as ε ≤ (C1 ε2D + C2 ε2I )1/2 , where C1
and C2 are constants independent of the discretizations in physical space and
stochastic space.
27
Figure 2.1: Collocation nodes for a two-dimensional problem in a level
4 full (left) and sparse grid (right). Both the grids are con-
structed from the same level of one-dimensional interpolation
functions. The sparse grid offers marginally worse interpo-
lation properties with huge improvements in the number of
points sampled.
2.4.4 Numerical illustrations
In this subsection, we illustrate the sparse grid interpolation method with a few
examples which will also serve as a motivation for the development of the adap-
tive sparse grid methodology.
Grid sizes
Fig. 2.1 shows the two-dimensional interpolation nodes for the sparse grid
along with the two-dimensional tensor product grid based on the same one-
dimensional interpolation nodes. As the number of dimensions or the level

of interpolation increases, the number of points to construct interpolants with
similar errors using the sparse grid collocation method reduces appreciably as
compared to the full-tensor product collocation method.
28
Figure 2.2: Comparison of the exact (left) and interpolant (right) functions.
Interpolating smooth functions
The sparse grid collocation method requires function evaluations at a finite
number of discrete points Θ. These function evaluations are usually finite el-
ement simulations of the deterministic problem at each stochastic point Θi . To
illustrate the methodology, we will instead assume that certain analytical func-
2 −̺y2
tions are the inputs at specific nodal points. Consider the function f = e−x ,
where ̺ is a positive number. This function is analytically smooth. For ̺ = 1, the

function is isotropically smooth. The computational domain is [−1, 1] × [−1, 1].
Fig. 2.2 compares the actual function and the interpolated function. The abso-
lute error max | f − Aq,2 ( f )| is 3 × 10−5 when considering 3329 points. This corre-
sponds to building the sparse interpolant using one-dimensional interpolating
functions up to order 9. To achieve a similar interpolation error using the full-
tensor product grid would require 263169 points.
Now we gradually change the value of ̺ to make the function more

anisotropic. The interpolation error versus the number of nodes in the sparse
grid is plotted in Fig. 2.3 for ̺ varying from 1 to 10000. Notice that as the
anisotropy increases, the variation of the function along one direction becomes
29
0
10
-1
10
-2
10
Error
ρ= 1
10
-3
ρ= 10
ρ= 100
ρ= 1000
-4 ρ= 10000
10
-5
10
0 1 2 3 4
10 10 10 10 10
Number of nodes
Figure 2.3: Interpolation error for anisotropic functions: As ̺ increases, the

function becomes steeper (Fig. 2.4). More points are required
to accurately interpolate the function.
0.9
0.8
↑ρ
0.7
0.6
ρ =1
0.5
0.4
0.3
ρ =10
0.2
0.1 ρ =100
0
-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
Figure 2.4: Slice of the function in the x plane. The arrow shows the direc-
tion of increasing ̺.
progressively steeper (see Fig. 2.4). Thus, more support nodes are required to
accurately interpolate the function. To reach an accuracy of 2 × 10−2 the number
of points required are 65, 145, 321, 705, 1537 and 3329, for ρ = 1, 10, 100, 1000
and 10000, respectively.
The critical point to note here is that even though only one dimension of
the function is very steep (as ̺ increases), the sparse grid increases the node
allocation in all directions. This suggests that one can further reduce the number
of function evaluations if we can detect the directions in which the function
30
varies more smoothly than the others.
Interpolating discontinuous functions
In many physical systems governed by SPDEs, one is primarily interested in
analyzing mode jumps and equilibrium shifts that occur due to small variations
in the random variables. One extreme case is when there is a sharp (finite) dis-
continuity in the solution across a particular random variable. We are interested
in the performance of the sparse grid collocation method in capturing this dis-
continuity.
1
10
0
10
-1
10
Error
-2
10
-3
10
-4
10
0 1 2 3 4 5
10 10 10 10 10 10
Number of nodes
Figure 2.5: Interpolation error for a discontinuous function.
2 +2sign(y)
Consider a function f = e−x in the domain [−1, 1] × [−1, 1]. Here sign
is the signum function: 1, when y is positive and −1, otherwise. The function f
has a sharp discontinuity along the y = 0 plane. On the other hand, the function
is very smooth in the x direction. That is, the function f (x, c), for a constant c
is an exponentially decaying function. The sparse grid method is used to build
the interpolation for this function. Fig. 2.5 plots the error as a function of the
number of support nodes used by the algorithm. Notice that the error reduction
is sub-linear during the first few interpolation levels. This is because of the error
31
y x y x y x y x
1 1 1 1
0.9 0.9 0.9 0.9
0.8 0.8 0.8 0.8
0.7 0.7 0.7 0.7
0.6 0.6 0.6 0.6
0.5 0.5 0.5 0.5
0.4 0.4 0.4 0.4
0.3 0.3 0.3 0.3
0.2 0.2 0.2 0.2
0.1 0.1 0.1 0.1
0 0 0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Figure 2.6: Top row: Interpolated function at different levels of interpola-

tion, Bottom row: The corresponding sparse grids used.
caused by the jump across the x axis.
Fig. 2.6 plots the ‘evolution’ of the interpolation function as the number of
support nodes increases. The first row shows a plot of the interpolated function
while the second row shows the corresponding support nodes required to build
that particular interpolation function. To construct a polynomial interpolation

for the function f with an error in the range of 2 × 10−3 required 32769 support
nodes. But notice that the sparse grid adds points equally in both dimensions,
even when the function varies very smoothly in one dimension.
The two examples discussed above illustrate the possibility of further re-
ducing the number of needed function evaluations for cases in which (i) the
functional dependence on the stochastic variables is additive or nearly addi-

tive, (ii) the function is anisotropic in the stochastic dimensions. and (iii) when
discontinuities exist in the function in certain dimensions.
32
The choice of support nodes now depends implicitly on the function that one
is interpolating. Therefore, one must decide which nodes to sample on-the-fly
for minimizing the number of function evaluations. This naturally leads to the
idea of adaptive sparse grid collocation, which is discussed in the next section.
2.5 Adaptive sparse grid collocation methods
As stated earlier, in the standard sparse grid approach, all the stochastic dimen-
sions are treated equally. By treated equally, we mean that the number of grid
points in each direction is equal. In most physical problems that one deals with,
there usually exists some structure (additive, nearly-additive, anisotropic, dis-

continuous) that can be taken advantage of, to reduce the number of function
evaluations. However, the specific kind of structure that the particular solu-
tion exhibits is not known a priori. Thus one must construct an approach that
automatically detects which dimensions require more nodal points.
The basic proposition of the adaptive sparse grid collocation method is to
assess the stochastic dimensions differently, according to the error of interpo-

lation in that dimension. We closely follow the dimension adaptive quadra-
ture approach developed by Gerstner and Griebel [51] and further developed
by Klimke [54]. The reason all dimensions are sampled equally in the stan-
dard sparse grid method is apparent by revisiting Eq. (2.23). The multi-index
|i| = i1 + . . . + iN that determines the number of sampling points in each dimen-
sion ik is sampled from the simplex |i| ≤ q. This ensures that if (n1 , n2 , . . . , nN ) is a
valid index, then (n2 , n1 , . . . , nN ) is also a valid index and so on. One simple ap-
proach to extend the conventional sparse grid to sample differently important
33
dimensions is to consider a weighted simplex a · i ≤ q (this simple extension
is in fact not feasible [51]). Another approach is to allow any general index set
satisfying certain conditions. The problem essentially boils down to choosing
index sets from the set of conventional sparse grid points. One can heuristically
improve the performance of the interpolant in directions where the function is

varying rapidly.
2.5.1 Generalized sparse grid construction
The conventional Smolyak algorithm imposes a strict admissibility criterion on

the index sets (|i| ≤ q). The incorporation of heuristically choosing the most im-
portant dimensions/directions/regions can be achieved by relaxing this strict

admissibility criterion. The so-called generalized sparse grids [51] are obtained
by relaxing this admissibility criteria.
An index set S is called admissible if for all indices k ∈ S [51]
k − e j ∈ S for 1 ≤ j ≤ N, k j > 1, (2.31)
where e j is the j-th unit vector. This means that an admissible set contains for
every index k all indices which have smaller entries than k. The generalized
sparse grid includes both the conventional sparse grids as well as the classical
product formulae within it. This criterion still ensures the telescopic property
of the differential interpolant ∆i = U i − U i−1 . This means that one can construct
a better interpolant starting from a coarse interpolant by just sampling on the
unique nodes of the finer interpolant, see Eq. (2.23). Denote the space of all
34
admissible indices by F . The generalized sparse grid construction is then [51]
X
AS ( f ) = (∆i1 ⊗ . . . ⊗ ∆i1 )( f ), (2.32)
i∈S
for an admissible set S ∈ F . By carefully constructing the index sets, it is pos-
sible to construct polynomials at least as good as the case of the conventional
sparse grids.
2.5.2 Interpolation procedure
To construct interpolation polynomials of increasing accuracy by increasing the

number of indices, one uses nested sequence of index sets (as in the case of the
conventional sparse grid method). The construction starts with the one element
index set 1 = (1, . . . , 1). Indices are added successively such that two criteria are
satisfied: the resulting index sets all remain admissible and a large reduction
in the interpolation error is achieved. To achieve the second criterion, an error
estimator has to be defined that quantifies the error reduction. Once a suitable
error indicator is determined, the corresponding error indicators are computed
in all the stochastic dimensions.
One starts the interpolation construction with the smallest index 1 and adds
the indices (2, . . . , 1), . . . , (1, . . . , 2). One then obtains an initial set of error indica-
tors. The index set is partitioned into two sets, the active indices A and the old
indices O [51]. Newly added indices are denoted as active indices. The error in-
dicators of the forward neighbors of active indices have not yet been computed.
All other indices are denoted as old indices.
At each step, the forward neighborhood of the index set with the largest error
indicator is considered, and the corresponding error indicators computed. The
35
forward neighborhood of an index i are the N indices {i + e j | j = 1, . . . , N}. These
new sets are added to the active index sets. Then the index set with the largest
error indicator is considered and so on. The error indicator is chosen to be the
deviation of the computed value of a particular point from its expected point
(from the interpolation polynomial built at this level). This is just the differential
interpolation operator ∆i−1 ( f ) (see Eq. (2.23)) i.e. it is the difference between
P
the computed value of the function at an active index and the interpolated value
of the function using all the old indices.
In this way, the algorithm adaptively samples the regions of the conven-
tional sparse grid support region where the error is maximized. The following
subsection illustrates the effectiveness of this methodology with some sample
examples.
1 1
0.9 0.9
0.8 0.8
0.7 0.7
0.6 0.6
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Figure 2.7: The adaptive grid used for ρ = 1 (left) and ρ = 1000 (right). No-
tice the increased node points in the y direction for the second
case.
36
2.5.3 Numerical illustrations
Interpolating anisotropic functions
We revisit the first problem in the sparse grid collocation section. Briefly, con-
2 −̺y2
sider the function f = e−x , where ̺ is a positive number. This function is
analytically smooth. For ̺ = 1, the function is isotropically smooth. The com-
putational domain is [−1, 1] × [−1, 1]. We look at the problem for increasing ̺.
As the anisotropy increases the adaptive procedure must sample more points
in that direction where the error is maximum (the y direction in this particular
case). Fig. 2.7 plots the adaptive grids used for the two particular cases of ̺.
Fig. 2.8 plots the error reduction with the number of support nodes. Compare
this with the case of using the conventional sparse grids (Fig. 2.6). The number
of nodes required to achieve the same interpolation error of 5 × 10−2 for the case
of ρ = 1000 is 577 using the adaptive sparse grid method as compared to 1537
using the conventional sparse grid method. As the anisotropy of the problem
increases, the adaptive sparse grid method selectively improves the function in
the y direction, until the same level of accuracy is reached in both directions.
Interpolating discontinuous functions
We revisit the second problem discussed in the previous section. Briefly, con-
2
sider a function f = e−x +2sign(y) in the domain [−1, 1] × [−1, 1], where sign is
the signum function. The function f has a sharp discontinuity along the y = 0
plane. Fig. 2.9 shows the adaptive sparse grid used to compute the interpola-
tion function for this discontinuous function. Notice the large bias towards the y
direction, which is to be expected from an adaptive algorithm. Fig. 2.9 also plots
37
0
10
-1
10
ρ =1
Error
-2
10
ρ = 10
ρ = 100
-3 ρ = 1000
10
-4
10
0 1 2 3
10 10 10 10
Number of nodes
Figure 2.8: Error versus the number of nodes using the adaptive sparse
grid method for interpolating anisotropic functions. Compare
with the error plot using the conventional sparse grid method
shown in Fig. 2.3.
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
y x
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Figure 2.9: Left: Adaptive mesh for the discontinuous problem, Right: In-
terpolated discontinuous functions.
the interpolated function. The error after using just 553 nodal points is 1 × 10−1 .
To achieve the same level of error using the conventional sparse grid method
(see Fig. 2.5) 3300 points have to be sampled.
The adaptive sparse grid collocation strategy at worst samples as many

points as the conventional sparse grid collocation strategy. This is for prob-
38
lems where every stochastic dimension is equally important. In most physical
problems encountered, the solutions exhibit some level of dissimilarity along
various directions or have some structure that can be easily exploited using the
adaptive sparse grid collocation method. Furthermore, there is very little over-
head in computing the error indicator function (since the error indicators are
just the incremental interpolation function at the new indices (Eqn. 2.22), [54])
2.6 Ingredients for an adaptive sparse grid collocation imple-
mentation
In this section, we outline an approach to perform stochastic analysis of partial

differential equations by non-intrusively using an existing deterministic code.
The solution procedure can be broadly divided into three distinct operations:
• A subroutine for computing deterministic solutions.
• A subroutine for building the interpolation functions.
• A subroutine for computing moments and other post-processing opera-

tions.
2.6.1 Deterministic code
The concept of the collocation method lies in effectively decoupling the physical
domain computation from the computations in the stochastic dimensions. The
deterministic codes used in the present work took in as input, the sparse grid
39
coordinates. For example, consider the case of two stochastic dimensions (ξ1 , ξ2),
that determine the input random process ω. The range of (ξ1 , ξ2) is assumed to
be [0, 1] × [0, 1] without loss of generality. The deterministic executable must
be able to take in the different sampled two-tuples (ξ1i , ξ2i ) and output logically
named result files Ri .out.
2.6.2 Interpolation functions
In the present work, we use a simple MATLAB wrapper program that first ini-
tializes the stochastic dimensions and constructs the sparse grid coordinates.
In the case of the conventional sparse grid collocation method, all the sampled
nodes are known a priori. Hence in this case, the deterministic code is run and
the appropriately named result files are available. The wrapper program then
reads the input data and constructs the interpolation function. For the rapid
construction of the interpolation functions, we utilize the sparse grid interpo-
lation toolbox developed by Klimle [57, 58]. In the case of the adaptive sparse
grid collocation method, the wrapper program builds the input files for the de-
terministic code, calls the executable of the deterministic program and reads in
the corresponding result at each step of the process.
2.6.3 Post-processing
Postprocessing usually involves computing the moments of the stochastic solu-

tion across the whole domain. This is essentially a cubature of the interpolating
function across the stochastic space. In the case of conventional sparse grid
40
collocation, the integration is very straightforward because the weights corre-
sponding to the known nodal positions are computed a priori. A data set con-
taining the weights for each nodal position, for different dimensions and differ-
ent interpolation depths was first created. Then the cubature over the stochastic
space for simultaneously computing moments of various fields becomes sim-

ple scalar-matrix products. In the case of the adaptive sparse grid collocation
method, the number of collocation points is not known a priori and hence the
weights for each point are computed on the fly. That is at each level (or if
a particular error tolerance has been reached), the interpolation functions are
numerically integrated to get the corresponding weights associated with each
nodal point. This results in significantly larger post processing time for higher-
dimensional integrations.
2.7 Numerical examples
In this section, we showcase the collocation based strategy to solving stochas-

tic natural convection problems. In all the examples discussed in this work, we
construct the interpolation functions based on Clenshaw-Curtis points [54]. The
Clenshaw-Curtis type grid has been shown to outperform other types of grids
in numerical tests [53, 54]. In all the numerical examples, the global error com-
puted is the maximum of the absolute variation of the interpolated value from
the actual value at the newly computed sparse grid points (supi (∆i )). That is,
at each level, the function is evaluated at some points. The error is defined as
the difference between the interpolated value (using the interpolation function
at the previous level) and the actual function value at these points. The global
error reported is the maximum of these errors.
41
The main focus of this work is to investigate the effects of input uncertain-
ties in natural convection problems. Consider a d-dimensional bounded domain
D ⊂ Rd with a boundary ∂Dd ∂Dn . Dirichlet boundary conditions are applied
S
on ∂Dd , while Neumann boundary conditions are applied on ∂Dn . Natural con-
vection sets in due to the thermal differential maintained across ∂Dd . Consider
also a complete probability space (Ω, F , P), where Ω is the event space, F ⊂ 2Ω
the σ-algebra, and P : F → [0, 1] is the probability measure. In the problems
that we consider, the input uncertainties occur due to the following random
fields f (x, ω):
• Variations in the Dirichlet boundary conditions: θ(x) = f1 (x, ω), where x ∈

∂Dd and ω ∈ Ω.
• Variations in the boundary topology (roughness): ∂Dd = f2 (x, ω), where

x ∈ ∂Dd and ω ∈ Ω.
• Variations in material properties (viscosity, porosity): ̺(x) = f3 (x, ω),where

x ∈ D and ω ∈ Ω.
Under these conditions the problem is to find stochastic functions that describe
the velocity u ≡ u(x, t, ω) : D × [0, T ] × Ω → Rd , the pressure p ≡ p(x, t, ω) :

D × [0, T ] × Ω → R and the temperature θ ≡ θ(x, t, ω) : D × [0, T ] × Ω → R, such
that for P − almost everywhere ω ∈ Ω, the following equations are satisfied:
∇ · u = 0, (2.33)
∂u
+ u · ∇u = −∇p + Pr∇2 u + F(u, θ), (2.34)
∂t
∂θ
+ u · ∇θ = ∇2 θ, (2.35)
∂t
where F(u, θ) is the forcing function in the Navier-Stokes equations and Pr is

the Prandtl number of the fluid. In the problems considered later, F(u, θ) will
42
usually be the Bousinnessq approximated buoyant force term −RaPrθeg , where
Ra is the thermal Rayleigh number and eg is the gravity vector.
2.7.1 Natural convection with random boundary conditions
In the following example, a natural convection problem in a square domain

[−0.5, 0.5] × [−0.5, 0.5] is considered. The schematic of the problem is shown in
Fig. 2.10. The left wall is maintained at a higher temperature of 0.5. The right
wall is maintained at a lower mean temperature of −0.5. The temperature at dif-
ferent points on the right hand side boundary is correlated. This is physically
feasible through, say, a resistance heater, where the mean temperature remains
constant, but material variations cause local fluctuations from this mean temper-
ature. We assume this correlation to be a simple exponential correlation func-
tion, C(y1 , y2 ) = exp(−c · |y1 − y2 |). Here, c is the inverse of the correlation length
that is taken as 1. For ease of analysis, we consider a non-dimensionalized form
of the governing equations. The Prandtl number is set to 1.0 and the thermal
Rayleigh number is set to 5000. The upper and lower boundaries are thermally
insulated. No slip boundary conditions are enforced on all four walls.
Following the solution procedure in [7], the Karhunen-Loève expansion of
the exponential correlation for the input uncertainty is performed. Truncating

the infinite dimensional representation of the boundary temperature to a finite
dimensional approximation gives
M
X p
T (y, ξ) = −0.5 + ξi λi fi (y), (2.36)
i=0
where ξi are normally distributed (N(0, 1)). The eigenvalues of the correlation
function decay very rapidly. The two largest eigenvalues contribute to 95% of
43
T (y) = f (y,ω)
T (y) = 0.5
Figure 2.10: Schematic of a natural convection problem with random

boundary temperature conditions.
the energy of the spectrum. To showcase the sparse grid collocation method,
three cases, where the Karhunen-Loève expansion is truncated at 2, 4 and 8
terms, are considered.
The physical domain is discretized into 50 × 50 uniform quadrilateral ele-
ments. Each deterministic simulation is performed until steady-state is reached.

This was around 600 time steps with a time interval of ∆t = 1 × 10−3 .
Convergence with increasing number of nodes
Fig. 2.11 plots the variation in the error with increasing number of support
points for both the conventional isotropic sparse grid and the adaptive sparse
grid methods. In the case of the isotropic sparse grid method, the level of in-
terpolation is increased linearly for the three problems of dimensions 2, 4 and
8, respectively. For the two-dimensional case, the adaptive sparse grid colloca-
tion (ASGC) is marginally better than the conventional sparse grid collocation
(CSGC) technique. The same level of accuracy is achieved in 137 function eval-
44
Error
Error
Number of nodes Number of nodes
Error
Number of nodes
Figure 2.11: Error reduction with increased number of support nodes for
2, 4 (top figures) and 8 (bottom figure) dimensions.
uations as compared to the 321 required in the CSGC method. As the number of
stochastic dimensions is increased, the ASGC method provides orders of mag-
nitude reduction in the number of functional evaluations. For the same mag-
nitude of error (∼ 4 × 10−2 ), in 4 dimensions the ASGC required 751 function
evaluations compared to the 7537 that the CSGC required (see Fig. 2.11). This
is illustrated clearly in the case of 8 stochastic dimensions where a 65 times re-
duction in the number of function evaluation is seen going from the CSGC to
ASCG methods (15713 to 241). What has to be noted is that this dramatic re-
duction in the number of functional evaluations is mainly due to the structure
of the solution. While the adaptive strategy neglects the last four dimensions
and quickly converges to an accurate solution, the isotropic sparse grid method
45
0.45 8.00
0.36 6.40
0.27 4.80
0.18 3.20
0.09 1.60
0.00 0.00
-0.09 -1.60
-0.18 -3.20
-0.27 -4.80
-0.36 -6.40
-0.45 -8.00
0.45 8.00
0.36 6.40
0.27 4.80
0.18 3.20
0.09 1.60
0.00 0.00
-0.09 -1.60
-0.18 -3.20
-0.27 -4.80
-0.36 -6.40
-0.45 -8.00
8.00
0.45
6.40
0.36
4.80
0.27
3.20
0.18
1.60
0.09
0.00
0.00
-1.60
-0.09
-3.20
-0.18
-4.80
-0.27
-6.40
-0.36
-8.00
-0.45
Figure 2.12: Mean Temperature and u velocity contours from different so-
lution strategies. Top row: A level 6 collocation method, Mid-
dle row: Monte-Carlo sampling over 65000 points, Bottom
row: Second-order GPCE expansion.
46
takes many more functional evaluations to do the same because it treats all
dimensions equally. As the number of dimensions (i.e more terms in the KL
expansion are taken into account) increases, the isotropic method treats these
dimensions equally, even though the higher dimension have a negligible effect
on the stochastic solution. On the other hand, the adaptive method takes this
structure into account and preferentially samples the dimensions with the larger
eigen values.
Comparison with Monte-Carlo simulations and GPCE
In the case of the conventional sparse grid collocation method for the prob-
lem truncated to two-dimensions, a level 6 interpolation scheme was used in
the sparse grid interpolation method. This resulted in solving the determin-
istic problem at 321 different realizations. In the case of the adaptive sparse
grid collocation method, 137 realizations were required. The sparse grid points
correspond to the Clenshaw-Curtis cubature points in two-dimensions. The
solution procedure is embarrassingly parallel. The complete simulation was
performed independently on 16 nodes (32 processors) of the V3 cluster at the

Cornell Theory Center. The total simulation time was about 8 minutes. To com-
pare the results with other available methods, we performed Monte-Carlo sim-
ulations of the same process. 65000 samples from the two-dimensional phase
space were generated. A second order polynomial chaos expansion of the above
problem was also performed. The mean and standard deviation values of the
three methodologies match very closely as seen in Fig. 2.12 and Fig. 2.13, respec-
tively. The total computational time for the GPCE problem was 235 minutes.
Table. 1 shows representative times for the natural convection problem solved
47
0.60 6.78
0.53 6.10
0.47 5.43
0.40 4.75
0.33 4.07
0.27 3.39
0.20 2.71
0.13 2.03
0.07 1.36
0.00 0.68
0.60 6.78
0.53 6.10
0.47 5.43
0.40 4.75
0.33 4.07
0.27 3.39
0.20 2.71
0.13 2.03
0.07 1.36
0.00 0.68
0.60 6.78
0.53 6.10
0.47 5.43
0.40 4.75
0.33 4.07
0.27 3.39
0.20 2.71
0.13 2.03
0.07 1.36
0.00 0.68
Figure 2.13: Standard deviation of Temperature and u velocity contours

from different solution strategies. Top row: A level 6 colloca-
tion method, Middle row: Monte-Carlo sampling over 65000
points, Bottom row: Second-order GPCE expansion.
48
Probability
Probability
Temperature Pressure
Probability
Probability
U velocity V velocity
Figure 2.14: Probability distribution functions for the dependent variables

row: u and v velocity components.
Table 2.1: Solution times (minutes)

N (stochastic dims) GPCE Conventional SC Adaptive SC
2 235 8 3.4
4 5340 187 19
8 1585 391 8
using GPCE and collocation based methods. The dimension 2 and 4 problems
were solved using a second-order polynomial chaos representation, while the
problem in 8 dimensions was solved using a first-order polynomial chaos rep-
resentation. All problems were solved on 16 nodes (32 processors) of the V3

cluster at the Cornell Theory Center. Notice that as the number of dimensions
increases, the performance of the collocation method improves.
49
Higher-order moments and PDFs
Once the interpolant has been constructed, it is relatively straightforward to

extract higher-order statistics from the stochastic solution. The effect of the un-
certain boundary conditions is to change the convection patterns in the domain.
At the extreme ranges, the direction of the convective roll actually reverses. It
is informative to look at the probability distribution of the velocity at the points

of high standard deviation. This provides a clue towards the existence of equi-
librium jumps and discontinuities in the field. The point (0.34, 0.0) is one such
point of high-deviation (see Fig. 2.13). The velocity at that point was sampled
randomly 500000 times from the 2 dimensional stochastic space and the corre-
sponding PDF from its histogram distribution was computed. The probability
distribution functions for the temperature, pressure and velocities at this spatial
point are plotted in Fig. 2.14. The applied random temperature boundary con-
ditions results in some realizations having temperatures (along the right wall)
both higher and lower than the left wall. This leads to instances where there is
a complete reversal in the direction of circulation. Furthermore, as seen from
Fig. 2.12, most of the flow in the horizontal direction is concentrated in two
regions. Consequently, the u velocity in these regions experiences a large varia-

tion due to the imposed boundary conditions. This results in the tailing of the u
velocity distribution seen in PDF for u velocity in Fig. 2.14. Since pressure can
be considered to enter the physics as a means of imposing continuity, the PDF
for the pressure also exhibits a similar tailing.
50
2.7.2 Natural convection with random boundary topology
In the previous problem, we modelled the uncertainty due to the randomness

in the boundary conditions. In the present problem, we investigate the effects
of surface roughness on the thermal evolution of the fluid heated from below.
The surface of any material is usually characterized broadly into two aspects:
waviness and roughness. Waviness refers to the large scale fluctuations in the
surface. Roughness is the small scale perturbations of the surface. Roughness is
described by two components: the probability of the point being z above/below
the datum (PDF), and the correlation between two points on the surface (ACF).
The type of surface treatment that resulted in the surface can be directly con-
cluded from the auto-correlation function. For example, if the surface has been
shot-peened, its ACF would have a larger correlation distance that if the surface
had been milled. Usually the PDF of the position of a particular point is well
represented by a Gaussian like distribution. To a degree, the ACF can be approx-

imated by an exponential correlation kernel with a specified correlation length.
Surface roughness has been shown to play an important role in many systems,
particularly those involving turbulent flow [60]. A recent work of Tartakovsky
and Xiu [61] deals with this problem in a spectral stochastic framework.
A natural convection problem in a rectangular domain [−1.0, 1.0] × [−0.5, 0.5]

is considered. The schematic of the problem is shown in Fig. 2.15. The upper
wall is smooth while the lower wall is rough. The ACF of the surface roughness
of the lower wall is approximated to be an exponential function. The correlation
length is set to 0.10. The mean roughness (absolute average deviation from a
1 th
smooth surface) is taken to be 100
of the characteristic length (L = 1.0). The first
eight eigenvalues are considered to completely represent the surface random-
51
T (y) = -0.5
T (y) = 0.5 y = f(x,ω)
Figure 2.15: Schematic of natural convection with random boundary

topology (surface roughness).
ness. The problem is solved in eight dimensional stochastic space. The bottom
wall is maintained at a higher temperature of 0.5. The top wall is maintained at
a lower temperature of −0.5. The side walls are thermally insulated. The Prandtl
number is set to 6.4 (corresponding to water) and the thermal Rayleigh number
is set to 5000. No slip boundary conditions are enforced on all four walls.
The physical domain is discretized into 100 × 50 uniform quadrilateral ele-
ments. Each deterministic simulation is performed until steady-state is reached.

This was around 400 time steps with a time interval of ∆t = 1 × 10−2 .
Each deterministic simulation represents one realization of the random

(rough) lower wall sampled at different ξ = (ξ1 , .., ξ8). The lower wall is char-
P √
acterized by the truncated KL expansion, y(x) = 8i=1 λi ξi fi (x), where λi and
fi (x) are the eigenvalues and eigenvectors of the ACF. A reference domain
([−1.0, 1.0] × [−0.5, 0.5]) and grid (100 × 50 uniform quadrilateral elements) are
considered and the nodes on the bottom boundary are transformed according
to the above formula to get each realization of the random boundary. The Jaco-
bian of the transformation from the reference grid to the new grid can be easily
computed (It is just the ratio of the elemental areas before and after the transfor-
52
Error
Number of nodes
Figure 2.16: Error reduction using the adaptive sparse grid methodology.
mation). The deterministic solutions at each collocation point in stochastic space
is computed on the reference grid. This allows the subsequent computation of

the ensemble averages naturally.
In the case of the conventional sparse grid collocation method, a level 4

sparse interpolation grid was utilized for discretizing the stochastic space. 3937
collocation points were used to discretize the eight dimensional stochastic space
using the conventional sparse grid collocation method (CSCG) while the adap-
tive sparse grid collocation method (ASGC) method required 820 realizations
to construct the interpolation function. 8 nodes on the Velocity-3 cluster at the
Cornell Theory Center took 500 minutes for simulating the problem at all the
collocation points. Fig. 2.16 plots the reduction in the error with the number of
nodal supports used.
Fig. 2.17 shows some sample realizations of the temperature and velocity
contours. The first row of figures is the case when the roughness is minimal. The
temperature and velocity contours closely resemble those of Rayleigh-Bénard
53
0.438 6.363 8.369
0.313 4.547 6.449
0.188 2.731 4.529
0.063 0.915 2.610
-0.063 -0.902 0.690
-0.188 -2.718 -1.230
-0.313 -4.534 -3.150
-0.438 -6.350 -5.070
0.438 14.246 17.537

0.313 10.175 13.326
0.188 6.104 9.114
0.063 2.033 4.902
-0.063 -2.037 0.691
-0.188 -6.108 -3.521
-0.313 -10.179 -7.733
-0.438 -14.250 -11.944
0.438 14.350 12.013

0.313 10.250 7.773
0.188 6.151 3.533
0.063 2.052 -0.707
-0.063 -2.047 -4.947
-0.188 -6.147 -9.186
-0.313 -10.246 -13.426
-0.438 -14.345 -17.666
0.438 10.932 15.707

0.313 7.626 11.772
0.188 4.319 7.837
0.063 1.012 3.901
-0.063 -2.295 -0.034
-0.188 -5.601 -3.969
-0.313 -8.908 -7.904
-0.438 -12.215 -11.839
0.438 12.209 15.697

0.313 8.902 11.758
0.188 5.594 7.819
0.063 2.287 3.880
-0.063 -1.021 -0.059
-0.188 -4.328 -3.998
-0.313 -7.636 -7.937
-0.438 -10.943 -11.876
Figure 2.17: Realizations of the problem: Temperature, u velocity and v

velocity components.
convection with smooth boundaries. The next 4 rows of figures show extremal
values of the surface roughness. The thermal plume changes position according
to the local surface geometry. Also note that the flow field has been substantially
enhanced due to roughness. Experiments have shown that surface roughness

results in the emission of thermal plumes (at high Rayleigh number) and gener-
ally causes enhanced heat transport and improved flow circulation [60].
54
0.437 6.530
0.312 4.670
0.187 2.810
0.062 0.950
-0.062 -0.911
-0.187 -2.771
-0.312 -4.631
-0.437 -6.492
4.192 0.355
2.528 0.253
0.864 0.152
-0.800 0.050
-2.465 -0.051
-4.129 -0.153
-5.793 -0.254
-7.457 -0.356
Figure 2.18: Mean values of the variables: Top left: Temperature, Top
right: u velocity component, Bottom left: v velocity compo-
nent, Bottom right: pressure.
Moments and probability distribution functions
Fig. 2.18 plots the mean values of the temperature, velocity and pressure. The
rough lower surface causes the mean temperature to be much more homoge-
nized. The mean velocity is larger than the case with smooth lower surface.
In the case of smooth lower surfaces, the u velocity shows symmetry about the
horizontal centerline. Notice that this is lost when the lower surface is made
rough. The rougher surface enhances the circulation patterns in the lower half
of the system. Fig. 2.19 plots the mean velocity vectors and some streamline
contours. The top corners of the system consist of recirculating rolls which re-
sult in reduced heat transfer at the edges as can be seen in the mean temperature
contour plot.
Fig. 2.20 plots the second moment of the temperature, velocity and pressure
variables. The standard deviation of the velocity is of the order of the mean
velocity. This suggests that there are multiple modes at which the system exists
55
-0.742352
2
29
-0.245137
49
0. 7
0.2
-1.73678
5
20
77
-1 . 2
51
395
46
1. 2
7
-0.2
0.
25
451
20
37
77
Figure 2.19: Mean velocity vectors and streamlines.
0.301 15.875
0.260 13.758
0.220 11.642
0.180 9.525
0.140 7.408
0.100 5.292
0.060 3.175
0.020 1.058
18.599 0.801
16.119 0.694
13.639 0.587
11.159 0.480
8.679 0.374
6.200 0.267
3.720 0.160
1.240 0.053
Figure 2.20: Standard deviation of the variables: Top left: Temperature,

Top right: u velocity component, Bottom left: v velocity com-
ponent, Bottom right: pressure.
and the variation of the random variables across the stochastic space causes the
system to move from one stable state to another. If the state changes abruptly
as the random variable moves across a threshold value, a mode shift is said to
have occurred. The existence of multiple patterns in the region of parameters
considered here is well known [62].
56
Mode shifts and equilibrium jumps
One way to identify multiplicity of stable states is through visualizing the prob-
ability distribution functions of the dependent variables at a physically relevant
point. Small changes in the roughness causes the central thermal plume in the
case of smooth surface Rayleigh-Bénard instability to change positions. This can
cause reversal of the direction of the fluid flow in parts of the domain (compare
rows 2 and 3 in Fig. 2.17). We choose one such point, (0, 0.25) which has a large
deviation in the temperature to plot the PDFs. The PDFs were computed from
500000 Monte-Carlo realizations of the dependent variables. Fig. 2.21 plots the
corresponding distribution functions. Notice that there exist two peaks in the
distribution functions for the temperature pressure and the v velocity. This con-
firms the existence of two stable states. We assert that these shifts also occur
due to the non-linearities introduced by the surface roughness. To verify if this
is indeed the case, Fig. 2.22 shows the temperature and v velocity variations at a
specific point ([0.0, 0.25], chosen because of its large temperature and v velocity
second moment) plotted over the first two random variables. Notice the abrupt
jump in the temperature and v velocity as the random variables cross 0. Fig. 2.23
shows the projection of the variation onto a single dimension. The mode shift is
apparent in the abrupt change of the dependent variables as the random vari-
able crosses over zero.
Below the critical Rayleigh number
As an extension to the previous problem, we investigate the possible mode

shifts from a conduction based state to a convection based (Rayleigh-Bénard
flow) thermal pattern caused by the effect of surface roughness. A sub-critical
57
Probability
Probability
Temperature Pressure
Probability
Probability
U velocity V velocity
Figure 2.21: Probability distribution functions for the dependent variables

row: u and v velocity components.
Figure 2.22: Mode shift: Left: Temperature, Right: v velocity component.
58
0.2
15
0.1
10
0
5
Temperature
Y velocity
-0.1 0
-5
-0.2
-10
-0.3
-15
-0.4
-5 0 5 -5 0 5
Random variable Random variable
0.438
0.438
0.313
0.313
0.188
0.188
0.063
0.063
-0.063
-0.063
-0.188
-0.188
-0.313
-0.313
-0.438
-0.438
Figure 2.24: Extreme realizations of temperature.
Rayleigh number of 1700 is chosen. This is just below the critical Rayleigh num-
ber of 1708. The other system and domain specifications are the same as the
previous problem. We study the problem in four stochastic dimensions.
The critical Rayleigh number of 1708 is computed for a infinitely long, flat
layer of fluid. The variations in the lower surface coupled with the fact that we
are in fact only considering a finite sized domain causes the onset of convection
in some of the stochastic space. Fig. 2.24 shows two such extreme realizations
of temperature at two different collocation points. Notice that the two patterns
almost mirror images of each other. But the behavior of the mean temperature is
along expected lines. Fig. 2.25 plots the mean and the standard deviation of the
temperature contours. The temperature evolution is predominantly conductive.
59
0.437 0.182
0.312 0.158
0.187 0.134
0.062 0.109
-0.062 0.085
-0.187 0.061
-0.312 0.037
-0.437 0.012
Figure 2.25: Mean and standard deviations of temperature.
-0.01 0.5
-0.02
0
-0.03
-0.04 -0.5
Temperature
Y velocity
-0.05
-1
-0.06
-0.07 -1.5
-0.08
-2
-0.09
-5 0 5 -2.5
Random variable -5 0 5
Random variable
To further verify that no mode shift occurs as the stochastic phase space is
sampled, Fig. 2.26 plots the projection of the temperature and y velocity varia-
tions onto one dimension at a specific point ([0.0, 0.0]). The v velocity variation
is smooth with small gradients (compare with Fig. 2.23) while the temperature
variation is negligible.
2.7.3 Natural convection with random boundary topology:
Large dimensions
In the previous problem, we concentrated on qualitatively predicting mode

shifts caused due to surface roughness of the lower wall in a Rayleigh-Bénard
60
1
16
0.8
12
0.6
Eigenvalue
V2
8
0.4
0.2
4
0
0
0 0.5 1 1.5 2 5 10 15 20
V1 Index
Figure 2.27: Left: ACF of the surface roughness, Right: Eigen-spectrum of

the correlation.
T (y) = -0.5
T (y) = 0.5 y = f(x,ω)
Figure 2.28: Schematic of natural convection with wall surface roughness

computed from experiments.
type simulation. To represent the surface roughness, the ACF was considered to
be an exponential function. In this problem, we consider realistic ACF from ex-
perimental data. The surface roughness correlation function is given in Fig. 2.27
(taken from [72]). The corresponding eigen-spectrum is given in Fig. 2.27. The
first 20 eigenvalues correspond > 99% of the spectrum.
A natural convection problem in an enlarged rectangular domain [−2.0, 2.0]×

[−0.5, 0.5] is considered. The schematic of the problem is shown in Fig. 2.28.
61
0.44
0.31
0.19
0.06
-0.06
-0.19
-0.31
-0.44
0.44
0.31
0.19
0.06
-0.06
-0.19
-0.31
-0.44
0.44
0.31
0.19
0.06
-0.06
-0.19
-0.31
-0.44
Figure 2.29: Realizations of temperature.
The upper wall is smooth while the lower wall is rough. The mean roughness
1 th
(absolute average deviation from a smooth surface) is taken to be 200
of the
characteristic length (L = 1.0). The first twenty eigenvalues are considered to
completely represent the surface randomness. The bottom wall is maintained at

a higher temperature of 0.5. The top wall is maintained at a lower temperature
of −0.5. The side walls are thermally insulated. The Prandtl number is set to 6.4
(corresponding to water) and the thermal Rayleigh number is set to 10000. No
slip boundary conditions are enforced on all four walls.
A level 3 interpolatory sparse grid is considered in 20 dimensional stochastic
space. This corresponds to 11561 collocation points. The complete simulation
62
0.44
0.31
0.19
0.06
-0.06
-0.19
-0.31
-0.44
15.93
11.40
6.87
2.33
-2.20
-6.73
-11.26
-15.79
Figure 2.30: Mean values of temperature and v velocity component.
took about 1400 minutes on 48 nodes of the Velocity-3 cluster at the Cornell
Theory Center. Fig. 2.29 shows temperature contours at some of the collocation
points in 20 dimensional stochastic space. Fig. 2.30 shows the mean values of
the temperature and the v velocity. Fig. 2.31 plots the mean velocity vectors
and some streamlines. The standard deviations of the temperature and the v
velocity are shown in Fig. 2.32. Notice that the surface roughness causes thermal
plumes to form, which are seen in the variations in the temperature and the
v velocity. These plumes caused by the non-uniformities in the lower surface
result in improved heat transfer across the domain. This concept can be further
extended to the design of the ACF of the surface roughness to achieve enhanced
heat transfer characteristics. This will showcased in one of our forthcoming
publications.
63
35
-1.721
46
7 9 3. 4 11
84
06
0.
356
-4.2
79
0.844
-1.721
882
- 4. 28
821
1
0.17
0.14
0.12
0.10
0.08
0.06
0.03
0.01
7.63
6.62
5.60
4.58
3.56
2.54
1.53
0.51
Figure 2.32: Standard deviations of temperature and v velocity compo-

nent.
2.7.4 Convection in heterogeneous porous media
Fluid flow through porous media is an ubiquitous process occurring in various

scales ranging from the large scale :- geothermal energy systems, oil recovery,
to much smaller scales :- flow of liquid metal through dendritic growth in alloy
solidification and flow through fluidized beds. The analysis of flow through a
medium with deterministic variable porosity has been well studied [63]. But
in most cases of heterogeneous porous media, the porosity either varies across
multiple length scales or cannot be represented deterministically. In the follow-
ing example, we focus on the case when the porosity is described as a random
64
u=v=0
T=1 T=0
u=v=0 Free fluid u=v=0
Porous medium
u=v=0
Figure 2.33: Schematic of convection in a medium with random porosity.
field. The usual descriptor of the porosity is its correlation function and the
mean porosity.
A schematic of the problem is shown in Fig. 2.33. This problem has

been investigated by Nithiarasu et al. [63]. A square domain of dimensions
[−0.5, 0.5] × [−0.5, 0.5] is considered. The inner half of the square is considered to
be free fluid. The rest of the domain is filled with a porous material. The porous
material is assumed to be Fontainebleau sandstone, whose experimental corre-
lation function is taken from [64]. The correlation function and a cross-sectional
image of the sandstone is shown in Fig. 2.34.
The governing equation for the fluid flow in the variable porosity medium
is given by (see Nithiarasu et. al[63] and Samanta and Zabaras [65]):
∂v v Pr (1 − ǫ)2 1.75 k v k (1 − ǫ)
+ v · ∇( ) = − 2
v − v + Pr∇2 v −
∂t ǫ Da ǫ (150Da)1/2 ǫ 2
ǫ∇p − ǫPrRaTeg (2.37)
where ǫ is the volume fraction of the fluid and Da is the Darcy number. Nat-
65
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0 10 20 30 40 50 60
Figure 2.34: Fontainebleau sandstone: Correlation function and cross-

sectional image.
15
10
Eigenvalue
0
5 10 15 20
Index
Figure 2.35: Eigen spectrum of the correlation kernel.
ural convection in the system is considered. The left wall is maintained at a

dimension-less temperature 1.0, while the right wall is maintained at 0.0. The
other walls are insulated. No slip conditions are enforced on all four walls. The
Rayleigh number is 10000 and the wall Darcy number is 1.185 × 10−7 . The wall
porosity is ǫ = 0.8 and the porosity of the free fluid in the interior of the domain
is 1.0.
66
0.94 7.0
0.81 4.4
0.69 1.8
0.56 -0.8
0.44 -3.4
0.31 -6.0
0.19 -8.6
0.06 -11.2
9.5 0.6
7.1 0.5
4.6 0.4
2.2 0.3
-0.3 0.3
-2.7 0.2
-5.1 0.1
-7.6 0.0
Figure 2.36: Mean values of the variables: Top left: Temperature, Top
right: u velocity component, Bottom left: v velocity compo-
nent, Bottom right: pressure.
-0.020765
8 -1.06042
95
-0.020765
52
- 0.
-0.52958
-0.020765
6
2
91
.5
-1
221
4
29
65
26
-2.1
-2 .
591
-2.122
-1.
1
- 1. 0
604
2 -1.06042
-0.52958
-0.020765
67
0.097 5.056
0.084 4.382
0.071 3.708
0.058 3.034
0.045 2.359
0.032 1.685
0.019 1.011
0.006 0.337
6.936 0.054
6.011 0.047
5.086 0.040
4.161 0.032
3.237 0.025
2.312 0.018
1.387 0.011
0.462 0.004
Figure 2.38: Standard deviation of the variables: Top left: Temperature,

Top right: u velocity component, Bottom left: v velocity com-
ponent, Bottom right: pressure.
The numerically computed eigenvalues of the correlation are shown in

Fig. 2.34. The first eight eigenvalues correspond to 94% of the spectrum. We
consider the porosity to described in a 8 dimensional stochastic space. The mean
porosity is 0.8. A level 4 interpolatory sparse grid with 3937 collocation points
was utilized for the stochastic simulation. A fractional time step method based
on the work of Nithiarasu et al. [63] is used in the simulation of the deterministic
problem. Fig. 2.36 plots the mean temperature, velocity and pressure contours.
Fig. 2.37 shows the mean velocity vectors and some streamlines. The standard
deviation of the dependent variables is shown in Fig. 2.38.
68
2.8 Conclusions
The generalized polynomial chaos approach and its derivatives have been ex-
tensively utilized to solve thermal and fluid flow problems involving random
inputs and boundary conditions. The GPCE is restrictive in the sense that it
becomes computationally infeasible to solve problems involving lareg number

of expansion terms due to the coupled nature of the governing equations.In this
context collocation based approaches are considered. We have attempted to pro-
vide a natural development of collocation based approaches starting from the
drawbacks of the spectral stochastic methodology. We show that this leads nat-
urally to the tensor product collocation method [15] and the sparse grid colloca-
tion method [16]. An extension of the sparse grid collocation method to adap-
tively sample regions/dimensions of high function variations is presented. The

methodology can also efficiently detect additive and nearly additive structures
in the solution. The adaptive sparse grid collocation methodology provides at
least an order of magnitude improvement in the number of function evalua-
tions required to construct comparable interpolation functions as compared to

the conventional sparse grid method. The collocation methodology results in
the solution of completely decoupled deterministic problems. The embarrass-
ingly parallel nature of these problems is exploited to construct fast and efficient
solution strategies to problems in high stochastic dimensions.
Using the adaptive sparse grid collocation method, it is relatively straightfor-
ward to convert any deterministic code into a code that solves the correspond-
ing stochastic problem. We provide a detailed road-map, description of the nec-
essary changes in the deterministic code and the required sub-routines for pre-
and post-processing to easily accomplish this change.
69
We have attempted to showcase the utility of the sparse grid collocation
method to solve large stochastic dimensional heat transfer problems. We have
solved realistic natural convection problems in rough surfaces with stochastic
dimensions up to d = 20 as well as natural convection problems through hetero-
geneous porous media. We further looked at the shifts in equilibrium that occur
due to slight variations in the boundary conditions and successfully captured
these equilibrium jumps.
70
CHAPTER 3
DATA-DRIVEN APPROACHES TO CONSTRUCTING STOCHASTIC
INPUT MODELS
3.1 Representing input randomness
The first step towards solving Eq. (2.1) is some form of numerical representa-
tion of the input random processes. A finite dimensional representation of the
abstract probability space is a necessary prerequisite for developing any viable
framework to solve Eq. (2.1). In this chapter, we discuss various techniques that
can be utilized to represent the input uncertainties, with an emphasis on utiliz-
ing experimental data to construct these representations. Note that the input
parameters to Eq. (2.1) can be either be scalars or functions.
3.1.1 Input parameters as random variables
Consider the case when the input parameter, α is a scalar. The problem of inter-
est is now to represent the uncertainty of α in a numerically feasible way that
utilizes experimental data. Experimental data, in this case, refers to a set of real-
izations of α, {α1 , . . . , αM }. The most straightforward, intuitive way to represent

this variability is to construct a histogram of the variability and associate a ran-
dom variable, ξα , with a PDF proportional to the histogram. Thus, each input
parameter is represented using a random variable. The governing stochastic
differential equation now becomes
B(u, p, θ : x, t, ξα) = 0, (3.1)
71
where ξα is the random variable which represents the variability in the input pa-
rameter α. In case there are multiple (uncertain) scalar parameters {q, α} , each of
these input parameters should be represented as appropriate random variables
{ξq , ξα}. Assume that there are n = 2 scalar input parameters, the governing
stochastic differential equation now becomes
B(u, p, θ : x, t, ξq, ξα) = 0. (3.2)
Thus, from a deterministic dependant variable, θ, defined in nsd+1 (nsd space

dimensions and 1 dimension representing time) dimensions, we now have to
solve for the stochastic dependant variable, θ, defined in nsd + 1 + n (nsd space
dimensions, 1 dimension representing time and n stochastic dimensions repre-

senting the variability in the input parameters)
3.1.2 Random field as a collection of random variables
In some cases, the input parameters are functions showing variability in space.
This could be, for instance, imposed boundary conditions or property variabil-
ity in heterogeneous media. Assume (for clarity), that q(x) is one such input pa-
rameter that is uncertain. One simple way of representing the uncertainty of this
spatially varying random process (also called a random field) is to consider q(x)
as a collection of scalar values at different spatial locations. That is, the function
q(x) is approximated as a collection of points q′ = {q(x1 ), q(x2 ), . . . , q(xk )}. Each
of these individual parameters q(xi ) show some uncertainty and hence can be
represented as appropriate random variables. Note that the representation of
the random field q(x), x ∈ D as a vector of points q′ = {q(x1 ), q(x2 ), . . . , q(xk )} is

an approximation that is equivalent to q(x) in the limit as k → ∞. For some fi-
72
nite value of k, the above procedure of approximating a random field as a set of
independent random variables results in a nsd + 1 + k dimensional stochastic dif-
ferential equation for the evolution of the dependent variable θ. The governing
stochastic differential equation is written as:
B(u, p, θ : x, t, ξq1 , . . . , ξqk ) = 0. (3.3)
One problem with the above strategy is that the number of random variables
(the stochastic dimensionality) quickly becomes very large and can pose signif-
icant computational problems.
3.1.3 The Karhunen-Loève expansion
An alternate approach to representing random fields is to recognize that such

fields usually have some amount of spatial correlation. There is rich literature
on techniques to extract/fit correlations for these random fields from input ex-
perimental/numerical data. This is an area of intense ongoing research [19].

The assumption of the existence of some correlation structure of the random
field provides a elegant way of representing the random field as a finite set of
uncorrelated random variables. This technique is called the Karhunen-Loève
expansion.
The Karhunen-Loève expansion is essentially founded on the idea that the

space of all second-order random processes (i.e. processes with a finite sec-
ond moment) can be defined as a Hilbert space equipped with the mean-square
norm. Denote this Hilbert space as L2 (Ω, F , P). Since this space is complete, any
73
random process q(x, ω) ∈ L2 (Ω, F , P) can be expressed as a summation
∞
X
q(x, ω) = qi (x)ζi (ω), (3.4)
i=1
where {ζi (ω)}∞ ∞

i=0 forms a basis of the space and {qi }i=0 are the projections of the
field onto the basis. Eq. (3.4) involves an infinite summation and hence is
computationally intractable. The Karhunen-Loève expansion (KLE) considers
finite-dimensional subspaces of L2 (Ω, F , P) that best represent the uncertainty
in q(x, ω). The KLE for a stochastic process q(x, t, ω) is based on the spectral
decomposition of its covariance function Rhh (y1 , y2 ). Here, y1 and y2 denote the
spatio-temporal coordinates (x1 , t1) and (x2 , t2 ), respectively. By definition, the
covariance function is symmetric and positive definite and has real positive
eigenvalues. Further, all its eigenfunctions are orthogonal and span the space to
which q(x, t, ω) belongs. The KL expansion can be written as

∞ p
X
q(x, t, ω) = E(q(x, t, ω)) + λi fi (x, t)ξi(ω), (3.5)
i=0
where E(q(x, t, ω)) denotes the mean of the process and {ξi (ω)}∞
i=0 forms a set of
uncorrelated random variables whose distribution has to be determined [18].

(λi , fi(x, t))∞
i=0 form the eigenpairs of the covariance function:
Z
Rhh (y1 , y2 ) fi (y2 )dy2 = λ f1 (y1 ). (3.6)
D×T
The chief characteristic of the KLE is that the spatio-temporal randomness has
been decomposed into a set of deterministic functions multiplying random vari-

ables. These deterministic functions can also be thought of as representing the
scales of fluctuations of the process. The KLE is mean-square convergent to the
original process q(x, t, ω). More interestingly, the first few terms of this expan-
sion represents most of the process with arbitrary accuracy. The expansion in
Eq. (5.5) is typically truncated to a finite number of summation terms, k. Using
74
this technique, the representation becomes
B(u, p, θ : x, t, ξq1 , . . . , ξqk ) = 0. (3.7)
3.1.4 Model reduction as a tool for constructing input models
In many cases, one can only obtain limited experimental information about the
variation in a particular property/conditions that is essential to analyzing a par-
ticular system. This is particularly apparent when one is dealing with systems
that involve heterogeneous or anisotropic media or forcing conditions. Ex-

amples of these systems include (but are not limited to) polycrystalline ma-
terials, functionally-graded composites, heterogeneous porous media, multi-
phase composites, structural materials like cement and mortar. The thermal-
mechanical (and chemical) properties of these media/materials depend on the

local distribution of materials. In such analysis, the only information that is
usually available experimentally to quantify these variations are statistical cor-
relations (like the mean distribution of material, some snapshots of the struc-
ture at a few locations or some higher order correlation functions, Fig. 3.1). This
leads to an analysis of the problem assuming that the properties are random
fields satisfying some experimentally determined statistics. In the context of a
computational framework to perform such analysis, one needs to construct a
stochastic input model that encodes this limited information.
Physically meaningful/useful solutions can be realized only if property

statistics are experimentally obtained and used in generating the stochastic
model. Usually, experimental data regarding statistics and correlations of the
property (or microstructure) are known (in the following, for the sake of con-
75
Exact distribution of components
Polycrystals and unknown. Topology and properties
multicomponent Application
vary.
microstructures
Analysis of systems consisting

Only statistical/averaged quantities available to
of such microstructures
characterize these structures experimentally.
eg. volume fraction, two-point correlation
Figure 3.1: Limited information about important input quantities necessi-

tates construction of stochastic input models
sistency, we will use microstructure and property interchangeably). There

may be infinitely many microstructures (or property maps) that satisfy these
relations. Consequently, any of these microstructures are equally probable
to be the microstructure under consideration. Hence we have to consider

the variation of the properties (resulting from the possible variation in topol-
ogy/microstructures) as a random field. The first step to including the effect
of these random fields is to estimate a reliable descriptor of the same. For

this descriptor to be useful in a computational framework, it must be a finite-
dimensional function. The objective is then to construct low-dimensional func-
tions that approximate this property variation. By sampling over this approxi-
mate space, we are essentially sampling over all possible, plausible microstruc-
tures (or any other property variations).
In our recent work in this area [20], a general framework to construct a

low-dimensional stochastic model to describe the variation in spatially vary-
ing quantities (based on limited statistical information) was developed. The
methodology relies on a limited number of snapshots of possible property maps
76
(or microstructures). These snapshots are either experimentally obtained or re-
constructed using the experimental statistical information available. A model
reduction scheme (based on Principle Component Analysis (PCA)) was for-
mulated to represent this set of snapshots/property maps (which is a large-
dimensional space) as a low-dimensional approximation of the space. A sub-

space reduction strategy (see [20]) based on constraints that enforce the exper-
imental statistics was used to construct the smallest low-dimensional repre-
sentation of the variability. This basic idea is illustrated using the example
of polycrystalline materials (this general concept can be applied to any prop-

erty variation) in Fig. 3.1 using a trivial two-dimensional model. The image on
the left shows a reduced-order model (the plane) that captures the variation in
polycrystalline microstructures that satisfy an experimentally determined grain
boundary distribution [20]. Each point on the linear manifold shown corre-
sponds to a microstructure with a unique grain distribution. By sampling over
this space, one essentially samples over the space of all allowable microstruc-
tures. The figure on the right shows a similar orientation distribution model.
This methodology proved to be extremely effective in analyzing the effect of

thermal property variations in two phase microstructures.
Following this, in [21], a more flexible non-linear method to encode the mi-
crostructural variability was developed. This framework is based on concepts
borrowed from psychology [22, 23], cognitive sciences [24, 25], differential ge-
ometry [26, 27, 28] and graph theory [29, 30]. The key mathematical concept on
which the framework is constructed on is the so-called principle of ‘manifold
learning’: Given a finite set of unordered data points in high dimensional space,
construct a low dimensional parametrization of this dataset. The finite set of
unordered data points correspond to various (experimental or reconstructed)
77
plausible realizations of the property map. In [21], we show that this set of
unordered points (satisfying some experimental statistics) lies on a manifold
embedded in a high dimensional space. That is, all the possible variations of
the quantity of interest (given a limited set of information about its variability, in
terms of say, some experimentally determined correlation statistics) lie on a curve in

some high-dimensional space. An isometric transformation of the manifold into
a low-dimensional, convex connected set is performed. It is shown that this
isometric transformation preserves the underlying geometry of the manifold.
The isometric transformation of the manifold is constructed using ideas from

graph theory, specifically by generating graph approximations to the geodesic
distance on embedded manifolds [21]. Given only a finite amount of informa-
tion about the property variability, this methodology recovers the geometry
of the curve and its corresponding low-dimensional representation. Rigorous

error bounds and convergence estimates of this parametric representation of
the property variability are constructed [21]. This parametric space serves as
an accurate, low-dimensional, data-driven representation of the property variation.
The framework was shown to work exceedingly well in constructing models

of various property variations. This methodology was used to construct low-
dimensional input stochastic models of property variations for the analysis of
two-phase microstructures [21].
Given limited information about the variability of a spatially varying pa-
rameter, the techniques discussed above construct a low dimensional stochastic

model, G, representing the variability as:
α(x, ω) = G(ξ1 , . . . , ξk ). (3.8)
78
The corresponding governing stochastic differential equation is
B(u, p, θ : x, t, ξq1 , . . . , ξqk ) = 0. (3.9)
We detail the non-linear model reduction strategy in the rest of the chapter:
The focused application of the developed framework is to analyze transport
phenomena in heterogeneous random media. In this work, we are particularly

interested in investigating thermal diffusion in two-phase heterogeneous me-
dia. In such problems, the topology of the heterogeneous structure (the mi-
crostructure) is only known in terms of a few statistical correlations. Denote
this set of statistical correlations by S = {S 1 , . . . , S p }. Any material structure that

satisfies these statistical correlations is a valid realization of the microstructure.
Consequently, the microstructural topology should be considered as a random
field (satisfying some statistical correlations) and the microstructure in any ar-
bitrary specimen as a realization of this field. The thermal diffusivity of the

material obviously depends on the topology of the microstructure. We assume
that the thermal diffusivity of the material is uniquely defined by its microstruc-
ture (e.g. each point in a realization of a two-phase medium is assumed to be
uniquely occupied by one of the two phases (0 or 1) and that each phase has
a given diffusivity). That is, for a microstructure specified as a distribution of
two phases in a domain, the thermal diffusivity distribution is given by simply
replacing the phase description (0 or 1) at each point on the domain by its cor-
responding thermal diffusivity (α0 or α1 ). From this simple scaling, for each
79
realization of the microstructure, we can compute the corresponding realization
of the diffusivity, α (α = k/ρC p ).
Let MS be the space of all microstructures that satisfy the statistical proper-
ties S . This is our event space. Every point in this space is equiprobable. Conse-
quently, we can define a σ-algebra G and a corresponding probability measure
P : G → [0, 1] to construct a complete probability space (MS , G, P) of allow-

able microstructures. Corresponding to a microstructure realization ω ∈ MS ,
we can associate a thermal diffusivity distribution α(x, ω). That is, the thermal
diffusivity of the random heterogeneous medium is represented as
α(x) = α(x, ω), x ∈ D, ω ∈ MS , (3.10)
where D ⊂ Rnsd is the n sd -dimensional bounded domain that is associ-

ated with this medium. The governing equation for thermal diffusion in this
medium can be written as:
∂u(x, t, ω)
= ∇ · [α(x, ω)∇u(x, t, ω)] + f (x, t), x ∈ D, t ∈ [0, T f ], ω ∈ MS , (3.11)
∂t
where u is the temperature. Here, f is the thermal source/sink and is assumed
to be deterministic without loss of generality.
The solution methodology is to first reduce the complexity of the problem
by reducing the probability space into a finite-dimensional space [45]. In the

present case, the random topology satisfies certain statistical correlation func-
tions S = {S 1 , . . . , S p } (from now on referred to as the ‘experimental statistics’).
We utilize nonlinear model reduction techniques (see Section 3.3) to decompose
the random topology field into a finite set of uncorrelated random variables.
Upon decomposition and characterization of the random inputs into d random
80
variables, ξi (ω), i = 1, . . . , d, the solution to the SPDE Eq. (3.11) can be written as
u(x, t, ω) = u(x, t, ξ), ξ = (ξ1 , . . . , ξd ), (3.12)
where ξ is the d-tuple of the random variables. The domain of definition of ξ is

denoted by Γ. Eq. (3.11) can now be written as a (d + n sd )-dimensional problem
as follows:
∂u(x, t, ξ)
= ∇ · [α(x, ξ)∇u(x, t, ξ)] + f (x, t), x ∈ D, t ∈ [0, T f ], ξ ∈ Γ. (3.13)
∂t
For the sake of brevity, we will denote the system of equations consisting of
Eq. (3.13) and appropriate initial and boundary conditions (here for simplicity
assumed to be deterministic) as B(u : x, t, ξ) = 0.
We utilize a sparse grid collocation framework (based on the Smolyak al-

gorithm) that results in a set of decoupled deterministic equations [14]. In this
collocation approach, a finite element approximation is used for the spatial do-
main and the multidimensional stochastic space is approximated using inter-

polating functions. One computes the deterministic solution at various points
in the stochastic space and then builds an interpolated function that best ap-
proximates the required solution [14]. The collocation method collapses the
(d + n sd )-dimensional problem to solving M (where, M is the number of col-

location points, ξ k , k = 1, . . . , M) deterministic problems in n sd -space dimen-
sions. The q-th order statistics (for q ≥ 1) of the random solution can be
obtained through simple quadrature operations on the interpolated function
PM
u(x, ξ) = k=1 u(x, ξk )Lk (ξ) (where Lk are the sparse grid interpolation functions)
as:
M
X Z
q q
< u (x) >= u (x, ξ k ) Lk (ξ)ρ(ξ)dξ, (3.14)
k=1 Γ
81
where ρ(ξ) : Γ → R is the joint probability distribution function for the set of ran-
dom variables {ξ1, . . . , ξd }. In the following sections, we describe the non-linear
model reduction framework for computing α(x, ξ). Details of the implementa-
tion of the Smolyak algorithm for Eq. (3.13) can be found in [20].
3.3 Nonlinear model reduction: Its necessity and some basic
ideas
In our recent work [20], a linear model reduction strategy was developed to
convert experimental statistics into a plausible low-dimensional representation
of two-phase microstructures. The first step in that formulation was the con-
version of the statistical information into a set of plausible 3D realizations. This
feature of first converting the given experimental statistics into a data set of
plausible 3D microstructures is continued in the developments featured here.
This is motivated by the fact that there exists a variety of mature mathemati-
cal and numerical techniques that convert experimental data and statistics into
multiple plausible 3D reconstructions of the topology and thus property varia-

tions. For instance, the GeoStatistical Modelling Library [73] (GSLIB) converts
experimental statistics of the permeability (semi-variograms, correlations, etc.)
into plausible 3D models of permeability variations. Similarly, there have been
various techniques that have been developed [20, 74, 75, 76, 77], to convert ex-
perimental statistics into a plausible 3D reconstructions of two-phase composite
microstructures as well as polycrystalline materials.
82
3.3.1 Linear model reduction: where does it fail?
The PCA based linear model reduction technique that we had formulated [20]
has some drawbacks. The most critical of these is that any PCA based approach
can only identify the closest linear subspace to the actual space (which is possi-
bly non-linear) in which the data reside. This directly translates into the fact that
PCA tends to consistently over-estimate the actual dimensionality of the space.

This is illustrated in Fig. 3.2 which shows a plot of the number of eigenvectors
(using PCA) required to represent 80% of the information content of a set of
images, as the number of such sample images are increased. Each image is a
3D two-phase microstructure that satisfies a specific volume-fraction and two-

point correlation. Notice that as the number of samples increases, the number
of eigenvectors required for a moderately accurate representation of the data
monotonically increases.
Number of eigen vectors
Number of samples
Figure 3.2: Plot of the number of samples versus the number of eigenvec-
tors required to represent 80% of the ‘energy’ spectrum con-
tained in the samples.
83
The issue brought out in the discussion above can be understood in a more
intuitive way by looking at the simple surface shown in Fig. 3.3(a). The set of
points shown in Fig. 3.3(a) all lie on a spiral in 3D space. The global coordinates
of any point on the spiral is represented as a 3-tuple. Any PCA based model
tries to fit a linear surface such that the reconstruction error is minimized. This
is shown by the green plane which is a 2D representation of the data. This
clearly results in a bad representation of the original data. On the other hand,
knowledge of the geometry of the 3D curve results in representing the global po-
sition of any point of the curve on a 2D plane, which is obtained by unravelling

the spiral into a plane (Fig. 3.3 (b)). This is essentially the intuitive concept of
non-linear model reduction techniques: i.e. they try to unravel and smoothen
curves lying in high-dimensional spaces, so that a low-dimensional represen-
tation naturally arises. Thus, while PCA based methods would require a 3D
representation to accurately describe the spiral shown in Fig. 3.3, a non-linear
model reduction method would identify the geometry of the curve, unravel it
and provide a simple 2D representation of the data.
3.3.2 Non-linear model reduction: preliminary ideas
The essential idea of nonlinear model reduction finds its roots in image com-
pression and related digital signal processing principles. Figure 3.4(a) shows
multiple images of the same object taken at different left-right and up-down
poses. Each image is a 64 × 64 gray-scale picture. Even though each image

shown in the figure is defined using 64 × 64 = 4096 dimensional vector, each
image is in fact uniquely parameterized by just two values, the right-left pose
and the up-down pose. It follows that the curve (we will henceforth refer to this
84
PCA
3D 40
20
data
15
10
15
10
-5
5
0
-10 -5
-10
(a) (b)
Figure 3.3: A set of points lying on a spiral in 3D space. The global coordi-
nates of any point on the spiral is represented as a 3-tuple. The
figure on the left (a) shows the reduced-order model resulting
from a linear PCA based reduction. The figure on the right (b)
depicts a non-linear strategy that results in an accurate repre-
sentation of the 3D spiral (data taken from [78]) that works by
“unravelling and smoothing” the 3D spiral into a 2D sheet.
curve as the manifold) representing all possible pictures of this object is embed-
ded in R4096 but is parameterized by a region in R2 . The identification of the

(small set of) parameters that uniquely define a manifold embedded in a high-
dimensional space is called the ‘manifold learning problem’ [22, 24, 23]. This
problem of estimating the low-dimensional representation of unordered high-
dimensional data sets is a critical problem arising in studies in vision, speech,

motor control and data compression.
Analogous to the problem defined above (using Fig. 3.4 (a)), we define a
problem based on the images in Fig. 3.4 (b). Fig. 3.4 (b) shows multiple mi-
crostructures that satisfy experimentally determined two-point correlation and
volume-fraction. Each microstructure is a 65 × 65 × 65 binary image. Each mi-
crostructure that satisfies the given experimental statistics is a point that lies
on a curve (manifold) embedded in 65 × 65 × 65 = 274625 dimensional space.
The problem of ‘manifold learning’ or parameter estimation as applied to this
85
(a) (b)
Figure 3.4: Figure (a) on the left shows images of the same object
(from [78]) taken at various poses (left-right and up-down)
while Figure (b) on the right shows various two-phase mi-
crostructures that satisfy a specific volume-fraction and two-
point correlation.
situation is as follows:
Problem statement A: Given a set of N-unordered points belonging to a manifold M
embedded in a high-dimensional space Rn , find a low-dimensional region A ⊂ Rd that

parameterizes M, where d ≪ n.
Classical methods in manifold learning have been methods like the Princi-
ple Component Analysis (PCA), Karhunun-Loève expansion (KLE) and multi-
dimensional scaling (MDS) [79]. These methods have been shown to extract
optimal mappings when the manifold is embedded linearly or almost linearly

in the input space. Recently two new approaches have been developed that
combine the computational advantages of PCA with the ability to extract the
geometric structure of nonlinear manifolds. One set of methods preserve the
local geometry of the data. They aim to map nearby points on the manifold
86
to nearby points in the low-dimensional representation. Such methods, Locally
Linear Embedding (LLE) [22], Laplacian Eigen Maps, Hessian Eigen maps, es-
sentially construct a homeomorphic mapping between local sets in the manifold
to an open ball in a low-dimensional space. The complete mapping is a union
of these local maps. An alternate set of approaches towards nonlinear model

reduction take a top-down approach. Such global approaches, like the Isomap
and its numerous variants, attempt to preserve the geometry at all scales [24].
They ensure that nearby points on the manifold (with distance defined via a
suitable metric) map to nearby points in the low-dimensional space and far-
away points map to faraway points in the low-dimensional space. Though both
approaches are viable, we focus our attention to global methods of non-linear
dimension reduction. The global approach has been shown to provide a faith-
ful representation of the global structure of the data. Furthermore, based on

our developments, it is possible to prove tight error bounds and convergence
estimates of the model reduction strategy using global methods. Finally, recent
advances in such global approaches have made these strategies computation-
ally very efficient.
The basic premise of the global methods (particularly the Isomap [23] algo-
rithm) is that ‘only geodesic distances reflect the true low-dimensional geome-
try of the manifold’. The geodesic distance (between two-points) on a manifold
can be intuitively understood to be the shortest distance between the two-points
along the manifold. Subsequent to the construction of the geodesic distance be-
tween the sample points (x) in the high-dimensional space, the global methods
construct the low-dimensional parametrization simply as a set of points (y) ly-
ing in a low-dimensional space that most accurately preserve the geodesic dis-
tance. For example, the Isomap algorithm is an isometric transformation of the
87
high-dimensional data into low-dimensional points. The global based methods
solve the following version of the problem statement:
Problem statement B: Given a set of N-unordered points belonging to a manifold M

embedded in a high-dimensional space Rn , find a low-dimensional region A ∈ Rd which
is isometric to M, with d ≪ n.
The above discussion provides a basic, intuitive picture of the non-linear

model reduction strategy. There are several features – the properties that the
manifold satisfies, the notion of distances in the high-dimensional space, the

optimality of the low-dimensional parametrization and the accuracy of this
parametrization – that require a rigorous mathematical footing. We proceed
to develop these in the next section.
3.4 Mathematical formulation
This section is divided into five parts. We first introduce some mathemati-
cal preliminaries and define an appropriate distance function D between two
points in MS . For the straightforward construction of a transformation F :
MS → A we ensure that MS is topologically well-behaved, i.e. it is smooth and

has no holes. This can be ensured by showing that MS is compact (Section 3.4.1).
The next step in the construction of the transformation is the estimation of the
pair-wise geodesic distance between all the data points. The geodesic distance
reflects the true geometry of the manifold embedded in the high-dimensional

space. We utilize developments in the graph approximations to geodesics to
do the same (Section 3.4.2). Following this, techniques for estimating the opti-
mal dimensionality of the reduced-order model are developed (Section 3.4.3).
88
Section 3.4.4 details the application of the Isomap algorithm (along with the
estimate of the optimal dimensionality from Section 3.4.3) to construct the low-
dimensional parametrization of the input data. Finally Section 3.4.5 details a
non-parametric mapping that serves as the reduced-order, data-driven model
of the material topology and thus thermal diffusivity variation.
3.4.1 Some definitions and the compactness of the manifold M
We provide a few definitions to make the subsequent presentation clear. De-

tailed proofs of the lemmas stated here can be found in the Appendix.
Definition 4.1: By a microstructure x, we mean a pixelized representation of a 3D

topology. Without loss of generality, we assume that the number of pixels representing
the microstructure is n = q × q × q.
Definition 4.2: Let MS denote the set of microstructures {xi } satisfying a set of
statistical correlations S = {S 1 , . . . , S p }, i.e.
MS = {x ∈ Rn | x satisfies {S 1, . . . , S p}}. (3.15)
Note that these correlations have an increasing hierarchy of information con-

tent. That is, the two-point correlation contains more information about the
material topology and property distribution than, say, the volume fraction.
Definition 4.3: We denote as ‘upper-correlation function’, S upper , the statistical
correlation function which has a higher-information content than {S 1 , . . . , S p }.

For instance, if MS is the set of all microstructures satisfying a first-order con-
straint S 1 (volume fraction), the upper-correlation function for MS is the two-point
89
correlation function.
Remark 4.1: In most realistic systems, it is possible to experimentally de-

termine first-order (mean) and second-order (two-point correlation) statistics
fairly easily [80]. Higher-order statistics, though feasible, are expensive to ex-
perimentally determine. Since we are particularly interested in converting these
experimental statistics into viable stochastic models, we will henceforth limit

our discussion to the set of two-phase microstructures that satisfy given first-
and second-order statistical correlations. Nevertheless, the developments de-
tailed below are in no way limited to second-order statistics and in fact can be
extended to include higher-order statistics in a very straightforward manner.
For clarity of presentation, we restrict our analysis to the set of two-phase mi-
crostructures satisfying some volume-fraction and two-point correlation. Each

microstructure x is represented as a n = q × q × q pixelized binary image. Each
pixel can take values of 0 or 1, representing one of the two phases constitut-
ing the two-phase microstructure. We denote the set of microstructures sat-
isfying the given first- and second-order statistics S = {S 1 , S 2} by MS2 . The

upper-correlation function for MS2 is the three-point correlation function S 3 . That
is, S 3 (a, b, c) is the probability of finding three points, forming a triangle with
sides a, b, c that all belong to the same phase. Since the microstructure is dis-
cretized/pixelated, a, b, c take integer values. Also, since the microstructure is

finite (defined on q×q×q pixels), the number of such integer sets (a, b, c) is finite.
Using various reconstruction techniques [81, 68], it is possible to generate a set
of samples xi ∈ MS2 . The aim of the present work is to utilize these realizations
to construct a low-dimensional model for the set MS2 .
Definition 4.4: The function D : MS2 × MS2 → [0, ∞) is defined for every x1 , x2 ∈
90
MS2 as
D(x1 , x2 ) = |S upper (x1 ) − S upper (x2 )|. (3.16)
Based on Remark 4.1, D is defined as

X
D(x1 , x2 ) = |S 3 (a, b, c)(x1 ) − S 3 (a, b, c)(x2 )|, (3.17)
(a,b,c)
over all possible combinations of a, b, c.
Lemma 4.1: (MS2 , D) is a metric space.
Remark 4.2: Notion of distance and equivalence between two microstructures: The
function D(x1 , x2 ) provides a notion of distance between two microstructures
x1 , x2 ∈ MS2 . Since x1 , x2 belong to MS2 , both have identical volume fraction, S 1
and two-point correlation S 2 . We naturally denote them as equivalent if they
have the same upper-correlation function, S 3 . Since we are dealing with statisti-
cally reconstructed microstructures, this definition of equivalence ensures that
two microstructures are identical if their higher-order topological characteriza-
tion is identical.
Remark 4.3: Any other mapping that satisfies the three properties of a met-
ric [82] can be used as a measure of equivalence and distance. Since we are
dealing with correlation statistics (that inherently result in limited information

about the topology), using the upper-correlation function is a natural way of
utilizing this limited information towards quantifying the difference between
two points (microstructures) in MS2 . An alternate idea of representing the dis-
tance between microstructures is to compute the pixel-wise difference between

qP
q 2
the two microstructures, i.e. D(x1 , x2 ) = i, j,k=1 (x1 (i, j, k) − x2 (i, j, k)) .
91
Lemma 4.2: The metric space (MS2 , D) is totally bounded.
Lemma 4.3: The metric space (MS2 , D) is dense.
Lemma 4.4: The metric space (MS2 , D) is complete.
Theorem 4.1: The metric space (MS2 , D) is compact.
Proof: Follows from Lemma 4.2 and Lemma 4.4 (see Theorem 45.1 in [26]).
3.4.2 Estimating the pair-wise geodesic distances: graph ap-
proximations
The Isomap algorithm attempts to find a low-dimensional representation, {yi } of

the points {xi } such that {yi } is isometric to {xi } based on the geodesic distances
between the points. It is therefore necessary to compute the pair-wise geodesic
distance between all the input data points {xi }.
Definition 4.5: Denote the intrinsic geodesic distance between points in MS2
by D M . D M is defined as
D M (x1 , x2 ) = inf {length(γ)}, (3.18)

γ
where γ varies over the set of smooth arcs connecting x1 and x2 . We wish to
remind the reader that length of the arcs in the equation above are defined using
the distance metric D.
It is important to appreciate the fact that we start off with no knowledge

of the geometry of the manifold. We are only given N unordered samples {xi }
lying in MS2 . Hence, the Definition 4.5 of the geodesic distance is not partic-
ularly useful in numerically computing the distance between two-points. An
92
approximation of the geodesic distance is required to proceed further. Such an
approximation is provided via the concept of graph distance. We subsequently
show that this approximation asymptotically matches the actual geodesic dis-
tance (Eq. (3.18)) as the number of samples, N, increases (see Theorem 4.2 be-
low).
The unknown geodesic distances in MS2 between the data points are ap-
proximated in terms of a graph distance with respect to a graph G constructed
on the data points. This neighborhood graph G is very simple to construct [23].
Two points share an edge on the graph if they are neighbors. The neighbor-
hood information is estimated in terms of the distance metric D. x1 and x2 are

neighbors if D(x1 , x2 ) = min j=1,...,N (D(x1 , x j )). These neighborhood relations are
subsequently represented as a weighted graph G over all the data points. The
edges are given weights corresponding to the distance D(xi , x j ) between points.
For points close to each other, the geodesic distance is well approximated
by the distance metric D. This is because the curve can be locally approxi-
mated to be a linear patch, and the distance between two points on this patch
is the straight line distance between them. This straight line distance is given
by the distance metric D, which is just the edge length between the points on
the graph, G. On the other hand, for points positioned faraway from each other,
the geodesic distance is approximated by adding up a sequence of short hops

between neighboring points [23]. These short hops can be computed easily
from the neighborhood graph G. Denote the shortest path distance between
two-points x1 and x2 on the graph G as DG . The key to constructing the low-
dimensional representation is to approximate D M as DG . As the number of in-

put data points increases, the graph distance approximation approaches the in-
93
trinsic geodesic distance. The asymptotic convergence of the graph distance to
the geodesic distance is rigorously stated in Theorem 4.2. This theorem utilizes
some parameters for the quantification of the geometry of the manifold, partic-
ularly the minimal radius of curvature, ro and the minimal branch separation
so . For the sake of completeness, we state the theorem below. For the sake of
brevity, we leave out the definitions of these abstract parameters (the interested
reader is referred to [23, 26] for discussion of these terms).
Theorem 4.2: Let MS2 be a compact manifold of Rn isometrically equivalent to a

convex domain A ⊂ Rd . Let 0 < λ1 , λ2 < 1 and 0 < µ < 1 be given, and let ǫ > 0
√
be chosen such that ǫ < so and ǫ ≤ 2π ro 24λ1 . A finite sample set {xi }, i = 1, . . . , N
V
log( )
µηd (λ2 ǫ/16)d
is chosen randomly from MS2 with a density α, with α > ηd (λ2 ǫ/8)d
, where V is the
volume of MS2 and ηd is the volume of the unit ball in Rd . The neighborhood graph G is
constructed on {xi }. Then, with probability at least 1 − µ, the following inequalities hold
for all x, y in MS2 :
(1 − λ1 )D M (x, y) ≤ DG (x, y) ≤ (1 + λ2 )D M (x, y).
Proof: Follows from Theorem B in [25].
3.4.3 Estimating the optimal dimension, d, of the low-
dimensional representation
MS2 is intrinsically parameterized by a low-dimensional set A ⊂ Rd . As a first

step towards constructing A, the pair-wise geodesic distances, DG , are con-
structed from the neighborhood graph G. Before we can proceed further, the
intrinsic dimensionality of this low-dimensional space, d has to be estimated.
94
We draw from the recent work in [27, 28], where the intrinsic dimension
of an embedded manifold is estimated using a novel geometrical probability
approach. This work is based on a powerful result in geometric probability -
the Breadwood-Halton-Hammersley [83] theorem where d is linked to the rate
of convergence of the length-functional of the minimal spanning tree of the

geodesic distance matrix of the unordered data points in the high-dimensional
space. Consistent estimates of the intrinsic dimension d of the sample set are
obtained using a very simple procedure.
The sequel utilizes concepts from graph theory. We provide some of the
essential definitions below. For a detailed discussion of graphs, trees and related
constructs, the interested reader is referred to [29]. Consider a set of k points
(vertices) and a graph defined on this set. A graph consists of two types of
elements, namely vertices and edges. Every edge has two endpoints in the set
of vertices, and is said to connect or join the two endpoints.
• A weighted graph associates a weight (here, this weight is the distance
between the vertices) with every edge in the graph.
• A tree is a graph in which any two vertices are connected by exactly one
path.
• A spanning tree of a graph (with k vertices) is a subset of k − 1 edges that
form a tree.
• The minimum spanning tree of a weighted graph is a set of edges of min-
imum total weight which form a spanning tree of the graph.
Definition 4.7: The geodesic minimal spanning tree (GMST) is the minimal
spanning tree of the graph G. The length functional, L({x}) of the GMST is de-
95
P
fined as L({x}) = minT ∈T N e∈T |e|, where T N is the set of all spanning trees of the
graph G and e are the edge-weights of the graph. This is in fact, simply the total
weight of the tree.
The mean length of the GMST is linked to the intrinsic dimension d of a

manifold MS2 embedded in a high-dimensional space Rn through the following
theorem:
Theorem 4.3: Let MS2 be a smooth d-dimensional manifold embedded in Rn through
a map F −1 : Rd → MS2 . Let 2 ≤ d ≤ n. Suppose that {xi }, i = 1, . . . , N are random

vectors in MS2 . Assume that each of the edge lengths |ei j | M computed from the graph G
converge to |F (xi ) − F (x j )|2 as N → ∞ (i.e. the graph distance converges to the true
manifold distance - This is guaranteed by Theorem 4.2). Then the length functional,
L({x}) of the GMST satisfies:


∞ if d ′ < d,






′ ′

lim L({x})/N (d −1)/d

= βmC if d ′ = d, (3.19)

N→∞ 




if d ′ > d,

 0

where βm is a constant, and C is a non-zero function defined in MS2 .
Proof: Follows from Theorem 2 in [30].
Theorem 4.3, particularly the asymptotic limit given in Eq. (3.19) for the
length functional of the GMST provides a means of estimating the intrinsic di-
mension of the manifold. It is clearly seen that the rate of convergence of L({x})
is strongly dependent on d. Following [30], we use this strong rate dependence
to compute d. Defining lN = log(L({x})), gives the following approximation for
lN (from Eq. (3.19))
lN = a log(N) + b + ǫN , (3.20)
96
where
d−1
a = , (3.21)
d
b = log(βmC), (3.22)
and ǫN is an error residual that goes to zero as N → ∞ [27, 28, 30]. The intrinsic
dimensionality, d can be estimated by finding the length functional for different
number of samples N and subsequently finding the best fit for (a, b) in Eq. (3.20).
3.4.4 Constructing the low-dimensional parametrization, A
Section 3.4.2 presented the construction of the geodesic distance between all
pairs of input points, while Section 3.4.3 provided an estimation of the dimen-
sionality, d, of the low-dimensional representation. Denote as M, the pair-wise
distance matrix (based on the geodesic distance), with elements di j = DG (xi , x j )
where i, j = 1, . . . , N. Multi-dimensional scaling [79, 84] (MDS) arguments are

used to compute the set of low-dimensional points that are isometric to these
high-dimensional images. The objective of MDS is: Given a matrix, M, of pairwise
distances between N (high-dimensional) points, find a configuration of points in a low-
dimensional space such that the coordinates of these N points yield a Euclidean distance
matrix whose elements are identical to the elements of the given distance matrix M.
Given M, find {yi } i = 1, . . . , N such that yi ∈ Rd and

X d
(yik − y jk )2 = di2j , for all i, j. (3.23)
k=1
This is called ‘coordinate recovery or parametrization’: finding a set of points

given only the pair-wise distance between the points.
Define the N × N symmetric matrix A with elements ai j = − 12 di2j . Define the
97
N × N centering matrix [79, 84], H = I − N1 11′ with elements hi j = δi j − 1/N. Define
the N × d matrix of points Y = (y1 , . . . , yN )T . Assume without loss of generality
PN
that the centroid of the set of points is the origin ( i=1 yi = 0). Define the N × N
matrix B whose components are the scalar products yTi · y j .
d
X
bi j = yik y jk = yTi · y j . (3.24)
k=1
The problem at hand is to estimate the coordinates yi , given M. We relate these
points {yi } to B. B in turn can be represented in terms of the known distances

M = {di j }. It can easily be shown that [79, 84]
B = HAH. (3.25)
B is a positive definite matrix that can be decomposed as follows [84]:
B = ΓΛΓ. (3.26)
Here, Λ = diag(λ1 , . . . , λN ) is the diagonal matrix of the positive eigenvalues of

B and Γ = (γ1 , . . . , γN )T the matrix of the corresponding eigenvectors. Note that
B will have a decaying eigen-spectrum. The required set of points y1 , . . . , yN is

related to the d largest eigenvalues/eigenvectors of B. B is by definition (see
Eq. (3.24), the scalar product matrix)
B = YYT . (3.27)
From Eq. (3.26) and Eq. (3.27), an estimate of Y in terms of the largest d eigen-
1
vectors of B follows: Y = Γd Λd 2 , where Λd is the diagonal (d × d) matrix of
the largest d eigenvalues of B and Γd is the N × d matrix of the corresponding

eigenvectors.
The knowledge of the optimal dimensionality, d, (from Section 3.4.3) is quite

useful in truncating the eigen values of the N × N matrix B. In contrast, classical
98
PCA based techniques truncate the expansion based on representing some per-
centage of the eigen-spectrum i.e. choose the largest d eigenvalues that account
Pd
i λ
for, say, 95% of the eigen-spectrum. ( Pi=1
N
λ
> 0.95).
i=1 i
We have converted a finite set of data points {xi }, i = 1, . . . , N from MS2 ⊂ Rn

to points {yi }, i = 1, . . . , N in Rd . These points belong to a connected convex sub-
set A ⊂ Rd (Theorem 4.2). This convex region can be numerically estimated as

the convex hull of the set of points {yi }, i = 1, . . . , N. Hence the low-dimensional
parametrization of MS2 is given by
A = {ξ ∈ Rd | ξ ∈ ConvexHull({y1 , . . . , yN })}. (3.28)
The stochastic collocation procedure for the solution of SPDEs involves com-
puting the solution at various sample points, ξ, from this space, A.
3.4.5 Construction of a non-parametric mapping F −1 : A →
MS2
Given a set of samples {xi }, i = 1, . . . , N in MS2 , the nonlinear dimension re-

duction strategy (Section 3.4.2) coupled with the dimension estimation method
(Section 3.4.3) convert these points into a set of points {yi }, i = 1, . . . , N belonging
to a convex set A. This convex region A ⊂ Rd , defines the reduced representation of
the space of microstructures, MS2 , satisfying the given statistical correlations S2 . A
can be considered to be a surrogate space to MS2 . One can access the complete
variability in the topology and property distribution of microstructures in MS2

by simply sampling over the region A. In the collocation based solution strat-
99
egy for solving SPDEs, one constructs the statistics of the dependant variable by
sampling over a discrete set of microstructures. Since we propose to utilize A as
a reduced representation of MS2 , we sample over a discrete set of points ξ ∈ A
instead. But we have no knowledge of the image of a random point ξ ∈ A in
the microstructural space MS2 (we only know that the points {yi } for i = 1, . . . , N
map to the microstructures {xi } for i = 1, . . . , N). For a usable reduced-order model of
the microstructure space, an explicit mapping F −1 from A to MS2 has to be constructed.
There are numerous ways of constructing parametric as well as non-

parametric mappings between two sets of objects. For instance, neural networks
can be trained using the sample points ({yi }, {xi }), i = 1, . . . , N to construct a non-
parametric mapping F −1 . There have been recent reports of variants of the
Isomap algorithm that along with constructing the reduced-order representa-
tion of the samples also construct an explicit mapping between the two sets [85].
But there are two significant issues that have to be considered when one utilizes
such mapping strategies: 1) Most of these explicit mapping strategies are es-
sentially some form of interpolation rule that utilize the sample set of values
({y}, {x}). One has to make sure that the interpolated result, x (x = F −1 (ξ)) for
some arbitrary point ξ ∈ A actually belongs to MS2 . 2) Care must be taken to for-
mulate the explicit mapping in a way that results in a computationally simple
methodology of finding the images of points. This is very significant consid-
ering the fact that we will potentially deal with very large pixel sized images
(pixelized microstructures or property maps) with pixel counts of the order of

128 × 128 × 128. Any strategy that involves performing non-trivial operations
on large data sets of (high resolution) property maps would make the complete
process very inefficient. We propose several strategies of constructing computa-
tionally simple mappings between the two spaces MS2 and A keeping in mind
100
the issues raised above.
Method 1: Nearest neighbor map
This is the simplest map (illustrated schematically in Fig. 3.5) that sets the image
of an arbitrary point in the region A to the image of the nearest sampled point.
The nearest point is the point that is the smallest Euclidean distance from the
given arbitrary point. This method is particularly useful when the sampling
density (α as defined in Theorem 4.5) is large, i.e. this method results in a reliable
mapping when the number of microstructure samples {xi }, i = 1, . . . , N is large.
A⊂Rd MS ⊂ R n
Figure 3.5: The figure above illustrates the simplest possible mapping be-
tween the low-dimensional region A and the high-dimensional
microstructural space MS2 . Given an arbitrary point ξ ∈ A,
find the point yk closest to ξ from the sampled points {yi },
i = 1, . . . , N. Assign the image value of yk i.e. xk to the image of
ξ.
It is simple to construct error estimates for this mapping. Given an arbitrary

point ξ ∈ A, approximating it by its closest neighbor results in an error, eA ,
q
Pd 2
given by eA = mini=1,...,N k=1 (ξk − yik ) . From isometry, the error between the
actual image and the mapped image is given by e = eA . This error can be made
arbitrarily small by increasing N.
101
Method 2: Local linear interpolation
A simple non-parametric mapping based on the k-nearest neighbors of a point

is defined (Fig. 3.6) as follows: Given any point ξ ∈ A, find the k-nearest neigh-
bors, ŷi , i = 1, . . . , k (k defined a priori) to ξ from the set {yi }. Compute the
qP
d 2
(Euclidean) distance li = p=1 (ξ p − ŷip ) of ξ from ŷi i = 1, . . . , k. The point ξ
can be represented as a weighted sum of its k-nearest neighbors as

Pk ŷi
i=1 li
ξ= Pk 1 . (3.29)
i=1 li
Utilizing the fact that the isometric embedding that generated the points yi from
xi conserves distances, the image of ξ is then given by
Pk x̂i
i=1 l
x = Pk 1i . (3.30)
i=1 li
That is, the image of ξ is the weighted sum of the images of the k-nearest neigh-
bors of ξ (where the nearest neighbors are taken from the N-sampled points).
A⊂Rd MS ⊂ R n
Figure 3.6: The figure above illustrates a local linear (k-neighbor) interpo-
lation mapping between the low-dimensional region A and the
high-dimensional microstructural space MS2 .
The linear interpolation procedure is based on the principle that a small re-
gion in a highly-curved manifold can be well approximated as a linear patch.
102
This is in fact one of the central concepts that result in local strategies of non-
linear dimension reduction [22, 24] (see Section 3.3 for a discussion of global
versus local strategies of non-linear dimension reduction). This linear patch is
constructed using the k-nearest neighbors of a point ξ. As the sampling density
(the number of sample points, N) used to perform the nonlinear dimension re-
duction increases, the mean radius of the k-neighborhood of a point approaches
zero (limN→∞ maxi=1,...,k kξ − ŷi k2 → 0), ensuring that the linear patch represents
the actual curved manifold arbitrarily well.
■
□
l1 l2
Figure 3.7: Simple estimate for the interpolation error in Method 2: Let r
denote the local radius of curvature of the function near the
point xexact (the filled square). We approximate the curve by a
linear patch, resulting in some error. This error is the distance
between the approximate linear image, x (the unfilled square)
and the actual point, xexact (the filled square). This distance can
be approximated from simple geometry as a function of r and
the local geodesic distance between the points.
It is possible to estimate approximate error bounds by computing the local

curvature of the manifold. Denote by x, the image of the point ξ. As before,
let the k-nearest neighbors of ξ be ŷi and their corresponding images in MS2 be
x̂i . The local curvature of the manifold in the neighborhood of x can be ap-
proximated from the geodesic distances between the points. Let r denote the
radius of curvature of the manifold at x. The error in the interpolation based
103
representation is caused by considering the space to be locally linear, when it
is curved (see Fig. 3.7). This error is approximated from simple geometry as
q
e = maxi=1,...,k (r − r2 − l2i ).
Method 3: Local linear interpolation with projection
The local linear interpolation method (Method 2) can be made exact by sim-
ply projecting the image obtained after interpolation onto the manifold (shown
schematically in Fig. 3.8). This ensures that the image lies on the manifold MS2 .
The projection is numerically computed as follows: Denote the microstruc-

ture obtained after the interpolation step as x̂. The locally linear interpolation
provides a good approximation of the exact image, x̂ ≈ xexact , where xexact is that
microstructure on MS2 whose geodesic distances from each of the k microstruc-
tures xi , i = 1, . . . , k is li . The errors in this approximation are due to the fact that
the approximation x̂ does not usually lie on the manifold (as seen in Fig. 3.8). That
is, x̂ does not satisfy the statistical correlations S = {S 1 , . . . , S p } that all points on
the manifold satisfy.
The projection operation essentially modifies the point x̂ to satisfy these cor-
relations. This can be achieved computationally by performing a stochastic op-
timization problem starting from x̂ [76, 80, 86]. Since x̂ is very close to xexact ,
these algorithms are guaranteed to reach the local minima defined by xexact . In
the context of the numerical examples presented in this work, using two-phase
microstructures (see Section 3.6), the stochastic optimization is done as follows:
Starting from the approximate microstructure x̂, compute the volume fraction
and two-point correlation of this image. Change the pixel values of t sites in this
104
A⊂Rd MS ⊂ R n
Figure 3.8: The local linear interpolation method can be made exact by
simply projecting the image obtained after interpolation onto
the manifold. This is illustrated in the figure above, where
the botted blue line represents the linear interpolation and the
curved line represents the projection operator that constructs
the image lying on the manifold.
microstructure such that the volume-fraction matches the experimental volume

fraction. Following this, randomly swap pixel values in the microstructure (ac-
cepting a move only if the error in the two-point correlation decreases), until the
two-point correlation matches the experimental value. This can be considered
as a version of simulated annealing, with the starting point, x̂ being close to the
optimal point, xexact .
Figure 3.9: Sample illustration of the projection step: The figure on the
left is a microstructure after interpolation. Projecting it onto
the manifold yields the microstructure on the right. There is
negligible change in the three point correlation between the mi-
corstructures.
105
One question that arises from this mapping strategy is the following: The
initial goal of the mapping was to construct a microstructure (lying on the man-
ifold) that was a distance (geodesically) li from xi . The interpolation step en-
sures this distance (but the microstructure does not lie on the manifold). How
much deviation from this distance does the projection operator cause? By swap-
ping the pixels in the microstructure during the stochastic optimization step, the
three-point correlation is being changed (which is a measure of distance). But in
all of our numerical experiments, this optimization converged within 1000 such
pixel flips, thus negligibly affecting the three-point correlation (since the three-
point correlation is computed by sampling over 65 × 65 × 65 ∼ 300, 000 points,
changing 0.3% of the pixels is negligibly small). This is illustrated pictorially
using a sample example shown in Fig. 3.9. The number of neighbors used for
the local-linear interpolation map is k = 10.
In all three methods detailed above, we only utilize the given input data {xi }
to construct the mapping for an arbitrary point ξ ∈ A. The mapped microstruc-
ture, F −1 (ξ) is constructed solely based on the available input microstructures
and not based on direct reconstruction from moments. In case of the projection
operation used in method 3, the mapped microstructure is moved onto the man-
ifold MS 2 . This operation changes less that 0.3% of the microstructure having
negligible effect on the thermal behavior of the microstructure.
In potential applications where such projection techniques are infeasible,

there are two possible solution strategies that can be pursued: one can use an al-
ternate definition of the distance between the images (see Remark 4.3) or one can
utilize more computationally demanding reconstruction [81] or training frame-

works [85] to construct the mapping F −1 : A → MS 2 .
106
3.4.6 The low-dimensional stochastic input model F −1 : A →
MS 2
A represents the space of d-tuples ξ = (ξ1 , . . . , ξd ) that map to microstructures
that satisfy the statistical properties S = {S 1 , . . . , S p}.
Remark 4.8: In our theoretical derivations in this section, we have ensured

that A is indeed a convex, connected and compact region of Rd . Hence, starting
from a large set of points {yi }, i = 1, . . . , N in A, one can represent the complete
region A as the convex hull of the points {yi }, i = 1, . . . , N.
Each of the microstructures in MS2 (by definition) satisfies all required sta-
tistical properties, therefore they are equally probable to occur. That is, ev-
ery point in the manifold MS2 is equiprobable. This observation provides a

way to construct the stochastic model for the allowable microstructures. De-
fine the stochastic model for the topology variation as F −1 (ξ) : A → M
where ξ = (ξ1 , . . . , ξd ) is a uniform random variable chosen from A. This low-

dimensional stochastic model F −1 for the microstructure is the stochastic input
in the SPDE (Eq. (3.13)) defining the diffusion problem.
3.5 Numerical implementation
This section contains recipes for the numerical implementation of the theoreti-
cal developments detailed in the previous sections. We divide this section into
various subsections that sequentially discuss the data-driven strategy, starting
with the generation of the samples given some limited microstructural informa-
107
tion (Section 3.5.1), the algorithm for constructing the low-dimensional region
A (Section 3.5.2) and the Smolyak algorithm for the solution of the SPDEs (Sec-
tion 3.5.3).
3.5.1 Microstructure reconstruction: creating the samples xi
Given some experimentally determined statistical correlation functions of the
microstructure, the goal is to reconstruct a large set of microstructures satisfying

these correlation functions. This is the first step towards building a reduced-
order model to the microstructural space. In this work, the microstructure is
considered to be a level cut of a Gaussian Random Field (GRF). The statistical
correlations are enforced during the reconstruction of the GRF using the given
information [87, 69, 88]. With this method, a set of N 3D models of the property
variations can be generated.
3.5.2 Constructing the low-dimensional region A
The reconstruction procedure results in a large set of random samples xi from

the space MS2 . The following steps are followed to compute the corresponding
points {yi }.
Step 1: Find the pair-wise distances, P between the N samples xi , i = 1, . . . , N.

This is done by first defining an appropriate distance metric, D between the
microstructures. In the example (Section 3.6) using two-phase microstructure,
we define the distance between two microstructures as the difference between

their three-point correlations. This operation is obviously of O(N 2 ) complexity.
108
Step 2: Construct the neighborhood graph, G, of this sample set. That is,
determine which points are neighbors on the manifold based on the distance
P(i, j). Find the nearest k-neighbors of each point. This is performed using a
sorting algorithm (O(NlogN) complexity). Connect these points on the graph G
and set the edge lengths equal to P(i, j). The total complexity of this operation
is O(N 2 logN).
for i = 1:N
[z,I] = sort (P(i, :)), z are the sorted distances
and I are the corresponding indices
G(1:k,i) = I(2:k+1)
Step 3: Estimate the geodesic distance M(i, j) between all pairs of points on
the manifold. This can be done by computing the shortest path distances in the
graph G. There are several algorithms to compute the shortest path on a graph.
In our implementation, we utilize Floyd’s algorithm to compute M(i, j). The
complexity of this step is O(N 3 ).
109
Initialize M(i, j) as
M(i, j) = P(i, j) if i,j are neighbors or M(i, j) = ∞ otherwise
for k = 1:N
for each pair (i,j) in 1:N
M(i, j) = min(M(i, j), M(i, k) + M(k, j))
Step 4: Decide on the optimal dimensionality, d, of the low-dimensional
space A. Using the graph G [30], estimate the average geodesic MST length.
This is done as shown below.
Choose Q integers p = p1 , . . . , pQ between 1 and N
Randomly pick p samples from the N available samples
Compute the length of the MST of these samples, L(p)
Find the best least squares fit value of a for
L(p) = a log(p) + ǫ p
The optimal dimension d is

1
d = round( )
1−a
We utilize the code in [89] to compute the length functional of the MST. The
complexity of this step is O(NlogN).
110
Step 5: Construct the d-dimensional embedding using MDS.
1
Compute A(i, j) = − M2 (i, j)
2
Compute B = HAH
Compute the eigenvalues of B; B = ΓΛΓ

1
Define Y = ΓΛ 2
The low-dimensional mapping of the input sample points is given by the

first d components of Y. That is, yik = Y(k, i) for i = 1, . . . , N and k = 1, . . . , d. The
complexity of this step is O(nN 2 ), where n is the pixel count in each image. Here,
n = p × p × p ∼ 1283 . For number of samples N ≪ n, MDS is computationally

more feasible than PCA, which has a complexity of O(n2 N).
Step 6: Following Remark 4.8, the low-dimensional region A that maps to

the high-dimensional microstructural space MS is given by the convex hull of
the set of low-dimensional points (y1 , . . . , yN ). We utilize the Qhull program [90]
to compute the convex hull of multidimensional data sets.
A = convex hull(y1 , . . . , yN )
111
3.5.3 Utilizing this reduced-order representation: Stochastic
collocation
The above generated d-dimensional embedding is utilized as an input stochas-
tic model for the solution of SPDEs. We utilize a sparse grid collocation strategy
for constructing the stochastic solution [14]. The method essentially solves the
problem at various points ξ on the stochastic space and constructs an interpola-
tion based approximation to the stochastic solution. The sparse grid collocation
strategy used utilizes piecewise multi-linear hierarchical basis functions as in-

terpolation functions [14]. For a given set of stochastic collocation points ξi ,
i = 1, . . . , M, from A, we find the corresponding images of these points (using
the procedures detailed in Section 3.4.5) xi = F −1 (ξ i ). These microstructural
images are utilized as inputs (property maps) in the solution of the SPDEs.
Remark 5.1: In the stochastic collocation approach, the collocation points are
usually given in the unit hypercube, i.e. p ∈ [0, 1]d . As a first step, this point p
must be mapped to a corresponding point ξ ∈ A.
for i = 1 : M
determine the collocation point pi ∈ [0, 1]d
compute the point ξ i ∈ A
compute xi = F −1 (ξ i )
solve the PDE using xi as input
112
3.6 Illustrative example
In this section, we showcase the theoretical developments detailed in the previ-

ous sections with a realistic example.
3.6.1 Two-phase microstructures
The non-linear dimension reduction strategy is applied to construct a reduced-

order model for the distribution of material in a two-phase metal-metal com-
posite. The problem of interest is as follows:
Compute the PDFs of temperature evolution in a metal-metal composite microstruc-

ture when only limited statistics of the material distribution is known.
This topological uncertainty translates to uncertainties in the thermal dif-

fusivity α(x) of the microstructure. For clarity of presentation, we divide the
solution into multiple sections. 1) The first step is the extraction of topological
statistics from the experimental image provided. These statistics are then uti-
lized to reconstruct a large set of 3D microstructures {xi }, i = 1, . . . , N. 2) The
next step is to construct the low-dimensional representation of the class of mi-
crostructures utilizing the samples {xi }, i = 1, . . . , N in the input space. 3) The

final step is to utilize the reduced-order representation of the microstructural
topology (and hence, the thermal diffusivity coefficient) as an input stochastic
model to solve for the evolution of the temperature statistics.
113
Data extraction and sample set construction
We start from a given experimental image of a microstructure. The image

(204µm × 236µm), shown in Fig. 3.10, is of a Tungstan-Silver composite [66]. This
is a well characterized system, which has been used to test various reconstruc-
tion procedures [68, 67]. The first step is to extract the necessary statistical in-
Figure 3.10: Experimental image of a two-phase composite (from [66]).
formation from the experimental image. The image is cropped, deblurred and
discretized. The volume fraction of silver is p = 0.2. The experimental two-point
correlation is extracted from the image. The normalized two-point correlation
L2 (r)−p2
(g(r) = p−p2
), is shown in Fig. 3.11. The data extraction was performed in
Matlab.
1
0.8
0.6
g(r)
0.4
0.2
0
0 5 10 15 20
r (um)
Figure 3.11: The two-point correlation function.
114
The next step is to utilize these extracted statistical relations (volume frac-
tion and two-point correlation) to reconstruct a class of 3D microstructures. We
utilize a statistics based reconstruction procedure based on Gaussian Random
Fields (GRF). In this method, the 3D microstructure is obtained as the level cuts
to a random field, ϕ(x), x ∈ D. The random field has a field-field correlation

hϕ(0)ϕ(r)i = γ(r). The statistics of the reconstructed 3D image can be matched to
the experimental image by suitably modifying the field-field correlation func-
tion and the level cut values (see [68] for a detailed discussion). Following the
work in [68], the GRF is assumed to satisfy a specified field-field correlation

given by
e−r/β − (rc /β)e−r/rc sin(2πr/d)

γ(r) = , (3.31)
1 − (rc /β) 2πr/d
where the field is characterized by the correlation length β, a domain scale
d and a cutoff scale rc . For a specific choice of (β, d, rc), one can construct a
microstructure from the resulting GRF. The (theoretical) two-point correlations
corresponding to this reconstructed microstructures is computed. Optimal val-
ues of (β, d, rc) are obtained by minimizing the error between the theoretical
two-point correlation and the experimental two-point correlation. The theoret-

ical two-point correlation corresponding to (β, d, rc) = (2.229, 12.457, 2.302)µm is
plotted in Fig. 3.12.
Using the optimal parameters of the GRF (to match with the experimental
data), realizations of 3D microstructure were computed. Each microstructure
consisted of 65 × 65 × 65 pixels. This corresponds to a size of 20µm × 20µm × 20µm.
One realization of the 3D microstructure reconstructed using the GRF is shown

in Fig. 3.13.
115
1
Reconstructed
0.8
Experimental
0.6
g(r)
0.4
0.2
0
0 5 10 15 20
r (um)
Figure 3.12: Comparison of the two-point correlation function from exper-

iments and from the GRF.
Figure 3.13: One instance (realization) of the two-phase microstructure.
Nonlinear dimension reduction and construction of the topological model
The GRF based reconstruction detailed above was used to generate a set of
N = 1000 samples of two-phase microstructure. Each microstructure is repre-
sented as a 65 × 65 × 65 pixel image. The three-point correlations of all these
samples are calculated. The three-point correlation is easily computed as fol-

lows: for a given value of (a, b, c), randomly place triangles of side lengths a, b, c
on the microstructure. Count the number of times all three vertices of the trian-
gle lie on the same phase. S 3 (a, b, c) is the ratio of the number of such successful
placements over the total number of tries. In our computations, we randomly
116
place 500, 000 triangles to compute S 3 for each value of (a, b, c). The total com-
putational time to reconstruct 1000 microstructures along with their S 3 was 30
minutes on 25 nodes of our in-house Linux cluster.
40000
30000
20000
Log(length of MST)
10000
100 300 500

Log(Samples)
Figure 3.14: Plot of the length functional of the MST of the graph G for
various sample sizes.
Based on the calculated S 3 , the pairwise distance matrix P is computed. This

took 6 minutes to compute on a 3.8GHz PC. From this, the geodesic distance
matrix M and the graph G are computed. These are used to estimate the opti-
mal dimensionality of the low-dimensional space by computing the length func-

tional of the MST of the graph G. The Matlab code available at [89] was utilized.
Fig. 3.14 plots the length functional of the MST of the graph G for various sam-
ple sizes. The optimal dimensionality of the low-dimensional set is related to
the slope of this line (see Eq. (3.20)). The slope of the curve is computed using a
simple least squares fit. The optimal dimensionality was estimated to be d = 9
(a ∼ 0.885). The total computational time to estimate the dimensionality was 8
minutes on a 3.8GHz PC.
117
Multi Dimensional Scaling is performed using the geodesic distance matrix
M. The 9 largest eigenvalues and their corresponding eigenvectors are used to
represent the input samples. The low-dimensional region A is constructed as
the convex hull of these N(= 1000) 9-dimensional points ξi . This region coupled
with the mappings developed in Section 3.4.5 define the reduced-order stochas-
tic input model F −1 : A → MS 2 .
0.0068
Residual variance
0.95
Eigenspectrum
0.0064
Optimal dimension
Optimal dimension
0.006
0.9
0.0056
0.85
5 10 15 20 5 10 15
Dimension Dimension
Pd
i λ
Figure 3.15: (Left) The cumulative eigenspectrum of the data, Pi=1 N
λ
.
i=1 i
(Right) The residual variance for different dimensionalities of
the region A, computed from MDS.
Fig. 3.15 illustrates the potential difficulty in choosing the dimensionality of
the region A based on simple variance errors (i.e. choose the value of d that ac-
counts for 90% of the variance in the data). Fig. 3.15(a) plots the eigenspectrum
Pd
i λ
of the computed eigenvalues of B ( Pi=1
N
λ
). All dimensionalities beyond d = 4
i=1 i
account for over 90% of the variance in the data. Hence, there arises some ambi-
guity in simply choosing the dimensionality of the reduced mode based on this
plot. Fig. 3.15(b) plots the residual variance of the low-dimensional representa-
tion for various dimensionalities. The residual variance measures the difference
between the intrinsic manifold distance matrix, M and the pairwise Euclidian
118
matrix, DA , recovered from MDS (see Eq. 3.23) for various dimensionalities d.
It is defined in terms of the element-wise correlation between the two matrices,
e = 1 − r2 (DA , M), where r here is the standard linear correlation coefficient [91].
Notice that all the dimensions above d = 8 have fairly small variance, with
d = 14 having the minimum. This ambiguity in choosing the dimensionality of

the low-dimensional representation is overcome by using the ideas discussed in
Section 3.4.3, resulting in an optimal dimensionality of d = 9.
Utilizing the model to solve a stochastic PDE: Diffusion in random heteroge-

neous media
The procedure detailed above results in the function F −1 . F −1 is a mapping from

a 9-dimensional space A to the space of microstructures MS 2 . F −1 along with
ξ ∈ A serve as the stochastic input for the diffusion equation. A simple diffusion
problem is considered (Eq. 3.13). A computational domain of 65 × 65 × 65 is
considered (this corresponds to a physical domain of 20µm × 20µm × 20µm ).
The random heterogeneous microstructure is constructed as a 65 × 65 × 65 pixel
image. The steady-state temperature profile, when a constant temperature of 0.5

is maintained on the left wall and a constant temperature of −0.5 is maintained
on the right wall, is evaluated. All the other walls are thermally insulated. The
axis along which the temperature boundary conditions are imposed is denoted
as the x-axis (left-right) while the vertical axis is the z-axis.
The construction of the stochastic solution is through sparse grid colloca-

tion strategies (Smolyak algorithm). A level 5 interpolation scheme is used
to compute the stochastic solution in 9 dimensions. The stochastic problem
was reduced to the solution of 26017 deterministic decoupled equations. Fifty
119
1
10
0
10
-1
10
-2
10
Error
-3
10
-4
10
-5
10
-6
10 0 1 2 3 4
10 10 10 10 10
# of collocation points
Figure 3.16: Reduction in the interpolation error with increasing number

of collocation points.
nodes (each with two 3.8G CPUs) of our 64-node Linux cluster were utilized to
solve these deterministic equations. These are dual core processors with hyper-
threading capabilities thus each node was used to perform the computation for
4 such problems. The total computational time was about 210 minutes. Each
deterministic problem involved the solution of a diffusion problem on a given

microstructure using an 64×64×64 element grid (uniform hexahedral elements).
The reduction in the interpolation error with increasing depth of interpo-

lation is shown in Fig. 3.16. The interpolation error is defined as the varia-
tion of the interpolated value of the function from the computed value (e =
maxi=1:nno (| f − I( f )|), where nno is the number of nodes). As the level of interpo-
lation increases, the number of sampling points used to construct the stochastic
solution increases [14]. Notice that there is a slight jump in the error going
from an interpolation of depth 2 to depth 3. This is probably due to the pres-
120
ence of some highly localized fluctuations of the stochastic solution that is cap-
tured only when the depth of interpolation reaches 3. Nevertheless, the error
reduction shown above follows the theoretical convergence estimates for using
Smolyak based interpolation [14].
e
b c d
a g
Figure 3.17: Steady-state mean temperature: (a) temperature contour, (b-

d) temperature iso-surfaces, and (e-g) temperature slices.
The mean temperature is shown in Fig. 3.17. The figure plots iso-surfaces of
temperatures −0.25 (Fig. 3.17(b)), 0.0 (Fig. 3.17(c)) and 0.25 (Fig. 3.17(d)). The
figure also shows temperature slices at three different locations of the xz plane:
y = 0 (Fig. 3.17(e)), y = 8µm (Fig. 3.17(f)) and y = 16µm (Fig. 3.17(g)).
The standard deviation and other higher-order statistics of the tempera-

ture variation are shown in Fig. 3.18. Fig. 3.18(a) plots standard deviation iso-
surfaces. Figs. 3.18(d)–3.18(f) plot slices of the temperature deviation at three
121
6
Probability distribution function

5
1
d
0
-0.2 0 0.2
b c Temperature
a
f
Figure 3.18: Standard deviation of temperature: (a) standard deviation

contours, (b) standard deviation iso-surfaces, (c) temperature
PDF at a point, and (d-f) standard deviation slices.
different planes y = 0, y = 8µm, y = 16µm, respectively. The standard deviation

reaches 48% of the maximum temperature difference maintained. A point from
a region of high-standard deviation (A = (4, 4, 20)µm) is chosen and the PDF of
temperature at this point is determined. Fig. 3.18c plots the PDF for the point.
We conclude this section by making a few observations. The physical fea-
tures of the topology variation will have an effect on the low-dimensional

model. If the correlation length (of the two-point correlation) decreases the
optimal dimensionality of the model will increase and vice versa. The model
reduction strategy developed is data-driven: the transformation only converts
the given finite input data set into a set of low-d points. In case the input data,
122
{xi } all belong to a localized region of MS 2 , the model reduction strategy frame-
work will construct a low-dimensional parametrization of only this localized
region. By increasing the amount of data utilized, one can make sure that the
complete space, MS 2 is sampled, ensuring that the data-driven model reflects
the variability in the complete space.
3.7 Conclusions
A non-linear model reduction technique for converting experimentally deter-

mined statistics into viable, realistic stochastic input models of property vari-
ability has been developed in the present work. The major advantages of
the proposed developments are: it seamlessly meshes with any reconstruction
method, directly converts samples into an equiprobable low-order model of
the property, and is applicable to any property variation (for instance, property
variation in polycrystalline materials, permeability variation in heterogeneous

porous media, etc.).
The current developments borrow generously from ideas in image process-

ing and psychology where the problem of manifold learning is frequently en-
countered. Ideas from differential geometry are employed to show the accuracy
and asymptotic convergence of the reduced-order model. We showcase the
framework developed to construct a realistic reduced-order stochastic model

that describes the material and property variation in a two-phase microstruc-
ture (starting from an experimental image of the microstructure). We utilize
this stochastic model as an input in the solution of a SPDE governing diffusion
in random heterogeneous media. The solution provides an understanding of
123
how uncertainty in the topology of the microstructure affects the evolution of a
dependant variable (temperature).
The basic model reduction ideas envisioned in this work are not limited to
generation of viable stochastic input models of property variations. This frame-
work has direct applicability to problems where working in high-dimensional
spaces is computationally intractable, for instance, in visualization of property

evolution, extracting process-property maps in low-dimensional spaces, among
others. Furthermore, the generation of a low-dimensional surrogate space has
major ramifications in the optimizing of properties-processes and structures,
making complicated operations like searching, contouring and sorting compu-

tationally much more feasible. These potentially exciting areas of application of
the non-linear model reduction framework developed here offer fertile avenues
of further research.
Different reduction techniques (for instance, locally linear embedding, ker-

nel PCA, self organizing maps) can be incorporated into the general model re-
duction strategy formulated here. This is an area that is unexplored and could
potentially result in very efficient, real time, data-driven, stochastic reduced-
order model generation techniques. In addition to the importance of such mod-
els in process modeling of heterogeneous materials (polycrystals, composites,
concrete, etc.), many other technological applications in modeling multiscale

thermal/flow transport in geological media, soil contamination and reservoir
engineering remain to be explored.
124
CHAPTER 4
ANALYSIS OF FLOW THROUGH HETEROGENEOUS RANDOM MEDIA:
A STOCHASTIC MULTISCALE SYSTEM
4.1 Introduction
Thermal and hydrodynamic transport in random heterogeneous media are

ubiquitous processes occurring in various scales ranging from the large scale
(e.g. geothermal energy systems, oil recovery, geological heating of the earth’s
crust) to smaller scales (e.g. heat transfer through composites, polycrystals,

flow through pores, inter-dendritic flow in solidification, heat transfer through
fluidized beds). There has been increasing interest in reliably modelling and
predicting the thermal and hydrodynamic behavior of such media. One of
the challenging mathematical issues in the analysis of transport through het-

erogeneous random media is the multiscale nature of the property variations.
Complete response evaluation involving full (fine-scale) spatial and temporal
resolution simulations of multiscale systems is extremely expensive. Com-
putational techniques have been developed that solve for a coarse-scale so-
lution by defining an appropriate coarse-scale problem that captures the ef-
fect of the fine-scales [94]. The more popular techniques developed for such
upscaling fall under the category of multiscale methods viz. the variational
multiscale (VMS) method (operator upscaling) [95, 96, 97, 98, 99, 100], the
heterogenous multiscale method [101, 102] and the multiscale finite element
method [103, 104, 105, 106, 107]. Further related techniques are presented
in [108, 109]. These computationally scalable techniques have resulted in the
development of black box simulators that have been used with significant suc-
125
cess in the solution of large scale transport problems in complex geometries.
The multiscale analysis of such systems inherently assumes that the com-
plete, fine-scale variation of the permeability is known. This assumption lim-
its the applicability of these frameworks since it is usually not possible to ex-
perimentally determine the complete structure of the media at the finest scale.
In most cases, only a few statistical descriptors of the property variation or

the property variation in small test regions can be experimentally determined.
This limited information about the permeability necessitates viewing the per-
meability variation as a random field that satisfies certain statistical proper-
ties/correlations. This naturally results in describing the physical phenomena

as stochastic partial differential equations (SPDEs) instead of partial differential
equations (PDEs).
In the past decade, there has been tremendous progress in posing and
solving SPDEs. Several techniques like Generalized Polynomial Chaos expan-
sion [7, 8, 11, 12, 37, 39, 41], wavelet expansion and collocation based strate-
gies [14, 16, 110, 111] have been developed to solve SPDEs. These techniques
have been applied with significant success to solve single scale stochastic prob-
lems. In recent years, there has been significant interest in coupling determin-
istic multiscale methods with techniques in stochastic analysis to investigate
critical multiscale systems in the presence of uncertainties.
There are two key questions that have to be sequentially answered to con-
struct a stochastic multiscale framework that models transport phenomena only
given limited information: (1) Techniques to convert/encode limited statistical
information and/or sampled property values into a viable (stochastic) input
model for the (multiscale) permeability variation and (2) Given this input mul-
126
tiscale stochastic model, techniques to solve the stochastic multiscale equations.
The basic idea to solve the stochastic multiscale set of equations is to ex-
tend deterministic multiscale methods to their stochastic analogues. Spectral
strategies to pose and solve stochastic multiscale problems have been investi-
gated by Xu [112] and Asokan and Zabaras [42]. Collocation based strategies
have recently been developed in [20]. The key is to define appropriate ways to
link the fine-scale (subgrid) stochastic variation with the coarse-scale stochastic
variation of the dependent variables.
In the present work we are interested in analyzing flow through random

heterogeneous media given limited statistical information about the multiscale
permeability variation. We link stochastic analysis and multiscale methods to
investigate this problem.
Fig. 4.1 shows a schematic of the problem of interest. Denote the domain as
D ⊂ Rnsd , where n sd is the number of spatial dimensions. The characteristic
length scale of D is L. Denote the length scale of permeability fluctuation as l.

In the problems that we are interested in solving the characteristic length of the
domain is a couple of orders of magnitude larger than the characteristic length
scale of the permeability fluctuations l ≪ L.
We are interested in evaluating the pressure, p and velocity, u in the domain,

D. The variables (u(x), p(x)) depend on the (multiscale) permeability distribu-
tion, k(x) in the domain. However, the complete permeability distribution in un-
127
D
Figure 4.1: Schematic of the problem of interest.
known. Only some limited statistics and/or snapshots of the permeability are
given. This limited information available to characterize the permeability neces-
sitates assuming that the permeability is a realization of a random field. This is
mathematically stated as follows: Let Ω be the space of all allowable permeabil-
ity variations. This is our event space. Every point k = {k(x, ω), ∀x ∈ D, ω ∈ Ω}
in this space is equiprobable. Consequently, we can define a σ-algebra F and
a corresponding probability measure P : F → [0, 1] to construct a complete
probability space (Ω, F , P) of allowable permeability. To make this abstract de-
scription amenable to numerical simulation, a finite-dimensional approxima-

tion/representation [44] of this abstract set is necessary. Various data-driven
strategies to represent the set Ω as a finite-dimensional function are discussed
in Chapter 3. The stochastic permeability is represented as
k(x, ω) ≡ k(x, Y1 , . . . , YN ) ≡ k(x, Y), (4.1)
128
where Y1 , . . . , YN are uncorrelated random variables.
The pressure and velocity are characterized by the following set of equations
∇ · u(x, Y) = f (x), (4.2)
u(x, Y) = −k(Y)∇p(x, Y), (4.3)
where the source/sink term f (x) is taken to be deterministic. Furthermore, f
is also assumed to not have a multiscale character. These assumptions are sim-
ply to make the subsequent developments clear. It is fairly straightforward to
account for uncertainties in f as well as to analyze the effects of a multiscale
f [113].
The basic idea is to solve the problem on a coarse spatial discretization, Dc ,
while taking into account the fine-scale variation in the stochastic permeability.
In the next section, we detail a stochastic extension to the variational multiscale
method to solve this problem. The stochastic multiscale formulation is based
on the multiscale formulation detailed in the papers by Juanes et al. [113] and
Arbogast et al. [114]. The augmentation of the deterministic multiscale formu-

lation to its stochastic counterpart mainly depends on two straightforward de-
velopments: (a) defining the appropriate stochastic function spaces (both finite-
and infinite-dimensional), and (b) defining the appropriate localization assump-
tions. For brevity of presentation, we will concentrate on these two aspects. The
interested reader in referred to [113, 114] for other details.
129
4.3 Stochastic variational multiscale formulation
For the problem to be physically relevant, we assume that the stochastic perme-
ability satisfies some conditions, particularly:
Assumption 1: k is positive and uniformly coercive:
∃ kmin , kmax ∈ (0, ∞) : (4.4)
P(ω ∈ Ω : k(x, ω) ∈ [kmin , kmax ], ∀x ∈ D) = 1.
As stated in Section 4.2, the abstract representation of k(., ω) in Ω is replaced

by a more tractable finite-dimensional representation k(., Y), with Y ∈ Γ ⊂ RN .
Corresponding to the probability measure P : F → [0, 1], we denote the equiv-

alent probability measure ρ : Γ → [0, 1].
The governing equations for the velocity and pressure given in the mixed
form are as follows:
k−1 u + ∇p = 0, (4.5)
∇ · u = f. (4.6)
The above constitutive and conservation equations are supplemented with the
following boundary conditions
p = po on ∂D p , (4.7)
u · n = uo on ∂Du . (4.8)
Without loss of generality, we assume that the boundary conditions are deter-
ministic and that the Neumann condition is homogeneous [113], uo = 0 on ∂Du .
The next step is to introduce the appropriate function spaces in which the
velocity and pressure lie. In contrast to their deterministic counterparts, the ve-
130
locity and pressure defined here are stochastic processes. Most dependent vari-
ables one encounters in stochastic analysis are usually random processes [45,
115, 116, 117] that are defined in product spaces. These variables (for instance
p(x, Y)) usually have one structure in the stochastic space p(., Y) and another
in the physical space p(x, .). The numerical analysis/approximation [45, 115]
of such functions can be performed by defining the appropriate tensor product
spaces. The interested reader is referred to [45, 115] for insightful discussions
on the definitions of these product spaces. Following [45, 115], we define ap-
propriate function spaces that encode variations of the function in the physical
domain D and in the stochastic space Γ.
Remark 3.1: We will consider stochastic functions that belong to S, the space
of all square integrable functions, with measure ρ(Y). Given a deterministic
function space A, its stochastic counterpart is the tensor product space A ≡ S⊗A.
We introduce the following tensor product function spaces:

Z Z
2 2
W ≡ S ⊗ W ≡ L (Γ) ⊗ L (D), with, (p, p) ≡ dρ(Y) p2 dx, (4.9)
||p||2W :=
Γ D
Z Z
2 2
H ≡ S ⊗ H ≡ L (Γ) ⊗ H(div, D), with, (u, u) ≡ ||u||H := dρ(Y) (4.10)
u · udx,
Γ D
V = {u : u ∈ H, u(., Y) · n = 0 on ∂Du , ∀ Y ∈ Γ}. (4.11)
The problem defined by Eqs. (4.5) and (4.6) along with the boundary conditions
Eqs. (4.7) and (4.8) can be written in mixed variational form: Find (u, p) ∈ V×W
such that
(v, k−1 u) − (∇ · v, p) = −hv · n, po i, ∀v ∈ V, (4.12)
(w, ∇ · u) = (w, f ), ∀w ∈ W, (4.13)
R R
where h f, gi is defined as Γ
dρ(Y) ∂D p
f gdx.
131
Variational multiscale approach: The exact solution u is assumed to be
made up of contributions [98] from two different (spatial) scales namely, the
coarse-scale solution uc (x, .) that can be resolved using a coarse (spatial) mesh
and a subgrid solution u f (x, .):
u = uc + u f , p = pc + p f . (4.14)
This additive sum decomposition induces a similar decomposition for the spa-
tial part of the fine-scale tensor-product function spaces into a direct sum of a
coarse-scale and a subgrid tensor-product function spaces.
W = Wc ⊕ W f , H = Hc ⊕ H f , V = Vc ⊕ V f . (4.15)
The main idea is to develop models for characterizing the effect of the subgrid
solution u f (x, .) on the coarse-scale solution and to subsequently derive a mod-
ified coarse-scale formulation that only involves uc (x, .). The additive decom-
position provides a way of splitting the fine-scale problem given by Eqs. (4.12)-
(4.13) into a coarse-scale problem and a subgrid problem. Testing against the
coarse-scale test functions results in the coarse-scale variational problem: Find
(uc , pc ) ∈ Vc × Wc such that
(vc , k−1 (uc + u f )) − (∇ · vc , (pc + p f )) = −hvc · n, po i, ∀vc ∈ Vc , (4.16)
(wc , ∇ · (uc + u f )) = (wc , f ), ∀wc ∈ Wc . (4.17)
Similarly testing against the subgrid test functions results in the subgrid varia-
tional problem: Find (u f , p f ) ∈ V f × W f such that
(v f , k−1 (uc + u f )) − (∇ · v f , (pc + p f )) = −hv f · n, po i, ∀v f ∈ V f , (4.18)
(w f , ∇ · (uc + u f )) = (w f , f ), ∀w f ∈ W f . (4.19)
Note that both the coarse-scale variational equations Eqs. (4.16)-(4.17) as well
as the subgrid variational equations Eqs. (4.18)-(4.19) contain coarse and sub-
grid variations in the dependent variable ( uc , pc ; u f , p f ). The key is to solve
132
Eqs. (4.18)-(4.19) for u f and construct a functional representation of the subgrid
variation, u f and p f in terms of the coarse-scale variation, uc :
u f = Φ(uc ), p f = Ψ(uc ). (4.20)
This representation can be subsequently used to remove explicit dependence of

u f and p f in Eqs. (4.16)-(4.17). The key problem is now to solve Eqs. (4.18)-(4.19).
But these equations are defined over the complete global domain, D. Solving
the subgrid problem on the global domain is equivalent to solving the fine-scale
problem itself. To make the problem computationally tractable, some localiza-

tion assumptions have to be made to convert this global subgrid problem into a
set of local subgrid problems [113].
Finite element approximation spaces: Before defining the localization as-

sumptions, the finite element approximation spaces at the fine-scale and coarse-
scale are first defined. As stated before (Remark 3.1), the stochastic function
spaces are simply tensor product spaces of S with the corresponding determin-
istic spaces.
Fine-scale approximation spaces: Let Vh and Wh be finite-dimensional subspaces
of the spatial part of the corresponding continuum spaces V and W. That is
W h = S ⊗ Wh , V h = S ⊗ V h . (4.21)
Note that Wh and Vh should satisfy the discrete inf-sup condition [118]. Consider
SNh
a partition, Th of the domain D into non-overlapping elements ei , Th = i=1 ei ,
where Nh is the number of elements of the grid. Following [113], define also the
S Mh
skeleton of the partition, SPh = a=1 γa , where Mh is the number of element faces
denoted by γa . The partition Th is denoted as the fine-grid, on which the fine-
scale permeability is defined. The deterministic finite element approximation

space for the velocity, Vh is taken to be the lowest-order Raviart-Thomas [119,
133
113] space, RT 0 (Th ), and the deterministic finite element approximation space
for the fine scale pressure, Wh is taken to be the space of piece-wise constants on
the fine mesh, P0 (Th ).
Coarse-scale approximation spaces: Consider a coarse-scale partition of the do-

SNc S Mc
main, D. Denote this partition as Tc = i=1 E i . Denote by SPc = a=1 Λa the
associated skeleton of the coarse-scale discretization. Here, Nc is the number of

coarse-elements E i and Mc is the number of coarse-element faces denoted by Λa .
Following [113], we also assume for simplicity that the partitions Th and Tc –
the fine- and coarse-grid, respectively – are nested, conforming, and consist of
rectangular elements. Since Wh = S ⊗ Po (Th ), choose the coarse-scale pressure

as belonging to the space of piecewise constant functions, Wch = S ⊗ Po(Tc ). The
choice of the coarse-scale velocity approximation spaces now has to be com-
patible with this choice of the pressure approximation. In the analogous deter-
ministic developments, the two choices for such spaces were the lowest order
Raviart-Thomas space (used in [113]) and the Brezzi-Douglas-Marini space of
order 1 (used in [96]). We choose to utilize the stochastic analog to the low-
est order Raviart-Thomas space in the current work. We can now associate the
coarse-scale velocity with the lowest-order Raviart-Thomas space, RT 0 (Tc ) as:
Mc
X
Vch =S⊗ Vch, Vch = {uc : uc = N ca uca , uca = 0 ∀Λa ∈ ∂Du }, (4.22)
a=1
where Vch is the finite-dimensional approximation to the coarse-scale continuum

space Vc . Here, Nca is the RT 0 basis function associated with face Λa , and uca is the
corresponding degree of freedom (the integrated coarse-flux through Λa [113]).
The coarse-scale pressure approximation is piecewise constant on the coarse-
134
mesh, P0 (Tc ):
Nc
X
Wch =S⊗ Wch , Wch = {wc : wc = φci wci }, (4.23)
i=1
where Wch is the finite-dimensional approximation to the coarse-scale contin-

uum space Wc . Here, φci is the pressure basis function for the coarse-element i,
which is equal to one in element E i and zero in all other elements. wci is the cor-
responding pressure degree of freedom (the average pressure in coarse-element
E i ).
Subgrid-scale approximation spaces: The subgrid-scale velocities have to ac-

curately mimic the fine-scale velocities. To ensure this, they are restricted to
belong to the lowest-order Raviart-Thomas space on the fine grid within each
coarse-element [113]. Denote by E i,h = Th (E i ) the fine grid defined over the coarse
element E i . The subgrid-scale velocity field defined on each coarse element sat-
isfies the condition
V hf,i ⊂ RT 0 (E i,h ), (4.24)
Vhf,i = S ⊗ V hf,i , (4.25)
where Vhf,i is the finite-dimensional approximation to the subgrid-scale contin-
uum space V f , constrained to a coarse element E i . The elements of Vhf,i can

naturally be extended to all of D by zero. The global subgrid velocity space can
then be defined as the direct sum of the subgrid spaces over the coarse elements
Nc
M
Vhf = Vhf,i . (4.26)
i=1
Similarly, the subgrid pressure space is restricted to belong to the space of piece-
wise constant functions on each coarse element
W hf,i ⊂ P0 (E i,h ), (4.27)
Whf,i = S ⊗ W hf,i , (4.28)
135
where Whf,i is the finite-dimensional approximation to the subgrid-scale con-
tinuum space W f , constrained to a coarse element E i . The elements of these
spaces are extended to zero functions to the entire spatial domain [113], D, and
the subgrid pressure space is defined as
Nc
M
Whf = Whf,i . (4.29)
i=1
Localization Assumptions: Having defined the appropriate finite element

approximation spaces, we now move to the problem of reducing the global sub-
grid problem (defined by Eqs. (4.18)-(4.19)) into a set of local subgrid problems
defined over the coarse elements E i .
The key localization assumption in the construction of the multiscale frame-

work is to ensure that the approximation is locally conservative at both
scales [113]. This implies that the discrete version of the conservation equa-
tion is satisfied on each element in the coarse and fine grids. Furthermore, we
also assume that this condition is satisfied not in a distributed sense, but for each
stochastic realization (i.e. point-wise in stochastic space). This leads to the following
condition
(wc , ∇ · uc )Ei = (wc , f )Ei , (4.30)
where (., .)Ei is the corresponding inner product defined over the coarse-element
E i . This condition (in its strong, point-wise form) is equivalent to [113]
∇ · uc = Πc f, (4.31)
where Πc f is the projection of the source/sink term onto the space Wc of piece-
wise constants on the coarse grid. As stated in the beginning of this section, we
assume that the sourse/sink function does not display a multiscale nature, i.e.
136
it is equal to its projection on the space of coarse-scale pressures ( f = Πc f ). Sub-
stituting Eq. (4.30) into the coarse-scale conservation equation Eq. (4.17) results
in
(wc , ∇ · u f )Ei = 0, ∀E i ∈ Tc . (4.32)
Since wc is constant in each coarse element, using the divergence theorem and
the assumption that this condition is valid point-wise in stochastic space, results
in the following condition on the subgrid velocities in each coarse element

Z
u f · ndΛ = 0, ∀E i ∈ Tc . (4.33)
∂Ei
Eq. (4.33) is the essential condition [113] that guarantees mass conservation at
both scales and allows for the localization of the subgrid problem. This local-
ization assumption converts the global subgrid problem into a set of local Neu-
mann problems with boundary conditions defined by Eq. (4.33). Furthermore,
the subgrid test function v f must satisfy v f · n = 0 on ∂E i which results in the
following constraint
(∇ · v f , pc )Ei = 0, ∀E i ∈ Tc . (4.34)
Finally, we define the space of subgrid pressure Whf as the orthogonal comple-
ment of Wch in Wh (i.e. (w f , wc ) = 0). Now, since WCh = div Vch , this results in an
additional constraint
(∇ · vc , p f )Ei = 0, ∀E i ∈ Tc . (4.35)
The subgrid problem: Using the localization assumption Eqs. (4.33), (4.34), (4.35)
the localized subgrid problem can be defined as follows: For each coarse ele-
ment, E i = 1, . . . , Nc , find (u f , p f ) ∈ Vhf,i × Whf,i , such that
137
(v f , k−1 u f )Ei − (∇ · v f , p f )Ei = −(v f , k−1 uc )Ei , ∀v f ∈ Vhf,i , (4.36)
(w f , ∇ · u f )Ei = (w f , f − ∇ · uc )Ei , ∀w f ∈ Whf,i . (4.37)
Since f = ∇ · uc (assumption that the sourse/sink exhibits no multiscale char-

acter), the RHS of Eq. (4.37) is identically zero. The interested reader is re-
ferred to [113] for detailed discussion of the case when the source/sink func-
tion displays multi-scale behavior where deterministic analysis is provided. It
is straightforward to extend those arguments to the stochastic case. Note that

given uc and the appropriate boundary conditions, the above problem has a
unique solution for the subgrid velocity and pressure. The global subgrid scale
solution (u f , p f ) is obtained by patching together the solutions on each coarse
element.
Multiscale basis functions: Eqs. (4.36)-(4.37) represent the subgrid scale
velocity and pressure in terms of the coarse-scale velocity. Recollect that

the coarse-scale velocity is represented in terms of the RT 0 basis functions
P Mc c c
(Eq. (4.22)) and the Mc degrees of freedom as uc = a=1 N a ua . We therefore rep-
resent the subgrid scale variation in terms of the finite number of coarse-scale
degrees of freedom [113] as

Mc
X Mc
X
uf = N af uca , pf = φaf uca . (4.38)
a=1 a=1
The above equations transform the problem of finding the function Φ(uc ) and
Ψ(uc ) (defined in Eq. (4.20)) into finding the subgrid scale basis functions (N af , φaf )
associated with each coarse-scale degree of freedom. For computational effi-
ciency, the multiscale basis function representing the multiscale velocity is de-
138
fined as the sum of the coarse-scale and subgrid-scale components as
Mc
X Mc
X
c f c
u = (N a + N a )ua = N ms c
a ua . (4.39)
a=1 a=1
The multiscale basis functions are associated with each coarse-scale inter-
face Λa . The multiscale basis functions are constructed based on defining the
Greens function to the subgrid Eqs. (4.36)-(4.37) [98, 99]. In the context of de-
terministic mixed multiscale methods, various authors have constructed these
multiscale basis functions [96, 106, 107]. We utilize the deterministic definition
of the multiscale basis functions used in [120, 113] to define the stochastic mul-
tiscale basis functions. The multiscale basis function is the solution to a flow
problem restricted to a pair of adjacent coarse elements that share a common
coarse interface. The course terms are specified in such a way that the flow
through the interface is identically one (based on the Green’s function idea). It
is straightforward to extend this definition of the multiscale basis function to
the case when the permeability is stochastic.
The stochastic multiscale basis functions (Nms ms

a , φa ) for the interface Λa (which
is shared by the coarse elements E i and E j ) are the solution to the following
stochastic local problem:
k−1 Nms ms
a + ∇φa = 0, in E i ∪ E j , (4.40)
N ms
a · n = 0, on ∂(E i ∪ E j ), (4.41)
 R
if x ∈ E i ,



 +k(x, .)/ Ei
k(x, .)dx,
∇ · Nms =  (4.42)

a  R
 −k(x, .)/ k(x, .)dx, if x ∈ E j .


Ej
The coarse-scale problem: We next turn to the coarse-scale variational equa-
tions. Based on the localization assumptions Eqs. (4.33)-(4.35), the coarse-scale

variational equations Eqs. (4.16)-(4.17) reduce to the following: Find (uc , pc ) ∈
139
Vch × Wch , such that
(vc , k−1 uc ) + (vc , k−1 u f ) − (∇ · vc , pc ) = −hvc · n, po i, ∀vc ∈ Vch , (4.43)
(wc , ∇ · uc ) = (wc , f ), ∀wc ∈ Wch . (4.44)
Following [113], the problem is expressed in its equivalent symmetric form by

h
defining the stochastic multiscale velocity space, Vms and stochastic multiscale
h
pressure spaces Wms as follows:
Mc
X
h h h
Vms = S⊗ Vms , Vms = {ums : ums = N cms uca , uca = 0 ∀Λa ∈ ∂Du }, (4.45)
a=1
Nc
X X
h h h
Wms = S⊗ Wms , Wms = {wms : wms = φci (wci + φ̃ms c
a ua )}, (4.46)
i=1 a
1
R
where φ̃ms ms
a = φa − |Ei | Ei
φms
a dx.
With this definition of the finite-dimensional function spaces, the stochastic

h h
coarse-scale problem can be written as: Find (ums , pms ) ∈ Vms × Wms , such that
(vms , k−1 ums ) − (∇ · vms , pms ) = −hvms · n, po i, ∀vms ∈ Vms

h
, (4.47)
(wms , ∇ · ums ) = (wms , f ), ∀wc ∈ Wch . (4.48)
Once the coarse-scale variables are solved, the fine-scale velocity and pres-
sure can be reconstructed (if necessary) by resorting to the additive decompo-
sition Eq. (4.14). The subgrid part is obtained by a linear combination of the
coarse-scale fluxes and the subgrid scale basis functions.
The stochastic multiscale framework: The abstract framework to solve the

stochastic multiscale problem defined by Eqs. (4.2)-(4.3) is given in Fig. 4.2.
As seen in Fig. 4.2, the key steps involved are
140
Set coarse and fine discretization
Utilize limited information to

construct a finite dimensional
Get input stochastic model
representation of the multiscale
permeability
Compute the stochastic multiscale

basis functions for each coarse
element Utilize adaptive strategies to
solve the stochastic partial
differential equations
Compute the stochastic coarse-scale
fluxes
Figure 4.2: The stochastic multiscale framework.
• From limited data determine the fine-scale stochastic permeability. This

requires data driven strategies to construct viable finite-dimensional rep-
resentation of the stochastic permeability. Different techniques are dis-

cussed in Chapter. 3.
• Find the stochastic multiscale basis functions for each coarse element in-
terface by solving Eqs. (4.40)-(4.42).
• Use these multiscale basis functions to solve for the stochastic coarse-scale
velocities and pressure given by Eqs. (4.47)-(4.48). Adaptive techniques to
solve the above set of SPDEs are discussed in Section 3.5.3.
4.4 The complete algorithm
The complete schematic of the stochastic multiscale procedure is illustrated in
Fig. 4.3. The data-driven strategies (discussed in Chapter. 3) convert the limited
information available into a plausible, realistic stochastic representation of the
fine-scale permeability variation. The adaptive sparse grid collocation strategy
141
is used to construct the stochastic multiscale basis functions over all the coarse
elements. Following this, the stochastic coarse-scale equations are solved for the
coarse-scale stochastic pressure and velocity.
Limited information, Snapshots
Data driven model

reduction
Stochastic model for fine-scale

permeability
Variational stochastic Adaptive sparse grid

mixed multiscale collocation framework
framework
Stochastic multiscale velocity and

pressure
Figure 4.3: Schematic of the developed stochastic multiscale framework.
4.5 Numerical examples
In this section, we apply the complete stochastic analysis framework- from data-
driven model generation, the stochastic variational mixed multiscale formula-
tion to the adaptive solution of the resulting SPDEs. In the first example, we
compare the effect of uncertainties in the small scale permeability variations

and the effect of uncertainties in large scale geological features on the flow char-
acteristics in a domain. In the second example, we look at the effect of local-
ized uncertainties in permeability and how they propagate into the complete
domain.
142
4.5.1 Effect of uncertainties at different scales
The schematic of the domain of interest is shown in Fig. 4.4. The region is part
of an outcrop (where extensive measurements of permeability have been per-
formed [124]) in Lawyer Canyon in Texas. The domain of interest is a square
of length 200 ft. Flow is driven by an injection well at the left bottom corner
of the domain and a production well at the top right corner. There is a low-
permeability fault running across the domain. Limited information about the
spatial variation in the fine-scale permeability is available in the form of semi-
variograms and mean permeability (Fig. 4.5). Furthermore, the exact location
and characteristics of the fault are unknown. Hence both the fine-scale perme-
ability variation as well as the large-scale fault features have to be considered to
be stochastic.
Fault
200 ft
Figure 4.4: Schematic of the domain. A fault runs across the middle of the
domain.
143
Effect of stochastic fine-scale permeability: We first investigate the effect
of the stochastic fine-scale permeability keeping the location of the fault fixed.
The fault is assumed to originate at (40, 100) and have a length of 100 feet and a
width of 20 feet. A reduced model for the fine-scale permeability is constructed
from the limited statistics available. Since in this case, the correlation structure
of the permeability variation is available, we utilize the KL expansion to con-
struct a viable stochastic model.
The two-point correlation is extracted and the KL expansion is performed.

Fig. 4.6 (right) plots the first 40 eigenvalues constructed from the KL expansion.
Note the logarithmic scale used. The first 25 eigenvalues represent about 95% of
the total information. The KL expansion is truncated to 25 terms and this gen-
erates a 25-dimensional input stochastic model for the fine-scale permeability.
Figure 4.5: The Lawyer Canyon in Texas, where extensive measurements

of permeability was made. Data available includes the semi-
variograms of the permeability.
Three eigenvectors utilized in the KL expansion are shown in Fig. 4.7. Note
144
3
0.45 10
0.44
0.43
0.42
Semivariogram, V(r)
Eigenvalue
0.41
102
0.4
0.39
0.38
0.37
1
10
0.36
0.35
0 100 200 300 400 0 10 20 30 40
Distance, r Number of eigenvectors
Figure 4.6: (left) Experimental semi-variogram of permeability. (right) The

KL expansion of the correlation structure.
that the eigenvectors are in fact eigenvectors of the log-permeability field.

200 200 200
3.78E-02 3.37E-02 3.83E-02
2.70E-02 2.41E-02 2.74E-02
1.62E-02 1.44E-02 1.64E-02
150 5.40E-03 150 4.81E-03 150 5.47E-03
-5.40E-03 -4.81E-03 -5.47E-03
-1.62E-02 -1.44E-02 -1.64E-02
-2.70E-02 -2.41E-02 -2.74E-02
100 -3.78E-02 100 -3.37E-02 100 -3.83E-02
50 50 50
0 0 0
0 50 100 150 200 0 50 100 150 200 0 50 100 150 200
Figure 4.7: Three eigenvectors constructed from the KL expansion.
Fig. 4.8 shows two extreme realizations of the log-permeability field con-
structed utilizing the KL model. Note the close to 7 orders of magnitude varia-
tion in the permeability variation in these realizations.
The fine-scale permeability is given as 100 × 100 fineblocks in the domain.

Aggressive upgridding is utilized and a coarse-scale discretization of 10 × 10 is
utilized in the stochastic variational mixed multiscale framework. The 100-fold
coarse-graining results in the representation of the system by only 320 upscaled
145
Figure 4.8: Two extreme realizations of the log-permeability generated
from the KL expansion.
coarse variables.
30 nodes of our local Linux cluster (corresponding to 120 processors) was

utilized. The total computational time was about 96 hours to solve this 25 + 2
(stochastic + spatial domain) dimensional problem. Fig. 4.9 (left) plots the con-
vergence of the stochastic pressure (mean and standard deviation) with increas-
ing number of collocation points, while Fig. 4.9 (right) plots the convergence of
the stochastic x-direction flux with increasing number of collocation points. For
the same depth of interpolation about 2.9 × 109 collocation points would have
been required using conventional sparse grid collocation.
Fig. 4.10 plots the mean values of the coarse-scale pressure and x-direction
flux. Note that the pressure is stratified around the region where the permeabil-
ity fault is located.
Fig. 4.11 plots the standard deviation of the stochastic coarse-scale pres-
sure and x-direction flux. The low-permeability fault leads in preferential flow
around the fault. This leads to stratification of pressure around the fault as well
as negligible flow. This is clearly seen as the very small standard deviation in
146
1
10
Mean p
Mean vel
-1
Var pp
Std
10 Var
Std vel
0
10
Error
Error
-2
10
-1
10
-3
10
-2
10
2
10
3
10
4 10 2 3 4
10 10 10
Number of collocation points Number of collocation points
Figure 4.9: Convergence of the stochastic solution (mean and standard de-
Stochastic coarse-scale pressure, (right) Stochastic coarse-scale
x-direction flux.
2.6
1.5 25.3
0.3 20.8
0.0 16.4
-0.6 11.9
-1.8 7.5
-2.9 3.0
-4.1 -1.5
-5.2 -5.9
-6.4 -10.4
-7.6 -14.9
Figure 4.10: Mean stochastic coarse-scale solution. (left) Coarse-scale pres-

sure, (right) Coarse-scale x-direction flux.
pressure in the middle of the domain where the fault is located. Furthermore,
this results in a large variation in flux around the permeability fault as can be
seen in Fig. 4.11 (right).
To check the accuracy of the stochastic solution, we perform comparison

with the stochastic solution constructed using Monte-Carlo sampling strate-
147
0.064 3.299
0.052 2.480
0.039 1.661
0.027 0.842
0.015 0.547
0.003 0.117
Figure 4.11: Standard deviation of the stochastic coarse-scale solution.

(left) Coarse-scale pressure, (right) Coarse-scale x-direction
flux.
gies. 106 samples were utilized to compute the statistics using the Monte Carlo
method. This took about 900 hours using 30 nodes of our in-house Linux clus-
ter. Fig. 4.12 plots the difference between the mean values of the coarse-scale
pressure and x-direction flux computed from the two methods.
2.80E-03 2.52E-02
2.35E-03 1.84E-02
1.90E-03 1.16E-02
1.46E-03 4.85E-03
1.01E-03 -1.95E-03
5.64E-04 -8.74E-03
1.18E-04 -1.55E-02
-3.29E-04 -2.23E-02
Figure 4.12: Difference in the mean value of the stochastic solutions using
the adaptive sparse grid strategy and Monte Carlo sampling.
(left) Mean pressure, (right) Mean x-direction flux.
Notice the difference in the mean pressure is very close to zero close to the
injection well as well as on the permeability fault. On the other hand, since
the flow will preferentially try to avoid the fault (resulting in the largest devi-
148
ation around this region), there will be high variability in the x-direction flux
immediately above and below the permeability fault. This leads to the largest
error occurring here as seen clearly in Fig. 4.12. This is also clearly illustrated
in Fig. 4.13 which plots the difference between the standard deviation of the
coarse-scale pressure and x-direction flux computed from the two methods.
5.53E-04 1.12E-01
1.77E-04 7.51E-02
-1.99E-04 3.81E-02
-5.75E-04 1.09E-03
-9.51E-04 -3.59E-02
-1.33E-03 -7.30E-02
-1.70E-03 -1.10E-01
-2.08E-03 -1.47E-01
Figure 4.13: Difference in the standard deviation of the stochastic solutions

using the adaptive sparse grid strategy and Monte Carlo sam-
pling. (left) Standard deviation in pressure, (right) Standard
deviation in the x-direction flux.
Fig. 4.14 plots streamlines based on the mean coarse-scale velocity distribu-
tion in the domain. The streamlines preferentially circumvent the permeability
barrier.
From the stochastic solution, the probability distribution function of the

coarse-scale x-direction flux is plotted at two spatial locations (120, 40) and
(180, 160). The PDFs are plotted in Fig. 4.15. PDFs of coarse-scale pressure (not
shown) also exhibited similar trends. The fine-scale stochastic permeability es-
sentially caused a ‘smearing’ of the coarse-scale pressure and velocity fluxes.
Effect of uncertainty in large-scale permeability barrier: Next we focus

on the effects of uncertainty in the location of the permeability barrier on the
149
Figure 4.14: Streamlines drawn on the mean stochastic coarse-scale veloc-
ity.
0.6 0.7
0.6
0.5
0.5
0.4
0.4
0.3
0.3
0.2
0.2
0.1
0.1
0
3 4 5 6 7 0
1 2 3 4 5
Flux (x-direction) Flux (x-direction)
Figure 4.15: Probability distribution functions of the coarse-scale flux (x-

direction) for two points in the domain.
coarse-scale stochastic pressure and velocity flux. We set the fine-scale perme-
ability variation to its mean value and utilize the same 100-fold coarse-grained
discretization as in the previous problem. We assume that there is 10% uncer-
tainty in the location, length and width of the permeability barrier. This is quan-
tified by 4 independent uniform random variables, (Y1 , Y2 , Y3 , Y4 ), on the domain
150
[−1, 1] as
xl = (0.2 + 0.05Y1 ) ∗ 200,
yl = (0.5 + 0.05Y2 ) ∗ 200,
Lbarrier = (0.5 + 0.05Y3 ) ∗ 200,
Wbarrier = (0.1 + 0.05Y4 ) ∗ 200,
where (xl , yl ) is the start position of the barrier, and Lbarrier , Wbarrier are the
length and width of the barrier, respectively. The variational stochastic mixed
multiscale framework is utilized to solve this problem. 40 nodes of our local
Linux cluster (corresponding to 160 processors) is utilized to solve the problem.
The total computational time was about 240 minutes.
Mean p 10
1 Mean vel
10-1 Var
Std p Var
Std vel
Error
Error
-2
10 10
0
-3
10
-1
10
500 2000 500 2000
Figure 4.16: Convergence of the stochastic solution (mean and standard

deviation) with increasing number of collocation points. (left)
Stochastic coarse-scale pressure, (right) Stochastic coarse-
scale x-direction flux.
Fig. 4.16 (left) plots the convergence of the stochastic pressure with increas-
ing number of collocation points, while Fig. 4.16 (right) plots the convergence
of the stochastic x-direction flux with increasing number of collocation points.
151
For the same depth of interpolation about 8000 collocation points would have
been required using conventional sparse grid collocation. The adaptive sparse
grid strategy sampled the 4D stochastic space anisotropically with the random
dimension corresponding to the x-location of the permeability barrier being the
most sensitive dimension.
Figure 4.17: Mean contours of the stochastic coarse-scale solution. (left)

Coarse-scale pressure, (right) Coarse-scale x-direction flux.
Fig. 4.17 plots the mean contours of the stochastic coarse-scale pressure and
flux in the x-direction. The effects of the fine-scale variation in the structure of
permeability is visible in the uneven pressure contours near the injection and
production wells. Furthermore, the pressure is essentially uniform close to the
middle of the domain where the permeability barrier lies. This is also seen in
Fig. 4.17 (right), where most of the flow circumvents the location of the fault.
This is even more clearly seen in the streamlines drawn on the mean coarse-
scale velocity field as shown in Fig. 4.18.
Fig. 4.19 plots the standard deviation contours of the coarse-scale stochastic
pressure and x-direction flux. As expected, most of the deviation in the pressure
occurs close to the region of the permeability fault. Interestingly, most of the
152
Figure 4.18: Streamlines drawn on the mean stochastic coarse-scale veloc-
ity.
large deviation region is ‘downstream’ of the location of the uncertainty– along

the direction of flow. This is also seen in Fig. 4.19 (right) where the region of
large standard deviation is on either side of the permeability barrier and extends
along the flow direction downstream towards the production well on the top
right.

flux.
153
0.8
2

0.6
1.5
0.4
1
0.2 0.5
0 0
9 10 11 12 13 -2 -1.5 -1 -0.5 0


coarse-scale x-direction flux is plotted at two spatial locations. The first point
(120, 40) has a relatively large standard deviation in the flux, while the sec-
ond point (100, 60) has a smaller standard deviation. The PDFs are plotted in
Fig. 4.20. Notice that the PDFs at both points have several peaks (in contrast
with the previous case) pointing to the possibility of mode shifts existing. In
particular, Fig. 4.20 (right) is the PDF of a point that lies very close to the perme-
ability barrier and it clearly shows up to 10 distinct peaks. It appears that small
variations in the location of the peak results in significant changes in the flow
patterns leading to the multi-modal structure of the PDF.
We investigate this possibility by reducing the problem to its most sensi-

tive stochastic dimension. As stated earlier, the adaptive sparse grid collocation
algorithm preferentially refines the stochastic solution along regions of steep
variations and discontinuities. Most of the refinement proceeded along the di-
mension that characterized the uncertainty in the x-location of the permeability
154
22
20
18
16
Depth of interpolation
14
12
10
8
6
4
2
0
0 0.25 0.5 0.75 1
Random variable
Figure 4.21: The refinement of the adaptive sparse grid. Each row of points
represents the refinement at a particular level of interpolation.
The interpolation starts at the bottom.
barrier (i.e. Y1 ). We solved a problem with this as the only input stochastic vari-
able. Fig. 4.21 plots the evolution of the adaptive grid. Notice that after the 8-th
level of interpolation, most of the refinement is concentrated at 5 points. This
implies the definite existence of steep gradients or discontinuities around these
points in stochastic space. This is clearly seen in Fig. 4.22, where the PDF of the
coarse-scale flux in the x-direction is plotted at two spatial points. The PDF in
both cases consist of 5 sharp peaks.
In Fig. 4.23, we illustrate this mode shift behavior by plotting two real-
izations of the coarse-scale stochastic flux (x-direction). These two realiza-
tions correspond to two points in stochastic space that are separated by 1%
(Y1 = 0.295, 0.305). This corresponds to a mode-shift located at Y1 = 0.30 as
seen in Fig. 4.21. The flow structure close to the permeability barrier is signifi-
cantly different for the two cases.
155
10.00 450
9.00 400

8.00 350
7.00 300
6.00
250
5.00
200
4.00
150
3.00
100
2.00
1.00 50
0.00 0
5.00 5.50 6.00 6.50 7.00 0.08 0.09 0.1 0.11 0.12 0.13

Figure 4.23: (left and center) Two realizations of the coarse-scale flux (x-
direction) for a 1% perturbation in the x-location of the perme-
ability barrier, (right) Difference between the two solutions.
In comparison with the uncertainty due to fine-scale permeability variation,

even a small uncertainty in the location of large-scale structures results in sig-
nificant variabilities in the flow patterns.
4.5.2 Effect of localized uncertainties
In this second example, we look at the effect of localized uncertainties in perme-

ability and how they propagate into the complete domain. A schematic of the
156
domain is shown in Fig. 4.24. The region consists of a 250×150 ft2 domain where
the fine-scale permeability is given in 1 ft2 blocks. An injection well is located
at the bottom left corner of the domain while a production well is placed at the
top right corner of the domain. No flow boundary conditions are enforced on
all the four boundaries. The fine-scale discretization of the domain corresponds
to a 250 × 150 element triangulation.
Uncertain permeability
block
150 ft
250 ft
Figure 4.24: Schematic of the domain. The permeability is uncertain in

part of the domain.
The fine-scale permeability in the domain is shown in Fig. 4.25. The deter-
ministic permeability is taken from the 10-th SPE comparative project [125]. The
permeability shows variation of 5 orders of magnitude. Furthermore, there is
a localized region of uncertain fine-scale permeability in the central portion of
the domain, corresponding to a 50 × 50 ft2 domain. Aggressive coarse-scaling is
performed at the complete domain which is characterized using a coarse 50 × 30

element discretization.
We are given a finite number of snapshots of plausible fine-scale permeabil-

ity distributions in the uncertain permeability block. Based on the number of
available snapshots, we consider two cases (a) for small number of snapshots, a
linear model reduction strategy is utilized to construct a data-driven model of
157
Figure 4.25: The fine-scale log permeability distribution in the domain.
The axes are scaled (1 unit = 50 ft).
the fine-scale permeability in the domain, and (b) if a large number of snapshots
are available, a non-linear model reduction strategy is utilized.
Small number of snapshots

In this case, a data-set containing 85 plausible fine-scale permeability distribu-
tions in the uncertain part of the domain is available. Based on the discussion
in Chapter. 3, the covariance matrix C [140] is constructed using these 85 per-
meability distributions and the eigenvalue problem solved. Fig. 4.26 plots the
eigen-spectrum of the covariance matrix. The first six eigenmodes accurately
represent the variability in the given data-set. These eigen-modes are utilized to
construct the corresponding low-dimensional representation of the stochastic-
fine-scale permeability distribution in the central block of the domain. The di-
mensionality of the representation is 6.

(stochastic + spatial domain) dimensional problem. Fig. 4.27 (left) plots the con-
158
2.5
1.5
Eigenvalue
1
0.5
0
1 2 3 4 5 6 7 8 9 10
Number of eigenvectors
Figure 4.26: The eigen-spectrum of the data set. The first 6 eigen-modes
represent most of the variation that the input data set exhibits.
vergence of the stochastic pressure (mean and standard deviation) with increas-
ing number of collocation points, while Fig. 4.27 (right) plots the convergence
of the stochastic x-direction flux with increasing number of collocation points.
-2 -2
10 10
Mean vel
Mean p Std vel
Var vel
Var
Std p
-3
10
Error
Error
-3
10
-4
10
-5 -4
10 1 2 3 10 1 2 3
10 10 10 10 10 10
Figure 4.27: Convergence of the stochastic solution (mean and standard

deviation) with increasing number of collocation points. (left)
Stochastic coarse-scale pressure, (right) Stochastic coarse-
scale x-direction flux.
159
flux in the x-direction. Note the highly dendrite-like structure of the x-direction
flux. Due to the large variation in the permeability, the flow tries to find the path
with the largest permeability resulting in this dendrite-like structure.
-0.7 90.
-1.8 69.
-2.8 49.
-3.5 29.
-3.8 8.
-4.2 4.
-5.2 -6.
-6.3 -26.
-7.3

Fig. 4.29 plots the standard deviation contours of the coarse-scale stochastic
pressure and x-direction flux. Upstream of the uncertain permeability block, the
standard deviation of the coarse-scale pressure and velocity are both negligible.
Most of the deviation in the pressure occurs in the uncertain domain with its ef-
fects felt downstream of the uncertain domain. The velocity flux has maximum
deviation in the region of uncertainty and the deviation rapidly decays in the
rest of the domain.
0.88
3.5E-04
0.75
3.0E-04
0.63
2.4E-04
0.50
1.9E-04
0.38
1.3E-04
0.29
7.4E-05
0.21
1.9E-05
0.08
0.00

flux.

coarse-scale pressure and x-direction flux is plotted at two spatial locations. The
160
first point (50, 50) is upstream of the localized uncertain region, while the second
point (125, 75) is in the region of uncertain fine-scale permeability. The PDFs are
plotted in Figs. 4.30 and 4.31. For the point upstream, the probability distribu-
tions are peaked and have a very small range of variation. In comparison, the
point in the region of uncertainty has a significantly larger range of variability.
2500
60

2000 50
40
1500
30
1000
20
500
10
0 0
-2.459 -2.4585 -2.458 -2.4575 12.25 12.275 12.3 12.325
Pressure Flux (x-direction)
Figure 4.30: Probability distribution functions of the coarse-scale pressure

uncertainty.
40 4
3.5
30 3
2.5
20 2
1.5
10 1
0.5
0
-3.925 -3.9 -3.875 -3.85 -3.825 2 2.5 3

domain.
161
Large number of snapshots
In this case, a data-set containing 1500 plausible fine-scale permeability distri-
butions in the uncertain part of the domain is available. Based on the discus-
sion in Chapter. 3, the neighborhood graph G is constructed using these 1500
permeability distributions. Following this, the geodesic distance matrix is com-

puted. The BHH theorem [21] is used to compute the intrinsic dimensionality
of the dataset. The dimensionality is related to the slope of the length func-
tional of the minimal spanning graph (MST) of the neighborhood graph G of
the dataset (see [21]). Fig. 4.32 plots the variation in the length functional of the
MST. The optimal dimensionality was estimated to be d = 5. The corresponding
low-dimensional representation of the stochastic-fine-scale permeability distri-
bution in the central block of the domain is constructed.
4
10
Length of the MST
3
10
2
10 1 2 3
10 10 10
Number of samples
Figure 4.32: Plot of the length functional of the MST of the graph G for
various sample sizes. The intrinsic dimension is related to the
slope of the graph.

162
(stochastic + spatial domain) dimensional problem.
flux in the x-direction. The mean results are very similar to the results obtained
by generating a data-driven model utilizing a much lower amount of initial
information.
-0.7 90.
-1.8 69.
-2.8 49.
-3.5 29.
-3.8 8.
-4.2 4.
-5.2 -6.
-6.3 -26.
-7.3

On the other hand, Fig. 4.34 plots the standard deviation contours of the
coarse-scale stochastic pressure and x-direction flux. The utilization in more
information (larger number of snapshots to build the input model) results in an
overall reduction in the standard deviation in the domain, though the general
trends are similar (compare Figs. 4.29 and 4.34).
2.4E-04
0.43
2.1E-04
0.38
1.7E-04
0.32
1.3E-04
0.26
9.0E-05
0.21
5.1E-05
0.15
1.3E-05
0.09
0.04
0.00

flux.
163
1500 40

30
1000
20
500
10
0
-2.459 -2.4585 -2.458 -2.4575 12.25 12.275 12.3

uncertainty.
2
20
1.5
1
10
0.5
0
-3.925 -3.9 -3.875 -3.85 2 2.5 3

domain.
164
coarse-scale pressure and x-direction flux is plotted at the same two spatial lo-
cations. The PDFs are plotted in Figs. 4.35 and 4.36. For the point upstream,
the probability distributions are peaked and have a very small range of varia-
tion. In comparison, the point in the region of uncertainty has a significantly
larger range of variability. Comparing Fig. 4.36 with Fig. 4.31 illustrates the
effect of the incorporation of more data into the construction of the stochastic
input model.
165
CHAPTER 5
CONCLUSIONS AND SUGGESTIONS FOR THE FUTURE RESEARCH
We have extended the state-of-art in deterministic multiscale modeling of

flow through heterogeneous media to a stochastic multiscale framework. We
accomplish this in three steps. In the first step, a variational stochastic mixed
multiscale formulation is developed to incorporate the effects of stochastic fine-
scale permeability. The input to this is a finite-dimensional representation of the

stochastic permeability that is constructed from limited permeability informa-
tion in the form of snapshots or statistics. We utilize various data-driven model
reduction strategies to embed this limited information into viable stochastic in-
put models of the permeability. The resulting multiscale SPDEs are solved uti-
lizing an adaptive sparse grid collocation strategy. We showcase the complete
formulation through realistic large scale applications of flow through hetero-
geneous random media. This is (to the best knowledge of the authors) the first
instance of a stochastic variational multiscale treatment of flow through random

heterogeneous media.
Although this thesis has addressed the fundamental issues with (a) solving
stochastic partial differential equations in a computationally tractable form (b)
generating reliable stochastic input models from limited experimental data, and
(c) extend deterministic multiscaling ideas to stochastic multiscaling, there are

several areas where developments are required. Suggestions for the continua-
tion of this study are provided next.
166
5.1 Stochastic inverse problems
Experimental evidence suggests that the dynamics of many physical phenom-

ena are significantly affected by the underlying uncertainties associated with
variations in properties and fluctuations in operating conditions. Recent devel-
opments in stochastic analysis have opened the possibility of realistic model-

ing of such systems in the presence of multiple sources of uncertainties. These
advances raise the possibility of solving the corresponding stochastic inverse
problem: the problem of designing/estimating the evolution of a system in
the presence of multiple sources of uncertainty given limited information. A

scalable, parallel methodology for stochastic inverse/design problems can de-
veloped. The key features of the development could be the following (a) The
representation of the underlying uncertainties and the resultant stochastic de-

pendant variables is performed using a sparse grid collocation methodology.
(b) A stochastic sensitivity method based on multiple solutions to deterministic
sensitivity problems. (c) The stochastic inverse/design problem is transformed
to a deterministic optimization problem in a larger-dimensional space that is

subsequently solved using deterministic optimization algorithms. This ensures
that the design framework relies entirely on deterministic direct and sensitivity
analysis of the continuum systems, thereby significantly enhancing the range of
applicability of the framework for the design in the presence of uncertainty of

many other systems usually analyzed with legacy codes.
167
5.2 Incorporating correlation statistics and investigating regu-
larization
In classical inverse problems [126, 127, 128, 129, 130, 131], the ill-posednes re-
quires some form of regularization or smoothing assumptions for computing
the solution. But in the context of stochastic inverse problems, we are search-
ing for a solution in a distribution sense in a much larger space (tensor product
space of spatio-temporal and stochastic variations). This expanded search space
essentially guarantees a stochastic solution [137]. Nevertheless, regularization
arguments play a very significant role in reducing the computational effort by
imposing some smoothness criteria on the designed stochastic solution. Future

work would be to investigate mathematical approaches to stochastic regulariza-
tion as a purely computational strategy: (a) utilize spatial (temporal) regulariza-
tion via parametrization of the spatial (temporal) representation of the unknown
stochastic fields, and (b) utilize stochastic regularization via parametrization of

the stochastic field.
Spatial (temporal) parametrization of the unknown stochastic fields: An ideal

strategy would be to extend deterministic parametric representation of fields
in a straightforward manner for the spatial representation of stochastic fields.
There are many methodologies to parametrically represent the spatial varia-

tion of a stochastic field q(x, Y) (wavelets, polynomials, Bézier surfaces among
others). Consider a Bézier representation of q(x, Y)- a stochastic field in two
spatial dimensions. A Bézier curve of order (m x1 , m x2 ) is defined by a set of
168
(m x1 + 1) × (m x2 + 1) control points Pi, j as
m x1 X
X m x2
q(x1, x2, Y) = Pi, j (Y)Bmi x1 (x1)Bmj x2 (x2), where, (5.1)
i=0 j=0
mx !
Bmi x (x) = xi (1 − x)mx −i . (5.2)
i!(m x − i)!
The design variables are now the stochastic control values Pi, j (Y). These design
variables can be subsequently represented in terms of nq collocated values as
nq
X
Pi, j (Y) = Pri, j (Y r )Lr (Y). (5.3)
r=1
The sensitivity equations are now defined in terms of these nq deterministic pa-
rameters. Since the choice of the parametrization controls the spatial smooth-
ness of the solution, it is necessary to construct convergence estimates as the
spatial parametrization is varied. Some preliminary work on convergence of
this approximation is given in [138]. Similar arguments can be utilized to pa-

rameterize the temporal variation of the stochastic field.
Stochastic regularization of the unknown stochastic fields: An alternative

strategy is to consider some stochastic regularization. This can be imposed by
enforcing some correlation structure on the unknown stochastic field. The cor-
relation kernel C(x) determines the stochastic field and imposes some smooth-
ness on its spatial variability. This will convert the problem of designing the
stochastic field q to the problem of designing its correlation structure C(x). The
unknown correlation function is represented in terms of its spectral expansion
K
X
C(r) = ai sin(ix/L), (5.4)
i=1
where L is the characteristic length of the system and {ai } is the set of K
Fourier modes of the correlation function (and we have assumed that the field
is isotropic, without loss of generality). By representing the correlation this way,
169
the stochastic optimization problem is converted to computing the modal coef-
ficients {ai }. The design parameters are now the set of K scalars {ai }. Given any
correlation kernel, the Karhunen-Loève expansion can be utilized to explicitly
represent the stochastic field as
N p
X
q(x, ω) = E(q(x, ω)) + λi fi (x)Yi (ω), (5.5)
i=0
where (λi , fi (x)) are the eigen-pairs of the correlation function. The stochastic
sensitivity equations are formulated in terms of these K deterministic scalars.
The choice of the number of modes K used in the representation of the unknown
correlation provides a measure of the regularization. Including higher frequen-
cies (larger K) allows C(r) to be very steep representing a highly uncorrelated

field. One would expect to utilize an iterative solution strategy to sequentially
increase the number of Fourier modes in the expansion to guarantee the accu-
racy of the stochastic solution. Defining and constructing the gradient of the
cost functional is important along with developing a strategy for estimating the
number of modes required based on the statistics of the measurements.
Spatial correlation statistics provides some notion of how easy or difficult it

is to physically realize the designed process. A very large or small correlation
signifies expensive manufacturing, fabrication or operating conditions. This in-
troduces the possibility of computing the designed fields with some constraints
to obtain physically accessible fields. The above proposed strategies naturally

take into account these issues.
170
5.3 Stochastic low-dimensional modeling
In the stochastic inverse/design problem, we need to solve the direct problem

and the sensitivity problem for all of the collocation points at each iteration.
For large stochastic dimensionality, each iteration in the optimization procedure
will take substantial amount of computational time. The solution procedure

can be accelerated significantly if a reduced order model of the system can be
constructed. A development would be to utilize model reduction strategies to
extract the coherent structures [132] of the system evolution. This reduced repre-
sentation can either be created off-line or adaptively modified on-the-fly [133, 134, 135].
One area of research would be to extend the classical POD method to

stochastic evolution equations based on recent work in [139] coupled with re-
cent work [133, 134, 135] on solving constrained optimization problems using
POD basis. We propose to develop a compact expansion of the stochastic depen-
dent variable into stochastic spatial modes. Consider a random field u(x, t; Y) in
a space-time domain and we look for biorthogonal representations as
K
X
u(x, t; Y) = b(h) (h)
i (t)ai (x; Y). (5.6)
i=1
The spatial-stochastic modes a(h)

i (x; Y) are constructed from snapshots of the
stochastic field (either from off-line calculations, available numerical data or ex-
perimental data) by solving an eigenvalue problem. This eigen-value problem

arises from the Euler-Lagrange equations resulting from minimizing the resid-
ual of the projection of the stochastic field onto the basis functions [132, 139].
The residual is calculated based on some norm {, ., }h. Recent work [139] has
suggested that this stochastic counterpart of the POD framework will produce
low-order ODEs (in terms of the temporal coefficients bi(h) (t)) that are indepen-
171
dent of the stochastic dimensionality of the system. It is important that care
be taken in choosing the appropriate norm over which to construct the spatial
stochastic basis. As a preliminary step, one could utilize inner products based
on the mean, variance and standard deviation [139]:
Z Z Z
{b, c}0 := hbi · hcidx, {b, c}1 := hb · cidx {b, c}2 := (hb · ci − hbi · hci) dx.(5.7)
D D D
The construction of these spatial-stochastic modes ai should be done using

sparse grid collocation. That is, the stochastic modes are represented as
M
X
a(h)
k (x; Y) = â(h)
ki (x, Yi )Li (Y). (5.8)
i=1
An arbitrary stochastic field is represented in terms of these stochastic modes as

K
X M
X
u(x, t; Y) = b(h)
i (t) a(h)
i j (x; Y j )L j (Y). (5.9)
i=1 j=1
The unknowns are the deterministic coefficients {bi }. Inserting this representa-
tion, Eq. (5.9) into Eqs. (2.3) and using the orthogonality properties of the modes
results in a set of ODEs for the modal coefficients, {bi }.
K
X
(B( b(h) (h) (h)
i (t)ai : {q, α}), b j ) = 0, j = 1, ..., K, (5.10)
i=1
which can be written in compact form as
∂bhj
= F j (bhj ), j = 1, ..., K. (5.11)
∂t
The cost functional, its gradient and the stochastic optimization problem can
now be posed in terms of this set of ODEs instead of the set of SPDEs.
Minimize CF ({bi }, {qi}),

∂bhi
subject to = Fi (bhi ), j = 1, ..., K, (5.12)
∂t
bi (0) = b0i .
172
There has been recent progress on developing reduced order models to control
systems defined by PDEs [133, 134, 135]. In such approaches, an adaptive pro-
cedure is used that successively updates the ensemble of data snapshots and
the reduced order model during the optimization procedure. It would be help-
ful to utilize this strategy via straightforward extension to computing stochastic

modes (Eq. 5.9) for the control/optimization of SPDEs.
173
BIBLIOGRAPHY
[1] Major aircraft disasters, information available at

(http://dnausers.d-n-a.net/dnetGOJG/Disasters.htm).
[2] AIM, Accelerated insertion of materials program,

DSO Office, DARPA, information available at
http://www.darpa.mil/dso/thrust/matdev/aim.htm.
[3] Prognosis, Prognosis program, DSO Office, DARPA, information avail-

able at
http://www.arpa.mil/dso/thrust/matdev/prognosis.htm.
[4] National security assessment of the U.S. forging industry,

Bureau of industry and security, U.S. dept. of commerce
https://www.bis.doc.gov/defenseindustrialbaseprograms/
osies/.
[5] MAI, Metals affordability initiative, AFRL/ML, information available at

http://www.wrs.afrl.af.mil/contract/.
[6] The USAF/Navy forging supplier initiative, information available at:

http://www.onr.navy.mil/scitech/industrial/mantech/
docs/successes/metals/.
[7] R. G. Ghanem, P. D. Spanos, Stochastic Finite Elements: A Spectral Ap-

proach, Dover publications, 1991.
[8] R. Ghanem, Probabilistic characterization of transport in heterogeneous

porous media, Comput. Methods Appl. Mech. Engrg. 158 (1998) 199-220.
[9] R. Ghanem, A. Sarkar, Mid-frequency structural dynamics with param-

eter uncertainty, Comput. Methods Appl. Mech. Engrg. 191 (2002) 5499-
5513.
[10] R. Ghanem, Higher order sensitivity of heat conduction problems to ran-

dom data using the spectral stochastic finite element method, ASME J.
Heat Transfer 121 (1999) 290-299.
[11] D. Xiu, G. E. Karniadakis, Modeling uncertainty in steady state diffu-

sion problems via generalized polynomial chaos, Comput. Methods Appl.
Mech. Engrg. 191 (2002) 4927-4948.
174
[12] D. Xiu, G. E. Karniadakis, Modeling uncertainty in flow simulations via
generalized polynomial chaos, J. Comp. Physics 187 (2003) 137–167.
[13] D. Xiu, G. E. Karniadakis, A new stochastic approach to transient heat

conduction modeling with uncertainty, Int. J. Heat Mass Transfer 46 (2003)
46814693.
[14] B. Ganapathysubramanian, N. Zabaras, Sparse grid collocation schemes

for stochastic natural convection problems, Journal of Computational
Physics 225 (1) (2007) 652–685.
[15] I. Babuska, F. Nobile, R. Tempone, A stochastic collocation method for

elliptic partial differential equations with random input data, ICES Report
05-47, 2005.
[16] D. Xiu, J. S. Hesthaven, High order collocation methods for the differential
equation with random inputs, SIAM J. Sci. Comput. 27 (2005) 1118–1139.
[17] C. W. Gardiner, Handbook of stochastic methods for physics, chemistry

and the natural sciences, Springer-Verlag,1990
[18] M. Loève, Probability Theory, Springer-Verlag: 4th Edition, Berlin, 1977.
[19] C. Desceliers, R. Ghanem, C. Soize, Maximum likelihood estimation of

stochastic chaos representations from experimental data, Int. J. Numer.
Meth. Engng. 66 (2006) 978–1001.
[20] B. Ganapathysubramanian, N. Zabaras, Modelling diffusion in random

heterogeneous media: Data-driven models, stochastic collocation and the
variational multi-scale method, Journal of Computational Physics, 226
(2007) 326-353
[21] B. Ganapathysubramanian and N. Zabaras, A non-linear dimension re-

duction methodology for generating data-driven stochastic input models,
J. Comput. Phys., in press (http://dx.doi.org/10.1016/j.jcp.2008.03.023).
[22] S. T. Roweis, L. K. Saul, Nonlinear Dimensionality Reduction by Locally

Linear Embedding, Science 290 (2000) 2323–2326.
[23] J. Tenenbaum, V. DeSilva, J. Langford, A global geometric framework for

nonlinear dimension reduction, Science, 2000.
175
[24] V. deSilva, J. B. Tenenbaum, Global versus local methods in nonlinear di-
mensionality reduction, Advances in Neural Information Processing Sys-
tems 15 (2003) 721–728.
[25] M. Bernstein, V. deSilva, J. C. Langford, J. B. Tenenbaum,

Graph approximations to geodesics on embedded mani-
folds, Dec 2000, Preprint may be downloaded at the URL:
http://isomap.stanford.edu/BdSLT.pdf.
[26] J. R. Munkres, Topology, Second edition, Prentice-Hall, 2000.
[27] J. A. Costa, A. O. Hero, Entropic Graphs for Manifold Learning, Proc. of

IEEE Asilomar Conf. on Signals, Systems, and Computers, Pacific Groove,
CA, November, 2003.
[28] J. A. Costa, A. O. Hero, Geodesic Entropic Graphs for Dimension and En-
tropy Estimation in Manifold Learning, IEEE Trans. on Signal Processing
52 (2004) 2210–2221.
[29] T. H. Cormen, C. E. Leiserson, R. L. Rivest, C. Stein, Introduction to Algo-

rithms, 2001, The MIT Press.
[30] J. A. Costa, A. O. Hero,Manifold Learning with Geodesic Minimal Span-

ning Trees, 2003, arXiv:cs/0307038v1.
[31] D. Frenkel, B. Smit, Understanding Molecular Simulations: From Algo-

rithms To Applications, Academic Press, 2002.
[32] L. Guadagnini, A. Guadagnini, D. M. Tartakovsky, Probabilistic recon-

struction of geologic facies, J. Hydrol. 294 (2004) 57–67.
[33] C. L. Winter, D. M. Tartakovsky, Mean flow in composite porous media.

Geophys. Res. Lett., 27 (2000) 1759-1762, 2000
[34] C. L. Winter, D. M. Tartakovsky, Groundwater flow in heterogeneous

composite aquifers. Water Resour. Res., 38 (2002) 23.1
[35] C. L. Winter, D. M. Tartakovsky, A. Guadagnini, Numerical solutions of

moment equations for flow in heterogeneous composite aquifers. Water
Resour. Res., 38 (2002) 13.1
176
[36] C. L. Winter, D. M. Tartakovsky, A. Guadagnini, Moment equations for
flow in highly heterogeneous porous media. Surv. Geophys., 24 (2003) 81-
106.
[37] M. Jardak, C-H. Su, G. E. Karniadakis, Spectral Polynomial Chaos Solu-

tions of the Stochastic Advection Equation, J. Sci. Comput. 17 (2002) 319–
338.
[38] D. Lucor, C-H. Su, G. E. Karniadakis, Karhunen-Loève representation of

periodic second-order autoregressive processes, International Conference
on Computational Science (2004) 827-834.
[39] X. Wan, G. E. Karniadakis, An adaptive multi-element generalized poly-

nomial chaos method for stochastic differential equations, J. Comp.
Physics 209 (2005) 617–642.
[40] X. Wan, D. Xiu, G. E. Karniadakis, Stochastic Solutions for the Two-

Dimensional Advection-Diffusion Equation, SIAM J. Sci. Computing 26
(2005) 578–590.
[41] D. Xiu, G. E. Karniadakis, The Wiener-Askey Polynomial Chaos for

Stochastic Differential Equations, SIAM J. Sci. Computing 24 (2002) 619–
644.
[42] B. Velamur Asokan, N. Zabaras, Variational multiscale stabilized FEM

formulations for transport equations: stochastic advection-diffusion and
incompressible stochastic Navier-Stokes equations, J Comp. Physics 202
(2005) 94–133.
[43] M. K. Deb, I. K. Babuska, J. T. Oden, Solution of stochastic partial differ-

ential equations using the Galerkin finite element techniques, Comput.
Methods Appl. Mech. Engrg. 190 (2001) 6359–6372.
[44] I. Babuska, R. Tempone, G. E. Zouraris, Solving elliptic boundary value

problems with uncertain coefficients by the finite element method: the
stochastic formulation, Comput. Methods Appl. Mech. Engrg. 194 (2005)
1251–1294.
[45] I. Babuska, R. Tempone, G. E. Zouraris, Galerkin finite elements approxi-

mation of stochastic finite elements, SIAM J. Numer. Anal. 42 (2004) 800–
825.
177
[46] D. Xiu,, D. Lucor, C.-H. Su, G. E. Karniadakis, Performance Evaluation
of Generalized Polynomial Chaos, International Conference on Computa-
tional Science, Lecture Notes in Computer Science, Vol. 2660, pp. 346-354,
Springer, 2003.
[47] R. Tempone, Numerical Complexity Analysis of Weak Approximation of

Stochastic Differential Equations, PhD Thesis, 2002.
[48] B. Velamur Asokan, N. Zabaras, Using stochastic analysis to capture un-

stable equilibrium in natural convection, J. Comp. Physics 208 (2005) 134–
153.
[49] E. Novak, K. Ritter, The curse of dimension and a universal method for
numerical integration, Multivariate approximation and splines, G. Nurn-
berger, J. W. Schmidt, G. Walz (eds.), 1997, 177–188.
[50] T. Gerstner, M. Griebel, Numerical integration using sparse grids, Numer-

ical Algorithms, 18 (1998) 209–232.
[51] T. Grestner, M. Griebel, Dimension adaptive tensor product quadrature,

Computing 71 (2003) 65–87.
[52] E. Novak, K. Ritter, R. Schmitt, A. Steinbauer, On an interpolatory method

for high dimensional integration, J. Comp. Appl. Mathematics, 112 (1999)
215–228.
[53] V. Barthelmann, E. Novak, K. Ritter, High dimensional polynomial inter-

polation on sparse grids, Adv. Compu. Math. 12 (2000) 273–288.
[54] A. Klimke. Uncertainty Modeling using Fuzzy Arithmetic and Sparse

Grids, PhD Thesis, Universitt Stuttgart, Shaker Verlag, Aachen, 2006.
[55] C. Canuto, M. Y. Hussaini, A. Quarteroni, T. A. Zang, Spectral Methods:

Fundamentals in Single Domains Series, Springer 2006
[56] D. Xiu, Efficient Collocational Approach for Parametric Uncertainty Anal-

ysis, Communications In Computational Physics 2 (2007) 293–309.
[57] A. Klimke, B. Wohlmuth, Algorithm 847: spinterp: Piecewise Multilinear

Hierarchical Sparse Grid Interpolation in MATLAB, ACM Transactions on
Mathematical Software 31 (2005).
178
[58] A. Klimke, Sparse Grid Interpolation Toolbox – User’s Guide, IANS re-
port 2006/001, University of Stuttgart, 2006.
[59] http://mpdc.mae.cornell.edu/Software.
[60] Y. Shen, P. Tong, K. -Q. Xia, Turbulent convection over rough surfaces,
Phys. Rev. Let. 76 (1996) 908–911.
[61] D. M. Tartakovsky, D. Xiu, Stochastic analysis of transport in tubes with

rough walls, J. Comput. Physics 217 (2006) 248-259.
[62] E. Bodenschatz, W. Pesch, G. Ahlers, Recent developments in Rayleigh-

Bénard convection, Annu. Rev. Fluid Mech. 32 (2000) 709–778.
[63] P. Nithiarasu, K. N. Seetharamu, T. Sundararajan, Natural convective heat

transfer in a fluid saturated variable porosity medium, Int. J. Heat Mass
Transfer 40 (1997) 3955–3967.
[64] C. Manwart, R. Hilfer, Reconstruction of random media using Monte

Carlo methods, Physical Review E 59 (1999) 5597–5600.
[65] D. Samanta, N. Zabaras, Modeling melt convection in solidification pro-

cesses with stabilized finite element techniques, International Journal for
Numerical Methods in Engineering, 64 (2005) 1769–1799.
[66] S. Umekawa, R. Kotfila, O. D. Sherby, Elastic properties of a tungsten-

silver composite above and below the melting point of silver, J. Mech.
Phys. Solids 13 (1965) 229-230.
[67] J. Aldazabal, A. Martin-Meizoso, J. M. Martinez-Esnaola, Simulation of

liquid phase sintering using the Monte-Carlo method, Mater. Sci. Eng. A
365 (2004) 151-155.
[68] A. P. Roberts, E. J. Garboczi, Elastic properties of a tungsten-silver com-

posite by reconstruction and computation, J. Mech. Phys. Solids 47 (1999)
2029–2055.
[69] L. Arleth, S. Marc̆elja. T. Zemb, Gaussian random fields with two level-
cuts– Model for asymmetric microemulsions with nonzero spontaneous
curvature, The Journal of Chemical Physics 115 (8) (2001) 3923–3936.
[70] http://www.mathworks.com/products/matlab/.
179
[71] F. F. Abraham, J. Q. Broughton, N. Bernstein, E. Kaxiras, Spanning the
Length Scales in Dynamic Simulation, Computers In Physics 12 (1998)
538–546.
[72] H. Li, K. E. Torrance, An experimental study of the correlation between

surface roughness and light scattering for rough metallic surfaces, Ad-
vanced Characterization Techniques for Optics, Semiconductors, and
Nanotechnologies II, edited by Angela Duparr, Bhanwar Singh, Zu-Han
Gu, Proceedings of SPIE Vol. 5878 (SPIE, Bellingham, WA, 2005).
[73] GSLIB: Geostatistical Software Library, available at

http://www.pe.utexas.edu/Geosci/Software/
GSLIB/gslib.html.
[74] N. Zabaras, S. Sankaran, An information-theoretic approach to stochastic

materials modeling, IEEE Computing in Science and Engineering (CISE),
special issue of ‘Stochastic Modeling of Complex Systems’, March/April
issue (2007) 50–59.
[75] V. Sundararaghavan, N. Zabaras, Classification of three-dimensional mi-

crostructures using support vector machines, Computational Materials
Science 32 (2005) 223–239.
[76] S. Sankaran, N. Zabaras, A maximum entropy approach for property pre-

diction of random microstructures, Acta Materialia 54 (2006) 2265–2276.
[77] N. Zabaras, S. Sankaran, Computing property variability of polycrystals

induced by grain size and orientation uncertainties, Acta Materialia 55
(2007) 2279-2290.
[78] A Global Geometric Framework for Nonlinear Dimension-

ality Reduction, freely downloadable software available at
http://isomap.stanford.edu/.
[79] J. T. Kent, J. M. Bibby, K. V. Mardia, Multivariate Analysis (Probability and

Mathematical Statistics), Elsevier (2006).
[80] S. Torquato, Statistical description of microstructures, Ann. Rev. Mater.

Sci. 32 (2002) 77-111.
[81] S. Torquato, Random Heterogeneous Materials: Microstructure and

Macroscopic Properties, Springer, 2002.
180
[82] C. Goffman, G. Pedrick, First course in functional analysis, Prentice-Hall,
2002.
[83] J. Breadwood, J.H. Halton, J.M. Hamersley, The shortest path through
many points, Proc. Cambridge Philos. Soc. 55 (1959) 299–327.
[84] W. Hardle, L. Simar, Applied Multivariate Sta-

tistical Analysis, 2004, available online at
http://www.quantlet.com/mdstat/scripts/mva/htmlbook/.
[85] C-G. Li, J. Guo, G. Chen, X-F. Nie, Z. Yamg, A version of Isomap with ex-
plicit mapping, Proceedings of the Fifth International conference on ma-
chine learning and cybernetics, Dalian, Aug 2006.
[86] C. L. Y. Yeong, S. Torquato, Reconstructing random media. II. Three-

dimensional media from two-dimensional cuts, Phys. Rev. E 58 (1998)
224-233.
[87] A. P. Roberts, M. A. Knackstedt, Structure property correlations in model

composite materials, Phys. Rev. E 54 (1996) 2313–2328.
[88] P. S. Koutsourelakis, G. Deodatis, Simulation of multi-dimensional binary

random fields with application to modeling of two phase random media,
J. Engg. Mech. 132 (2006) 619–631.
[89] Intrinsic Dimension and Entropy Estimation in Manifold Learn-

ing, Matlab codes available at http://www.eecs.umich.edu/
hero/IntrinsicDim/.
[90] The Qhull source code available at http://www.qhull.org/.
[91] P. J. Bickel, K. A. Doksum, Mathematical statistics- Basic ideas and se-

lected topics, Prentice Hall, 2001.
[92] H. Hotelling, Tubes and Spheres in n-Spaces and a Class of Statistical

Problems, American Journal of Mathematics 61 (1939) 440–460.
[93] T. F. Cox, M. A. A. Cox, Multidimensional scaling, Chapman and Hall,

1994.
[94] T. Arbogast, S. L. Bryant, A two-scale numerical subgrid technique for

waterflood simulations, SPE J. (2002) 446-457.
181
[95] T. Arbogast, S. E. Minkoff, P. T. Keenan, An operator-based approach
to upscaling the pressure equation, Computational Methods in Water
Resources XII, Computational Methods in Contamination and Remedi-
ation of Water Resources Vol. 1, Computational Mechanics Publications,
Southampton, UK (1998).
[96] T. Arbogast, Implementation of a locally conservative numerical subgrid

upscaling scheme for two-phase darcy flow, Comput. Geosci. 6 (2002) 453-
480.
[97] T. Arbogast, Numerical subgrid upscaling of two-phase flow in porous

media, Lecture Notes in Physics vol. 552, Springer, Berlin (2000).
[98] T. J. R. Hughes, Multiscale phenomena: Greens functions, the dirichlet-

to-neumann formulation, subgrid scale models, bubbles and the origin of
stabilized methods, Comp. Meth. Appl. Mech. Eng. 127 (1995) 387-401.
[99] T. J. R. Hughes, G. R. Feijo, L. Mazzei, J. B. Quincy, The variational mul-

tiscale method a paradigm for computational mechanics, Comp. Meth.
Appl. Mech. Engrg. 166 (1998) 3-24.
[100] R. Juanes, T. W. Patzek, A variational multiscale finite element method for

multiphase flow in porous media, Finite Elements in Analysis and Design,
41 (2005) 763–777.
[101] W. E, B. Engquist, The heterogeneous multiscale methods, Comm. Math.

Sci. 1 (2003) (1) 87-133.
[102] W. E., P. B. Ming, P. W. Zhang, Analysis of the heterogeneous multiscale

method for elliptic homogenization problems, preprint (2003).
[103] T. Y. Hou, X. H. Wu, A multiscale finite element method for elliptic prob-
lems in composite materials and porous media, J. Comput. Phys. 134
(1997) 169-189.
[104] T. Y. Hou, X. H. Wu, Z. Q. Cai, Convergence of a multiscale finite element

method for elliptic problems with rapidly oscillating coefficients, Math.
Comput. 68 (1999) 913-943.
[105] P. Park, T. Y. Hou, Multiscale numerical methods for singularly-perturbed

convection diffusion equations, Int. J. Comput. Meth. 1 (2004) 17-65.
182
[106] Y. Efendiev, V. Ginting, T. Y. Hou, R. Ewing, Accurate multiscale finite
element methods for two-phase flow simulations, J. Comput. Phys., sub-
mitted.
[107] J.E. Aarnes, V. Kippe, K.A. Lie, Mixed multiscale finite elements and
streamline methods for reservior simulation of large geomodels, Adv. wa-
ter. Resources. 28 (2005) 257–271.
[108] I. Babuska, U. Banerjee, J. Osborn, Generalized finite element methods:

main ideas, results and perspective, Int. J. Comput. Meth. 1 (2004) 67-103.
[109] G. Sangalli, Capturing small scales in elliptic problems using a residual-

free bubbles finite element method, SIAM MMS 1 (2003) 485-503.
[110] D. Xiu, Efficient collocational approach for parametric uncertainty analy-

sis, Comm. Computational Physics 2 (2) (2007) 293–309.
[111] X. Ma, N. Zabaras, An adaptive hierarchical sparse grid collocation al-

gorithm for the solution of stochastic differential equations, J. Comput.
Phys., submitted.
[112] X. Frank Xu, A multiscale stochastic finite element method on elliptic

problems involving uncertainties, Comput. Methods Appl. Mech. Engrg.
196 (2007) 2723-2736
[113] R. Juanes, F.-X. Dub, A locally-conservative variational multiscale method

for the simulation of flow in porous media with multiscale source terms,
Computational Geosciences, 2008, in press.
[114] T. Arbogast, K. J. Boyd, Subgrid Upscaling and Mixed Multiscale Finite

Elements, SIAM J. Num. Analysis 44 (2006) 1150 - 1171
[115] Ch. Schwab, R. A. Todur, Sparse finite elements for stochastic elliptic
problems–higher moments, Computing 71 (1) (2003) 43–63.
[116] Ch. Schwab, R. A. Todur, Sparse finite elements for elliptic problems with
stochastic loading, Numerische Mathematik 95 (2001) 707–734.
[117] T. Petersdorff, Ch. Schwab, Sparse finite element methods for operator
equations with stochastic data, Applications of Mathematics 51 (2006)
145–180.
183
[118] F. Brezzi, M. Fortin, Mixed and hybrid finite element methods, Springer-
Verlag 1991
[119] P.A.Raviart, J.M. Thomas, A mixed finite element method for second
order elliptic problems, in: Mathematical aspects of the finite element
method, Lecture Notes in Mathematics 606, Springer-Verlag, Berlin 1977
[120] V. Kippe, J.E.Aarnes, K.A.Lie, A comparison of multiscale methods for el-

liptic problems in porous media flow, Computational Geosciences, (2007)
in press.
[121] P. Holmes, J. L. Lumley, G. Berkooz, Turbulence, Coherent Structures, Dy-

namical Systems and Symmetry, Cambridge University Press, 1998.
[122] F. Nobile, R. Tempone, C. Webster, An anisotropic sparse grid collocation

method for elliptic partial differential equations with random input data,
MOX, Dipartimento di Matematica, Politecnico di Milano, Italy, 2007.
[123] S. Smolyak, Quadrature and interpolation formulas for tensor product of

certain classes of functions, Soviet Math. Dokl. 4 (1963) 240–243.
[124] J.W. Jennings, S.l. Bryant, Outcrop permeability statistics and implication
for fluid flow and reactive transport modeling, Bureau of Economic Geol-
ogy, U. Texas Austin, 2000 report.
[125] M.A. Christie, M.J. Blunt, Tenth SPE comparative solution project: A com-
parison of upscaling techniques, SPE Reserv. Eval. Eng. 4 (2001) 308–317.
[126] A. N. Tikhonov, V. Y. Arsenin, Solutions of Ill-Posed Problems. John Wiley,

New York, 1977.
[127] O. Alifanov, Inverse Heat Transfer Problems, Springer, 1995.
[128] H. W. Engl, M. Hanke,A. Neubauer, Regularization of Inverse Problems,

Kluwer Academic Publishers, 1996.
[129] A. Kirsch, An Introduction to the Mathematical Theory of Inverse Prob-

lems, Springer, 1996.
[130] V. Isakov, Inverse Problems for Partial Differential Equations (Applied

Mathematical Sciences), Springer, 2005.
184
[131] V. A. Morozov, Methods for Solving Incorrectly Posed Problems.
Springer-Verlag, New York, 1984.
[132] P. Holmes, J. L. Lumley, G. Berkooz, Turbulence, Coherent Structures, Dy-

namical Systems and Symmetry, Cambridge Univ press, 1996.
[133] S.S. Ravindran, Adaptive Reduced-Order Controllers for a Thermal Flow

System, SIAM Journal on Scientific Computing 23 (2002) 1925–1943.
[134] S.S. Ravindran, Control of flow separation over a forward-facing step by

Model reduction, Computer Methods in Applied Mechanics and Engi-
neering 191 (2002) 4599–4617.
[135] K. Ito, S.S. Ravindran, Reduced Basis Method for Optimal Control of Un-
steady Viscous Flows, International Journal of Computational Fluid Dy-
namics 15 (2001) 97–113.
[136] D. Venturi, X. Wan, G. E. Karniadakis, Stochastic low-dimensional mod-

eling of random laminar wake past a circular cylinder, preprint.
[137] J Kaipio, E Somersalo, Statistical and computational inverse problems,

Springer, New York, 2005.
[138] N. Zabaras, B. Ganapathysubramanian, A scalable framework for the so-

lution of stochastic inverse problems using a sparse grid collocation ap-
proach, J. Computational Physics 227 (2008) 4697–4735.
[139] D. Venturi, X. Wan, G. E. Karniadakis, Stochastic low-dimensional mod-

eling of random laminar wake past a circular cylinder, preprint.
[140] B. Ganapathysubramanian and N. Zabaras, A stochastic multiscale frame-

work for modeling flow through heterogeneous porous media, Journal of
Computational Physics, under review.
185

S To Chas Tic Multi Scale

Uploaded by

Copyright:

Available Formats

S To Chas Tic Multi Scale

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

S To Chas Tic Multi Scale

Uploaded by

Copyright:

Available Formats

COMPUTATIONAL STRATEGIES FOR

DATA-DRIVEN MODELING OF STOCHASTIC

Presented to the Faculty of the Graduate School

ALL RIGHTS RESERVED

Many physical systems of fundamental and industrial importance are signif-

conditions as well as variabilities in operating and surrounding conditions.

ology to design/control physical processes in the presence of multiple sources

in describing the input uncertainties increases, these approaches become highly

nate method for solving high-dimensional stochastic partial differential equa-

abilities in initial and boundary conditions as well as other parameters using

features. The framework is applied to analyze flow through random heteroge-

scientific computation package Pestc. I would like to acknowledge the effort

Biographical Sketch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii

2 Solving stochastic partial differential equations: A sparse grid colloca-

4 Analysis of flow through heterogeneous random media: A stochastic

5 Conclusions and suggestions for the future research 166

2.1 Solution times (minutes) . . . . . . . . . . . . . . . . . . . . . . . . 49

2.1 Collocation nodes for a two-dimensional problem in a level 4

3.1 Limited information about important input quantities necessi-

4.1 Schematic of the problem of interest. . . . . . . . . . . . . . . . . 128

performance computing platforms, it has now become possible to computation-

made from polycrystalline or functionally graded materials) to the geological

A good example of how the underlying variability affects performance is

manufacturing processes to fabricate the MEMS devices. This often results in

materials which constitute a major element in structural, transport, and chem-

heterogeneous, depending on the local microstructure. Experimental evidence

of the PDF) of microstructural phenomena [1, 2, 3, 4, 5, 6]. In addition, no pre-

Sources of uncertainties: The above two examples clearly underscore the

necessity of incorporating the effects of uncertainty in property variation and

miliar systems include analysis of thermal transport through devices (noz-

rors/fluctuations/variations in input constitutive relations, material properties

models (e.g. linearization of non-linear equations) and discretization errors.

• The first is a mathematically rigorous framework that can be used to de-

of the effects of uncertainty) in this set of differential equations enters as

• The second ingredient is a set of techniques to construct usable, realistic

models of these input uncertainties. These models should utilize available

Solving stochastic partial differential equations: There are various frame-

broadly classified into two approaches: statistical and non-statistical. Monte-

sampling and importance sampling among others [31].

A non-statistical approach consists of approximating and then modelling the

A similar method is to expand the inverse of the stochastic operator in a Neu-

A more recent approach to model uncertainty is based on the spectral

workers subsequently applied it to transport in random media [8], structural

fast convergence rates with increasing orders of expansions. These convergence

deterministic problems makes the solution of the stochastic problem extremely

quired to build the interpolation scheme increased as power of the number of

the number of sampling points in the stochastic dimensions according to the

We further utilize these methods to tackle significant problems in natural con-

ciently scale to a moderately large number of stochastic dimensions. We further

Constructing reliable stochastic input models: In most complex systems in-

volving multiple coupled physical phenomena, the material (thermo-physical)

of polycrystals and/or functionally graded materials, hydrodynamic transport

on the system needs the construction of a stochastic model (preferably a low-

framework. The random domain decomposition (RDD) method was used

amount of heuristic parameter fitting.

The framework developed in this thesis (Chapter. 3) utilizes the available

statistical information about the variability of random media to construct a set

experimental data. By sampling over this low-dimensional equivalent surrogate

major advantages of this framework are the enormous reduction in complex-