SparseDA ICNSC2013

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Sample Size Reduction in Groundwater Surveys Via

Sparse Data Assimilation


Zaeem Hussain Abubakr Muhammad
Department of Computer Science Department of Electrical Engineering
LUMS School of Science and Engineering LUMS School of Science and Engineering
Lahore, Pakistan Lahore, Pakistan
Email: [email protected] Email: [email protected]

Abstract—In this paper, we focus on sparse signal recovery techniques either do not explicitly incorporate model dynamics
methods for data assimilation in groundwater models. The or make additional assumptions about the system. In one of
objective of this work is to exploit the commonly understood the earliest examples of such considerations, the work by
spatial sparsity in hydrodynamic models and thereby reduce
the number of measurements to image a dynamic groundwater Jafarpour et al [6] focuses specifically on hydrogeological
profile. To achieve this we employ a Bayesian compressive sensing reservoir systems. It uses a Discrete Cosine Transform (DCT)
framework that lets us adaptively select the next measurement parametrization for states as such systems naturally exhibit
to reduce the estimation error. An extension to the Bayesian a sparse representation in the spatial domain. They use the l1
compressive sensing framework is also proposed which incorpo- norm optimization technique to promote sparse signal recovery
rates the additional model information to estimate system states
from even lesser measurements. Instead of using cumulative and performs data assimilation by minimizing the l2 norm of
imaging-like measurements, such as those used in standard the observations with the estimated states. History matching
compressive sensing, we use sparse binary matrices. This choice is done by placing weights and regularization parameters
of measurements can be interpreted as randomly sampling only a on each of the terms and the collective cost function thus
small subset of dug wells at each time step, instead of sampling obtained is minimized to estimate the system state vector.
the entire grid. Therefore, this framework offers groundwater
surveyors a significant reduction in surveying effort without However, there is no procedure defined for setting or adjusting
compromising the quality of the survey. regularization coefficients and weights and the paper mentions
that their value can be set from prior belief. Another example
I. I NTRODUCTION of using compressive sensing like ideas is [13] which studies
The focus of this paper is to investigate techniques for the application of the compressive sensing procedure for soil
state estimation using a small number of measurements and moisture levels.
to propose a framework for data assimilation using those The framework introduced in [10] also exploits the sparsity
techniques, with a special emphasis groundwater monitoring. of a given dynamical system in some transform domain and
For many geophysical processes, groundwater being a case in merges it with compressed sensing framework. It works by
point, the number of available measurements is limited. Data defining a support set of the system state as the indices at
collection for groundwater models is especially cumbersome which the transform of the state vector is non zero. This
because the measurement process for groundwater level, flow framework then runs a reduced order Kalman filter for the non
or contamination involves digging of wells, collection of water zero elements of the state transform coefficients corresponding
samples and subsequent chemical analysis. Digging wells for to the support set. As the support set changes over time, this
the entire field may not be an economically justified option. change is detected by the error between the estimate and the
Even if the wells are already dug (as is the case for agricultural observations. The change is then estimated by performing the
lands in many provinces of Pakistan and India), collecting data l1 norm minimization of the vector that represents the supports
still is an expensive and time consuming exercise. Hence a data set, based on the assumption that the sparsity pattern of the
assimilation technique that makes the most of the limited data state vector changes slowly over time and thus the vector
available is most desirable. Another desirable feature of such representing change in the non zero indices will be sparse.
a technique would be the adaptive nature of its measurements Thus this framework makes the additional assumption about
so that we can theoretically predict which observations points the system that the sparsity pattern of the system states changes
are better than others to conduct the next round of survey such slowly with time. Moreover, it also requires knowledge of the
that the new observations would reduce the estimation error support set for the initial state.
the most. This work differs from all previous approaches in two
Recently, techniques have been introduced for signal re- respects. First, we have tackled the issue of point measure-
construction that exploit a sparse representation in some ments instead of imaging-like cumulative measurements used
transform domain. Most of such techniques have been inspired in conventional compressive sensing. This opens the avenue
by the Compressive Sensing (CS) paradigm. However, these for many other geophysical applications where sweep mea-
surements are not available but where sensors probe the field We now state the problem as follows. What is the minimum
directly at sample points. Secondly, we have given a Bayesian number of measurements M to estimate the joint pdf of
framework that tracks not just point estimates (as in the case of parameters and model state, given a dynamical model with
l1 -optimization in CS) but also their probability distribution. known uncertainties and that for a fixed time, the field ψ(x, .)
This enables one to study adaptation (between and during field is spatially sparse?
measurements) as well as a formal way to incorporate sparse
III. R ECONSTRUCTING S PARSE F IELDS
measurements into a Bayesian filtering framework. Although
the later has not been been fully developed in this paper, The compressive sensing paradigm works by exploiting the
glimpses of both adaptation and model incorporation have prior knowledge that the signal to be recovered is sparse or
been demonstrated on two examples via simulation results. compressible in some transform domain. Consider an N × 1
In both examples, one a toy 1D-heat equation and the other a column vector x in RN . The signal x can be represented in
2D contamination spread model in groundwater, we have ex- another basis of N × 1 vectors {Bi }N
i=1 as
ploited the diffusive nature of the dynamics and demonstrated N
X
adaptation to changing sparsity and model incorporation. x= si Bi , or x = Bs, (3)
i=1
II. P ROBLEM F ORMULATION
where s is the column vector of the weights si = BiT x, and
We now formulate the problem following the standard B is the N × N transformation matrix with Bi as its columns.
data assimilation terminology [5]. Given a domain D with Let y be the M × 1 column vector of measurements given by
boundary ∂D parameterized by coordinates x ∈ Rd , the
spatio-temporal dynamics of a groundwater-like geophysical y = Φx = ΦBs = θs, (4)
model are given by where Φ is the M ×N measurement matrix. In our setup, x can
∂ψ(x, t) be thought of as a grid representation of a static field ψ(x, .)
= G(ψ(x, t), α(x)) + u(x, t). (1) and Φ is a matrix model for the measurement functional M
∂t
introduced in Equation (2). A good choice of transform B
Here, the field to be estimated at each time step is ψ(x, t) ∈ to model spatial sparsity is the DCT. A vector representation
RN , where N is the number of model state variables de- of a signal s is called K-sparse if it has only K non-zero
scribing the field, each variable dependent on time and space. coefficients. This is measured by the l0 norm of the signal
G(ψ, α) is the nonlinear model operator, α(x) are spatially ksk0 .
varying parameters and u is a deterministic control input or
a noisy forcing function. Appropriate initial and boundary A. Standard Compressive Sensing Framework
conditions are also needed to solve the model. In most data The general problem of reconstructing x from y with the
assimilation problems, parameters α(x) are also unknown measurement matrix Φ is equivalent to the problem of recover-
or at best poorly known. In groundwater flow models these ing s from y with the transformed M ×N measurement matrix
parameters correspond to ground porosities or other geological θ. Thus to perform reconstruction using the compressive
structures. Here, we do not use this distinction between state sensing framework, we choose a transform domain in which s
and parameters and augment the state if such a situation arises. has a highly sparse representation. The problem of recovering
The observation process is defined generically as a measure- N × 1 column vector s from the M << N measurements
ment functional M[ψ]. In this paper, we focus on measurement in y is in general under-determined and a unique solution
functionals of the form to the problem does not exist. However, if the signal to be
Z Z recovered is sparse, and the indices of the non zero elements
Mi [ψ] = ψ T (x, t)δψi δ(t − ti )δ(x − xi )dtdx, (2) in the signal are known, then recovery is possible if the
number of measurements is greater than the number of non
which model point-sampling the field at location xi and time zero elements in the signal to be recovered. A necessary and
ti ). To simply notation, we will denote such measurements as sufficient condition for this to hold is that multiplication of θ
an M -dimensional vector y ∈ RM . Assumptions on sensor with any vector v that has the same non zero elements as s
noise can also be built into this simplified model. should not alter the length of v:
We make one additional assumption on this standard data
||θv||2
assimilation framework. In many geophysical problems and 1−≤ ≤ 1 + ,
in particular due to the diffusive properties of fluid flows, it ||v||2
is reasonable to assume that the field to be estimated will for a small  > 0. If the indices of the non zero elements
not have very sharp spatial variations. This can be technically in the vector to be recovered are not known, then a sufficient
captured by saying that for a fixed time, the field ψ(x, .) is condition for recovery is that (5) should hold for any arbitrary
spatially sparse in a known transform domain. Specifically, if vector with three times the number of non zero elements as in
we take a Fourier-like transform of ψ(x, .), then the field can the original signal. This is known as the Restricted Isometric
be reconstructed using the knowledge of only a fraction of the property [3], [4]. Another requirement for recovery of the
transform coefficients. signal is that the columns of B should not be combinations of
a small number of rows of Φ and also the other way round. In constructed in this way with a different transformation basis,
other words, the two matrices should be incoherent and this the reconstruction results that they have reported indicate that
condition is referred to as incoherence [3], [4]. such matrices should work well in practice with the right
To construct a measurement matrix which satisfies the amount of available data.
above conditions would mean checking (5) for all possible
arrangements of the non zero elements in the RN vector C. Bayesian Compressive Sensing
space. However, we can simply select Φ as a random matrix The sparse signal recovery problem discussed earlier can
whose entries are chosen from independent and identically also be viewed from a Bayesian viewpoint. The framework is
distributed Gaussian variables [3], [4]. demonstrate that such inspired by work on Relevance Vector machines (RVM) [9]
a matrix would satisfy both the RIP and the incoherence and provides the probability distributions for the unknowns
property with high probability. The recovery algorithm itself instead of the point estimates provided by l1 norm based
is usually via an optimization program that minimizes the optimization techniques. Thus, this framework provides addi-
ksk0 while obeying the basic measurement equation (4). The tional information about the unknown signal and the resultant
breakthrough provided by CS framework is that this difficult probability distributions can also be used for further analysis.
combinatorial problem can be solved by a relaxation using It starts from the same linear regression problem
the l1 norm instead [2]. The success of the CS framework
also lies in the fact that such a recovery is possible with M of y = θs + n, (5)
the order of K log(N/K), for a K-sparse signal which is an where n is additive sensor noise. The RVM works by assuming
astonishingly low number compared to measuring x directly a prior distribution on s for a sparse estimate and gives the
using N measurements. distributions for the non zero weights in s.
B. CS with Sparse Measurement Matrices Assuming the additive noise n to be zero mean Gaussian
with variance σ 2 , the distribution for y can be written as [7]
In the standard framework described above, typical CS
measurements created from Gaussian random matrices are 1
p(y|s, σ 2 ) = (2πσ 2 )−K/2 exp(− ||y − θs||2 ), (6)
themselves dense. This means that each measurement is a 2σ 2
cumulative or weighted average of all the states variables. This where K, as before, counts the number of non zero elements
goes against the requirement of point measurements that are in s or its sparsity. This is the Gaussian distribution for the
themselves often very sparse for groundwater monitoring, as training data y using the Bayesian approach.
explained above. Thus the measurement matrix itself needs Now we introduce prior knowledge that s is sparse by
to be sparse, in addition to satisfying RIP and incoherence placing a prior distribution on s that promotes sparsity. A
properties. hierarchical prior is imposed on s which has been observed
The construction of sparse matrices can be done in multiple to yield sparse representations for s [9]. First, a zero mean
ways. Although, we are forced to place all zeros in the columns Gaussian prior is defined on s
for which the data points are unavailable, we are free to place
N
non-zero values in the rest of the columns in any way we like. Y
p(s|α) = N (si |0, αi−1 ), (7)
For example, we can randomly place a certain number of ones
i=1
in each of these columns at a random row numbers. We can
also place a single one in each of these columns in a different with αi as the inverse of variance of each element. The αi are
row number for every column. This would mean there is only a further drawn from a Gamma distribution. Next we obtain an
single one in each row and the number of rows will be equal to expression for the posterior p(s, α, σ 2 |y) by decomposing it
the number of elements of x that are available for observation. as
Such a matrix is more sparse than the one which randomly p(s, α, σ 2 |y) = p(s|y, α, σ 2 )p(α, σ 2 |y). (8)
places ones in the free columns but is certainly more desirable
The first part of (8) can be computed as
for storage purposes. Our simulation results have shown that,
for the missing data case, it does not make a difference if the p(y|s, σ 2 )p(s|α)
p(s|y, α, σ 2 ) = , (9)
binary matrix contains multiple ones in the columns for which p(y|α, σ 2 )
the data is available or if it contains a single one in a different
1
row in each column. We borrow this construction from [1]. = Cexp(− (s − µ)T Σ−1 (s − µ)), (10)
Thus, in our simulation and experiments, we use sparse binary 2
matrices for measurement which have as many rows as the where the mean and covariance are given by
number of points whose observation is available and contain
a single one in each row for each of the available data points. µ = σ −2 Σθ T y, (11)
It is worth noting here that the Binary matrices constructed Σ = (σ −2 θ T θ + B)−1 , (12)
as defined above have also been used in [13] which studies
the application of the compressive sensing procedure for soil where B = diag(α1 , α2 , ..., αN ). The second part of (8)
moisture levels. Although, the paper uses the binary matrices p(α, σ 2 |y) is approximated by a delta distribution at its
1
most probable values [9]. This would mean searching for the = − log |B + σ −2 θ T θ| + const. (18)
maximum values of p(α, σ 2 |y) and since 2
This expression depends on the measurement matrix θ with M
p(α, σ 2 |y) ∝ p(y|α, σ 2 )p(α)p(σ 2 ), (13) rows, the estimates for α and σ −2 and some other constants.
Now suppose we wish to add another row rM +1 to θ.
we maximize p(y|α, σ 2 ) as the distributions p(α) and p(σ 2 )
The effect of adding this new row would be to change the
are considered uniform because of the choice of parameters
differential entropy of x to [7]
for their priors. Now p(y|α, σ 2 ) is computable and given by
1
hM+1 (x) = hM (x) − log(1 + σ −2 rMT
Z
+1 ΣrM +1 ). (19)
p(y|α, σ ) = p(y|s, σ 2 )p(s|α)ds,
2
(14) 2
To minimize the uncertainty in x, we maximizing the term
1
= C|σ −2 I+θB −1 θ T |−1/2 exp(−y T ( (σ −2 I+θB −1 θ T )−1 y) T
rM
2 +1 ΣrM +1 = Var(yM +1 ), (20)
(15)
Now we need to obtain values for α and σ 2 which maximize which shows that the next projection rM +1 should be con-
the function in (15). As described in [9], this can be done by structed in such a way that the next measurement yM +1 comes
an iterative algorithm. Finally, the posterior distribution for the from the most uncertain subspace. This can be done simply
T
original signal x is also a multivariate Gaussian given by by an eigen-decomposition of Σ and selecting for rM +1 the
eigenvector with the largest eigenvalue [7].
E[x] = Bµ, Cov[x] = BΣB T . (16) This procedure can be directly adopted in the standard
CS framework with dense measurement matrices. However,
Recent work on the theoretical analysis of Relevance Vector this creates a problem for constructing sparse binary matrices
machines has shown that RVM in fact provides a tighter adaptively because the eigen-analysis of Σ will almost always
approximation to the l0 norm based optimization than the give dense eigenvectors. To take care of this, we look for the
l1 norm optimization technique[11], [12]. The iterative RVM nearest sparse binary approximation to the suggested rM +1
algorithm has also been reported to perform better or at using standard dot products to the basis frame. We have
least as good as other l1 norm based optimization algorithms. not done a formal analysis on performance loss due to to
Further, the Bayesian framework gives us a handle on the this approximation but in practise this works well for most
entire distribution rather than a point estimate. This means problems we have done simulations on.
that the RVM based compressive sensing is more suitable
alternative to the more popular l1 norm minimization based B. Incorporating Model Dynamics
techniques. This also opens the door to investigating how the The Compressive sensing framework and its Bayesian coun-
future measurements can be designed to improve the estimates terpart are used to estimate an individual sparse signal from
of the more uncertain points. its limited measurements. In order to estimate a sequence of
IV. A DAPTATION AND DYNAMICS signals varying over time, when the dynamics of the changing
system acting on the signal are also known, the framework
The measurement matrix Φ (or θ in the transform domain) needs to extended to incorporate the model dynamics as well.
for the compressive sensing techniques described earlier is As seen in the section on Bayesian Compressive sensing,
describes a predetermined and fixed sensing strategy and the framework imposes the sparsity constraint on the signal
does not involve adaptation between individual measurements. by assuming a hierarchical prior. The known model dynamics
Since the measurement matrices that we use are not dense but can be viewed as additional information about the signal. The
binary (signifying a temporal sequence), we explore whether standard way to proceed from here would be to use the Bayes
such matrices can be adaptively designed, i.e. adaptively Filter algorithm [5] to incorporate this prior. The filter can be
adding rows to the matrix while ensuring that the estimate implemented using an appropriate estimation technique such
obtained keeps getting more accurate. The advantage of such as a variant of the Kalman filter or the non-parametric particle
a framework is twofold. First it lets us adaptively design filter. For details of this formal development see [8].
our survey points during a static reconstruction of the field. In this paper, we take a more practical view point and outline
Secondly, it gives us clue on how to incorporate previously a method that works for our groundwater-like problems. Sup-
learned state history in designing new measurements, after pose that the grid model of the field evolves from xt := ψ(x, t)
the field has changed due to its model dynamics. to xt+1 := ψ(x, t+∆t) following Equation 1. Further, assume
that this evolution is a linear transformation of the form
A. Adaptive Bayesian Compressive Sensing
The Bayesian CS framework produces a multivariate Gaus- xt+1 = Axt + qt , (21)
sian posterior on the signal x and as measure of the uncertainty where qt models the process noise in model dynamics. Let us
in the signal, the differential entropy of x is given by [7] also assume that initially xt has been reconstructed as mean x̂
and covariance Σt using the Bayesian compressive framework
Z
hM (x) = − p(x) log p(x)dx (17) described above. The standard Bayes’s filtering methodology
would be to first create a prediction x̂pt+1 = Ax̂t and then
correct using the measurement pdf evaluated for the next time
step. We instead use the following procedure which follows
this predict and correct methodology in spirit but is more
explicit in using the adaptation introduced in earlier sections.
We first create pseudo-measurements from the predicted
state by
p
yt+1 = Φt Ax̂t . (22)

Note that Φt is the complete binary measurement matrix used


for compressive sensing of initial field xt . Next, we append
the actual measurements received (so far) in the current step
a
yt+1,M = Φt+1,M xt+1 . (23)

We then append the two measurement classes together to get


 p 
yt+1 (a)
yt+1,M = a (24)
yt+1,M .
Following the procedure in the previous section, we design the
next measurement (or measurement set) by maximizing the
variance of yt+1,M . This method makes some approximations
to the optimal estimation method in that the correction step
is not followed to the letter. Practically, we find that this
approximation does not prevent us from obtaining reasonably
good results. A full investigation of this issue has been taken
up in [8].
It should be noted that we currently limit the application
of this framework to linear models only. This is because, as
described in the section on Bayesian CS, the distribution of (b)
the resulting estimate is multivariate Gaussian. To obtain the
pseudo measurements, this estimate is propagated to the next Fig. 1. (a) The temperature state of a one dimensional heat rod at each time
instant. (b) A 3-D plot showing the transition of the temperature state of the
time-step and in a linear model this propagated estimate will rod. t denotes the time axis and x denotes the spatial axis.
also be a multivariate Gaussian. However, for the more general
non-linear case, we are currently not be able to say much
about the distribution of the propagated estimate. However, in A. One-dimensional Diffusion
principle, the framework should be extendable either through The first problem we consider is that of a one dimensional
linear approximations or via non-parametric filtering methods. heat equation as a toy model for contaminant diffusion in
This framework is also expected to give better results fluids. Here, the system states are the temperatures of 200
with same or less measurements than performing CS with evenly spaced points on a one dimensional rod. Hence, x is a
l1 optimization or the Bayesian CS which are applied on 200 × 1 vector of temperature of the rod and s is the vector of
individual signals without assuming any additional knowledge DCT transform of x. Our measurement matrix, Φ, is binary
of the system’s model based dynamics. This extension of the because of the problem statement, i.e. the observation points
framework has mainly been done to bring about improvement are limited. The temperature at 10 time instants is measured
in results whenever we have this additional information about and the above framework is used to estimate s individually
the system dynamics available. for each time instant. Figure 1 shows the original temperature
of the rod. As we can see that with time the temperature state
V. R ESULTS AND S IMULATION
of the rod becomes smoother and thus its sparsity pattern in
We now apply the l1 norm optimization as well as the the DCT domain improves.
Bayesian compressive sensing techniques on two examples 1) Compressive Sensing using l1 minimization: We consti-
where system states are linearly evolving over time and only tute the Binary measurement matrix Φ by randomly selecting
a limited number of points are available for observation. We around 25 of the 200 points of the rod and consider them as the
first use these techniques to reconstruct individual signals and points that are available for observation. Thus Φ contains the
then use the model incorporating technique and see whether rows from the 200 × 200 identity matrix corresponding to the
it reduces the number of measurements than those required locations of the points considered available for measurement.
when prior model information for a signal is not available. The results by reconstruction using this measurement matrix
Fig. 2. (a) Estimates of temperature state of the rod using l1 norm Fig. 3. Estimates of temperature state of the rod using Bayesian compressive
minimization. sensing.

and the l1 norm optimization are shown in Figure 2. As can


be seen that the initial estimate is not very accurate. This is
because as we already mentioned the sparsity pattern of the
DCT transform of the temperature states is initially poor but
improves over time, and so do the estimates.
2) Bayesian Compressive Sensing: We use the same mea-
surement matrix as in the previous section and the number
of points is also kept same. The estimation algorithm used
is the faster RVM algorithm. and the MATLAB code for it
is available at http://www.ece.duke.edu/ shji/BCS.html. This
code defines a function which takes as input the measurement
matrix, measurement and the initial noise variance. The output
contains the expected values of the non zero elements of the
estimates as well as their variances. The initial noise variance
is taken to be variance of the measurement divided by 100.
The results of this estimation procedure are shown in Figure 3.
The results of estimation using Bayesian compressive sensing
are slightly better than the l1 norm based optimization as can Fig. 4. Estimates of temperature state of the rod using Bayesian compressive
be seen in the figure that the estimates of the state temperature sensing along with information about the model dynamics.
are smoother.
3) Using information of Model dynamics: This framework
assumes that the model dynamics are known and in the case
of the heat rod (21) is linear and is given by a matrix upon same and this shows that the framework does improve on the
discretization of the heat equation for the rod. Using this requirement for the number of measurements by using the
knowledge, the framework is applied to this example. Since prior information of a system’s state. The results are shown in
this framework uses the additional prior information about Figure 4.
the system’s previous state, it should, theoretically, require Finally we show the relative errors in estimation plotted
lesser measurements to give the same results as CS with l1 against time of each of the three techniques in Figure 5. It can
optimization and Bayesian CS. The results indicate that the be seen that the Bayesian compressive sensing technique and
framework indeed requires lesser measurements to give the its extension using the model information have smaller errors
same performance as the above two methods. The figures than compressive sensing based on l1 norm minimization. The
shown below were generated by estimates that used only 15 extension to the Bayesian Compressive sensing framework
measurements from the one dimensional rod after initially which uses the additional information about how the model
taking 25 measurements to obtain a good estimate of the initial is evolving achieves the same estimation accuracy despite
state. Although the measurements were much less than those measuring the states at lesser points than the Bayesian CS
taken in the previous trials, the final result and error are the technique (15 compared to 25 points used by BCS).
Fig. 5. Relative errors in the estimates of temperature states of the one
dimensional rod. Estimates were obtained using l1 optimization, Bayesian Fig. 6. Groundwater flow configuration in the two-dimensional 60 × 60
Compressive sensing without model information and Bayesian CS with model grid. Arrows indicate the direction of flow groundwater. The color indicates
information. the level of groundwater which is highest on the west side and lowest on the
eastern end of the region.

B. Groundwater Contaminant in a 2-Dimensional grid


The next problem we consider is that of a groundwater
model. We consider a 60 × 60 two-dimensional grid giving
a total of 3600 points or cells. The system state here is the
amount of a particular contaminant in the groundwater in
each smaller cell. Because such a state would be spatially
smooth, the Discrete Cosine Transform is once again a good
transform domain for this example as the system state would
exhibit a sparse representation in this domain. Assuming
system precipitation and permeability in the region, Darcy flow
equations can be used to construct a matrix that describes the
groundwater flow in the grid. The flow configuration in the
grid is shown in Fig 6. We initially assume there is a sharp
contaminant plume on the west side of the grid and this plume Fig. 7. Simulated 2D plot of the contaminant level over 50 years in the
spreads and moves further in the grid according to the flow groundwater. The color here indicates the amount of contaminant in a cell in
configuration over time. The simulation is run for a period of the groundwater.
fifty years and the final position of the plume is noted. The
initial position of the contaminant plume and snapshots of its
movement during this time are shown in 7.
We first perform the Bayesian CS framework on system
state without assuming any information about the model
dynamics. The system state is measured at about 400 points
chosen randomly from the 3600 in the grid. This is based
on the assumption that in many real life examples such as
this, it is not practically possible to have all the points in
a spatial grid available for measurement. This is especially
true about groundwater models where each measurement has
to be performed by digging up a well or by boring a deep
hole in the ground. It would cost too much to perform such
a procedure for each point on the grid in question. Thus,
about 400 measurements are performed from points chosen
randomly and the system state is estimated. The result is Fig. 8. 2D plot of the estimated contaminant level over 50 years in the
shown in Fig 8. Finally we test our proposed extension to groundwater. Estimates are obtained using Bayesian compressive sensing
without using knowledge about the aquifer flow configuration
the Bayesian CS framework which incorporates the model
are worth exploring further. One direction could be a detailed
mathematical analysis of the framework that utilizes the model
information along with sparse signal recovery. Theoretical re-
sults that demonstrate the usefulness of this technique in reduc-
ing the number of measurements required for a good enough
recovery may be derived. Another area worth exploring is
the adaptive techniques for both compressive sensing and
its Bayesian counterpart. Although some of the preliminary
results we have obtained in this regard indicate that adaptively
choosing the next points for measurements generally gives the
same estimation errors as randomly choosing the available
points beforehand, there is certainly need for this aspect to
be explored further. If some adaptive technique does improve
Fig. 9. 2D plot of the estimated contaminant level over 50 years in the the estimates more than random measurements in some cases,
groundwater. Estimates are obtained using Bayesian compressive sensing also it could be used with the extension we have developed to
making use of the additional knowledge about the aquifer flow configuration adaptively select the next points for measurements. This would
mean that in the practical problem of groundwater survey, for
example, we can use the adaptive scheme to direct future
dynamics information along with the sparsity constraints.It surveys towards those points whose measurements, if used,
is important to note the the matrix A2 in (24) is binary in would provide the best reduction in the estimate error.
keeping with the constraint that only a limited number of
points are available for measurement. This framework is able ACKNOWLEDGMENT
to achieve smoother estimates of the system state with around The authors would like to thank Dr. Ibrahim Hoteit and Mr
300 measurements. Thus using the prior information of the Mohamad El Gharamti at KAUST for providng us with the
system state, this framework has been able to achieve as good code to simulate 2D contaminant flow in the groundwater. We
an estimate of the contaminant plume after 50 years than the also thank Hasan Arshad Nasir who helped us greatly in the
estimate based simply on Bayesian CS. The estimate produced initial phase of the project.
by this framework using only about 300 points from the 3600
in the grid is shown in Fig 9. R EFERENCES
[1] R. Berinde and P. Indyk, “Sparse recovery using sparse random matri-
VI. C ONCLUSION AND F UTURE D IRECTIONS ces,” preprint, 2008.
[2] E. Candes and J. Romberg, “l1-magic: Recovery of sparse
In this paper we have discussed techniques for sparse signals via convex programming,” URL: www. acm. caltech.
edu/l1magic/downloads/l1magic. pdf, vol. 4, 2005.
signal recovery and investigated their application on recovery [3] E. Candès, J. Romberg, and T. Tao, “Robust uncertainty principles: Exact
problems where most of the data points are missing from the signal reconstruction from highly incomplete frequency information,”
sample. The techniques we have explored are Compressive Information Theory, IEEE Transactions on, vol. 52, no. 2, pp. 489–509,
2006.
sensing based on l1 norm minimization and a Bayesian Com- [4] D. Donoho, “Compressed sensing,” Information Theory, IEEE Transac-
pressive sensing framework based on relevant vector machines. tions on, vol. 52, no. 4, pp. 1289–1306, 2006.
We have seen that the measurement matrices that can be used [5] G. Evensen, Data assimilation: the ensemble Kalman filter. Springer,
2009.
for the missing data problem are highly sparse binary matrices [6] B. Jafarpour, V. Goyal, D. McLaughlin, and W. Freeman, “Compressed
and their use with both the recovery schemes indicates that history matching: Exploiting transform-domain sparsity for regulariza-
the Bayesian framework performs slightly better than the l1 tion of nonlinear dynamic data integration problems,” Mathematical
Geosciences, vol. 42, no. 1, pp. 1–27, 2010.
norm minimization based compressive sensing with the same [7] S. Ji, Y. Xue, and L. Carin, “Bayesian compressive sensing,” Signal
number of measurements. Processing, IEEE Transactions on, vol. 56, no. 6, pp. 2346–2356, 2008.
We have also developed an extension to the Bayesian Com- [8] A. Muhammad and Z. Hussain, “Sparse data assimilation in geophysical
models,” In Preparation, 2012.
pressive sensing framework in the case of recovering a se- [9] M. Tipping, “Sparse bayesian learning and the relevance vector ma-
quence of system states when the dynamics of the underlying chine,” The Journal of Machine Learning Research, vol. 1, pp. 211–244,
model acting on those states are known to us. This extension 2001.
[10] N. Vaswani, “Kalman filtered compressed sensing,” in Image Processing,
utilizes this additional information to achieve the recovery 2008. ICIP 2008. 15th IEEE International Conference on. IEEE, 2008,
of system state using even lesser measurements than those pp. 893–896.
required by the Bayesian CS framework. The results we have [11] D. Wipf and B. Rao, “l 0-norm minimization for basis selection,”
Advances in Neural Information Processing Systems, vol. 17, 2005.
obtained on the two simulated examples where the models [12] ——, “Comparing the effects of different weight distributions on finding
are linear indicate that this extension does achieve the same sparse representations,” 2006.
estimation accuracy as the BCS framework but with lesser [13] X. Wu and M. Liu, “In-situ soil moisture sensing: measurement schedul-
ing and estimation using compressive sensing,” in Proceedings of the
measurements by utilizing the additional information about 11th international conference on Information Processing in Sensor
the model. Networks. ACM, 2012, pp. 1–12.
There are more than one avenues related to this work that

You might also like