Physico-Mathematical Systems and System Biology
Vikram Prakash, VIT University Vellore,
[email protected]
Abstract
Many systems in nature, including biological systems, have very complex dynamics which
generate random-looking time series. Biological system dynamics is basically understood by
physico-mathematical models which gives the basics of the atomic/molecular interaction in
living nature. The moment of the object in universe either with respect to time or space is
made under the physico- mathematics. To better understand a particular dynamical system, it
is often of interest to determine whether the system is caused by static subsystem,
deterministic subsystems stochastic subsystems, or all. Alternatively, one can use methods
that measure the complexity in a particular system which seldom make assumptions about a
particular system, such as assuming the presence of stationarity. Additionally, mathematical
and computational modelling techniques can be used to test different hypothesis about the
dynamics of biological systems.
Keywords Static model, Deterministic model, stochastic model, System Biology.
Introduction
Systems biology is a new approach to study biological phenomena. The main idea is to lift
the reductionable paradigm that drives the current research practice in biology to a global
view of biological systems that can be investigated at different level of models. The new
paradigm is hypothesis driven, iterative and global. The main focus is on the functions that
these components have in the dynamic evolution of biological systems. More abstractly we
can say that biologists are passing from production of knowledge to organization of
knowledge so that structure (well reported in many public data bases) must be related to
behavior (still largely not organized in suitable representations).
Static model
Much of how an object responds to any interaction with its surroundings depends on what has
happened to it up to now. Static modeling is better termed as an organism’s interactome,
static models tend to be broader and coarser in scope, often encompassing the entire
interactome. Static modeling is less demanding from an experimental perspective, as just
about any assay on a population of cells will prove informative. This modeling is best
conceptualized as the computerized reconstruction of molecular anatomy. Static modeling is
about determining which elements are present and how they are interconnected it is a basic
prerequisite for any kind of systems biological analysis. As one example, Determining
whether two bacteria can metabolize the same sets of compounds requires an enumeration of
their functional modules, roughly corresponding to the evolutionarily conserved sub graphs in
their respective static models. As another example, identifying proteins which are essential
for cellular function can be greatly aided by knowledge of which proteins are central in static
network models. In particular, static models are essential starting points for more complex
dynamic modeling strategies.
Tasks Associated with Static Modeling:
1. Determine desired network detail.
The first step in static modeling is to determine the scope and detail of the network
reconstruction.
2. Enumerate input data sources.
The next step is to work backward to determine which input variable could potentially predict
the desired network properties.
3. Network reconstruction.
4. Experimental confirmation.
In the ideal scenario, the properties of the predicted network are then experimentally tested.
5. Network applications.
Given an experimentally reliable static network, we can then proceed to further applications,
such as comparisons of networks across species and conditions (network alignment) and
network-guided experimental prioritization.
Limitations of Static Models:
Perhaps the most obvious limitation of a static model is that it is in fact static: it does not
incorporate temporal, spatial, or conditional information except indirectly. In particular, less
detailed static models may give little information about how different nodes talk to each
other. For example, low-resolution models that predict solely whether two proteins “interact”
with some probability are useful for generating hypotheses, but give little mechanistic insight
as to whether they are related by physical contact, presence in the same pathway, or
regulation of the same genes. These limitations can be partially overcome by including more
types of datas, though there are fundamental limitations on the level of conditional detail
possible in a static network.
Dynamic Models (Deterministic and Stochastic)
Discrete models require the states of the system variables (genes, proteins, signaling
molecules) to take on integer values. Although at a molecular level this requirement is the
most realistic, it is often used at a higher level to simplify the resulting models. A boolean
model provides one such simplification: it consists of binary-valued variables whose
interrelationships are captured by boolean functions. In cases where a boolean model is too
coarse grained for a particular system, a more elaborate dynamic Bayesian network can be
used. These models can be either discrete or continuous, and they allow dynamical systems to
be described probabilistically. An example of a recent discrete DBN applied to yeast cell
cycle time series data is found in Zou and Conzen (2005). Short of molecular dynamics
simulations that track the simultaneous position and velocity of every molecule in the system,
the most realistic (and mechanistic) signaling network models fall under the stochastic
chemical kinetics framework (Gillespie 2007). These models represent biological systems as
well-stirred collections of finite numbers of chemical species; reactions are simulated
probabilistically according to known reaction propensities Continuous models permit system
variables to take on non-negative real-valued states. We focus on so-called chemical kinetic
(mechanistic) models where states represent concentrations of molecules. These models,
though approximate, are sufficiently accurate when the molecular populations of all species
are orders of magnitude larger than one (Gillespie 2007). The oldest and most common
modeling formalism uses ordinary differential equations (ODEs) and known chemical
kinetic/physico-chemical principles (Cornish-Bowden 1979) to deterministically model
molecular concentrations as a function of time. Though these equations are not usually
analytically solvable, there exist a wide variety of numerical tools that can efficiently model
relatively complex systems (Rangamani and Jyengar 2007). Partial differential equation
(PDE) models of signaling networks describe the evolution of molecular concentrations as
functions of both space and time. These models are more physically realistic than ODEs, but
they are also significantly more difficult to solve and typically require custom-made
numerical solution methods (Eungdamrong and Iyengar 2004). The addition of a noise term
to a deterministic differential equation yields a stochastic differential equation (SDE), which
in chemical kinetic systems often takes the form of a chemical Langevin equation (CLE)
(Gillespie 2000). The CLE follows from approximations to discrete stochastic chemical
kinetics, and its solution can be computed much more efficiently than solutions for the
corresponding discrete model (Wilkinson 2009). Discrete dynamical models represent an
active area of research in systems biology, and they have recently been discussed elsewhere
(Uhrmacher et al. 2005).The cellular environment is constantly changing as a result of
deterministic chemical reactions and stochastic fluctuations. Thus, dynamical systems are
more realistic depictions of biology than static models. Through simulation, dynamical
models enable characterization of nonlinear, emergent behavior that evolves over time. Such
behavior is often only visible at a systems level and would be missed by reductionist methods
(Bhalla and Iyengar 1999).The outputs of differential equation models relate more closely to
experimentally observed phenotypes than coarser-grained alternatives (Sauer 2004). As a
result, though these models often require extensive parameterization, the parameter space can
be constrained such that the model reproduces experimental data. This significantly reduces
the complexity of model calibration and also enables easier model validation (Rangamani and
Iyengar 2007).
Tasks Associated with Dynamical Modeling
1. Model construction and calibration.
The first step is to specify the structure and parameterization of a model from prior
knowledge and experimental data.
2. Model validation and testing.
After calibration, it is important to compare model output with existing experimental data
(Eungdamrong and Iyengar 2004; Ideker et al. 2001). This procedure is necessary (though not
sufficient) to determine whether a model is specified correctly.
3. Parameter sensitivity analysis.
Sensitivity analysis involves determining which molecular concentrations or kinetic
parameters have the greatest influence on model behavior. This is valuable when prioritizing
parameters for subsequent experimental measurement or perturbation (Rangamani and
Iyengar 2007).
4. Analysis of emergent behavior.
As mentioned emergent behavior arises from systems level properties that are not apparent
from studying individual components. Many of these phenomena, which can include
robustness to noise, feedback, bistability, and oscillation, are best characterized through
simulation of the model (Gilbert et al 2006; Angeli et al 2004).
5. Predictive modeling and discovery.
One of the most exciting areas of systems biology is prospective modeling to test hypotheses
that are too difficult or expensive to query in vivo. Here, a prerequisite for making accurate
predictions is a
sufficiently detailed and accurate model (You 2004).
Ordinary Differential Equation Systems
Ordinary differential equation models are by far the most common dynamical model used in
biology (Andrews and Arkin 2006). They represent behavior at the level of chemical kinetics,
whereby the concentration of each system component yi(t) as a function of time is
represented in the following manner:
where y (t) = _yi (t) , . . . , yn (t)_ and fi is a function which describes the rate of change of
yi(t). This function can be constant (uninhibited synthesis), linear (first-order reaction such as
degradation), or nonlinear (second-order reaction like Michaelis–Menten kinetics), and its
precise form follows from qualitative prior experimental knowledge. These coupled
expressions are often collectively referred to as reaction rate equations (RREs). The RREs of
most biologically realistic systems cannot be solved analytically, but numerous welldeveloped and efficient numerical methods for solving these systems are available.
Assumptions of ODE Biological Models:
The relative ease with which ODE models of biological systems can be constructed and
solved is a consequence of the simplifying assumptions made about the system. These
assumptions include as follows:
• Reactions occur in a homogeneous, well-stirred volume (corollary: molecular
concentrations are functions of time and not space)
• Reactions occur in a deterministic manner
• Discrete effects on molecular concentrations can be ignored (corollary: molecular
populations of all species are orders of magnitude larger than one)
Partial Differential Equation Systems
Biological systems are known to exhibit spatial inhomogeneity, and some tasks require
explicit modeling of the spatial dimension. This is especially true when the biological system
in question extends across several cellular organelles, each potentially containing different
components, or when the diffusion of individual components across the modeled space
cannot be treated as an instantaneous process. Compartmental ODE models have been
successfully used to model the former case, where components are assumed to be well mixed
within compartments and transport between compartments occurs at a much slower
measurable rate (Aldridge et al. 2006a). As these models are modified versions of the ODE
models described above, we will not discuss them further. In the latter case, i.e., when
explicitly modeling the diffusion of certain components, partial differential equation models
are necessary. Here, the spatial dimension is modeled as a continuous quantity, and the
concentration of each component becomes a function of both space and time. The PDEs most
commonly used to describe such systems are reaction–diffusion equations, where the
concentration of each component yi(t) of the system can be represented as follows (derived
using Fick’s second law of diffusion):
Where y(t) is as above, Di is a diffusion coefficient, xj represents a spatial dimension, and m
is the number of spatial dimensions modeled. The first term on the RHS, fi, describes the
contributions of chemical reactions to the time derivative, and the second term describes the
contributions of diffusion. Compared to ODE models, PDE systems are much more
challenging to solve, in part because they require many more parameters (Eungdamrong and
Iyengar 2004). Aside from the kinetic parameters needed to specify fi, the reaction–diffusion
system requires a diffusion coefficient for each species (which are difficult to measure
experimentally (Rangamani and Iyengar 2007)), and fluxes and/or concentrations of each
component must be specified at the boundary of the physical space being modeled. This latter
constraint becomes even more prohibitive when considering complex physical geometries.
Solutions to nonlinear PDE systems are almost exclusively numerical, and the added realism
of the model comes at a computational cost due to the increased dimensionality of the system.
PDE Models Describing Biological Systems
Models of biological systems governed by PDEs employ two of the three simplifying
assumptions of ODE models, with spatial homogeneity being the exception. Nonetheless,
when mathematically and computationally tractable, these models can accurately reproduce
spatially varying molecular behavior. One of the first examples of a PDE model describing a
biological system modeled the behavior of two (generic) morphogens reacting and diffusing
through simple geometries of cells (Turing 1952). Subsequent work elaborated upon this
simple model of morphogen-controlled patterning. One study simulated a four morphogen
reaction–diffusion system to mimic pattern formation in Drosophila embryogenesis (Lacalli
1990). By adjusting model parameters, the authors could produce striped patterns of
morphogen concentration compatible with observed wild-type and mutant phenotypes.
Stochastic Differential Equation Systems
Both ODE and PDE models of biological systems assume that reactions occur in a
deterministic manner. This assumption seems to imply that biological reactions exhibit little
to no heterogeneity or stochasticity (“intrinsic noise”), which is known to be false (McAdams
and Arkin 1997). Rather, the main reason for the success of deterministic biological models is
that stochastic effects are often rendered negligible by averaging across large numbers of
molecules or cells. This phenomenon also underlies the success of continuous mechanistic
models, where discrete numbers of molecules can be approximated with continuous
concentrations. There are, however, a number of well-characterized biological systems where
the modeling assumption of deterministic reactions leads to qualitatively incorrect depictions
of behavior. A deterministic model of the circadian rhythm oscillator parameterized with
particular degradation rates fails to oscillate; in contrast, the noise present in the
corresponding stochastic model gives rise to more robust oscillatory behavior (Vilar et al.
2002). In a common class of biochemical reaction mechanisms, enzymatic futile cycles,
extrinsic noise (i.e., noise due to components/ processes outside the system) in a stochastic
model was shown to induce bi-stable oscillatory behavior that was absent in a similar
deterministic model (Samoilov et al. 2005).These (and other) important exceptions to
deterministic reaction mechanisms have led to the application of stochastic models to
biological systems. We begin by representing the state of the system as a function of time
with Z(t) = [Zi (t) , . . . , Zn (t)], where Zi(t) represents the number of molecules of species i.
Capital letters are used to emphasize the stochastic nature of the model; the Zi’s are random
variables. A specific instantiation of the system is represented by lowercase letters; i. e., z =
[zi, . . . , zn]. The system state can be altered by the firing of any of p reactions; each reaction
changes the state by vk = [v1 k, . . . , vnk], 1 ≤ k ≤ p, where vik represents the change in the
number of molecules of species i after the completion of reaction k. Each reaction can be
characterized by its propensity function ak (z), defined so that ak (z) dt is equivalent to the
probability that reaction k will occur once in the system in the infinitesimal time interval [t, t
+ dt] given Z (t)=z. Given that any instantiation of the system is random, it would be useful
to have a probabilistic expression for the time evolution of the system P (z, t|z0, t0)
(probability that the system is in state z at time t, given that it is in state z0 at time t0). Using
the above quantities and the laws of probability, this can be derived as follows:
Equation (1.1) is called the chemical master equation (CME). Since the possible values of z
are discretely varying, the CME is actually a set of coupled ODEs that is nearly as large as
the number of possible combinations of molecules in the system. Consequently, except for
very simple systems, these equations are not solvable analytically and numerical solutions are
usually intractable. Progress has been made in developing approximation schemes for
numerically solving the CME (Munsky and Khammash 2006; Deuflhard et al. 2007; Jahnke
and Huisinga 2008), but most applications turn to Monte Carlo methods to sample from the
distribution P (z, t|z0, t0). The stochastic simulation algorithm (SSA), also known as the
Gillespie algorithm, simulates each reaction sequentially as they occur in time (Gillespie
1977). This approach has been widely used in stochastic modeling of biological networks, in
part because it produces a draw from the exact probability distribution that solves the CME.
Conclusion
With thousands of sequenced genomes (Wheeler et al. 2007) and hundreds of functional
genomic data sets (Barrett et al. 2005), the future of systems biology is bright.
In static modeling, the supervised learning approach, in which high-throughput data is
compared against a small training set of curated knowledge, has proven to be the most fruitful
data integration strategy to date. In particular, supervised predictions of function and
interaction from multiple data sets are more robust than those derived from individual data
sets and have provided a foundation for recent work on network alignment and systematic
validation. The primary challenges for static modeling are to (1) decide on a set of reference
networks and (2) tie every predicted node and edge in such networks to a gold-standard
experimental test such as co-immunoprecipitation for confirmation of physical protein
interactions. These steps will be crucial to bringing network predictions to the same level of
confidence and widespread utilization as gene predictions. For dynamic models, the core
problem is that the area will remain data starved (Albeck et al. 2006) until high-throughput
methods for the determination of rate constants (Famili et al. 2005) and spatial substructure
(Foster et al. 2006; Schubert et al. 2006) become commonplace. Recent efforts at compiling
and curating a number of biological constants (Milo et al. 2009) and developing a repository
of systems biology models (Hucka et al. 2003) are an important step in the right direction
toward establishing a repository of “consensus constants ”
References
1. Bernie J. Daigle, Jr., Balaji S. Srinivasan, Jason A. Flannick, Antal F. Novak, And Serafim
Batzoglou, Current Progress In Static And Dynamic Modeling Of Biological Networks.
2.Flemming Nielson, Hanne Riis Nielson, Debora Schuch Da Rosa And Corrado Priami, July
2003,Static Analysis For Systems Biology
3.Models In Systems Biology: The Parameter Problem And The Meanings Of Robustness
4.Ferenc Czegledy And Jose Katz,Biological System:Stochastic,Determinstic Or Both.
5.Static Models Object Attributes And Invariants
6.Marian Gheorghe, Vincenzo Manca Francisco J. Romero-Campero, Deterministic And
Stochastic P Systems For Modelling Cellular Processes
7.Ashish Bhan And Eric Mjolsness,Static And Dynamic Models Of
Biological Networks