Advances in Mathematics 218 (2008) 2051–2088
www.elsevier.com/locate/aim
V -variable fractals:
Fractals with partial self similarity ✩
Michael F. Barnsley a , John E. Hutchinson a,∗ , Örjan Stenflo b
a Department of Mathematics, Mathematical Sciences Institute, Australian National University,
Canberra, ACT, 0200, Australia
b Department of Mathematics, Uppsala University, 751 05 Uppsala, Sweden
Received 18 March 2008; accepted 3 April 2008
Available online 2 June 2008
Communicated by Kenneth Falconer
Abstract
We establish properties of a new type of fractal which has partial self similarity at all scales. For any
collection of iterated functions systems with an associated probability distribution and any positive integer V there is a corresponding class of V -variable fractal sets or measures. These V -variable fractals can
also be obtained from the points on the attractor of a single deterministic iterated function system. Existence, uniqueness and approximation results are established under average contractive assumptions. We
also obtain extensions of some basic results concerning iterated function systems.
2008 Elsevier Inc. All rights reserved.
Keywords: V -variable fractal; Superfractal; Fractal; Random fractal; Iterated function system; Chaos game; Markov
chain
1. Introduction
A V -variable fractal is loosely defined by the fact that it possesses at most V distinct local
patterns at each level of magnification, where the class of patterns depends on the level, see
Definition 18 and Remark 19. Such fractals are useful for modelling purposes and for geometric
This work was partially supported by the Australian Research Council and carried out at the Australian National
University.
* Corresponding author.
E-mail address:
[email protected] (J.E. Hutchinson).
✩
0001-8708/$ – see front matter 2008 Elsevier Inc. All rights reserved.
doi:10.1016/j.aim.2008.04.011
2052
M.F. Barnsley et al. / Advances in Mathematics 218 (2008) 2051–2088
Fig. 1. Sets of V -variable fractals for different V and M. See Remark 55.
applications which require random fractals with a controlled degree of strict self similarity at
each scale, see [2, Chapter 5].
Let F be a family of iterated function systems [IFSs] acting on a metric space X, together
with an associated probability distribution on F . A construction tree associated to F is a labelled
tree with a single root node; corresponding to each node is an IFS from F , and the edges upward
from each node are in one-to-one correspondence with the functions in the corresponding IFS.
Associated to almost every such construction tree is a fractal set or measure. The construction
tree and the corresponding fractals are defined to be V -variable if at each level of the construction
tree there are at most V distinct subtrees with base node at that level, see Definition 24. The case
V = 1 corresponds to homogeneous random fractals. The case V > 1 corresponds to a weaker
spatial homogeneity requirement that we call V -variability.
An unexpected fact is that the class of V -variable fractal sets or measures associated to F ,
and an associated natural probability distribution, can be generated by a single deterministic
IFS (or Markov chain or “chaos game,” depending on one’s perspective) operating not on the
state space X but on the state space C(X)V or M(X)V of V -tuples of compact subsets of X or
probability measures over X, respectively. The projection of the IFS attractor in any of the V
coordinate directions gives the class of V -variable fractal sets or measures corresponding to F
together with its natural probability distribution in each case. See Theorems 29, 43 and 45, and
see Section 8 for a simple example. The full attractor contains further information about the
correlation structure of subclasses of these V -variable fractals. The Markov chain converges
exponentially, and approximations to its steady state attractor can readily be obtained.
The limit V → ∞ gives standard random fractals, and for this reason the Markov chain for
large V provides a fast way of generating classes of standard random fractals together with their
probability distributions. Ordinary fractals generated by a single IFS can be seen as special cases
of the present construction and this provides new insight into the structure of such fractals, see
Remark 55. For the connection with other classes of fractals in the literature see Remarks 39
and 48.
1.1. We now summarise the main notation and results
Let (X, d) be a complete separable metric space. Typically this will be Euclidean space Rk
with the standard metric. For each λ in some index set Λ let F λ be an IFS acting on (X, d), i.e.
2053
M.F. Barnsley et al. / Advances in Mathematics 218 (2008) 2051–2088
!
"
λ
λ
F λ = f1λ , . . . , fM
, w1λ , . . . , wM
,
λ
fmλ : X → X, 0 ! wm
! 1,
M
#
λ
wm
= 1.
(1)
m=1
We will require both the cases where Λ is finite and where Λ is infinite. In order to simplify the
exposition we assume that there is only a finite number M of functions in each F λ and that M
does not depend on λ. Let P be a probability distribution on some σ -algebra of subsets of Λ.
The given data is then denoted by
$
%
F = (X, d), F λ , λ ∈ Λ, P .
(2)
Let V be a fixed positive integer.
Suppose first that the family Λ is finite and the functions fmλ are uniformly contractive. The
set KV of V -variable fractal subsets of X and the set MV of V -variable fractal measures on X
associated to F is then given by Definition 18. There are Markov chains acting on the set C(X)V
of V -tuples of compact subsets of X and on the set Mc (X)V of V -tuples of compactly supported
unit mass measures on X, whose stationary distributions project in any of the V coordinate
directions to probability distributions KV on KV and MV on MV , respectively. Moreover, these
c
Markov chains are each given by a single deterministic IFS FCV or FM
constructed from F and
V
Mc
C
V
V
acting on C(X) or Mc (X) respectively. The IFS’s FV and FV are called superIFS’s. The
sets KV and MV , and the probability distributions KV and MV , are called superfractals. See
Theorem 43; some of these results were first obtained in [6]. The distributions KV and MV have
a complicated correlation structure and differ markedly from other notions of random fractal in
the literature. See Remarks 39 and 52.
In many situations one needs an infinite family Λ or needs average contractive conditions,
see Example 47. In this case one works with the set M1 (X)V of V -tuples of finite first moment
1
unit mass measures on X. The corresponding superIFS FM
is pointwise average contractive
V
by Theorem 45 and one obtains the existence of a corresponding superfractal distribution MV .
There are technical difficulties in establishing these results, see Remarks 11, 12 and 46.
In Section 2 the properties of the Monge–Kantorovitch and the strong Prokhorov probability
metrics are summarised. The strong Prokhorov metric is not widely known in the fractal literature
although it is the natural metric to use with uniformly contractive conditions. We include the
mass transportation, or equivalently the probabilistic, versions of these metrics, as we need them
in Theorems 8, 43, 45 and 49. We work with probability metrics on spaces of measures and such
metrics are not always separable. So in Section 2 the extensions required to include non-separable
spaces are noted.
In Section 3, particularly Theorem 8 and the following remarks, we summarise and in some
cases extend basic results in the literature concerning IFS’s, link the measure theoretic and probabilistic approaches and sketch the proofs. We hope this brief survey will be of independent use.
In particular, IFS’s with a possibly infinite family of functions and pointwise average contractive conditions are considered. The law of large numbers for the corresponding Markov process
starting from an arbitrary point, also known as the convergence theorem for the “chaos game”
algorithm, is extended to the case when the IFS acts on a non locally compact state space. This
situation typically occurs when the state space is a function space or space of measures, and here
1
in Theorem 45. The strong Prokhorov metric is used in the
it is required for the superIFS FM
V
case of uniform contractions. In Section 4 we summarise some of the basic properties of standard
random fractals generated by a family of IFS’s.
2054
M.F. Barnsley et al. / Advances in Mathematics 218 (2008) 2051–2088
In Section 5 the representation of V -variable fractals in the space ΩV of tree codes and in the
space A∞
V of addresses is developed, and the connection between the two spaces is discussed.
The space ΩV is of the type used for realisations of general random fractals and consists of
trees with each node labelled in effect by an IFS, see the comment following Definition 13 and
see Definition 18. The space A∞
V is of the type used to address points on a single deterministic
fractal, and here consists of infinite sequences of V × (M + 1) matrices each of which defines
a map from the set of V -tuples of sets or measures to itself, see Definition 30. Also see Figs. 2
and 4.
In Section 6 the existence, uniqueness and convergence results for V -variable fractals and
superfractals are proved, some examples are given, and the connection with graph directed IFSs
is discussed. In Section 7 we establish the rate at which the probability distributions KV and MV
converge to the corresponding distributions on standard random fractal sets and measures respectively as V → ∞. In Section 8 a simple example of a super IFS and the associated Markov chain
is given. In Section 9 we make some concluding remarks including the relationship with other
types of fractals, extensions of the results, and some motivation for the method of construction
of V -variable fractals.
The reader may find it easier to begin with Section 4 and refer back to Sections 2 and 3 as
needed, particularly in the proofs of Theorems 43 and 45.
2. Preliminaries
Throughout the paper (X, d) denotes a complete separable metric space, except where mentioned otherwise.
Definition 1. The collection of nonempty compact subsets of X with the Hausdorff metric is
denoted by (C(X), dH ). The collection of nonempty bounded closed subsets of X with the Hausdorff metric is denoted by (BC(X), dH ).
For A ⊂ X let Aǫ = {x: d(x, A) ! ǫ} be the closed ǫ-neighbourhood of A.
Both spaces (C(X), dH ) and (BC(X), dH ) are complete and separable if (X, d) is complete
and separable. Both spaces are complete if (X, d) is just assumed to be complete.
Definition 2 (Prokhorov metric). The collection of unit mass Borel (i.e. probability) measures
on the Borel subsets of X with&the topology
& of weak convergence is denoted by M(X). Weak
convergence of νn → ν means φ dνn → φ dν for all bounded continuous φ. The (standard)
Prokhorov metric ρ on M(X) is defined by
$
! "
%
ρ(µ, ν) := inf ǫ > 0: µ(A) ! ν Aǫ + ǫ for all Borel sets A ⊂ X .
The Prokhorov metric ρ is complete and separable and induces the topology of weak convergence. Moreover, ρ is complete if (X, d) is just assumed to be complete. See [7, pp. 72, 73].
We will not use the Prokhorov metric but mention it for comparison with the strong Prokhorov
metric in Definition 4.
The Dirac measure δa concentrated at a is defined by δa (E) = 1 if a ∈ E and otherwise
δa (E) = 0.
M.F. Barnsley et al. / Advances in Mathematics 218 (2008) 2051–2088
2055
If f : X → X or f : X → R, the Lipschitz constant for f is denoted by Lip f and is defined
to be the least L such that d(f (x), f (y)) ! Ld(x, y) for all x, y ∈ X.
Definition 3 (Monge–Kantorovitch metric). The collection of those µ ∈ M(X) with finite first
moment, i.e. those µ such that
'
d(a, x) dµ(x) < ∞
(3)
for some and hence any a ∈ X, is denoted by M1 (X). The Monge–Kantorovitch metric dMK on
M1 (X) is defined in any of the three following equivalent ways:
'
dMK (µ, µ ) := sup
= inf
('
('
f dµ −
'
'
f dµ : Lip f ! 1
)
d(x, y) dγ (x, y): γ a Borel meas. on X × X, π1 (γ ) = µ, π2 (γ ) = µ
%
$
= inf E d(W, W ' ): dist W = µ, dist W ' = µ' .
'
)
(4)
The maps π1 , π2 : X × X → X are the projections onto the first and second coordinates, and
so µ and µ' are the marginals for γ . In the third version the infimum is taken over X-valued
random variables W and W ' with distribution µ and µ' respectively but otherwise unspecified
joint distribution.
Here and elsewhere, “dist” denotes the probability distribution on the associated random variable.
The metric space (M1 (X), dMK ) is complete and separable. The moment restriction (3) is
automatically satisfied if (X, d) is bounded. The second equality in (4) requires proof, see
[12, §11.8], but the third form of the definition is just a rewording of the second. The connection
between dMK convergence in M1 (X) and weak convergence is given by
dMK
νn −−−−→ ν
iff
νn → ν weakly and
'
d(x, a) dνn (x) →
'
d(x, a) dν(x)
for some and hence any a ∈ X. See [37, Section 7.2].
Suppose (X, d) is only assumed to be complete. If measures µ in M1 (X) are also required
to satisfy the condition µ(X \ spt µ) = 0, then (M1 (X), dMK ) is complete, see [26] and [16,
§2.1.16]. This condition is satisfied for all finite Borel measures µ if X has a dense subset whose
cardinality is an Ulam number, and in particular if (X, d) is separable. The requirement that the
cardinality of X be an Ulam number is not very restrictive, see [16, §2.2.16].
It is often more natural to use the following strong Prokhorov metric rather than the Monge–
Kantorovitch or standard Prokhorov metrics in the case of a uniformly contractive IFS.
Definition 4 (Strong Prokhorov metric). The set of compact support, or bounded support, unit
mass Borel measures on X is denoted by Mc (X) or Mb (X), respectively. The strong Prokhorov
metric dP is defined on Mb (X) in any of the following equivalent ways:
2056
M.F. Barnsley et al. / Advances in Mathematics 218 (2008) 2051–2088
$
! "
%
dP (µ, µ' ) := inf ǫ > 0: µ(A) ! µ' Aǫ for all Borel sets A ⊂ X
$
%
= inf ess supγ d(x, y): γ a meas. on X × X, π1 (γ ) = µ, π2 (γ ) = µ'
$
%
= inf ess sup d(W, W ' ): dist W = µ, dist W ' = µ' ,
(5)
where the notation is as in the paragraph following (4).
Note that
Mc (X) ⊂ Mb (X) ⊂ M1 (X) ⊂ M(X).
The first definition in (5) is symmetric in µ and µ' by a standard argument, see [12, proof of
Theorem 11.3.1]. For discussion and proof of the second equality see [33, Eq. (7.4.15), p. 160]
and the other references mentioned there. The third version is a probabilistic reformulation of the
second.
Proposition 5. (Mc (X), dP ) and (Mb (X), dP ) are complete. If ν, ν ' ∈ Mb (X) then
dH (spt ν, spt ν ' ) ! dP (ν, ν ' ),
dMK (ν, ν ' ) ! dP (ν, ν ' ).
In particular, νk → ν in the dP metric implies νk → ν in the dMK metric and spt νk → spt ν in
the dH metric.
Proof. The first inequality follows directly from the definition of the Hausdorff metric and the
second from the final characterisations in (4) and (5).
Completeness of Mb (X) can be shown as follows and this argument carries across to Mc (X).
Completeness of Mc (X) is also shown in [15, Theorem 9.1].
Suppose (νk )k!1 ⊆ (Mb (X), dP ) is dP -Cauchy. It follows that (νk )k!1 is dMK -Cauchy and
hence converges to some measure ν in the dMK sense and in particular weakly. Moreover,
spt(νk )k!1 converges to some bounded closed set K in the Hausdorff sense, hence spt ν ⊂ K
using weak convergence, and so ν ∈ Mb (X). Suppose ǫ > 0 and using the fact (νk )k!1 is dP Cauchy choose J so k, j " J implies νk (A) ! νj (Aǫ ) for all Borel A ⊂ X. By weak convergence
and because Aǫ is closed, lim supj →∞ νj (Aǫ ) ! ν(Aǫ ) and so νk (A) ! ν(Aǫ ) if k " J . Hence
νk → ν in the dP sense. !
If (X, d) is only assumed to be complete, but measures in Mc (X) and Mb (X) are also required to satisfy the condition µ(X \spt µ) = 0 as discussed following Definition 3, then the same
proof shows that Proposition 5 is still valid. The main point is that one still has completeness of
(M1 (X), dMK ).
Remark 6 (Strong and standard Prokhorov metrics). Convergence in the strong Prokhorov metric is a much stronger requirement than convergence in the standard Prokhorov metric or the
Monge–Kantorovitch metric. A simple example is given by X = [0, 1] and νn = (1 − n1 )δ0 + n1 δ1 .
Then νn → δ0 weakly and in the dMK and ρ metrics, but dP (νn , δ0 ) = 1 for all n.
The strong Prokhorov metric is normally not separable. For example, if µx = xδ0 + (1 − x)δ1
for 0 < x < 1 then dP (µx , µy ) = 1 for x *= y. So there is no countable dense subset.
M.F. Barnsley et al. / Advances in Mathematics 218 (2008) 2051–2088
2057
If f : X → X is Borel measurable then the pushforward measure f (ν) is defined by
f (ν)(A) = ν(f −1 (A)) for Borel sets A. The scaling property for Lipschitz functions f , namely
!
"
dP f (µ), f (ν) ! Lip f dP (µ, ν),
(6)
follows from the definition of dP . Similar properties are well known and easily established for
the Hausdorff and Monge–Kantorovitch metrics.
3. Iterated function systems
Definition 7. An iterated functions system [IFS] F = (X, fθ , θ ∈ Θ, W ) is a set of maps
fθ : X → X for θ ∈ Θ, where (X, d) is a complete separable metric space and W is a probability measure on some σ -algebra of subsets of Θ. The map (x, θ ) +→ fθ (x) : X × Θ → X is
measurable with respect to the product σ -algebra on X × Θ, using the Borel σ -algebra on X. If
Θ = {1, . . . , M} is finite and W (m) = wm then one writes F = (X, f1 , . . . , fM , w1 , . . . , wM ).
It follows f is measurable in θ for fixed x and in x for fixed θ . Notation such as Eθ is used to
denote taking the expectation, i.e. integrating, over the variable θ with respect to W .
Sometimes we will need to work with an IFS on a nonseparable metric space. The properties
which still hold in this case will be noted explicitly. See Remark 12 and Theorem 43.
The IFS F acts on subsets of X and Borel measures over X, for finite and infinite Θ respectively, by
F (E) =
M
*
fm (E),
F (ν) =
F (E) =
θ
fθ (E),
wm fm (ν),
m=1
m=1
*
M
#
F (ν) =
'
(7)
dW (θ )fθ (ν).
We put aside measurability matters and
& interpret the integral formally as the measure which operates on any Borel set A ⊂ X to give dW (θ )(fθ (ν))(A). The latter is thought of as a weighted
sum via W (θ ) of the measures fθ (ν). The precise definition in the cases we need for infinite Θ
is given by (9).
If F (E) = E or F (ν) = ν then E or ν respectively is said to be invariant under the IFS F .
In the study of fractal geometry one is usually interested in the case of an IFS with a finite
family Θ of maps. If X = R2 then compact subsets of X are often identified with black and
white images, while measures are identified with greyscale images. Images are generated by
iterating the map F to approximate limk→∞ F k (E0 ) or limk→∞ F k (ν0 ). As seen in the following
theorem, under natural conditions the limits exist and are independent of the starting set E0 or
measure ν0 .
In the study of Markov chains on an arbitrary state space X via iterations of random functions
on X, it is usually more natural to consider the case of an infinite family Θ. One is concerned
with a random process Znx , in fact a Markov chain, with initial state x ∈ X, and
Z0x (i) = x,
! x
"
(i) = fin ◦ · · · ◦ fi1 (x)
Znx (i) := fin Zn−1
if n " 1,
(8)
2058
M.F. Barnsley et al. / Advances in Mathematics 218 (2008) 2051–2088
where the in ∈ Θ are independently and identically distributed [iid] with probability distribution W and i = i1 i2 . . . . The induced probability measure on the set of codes i is also denoted
by W .
Note that the probability P (x, B) of going from x ∈ X into B ⊂ X in one iteration is
W {θ : fθ (x) ∈ B}, and P (x, ·) = dist Z1x . More generally, if the starting state is given by a
random variable X0 independent of i with dist X0 = ν, then one defines the random variable
X
Znν (i) = Zn 0 (i). The sequence (Znν (i))n!0 forms a Markov chain starting according to ν. We
define F (ν) = dist Z1ν and in summary we have
ν := dist Z0ν ,
F (ν) := dist Z1ν ,
F n (ν) = dist Znν .
(9)
The operator F can be applied to bounded continuous functions φ : X → R via any of the
following equivalent definitions:
!
"
F (φ) (x) =
'
!
"
φ fθ (x) dW (θ )
+ #
,
!
"
or
wm φ fm (x)
!
"
! "
= Eθ φ fθ (x) = E φ Z1x .
m
(10)
In the context of Markov chains, the operator
F acting
&
& on functions is called the transfer operator. It follows from the definitions that F (φ) dµ = φ d(F µ), which is the expected value of φ
after one time step starting with the initial distribution µ. If one assumes F (φ) is continuous (it
is automatically bounded) then F acting on measures is the adjoint of F acting on bounded continuous functions. Such F are said to satisfy the weak Feller property—this is the case if all fθ
are continuous by the dominated convergence theorem, or if the pointwise average contractive
condition is satisfied, see [35].
We will need to apply the maps in (8) in the reverse order. Define
-x (i) = x,
Z
0
-nx (i) = fi1 ◦ · · · ◦ fin (x)
Z
if n " 1.
(11)
Then from the iid property of the in it follows
-nν ,
F n (ν) = dist Znν = dist Z
! x"
! "
-n .
F n (φ)(x) = E φ Znx = E φ Z
(12)
-nx are very different. Under suitHowever, the pathwise behaviour of the processes Znx and Z
able conditions the former is ergodic and the latter is a.s. convergent, see Theorem 8(c) and (a)
respectively, and the discussion in [9].
The following Theorem 8 is known with perhaps two exceptions: the lack of a local compactness requirement in (c) and the use of the strong Prokhorov metric in (d).
The strong Prokhorov metric, first used in the setting of random fractals in [15], is a more natural metric than the Monge–Kantorovitch metric when dealing with uniformly contractive maps
and fractal measures in compact spaces. Either it or variants may be useful in image compression
matters. In Theorem 43 we use it to strengthen the convergence results in [6].
The pointwise ergodic theorems for Markov chains for any starting point as established in
[5,8,13,14,31] require compactness or local compactness, see Remark 11. We remove this restriction in Theorem 8. The result is needed in Theorem 45 where we consider an IFS operating
M.F. Barnsley et al. / Advances in Mathematics 218 (2008) 2051–2088
2059
on the space (M1 (X)V , dMK ) of V -tuples of probability measures. Like most spaces of functions or measures, this space is not locally compact even if X = Rk . See Remark 11 and also
Remark 47.
We assume a pointwise average contractive condition, see Remark 9. We need this in Theorem 45, see Remark 46.
The parts of the theorem have a long history. In the Markov chain literature the contraction
conditions (13) and (19) were introduced in [24] and [11] respectively in order to establish ergodicity. In the fractal geometry literature, following [27,28], the existence and uniqueness of
attractors, their properties, and the Markov chain approach to generating fractals, were introduced in [3,4,10,13,14,20]. See [9,34,35] for further developments and more on the history.
Theorem 8. Let F = (X, fθ , θ ∈ Θ, W ) be an IFS on a complete separable metric space (X, d).
Suppose F satisfies the pointwise average contractive and average boundedness conditions
!
"
!
"
Eθ d fθ (x), fθ (y) ! rd(x, y) and L := Eθ d fθ (a), a < ∞
(13)
for some fixed 0 < r < 1, all x, y ∈ X, and some a ∈ X.
(a) For some function Π , all x, y ∈ X and all n, we have
! x
"
! x
"
y
-n (i), Z
-ny (i) ! r n d(x, y),
E d Zn (i), Zn (i) = E d Z
! x
"
-n (i), Π(i) ! γx r n ,
Ed Z
(14)
!
"
! x
"
y
-n (i), Z
-ny (i) ! s n d(x, y),
d Znx (i), Zn (i) ! s n d(x, y),
d Z
! x
"
-n (i), Π(i) ! γx s n ,
d Z
(15)
where γx = Eθ d(x, fθ (x)) ! d(x, a) + L/(1 − r) and Π(i) is independent of x. The map
Π is called the address map from code space into X. Note that Π is defined only a.e.
If r < s < 1 then for all x, y ∈ X for a.e. i = i1 i2 . . . in . . . there exists n0 = n0 (i, s) such that
for n " n0 .
(b) If ν is a unit mass Borel measure then F n (ν) → µ weakly where µ := Π(W ) is the projection of the measure W on code space onto X via Π . Equivalently, µ is the distribution of Π
regarded as a random variable. In particular,
F (µ) = µ
and µ is the unique invariant unit mass measure.
The map F is a contraction on (M1 (X), dMK ) with Lipschitz constant r. Moreover,
µ ∈ M1 (X) and
for every ν ∈ M1 (X).
!
"
dMK F n (ν), µ ! r n dMK (ν, µ)
(16)
2060
M.F. Barnsley et al. / Advances in Mathematics 218 (2008) 2051–2088
(c) For all x ∈ X and for a.e. i, the empirical measure (probability distribution)
n
µxn (i) :=
1#
δZkx (i) → µ
n
(17)
k=1
weakly. Moreover, if A is the support of µ then there exists n0 = n0 (i, x, ǫ) such that
Znx (i) ⊂ Aǫ
(18)
if n " n0 .
(d) Suppose F satisfies the uniform contractive and uniform boundedness conditions
!
"
!
"
supθ d fθ (x), fθ (y) ! r d(x, y) and L := supθ d fθ (a), a < ∞
(19)
for some r < 1, all x, y ∈ X, and some a ∈ X. Then
"
! x
"
!
y
-n (i), Z
-ny (i) ! r n d(x, y),
d Znx (i), Zn (i) ! r n d(x, y),
d Z
! x
"
-n (i), Π(i) ! γx r n ,
d Z
(20)
for all x, y ∈ X and all i. The address map Π is everywhere defined and is continuous with
respect to the product topology defined on code space and induced from the discrete metric
on Θ. Moreover, µ = Π(W ) ∈ Mc and for any ν ∈ Mb ,
"
!
dP F n (ν), µ ! r n dP (ν, µ).
(21)
Suppose in addition Θ is finite and W ({θ }) > 0 for θ ∈ Θ. Then A is compact and for any
closed bounded E
!
"
dH ( F n (E), A ! r n dH (E, A).
(22)
Moreover, F (A) = A and A is the unique closed bounded invariant set.
Proof. (a) The first inequality in (14) follows from (8), (11) and contractivity.
Next fix x. Since
! x
"
"
! x
- (i), Z
-n (i), Z
-x (i) ,
-nx (i) ! r E d Z
Ed Z
n−1
n+1
! a
"
"
! a
- (i), Z
-a (i)
-n (i), Z
-na (i) ! r E d Z
Ed Z
n+1
n−1
(23)
-nx (i) and Z
-na (i) a.s. converge exponentially fast to the same limit Π(i)
for all n, it follows that Z
(say) by the first inequality in (14). It also follows that (23) is simultaneously true with i replaced
by ik+1 ik+2 ik+3 , . . . for every k. It then follows from (11) that
Again using (11),
!
"
Π(i) = fi1 ◦ · · · ◦ fik Π(ik+1 ik+2 ik+3 . . .) .
M.F. Barnsley et al. / Advances in Mathematics 218 (2008) 2051–2088
But
2061
!
! x
"
"
-n (i), Π(i) ! r n E d x, Π(in+1 in+2 in+3 . . .)
Ed Z
!
!
""
!
"
= r n E d x, Π(i) ! r n d(x, a) + E d a, Π(i) .
!
!
"
!
"
" #
E d fi1 ◦ · · · ◦ fin (a), fi1 ◦ · · · ◦ fin+1 (a)
E d a, Π(i) ! E d a, fi1 (a) +
n!1
!
#
r nL =
n!0
L
.
1−r
This gives the second inequality in (14). See [34] for details.
The estimates in (15) are the standard consequence that exponential convergence in mean
implies a.s. exponential convergence.
(b) Suppose φ ∈ BC(X, d), the set of bounded continuous functions on X. Let ν be any
-nx (i) → Π(i) for every x, using the continuity of φ and
unit mass measure. Since for a.e. i, Z
dominated convergence,
'
'
!
"
! x "
ν
-n (i) dW (i) dν(x)
φd dist Zn (by (12)) = φ Z
'
'
→ φ(Π(i) dW (i) dν = φ dµ,
"
!
φ d F nν =
'
by the definition of µ for the last equality. Thus F n (ν) → µ weakly. The invariance of µ and the
fact µ is the unique invariant unit measure follow from the weak Feller property.
One can verify that F : M1 (X) → M1 (X) and F is a contraction map with Lipschitz constant r in the dMK metric. It is easiest to use the second or third form of (4) for this. The rest
of (b) now follows.
(c) The main difficulty here is that (X, d) may not be locally compact and so the space
BC(X, d) need not be separable, see Remark 11. We adapt an idea of Varadhan, see [12, Theorem 11.4.1, p. 399].
There is a totally bounded and hence separable, but not usually complete, metric e on X such
that (X, e) and (X, d) have the same topology, see [12, Theorem 2.8.2, p. 72]. Moreover, as the
proof there shows, e(x, y) ! d(x, y). Because the topology is preserved, weak convergence of
measures on (X, d) is the same as weak convergence on (X, e).
Let BL(X, e) denote the set of bounded Lipschitz functions over (X, e). Then BL(X, e)
is separable in the sup norm from the total boundedness of e. Suppose φ ∈ BL(X, e). By the
ergodic theorem, since µ is the unique invariant measure for F ,
'
y
φ dµn (i) =
n
1# ! y "
φ Zk (i) →
n
k=1
'
φ dµ
for a.e. i and µ a.e. y.
Suppose x ∈ X and choose y ∈ X such that (24) is true. Using (a), for a.e. i
!
"
!
"
y
y
e Znx (i), Zn (i) ! d Znx (i), Zn (i) → 0.
(24)
2062
M.F. Barnsley et al. / Advances in Mathematics 218 (2008) 2051–2088
It follows from (24) and the uniform continuity of φ in the e metric that for a.e. i,
'
φ dµxn (i) =
n
1# ! x "
φ Zk (i) →
n
k=1
'
φ dµ.
(25)
Let S be a countable dense subset of BL(X, e) in the sup norm. One can ensure that (25) is simultaneously true for all φ ∈ S. By an approximation argument it follows (25) is simultaneously
true for all φ ∈ BL(X, e).
&
& Since (X, e) is separable, weak convergence of measures νn → ν is equivalent to φ dνn →
φ dν for all φ ∈ BL(X, e), see [12, Theorem 11.3.3, p. 395]. Completeness is not needed for
this. It follows that µxn (i) → µ weakly as required.
The result (18) follows from the third inequality in (15).
(d) The three inequalities in (20) are straightforward as are the claims concerning Π .
It follows readily from the definitions that each of Mc (X), Mb (X), C(X) and BC(X) are
closed under F , and that F is a contraction map with respect to dP in the first two cases and dH
in the second two cases. The remaining results all follow easily. !
Remark
9 (Contractivity conditions). The global average contractive condition Eθ rθ :=
&
rθ dW (θ ) ! r where rθ := Lip fθ , implies the pointwise average contractive condition. Although the global condition is frequently assumed, for our purposes the weaker pointwise
assumption is necessary, see Remark 46.
In some papers, for example [9,38], parts of Theorem 8 are established or used under the
global log average contractive and average boundedness conditions
Eθ log rθ < 0,
q
Eθ rθ < ∞,
!
"
q
Eθ d a, fθ (a) < ∞,
(26)
for some q > 0 and some a ∈ X. However, since d q is a metric for 0 < q < 1 and since
(Eθ g q (θ ))1/q ↓ exp(Eθ log g(θ )) as q ↓ 0 for g " 0, such results follow from Theorem 8. In
the main Theorem 5.2 of [9] the last two conditions are replaced by the equivalent algebraic tail
condition. One can even obtain in this way similar consequences under the yet weaker pointwise
log average conditions. See also [13].
Pointwise average contractivity is a much weaker requirement than global average contractivity. A simple example in which fθ is discontinuous with positive probability is given by
0 < ǫ < 1, X = [0, 1] with the standard metric d, Θ = [0, 1], W {0} = W {1} = ǫ/2, and otherwise
W is uniformly distributed over (0, 1) according to W {(a, b)} = (1−ǫ)(b −a) for 0 < a < b < 1.
Let fθ = X[θ,1] be the characteristic function of [θ, 1] for 0 ! θ < 1 and let f1 ≡ 0 be the zero
function. Then Eθ d(fθ (x), fθ (y)) ! (1−ǫ)d(x, y) for all x and y. The unique invariant measure
is of course 21 δ0 + 12 δ1 . Uniqueness fails if ǫ = 0. A simple example where fθ is discontinuous
with probability one is fθ = X{1} and θ is chosen uniformly on the unit interval. See [34,35] for
further examples and discussion.
Remark 10 (Alternative starting configurations). One can extend (14) by allowing the starting
point x to be distributed according to a distribution ν and considering the corresponding random
-nν (i). Then (16) follows directly by using the random variables Z
-nν (i) and Z
-nµ (i) in
variables Z
the third form of (4). Analogous remarks apply in deducing the distributional convergence results
in Theorem 8(d) from the pointwise convergence results, via the third form of (5).
M.F. Barnsley et al. / Advances in Mathematics 218 (2008) 2051–2088
2063
Remark 11 (Local compactness issues). Versions of Theorem 8(c) for locally compact (X, d)
were established in [5,8,13,14,31].
one first proves vague convergence in (17).
& In that case
&
By νn → ν vaguely one means φ dνn → φ dν for all φ ∈ Cc (X) where Cc (X) is the set
of compactly supported continuous functions φ : X → R. The proof of vague convergence is
straightforward from the ergodic theorem since Cc (X) is separable. Moreover, in a locally compact space, vague convergence of probability measures to a probability measure implies weak
convergence. That this is not true for more general spaces is a consequence of the following
discussion.
The extension of Theorem 8(c) to nonlocally compact spaces is needed in Section 6 and
Theorem 45. In order to study images in Rk we consider IFS’s whose component functions act
on the space (M1 (Rk )V , dMK ) of V -tuples of unit mass measures over Rk , where V is a natural
number. Difficulties already arise in proving that the chaos game converges a.s. from every initial
V -tuple of sets even for V = k = 1.
To see this suppose ν0 ∈ M1 (R) and ǫ > 0. Then Bǫ (ν0 ) := {ν: dMK (ν, ν0 ) ! ǫ} is not sequentially compact in the dMK metric and so (M1 (R), dMK ) is not locally compact. To show
sequential compactness does not hold let νn = (1 − nǫ )ν0 + nǫ τn ν0 , where τn (x) = x + n is translation by n units in the x-direction. Then clearly νn → ν0 weakly. Setting f (x) = x in (4),
dMK (νn , ν0 ) "
'
x dνn −
'
,'
+
'
'
ǫ
ǫ
x dν0 = 1 −
x dν0 +
(x + n) dν0 − x dν0 = ǫ.
n
n
On the other hand, let W be a random measure with dist W = ν0 . Independently of the value of W
let W ' = W with probability 1 − nǫ and W ' = τn W with probability nǫ . Then again from (4),
,
+
ǫ
ǫ
× 0 + n = ǫ.
dMK (νn , ν0 ) ! E dMK (W, W ' ) = 1 −
n
n
It follows that νn ∈ Bǫ (ν0 ) and νn ! ν in the dMK metric, nor does any subsequence. Since dMK
implies weak convergence it follows that (νn )n!1 has no convergent subsequence in the dMK
metric.
It follows that Cc (M1 (R), dMK ) contains only the zero function. Hence vague convergence
in this setting is a vacuous notion and gives no information about weak convergence.
Finally, we note that although (c) is proved here assuming the pointwise average contractive
condition, it is clear that weaker hypotheses concerning the stability of trajectories will suffice to
extend known results from the locally compact setting.
Remark 12 (Separability and measurability issues). If (X, d) is separable then the class of Borel
sets for the product topology on X × X is the product σ -algebra of the class of Borel sets on X
with itself, see [7, p. 244]. It follows that θ +→ d(fθ (x), fθ (y)) is measurable for each x, y ∈ X
and so the quantities in (13) are well defined.
Separability is not required for the uniform contractive and uniform boundedness conditions
in (19) and the conclusions in (d) are still valid with essentially the same proofs. The spaces
Mc (X), Mb (X) and M1 (X) need to be restricted to separable measures as discussed following
Definition (3) and Proposition 5.
2064
M.F. Barnsley et al. / Advances in Mathematics 218 (2008) 2051–2088
Separability is also used in the proof of (c). If one drops this condition and assumes the
uniform contractive and boundedness conditions (19) then a weaker version of (c) holds. Namely,
for every x ∈ X and every bounded continuous φ ∈ BC(X), for a.e. i
'
'
(27)
φ dµxn (i) → φ dµ.
The point is that unlike the situation in (c) under the hypothesis of separability, the set of such i
might depend on the function φ.
In Theorem 43 we apply Theorem 8 to an IFS whose component functions operate on
(Mc (X)V , dP ) where V is a natural number. Even in the case V = 1 this space is not separable, see Example 6.
4. Tree codes and standard random fractals
Let F = {X, F λ , λ ∈ Λ, P } be a family of IFSs as in (1) and (2). We assume the IFSs F λ are
uniformly contractive and uniformly bounded, i.e. for some 0 < r < 1,
!
"
supλ maxm d fmλ (x), fmλ (y) ! r d(x, y),
!
"
L := supλ maxm d fmλ (a), a < ∞
(28)
for all x, y ∈ X and some a ∈ X. More general conditions are assumed in Section 6.
We often use ∗ to indicate concatenation of sequences, either finite or infinite.
Definition 13 (Tree codes). The tree T is the set of all finite sequences from {1, . . . , M}, including
the empty sequence ∅. If σ = σ1 . . . σk ∈ T then the length of σ is |σ | = k and |∅| = 0.
A tree code ω is a map ω : T → Λ. The metric space (Ω, d) of all tree codes is defined by
Ω = {ω | ω : T → Λ},
d(ω, ω' ) =
1
Mk
(29)
if ω(σ ) = ω' (σ ) for all σ with |σ | < k and ω(σ ) *= ω' (σ ) for some σ with |σ | = k. A finite tree
code of height k is a map ω : {σ ∈ T : |σ | ! k} → Λ.
If ω ∈ Ω and τ ∈ T then the tree code ω0τ is defined by (ω0τ )(σ ) := ω(τ ∗ σ ). It is the tree
code obtained from ω by starting at the node τ . One similarly defines ω0τ if ω is a finite tree
code of height k and |τ | ! k.
If ω ∈ Ω and k is a natural number then the finite tree code ω1k is defined by (ω1k)(σ ) =
ω(σ ) for |σ | ! k. It is obtained by truncating ω at the level k.
The space (Ω, d) is complete and bounded. If Λ is finite then (Ω, d) is compact.
The tree code ω associates to each node σ ∈ T the IFS F ω(σ ) . It also associates to each σ *= ∅
'
the function fmω(σ ) where σ = σ ' ∗ m. The M k components of Kkω in (30) are then obtained by
beginning with the set K0 and iteratively applying the functions associated to the k nodes along
each of the M k branches of depth k obtained from ω. A similar remark applies to µωk .
Definition 14 (Fractal sets and measures). If K0 ∈ C(X) and µ0 ∈ Mc (X) then the prefractal
sets Kkω , the prefractal measures µωk , the fractal set K ω and the fractal measure µω , are given
by
2065
M.F. Barnsley et al. / Advances in Mathematics 218 (2008) 2051–2088
Kkω =
*
ω(σ1 ...σk−1 )
1 ) ◦ f ω(σ1 σ2 ) ◦ · · · ◦ f
fσω(∅)
◦ fσω(σ
σk
σ3
1
2
(K0 ),
σ ∈T ,|σ |=k
µωk =
#
ω(σ1 ...σk−1 ) ω(∅)
fσ1
1) · · · · · w
wσω(∅)
wσω(σ
σk
1
2
ω(σ1 ...σk−1 )
1) ◦ · · · ◦ f
◦ fσω(σ
σk
2
(µ0 ),
σ ∈T ,|σ |=k
µω = lim µωk .
K ω = lim Kkω ,
(30)
k→∞
k→∞
It follows from uniform contractivity that for all ω one has convergence in the Hausdorff and
strong Prokhorov metrics respectively, and that K ω and µω are independent of K0 and µ0 .
The collections of all such fractals sets and measures for fixed {F λ }λ∈Λ are denoted by
For each k one has
Kω =
*
$
%
K∞ = K ω : ω ∈ Ω ,
Kσω ,
$
%
M∞ = µω : ω ∈ Ω .
ω(σ1 ...σk−1 ) !
1 ) ◦ f ω(σ1 σ2 ) ◦ · · · ◦ f
Kσω := fσω(∅)
◦ fσω(σ
σk
σ3
1
2
|σ |=k
(31)
"
K ω0σ .
(32)
The M k sets Kσω are called the subfractals of K ω at level k.
The maps ω +→ K ω and ω +→ µω are Hölder continuous. More precisely:
Proposition 15. With L and r as in (28),
!
!
2L α
2L α
'"
'"
d (ω, ω' ) and dP µω , µω !
d (ω, ω' ),
dH K ω , K ω !
1−r
1−r
(33)
where α = log(1/r)/ log M.
Proof. Applying (30) with K0 replaced by {a}, and using (28) and repeated applications of
the triangle inequality, it follows that dH (K ω , a) ! (1 + r + r 2 + · · ·)L = L/(1 − r) and so
'
dH (K ω , K ω ) ! 2L/(1 − r) for any ω and ω' . If d(ω, ω' ) = M −k then ω(σ ) = ω' (σ ) for
'
|σ | < k, and since dH (K ω0σ , K ω 0σ ) ! 2L/(1 − r), it follows from (32) and contractivity that
k
'
k
−kα = d α (ω, ω' ), the result for sets follows.
dH (K ω , K ω ) ! 2Lr
1−r . Since r = M
The proof for measures is essentially identical; one replaces µ0 by δa in (30). !
Example 16 (Random Sierpinski triangles and tree codes). The relation between a tree code ω
and the corresponding fractal set K ω can readily be seen in Fig. 2. The IFSs F = (f1 , f2 , f3 ) and
G = (g1 , g2 , g3 ) act on R2 , and the fm and gm are similitudes with contraction ratios 1/2 and 1/3
respectively and fixed point m. If a node is labelled F , then reading from left to right the three
main branches of the subtree associated with that node correspond to f1 , f2 , f3 respectively.
Similar remarks apply if the node is labelled G.
If the three functions in F and the three functions in G each are given weights equal to 1/3
then the measure µω is distributed over K ω in such a way that 1/3 of the mass is in each of the
top level triangles, 1/9 in each of the next level triangles, etc.
2066
M.F. Barnsley et al. / Advances in Mathematics 218 (2008) 2051–2088
Fig. 2. Random Sierpinski triangles and tree codes.
The sets K ω and measures µω in Definition 14 are not normally self similar in any natural
sense. However, there is an associated notion of statistical self similarity. For this we need the
following definition. The reason for the notation ρ∞ in the definition can be seen from Theorem 49.
Definition 17 (Standard random fractals). The probability distribution ρ∞ on Ω is defined by
choosing ω(σ ) ∈ Λ for each σ ∈ T in an iid manner according to P . The random set K =
ω +→ K ω and the random measure M = ω +→ µω , each defined by choosing ω ∈ Ω according
to ρ∞ , are called standard random fractals or random recursive fractals. The induced probability
distributions on K∞ and M∞ respectively are defined by K∞ = dist K and M∞ = dist M.
In the case of random fractal measures one can replace the uniform assumptions in (28) by
average or expected type assumptions and the map ω +→ µω is then defined a.s., see [21,22]. In
Section 6 we do the analogue of this for the case of V -variable random fractal measures.
It follows from the definitions that K and M are statistically self similar in the sense that
dist K = dist F(K1 , . . . , KM ),
dist M = dist F(M1 , . . . , MM ),
(34)
where F is a random IFS chosen from (F λ )λ∈Λ according to P , K1 , . . . , KM are iid copies of K
which are independent of F , and M1 , . . . , MM are iid copies of M which are independent of F.
Here, and in the following sections, an IFS F acts on M-tuples of subsets K1 , . . . , KM of X
and measures µ1 , . . . , µM over X by
F (K1 , . . . , KM ) =
M
*
m=1
fm (Km ),
F (µ1 , . . . , µM ) =
M
#
wm fm (µm ).
(35)
m=1
This extends in a pointwise manner to random IFSs acting on random sets and random measures
as in (34).
We use the terminology “standard” to distinguish the class of random fractals given by Definition 17 and discussed in [15,17,21,22,30] from other classes of random fractals in the literature.
M.F. Barnsley et al. / Advances in Mathematics 218 (2008) 2051–2088
2067
5. V -variable tree codes
5.1. Overview
We continue with the assumptions that F = {X, F λ , λ ∈ Λ, P } is a family of IFSs as in (1)
and (2), and that {F λ }λ∈Λ satisfies the uniform contractive and uniform bounded conditions (28).
In Theorem 45 and Example 47 the uniformity conditions are replaced by pointwise average
conditions.
In Section 5.2, Definition 18, we define the set ΩV ⊂ Ω of V -variable tree codes, where Ω
is the set of tree codes in Definition 13. Since the {F λ }λ∈Λ are uniformly contractive this leads
directly to the class KV of V -variable fractal sets and the class MV of V -variable fractal measures.
In Section 5.3, ΩV is alternatively obtained from an IFS Φ V = (Ω V , Φ a , a ∈ AV ) acting
on Ω V . More precisely, the attractor ΩV∗ ⊂ Ω V of Φ V projects in any of the V -coordinate
directions to ΩV . However, ΩV∗ *= (ΩV )V and in fact there is a high degree of dependence
between the coordinates of any ω = (ω1 , . . . , ωV ) ∈ ΩV∗ .
If
!
"
ω = lim Φ a0 ◦ Φ a1 ◦ · · · ◦ Φ ak−1 ω10 , . . . , ωV0
k→∞
we say ω has address a0 a1 . . . ak . . . . The limit is independent of (ω10 , . . . , ωV0 ).
In Section 5.4 a formalism is developed for finding the V -tuple of tree codes (ω1 , . . . , ωV )
from the address a0 a1 . . . ak . . . , see Proposition 33 and Example 34. Conversely, given a tree
code ω one can find all possible addresses a0 a1 . . . ak . . . of V -tuples (ω1 , . . . , ωV ) ∈ ΩV∗ for
which ω1 = ω.
In Section 5.5 the probability distribution ρV on the set ΩV of V -variable tree codes is defined
and discussed. The probability P on Λ first leads to a natural probability PV on the index set AV
for the IFS Φ V , see Definition 35. This then turns Φ V into an IFS (Ω V , Φ a , a ∈ AV , PV ) with
weights whose measure attractor ρV∗ is a probability distribution on its set attractor ΩV∗ . The
projection of ρV∗ in any coordinate direction is the same, is supported on ΩV and is denoted
by ρV , see Theorem 38.
5.2. V -variability
Definition 18 (V -variable tree codes and fractals). A tree code ω ∈ Ω is V -variable if for each
positive integer k there are at most V distinct tree codes of the form ω0τ with |τ | = k. The set of
V -variable tree codes is denoted by ΩV .
Similarly a finite tree code ω of height p is V -variable if for each k < p there are at most V
distinct finite subtree codes ω0τ with |τ | = k.
For a uniformly contractive family {F λ }λ∈Λ of IFSs, if ω is V -variable then the fractal set K ω
and fractal measure µω in (30) are said to be V -variable. The collections of all V -variable sets
and measures corresponding to {F λ }λ∈Λ are denoted by
respectively, cf. (31).
$
%
KV = K ω : ω ∈ Ω V ,
$
%
MV = µω : ω ∈ ΩV
(36)
2068
M.F. Barnsley et al. / Advances in Mathematics 218 (2008) 2051–2088
If V = 1 then ω is V -variable if and only if |σ | = |σ ' | implies ω(σ ) = ω(σ ' ), i.e. if and only
if for each k all values of ω(σ ) at level k are equal. In the case V > 1, if ω is V -variable then
for each k there are at most V distinct values of ω(σ ) at level k = |σ |, but this is not sufficient to
imply V -variability.
Remark 19 (V -variable terminology). The motivation for the terminology “V -variable fractal”
is as follows. Suppose all functions fmλ belong to the same group G of transformations. For
example, if X = Rn then G might be the group of invertible similitudes, invertible affine transformations or invertible projective transformations. Two sets A and B are said to be equivalent
modulo G if A = g(B) for some g ∈ G. If K ω is V -variable and k is a positive integer, then there
are at most V distinct trees of the form ω0σ such that |σ | = k. If |σ | = |σ ' | = k and ω0σ = ω0σ ' ,
then from (32)
!
"
Kσω = g Kσω' ,
ω(σ1 ...σk−1 )
g = fσω(∅)
◦ · · · ◦ fσk
1
! ω(∅) "−1 ω
! ω(σ ' ...σ ' ) "−1
Kσ ' .
◦ · · · ◦ fσ '
◦ fσ ' 1 k−1
(37)
1
k
In particular, Kσω and Kσω' are equivalent modulo G.
Thus the subfractals of K ω at level k form at most V distinct equivalence classes modulo G.
However, the actual equivalence classes depend upon the level. Similar remarks apply to V variable fractal measures.
Proposition 20. A tree code ω is V -variable iff for every positive integer k the finite tree codes
ω1k are V -variable.
Proof. If ω is V -variable the same is true for every finite tree code of the form ω1k.
If ω is not V -variable then for some k there are at least V + 1 distinct subtree codes ω0τ with
|τ | = k. But then for some p the V + 1 corresponding finite tree codes (ω0τ )1p must also be
distinct. It follows ω1(k + p) is not V -variable. !
Example 21 (V -variable Sierpinski triangles). The first tree code in Fig. 2 is an initial segment
of a 3-variable tree code but not of a 2-variable tree code, while the second tree is an initial
segment of a 2-variable tree code but not of a 1-variable tree code. The corresponding Sierpinski
type triangles are, to the level of approximation shown, 3-variable and 2-variable respectively.
Theorem 22. The ΩV are closed and nowhere dense in Ω, and
ΩV ⊂ ΩV +1 ,
dH (ΩV , Ω) <
1
,
V
*
V !1
ΩV "
*
ΩV = Ω,
V !1
where the bar denotes closure in the metric d.
Proof. For the inequality suppose ω ∈ Ω and define k by M k ! V < M k+1 . Then if ω' is chosen
so ω' (σ ) = ω(σ ) for |σ | ! k and ω' (σ ) is constant for |σ | > k, it follows ω' ∈ ΩV and d(ω' , ω) !
M −(k+1) < V −1 , hence dH (ΩV , Ω) < V −1 . The remaining assertions are clear. !
M.F. Barnsley et al. / Advances in Mathematics 218 (2008) 2051–2088
2069
5.3. An IFS acting on V -tuples of tree codes
Definition 23. The metric space (Ω V , d) is the set of V -tuples from Ω with the metric
"
""
!
!
!
d (ω1 . . . ωV ), ω1' . . . ωV' = max d ωv , ωv' ,
1"v"V
where d on the right side is as in Definition 13.
This is a complete bounded metric and is compact if Λ is finite since the same its true for
V = 1. See Definition 13 and the comment which follows it. The induced Hausdorff metric on
BC(Ω V ) is complete and bounded, and is compact if Λ is finite. See the comments following
Definition 1.
The notion of V -variability extends to V -tuples of tree codes, V -tuples of sets and V -tuples
of measures.
Definition 24 (V -variable V -tuples). The V -tuple of tree codes of the form ω = (ω1 , . . . , ωV ) ∈
Ω V is V -variable if for each positive integer k there are at most V distinct subtrees of the form
ωv 0σ with v ∈ {1, . . . , V } and |σ | = k. The set of V -variable V -tuples of tree codes is denoted
by ΩV∗ .
Let {F λ }λ∈Λ be a uniformly contractive family of IFSs. The corresponding sets KV∗ of
V -variable V -tuples of fractal sets, and M∗V of V -variable V -tuples of fractal measures, are
$! ω
"
%
K 1 , . . . , K ωV : (ω1 , . . . , ωV ) ∈ ΩV∗ ,
%
"
$!
M∗V = µω1 , . . . , µωV : (ω1 , . . . , ωV ) ∈ ΩV∗ ,
KV∗ =
where K ωv and µωv are as in Definition 14.
Proposition 25. The projection of ΩV∗ in any coordinate direction equals ΩV , however
ΩV∗ " (ΩV )V .
Proof. To see the projection map is onto consider (ω, . . . , ω) for ω ∈ ΩV . To see ΩV∗ " (ΩV )V
note that a V -tuple of V -variable tree codes need not itself be V -variable. !
Notation 26. Given λ ∈ Λ and ω1 , . . . , ωM ∈ Ω define ω = λ ∗ (ω1 , . . . , ωM ) ∈ Ω by ω(∅) = λ
and ω(mσ ) = ωm (σ ). Thus λ ∗ (ω1 , . . . , ωM ) is the tree code with λ at the base node ∅ and the
tree ωm attached to the node m for m = 1, . . . , M.
Similar notation applies if the ω1 , . . . , ωM are finite tree codes all of the same height.
We define maps on V -tuples of tree codes and a corresponding IFS on Ω V as follows.
Definition 27 (The IFS acting on the set of V -tuples of tree codes). Let V be a positive integer.
Let AV be the set of all pairs of maps a = (I, J ) = (I a , J a ), where
I : {1, . . . , V } → Λ,
J : {1, . . . , V } × {1, . . . , M} → {1, . . . , V }.
2070
M.F. Barnsley et al. / Advances in Mathematics 218 (2008) 2051–2088
For a ∈ AV the map Φ a : Ω V → Ω V is defined for ω = (ω1 , . . . , ωV ) by
"
!
!
"
Φva (ω) = I a (v) ∗ ωJ a (v,1) , . . . , ωJ a (v,M) .
Φ a (ω) = Φ1a (ω), . . . , ΦVa (ω) ,
(38)
Thus Φva (ω) is the tree code with base node I a (v), and at the end of each of its M base branches
are attached copies of ωJ a (v,1) , . . . , ωJ a (v,M) respectively.
The IFS Φ V acting on V -tuples of tree codes and without a probability distribution at this
stage is defined by
!
"
(39)
Φ V := Ω V , Φ a , a ∈ AV .
Note that Φ a : ΩV∗ → ΩV∗ for each a ∈ AV .
Notation 28. It is often convenient to write a = (I a , J a ) ∈ AV in the form
I a (1)
.
a = ..
I a (V )
J a (1, 1)
..
.
J a (V , 1)
...
..
.
...
J a (1, M)
..
.
.
(40)
J a (V , M)
Thus AV is then the set of all V × (1 + M) matrices with entries in the first column belonging
to Λ and all other entries belonging to {1, . . . , V }.
Theorem 29. Suppose Φ V = (Ω V , Φ a , a ∈ AV ) is an IFS as in Definition 27, with Λ possibly infinite. Then each Φ a is a contraction map with Lipschitz constant 1/M. Moreover,
with Φ V acting on subsets of Ω V as in (7) and using the notation of Definition 1, we have
Φ V : BC(Ω V ) → BC(Ω V ) and Φ V is a contractive map with Lipschitz constant 1/M. The
unique fixed point of Φ V is ΩV∗ and in particular its projection in any coordinate direction
equals ΩV .
Proof. It is readily checked that each Φ a is a2
contraction map with Lipschitz constant 1/M.
We can establish directly that Φ V (E) := a∈AV Φ a (E) is closed if E is closed, since any
Cauchy sequence from Φ V (E) eventually belongs to Φ a (E) for some fixed a. It follows that Φ V
is a contraction map on the complete space (BC(Ω V ), dH ) with Lipschitz constant 1/M and so
has a unique bounded closed fixed point (i.e. attractor).
In order to show this attractor is the set ΩV∗ from Definition 24, note that ΩV∗ is bounded and
closed in Ω V . It is closed under Φ a for any a ∈ AV as noted before. Moreover, each ω ∈ ΩV∗ is
of the form Φ a (ω' ) for some ω' ∈ ΩV∗ and some (in fact many) Φ a . To see this, consider the V M
tree codes of the form ωv 0m for 1 ! v ! V and 1 ! m ! M, where each m is the corresponding
node of T of height one. There are at most V distinct such tree codes, which we denote by
ω1' , . . . , ωV' , possibly with repetitions. Then from (38)
!
"
(ω1 , . . . , ωV ) = Φ a ω1' , . . . , ωV' ,
provided
I a (v) = ωv (∅),
ωJ' a (v,m) = ωv 1m.
So ΩV∗ is invariant under Φ V and hence is the unique attractor of the IFS Φ V .
!
2071
M.F. Barnsley et al. / Advances in Mathematics 218 (2008) 2051–2088
In the previous theorem, although Φ V is an IFS, neither Theorem 8 nor the extensions in
Remark 12 apply directly. If Λ is not finite then Ω V is neither separable nor compact. Moreover,
the map Φ V acts on sets by taking infinite unions and so we cannot apply Theorem 8(d) to find
a set attractor for Φ V , since in general the union of an infinite number of closed sets need not be
closed.
As a consequence of the theorem, approximations to V -variable V -tuples of tree codes, and in
particular to individual V -variable tree codes, can be built up from a V -tuple τ of finite tree codes
of height 0 such as τ = (λ∗ , . . . , λ∗ ) for some λ∗ ∈ Λ, and a finite sequence a0 , a2 , . . . , ak ∈ AV ,
by computing the height k finite tree code Φ a0 ◦ · · · ◦ Φ ak (τ ). Here we use the natural analogue
of (38) for finite tree codes. See also the diagrams in [6, Figs. 19, 20].
5.4. Constructing tree codes from addresses and conversely
Definition 30 (Addresses for V -variable V -tuples of tree codes). For each sequence a =
a0 a1 . . . ak . . . with ak ∈ AV and Φ ak as in (38), define the corresponding V -tuple ωa of tree
codes by
!
"
!
"
ωa = ω1a , . . . , ωVa := lim Φ a0 ◦ Φ a1 ◦ · · · ◦ Φ ak ω10 , . . . , ωV0 ,
k→∞
(41)
for any initial (ω10 , . . . , ωV0 ) ∈ Ω V . The sequence a is called an address for the V -variable V tuple of tree codes ωa . The set of all such addresses a is denoted by A∞
V .
The tree code Φ a0 ◦ Φ a1 ◦ · · · ◦ Φ ak (ω10 , . . . , ωV0 ) is independent of (ω10 , . . . , ωV0 ) ∈ Ω V up to
and including level k, and hence agrees with ωa for these levels. The sequence in (41) converges
exponentially fast since Lip Φ ak is ! 1/M.
∗
a
The map a +→ ωa : A∞
V → ΩV is many-to-one, since the composition of different Φ s may
give the same map even in simple situations as the following example shows.
∗
Example 31 (Non-uniqueness of addresses). The map a +→ ωa : A∞
V → ΩV is many-to-one,
a
since the composition of different Φ s may give the same map even in simple situations. For
example, suppose
M = 1,
V = 2,
F ∈ Λ,
Φa =
3
F
F
4
1
,
1
Φb =
3
F
F
4
1
.
2
Since M = 1 tree codes here are infinite sequences, i.e. 1-branching tree codes. One readily
checks from (38) that
Φ a (ω, ω' ) = (F ∗ ω, F ∗ ω),
Φ b (ω, ω' ) = (F ∗ ω, F ∗ ω' ),
and so
Φ a ◦ Φ b (ω, ω' ) = Φ a ◦ Φ a (ω, ω' ) = Φ b ◦ Φ a (ω, ω' ) = (F ∗ F ∗ ω, F ∗ F ∗ ω),
Φ b ◦ Φ b (ω, ω' ) = (F ∗ F ∗ ω, F ∗ F ∗ ω' ).
The following definition is best understood from Example 34.
2072
M.F. Barnsley et al. / Advances in Mathematics 218 (2008) 2051–2088
Definition 32 (Tree skeletons). Given an address a = a0 a1 . . . ak . . . ∈ A∞
V and v ∈ {1, . . . , V } the
a
ˆ
corresponding tree skeleton Jv : T → {1, . . . , V } is defined by
Jˆva (∅) = v,
!
"
Jˆva (m1 ) = J a0 (v, m1 ), Jˆva (m1 m2 ) = J a1 Jˆva (m1 ), m2 ,
"
!
Jˆva (m1 . . . mk ) = J ak−1 Jˆva (m1 . . . mk−1 ), mk , . . . ,
...,
(42)
where the maps J ak (v, m) are as in (40).
The tree skeleton depends on the maps J ak (w, m), but not on the maps I ak (w, m) and hence
not on the set {F λ }λ∈Λ of IFSs and its indexing set Λ.
The V -tuple of tree codes (ω1a , . . . , ωVa ) can be recovered from the address a = a0 a1 . . . ak . . .
as follows.
ak
Proposition 33 (Tree codes from addresses). If a = a0 a1 . . . ak . . . ∈ A∞
V is an address, I
a
a
a
and J k are as in (40), and Jˆv is the tree skeleton corresponding to the J k , then for each
σ ∈ T and 1 ! v ! V ,
!
"
ωva (σ ) = I ak Jˆva (σ )
where k = |σ |.
(43)
Proof. The proof is implicit in Example 34. A formal proof can be given by induction.
!
Example 34 (The espalier technique). 1 We use this to find tree codes from addresses and addresses from tree codes.
We first show how to represent an address a = a0 a1 . . . ak . . . ∈ A∞
V by means of a diagram as
in Fig. 3. From this we construct the V -variable V -tuple of tree codes (ω1 , . . . , ωV ) ∈ ΩV∗ with
address a.
Conversely, given a V -variable V -tuple of tree codes (ω1 , . . . , ωV ) ∈ ΩV∗ we show how to
find the set of all its possible addresses. Moreover, given a single tree code ω ∈ ΩV we find all
possible (ω1 , . . . , ωV ) ∈ ΩV∗ with ω1 = ω and all possible addresses in this case.
For the example here let Λ = {F, G} where F and G are symbols. Suppose M = 3 and V = 5.
∗
Suppose a = a0 a1 . . . ak . . . ∈ A∞
V is an address of (ω1 , . . . , ωV ) ∈ ΩV , where
F
F
a0 = G
F
G
F
G
a2 = F
F
G
3
3
4
4
3
1
1
3
3
4
1
2
3
1
5
2
5
2
4
3
2
3
1,
4
4
3
2
5,
4
4
F
G
a1 = G
F
F
G
G
a3 = F
G
G
1
2
4
5
3
2
2
5
1
5
∗
∗
∗
∗
∗
∗
∗
∗
∗
∗
2
5
3,
5
3
∗
∗
∗.
∗
∗
1 Espalier [verb]: to train a fruit tree or ornamental shrub to grow flat against a wall, supported on a lattice.
(44)
M.F. Barnsley et al. / Advances in Mathematics 218 (2008) 2051–2088
2073
Fig. 3. Constructing tree codes from an address.
We will see that up to level 2 the tree codes ω1 and ω2 are those shown in Fig. 2. Although ω1
and ω2 are 3-variable and 2-variable respectively up to level 2, it will follow from Fig. 3 that they
are 5-variable up to level 3 and are not 4-variable.
The diagram in Fig. 3 is obtained from a = a0 a1 a2 a3 . . . by espaliering V copies of the tree T
in Definition 13 up through an infinite lattice of V boxes at each level 0, 1, 2, 3, . . . . One tree
grows out of each box at level 0, and one element from Λ is assigned to each box at each level.
When two or more branches pass through the same box from below they inosculate, i.e. their sub
branches merge and are indistinguishable from that point upwards. More precisely, a determines
the diagram in the following manner. For each level k and starting from each box v at that level,
a branch
terminates in box number J ak (v, 1) at level k + 1, a branch
terminates
a
k
in box J (v, 2) at level k + 1 and a branch
terminates in box J ak (v, 3) at level k + 1.
The element I ak (v) ∈ Λ is assigned to box v at level k.
Conversely, any such diagram determines a unique address a = a0 a1 a2 a3 . . . . More precisely,
consider an infinite lattice of V boxes at each level 0, 1, 2, 3, . . . . Suppose at each level k there is
either F or G in each of the V boxes, and from each box there are 3 branches
,
and
, each branch terminating in a box at level k + 1. From this information one can read
off I ak (v) and J ak (v, m) for each k " 0, 1 ! v ! V and 1 ! m ! M, and hence determine a.
The diagram, and hence the address a, determines the tree skeleton Jˆva by assigning to each
node of the copy of T growing out of box v at level 0, the number of the particular box in which
that node sits. If ωa = (ω1 , . . . , ωV ) is the V -tuple of tree codes with address a then the tree
code ωv is obtained by assigning to each node of the copy of T growing out of box v at level 0
the element (fruit?) from Λ in the particular box in which that node sits.
Conversely, suppose ω ∈ ΩV is a single V -variable tree code. Then the set of all possible
ω ∈ ΩV∗ ⊂ Ω V of the form ω = (ω1 , ω2 , . . . , ωV ) with ω1 = ω, and the set of all diagrams and
corresponding addresses a ∈ A∞
V for such ω, is found as follows. Espalier a copy of T up through
the infinite lattice with V boxes at each level in such a way that if σ0 . . . σk and σ0' . . . σk' sit in
2074
M.F. Barnsley et al. / Advances in Mathematics 218 (2008) 2051–2088
the same box at level k then the sub tree codes ω1σ0 . . . σk and ω1σ0' . . . σk' are equal. Since ω is
V -variable, this is always possible. From level k onwards the two sub trees are fused together.
The possible diagrams corresponding to this espaliered T are constructed as follows. For
each σ ∈ T the element ω(σ ) ∈ Λ is assigned to the box containing σ . By construction, this is
the same element for any two σ ’s in the same box. The three branches of the diagram from this
box up to the next level are given by the three sub branches of the espaliered T growing out
of that box. If T does not pass through some box, then the F or G in that box, and the three
branches of the diagram from that box to the next level up, can be assigned arbitrarily.
In this manner one obtains all possible diagrams for which the tree growing out of box 1
at level 0 is ω. Each diagram gives an address a ∈ A∞
V as before, and the corresponding
ωa = (ω1 , . . . , ωV ) ∈ ΩV∗ with address a satisfies ω1 = ω. In a similar manner, the set of possible diagrams and corresponding addresses can be obtained for any ω = (ω1 , ω2 , . . . , ωV ) ∈
ΩV∗ ⊂ Ω V .
5.5. The probability distribution on V -variable tree codes
Corresponding to the probability distribution P on Λ in (2) there are natural probability distributions ρV on ΩV , KV on KV and MV on MV . See Definition 18 for notation and for the
following also note Definition 24.
Definition 35 (Probability distributions on addresses and tree codes). The probability distribution PV on AV , with notation as in (40), is defined by selecting a = (I, J ) ∈ AV so that
I (1), . . . , I (V ) ∈ Λ are iid with distribution P , so that J (1, 1), . . . , J (V , M) ∈ {1, . . . , V } are
iid with the uniform distribution {V −1 , . . . , V −1 }, and so the I (v) and J (w, m) are independent
of one another.
The probability distribution PV∞ on A∞
V , the set of addresses a = a0 a1 . . . , is defined by
choosing the ak to be iid with distribution PV . The probability distribution ρV∗ on ΩV∗ is the
image of PV∞ under the map a +→ ωa in (41). The probability distribution ρV on ΩV is the
projection of ρV∗ in any of the V coordinate directions. (By symmetry of the construction this is
independent of choice of direction.)
One obtains natural probability distributions on fractals sets and measures, and on V -tuples
of fractal sets and measures as follows.
Definition 36 (Probability distributions on V -variable fractals). Let (F λ )λ∈Λ be a uniformly
contractive family of IFSs. The probability distributions K∗V and KV on KV∗ and KV respectively
are those induced from ρV∗ and ρV by the maps (ω1 , . . . , ωV ) +→ (K ω1 , . . . , K ωV ) and ω +→ K ω
in Definitions 24 and 18.
Similarly, the probability distributions M∗V and MV on M∗V and MV respectively are those
induced from ρV∗ and ρV by the maps (ω1 , . . . , ωV ) +→ (µω1 , . . . , µωV ) and ω +→ µω .
That is, K∗V , KV , M∗V and MV are the probability distributions of the random objects
(K ω1 , . . . , K ωV ), K ω , (µω1 , . . . , µωV ) and µω respectively, under the probability distributions
ρV∗ and ρV on (ω1 , . . . , ωV ) and ω. Since the projection of ρV∗ in each coordinate direction is ρV
it follows that the projection of K∗V in each coordinate direction is KV and the projection of M∗V
in each coordinate direction is MV . However, there is a high degree of dependence between the
components and in general ρV∗ *= ρV V , K∗V *= KV V and M∗V *= MV V .
M.F. Barnsley et al. / Advances in Mathematics 218 (2008) 2051–2088
2075
Definition 37 (The IFS acting on the set of V -tuples of tree codes). The IFS Φ V in (39) is
extended to an IFS with probabilities by
!
"
Φ V := Ω V , Φ a , a ∈ AV , PV .
(45)
Theorem 38. A unique measure attractor exists for Φ V and equals ρV∗ . In particular, the projection of ρV∗ in any coordinate direction is ρV .
∞
∞
a
Proof. For a ∈ AV let R a : A∞
V → AV denote the operator a +→ a ∗ a. Then (AV , R ,
a
a ∈ AV , PV ) is an IFS and the R are contractive with Lipschitz constant 1/2 under the metric d(a, b) = 2−k , where k is the least integer such that ak *= bk . Thus this IFS has a unique
attractor which from Definition 35 is PV∞ .
Since each Φ a : Ω V → Ω V has Lipschitz constant 1/M from Theorem 29, it follows that Φ a
has Lipschitz constant 1/M in the strong Prokhorov (and Monge–Kantorovitch) metric as a map
on measures. It also follows that Φ V has a unique attractor from Theorem 29 and Remark 12.
V
a
Finally, if Π is the projection a +→ ωa : A∞
V → Ω in (41), it is immediate that Π ◦ R =
∞
∗
a
Φ ◦ Π and hence the attractor of Φ V is Π(PV ) = ρV by Definition 35.
The projection of ρV∗ in any coordinate direction is ρV from Definition 35. !
Remark 39 (Connection with other types of random fractals). The probability distribution ρV
on ΩV is obtained by projection from the probability distribution PV∞ on A∞
V , which is constructed in a simple iid manner. However, because of the combinatorial nature of the many-to-one
map
∗
a +→ ωa +→ ω1a : A∞
V → ΩV → ΩV
in (41), inducing PV∞ → ρV∗ → ρV ,
the distribution ρV is very difficult to analyse in terms of tree codes. In particular, under the
distribution ρV on ΩV and hence on Ω, the set of random IFSs ω +→ F ω(σ ) for σ ∈ T has a
complicated long range dependence structure. (See the comments following Definition 13 for
the notation F ω(σ ) .)
For each V -variable tree code there are at most V isomorphism classes of subtree codes at
each level, but the isomorphism classes are level dependent. Moreover, each set of realisations of
a V -variable random fractal, as well as its associated probability distribution, is the projection of
the fractal attractor of a single deterministic IFS operating on V -tuples of sets or measures. See
Theorems 43 and 45.
For these reasons ρV , KV and MV are very different from other notions of a random fractal
distribution in the literature. See also Remark 48.
6. Convergence and existence results for superIFSs
We continue with the assumption that F = {X, F λ , λ ∈ Λ, P } is a family of IFSs as in (1)
and (2).
The set KV of V -variable fractal sets and the set MV of V -variable fractal measures from
Definition 18, together with their natural probability distribution KV and MV from Definition 36,
c
1
are obtained as the attractors of IFSs FCV , FM
or FM
under suitable conditions, see TheoV
V
rems 43 and 45. The Markov chains corresponding to these IFSs provide MCMC algorithms,
2076
M.F. Barnsley et al. / Advances in Mathematics 218 (2008) 2051–2088
such as the “chaos game,” for generating samples of V -variable fractal sets and V -variable fractal measures whose empirical distributions converge to the stationary distributions KV and MV
respectively.
Definition 40. The metrics dH , dP and dMK are defined on C(X)V , Mc (X)V and M1 (X)V by
!
!
""
!
"
dH (K1 , . . . , KV ), K1' , . . . , KV' = max dH Kv , Kv' ,
v
"
""
!
! '
!
'
dP (µ1 , . . . , µV ), µ1 , . . . , µV = max dP µv , µ'v ,
v
#
""
!
!
!
"
dMK (µ1 , . . . , µV ), µ'1 , . . . , µ'V = V −1
dMK µv , µ'v ,
(46)
v
where the metrics on the right are as in Section 2.
The metrics dH and dMK are complete and separable, while the metric dP is complete but
usually not separable. See Definitions 1, 3 and 4, the comments which follow them, Proposition 5
and Remark 6. The metric dMK is usually not locally compact, see Remark 11.
The following IFSs (48) are analogues of the tree IFS Φ V := (Ω V , Φ a , a ∈ AV ) in (39).
Definition 41 (SuperIFS). For a ∈ AV as in Definition 27 let
F a : C(X)V → C(X)V ,
F a : Mc (X)V → Mc (X)V ,
F a : M1 (X)V → M1 (X)V
be given by
F a (K1 , . . . , KV )
! a
"
a
= F I (1) (KJ a (1,1) , . . . , KJ a (1,M) ), . . . , F I (V ) (KJ a (V ,1) , . . . , KJ a (V ,M) ) ,
F a (µ1 , . . . , µV )
! a
"
a
= F I (1) (µJ a (1,1) , . . . , µJ a (1,M) ), . . . , F I (V ) (µJ a (V ,1) , . . . , µJ a (V ,M) ) ,
where the action of F I
Let
a (v)
(47)
is defined in (35).
!
"
FCV = C(X)V , F a , a ∈ AV , PV ,
!
"
FVMc = Mc (X)V , F a , a ∈ AV , PV ,
!
"
1
FM
= M1 (X)V , F a , a ∈ AV , PV
V
(48)
be the corresponding IFSs, with PV from Definition 35. These IFSs are called superIFSs.
Two types of conditions will be used on families of IFSs. The first was introduced in (28).
M.F. Barnsley et al. / Advances in Mathematics 218 (2008) 2051–2088
2077
Definition 42 (Contractivity conditions for a family of IFSs). The family F = {X, F λ , λ ∈ Λ, P }
of IFSs is uniformly contractive and uniformly bounded if for some 0 ! r < 1,
!
"
supλ maxm d fmλ (x), fmλ (y) ! r d(x, y)
and
!
"
supλ maxm d fmλ (a), a < ∞
(49)
for all x, y ∈ X and some a ∈ X. (The probability distribution P is not used in (49). The second
condition is immediate if Λ is finite.)
The family F is pointwise average contractive and average bounded if for some 0 ! r < 1
! λ
"
λ
Eλ Em d fm (x), fm (y) ! rd(x, y) and
for all x, y ∈ X and some a ∈ X.
! λ
"
Eλ Em d fm (a), a < ∞
(50)
The following theorem includes and strengthens Theorems 15–24 from [6]. The space X may
be noncompact and the strong Prokhorov metric dP is used rather than the Monge–Kantorovitch
metric dMK .
Theorem 43 (Uniformly contractive conditions). Let F = {X, F λ , λ ∈ Λ, P } be a finite family
of IFSs on a complete separable metric space (X, d) satisfying (49).
c
Then the superIFSs FCV and FM
satisfy the uniform contractive condition Lip F a ! r. Since
V
V
(C(X) , dH ) is complete and separable, and (Mc (X)V , dP ) is complete but not necessarily
separable, the corresponding conclusions of Theorem 8 and Remark 12 are valid.
In particular
c
(1) FCV and FM
each have unique compact set attractors and compactly supported separable
V
measure attractors. The attractors are KV∗ and K∗V , and M∗V and M∗V , respectively. Their
projections in any coordinate direction are KV , KV , MV and MV , respectively.
(2) The Markov chains generated by the superIFSs converge at an exponential rate.
∗
(3) Suppose (K10 , . . . , KV0 ) ∈ C(X)V . If a = a0 a1 · · · ∈ A∞
V then for some (K1 , . . . , KV ) ∈ KV
0
0
which is independent of (K1 , . . . , KV ),
!
"
F a0 ◦ · · · ◦ F ak K10 , . . . , KV0 → (K1 , . . . , KV )
in (C(X)V , dH ) as k → ∞. Moreover
!
%
"
$ a
F 0 ◦ · · · ◦ F ak K10 , . . . , KV0 : a0 , . . . , ak ∈ AV → KV∗
in (C(C(X)V , dH ), dH ). Convergence is exponential in both cases.
Analogous results apply for M∗V , K∗V and M∗V , with dP or dH as appropriate.
k
k
(4) Suppose B 0 = (B10 , . . . , BV0 ) ∈ C(X)V and a = a0 a1 · · · ∈ A∞
V and let B (a) = B =
F ak (B k−1 ) if k " 1. Let B k (a) be the first component of B k (a). Then for a.e. a and every B 0 ,
k−1
1#
δB n (a) → KV
k
n=0
(51)
2078
M.F. Barnsley et al. / Advances in Mathematics 218 (2008) 2051–2088
weakly in the space of probability distributions on (C(X), dH ).
For starting measures (µ01 , . . . , µ0V ) ∈ Mc (X)V , there are analogous results modified as in
Remark 12 to account for the fact that (Mc (X), dP ) is not separable.
There are similar results for V -tuples of sets or measures.
Proof. The assertion Lip F a ! r follows from (46) by a straightforward argument using Definitions 1 and 4, Eq. (6) and the comment following it, and Eqs. (34), (46) and (47). So the
analogue of the uniform contractive condition (19) in Theorem 8 is satisfied, while the uniform
boundedness condition is immediate since F is finite. From Theorem 8(d) and Remark 12, FCV
c
and FM
each has a unique set attractor which is a subset of C(X)V and Mc (X)V respectively,
V
and a measure attractor which is a probability distribution on the respective set attractor.
It remains to identify the attractors with the sets and distributions in Definitions 18, 24, 35
- denote one of the maps
and 36. Let Π
ω +→ K ω , ω +→ µω ,
!
"
!
"
(ω1 , . . . , ωV ) +→ K ω1 , . . . , K ωV , (ω1 , . . . , ωV ) +→ µω1 , . . . , µωV ,
(52)
- ◦ Φa =
depending on the context. In the last two cases it follows from (38) and (47) that Π
a
-. Also denote by Π
- the extension of Π
- to a map on sets, on V -tuples of sets, on measures
F ◦Π
or on V -tuples of measures, respectively. It follows from Theorem 8(d) together with Definitions 18, 24 and 35 that
! "
! "
- ΩV∗ ,
- V ),
- ρV∗ ,
- V ),
KV = Π(Ω
K∗V = Π
KV = Π(ρ
KV∗ = Π
"
"
!
!
- ΩV∗ ,
- V ),
- ρV∗ ,
- V ).
M∗V = Π
MV = Π(Ω
M∗V = Π
MV = Π(ρ
(53)
The rest of (i) follows from Theorems 29 and 38.
The remaining parts of the theorem follow from Theorem 8 and Remark 12.
!
Remark 44 (Why use the dP metric?). For computing approximations to the set of V variable fractals and its associated probability distribution which correspond to F , the main
part of the theorem is (iv) with either sets or measures. The advantage of (Mc (X), dP ) over
(Mc (X), dMK ) is that for use in the analogue of (27) the space BC(Mc (X), dP ) is much
larger than BC(Mc (X), dMK ). For example, if φ(µ) = ψ(dP (µ, µ1 )), where µ1 ∈ Mc (X)
and ψ is a continuous cut-off approximation to the characteristic function of [0, ǫ] ⊂ R, then
φ ∈ BC(Mc (X), dP ) but is not continuous or even Borel over (Mc (X), dMK ).
Theorem 45 (Average contractive conditions). Let F = {X, F λ , λ ∈ Λ, P } be a possibly infinite
family of IFSs on a complete separable metric space (X, d) satisfying (50).
1
Then the superIFS FM
satisfies the pointwise average contractive and average boundedness
V
conditions
"
! a ! 0" 0"
! a
a
'
'
(54)
Ea dMK F (µ), F (µ ) ! rdMK (µ, µ ),
Ea dMK F µ , µ < ∞
for all µ, µ' ∈ M1 (X)V and some µ0 ∈ M1 (X)V . Since (M1 (X)V , dMK ) is complete and separable, the corresponding conclusions of Theorem 8 are valid.
In particular
M.F. Barnsley et al. / Advances in Mathematics 218 (2008) 2051–2088
2079
1
(1) FM
has a unique measure attractor and its projection in any coordinate direction is the
V
same. The attractor and the projection are denoted by M∗V and MV , and extend the corresponding distributions in Theorem 43.
0
0
0
0
V
a0
ak
(2) For a.e. a = a0 a1 · · · ∈ A∞
V , if (µ1 , . . . , µV ) ∈ M1 (X) then F ◦ · · · ◦ F (µ1 , . . . , µV )
converges at an exponential rate. The limit random V -tuple of measures has probability
distribution M∗V .
0
0
0
V
k
ak
k−1 ) for k " 1.
(3) If a = a0 a1 · · · ∈ A∞
V and µ = (µ1 , . . . , µV ) ∈ M1 (X) let µ (a) = F (µ
k
k
0
Let µ (a) be the first component of µ (a). Then for a.e. a and every µ ,
k−1
1#
δµn (a) → MV
k
(55)
n=0
weakly in the space of probability distributions on (M1 (X), dMK ).
Proof. To establish average boundedness in (54) let µ0 = (δb , . . . , δb ) ∈ M1 (X)V for some
b ∈ X. Then
! a
"
Ea dMK F (δb , . . . , δb ), (δb , . . . , δb )
+#
,
1 #
I a (v)
a
from (46), (47), (1) and (35)
dMK
wm δf I (v) (b) , δb
= Ea
m
V v
m
! Ea
= Ea
!
"
1 # # I a (v)
w
dMK δf I a (v) (b) , δb
m
V v m m
"
1 # # I a (v) ! I a (v)
wm d fm (b), b
V v m
by basic properties of dMK
basic properties of dMK
"
1 ## λ ! λ
wm d fm (b), b since dist I a (v) = P = dist λ by Definition 35
V v m
!
"
= Eλ Em d fmλ (b), b < ∞.
= Eλ
To establish average contractivity in (54) let (µ1 , . . . , µV ), (µ'1 , . . . , µ'V ) ∈ M1 (X)V . Then
! a
"
a
'
'
Ea dMK F (µ1 , . . . , µV ), F (µ1 , . . . , µV )
+#
,
# a
"
1 #
a (v) ! '
I a (v) I a (v)
I
(v)
I
! Ea
wm fm (µJ a (v,m) ),
dMK
wm f m
µJ a (v,m)
V v
m
m
! Ea
= Eλ
""
! a
!
1 # # I a (v)
a
wm dMK fmI (v) (µJ a (v,m) ), fmI (v) µ'J a (v,m)
V v m
""
!
!
1 ## λ
wm Ea dMK fmλ (µJ a (v,m) ), fmλ µ'J a (v,m)
V v m
!
! ""
1 ## λ
wm Et dMK fmλ (µt ), fmλ µ't
V v m
! ""
!
= Et Eλ Em dMK fmλ (µt ), fmλ µ't .
= Eλ
2080
M.F. Barnsley et al. / Advances in Mathematics 218 (2008) 2051–2088
The first inequality is from (47), (1) and (35), the second by properties of dMK , the first equality
by the independence of I a (v) and J a (v, n) in Definition 35 and since dist I a (v) = P = dist λ, and
the second equality by the uniform distribution of J a (v, m) for fixed (v, m) where t is distributed
uniformly over {1, . . . , V }.
Next let Wt , Wt' be random variables on X such that dist Wt = µt , dist Wt' = µ't and
E d(Wt , Wt' ) = dMK (µt , µ't ), where E without a subscript here and later refers to expectations
from the sample space over which the Wt and Wt' are jointly defined. This is possible by [12,
Theorem 11.8.2]. Then
! λ
! ' ""
λ
Et Eλ Em dMK fm (µt ), fm µt
!
! ""
by the third version of (4)
! Et Eλ Em E d fmλ (Wt ), fmλ Wt'
!
"
! r Et E d Wt , Wt' from (50)
"
!
= r Et dMK µt , µ't
by choice of Wt and Wt'
!
!
""
= rdMK (µ1 , . . . , µV ), µ'1 , . . . , µ'V .
This completes the proof of (54). The remaining conclusions now follow by Theorem 8.
!
Remark 46 (Global average contractivity). One might expect that the global average contractive condition Eλ Em Lip fmλ < 1 on the family F would imply the global average contractive
1
condition Ea Lip F a < 1, i.e. would imply that the superIFS FM
is global average contractive.
V
However, this is not the case.
For example, let X = R, V = 2 and M = 2. Let F contain a single IFS F = (f1 , f2 ; 1/2, 1/2)
where
3
f1 (x) = − x,
2
f2 (x) =
1+ǫ 1−ǫ
+
x.
2
2
Then Lip f1 = 3/2, Lip f2 = (1 − ǫ)/2 and Em Lip fm = 1 − ǫ/4. So F is global average contractive.
Note f1 (0) = 0 and f2 (1) = 1. Let µ = (δ0 , δ0 ), µ' = (δ1 , δ1 ) and note dMK (µ, µ' ) = 1. Then
for any a ∈ AV as in (40),
,
,
+
+
1
1
1
1
1
1
1
1
a
a
'
δ0 + δ 1+ǫ , δ0 + δ 1+ǫ ,
δ 3 + δ1 , δ− 3 + δ 1 .
F (µ ) =
F (µ) =
2
2 2 2
2 2
2 −2 2
2 2 2
From the first form of (4) with f (x) = |x| and from (46), dMK (F a (µ), F a (µ' )) " 1 − ǫ/4 and
so Lip F a " 1 − ǫ/4
7 for8every a.
Next let a ∗ = FF 11 22 and choose µ = (δ0 , δ0 ), µ' = (δ1 , δ0 ), so µ and µ' differ in the box
on which f1 always acts and agree in the box on which f2 always acts. Note dMK (µ, µ' ) = 1/2.
Then
,
,
+
+
1
1
1
1
1
1
1
1
∗
∗
F a (µ' ) =
δ0 + δ 1+ǫ , δ0 + δ 1+ǫ ,
δ− 3 + δ 1+ǫ , δ− 3 + δ 1+ǫ .
F a (µ) =
2
2 2 2
2 2
2 2 2 2 2 2 2 2
∗
∗
Again using the first form of (4) with f (x) = |x|, dMK (F a (µ), F a (µ' )) " 3/4, so
∗
Lip F a " 3/2.
M.F. Barnsley et al. / Advances in Mathematics 218 (2008) 2051–2088
2081
Since there are 16 possible maps a ∈ AV , each selected with probability 1/16, it follows that
+
,
ǫ
1 3
2
15
a
1−
+
· > 1 if ǫ < .
Ea Lip F "
16
4
16 2
15
1
So for such 0 < ǫ < 2/15 the IFS FM
is not global average contractive. But since
V
M
Em Lip fm = 1 − ǫ/4 it follows from Theorem 45 that FV 1 is pointwise average contractive,
and so Theorem 8 can be applied.
Example 47 (Random curves in the plane). The following shows why it is natural to consider
families of IFSs which are both infinite and not uniformly contractive. Such examples can be
modified to model Brownian motion and other stochastic processes, see [18, Section 5.2] and
[23, pp. 120–122].
Let F = {R2 , F λ , λ ∈ R2 , N(0, σ 2 I )} where F λ = {f1λ , f2λ ; 1/2, 1/2} and N(0, σ 2 I ) is the
symmetric normal distribution in R2 with variance σ 2 . The functions f1λ and f2λ are uniquely
specified by the requirements that they be similitudes with positive determinant and
f1λ (−1, 0) = (−1, 0),
f1λ (1, 0) = λ,
f2λ (−1, 0) = λ,
f2λ (1, 0) = (1, 0).
A calculation shows σ = 1.42 implies Eλ Em Lip fmλ ≈ 0.9969 and so average contractivity holds
if σ ! 1.42. If |λ| is sufficiently large then neither f1λ nor f2λ are contractive.
The IFS F λ can also be interpreted as a map from the space C([0, 1], R2 ), of continuous paths
from [0, 1] to R2 , into itself as follows:
( λ
! λ "
f1 (φ(2t)),
F (φ) (t) =
f2λ (φ(2t − 1)),
0 ! t ! 12 ,
1
2 ! t ! 1.
Then one can define a superIFS acting on such functions in a manner analogous to that for the
superIFS acting on sets or measures. Under the average contractive condition one obtains L1
convergence to a class of V -variable fractal paths, and in particular V -variable fractal curves,
from (−1, 0) to (1, 0). We omit the details.
Remark 48 (Graph directed fractals). Fractals generated by a graph directed system [GDS] or
more generally by a graph directed Markov system [GDMS], or by random versions of these,
have been considered by many authors. See [29,32] for the definitions and references. We comment here on the connection between these fractals and V -variable fractals.
In particular, for each experimental run of its generation process, a random GDS or GDMS
generates a single realisation of the associated random fractal. On the other hand, for each run,
a superIFS generates a family of realisations whose empirical distributions converge a.s. to the
probability distribution given by the associated V -variable random fractal.
To make a more careful comparison, allow the number of functions M in each IFS F λ as
in (1) to depend on λ. A GDS together with its associated contraction maps can be interpreted
as a map from V -tuples of sets to V -tuples of sets. The map can be coded up by a matrix a as
in (40), where M there is now the number of edges in the GDS. If V is the number of vertices
in a GDS, the V -tuple (K 1 , . . . , K V ) of fractal sets generated by the GDS is a very particular
V -variable V -tuple. If the address a = a0 a1 . . . ak . . . for (K 1 , . . . , K V ) is as in (41), then ak = a
for all k. Unlike the situation for V -variable fractals as discussed in Remark 19, there are at most
2082
M.F. Barnsley et al. / Advances in Mathematics 218 (2008) 2051–2088
V distinct subtrees which can be obtained from the tree codes ωv for K v regardless of the level
of the initial node of the subtree.
More generally, if (K 1 , . . . , K V ) is generated by a GDMS then for each k, ak+1 is determined just by ak and by the incidence matrix for the GDMS. Each subtree ωv 0σ is completely
determined by the value ωv (σ ) ∈ Λ at its base node σ and by the “branch” σk in σ = σ1 . . . σk .
Realisations of random fractals generated by a random GDS are almost surely not V -variable,
and are more akin to standard random fractals as in Definition 17. One comes closer to V -variable
fractal sets by introducing a notion of a homogeneous random GDS fractal set analogous to that
of a homogeneous random fractal as in Remark 55. But then one does not obtain a class of V variable fractals together with its associated probability distribution unless one makes the same
definitions as in Section 5. This would be quite unnatural in the setting of GDS fractals, for
example it would require one edge from any vertex to any vertex.
7. Approximation results as V → ∞
Theorems 49 and 51 enable one to obtain empirical samples of standard random fractals up to
any prescribed degree of approximation by using sufficiently large V in Theorem 43(4). This is
useful since even single realisations of random fractals are computationally expensive to generate
by standard methods. Note that although the matrices used to compute samples of V -variable
fractals are typically of order V × V , they are sparse with bandwidth M.
The next theorem improves the exponent in [6, Theorem 12] and removes the dependence
on M. The difference comes from using the third rather than the first version of (4) in the proof.
Theorem 49. If dMK is the Monge–Kantorovitch metric then dMK (ρV , ρ∞ ) ! 1.4V −1/3 .
Proof. We construct random tree codes WV and W∞ with dist WV = ρV and dist W∞ = ρ∞ .
In order to apply the last equality in (4) we want the expected distance between WV and W∞ ,
determined by their joint distribution, to be as small as possible.
Suppose A = A0 A1 A2 . . . is a random address with dist A = PV∞ . Let WV = ω1A (σ ) be the
corresponding random tree code, using the notation of (43) and (42). It follows from Definition 35
that dist WV = ρV .
Let the random integer K = K(A) be the greatest integer such that, for 0 ! j ! K, if
|σ | = |σ ' | = j and σ *= σ ' then Jˆ1A (σ ) *= Jˆ1A (σ ' ) in (42). Thus with v = 1 as in Example 34
the nodes of T are placed in distinct buffers up to and including level K.
Let W∞ be any random tree code such that if |σ | ! K then W∞ (σ ) = WV (σ ), while if
|σ | > K then dist W∞ (σ ) = P and W∞ (σ ) is independent of W∞ (σ ' ) for all σ ' *= σ . It follows
from the definition of K that W∞ (σ ) are iid with distribution P for all σ and so dist W∞ = ρ∞ .
For any k and for V " M k ,
E d(WV , W∞ )
9
9
"
"
!
!
= E d(WV , W∞ ) 9 K " k · Prob(K " k) + E d(WV , W∞ ) 9 K < k · Prob(K < k)
1
1
Prob(K < k) by (29) and since WV (∅) = W∞ (∅)
+
M k+1 M
:
k −1+
2 −1+
, M;
,
,<
M−1
M;
;+
i
i
i
1
1
1−
1−
· ··· ·
1−
= k+1 +
1−
M
V
V
V
M
!
i=1
i=1
i=1
2083
M.F. Barnsley et al. / Advances in Mathematics 218 (2008) 2051–2088
!
!
!
1
M k+1
1
M k+1
1
M k+1
1
+
MV
+
+
:M−1
#
i=1
i+
2 −1
M#
i + ··· +
k −1
M#
i=1
i=1
i
<
"
1 ! 2
M + M 4 + · · · + M 2k
2MV
1
M 2(k+1)
2M 2k−1
,
!
+
3V
2MV (M 2 − 1) M k+1
n (1 − a ) " 1 − =n a for a " 0 to obtain the second inequality and assuming
noting Πi=1
i
i
i=1 i
M " 2 for the last inequality. The estimate is trivially true if M = 1 or V < M k .
1/3 , this being the value of x which minimises 1 + 2M 2x−1 . Choose
Choose x so M x = ( 3V
4 )
3V
M x+1
k so k ! x < k + 1. Hence from (4)
+
,
3V 2/3
2
dMK (ρV , ρ∞ ) ! E d(WV , W∞ ) !
+
3MV 4
+ ,2/3 ,
++ ,−1/3
3
1 3
! 1.37V −1/3 .
+
! V −1/3
4
3 4
+
3V
4
,−1/3
!
Remark 50 (No similar estimate for dP is possible in Theorem 49). The support of ρV converges
to the support of ρ∞ in the Hausdorff metric by Theorem 22. However, ρV ! ρ∞ in the dP
metric as V → ∞.
To see this suppose M " 2, fix j, k ∈ {1, . . . , M} with j *= k and let E = {ω ∈ Ω:
ω(j ) = ω(k)}, where j and k are interpreted as sequences of length one in T . According to the
probability distribution ρ∞ , ω(j ) and ω(k) are independent if j *= k. For the probability distribution ρV∗ there is a positive probability 1/V that J (1, j ) = J (1, k), in which case ω1 (j ) = ω1 (k)
must be equal from Proposition 33, while if J (1, j ) *= J (1, k) then ω1 (j ) and ω1 (k) are independent. Identifying the probability distribution ρV on Ω with the projection of ρV∗ on Ω V in
the first coordinate direction it follows ρ∞ (E) < ρV (E).
However, d(ω' , E) " 1/M if ω' ∈
/ E since in this case for ω ∈ E either ω' (j ) *= ω(j ) or
'
ω (k) *= ω(k). Hence for ǫ < 1/M, E ǫ = E and so ρ∞ (E ǫ ) = ρ∞ (E) < ρV (E). It follows that
dP (ρV , ρ∞ ) " 1/M for all V if M " 2.
Theorem 51. Under the assumptions of Theorem 43,
dH (KV , K∞ ), dH (MV , M∞ ) <
2L −α
V ,
1−r
dMK (KV , K∞ ), dMK (MV , M∞ ) <
2.8L −V α,
1−r
(56)
where
!
"
L = supλ maxm d fmλ (a), a ,
log(1/r)
,
α=
log M
α=
(
α/3
1
if α ! 1,
if α " 1.
2084
M.F. Barnsley et al. / Advances in Mathematics 218 (2008) 2051–2088
Proof. The first two estimates follow from Theorem 22 and Proposition 15. For the third, let WV
and W∞ be the random codes from the proof of Theorem 49. In particular,
ρV = dist WV ,
ρ∞ = dist W∞ ,
E d(WV , W∞ ) ! 1.4V
−1/3
(57)
.
- be the projection map Ω +→ K ω given by (30) and (53). Then KV = dist Π
- ◦ WV ,
Let Π
K∞ = dist Π ◦ W∞ , and from the last condition in (4)
- ◦ WV , Π
- ◦ W∞ ).
dMK (KV , K∞ ) ! E dH (Π
(58)
If α ! 1 then from (33) on taking expectations of both sides, using Hölder’s inequality and
applying (57),
- ◦ WV , Π
- ◦ W∞ ) !
E dH (Π
2.8L −α/3
2L
α
V
.
E d (WV , W∞ ) !
1−r
1−r
If α " 1 then from the last two lines in the proof of Proposition (15),
!
2L α
'"
d (ω, ω' ),
dH K ω , K ω !
1−r
and so, using this and arguing as before,
- ◦ WV , Π
- ◦ W∞ ) !
E dH (Π
2.8L −1/3
V
.
1−r
This gives the third estimate. The fourth estimate is proved in an analogous manner.
!
Sharper estimates can be obtained arguing directly as in the proof of Theorem 49. In particular,
log(1/r)
the exponent α can be replaced by log(M
2 /r) .
8. Example of 2-variable fractals
Consider the family F = {R2 , U, D, 12 , 12 } consisting of two IFSs U = (f1 , f2 ) (up with a
reflection) and D = (g1 , g2 ) (down) acting on R2 , where
,
1 x 3y
9
x 3y
,
+
− , −
+
2
8
16 2
8
16
,
+
9
x 3y 17
x 3y
f2 (x, y) =
,
−
+ ,− −
+
2
8
16
2
8
16
,
+
1
x 3y
7
x 3y
g1 (x, y) =
,
+
− ,− +
+
2
8
16
2
8
16
,
+
9 x 3y
1
x 3y
g2 (x, y) =
.
−
+ , +
−
2
8
16 2
8
16
f1 (x, y) =
+
The corresponding fractal attractors of U and F are shown at the beginning of Fig. 4. The probability of choice of U and D is 12 in each case.
M.F. Barnsley et al. / Advances in Mathematics 218 (2008) 2051–2088
Fig. 4. Sampling 2-variable fractals.
2085
2086
M.F. Barnsley et al. / Advances in Mathematics 218 (2008) 2051–2088
The 2-variable superIFS acting on pairs of compact sets is FC2 = (C(R2 )2 , F a , a ∈ A2 , P2 ).
There are 64 maps a ∈ A2 , each a 2 × 3 matrix. The probability distribution
7 2 1 8P2 assigns proba1
. Applying F a to
to each a ∈ A2 . For example, in iteration 4 the matrix is a = U
bility 64
U 12
the pair of sets (E1 , E2 ) from iteration 3 gives
!
" !
"
F a (E1 , E2 ) = U (E2 , E1 ), U (E1 , E2 ) = f1 (E2 ) ∪ f2 (E1 ), f1 (E1 ) ∪ f2 (E2 ) .
The process in Fig. 4 begins with a pair of line segments. The first 6 iterations and iterations 23–26 are shown. After about 12 iterations the sets are independent of the initial sets up to
screen resolution. After this the pairs of sets can be considered as examples of 2-variable 2-tuples
of fractal sets corresponding to F .
The generation process gives an MCMC algorithm or “chaos game” and acts on the infinite
state space (C(R2 )2 , dH ) of pairs of compact sets with the dH metric. The empirical distribution
along a.e. trajectory from any starting pair of sets converges weakly to the 2-variable superfractal
probability distribution on 2-variable 2-tuples of fractal sets corresponding to F . The empirical
distribution of first (and second) components converges weakly to the corresponding natural
probability distribution on 2-variable fractal sets corresponding to F .
9. Concluding comments
Remark 52 (Extensions). The restriction that each IFS F λ in (1) has the same number M of
functions is for notational simplicity only.
In Definition 35 the independence conditions on the I and J may be relaxed. In some modelling situations it would be natural to have a degree of local dependance between (I (v), J (v))
and (I (w), J (w)) for v “near” w.
The probability distribution ρV is in some sense the most natural probability distribution on
the set of V -variable code trees since it is inherited from the probability distribution PV∞ with
the simplest possible probabilistic structure. We may construct more general distributions on the
set of V -variable code trees by letting PV∞ be non-Bernoulli, e.g. stationary.
Instead of beginning with a family of IFSs in (2) one could begin with a family of graph
directed IFSs and obtain in this manner the corresponding class of V -variable graph directed
fractals.
Remark 53 (Dimensions). Suppose F = {Rn , F λ ; λ ∈ Λ, P } is a family of IFSs satisfying the
strong uniform open set condition and whose maps are similitudes. In a forthcoming paper we
compute the a.s. dimension of the associated family of V -variable random fractals. The idea is
to associate to each a ∈ AV a certain V × V matrix and then use the Furstenberg Kesten theory
for products of random matrices to compute a type of pressure function.
Remark 54 (Motivation for the construction of V -variable fractals). The original motivation was
to find a chaos game type algorithm for generating collections of fractal sets whose empirical
distributions approximated the probability distribution of standard random fractals.
More precisely, suppose F = {(X, d), F λ , λ ∈ Λ, P } is a family of IFSs as in (2). Let V be
a large positive integer and S be a collection of V compact subsets of X, such that the empirical distribution of S approximates the distribution K∞ of the standard random fractal associated
to F by Definition 17. Suppose S ∗ is a second collection of V compact subsets of X obtained
from F and S as follows. For each v ∈ {1, . . . , V } and independently of other w ∈ {1, . . . , V },
M.F. Barnsley et al. / Advances in Mathematics 218 (2008) 2051–2088
2087
select E1 , . . . , EM from S according to the uniform distribution independently with replacement, and independently
of this select F λ from F according to P . Let the vth set in S ∗ be
2
λ
F (E1 , . . . , EM ) = 1"m"M fmλ (Em ). Then one expects the empirical distribution of S ∗ to also
approximate K∞ .
The random operator constructed in this manner for passing from S to S ∗ is essentially the
random operator F a in Definition 41 with a ∈ AV chosen according to PV .
Remark 55 (A hierarchy of fractals). See Fig. 1.
If M = 1 in (1) then each F λ is a trivial IFS (f λ ) containing just one map, and the family F
in (2) can be interpreted as a standard IFS. If moreover V = 1 then the corresponding superIFS
in Definition 41 can be interpreted as a standard IFS operating on (X, d) with set and measure
attractors K and µ, essentially by identifying singleton subsets of X with elements in X. For
M = 1 and V > 1 the superIFS can be identified with an IFS operating on X V with set and
measure attractors KV∗ and µ∗V . Conversely, any standard IFS can be extended to a superIFS
in this manner. The projection of KV∗ in any coordinate direction is K, but KV∗ *= K V . The
attractors KV∗ and µ∗V are called correlated fractals. The measure µ∗ provides information on a
certain “correlation” between subsets of K. This provides a new tool for studying the structure
of standard IFS fractals as we show in a forthcoming paper.
The case V = 1 corresponds to homogeneous random fractals and has been studied in
[19,25,36]. The case V → ∞ corresponds to standard random fractals as defined in Definition 17,
see also Section 7. See also [1] for some graphical examples.
For a given class F of IFSs and positive integer V > 1, one obtains a new class of fractals
each with the prescribed degree V of self similarity at every scale. The associated superIFS provides a rapid way of generating a sample from this class of V -variable fractals whose empirical
distribution approximates the natural probability distribution on the class.
Large V provides a method for generating a class of correctly distributed approximations to
standard random fractals. Small V provides a class of fractals with useful modelling properties.
References
[1] Takahiro Asai, Fractal image generation with iterated function set, Technical Report 24, Ricoh, November 1998.
[2] M.F. Barnsley, Superfractals, Cambridge University Press, Cambridge, 2006.
[3] M.F. Barnsley, S. Demko, Iterated function systems and the global construction of fractals, Proc. Roy. Soc. London
Ser. A 399 (1985) 243–275.
[4] Michael F. Barnsley, John H. Elton, A new class of Markov processes for image encoding, Adv. in Appl. Probab. 20
(1988) 14–32.
[5] Michael F. Barnsley, John H. Elton, Douglas P. Hardin, Recurrent iterated function systems, in: Fractal Approximation, Constr. Approx. 5 (1989) 3–31.
[6] Michael Barnsley, John Hutchinson, Örjan Stenflo, A fractal valued random iteration algorithm and fractal hierarchy, Fractals 13 (2005) 111–146.
[7] Patrick Billingsley, Convergence of Probability Measures, Wiley Ser. Probab. Stat., John Wiley & Sons Inc., New
York, 1999.
[8] Leo Breiman, The strong law of large numbers for a class of Markov chains, Ann. Math. Statist. 31 (1960) 801–803.
[9] Persi Diaconis, David Freedman, Iterated random functions, SIAM Rev. 41 (1999) 45–76.
[10] Persi Diaconis, Mehrdad Shahshahani, Products of random matrices and computer image generation, in: Random
Matrices and Their Applications, Brunswick, Maine, 1984, in: Contemp. Math., vol. 50, Amer. Math. Soc., Providence, RI, 1986, pp. 173–182.
[11] Wolfgang Doeblin, Robert Fortet, Sur des chaînes à liaisons complètes, Bull. Soc. Math. France 65 (1937) 132–148
(in French).
[12] R.M. Dudley, Real Analysis and Probability, Cambridge Stud. Adv. Math., vol. 74, Cambridge University Press,
Cambridge, 2002, revised reprint of the 1989 original.
2088
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
[27]
[28]
[29]
[30]
[31]
[32]
[33]
[34]
[35]
[36]
[37]
[38]
M.F. Barnsley et al. / Advances in Mathematics 218 (2008) 2051–2088
John H. Elton, An ergodic theorem for iterated maps, Ergodic Theory Dynam. Systems 7 (1987) 481–488.
John H. Elton, A multiplicative ergodic theorem for Lipschitz maps, Stochastic Process. Appl. 34 (1990) 39–47.
Kenneth Falconer, Random fractals, Math. Proc. Cambridge Philos. Soc. 100 (1986) 559–582.
Herbert Federer, Geometric Measure Theory, Grundlehren Math. Wiss., Band 153, Springer-Verlag New York Inc.,
New York, 1969.
Siegfried Graf, Statistically self-similar fractals, Probab. Theory Related Fields 74 (1987) 357–392.
Siegfried Graf, Random fractals, in: School on Measure Theory and Real Analysis, Grado, 1991, Rend. Istit. Mat.
Univ. Trieste 23 (1991) 81–144 (1993).
B.M. Hambly, Brownian motion on a homogeneous random fractal, Probab. Theory Related Fields 94 (1992) 1–38.
John E. Hutchinson, Fractals and self-similarity, Indiana Univ. Math. J. 30 (1981) 713–747.
John E. Hutchinson, Ludger Rüschendorf, Random fractal measures via the contraction method, Indiana Univ. Math.
J. 47 (1998) 471–487.
John E. Hutchinson, Ludger Rüschendorf, Random fractals and probability metrics, Adv. in Appl. Probab. 32 (2000)
925–947.
John E. Hutchinson, Ludger Rüschendorf, Selfsimilar fractals and selfsimilar random fractals, in: Fractal Geometry
and Stochastics, II, Greifswald/Koserow, 1998, in: Progr. Probab., vol. 46, Birkhäuser, Basel, 2000, pp. 109–123.
Richard Isaac, Markov processes and unique stationary probability measures, Pacific J. Math. 12 (1962) 273–286.
Yuri Kifer, Fractals via random iterated function systems and random geometric constructions, in: Fractal Geometry
and Stochastics, Finsterbergen, 1994, in: Progr. Probab., vol. 37, Birkhäuser, Basel, 1995, pp. 145–164.
A.S. Kravchenko, Completeness of the space of separable measures in the Kantorovich–Rubinshteı̆n metric, Sibirsk.
Mat. Zh. 47 (1) (2006) 85–96; English transl.: Siberian Math. J. 47 (1) (2006) 68–76.
Benoit B. Mandelbrot, Fractals: Form, Chance, and Dimension, revised ed., W.H. Freeman and Co., San Francisco,
CA, 1977, translated from the French.
Benoit B. Mandelbrot, The Fractal Geometry of Nature, W.H. Freeman and Co., San Francisco, CA, 1982.
R. Daniel Mauldin, Mariusz Urbański, Graph Directed Markov Systems, Cambridge Tracts in Math., vol. 148,
Cambridge University Press, Cambridge, 2003.
R. Daniel Mauldin, S.C. Williams, Random recursive constructions: Asymptotic geometric and topological properties, Trans. Amer. Math. Soc. 295 (1986) 325–346.
S.P. Meyn, R.L. Tweedie, Markov Chains and Stochastic Stability, Springer-Verlag London Ltd., London, 1993.
Lars Olsen, Random Geometrically Graph Directed Self-similar Multifractals, Pitman Res. Notes in Math. Ser.,
vol. 307, Longman Scientific & Technical, Harlow, 1994.
Svetlozar T. Rachev, Probability Metrics and the Stability of Stochastic Models, Wiley Series in Probability and
Mathematical Statistics: Applied Probability and Statistics, John Wiley & Sons Ltd., Chichester, 1991.
Örjan Stenflo, Ergodic theorems for iterated function systems controlled by stochastic sequences, Umeå University,
Sweden, 1998, Doctoral thesis, no. 14.
Örjan Stenflo, Ergodic theorems for Markov chains represented by iterated function systems, Bull. Polish Acad.
Sci. Math. 49 (2001) 27–43.
Örjan Stenflo, Markov chains in random environments and random iterated function systems, Trans. Amer. Math.
Soc. 353 (2001) 3547–3562.
Cédric Villani, Topics in Optimal Transportation, Grad. Stud. in Math., vol. 58, American Mathematical Society,
Providence, RI, 2003.
Wei Biao Wu, Michael Woodroofe, A central limit theorem for iterated random functions, J. Appl. Probab. 37 (3)
(2000) 748–755.