Takacs-Fiksel method for stationary marked Gibbs point
processes
Jean-François Coeurjolly, David Dereudre, Rémy Drouilhet, Frédéric Lavancier
To cite this version:
Jean-François Coeurjolly, David Dereudre, Rémy Drouilhet, Frédéric Lavancier. Takacs-Fiksel method
for stationary marked Gibbs point processes. Scandinavian Journal of Statistics, 2012, 39 (3), pp.416443. 10.1111/j.1467-9469.2011.00738.x. hal-00502004v2
HAL Id: hal-00502004
https://hal.science/hal-00502004v2
Submitted on 6 Jan 2011
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diffusion de documents
scientifiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
Takacs-Fiksel method for stationary marked
Gibbs point processes
J.-F. Coeurjolly1,2 , D. Dereudre3 , R. Drouilhet2 and F. Lavancier4
1
2
3
4
GIPSA-lab, Grenoble University, France,
Laboratoire Jean Kuntzmann, Grenoble University, France
LAMAV UVHC FR 2956, Lille Nord de France University, France,
Laboratoire de Mathématiques Jean Leray, Nantes University, France.
January 6, 2011
Abstract
This paper studies a method to estimate the parameters governing the distribution of a stationary
marked Gibbs point process. This procedure, known as the Takacs-Fiksel method, is based on the
estimation of the left and right hand sides of the Georgii-Nguyen-Zessin formula and leads to a family
of estimators due to the possible choices of test functions. We propose several examples illustrating
the interest and flexibility of this procedure. We also provide sufficient conditions based on the model
and the test functions to derive asymptotic properties (consistency and asymptotic normality) of the
resulting estimator. The different assumptions are discussed for exponential family models and for
a large class of test functions. A short simulation study is proposed to assess the correctness of the
methodology and the asymptotic results.
Keywords: stationary marked Gibbs point processes, parametric estimation, Takacs-Fiksel method,
asymptotic properties, ergodic theorem, central limit theorem.
1
Introduction
Spatial point pattern data arise in a wide range of applications and since the seventies [10], [31] various
statistical methods have been developed to study these kinds of data (see [36], [37] or [27] for recent reviews). In particular, a spatial point process is often modelled as the realization of a Gibbs distribution,
defined through an interaction function, also called Hamiltonian. Gibbs models are extensions of the
well-known Poisson process since they constitute a way to introduce dependence between points. Inference for parametric models in this setting has known a large development during the last decade. The
most popular method to estimate the parameters is certainly the maximum likelihood estimator (MLE).
It involves an intractable normalizing constant, but recent developments in computational statistics, in
particular perfect simulations, have made inference feasible for many Gibbs models (see [4]). Although
the MLE suffers from a lack of theoretical justifications (only some results for sparse patterns are proposed in [33]), some comparison studies, as in [18], have shown that it outperforms the other estimation
methods. Nevertheless, the computation of the MLE remains very time-consuming and even extremely
difficult to perform for some models. It is thus necessary to have quick alternative estimators at one’s
disposal, at least to propose relevant initial values for the MLE computation. The maximum pseudolikelihood estimator (MPLE for short) constitutes one of them. Proposed by Besag in [7] and popularized
by J.L. Jensen and J. Møller in [29] and A. Baddeley and R. Turner in [1] for spatial point processes,
this method has the advantage of being theoretically well-understood (see [9], [16]) and of being much
1
faster to compute than the MLE. Another estimation procedure is the Takacs-Fiksel estimator, which
arose from [19], [45], [20]. It can be viewed, in some sense, as a generalization of the MPLE. As a matter
of fact, the Takacs-Fiksel method is not very popular, nor really used in practice. The main reason is
certainly its relatively poor performances, in terms of mean square error, observed for some particular
cases as in [18]. However, we think that this procedure deserves some consideration for several reasons
that we expose below.
The Takacs-Fiksel procedure is based on the Georgii-Nguyen-Zessin formula (GNZ formula for short).
Empirical counterparts of the left and the right hand sides of this equation are considered, and the induced
estimator is such that the difference of these two terms is close to zero. Since the GNZ formula is valid
for any test functions, the Takacs-Fiksel procedure does not only lead to one particular estimator but to a
family of estimators, depending on the choice of the test functions. This flexibility is the main advantage
of the procedure. We present several examples in the present paper. Let us summarize them in order to
underline the interest of this method (see Section 3.2 for more details). First, this procedure can allow us
to achieve estimations that likelihood-type methods cannot. As an example, we focus on the quermass
model, which gathers the area interaction point process as a particular case (see [30], [14]). This model
is sometimes used for geometric random objects. From a data set, one typically does not observe the
point pattern but only some geometric sets arising from these points. The non-observability of the points
makes the likelihood-type inference unfeasible. As we will show, this problem may be solved thanks to
the Takacs-Fiksel procedure, provided that the test functions are chosen properly.
Another motivation is the possibility to choose test functions depending on the Hamiltonian in order
to construct quicker estimators which do not require the computation of an integral for each value of
the parameter. This improvement appears crucial for rigid models, such as those involved in stochastic
geometry (see [17]), for which the MLE is prohibitively time-consuming and the MPLE still remains
difficult to implement. Moreover, for some models, it is even possible to obtain explicit estimators which
do not require any simulation nor optimization. This is illustrated for the Strauss model.
Therefore, it appears important to us to understand the theoretical properties of this procedure. This
problem is the main objective of the present paper. We prove the consistency and the asymptotic normality of the induced estimator in a very general setting. In particular, we obtain a central limit theorem
for the Takacs-Fiksel estimator with the classical rate of convergence, i.e. square root of the volume of
the observation domain. This asymptotic result leads to the following comment: as a quick consistent
estimator, the Takacs-Fiksel estimator appears to be a very good starting point for refined algorithms.
Among them, let us mention the first step Newton-Raphson algorithm (as used in [26]) which allows an
accurate approximation of the MLE, starting from a consistent estimator, in only one step. Although the
theoretical justifications are missing in the Gibbs framework, it is well-known that this procedure leads to
an efficient estimator in the classical iid case (see [32]). Another possibility could be to exploit the local
asymptotic normality of the model in order to construct an adaptive estimator from the Takacs-Fiksel
estimator. The Local Asymptotic Normality property (LAN) has only been proved for restrictive models
in [33], but one can hope that it remains true for most Gibbs models. This procedure also leads to an
efficient estimator. All these possibilities are interesting prospects for future investigations.
Some asymptotic properties of the Takacs-Fiksel procedure have already been investigated in two
previous studies: one by L. Heinrich in [25] and the one by J.-M. Billiot in [8]. These papers have
different frameworks and are based on different tools, but they both involve regularity and integrability
type assumptions on the Hamiltonian and a theoretical condition which ensures that the contrast function
(associated to the Takacs-Fiksel procedure) has a unique minimum. In [25], the consistency and the
asymptotic normality are obtained for a quite large class of test functions. These results are, however,
proved under the Dobrushin condition (see Theorems 2 and 3 in [25]) which implies the uniqueness of the
underlying Gibbs measure and some mixing properties. This condition imposes a dramatic reduction of
the space of possible values for the parameters of the model. In [8], the author focuses only on pairwise
interaction point processes (which excludes the quermass model for example). The author mainly obtained
the consistency for a specific class of test functions. In the case of a multi-Strauss pairwise interaction
point process, the author also proved that the identifiability condition holds for the class of test functions
he considered.
2
In contrast, our asymptotic results are proved in a very general setting, i.e. for a large class of
stationary marked Gibbs models and test functions. The method employed to prove asymptotic normality
is based on a conditional centering assumption, first appeared in [24] for the Ising model and generalized
to certain spatial point processes in [28]. The main restriction that this method induces is only the finite
range of the Hamiltonian. There are no limitations on the space of parameters and, in particular, the
possible presence of phase transition does not affect the asymptotic behavior of the estimator. Moreover,
the test functions may depend on the parameters. This extension seems important to us because, as
emphasized in Section 3.2.2, such test functions can lead to quick and/or explicit estimators. All the
general hypotheses assumed for the asymptotic results are discussed. For this, we focus on exponential
family models, that is, on models whose interaction function is linear in the parameters. We show that
our integrability and regularity assumptions are not restrictive since they are valid for a large class of
models such as the Multi-Strauss marked point process, the Strauss-disc type point process, the Geyer’s
triplet point process, the quermass model and for all test functions used as a motivation for this work. In
the setting of the exponential family models, we also discuss the classical identifiability condition which
is required for the Takacs-Fiksel procedure. To the best of our knowledge, this is the first attempt to
discuss it. We will specially dwell on questions like: what choices of test functions (and how many test
functions) lead to a unique minimum of the contrast function? We propose general criteria and provide
examples. It seems commonly admitted that to achieve the identification of the Takacs-Fiksel procedure,
one should at least choose as many test functions as the number of parameters. As a consequence of
our study, it appears that one should generally strictly choose more test functions than the number of
parameters to achieve identification.
The rest of the paper is organized as follows. Section 2 introduces notation and a short background
on marked Gibbs point processes. The Takacs-Fiksel method is presented in Section 3. It is based on
the GNZ formula which is recalled in Section 3 also. Several examples of test functions are given. They
aim at illustrating our interest in considering the Takacs-Fiksel procedure. The asymptotic results of the
induced estimator are proposed in Section 4. Our results are obtained from a single realization observed in
a domain whose volume is supposed to increase to infinity. Some integrability and regularity assumptions
made for the Hamiltonian and for the test functions are discussed in this section while in Section 5 the
identifiability condition is specifically dealt with. In Section 6, the very special situation where the energy
function is not hereditary is considered. The GNZ formula is no longer valid in this setting, but it has
been recently extended in [16] thanks to a slight modification. This leads to a natural generalization of
the Takacs-Fiksel procedure. In Section 7, we propose a brief simulation study in order to assess the
correctness of the asymptotic results we obtained in Section 4. Finally, Section 8 contains the proofs of
the asymptotic results.
2
2.1
Background and notation
General notation, configuration space
Subregions of Rd will typically be denoted by Λ or ∆ and will always be assumed to be Borel with
positive Lebesgue measure. We write Λ ⋐ Rd if Λ is bounded. Λc denotes the complementary set of Λ
inside Rd . The notation |.| will be used without ambiguity for different kind of objects. For a countable
set J , |J | represents the number of elements belonging to J ; For Λ ⋐ Rd , |Λ| is the volume of Λ; For
x ∈ Rd , |x| corresponds to its uniform norm while kxk is its Euclidean norm. For all x ∈ Rd , ρ > 0, let
B(x, ρ) := {y ∈ Rd , ky − xk < ρ}. For a matrix M, let kMk be the Frobenius norm of M defined by
kMk2 = T r(MT M), where T r is the trace operator.
The space Rd is endowed with the Borel σ-algebra B(Rd ) and the Lebesgue measure λ. Let
be a
measurable space, which aims at being the mark space, endowed with the σ-algebra M and the probability
measure λm. The state space of the point processes will be := Rd ×
measured by µ := λ ⊗ λm. We
m
shall denote for short x = (x, m) an element of .
A configuration is a subset ϕ of which is locally finite in that ϕΛ := ϕ∩(Λ× ) has finite cardinality
NΛ (ϕ) := |ϕΛ | for all Λ ⋐ Rd . The space Ω = Ω( ) of all configurations is equipped with the σ-algebra
F that is generated by the counting variables ϕ → |ϕΛ×A | for any Λ ⋐ Rd and any A ∈ M. Finally,
let T = (τx )x∈Rd be the shift group, where τx : Ω → Ω is the translation by the vector −x ∈ Rd (i.e.
S
S
S
S
3
M
M
M
the application ϕ 7→ ∪(y,m)∈ϕ {(y − x, m)}). For the sake of simplicity, we set ϕ ∪ xm := ϕ ∪ {xm } and
ϕ \ xm := ϕ \ {xm }.
2.2
Marked Gibbs point processes
Our results will be expressed for general stationary Gibbs point processes. Since we are interested in
asymptotic properties, we have to consider these point processes acting on the infinite volume Rd . Let
us briefly recall their definition.
A marked point process Φ is an Ω-valued random variable, with probability distribution P on (Ω, F ).
The most prominent marked point process is the marked Poisson process π z with intensity measure
zλ ⊗ λm on , with z > 0 (see for example [34] for definition and properties). For Λ ⋐ Rd , let us denote
z
by πΛ
the marginal probability measure in Λ of the Poisson process with intensity z. Without loss of
1
generality, the intensity z is fixed to 1 and we simply write π and πΛ in place of π 1 and πΛ
.
p
d
Let θ ∈ R (for some p ≥ 1). For any Λ ⋐ R , let us consider the parametric function VΛ (.; θ) from
Ω into R ∪ {+∞}. From a physical point of view, VΛ (ϕ; θ) is the energy of ϕΛ in Λ given the outside
configuration ϕΛc . In this paper, we focus on stationary point processes on Rd , i.e. with stationary (i.e.
T -invariant) probability measure. For any Λ ⋐ Rd , we therefore consider VΛ (.; θ) to be T -invariant, i.e.
VΛ (τx ϕ; θ) = VΛ (ϕ; θ) for any x ∈ Rd . Furthermore, (VΛ (.; θ))Λ⋐Rd is a compatible family of energies, i.e.
for every Λ ⊂ Λ′ ⋐ Rd , there exists a measurable function ψΛ,Λ′ from Ω into R ∪ {+∞} such that
S
∀ϕ ∈ Ω
VΛ′ (ϕ; θ) = VΛ (ϕ; θ) + ψΛ,Λ′ (ϕΛc ; θ).
(1)
A stationary marked Gibbs point process is characterized by a stationary marked Gibbs measure
usually defined as follows (see [41]).
Definition 1 A probability measure Pθ on Ω is a stationary marked Gibbs measure for the compatible
family of T -invariant energies (VΛ (.; θ))Λ⋐Rd if for every Λ ⋐ Rd , for Pθ -almost every outside configuration ϕΛc , the law of Pθ given ϕΛc admits the following conditional density with respect to πΛ :
fΛ (ϕΛ |ϕΛc ; θ) =
1
e−VΛ (ϕ;θ) ,
ZΛ (ϕΛc ; θ)
where ZΛ (ϕΛc ; θ) is a normalization called the partition function.
The existence of a Gibbs measure on Ω which satisfies these conditional specifications is a difficult
issue. We refer the interested reader to [42], [41], [5], [13], [15] for the technical and mathematical
development of the existence problem.
In a first step, we assume that the family of energies is hereditary (the non-hereditary case will be
considered in Section 6), which means that for any Λ ⋐ Rd , for any ϕ ∈ Ω, and for all xm ∈ Λ × ,
M
VΛ (ϕ; θ) = +∞ ⇒ VΛ (ϕ ∪ xm ; θ) = +∞,
(2)
or equivalently, for all xm ∈ ϕΛ , fΛ (ϕΛ |ϕΛc ; θ) > 0 ⇒ fΛ (ϕΛ \ xm |ϕΛc ; θ) > 0.
The minimal assumption of our paper is then:
[Mod]: For any θ ∈ Θ, where Θ is a compact subset of Rp , there exists a stationary marked Gibbs
measure Pθ for the compatible T -invariant hereditary family (VΛ (.; θ))Λ⋐Rd . Our data consist in the
realization of a marked point process Φ with stationary marked Gibbs measure Pθ⋆ , where θ⋆ ∈ Θ̊
is the unknown parameter vector to estimate.
Let us note that [Mod] ensures the existence of at least one stationary Gibbs measure. When this Gibbs
measure is not unique, we say that the phase transition occurs. In this situation the set of Gibbs measures
is a Choquet simplex and any Gibbs measure is a mixture of extremal ergodic Gibbs measures. If the
Gibbs measure is unique, it is necessary ergodic (see [22] for more details about these properties).
In the rest of this paper, the reader has mainly to keep in mind the concept of local energy defined as
the energy required to insert a point xm into the configuration ϕ and expressed for any Λ ∋ x by
V (xm |ϕ; θ) := VΛ (ϕ ∪ xm ; θ) − VΛ (ϕ; θ).
From the compatibility of the family of energies, i.e. (1), this definition does not depend on Λ.
4
(3)
3
The Takacs-Fiksel estimation procedure
3.1
Presentation
The basic ingredient for the definition of the Takacs-Fiksel method is the so-called GNZ formula. This
equation was proved by Georgii, Nguyen and Zessin in the seventies but other authors such as Papangelou
and Takahashi also contributed to its establishment. See [40] and [46] for historical comments and [21]
or [38] for a general presentation.
Lemma 1 (Georgii-Nguyen-Zessin Formula) Under [Mod], for any measurable function h(·, ·; θ) :
S × Ω → R such that the following quantities are defined and finite, then
!
Z
X
m
−V (xm |Φ;θ ⋆ )
m
m
m
E
h (x , Φ; θ) e
µ(dx ) = E
h (x , Φ \ x ; θ) ,
(4)
Rd ×
M
xm ∈Φ
where E denotes the expectation with respect to Pθ⋆ .
For stationary marked Gibbs point processes, (4) reduces to
M
⋆
E h 0M , Φ; θ e−V (0 |Φ;θ ) = E h 0M , Φ \ 0M ; θ ,
(5)
where M denotes a random variable with probability distribution λm.
A second tool used throughout the paper is the ergodic Theorem established in [39]. Let us give a
simpler form, here, which is sufficient in this paper.
Lemma 2 (Ergodic result) Under [Mod], we assume that Pθ⋆ is ergodic. Then for any family of
measurable functions FΛ , indexed by the bounded sets Λ, from Ω to R which are additive (i.e. FΛ∪Λ′ =
FΛ + FΛ′ − FΛ∩Λ′ ), shift invariant (i.e. FΛ (ϕ) = Fτ (Λ) (τ (ϕ)) for any translation τ ) and integrable (i.e.
E(|F[0,1]d |) < +∞), we have that for Pθ⋆ -almost every ϕ
lim |Λn |−1 FΛn (ϕ) = E(F[0,1]d ),
n→+∞
where Λn = [−n, n]d (other regular domains (Λn )n≥1 converging towards Rd could be also considered).
Let h(·, ·; θ) : S × Ω → R and let us define for any ϕ ∈ Ω, θ ∈ Θ and Λ ⋐ Rd
Z
X
m
h(xm , ϕ \ xm ; θ).
CΛ (ϕ; h, θ) :=
h(xm , ϕ; θ)e−V (x |ϕ;θ)µ(dxm ) −
Λ×
M
(6)
xm ∈ϕΛ
Assume that we observe the realization of a marked point process Φ satisfying [Mod] in a domain
Λn . For appropriate choices of the functional h and a sequence of domain Λn , then it is possible to apply
the ergodic result in Lemma 2 to prove that the first and second terms of |Λn |−1 CΛn (Φ; h, θ) respectively
converge Pθ⋆ -almost surely to the left and right terms of (5).
The latter observation is the basic argument to define the Takacs-Fiksel method. Let us give K
functions hk (·, ·; θ) : S × Ω → R (for k = 1, . . . , K), then the Takacs-Fiksel estimator is simply defined by
θbn (ϕ) := θbnT F (ϕ) = arg min
θ∈Θ
K
X
CΛn (ϕ; hk , θ)2 ,
(7)
k=1
where arg minθ∈Θ F (θ) means the parameter θ which minimizes the function F . Under identification
assumption (16) given later, this minimum is obtained for a unique θ provided that n is large enough.
5
3.2
Some examples
In this section, some examples of models and test functions h, involved in (6), are provided. The choices
made in previous studies are presented in 3.2.1. The two examples presented in 3.2.2 and 3.2.3 show
the relevance of the Takacs-Fiksel procedure to provide quick estimates. The last example, in 3.2.4, is
concerned with the possible identification of a special marked point process, the quermass model, and
shows that appropriate choices of test functions can solve identification problems when points are not
observed.
There is no asymptotic consideration in this section. Therefore, the different estimates are defined
b
over a window, say Λ, and are denoted by θ.
3.2.1
Classical examples
Let us first quote the particular case when the Takacs-Fiksel estimator reduces to the maximum pseudolikelihood estimator introduced by [29] for spatial point processes. The MPLE is obtained by maximizing
the log-pseudo-likelihood contrast function, given by
Z
X
m
V (xm |ϕ \ xm ; θ).
(8)
e−V (x |ϕ;θ) µ(dxm ) −
LP LΛ (ϕ; θ) = −
Λ×
M
xm ∈ϕΛ
b defined by (7),
(xm |ϕ; θ), k = 1, . . . , p, the estimator θ,
solves the system CΛ (ϕ; hk , θ) = 0, k = 1, . . . , p, which means that θb is the root of the gradient vector of
LP LΛ , i.e. θb is the MPLE.
The first empirical study of the Takacs-Fiksel estimator can be found in [18], where this estimate is
compared to other estimators for some unmarked pairwise interaction point processes with two parameters. In this context different test functions hr1 , · · · , hrK were used, where for r > 0
X
hr (x, ϕ; θ) = ϕB(x,r) =
1[0,r] (ky − xk).
Therefore, with the choice hk (xm , ϕ; θ) =
∂
∂θk V
y∈ϕ
The integral term involved in (6) is approximated by discretization and the induced estimation (7) is then
assessed. Note that with this choice of test functions, the sum term in (6), when normalized by |Λ|−1 , is
an estimation of ρ2 K(r), where K(·) is the reduced second order function and ρ denotes the intensity of
the stationary point process Φ, i.e. for all B ∈ B(Rd ), E(Φ(B)) = ρ|B|.
The latter choice requires the computation of the integral in (6) for all θ. A more convenient choice
could be the one first proposed by Fiksel:
X
hr (x, ϕ; θ) = ϕB(x,r) eV (x|ϕ;θ) = eV (x|ϕ;θ)
1[0,r] (ky − xk).
(9)
y∈ϕ
In the stationary case, this leads to the following approximation thanks to the ergodic theorem
Z
1
hr (x, ϕ; θ)e−V (x|ϕ;θ)dx ≈ E ΦB(0,r) = ρπr2 .
|Λ| Λ
(10)
The integral term in (6) is thus easily approximated by |ϕΛ |πr2 for all θ, while the sum can be explicitly
computed. These historical choices of test functions have a natural extension in the marked case.
3.2.2
Some choices leading to quick estimations
The main advantage of the Takacs-Fiksel procedure is to provide quick consistent estimators, that might
supply initial values for a more evolved procedure. A simple way to achieve this goal is to generalize (9)
and consider test functions of the form
h(xm , ϕ; θ) = h̃(xm , ϕ)eV (x
m
m
|ϕ;θ)
,
where h̃(x , ϕ) does not depend on θ. So, the integral term in (6) has to be computed only once and not
for all θ, while the sum term in (6) does not require any approximation. Hence, the optimisation problem
(7) may be resolved very quickly.
In some particular examples, explicit formulas may even be obtained for the integral term, as in (10).
In the same spirit, an explicit estimator for the Strauss interaction is provided below.
6
3.2.3
An example of explicit estimator for the Strauss process
The (non-marked) Strauss process with range of interaction R > 0 is given for any Λ ⋐ Rd by
X
VΛ (ϕ; θ) = θ1 |ϕΛ | + θ2
1[0,R] (kx − yk) ,
(11)
x,y∈ϕ
{x,y}∩Λ6=∅
where θ1 ∈ R and θ2 > 0 are the two parameters of the model. Alternatively,
X
V (x|ϕ; θ) = θ1 + θ2
1[0,R] (ky − xk).
y∈ϕ
Let us consider the following family of test functions, for k ∈ N \ {0},
(
e(k−1)θ2 if ϕB(x,R) = k − 1,
hk (x, ϕ; θ) =
0
otherwise.
(12)
This choice gives in (6)
CΛ (ϕ; hk , θ) = e−θ1 Vk−1,Λ (ϕ) − e(k−1)θ2 Nk−1,Λ (ϕ),
where Nk,Λ (ϕ) denotes the number of points x ∈ ϕΛ such that |B(x, R) ∩ (ϕ \ x)| = k and Vk,Λ (ϕ) denotes
the volume of the set {y ∈ Λ, |B(y, R) ∩ ϕ| = k}.
Several explicit estimators may be obtained following (7) from (at least) two test functions as above.
Let us quote the simplest one, corresponding to the choice h1 and h2 in (7). This leads to the contrast
function CΛ (ϕ; h1 , θ)2 + CΛ (ϕ; h2 , θ)2 which vanishes at the unique point (θb1 (ϕ), θb2 (ϕ)) with
V1,Λ (ϕ)
V0,Λ (ϕ)
V0,Λ (ϕ)
, θb2 (ϕ) = ln
− ln
.
(13)
θb1 (ϕ) = ln
N0,Λ (ϕ)
N1,Λ (ϕ)
N0,Λ (ϕ)
This estimator of (θ1 , θ2 ) is completely explicit, provided the quantities Nk,Λ (ϕ) and Vk,Λ (ϕ) are available.
They can be easily approximated by computational geometry tools.
3.2.4
A solution for unobservability issues
The quermass model introduced in [30] is a marked point process which aims at modelling random sets in
R2 . This is a generalization of the well-known Boolean model to interacting random balls. Let us denote
by xR a marked point where x and R > 0 (i.e. the mark) respectively represent the center and the radius
of the associated ball B(x, R). For a finite configuration ϕ, i.e. with a finite support instead of R2 , the
quermass energy is defined for (θ1 , θ2 , θ3 , θ4 ) ∈ R4 by:
[
B(x, R)
V (ϕ; θ) = θ1 |ϕ| + θ2 P(Γ) + θ3 A(Γ) + θ4 E(Γ) where Γ =
(x,R)∈ϕ
and P(Γ), A(Γ) and E(Γ) denote respectively the perimeter, the area and the Euler-Poincaré characteristic
(i.e. number of components minus number of holes) of the set Γ. To extend this definition to the infinite
support R2 , it is convenient to suppose that the radii of the balls are almost surely uniformly bounded
(i.e. λm([0, R0 ]) = 1 for some R0 > 0). In this case the family of energies (VΛ ) is defined by
VΛ (ϕ; θ) = V ϕΛ⊕B(0,2R0 ) ; θ − V ϕΛ⊕B(0,2R0 )\Λ ; θ .
This definition may be extended to unbounded radius, though a restriction to the so-called tempered
configurations is needed to ensure the existence of the associated Gibbs measure. We refer to [14] for
more details.
When θ2 = θ3 = θ4 = 0, this model reduces to the Boolean model (see [44] for a survey). The area
process (see [3]) is also a particular case, taking θ2 = θ4 = 0.
7
In practice, one only observes the random set Γ, so the marked points xR in ϕ are unknown. A
challenging task is then to estimate the parameters (θ1 , θ2 , θ3 , θ4 ) in the presence of this unobservability
issue. In particular, a direct application of the maximum likelihood or pseudo-likelihood method is
impossible to estimate all the parameters, and especially θ1 which requires the observation of the number
of points in ϕ. For the other parameters, which are related to the observable functionals P, A and E, the
MLE has been investigated in [35].
Let us show that the Takacs-Fiksel procedure may be used to estimate θ1 in spite of this unobservability
issue. Indeed, it is possible to choose some test function h such that both the integral and the sum in (6)
are computable. The unobservability issue occurs mainly for the sum term. Let us consider the following
example of test function:
hper (xR , ϕ; θ) = P (C(x, R) ∩ Γc ) ,
(14)
where C(x, R) is the sphere {y, ||y − x|| = R}. For any finite configuration ϕ, we then have
X
hper (xR , ϕ \ xR ; θ) = P(Γ),
xR ∈ϕ
so that this sum is computable evenPif each term hper (xR , ϕ \ xR ; θ) is not. If the configuration ϕ is
infinite then for any bounded set Λ, xR ∈ϕΛ hper (xR , ϕ \ xR ; θ) is equal to the perimeter of Γ restricted
to Λ plus a boundary term which is asymptotically negligible with respect to the volume of Λ.
Consequently, assuming (θ2 , θ3 , θ4 ) known, θ1 may be estimated thanks to (7) with the above choice
of test function.
The joint estimation of all four parameters might be achieved thanks to additional test functions
sharing the same property as above, i.e. such that the sum in (6) is observable.
In the particular case of the area process, θ2 = θ4 = 0 and R is constant (i.e. λm = δR ), it suffices
to find one more test function to ensure an identifiable estimation (see Example 2 in Section 5 for more
details). A possible additional test function is
(
1 if P (C(x, R)) = 2πR
R
hiso (x , ϕ; θ) =
(15)
0 otherwise.
In this case
4
P
xR ∈ϕ
hiso (xR , ϕ \ xR ; θ) corresponds to the number of isolated balls in Γ.
Asymptotic results for the Takacs-Fiksel estimator
We present in this section asymptotic results for the Takacs-Fiksel estimator for a point process satisfying
[Mod] and assumed to be observed in a domain Λn , where (Λn )n≥1 is a sequence of increasing cubes
whose size goes to +∞ as n goes to +∞.
First, for a function g depending on θ, we denote by g(1) (θ) (resp. g(2) (θ)) the gradient vector of
length p (resp. the Hessian matrix of size (p, p)) evaluated at θ. Let us rewrite the Takacs-Fiksel estimator
as
θbn (ϕ) = arg min UΛn (ϕ; h, θ),
θ∈Θ
with UΛn (ϕ; h, θ) = |Λn |−2
4.1
Consistency
PK
k=1
CΛn (ϕ; hk , θ)2 , where h = (h1 , . . . , hK ) and CΛn is given by (6).
The consistency is obtained under the following assumptions, denoted by [C]: for any Gibbs measure
Pθ⋆ , for all θ ∈ Θ, k = 1, . . . , K and ϕ ∈ Ω
[C1] For all x ∈ Rd , hk (xm , ϕ; θ) = hk (0m , τx ϕ; θ) and
M
′
E hk 0M , Φ; θ e−V (0 |Φ;θ ) < +∞, for θ′ = θ, θ⋆ .
[C2] UΛn (ϕ; h, ·) is a continuous function for Pθ⋆ −a.e. ϕ.
8
[C3]
K
X
k=1
2
M
M
⋆
=0
E hk (0M , Φ; θ) e−V (0 |Φ;θ) − e−V (0 |Φ;θ )
=⇒ θ = θ⋆ .
(16)
m
[C4] hk and fk , defined by fk (xm , ϕ; θ) := hk (xm , ϕ; θ)e−V (x |ϕ;θ) , are continuously differentiable and
M
M
−V (0M |Φ;θ ⋆ )
< +∞,
E max |fk (0 , Φ; θ)| < +∞ and E max |hk (0 , Φ; θ)|e
θ∈Θ
θ∈Θ
(1) M
E max kfk (0 , Φ; θ)k < +∞
θ∈Θ
(1) M
−V (0M |Φ;θ ⋆ )
< +∞.
E max khk (0 , Φ; θ)ke
and
θ∈Θ
Theorem 3 Assuming [Mod] and [C] then, as n → +∞, the Takacs-Fiksel estimator θbn (ϕ) converges
towards θ⋆ for Pθ⋆ −a.e. ϕ.
Assumptions [C1], [C2] and [C4] are related to the regularity and the integrability of the different
test functions and the local energy function. Some general criteria may be proposed to verify these
assumptions, see Section 4.3 for a discussion. Assumption [C3] corresponds to an identifiability condition
and requires much more attention. It is well-known that such an assumption is fulfilled when h = V(1)
(leading to the MPLE) under mild assumptions (see Assumption [Ident] proposed by [9]). The question
to know if this remains true for more general test functions is difficult (actually it is untrue in several
cases). This will be discussed specifically in Section 5.
4.2
Asymptotic normality
We need the following assumptions denoted by [N]: For any Gibbs measure Pθ⋆ , k = 1, . . . , K, Λ ⋐ Rd ,
ϕ ∈ Ω and θ in a neighborhood V(θ⋆ ) of θ⋆ :
[N1] E |CΛ (Φ; hk , θ⋆ )|3 < +∞.
[N2] For any sequence of bounded domains Γn such that Γn → 0 as n → +∞, E CΓn (Φ; hk , θ⋆ )2 → 0.
[N3] CΛ (ϕ; hk , θ⋆ ) depends only on ϕΛ⊕B(0,D) for some D ≥ 0 (which is uniform in Λ, ϕ, θ⋆ ).
[N4] hk and fk (defined in [C4]) are twice continuously differentiable in θ and
M
⋆
E kh(2) (0M , Φ; θ)ke−V (0 |Φ;θ ) < +∞ and E kf (2) (0M , Φ; θ)k < +∞.
Let us remark that Assumption [N3] leads us to consider in general that V has a finite range, which
means that there exists D ≥ 0 such that for all (m, ϕ) ∈
× Ω and all θ ∈ Θ
V (0m |ϕ; θ) = V 0m |ϕB(0,D) ; θ .
M
The same kind of finite range property is also expected for (hk ).
Theorem 4 Under Assumptions [Mod], [C] and [N], for any ergodic Gibbs measure Pθ⋆ the following
convergence in distribution holds as n → +∞
d
(17)
|Λn |1/2 E(h, θ⋆ )E(h, θ⋆ )T θbn (Φ) − θ⋆ → N 0, E(h, θ⋆ )Σ(h, θ⋆ )E(h, θ⋆ )T ,
where E(h, θ⋆ ) is the (p, K) matrix defined for i = 1, . . . , p and k = 1, . . . , K by
M
⋆
(E(h, θ⋆ ))ik = E hk (0M , Φ; θ⋆ ) V(1) (0M |Φ; θ⋆ ) e−V (0 |Φ;θ )
i
and where Σ(h, θ⋆ ) is the (K, K) matrix defined by
X
e ∆ (D) (Φ; h, θ⋆ )C
e ∆ (D) (Φ; h, θ⋆ )T ,
E C
Σ(h, θ⋆ ) = D−d
0
ℓ
(18)
|ℓ|≤1
where, for all k ∈ Zd , ∆k (D) is the cube centered at kD with side-length D and where, for any bounded
e Λ (Φ; h, θ⋆ ) := (CΛ (Φ; hk , θ⋆ ))
domain Λ, C
k=1,...,K .
9
(1)
Remark 1 While Assumptions [N1-3] will ensure that a central limit theorem holds for UΛn (Φ; h, θ⋆ ),
(2)
Assumption [N4] is required to prove that UΛn (Φ; h, θ) is uniformly bounded in a neighborhood of θ⋆ .
These two statements allow us to apply a general central limit theorem for minimum contrast estimators
(e.g. Theorem 3.4.5 of [23]).
Remark 2 In Theorem 4, if the Gibbs measure Pθ⋆ is stationary (and then not necessarily ergodic), it
is a mixture of ergodic measures and the left hand term in (17) converges in distribution to a mixture
of normal distributions. We may also propose an asymptotic result valid in the presence of a phase
T
transition. Indeed, if the matrix E(h, θ⋆ )Σ(h, θ⋆ )E(h, θ⋆ ) is positive definite (for any extremal ergodic
measure of the Choquet simplex of stationary Gibbs measures), then we may define a consistent empirical
h
i−1/2
version S(Φ) of E(h, θ⋆ )Σ(h, θ⋆ )E(h, θ⋆ )T
E(h, θ⋆ )E(h, θ⋆ )T (see Section 4 of [12] for more details)
to obtain:
d
|Λn |1/2 S(Φ) θbn (Φ) − θ⋆ → N (0, I) .
Remark 3 Following Section 3.2.1, let us underline that (17) is coherent with the asymptotic normality
of the MPLE established in [11], i.e. with the case K = p and h = V(1) . Indeed, with similar assumptions
to the ones presented in the present paper, Equation (4.4) in Theorem 2 [11] states that
d
(19)
|Λn |1/2 A(θ⋆ ) θbn (Φ) − θ⋆ → N 0, Σ(V(1) , θ⋆ ) ,
where A(θ⋆ ) is the symmetric (p, p) matrix given for i, k = 1, . . . , p by
M
⋆
V(1) (0M |Φ; θ⋆ ) e−V (0 |Φ;θ ) .
(A(θ⋆ ))ik = E V(1) (0M |Φ; θ⋆ )
k
i
Since h = V(1) , then E(V(1) , θ⋆ ) := A(θ⋆ ) and therefore (17) reduces to
d
|Λn |1/2 A(θ⋆ )2 θbn (Φ) − θ⋆ → N 0, A(θ⋆ )Σ(V(1) , θ⋆ )A(θ⋆ )
which is exactly (19) by assuming that A(θ⋆ ) is invertible.
Remark 4 The question how to choose the test functions in order to minimize the norm of the asymptotic
covariance matrix is difficult to answer, still open and is a perspective for future work.
4.3
Discussion
The present paragraph is devoted to the discussion of Assumptions [Mod], [C] (except [C3]) and [N].
In the previous sections, we have expressed the different assumptions in a very general way. Our aim,
here, is to make these assumptions concrete for a wide range of models and a wide range of test functions,
in order to illustrate that our setting is not restrictive. In particular, we will focus on exponential family
models having a local energy of the form:
V (xm |ϕ; θ) := θT V(xm |ϕ) = θ1 V1 (xm |ϕ) + . . . + θp Vp (xm |ϕ),
S
S
with V = (V1 , . . . , Vp ) a vector function from × Ω( ) to Rp .
Let us consider the following assumptions: for all (m, ϕ) ∈
(20)
M × Ω.
(inf)
[Exp] For i = 1, · · · , p, for all x ∈ Rd , Vi (xm |ϕ) = Vi (0m |τx ϕ) and there exist κi
N, D > 0 such that one of the two following assumptions is satisfied :
(inf)
θi ≥ 0 and − κi
or
(inf)
−κi
(sup)
≤ Vi (0m |ϕ) = Vi (0m |ϕB(0,D) ) ≤ κi
(sup)
≤ Vi (0m |ϕ) = Vi (0m |ϕB(0,D) ) ≤ κi
10
.
(sup)
, κi
|ϕB(0,D) |ki .
≥ 0, ki ∈
^ Assumption [Exp] with ki = 0 or 1 for all i (when θi ≥ 0).
[Exp]
[H] For all x ∈ Rd , h(xm , ϕ; θ) = h(0m , τx ϕ; θ) and there exist κ > 0, k ∈ N, D > 0 such that
h(0m , ϕ; θ) = h(0m , ϕB(0,D) ; θ), such that h(0m , ϕ; ·) is twice continuously differentiable in θ and
such that |Y (ϕ, m)| ≤ κ|ϕB(0,D) |k , where
(21)
Y (ϕ, m) := max |h(0m , ϕ; θ)|, kh(1) (0m , ϕ; θ)k, kh(2) (0m , ϕ; θ)k .
T
m
g h(0m , ϕ; θ) = e
[H]
h(0m , ϕ; θ)eθ V(0 |ϕ) with e
h satisfying [H].
Let us underline that the different constants involved in these assumptions are assumed to be independent
of m, ϕ, θ. Note also that if the test function h is independent of θ, Y (ϕ, m) obviously reduces to |h(0m , ϕ)|.
These assumptions are common and very simple to check. Assumption [Exp] has already been
investigated in [9]. It includes a wide variety of models such as the overlap area point process, the multiStrauss marked point process, the k−nearest-neighbor multi-Strauss marked point process, the Strauss
type disc process, the Geyer’s triplet point process, the area process, some special cases of quermass
process (for instance when λm has a compact support not containing 0 and θ4 = 0), etc. Among these
^ is the Geyer’s triplet point process (see [9], p.242).
models, the only one that does not satisfy [Exp]
(1)
On the other hand, the test functions h(xm , ϕ; θ) = 1, h(xm , ϕ; θ) = Vk (xm |ϕ; θ) = Vk (xm |ϕ),
h(xm , ϕ; θ) = |ϕB(x,r) | satisfy [H] (for the second one, it is implied by [Exp]). Note that the functional described by (12) for the Strauss model depending on θ, the functionals hper and hiso in (14),
T
m
T
m
T
m
(15) also satisfy [H]. In a similar way, test functions like eθ V(x |ϕ) , eθ V(x |ϕ)/2 , |ϕB(x,r) |eθ V(x |ϕ) ,
T
m
g
1[0,r] (d(xm , ϕ))eθ V(x |ϕ) satisfy [H].
We show that most of all the assumptions required in Theorems 3 and 4 are not too restrictive.
Proposition 5 (i) Under Assumption [Exp], Assumption [Mod] is fulfilled.
g and a model satisfying [Exp] (resp. [Exp]),
^ then
(ii) For a test function satisfying [H] (resp. [H])
Assumptions [C] (excepted [C3]) and [N] are fulfilled.
Proof. (i) From [Exp], it is easy to check that for any fixed θ, there exists a positive constant K(θ)
such that V (0m |ϕ; θ) ≥ −K(θ). Therefore the local energy is stable for any θ. On the other hand it
is finite range. These two properties imply the existence of ergodic measures (see e.g. [5], Proposition
1). Among them, there exists at least one stationary measure due to the invariance by translation of the
family of energy functions.
(ii) [C2] and [N3] are quite obvious to check. Now, from the stability of the local energy, we have
that for all (m, ϕ) ∈
× Ω, θT V(0m |ϕ) ≥ −ρ for ρ < +∞, independent of m, ϕ, θ (this is possible
because Θ is compact). Let us also underline that this property ensures that for every Λ ⋐ Rd , every
c ∈ R, E(ec|ΦΛ | ) < +∞ (see e.g. Proposition 11 of [6]), which obviously implies that E(|ΦΛ |α ) < +∞ and
g and [Exp]),
^ the expectations
E(|ΦΛ |α ec|ΦΛ | ) < +∞ for every α > 0. Now, under [H] and [Exp] (or [H]
in [C1], [C4], [N1] and [N4] are clearly finite. Let us focus on [C1] for example (the justification for
the other assumptions is similar). We have for any θ, θ′ ∈ Θ
M
′T
E |h(0M , Φ; θ)|e−θ V(0 |Φ)
M
≤
(
eρ E |h(0M , Φ; θ)| ≤ c × eρ E |ΦB(0,D) |α
under [H] and [Exp]
ρ
α c|ΦB(0,D) |
ρ
M
θT V(0M |Φ)
g and [Exp],
^
e
≤ c × e E |ΦB(0,D) | e
under [H]
e E |h(0 , Φ; θ)|e
g
^
for some constants α and c. Note that, if we had not assumed
[Exp] for test functions of the form [H],
k
then one would have had expectations of the form E e|ΦΛ | for some k > 1 which is not necessarily
finite under the local stability property. Assumption [N2] is proved similarly and by using the dominated
convergence theorem.
11
Remark 5 By following ideas in [11], it is possible to fulfill the integrability type assumptions for more
complicated models such as the Lennard-Jones model (which is not locally stable and nonlinear in terms
of the parameters). The using of Ruelle’s estimates [43] plays a crucial role in this case of superstable
interaction. For the sake of conciseness and simplicity, we do not investigate this in the present paper.
5
Identifiability : Assumption [C3]
The Assumption [C3] is related to the identifiability of the estimation procedure. It is more complicated
to verify than the other assumptions and an investigation to obtain a criterion or a characterization seems
necessary. We address this question in this section.
In the following, we consider that the interaction has an exponential form as in (20). Then [C3] is
equivalent to: θ = θ⋆ is the unique solution of the nonlinear system of equations in θ defined by
T
M
⋆T
M
= 0,
1 ≤ k ≤ K.
(22)
E hk (0M , Φ; θ) e−θ V(0 |Φ) − e−θ V(0 |Φ)
If hk and V are sufficiently regular, each equation in (22) gives a (p− 1)-dimensional manifold of solutions
in Θ containing θ⋆ . So it is clear that the choice K ≥ p is in general necessary to prove that the system
(22) admits the unique solution θ⋆ .
In Section 5.1, we investigate the delicate case K = p in detail. In opposition to the linear case where
p hyperplanes in Rp have in general a unique common point, the intersection of p (p − 1)-dimensional
manifolds does not generally reduce to a single point. So, when K = p, there is no guarantee that (22)
has a unique solution θ⋆ . This is illustrated by a simple example at the beginning of Section 5.1. In
Proposition 6, we provide a criterion to ensure that the system in (22) admits the only one solution θ⋆ .
Some examples, for which the criterion is available, are presented and the rigidity of the criterion, when
p ≥ 3, is also evoked. In the case where p = 2, we show that our criterion is not far from being necessary.
The case K > p is studied in Section 5.2. The identification problem should be simpler since, in
general, p + 1 (p − 1)-dimensional manifolds in Rp have no common point. We give a sufficient criterion
to prove the identification but we think that it is far from being necessary.
Before presenting these two sections, let us give further notation. We denote by PV the law of
V(0M , Φ) in Rp . We also define the function Ψθ , for each θ ∈ Θ, by
Ψ θ : Rp
v
−→
7−→
RK
M
E h1 (0 , Φ; θ) V(0M |Φ) = v
..
.
E hK (0M , Φ; θ) V(0M |Φ) = v
.
(23)
We will see that this function plays a crucial role in the identification problem.
5.1
The case K = p
First of all, let us give a simple example to show that the identification problem is delicate in the situation
where K = p. Let us consider that K = p = 2, V1 = 1, and let us choose the simple test functions h1 = 1
T
m
⋆
M
and h2 = eθ V(x |ϕ) . Then θ̃ = (θ̃1 , θ̃2 ) with θ̃1 = θ1⋆ −ln(E(e−θ2 V2 (0 |Φ) )) and θ̃2 = 0 is always a solution
⋆
of the system in (22). Therefore if θ2 6= 0 and if θ̃ defined before is in Θ, then the system in (22) admits
at least two solutions.
In the following, we first give a sufficient criterion to prove the identifiability and propose some
examples. Next, we show the rigidity of our criterion which seems constraining when p ≥ 3.
5.1.1
Criterion for identifiability
Assumption [Det] gathers the two following assumptions:
[Det(6=)] For every θ in Θ, det(v1 , . . . , vp )det Ψθ (v1 ), . . . , Ψθ (vp ) is not (PV )⊗p -a.s. identically null
12
[Det(≥)] For every θ in Θ, there exists ǫ = ±1 such that for (PV )⊗p -a.s. every (v1 , . . . , vp ) in (Rp )p
ǫ det(v1 , . . . , vp )det Ψθ (v1 ), . . . , Ψθ (vp ) ≥ 0.
When ǫ = 1 (respectively ǫ = −1), [Det(≥)] means that Ψθ preserves the sign (respectively the opposite
sign) of the determinant.
The criterion is the following.
Proposition 6 If K = p then Assumption [Det] ensures that Assumption [C3] holds.
x
Proof. Denoting by ζ the real function x 7→ ln e x−1 with the convention ζ(0) = 0, the equations (22)
become
T
(θ⋆ − θ) Xk (θ, θ⋆ ) = 0,
1 ≤ k ≤ p,
(24)
where the vector Xk (θ, θ⋆ ) is defined by
⋆T
M
⋆
Xk (θ, θ ) = E hk (0M , Φ; θ)e−θ V(0 |Φ) eζ
(θ ⋆ −θ)T V(0M |Φ)
V(0 |Φ) .
M
(25)
Therefore the system (22) admits the unique solution θ⋆ if the family of vectors (Xk (θ, θ⋆ ))1≤k≤p is
independent in Rp . Let us give a formula of the determinant of these vectors which shows that it is not
null.
Conditioning by the law of V(0M |Φ) and using the multi-linearity of the determinant, we obtain
=
=
=
det(X1 (θ, θ⋆ ), . . . , Xp (θ, θ⋆ ))
Z
Z
⋆
T
⋆T
· · · det E h1 (0M , Φ; θ)e−θ v1 eζ (θ −θ) v1 v1 V(0M |Φ) = v1 , . . . ,
⋆
T
⋆T
E hp (0M , Φ; θ)e−θ vp eζ (θ −θ) vp vp V(0M |Φ) = vp PV (dv1 ) · · · PV (dvp )
Z
Z P
p
⋆T
· · · e k=1 −θ vk +ζ
(θ ⋆ −θ)T vk
Z
Z P
p
⋆T
1 X
· · · e k=1 −θ vσ(k) +ζ
p!
det (v1 , . . . , vp )
⋆
(θ −θ) vσ(k)
σ∈Sp
p
Y
k=1
T
p
Y
k=1
E hk (0M , Φ; θ)|vk PV (dv1 ) · · · PV (dvp )
det vσ(1) , . . . , vσ(p)
E hk (0M , Φ; θ)|vσ(k) PV (dv1 ) · · · PV (dvp ),
where Sp is the set of all permutations in {1, . . . , p}. Denoting by ǫ(σ) the signature of σ, we obtain
det(X1 (θ, θ⋆ ), . . . , Xp (θ, θ⋆ ))
Z
Z P
p
X
Y
p
1
−θ ⋆T vk +ζ (θ ⋆ −θ)T vk
k=1
det (v1 , . . . , vp )
··· e
E hk (0M , Φ; θ)|vσ(k)
ǫ(σ)
=
p!
σ∈Sp
PV (dv1 ) · · · PV (dvp )
Z
Z P
p
⋆T
1
· · · e k=1 −θ vk +ζ
=
p!
(θ ⋆ −θ)T vk
(26)
k=1
det(v1 , . . . , vp )det Ψθ (v1 ), . . . , Ψθ (vp ) PV (dv1 ) · · · PV (dvp ).
From Assumption [Det] , this determinant is not null. The proposition is proved.
Now let us give some examples for which the criterion is valid.
Example 1 (linear case) If the function Ψθ is linear and invertible then Assumption [Det(≥)] is
clearly satisfied and [Det(6=)] holds as soon as the support of PV is not included in a hyperplane. In
particular, if hk = Vk for every 1 ≤ k ≤ p then Ψθ is equal to the identity function. This situation
corresponds to the pseudo-likelihood procedure for which we regain the identifiability via our criterion.
13
Example 2 (area process) For the area process defined in Section 3.2.4 with λm = δR (i.e. the radii
of balls are constant), it is easy to check that the functions hper and hiso respectively defined by (14)
and (15) give a function Ψ which satisfies Assumption [Det] . Indeed the support of PV is the segment
{1} × [0, πR2 ] in R2 and for a vector v = (1, v2 ) the image Ψ(v) is (ψ1 (v2 ), 0) if v2 6= πR2 and (2πR, 1)
if v2 = πR2 . Therefore, it follows that [Det(≥)] is satisfied and noting that 0 < PV ((1, πR2 )) < 1 we
deduce that [Det(6=)] holds too.
Example 3 (a general example with p = 22) Example 2 is included in a more general setting when
p = 2. Indeed let us suppose that the function Ψθ has the form Ψθ (v1 , v2 ) = gθ (v1 , v2 ), gθ (v1 , v2 )fθ (v2 /v1 )
where gθ is a nonnegative scalar function and fθ is a monotone scalar function. Then Ψθ satisfies
[Det(≥)] and [Det(6=)] holds if gθ (v1 , v2 )fθ (v2 /v1 ) is not PV⊗2 -a.s. constant when gθ (v1 , v2 ) is not
null.
T
Example 4 (functions of the type hk eθ V ) Let us suppose that the functions (hk ) ensure that Ψθ
satisfies [Det(≥)] then for any nonnegative function gθ from Rp to R the functions (h̃k ) = gθ (V)hk
also provide a function Ψ̃θ satisfying [Det(≥)] . This remark is related to Section 3.2.2 where it is
T
suggested to choose functions (h̃k ) of the form (eθ V hk ) to simplify the integral in (6).
T
As an immediate consequence, the test functions Vk eθ V , considered in [8] for the particular multiStrauss point process, satisfy [Det(≥)] .
5.1.2
Rigidity of the criterion
In this section, we give some comments about the rigidity of the criterion. In Proposition 7 below, we
show that a function Ψθ , satisfying [Det(≥)] , has a strong linear structure since, under very reasonable
assumptions, the image of any hyperplane is included in a hyperplane. For example, in the classical
setting where V1 = 1, the function Ψθ is defined from the affine space H = {1} × Rp−1 and if Ψθ is
assumed to be continuous then the image of any p − 2 dimensional affine space in H is included in a
hyperplane. This property clearly shows that Ψθ is very rigid when p ≥ 3.
However, when p = 2, we show in Proposition 8 that our criterion is not far from being necessary.
Indeed, we present a large class of examples which do not satisfy our criterion and for which the identifiability fails.
Proposition 7 Let Ψ be a continuous function from D to Rp satisfying [Det(≥)] , where the domain D
is a subset of Rp with the following property: for any (xi )1≤i≤p ∈ Dp such that det(x1 , . . . , xp ) = 0, then
−
+
p
+
for any neighborhood V of (xi ), there exist (x+
i ) and (xi ) in V ∩ D such that det(x1 , . . . , xp ) > 0 and
−
−
p
det(x1 , . . . , xp ) < 0. Then for any hyperplane H in R the image Ψ(H ∩ D) is included in a hyperplane
of Rp .
Proof.
Let H be a hyperplane in Rp . To prove that Ψ(H ∩ D) is included in a hyperplane, it is
sufficient to prove that the dimension of the vectorial space generated by the vectors in Ψ(H ∩ D) is
not equal to p. Let us suppose that it is equal to p, then there exists (xi )1≤i≤p in (H ∩ D)p such that
det(Ψ(x1 ), . . . , Ψ(xp )) 6= 0. Since dim(H) = p − 1, we have det(x1 , . . . , xp ) = 0. By continuity of Ψ
−
p
and by the local properties of D assumed in Proposition 7, we find (x+
i )1≤i≤p and (xi )1≤i≤p in D such
+
+
−
−
+
−
−
that det(x1 , . . . , x+
)det(Ψ(x
),
.
.
.
,
Ψ(x
))
>
0
and
det(x
,
.
.
.
,
x
)det(Ψ(x
),
.
.
.
,
Ψ(x
))
<
0,
which
p
p
p
p
1
1
1
contradicts Assumption [Det(≥)] .
In the case where D = Rp , if we assume that Ψ(Rp ) is not reduced to a hyperplane and that Ψ is
differentiable at the origin, then we can show that Ψ satisfies [Det(≥)] if and only if Ψ(x) = g(x)Ax,
where A is an invertible matrix and g a nonnegative scalar function. It means that Ψ is quasi linear and
so the rigidity of Ψ is very strong.
Now let us focus on the case where p = 2 and let us show that, while our criterion seems very
constraining, it is not far from being necessary in this case. We suppose that p = 2, V1 = 1 and that the
support of V2 is included in an interval [a, b]. Let us remark that this case occurs for the area process
with [a, b] = [0, πR2 ]. First of all, it is easy to check visually, depending on the geometry of γθ defined
by the curve Ψθ ({1} × [a, b]), whether Ψθ satisfies [Det] (see figure 1 for examples).
14
Moreover let us show that the criterion is not far from being necessary. Suppose that the functions
g decomposed into the
(hi ) do not depend on θ and that Ψ := Ψθ satisfies for ǫ = ±1 Assumption [Det]
three following assumptions:
g =)] det (v1 , v2 ) det (Ψ(v1 ), Ψ(v2 )) is not P ⊗2 -a.s. identically null
[Det(6
V
g
[Det(≥)]
there exists δ > 0 such that for PV⊗2 -a.s. every (v1 , v2 ) in ({1} × [a, a + δ])2
ǫ det (v1 , v2 ) det (Ψ(v1 ), Ψ(v2 )) ≥ 0
g
[Det(≤)]
there exists δ > 0 such that for PV⊗2 -a.s. every (v1 , v2 ) in ({1} × [b − δ, b])2
ǫ det (v1 , v2 ) det (Ψ(v1 ), Ψ(v2 )) ≤ 0.
See Figure 1 for an example of such Ψ. Obviously, this situation is not exactly the opposite of Assumption
[Det], but it is strongly related to it. Then, we have the following proposition which proves that the
identifiability fails for this large class of examples.
g and if det EP (Ψ(v)vT ) 6= 0
Proposition 8 If the functions (hi ) are nonnegative, if Ψ satisfies [Det]
V
then [C3] fails.
Let us note that even if the assumption det EPV (Ψ(v)vT ) 6= 0 seems unnatural, it is in general
satisfied.
Proof. Let us show that (22) admits another solution than θ⋆ . We only give here the main lines of the
proof.
We denote by O the set in R2 containing the vectors u which are orthogonal to at least one vector v
in {1} × [a, b], i.e. O = {u ∈ R2 , ∃v ∈ {1} × [a, b], uT v = 0}. In fact, O is the union of a cone O+ in the
upper half plane and a cone O− in the lower half plane. For any δ > 0, the expression of the determinant
in (26) can be split in two parts
=
det(X1 (θ, θ⋆ ), X2 (θ, θ⋆ ))
(27)
Z Z
P2
⋆T
⋆
T
1
−θ
v
+ζ
(θ
−θ)
v
k
k
e k=1
det(v1 , v2 )det Ψ(v1 ), Ψ(v2 ) PV (dv1 )PV (dv2 )
2
[a,a+δ]2
Z Z
P2
1
−θ ⋆T vk +ζ (θ ⋆ −θ)T vk
k=1
det(v1 , v2 )det Ψ(v1 ), Ψ(v2 ) PV (dv1 )PV (dv2 ).
e
+
2
2
2
[a,b] \[a,a+δ]
g =)] and [Det(≥)]
g
Let u 6= 0 in O+ and θ = θ⋆ + αu with α > 0. From [Det(6
since ζ is increasing
and ζ(x) ∼ x as x → +∞, we deduce that the first integral in (27) dominates the second one when α
g ) when α is large
goes to infinity. Therefore det(X1 (θ, θ⋆ ), X2 (θ, θ⋆ )) has the sign ǫ (defined before [Det]
−
g =)] and [Det(≤)]
g
enough. Similarly, if u is in O then from [Det(6
det(X1 (θ, θ⋆ ), X2 (θ, θ⋆ )) has the
sign −ǫ when α is large enough and it implies that the sign of det(X1 (θ, θ⋆ ), X2 (θ, θ⋆ )), with θ = θ⋆ + u,
is different for u in O+ or in O− as soon as |u| is large enough. We deduce that there exists a continuous
⋆
⋆
⋆
curve t 7→ u(t) which crosses O suchthat det(X1 (θ(t),
θ ), X2 (θ(t), θ )) = 0 for every θ(t) = θ + u(t).
Let us note that the assumption det EPV (Ψ(v)vT ) 6= 0 ensures that det(X1 (θ⋆ , θ⋆ ), X2 (θ⋆ , θ⋆ )) 6= 0,
and so u(t) is never null. Let us show that there exists t0 such that θ(t0 ) is a solution of the system
in (22).
Since the functions (hi ) are nonnegative and from Definition (25), we obtain that for every t, X1 (θ(t), θ⋆ )
T
is collinear to a vector in {1} × [a, b]. By continuity of the function t 7→ X1 (θ(t), θ⋆ ) u(t) and by the
⋆
mean value theorem, there exists t0 such that u(t0 ) is orthogonal to X1 (θ(t0 ), θ ). Since the determinant det(X1 (θ(t0 ), θ⋆ ), X2 (θ(t0 ), θ⋆ )) = 0, u(t0 ) is also orthogonal to X2 (θ(t0 ), θ⋆ ) and it follows that the
system in (24) provides at least two solutions θ⋆ and θ(t0 ). Identification assumption (22) or [C3] fail.
15
b
b
γ
γ
Ψ
a
b
γ
Ψ
1
a
Ψ
1
a
1
Figure 1: on the left (respectively in the middle), an example of function Ψ satisfying (respectively not satisfying)
g .
[Det(≥)] . On the right, an example of Ψ satisfying [Det]
5.2
The case K>p
In the case where K > p, we noticed, in the introduction, that the identification problem should be
simpler. Nevertheless, we did not find a satisfactory criterion to prove it. The following Proposition 9
gives a sufficient criterion which is probably far from being necessary. It is based on a slight modification
of Assumption [Det] which does not seem to be the appropriate tool in this setting. However, in the
case where p = 2 and K = 3, this condition reduces to a nice geometrical property which can be checked
easily.
First, let us present the criterion. We denote by A the set of all subsets with p elements in {1, . . . , K},
n
o
A = I ⊂ {1, . . . , K}, such that #(I) = p .
We say that Assumption [Det’] is satisfied if, for every θ in Θ, there exists a family of real coefficients
(cI )I∈A such that the two following assumptions hold:
X
[Det’(6=)]
cI det(v1 , . . . , vp )det ΨIθ (v1 ), . . . , ΨIθ (vp )) is not (PV )⊗p -a.s. identically null
I∈A
[Det’(≥)] for (PV )⊗p -a.s. every (v1 , . . . , vp ) in (Rp )p
X
cI det(v1 , . . . , vp )det ΨIθ (v1 ), . . . , ΨIθ (vp )) ≥ 0,
I∈A
where ΨIθ (v) denotes the p-dimensional vector extracted from Ψθ (v) for the coordinates given by I. In
the particular case where K = p, Assumption [Det’] becomes Assumption [Det] exactly.
Our criterion is the following
Proposition 9 Assumption [Det’] ensures that Assumption [C3] holds.
⋆
Proof. As in the proof of Proposition 6, to show that (22) admits the unique
solution θ , it is sufficient
to prove that there exists I in A such that, for all θ 6= θ⋆ , det (Xi (θ, θ⋆ ))i∈I 6= 0. It is equivalent to: for
every θ 6= θ⋆ in Θ there exists a family of real coefficients (cI )I∈A such that
X
(28)
cI det (Xi (θ, θ⋆ ))i∈I > 0.
I∈A
With calculations as in (26) we obtain
X
p!
cI det (Xi (θ, θ⋆ ))i∈I
I∈A
=
Z
e
Pp
⋆T
vk +ζ (θ ⋆ −θ)T vk
k=1 −θ
X
!
cI det(v1 , . . . , vp )det ΨIθ (v1 ), . . . , ΨIθ (vp )) PV (dv1 ) · · · PV (dvp ).
I∈A
16
Thanks to [Det’] , this quantity is positive.
In the case where p = 2 and K = 3, [Det’(≥)] is satisfied if and only if there exist a, b, c in R3 such
that for every v1 and v2 with det(v1 , v2 ) > 0
{1,2}
{1,2}
{1,3}
{1,3}
{2,3}
{2,3}
a det Ψθ
(v1 ), Ψθ
(v2 ) +b det Ψθ
(v1 ), Ψθ
(v2 ) +c det Ψθ
(v1 ), Ψθ
(v2 ) ≥ 0. (29)
If we denote by ∧ the vectorial product in R3 , the inequality in (29) means that the following set
n
o
Ψθ (v1 ) ∧ Ψθ (v2 ), for all v1 , v2 such that det(v1 , v2 ) > 0
(30)
is included in the half space in R3 with equation cx − by + az ≥ 0. In the setting where V1 = 1 and V2 is
included in an interval [a, b], as for the area process, this condition becomes a geometrical characteristic
of the curve γθ = Ψθ ({1} × [a, b]) in R3 , which is easy to check visually.
6
Extension in the presence of non-hereditary interaction
In several recent papers, Gibbs processes with non hereditary interactions are considered, in particular in
the domain of stochastic geometry (see [13], [15]). The parametric estimation of such models has also been
investigated. The first results in this direction have been given in [16] via a pseudo-likelihood procedure
based on a generalization of the Georgii-Nguyen-Zessin formula (4). The same kind of generalization is
possible for the Takacs-Fiksel procedure. We address this improvement in this section.
In the following, we do not assume that the energy VΛ (ϕ, θ) satisfies the heredity assumption (2). The
first consequences are that the local energy V (xm |ϕ; θ) is not defined in general and that the GeorgiiNguyen-Zessin formula is not available. Let us begin by presenting the generalization of this formula, as
stated in [16], Proposition 2, which is valid in the hereditary and non-hereditary settings.
We first need to recall the concept of removable points which has been introduced in [16], Definition 3.
Definition 2 A point xm in a configuration ϕ is called removable if there exists a bounded set Λ containing x such that VΛ (ϕ\xm , θ) < +∞. We denote by Rθ (ϕ) the set of removable points in ϕ.
Let us remark that the removable set is only related to the support of the underlying Gibbs measure.
The local energy V (xm |ϕ\xm ; θ) of any removable point xm ∈ Rθ (ϕ) can then be defined by the classical
expression (3) where Λ comes from Definition 2. In the hereditary case, all the points of ϕ are removable
and we regain the classical definition of the local energy.
The generalization of the Georgii-Nguyen-Zessin formula is the following equation
Z
X
m
⋆
h (xm , Φ \ xm ; θ) .
h (xm , Φ; θ) e−V (x |Φ;θ ) µ(dxm ) = E
(31)
E
Rd ×
M
xm ∈Rθ⋆ (Φ)
Let us notice that the only difference with the classical formula is that the sum is restricted to the
removable points. Now, let us present the consequences of this formula on the Takacs-Fiksel procedure.
We have to consider the two following cases:
• When the support of the Gibbs measure does not depend on θ: the set of removable points Rθ (ϕ)
does not depend on θ either. In this case, the Takacs-Fiksel estimor is defined by (7), and CΛ is as
in (6) to the exception of the sum which is restricted to the removable points:
CΛ (ϕ; h, θ) :=
Z
Λ×
M
h(xm , ϕ; θ)e−V (x
m
|ϕ;θ)
µ(dxm ) −
X
h(xm , ϕ \ xm ; θ).
(32)
xm ∈Rθ⋆ (ϕ)∩Λ
The sum is computable because by assumption, the set Rθ⋆ (ϕ) does not depend on θ⋆ . In this
situation, with the same assumptions [C] and [N], the consistency and the asymptotic normality
of the estimator may be proved as in Section 7.
17
• When the support of the Gibbs measure depends on some parameters θhc = (θ1 , . . . , θq ), q ≤ p
(called the hardcore parameters): the remaining parameters θsm = (θq+1 , . . . , θp ) are supposed
to parameterize the classical (or smooth) interaction between points. The set of removable points
Rθ (ϕ) therefore depends on θhc only. The estimation issue is more complicated in this case. Indeed,
Assumption [C2] requires some regularities of the interaction with respect to the parameter θ,
such as continuity, which clearly fail to be true for the support parameter θhc . The Takacs-Fiksel
procedure is therefore unable to estimate θhc . Note that this problem is not specific to the presence of
non-hereditary interactions, but arises as soon as some hardcore parameters have to be estimated. In
[16], the authors solve this problem in both the hereditary and non-hereditary setting, by introducing
a two-step estimation procedure. We can follow the same strategy here. In a first step, the estimator
θ̂hc of the hardcore parameter is defined in a natural way according to the observed support of the
point process (see Section 4.2.1 in [16]). Then, in a second step, the Takacs-Fiksel estimator θ̂sm is
defined by (7) with
CΛ (ϕ; h, θsm ) :=
Z
m
h(xm , ϕ; θsm , θ̂hc )e−V (x |ϕ;θsm ,θ̂hc ) µ(dxm )
Λ×M
X
h(xm , ϕ \ xm ; θsm , θ̂hc ).
−
xm ∈Rθ̂
hc
(33)
(ϕ)∩Λ
Let us remark that the estimator θ̂hc is plugged in the computation of CΛ . In particular, the
removable points are determined with respect to θ̂hc . As in [16], the regularity and integrability
assumptions of type [C] for θ̂sm and conditions on the support of the Gibbs measure are required in
order to obtain the consistency of (θ̂hc , θ̂sm ). The asymptotic normality is more difficult to obtain
and no general results are available. In fact, it seems that there is no hope to expect asymptotic
normality without managing the rate of convergence of θ̂hc , which should strongly depend on the
model.
7
A short simulation study
The first aim of the present paper is to explore the asymptotic properties of the Takacs-Fiksel estimate.
Given a model, the important question of correctly choosing the test functions is an open question and
will not be treated here. Also, a complete comparison between all the existing parametric estimation
methods has not been attempted. In this section, we just present a brief simulation study to illustrate
the methodology and the asymptotic results.
The model considered in this section is the (non-marked) Strauss process described by (11) with
two parameters. Table 1 and Figure 2 give an empirical comparison of the approximated maximum
likelihood estimate (MLE) described in [26], the maximum pseudo-likelihood estimate (MPLE), which is
a particular case of the Takacs-Fiksel estimate, and the explicit Takacs-Fiksel estimate (TFE) described
in Section 3.2.3 and in particular in (13). To generate Strauss point processes and to compute the MLE,
the R package spatstat is used. The MPLE and the TFE are computed respectively from (8) and (6)
where the integrals were approximated by Monte-Carlo.1 Strauss point processes are generated on the
window [0, 3]2 ⊕ R where R is set to 0.05. Recall that this parameter corresponds to the finite range of
the Strauss process. Then the estimates are computed on the window [0, τ ]2 ⊕ R for τ = 1, 2, 3, using a
⋆
R-erosion. We use the parametrization of spatstat, i.e. the intensity parameter is β ⋆ := e−θ1 and the
⋆
interaction parameter is γ ⋆ := e−θ2 . We set β ⋆ = 100 and γ ⋆ = 0.5. The estimation of (β ⋆ , γ ⋆ ) by the
b
b
bb
TFE is thus (β,
γ ) = (e−θ1 , e−θ2 ) where (θb1 , θb2 ) are defined in (13).
As expected, the three estimates have decreasing variance when the domain of observation grows.
Since the explicit TFE is obtained very quickly, its behavior appears very satisfactory in comparison with
the MLE and MPLE.
1 The MPLE is also implemented in spatstat but we did not use it. In fact, it appears that the implementation therein
leads to biased estimations.
18
Wτ
[0, 1]2
[0, 2]2
[0, 3]2
Estimates of the parameter β ⋆ = 100
MLE
MPLE
TFE
103.50 (18.24) 101.19 (16.92) 101.44 (18.95)
101.03 (8.04)
100.33 (8.66)
99.99 (8.71)
100.38 (6.58)
99.55 (5.47)
99.96 (6.27)
Estimates of the
Wτ
MLE
[0, 1]2 0.50 (0.17)
[0, 2]2 0.50 (0.08)
[0, 3]2 0.51 (0.05)
parameter γ ⋆ = 0.5
MPLE
TFE
0.50 (0.16) 0.52 (0.22)
0.50 (0.08)
0.51(0.11)
0.49 (0.05) 0.51 (0.06)
Table 1: Empirical means and standard deviations between brackets for the MLE, MPLE and the (explicit) TFE based on m = 500 replications of a Strauss point process with parameters β ⋆ = 100, γ ⋆ = 0.5,
and (known) interaction range R = 0.05, generated in the window [0, 3]2 ⊕R and estimated on the window
Wτ = [0, τ ]2 ⊕ R with R-erosion for τ = 1, 2, 3.
Figure 2: Boxplots of the 3 estimates of β ⋆ (left) and γ ⋆ (right) when τ = 3, based on m = 500 replications.
19
τ =1
Empirical covariance of τ × (θb1 , θb2 )T
τ =2
0.037 −0.051
−0.051
0.185
0.031 −0.048
−0.048
0.185
MC approximation of
(E T )−1 ΣE −1
τ =3
0.035 −0.051
−0.051
0.179
0.034 −0.045
−0.045
0.154
Table 2: Comparison between the renormalized empirical covariance matrix of the estimates (θb1 , θb2 )
(based on m = 500 replications of the Strauss process on [0, τ ]2 ⊕ R as in Table 1) and the asymptotic
covariance matrix (approximated by Monte-Carlo simulations).
Now, we focus more particularly on the explicit TFE given by (13). We wish to assess whether
the empirical covariance matrix of (θb1 , θb2 )T agrees with the asymptotic covariance matrix deduced from
(17) in Theorem 4. For this reason, we come back to the parameterization of the Strauss model of
Section 3.2.3. From (17), we deduce that the asymptotic covariance matrix of τ × (θb1 , θb2 )T , as τ tends
to +∞, is E −1 (h, θ⋆ )T Σ(h, θ⋆ )E −1 (h, θ⋆ ) (a simplification occurs here since K = p = 2). Moreover, we
can prove that the matrix E(h, θ⋆ ) reduces to
⋆
E (N0,∆ ) eθ2 E (N1,∆ )
⋆
−1
⋆
E(h, θ ) = |∆|
0
eθ2 E (N1,∆ )
where ∆ is any bounded domain. Since the matrices E(h, θ⋆ ) and Σ(h, θ⋆ ) depend on expectations
with respect to Pθ⋆ , they are approximated by Monte-Carlo simulations. Table 2 contains the empirical
covariance of τ × (θb1 , θb2 )T , for τ = 1, 2, 3 as well as the asymptotic covariance matrix approximated
by Monte-Carlo simulations. Finally, in order to appreciate the Gaussian limit, Figure 3 compares the
empirical distribution of the two estimates (θb1 , θb2 ) with the expected limit when τ = 3. Note that the
bb
same kind of results as in Table 2 and Figure 3 could have been easily obtained for (β,
γ ) by use of the
delta method.
Figure 3: Histogram of τ (θbj − θj⋆ ) for j = 1 (intensity parameter, left) and j = 2 (interaction parameter,
right), for the large window, i.e. τ = 3. The curves correspond to the densities of the asymptotic Gaussian
laws, which are centered with variance .034 (left) and .154 (right) according to Table 2.
20
8
Proofs of asymptotic results
8.1
Proof of Theorem 3
If there is more than one stationary Gibbs measure, then some non ergodic Gibbs measures automatically
exist because, in the convex set of all Gibbs measures, only the extremal measures are ergodic. But any
stationary Gibbs measure can be represented as a mixture of ergodic measures ([22], Theorem 14.10). Due
to this decomposition, we can assume that Pθ⋆ is ergodic to prove the consistency of the Takacs-Fiksel
estimate. The proof is split into two steps.
Step 1. UΛn is a contrast function. Under [C1], the ergodic Lemma 2 and the GNZ formula given
in (5) may be applied to prove that Pθ⋆ -a.s.
M
|Λn |−1 CΛn (Φ, hk ; θ) → E hk (0M , Φ; θ)e−V (0 |Φ;θ) − E hk (0M , Φ \ 0M ; θ)
M
M
⋆
= E hk (0M , Φ; θ) e−V (0 |Φ;θ) − e−V (0 |Φ;θ ) .
Therefore, as n → +∞, one obtains Pθ⋆ −a.s.
UΛn (Φ; h, θ) → U (θ) :=
K
X
k=1
2
M
M
⋆
.
E hk (0M , Φ; θ) e−V (0 |Φ;θ) − e−V (0 |Φ;θ )
⋆
Note that U (θ ) = 0. In addition with Assumptions [C2] and [C3], this proves that UΛn is a continuous
contrast function vanishing only at θ⋆ .
Step 2. Modulus of continuity
The modulus of continuity of the contrast process is defined for all ϕ ∈ Ω and all η > 0 by
o
n
Wn (ϕ, η) = sup UΛn (ϕ; h, θ) − UΛn (ϕ; h, θ′ ) : θ, θ′ ∈ Θ, ||θ − θ′ || ≤ η .
This step aims at proving that there exists a sequence (εℓ )ℓ≥1 , with εℓ → 0 as ℓ → +∞ such that for all
ℓ≥1
1
= 0.
(34)
≥ εℓ
P lim sup Wn Φ,
ℓ
n→+∞
Let θ, θ′ ∈ Θ, then
|UΛn (ϕ; h, θ) − UΛn (ϕ; h, θ′ )| ≤
K
X
|Λn |−1 |CΛn (ϕ; hk , θ) − CΛn (ϕ; hk , θ′ )| ×
k=1
|Λn |
−1
|CΛn (ϕ; hk , θ)| + |CΛn (ϕ; hk , θ )| .
′
(35)
Under Assumption [C4], there exists n0 ≥ 1 such that for all n ≥ n0 we have for Pθ⋆ −a.e. ϕ
Z
m
′
−1
−1
|Λn | (|CΛn (ϕ; hk , θ)| + |CΛn (ϕ; hk , θ )|) ≤ 2|Λn |
max |hk (xm , ϕ; θ)|e−V (x |ϕ;θ) µ(dxm )
Λn ×
+2|Λn |−1
M θ∈Θ
X
xm ∈ϕ
≤ γ1 ,
where
γ1 := 4 × max
k=1,...,K
max |hk (xm , ϕ \ xm ; θ)|
θ∈Θ
Λn
M
⋆
.
E max |fk (0M , Φ; θ)| + E max |hk (0M , Φ; θ)|e−V (0 |Φ;θ )
θ∈Θ
θ∈Θ
Therefore for all n ≥ n0
K
|UΛn (ϕ; h, θ) − UΛn (ϕ; h, θ′ )| ≤ γ1 ×
1 X
(AΛn (ϕ; hk , θ, θ′ ) + BΛn (ϕ; hk , θ, θ′ )) ,
|Λn |
k=1
21
where
′
AΛn (ϕ; hk , θ, θ )
:=
BΛn (ϕ; hk , θ, θ′ )
Z
=
M
X
|fk (xm , ϕ; θ) − fk (xm , ϕ; θ′ )|µ(dxm )
Λn ×
xm ∈ϕ
|hk (xm , ϕ \ xm ; θ) − hk (xm , ϕ \ xm ; θ′ )|.
Λn
Under Assumption [C4], one mayapply the mean value theorem
in Rp as follows: there exist ξ (1) , . . . , ξ (p) ∈
′
′
′
′
[min(θ1 , θ1 ), max(θ1 , θ1 )] × . . . × min(θp , θp ), max(θp , θp ) such that for all ϕ ∈ Ω
′
AΛn (ϕ; hk , θ, θ ) =
Z
Λn ×
≤
p
X
M j=1
(1)
(θj − θj′ )(fk (1) (xm , ϕ; ξj ))j µ(dxm )
′
p × kθ − θ k
Z
Λn ×
In a similar way, one may prove that for Pθ⋆ −a.e. ϕ
X
BΛn (ϕ; hk , θ, θ′ ) ≤ p × kθ − θ′ k
max kfk (1) (xm , ϕ; θ)k µ(dxm )
M θ∈Θ
xm ∈ϕΛn
max khk (1) (xm , ϕ \ xm ; θ)k.
θ∈Θ
Under Assumption [C4], there exists n1 (k) ≥ 1 such that for all n ≥ n1 (k), we have for Pθ⋆ −a.e. ϕ
1
AΛn (ϕ; hk , θ, θ′ ) + BΛn (ϕ; hk , θ, θ′ ) ≤ γ2 kθ − θ′ k
|Λn |
where
(1) M
(1) M
−V (0M |Φ;θ ⋆ )
.
E max kfk (0 , Φ; θ)k + E max khk (0 , Φ; θ)ke
γ2 := 2p × max
k=1,...,K
θ∈Θ
θ∈Θ
We finally obtain the following upper-bound for Pθ⋆ −a.e. ϕ, for all θ, θ′ such that kθ − θ′ k ≤ 1/ℓ and for
all n ≥ N = max(n0 , maxk n1 (k))
1
|UΛn (ϕ; h, θ) − UΛn (ϕ; h, θ′ )| ≤ γ × ,
ℓ
with γ = K × γ1 × γ2 and therefore Wn (ϕ, 1/ℓ) ≤ γ × 1ℓ . Finally, since
\ [
[
1
1
1
γ
γ
γ
≥
=
≥
⊂
≥
lim sup Wn ϕ,
Wn ϕ,
Wn ϕ,
ℓ
ℓ
ℓ
ℓ
ℓ
ℓ
n→+∞
m∈N n≥m
n≥N
for Pθ⋆ −a.e. ϕ, the expected result (34) is proved.
Conclusion step. Steps 1 and 2 ensure the fact that we can apply Theorem 3.4.3 of [23] which asserts the
almost sure convergence for minimum contrast estimators.
8.2
Proof of Theorem 4
The proof is based on a classical result concerning asymptotic normality for minimum contrast estimators
e.g. Theorem 3.4.5 of [23]. We split it into two different steps.
(1)
Step 1. Asymptotic normality of UΛn (Φ; h, θ⋆ ).
We start with a Lemma which states a central limit theorem. Its proof uses a conditional centering
assumption as in [28], which holds due to the particular form of the contrast function (6). This scheme
of proof is now well-known and we mainly refer to previous works for the technical details.
Lemma 10 Under the Assumptions [N1], [N2] and [N3], the following convergence holds in distribution, as n → +∞
d
e Λn (Φ; h, θ⋆ ) →
(36)
N (0, Σ(h, θ⋆ )),
|Λn |−1/2 C
e Λn (ϕ; h, θ⋆ ) are defined in Theorem 4.
where Σ(h, θ⋆ ) and C
22
e Λn (ϕ; h, θ⋆ ) (of length K) corresponds to the vector of the hk −residuals for
Proof.
The vector C
k = 1, . . . , K computed on the same domain Λn with θb = θ⋆ , see [2] for a definition and practical study of
this concept of residuals. The asymptotic behavior of the residuals process has been investigated in [12].
b
e Λ (Φ; h, θ)
In particular, with the notation of the present paper, the asymptotic normality of the vector C
n
⋆
for general θb is obtained (see Proposition 4 in [12]). When θb = θ , the assumptions and the asymptotic
covariance matrix of Proposition 4 in [12] respectively reduce to [N1-3] and (18).
Now, according to the definition of UΛn (ϕ; h, θ⋆ ), we have
(1)
UΛn (ϕ; h, θ⋆ ) = 2|Λn |−2
K
X
(1)
CΛn (ϕ; hk , θ⋆ )CΛn (ϕ; hk , θ⋆ ).
k=1
Under Assumption [C4], one may apply the ergodic Lemma 2, in order to derive Pθ⋆ −a.s., as n → +∞
M
⋆
(1)
(37)
|Λn |−1 CΛn (Φ; hk , θ⋆ ) → E fk (1) (0M , Φ; θ⋆ ) − hk (1) (0M , Φ; θ⋆ )e−V (0 |Φ;θ ) .
It is easily checked
−E(hk , θ⋆ ), this vector of length p being defined by
that this expectation reduces to
⋆
M
⋆
(1) M
⋆ −V (0M |Φ;θ ⋆ )
. Let us denote by E(h, θ⋆ ) the (p, K) matrix
E(hk , θ ) := E hk (0 , Φ; θ )V (0 |Φ; θ )e
(E(h1 , θ⋆ ), . . . , E(hK , θ⋆ )), then we get the following decomposition
(1)
2|Λn |1/2 × |Λn |−2
|Λn |1/2 UΛn (Φ; h, θ⋆ ) =
K
X
(1)
CΛn (Φ; hk , θ⋆ )CΛn (Φ; hk , θ⋆ )
k=1
e Λn (Φ; h, θ⋆ )
−2|Λn |−1/2 E(h, θ⋆ )C
K h
i
X
(1)
+2
|Λn |−1 CΛn (Φ; hk , θ⋆ ) − (−E(hk , θ⋆ )) |Λn |−1/2 CΛn (Φ; hk , θ⋆ ).
=
k=1
According to (37) and Lemma 10, Slutsky’s Theorem implies that for any k ∈ {1, . . . , K},
P
(1)
|Λn |−1 CΛn (Φ; hk , θ⋆ ) − E(hk , θ⋆ ) |Λn |−1/2 CΛn (Φ; hk , θ⋆ ) → 0,
as n → +∞, the zero here being a vector of length p. Using again Lemma 10, we finally reach the
following convergence in distribution as n → +∞
d
(1)
T
|Λn |1/2 UΛn (Φ; h, θ⋆ ) → N 0, 4E(h, θ⋆ ) Σ(h, θ⋆ ) E(h, θ⋆ ) ,
where Σ(h, θ⋆ ) is defined by (18).
(2)
Step 2. Convergence of UΛn (Φ; h, θ) for θ ∈ V(θ⋆ )
(2)
According to our definition and Assumption [N4], the (p, p) matrix UΛn (ϕ; h, θ) is defined for i, j =
1, . . . , p by
(2)
UΛn (ϕ; h, θ⋆ )
ij
= 2|Λn |
−2
K
X
(2)
(1)
(1)
.
CΛn (ϕ; hk , θ)
CΛn (ϕ; hk ; θ) CΛn (ϕ; hk , θ) + CΛn (ϕ; hk , θ)
ij
k=1
(1)
(2)
Note also that CΛn (ϕ; hk , θ) and CΛn (ϕ; hk , θ) are defined by
Z
(1)
(1)
CΛn (ϕ; hk , θ) =
fk (xm , ϕ; θ)µ(dxm ) −
Λn ×
(2)
CΛn (ϕ; hk , θ)
i
=
Z
Λn ×
M
M
(2)
f k (xm , ϕ; θ)µ(dxm ) −
X
xm ∈ϕΛn
X
xm ∈ϕΛn
23
(1)
hk (xm , ϕ \ xm ; θ)
(2)
hk (xm , ϕ \ xm ; θ).
j
Under Assumption [N4], then for all i, j = 1, . . . , p and for any k = 1, . . . , K, each normalized term
(1)
(2)
|Λn |−1 CΛn (Φ; hk , θ), |Λn |−1 CΛn (Φ; hk , θ) and |Λn |−1 CΛn (Φ; hk , θ) satisfies the assumptions of the ergodic
Lemma 2. Therefore, for any θ ∈ V(θ⋆ ), there exists a matrix U(2) (h, θ) such that Pθ⋆ −a.s.
(2)
UΛn (Φ; h, θ) → U(2) (h, θ).
(2)
This justifies that, for n large enough, in a neighborhood of θ⋆ , UΛn (ϕ; h, θ) ij is uniformly bounded by
2 × maxθ∈V(θ⋆) | U(2) (h, θ) ij | for Pθ⋆ −a.e. ϕ. When θ = θ⋆ , recall, from (5), that |Λn |−1 CΛn (Φ; hk , θ⋆ )
T
converges almost surely to zero and that (37) holds. Hence, U(2) (h, θ⋆ ) reduces to 2E(h, θ⋆ )E(h, θ⋆ ) .
Conclusion
Step. From
Theorem
1 and 2 ensure that the normalized difference
3.4.5 of [23], Steps
(1)
(2)
⋆
1/2
⋆
⋆
b
U (h, θ ) θn (Φ) − θ − UΛn (Φ; h, θ ) converges in probability to 0, which is the expected
|Λn |
result.
Acknowledgements
We are grateful to the associate editor, the anonymous referee and to Hans Zessin for their valuable
comments and advices.
References
[1] A. Baddeley and R. Turner. Practical maximum pseudolikelihood for spatial point patterns (with
discussion). Australian and New Zealand Journal of Statistics, 42:283–322, 2000.
[2] A. Baddeley, R. Turner, J. Møller, and M. Hazelton. Residual analysis for spatial point processes.
Journal of the Royal Statistical Society (series B), 67:1–35, 2005.
[3] A. J. Baddeley and M. N. M. Van Lieshout. Area-interaction point processes. Ann. Inst. Statist.
Math., 47(4):601–619, 1995.
[4] K.K. Berthelsen and J. Møller. Likelihood and non-parametric Bayesian MCMC inference for spatial
point processes based on perfect simulation and path sampling. Scandinavian Journal of Statistics,
30(3):549–564, 2003.
[5] E. Bertin, J.-M. Billiot, and R. Drouilhet. Existence of “nearest-neighbour” spatial Gibbs models.
Adv. Appli. Prob., 31:895–909, 1999.
[6] E. Bertin, J.-M. Billiot, and R. Drouilhet. R-local Delaunay inhibition model. Journal of Statistical
Physics, 132(4):649–667, 2008.
[7] J. Besag. Spatial interaction and the statistical analysis of lattice system. Journal of the Royal
Statistical Society (series B), 26:192–236, 1974.
[8] J.-M. Billiot. Asymptotic properties of Takacs-Fiksel estimation method for Gibbs point processes.
Statistics, 30(1):69–89, 1997.
[9] J.-M. Billiot, J.-F. Coeurjolly, and R. Drouilhet. Maximum pseudolikelihood estimator for exponential family models of marked Gibbs point processes. Electronic Journal of Statistics, 2:234–264,
2008.
[10] D. Brillinger. Statistical inference for stationary point processes. Stochastic processes and related
topics, vol 1. 55-99 Ed. Madan Lal Puri. Academic Press, New-York-London, 1975.
[11] J.-F. Coeurjolly and R. Drouilhet. Asymptotic properties of the maximum pseudo-likelihood estimator for stationary Gibbs point processes including the Lennard-Jones model. Electronic Journal
of Statistics, 4:677–706, 2010.
24
[12] J.-F. Coeurjolly and F. Lavancier. Residuals for stationary marked Gibbs point processes. submitted,
2010. http://arxiv.org/abs/1002.0857.
[13] D. Dereudre. Gibbs Delaunay tessellations with geometric hardcore condition. J. Stat. Phys., 121(34):511–515, 2005.
[14] D. Dereudre. The existence of quermass-interaction processes for nonlocally stable interaction and
nonbounded convex grains. Adv. Appli. Prob., 41(3):664–681, 2009.
[15] D. Dereudre, R. Drouilhet, and H.O. Georgii. Existence of Gibbsian point processes with geometrydependent interactions. to appear in Prob. Theory and Related Fields, 2010.
[16] D. Dereudre and F. Lavancier. Campbell equilibrium equation and pseudo-likelihood estimation for
non-hereditary Gibbs point processes. Bernoulli, 15(4):1368–1396, 2009.
[17] D. Dereudre and F. Lavancier. Practical simulation and estimation for Gibbs Delaunay-Voronoı̈
tessellations with geometric hardcore interaction. Computational Stat. and Data Analysis, 55:498–
519, 2011.
[18] P. J. Diggle, T. Fiksel, P. Grabarnik, Y. Ogata, D. Stoyan, and M. Tanemura. On parameter
estimation for pairwise interaction point processes. International Statistical Review, 62(1):99–117,
1994.
[19] T. Fiksel. Estimation of parameterized pair potentials of marked and non-marked Gibbsian point
processes. Elektron. Inform. Kybernet., 20:270–278, 1984.
[20] T. Fiksel. Estimation of interaction potentials of Gibbsian point processes. Math. Operationsf.
Statist. Ser. Statist., 19:77–86, 1988.
[21] H.O. Georgii. Canonical and grand canonical Gibbs states for continuum systems. Communications
in Mathematical Physics, 48(1):31–51, 1976.
[22] H.O. Georgii. Gibbs measures and Phase Transitions. de Gruyter, Berlin, 1988.
[23] X. Guyon. Random fields on a network. Modeling, Statistics and Applications. Springer Verlag, New
York, 1995.
[24] X. Guyon and H.R. Künsch. Asymptotic comparison of estimators in the Ising model. Lecture notes
in Statistics 74. Springer, Berlin, 1992.
[25] L. Heinrich. Mixing properties of Gibbsian point processes and asymptotic normality of Takacs-Fiksel
estimates. preprint, 1992.
[26] F. Huang and Y. Ogata. Improvements of the maximum pseudo-likelihood estimators in various
spatial statistical models. Journal of Computational and Graphical Statistics, 8(3):510–530, 1999.
[27] J. Illian, A. Penttinen H., and Stoyan. Statistical analysis and modelling of spatial point patterns.
Wiley-Interscience, 2008.
[28] J.L. Jensen and H.R. Künsch. On asymptotic normality of pseudolikelihood estimates of pairwise
interaction processes. Ann. Inst. Statist. Math., 46:475–486, 1994.
[29] J.L. Jensen and J. Møller. Pseudolikelihood for exponential family models of spatial point processes.
Ann. Appl. Probab., 1:445–461, 1991.
[30] W. S. Kendall, M. N. M. Van Lieshout, and A. J. Baddeley. Quermass-interaction processes conditions for stability. Adv. Appli. Prob., 31:315–342, 1999.
[31] K. Krickeberg. Statistical problems of point processes. Rosz. Pol. Tow. Mat., Ser. III, Mat. Stosow.
13, 29-57, 1978.
25
[32] E.L. Lehman. Elements of large-sample theorey. Springer, 1999.
[33] S. Mase. Uniform LAN condition of planar Gibbsian point processes and optimality of maximum
likelihood estimators of soft-core potential function. Probab. Theory and Related Fields, 92:51–67,
1992.
[34] K Matthes, J. Kerstan, and J. Mecke. Infinitely Divisible Point Processes. J. Wiley, 1978.
[35] J. Møller and K. Helisová. Power diagrams and interaction processes for union of discs. Adv. Appli.
Prob., 40(2):321–347, 2008.
[36] J. Møller and R. Waagepetersen. Statistical Inference and Simulation for Spatial Point Processes.
Chapman and Hall/CRC, Boca Raton, 2003.
[37] J. Møller and R. Waagepetersen. Modern statistics for spatial point processes. Scan. J. Stat.,
34:643–684, 2007.
[38] X. Nguyen and H. Zessin. Integral and differential characterizations Gibbs processes. Mathematische
Nachrichten, 88(1):105–115, 1979.
[39] X.X. Nguyen and H. Zessin. Ergodic theorems for spatial processes. Z. Wahrscheinlichkeitstheorie
verw. Gebiete, 48:133–158, 1979.
[40] F. Papangelou. The Armenian Connection: Reminiscences from a time of interactions. Journal of
contemporary mathematical analysis, 44(1):14–19, 2009.
[41] C.J. Preston. Random fields. Springer Verlag, 1976.
[42] D. Ruelle. Statistical Mechanics. Benjamin, New York-Amsterdam, 1969.
[43] D. Ruelle. Superstable interactions in classical statistical mechanics. Comm. Math. Phys., 1970.
[44] D. Stoyan, W.S. Kendall, and J. Mecke. Stochastic geometry and its applications. John Wiley and
Sons, Chichester, 1995.
[45] R. Takacs. Estimator for the pair-potential of a Gibbsian point process. Math. Operationsf. Statist.
Ser. Statist., 17:429–433, 1986.
[46] H. Zessin. Der papangelou prozess. Journal of contemporary mathematical analysis, 44(1):36–44,
2009.
26