Academia.eduAcademia.edu

Takacs-Fiksel Method for Stationary Marked Gibbs Point Processes

2011, Scandinavian Journal of Statistics

This paper studies a method to estimate the parameters governing the distribution of a stationary marked Gibbs point process. This procedure, known as the Takacs-Fiksel method, is based on the estimation of the left and right hand sides of the Georgii-Nguyen-Zessin formula and leads to a family of estimators due to the possible choices of test functions. We propose several examples illustrating the interest and flexibility of this procedure. We also provide sufficient conditions based on the model and the test functions to derive asymptotic properties (consistency and asymptotic normality) of the resulting estimator. The different assumptions are discussed for exponential family models and for a large class of test functions. A short simulation study is proposed to assess the correctness of the methodology and the asymptotic results.

Takacs-Fiksel method for stationary marked Gibbs point processes Jean-François Coeurjolly, David Dereudre, Rémy Drouilhet, Frédéric Lavancier To cite this version: Jean-François Coeurjolly, David Dereudre, Rémy Drouilhet, Frédéric Lavancier. Takacs-Fiksel method for stationary marked Gibbs point processes. Scandinavian Journal of Statistics, 2012, 39 (3), pp.416443. ฀10.1111/j.1467-9469.2011.00738.x฀. ฀hal-00502004v2฀ HAL Id: hal-00502004 https://hal.science/hal-00502004v2 Submitted on 6 Jan 2011 HAL is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés. Takacs-Fiksel method for stationary marked Gibbs point processes J.-F. Coeurjolly1,2 , D. Dereudre3 , R. Drouilhet2 and F. Lavancier4 1 2 3 4 GIPSA-lab, Grenoble University, France, Laboratoire Jean Kuntzmann, Grenoble University, France LAMAV UVHC FR 2956, Lille Nord de France University, France, Laboratoire de Mathématiques Jean Leray, Nantes University, France. January 6, 2011 Abstract This paper studies a method to estimate the parameters governing the distribution of a stationary marked Gibbs point process. This procedure, known as the Takacs-Fiksel method, is based on the estimation of the left and right hand sides of the Georgii-Nguyen-Zessin formula and leads to a family of estimators due to the possible choices of test functions. We propose several examples illustrating the interest and flexibility of this procedure. We also provide sufficient conditions based on the model and the test functions to derive asymptotic properties (consistency and asymptotic normality) of the resulting estimator. The different assumptions are discussed for exponential family models and for a large class of test functions. A short simulation study is proposed to assess the correctness of the methodology and the asymptotic results. Keywords: stationary marked Gibbs point processes, parametric estimation, Takacs-Fiksel method, asymptotic properties, ergodic theorem, central limit theorem. 1 Introduction Spatial point pattern data arise in a wide range of applications and since the seventies [10], [31] various statistical methods have been developed to study these kinds of data (see [36], [37] or [27] for recent reviews). In particular, a spatial point process is often modelled as the realization of a Gibbs distribution, defined through an interaction function, also called Hamiltonian. Gibbs models are extensions of the well-known Poisson process since they constitute a way to introduce dependence between points. Inference for parametric models in this setting has known a large development during the last decade. The most popular method to estimate the parameters is certainly the maximum likelihood estimator (MLE). It involves an intractable normalizing constant, but recent developments in computational statistics, in particular perfect simulations, have made inference feasible for many Gibbs models (see [4]). Although the MLE suffers from a lack of theoretical justifications (only some results for sparse patterns are proposed in [33]), some comparison studies, as in [18], have shown that it outperforms the other estimation methods. Nevertheless, the computation of the MLE remains very time-consuming and even extremely difficult to perform for some models. It is thus necessary to have quick alternative estimators at one’s disposal, at least to propose relevant initial values for the MLE computation. The maximum pseudolikelihood estimator (MPLE for short) constitutes one of them. Proposed by Besag in [7] and popularized by J.L. Jensen and J. Møller in [29] and A. Baddeley and R. Turner in [1] for spatial point processes, this method has the advantage of being theoretically well-understood (see [9], [16]) and of being much 1 faster to compute than the MLE. Another estimation procedure is the Takacs-Fiksel estimator, which arose from [19], [45], [20]. It can be viewed, in some sense, as a generalization of the MPLE. As a matter of fact, the Takacs-Fiksel method is not very popular, nor really used in practice. The main reason is certainly its relatively poor performances, in terms of mean square error, observed for some particular cases as in [18]. However, we think that this procedure deserves some consideration for several reasons that we expose below. The Takacs-Fiksel procedure is based on the Georgii-Nguyen-Zessin formula (GNZ formula for short). Empirical counterparts of the left and the right hand sides of this equation are considered, and the induced estimator is such that the difference of these two terms is close to zero. Since the GNZ formula is valid for any test functions, the Takacs-Fiksel procedure does not only lead to one particular estimator but to a family of estimators, depending on the choice of the test functions. This flexibility is the main advantage of the procedure. We present several examples in the present paper. Let us summarize them in order to underline the interest of this method (see Section 3.2 for more details). First, this procedure can allow us to achieve estimations that likelihood-type methods cannot. As an example, we focus on the quermass model, which gathers the area interaction point process as a particular case (see [30], [14]). This model is sometimes used for geometric random objects. From a data set, one typically does not observe the point pattern but only some geometric sets arising from these points. The non-observability of the points makes the likelihood-type inference unfeasible. As we will show, this problem may be solved thanks to the Takacs-Fiksel procedure, provided that the test functions are chosen properly. Another motivation is the possibility to choose test functions depending on the Hamiltonian in order to construct quicker estimators which do not require the computation of an integral for each value of the parameter. This improvement appears crucial for rigid models, such as those involved in stochastic geometry (see [17]), for which the MLE is prohibitively time-consuming and the MPLE still remains difficult to implement. Moreover, for some models, it is even possible to obtain explicit estimators which do not require any simulation nor optimization. This is illustrated for the Strauss model. Therefore, it appears important to us to understand the theoretical properties of this procedure. This problem is the main objective of the present paper. We prove the consistency and the asymptotic normality of the induced estimator in a very general setting. In particular, we obtain a central limit theorem for the Takacs-Fiksel estimator with the classical rate of convergence, i.e. square root of the volume of the observation domain. This asymptotic result leads to the following comment: as a quick consistent estimator, the Takacs-Fiksel estimator appears to be a very good starting point for refined algorithms. Among them, let us mention the first step Newton-Raphson algorithm (as used in [26]) which allows an accurate approximation of the MLE, starting from a consistent estimator, in only one step. Although the theoretical justifications are missing in the Gibbs framework, it is well-known that this procedure leads to an efficient estimator in the classical iid case (see [32]). Another possibility could be to exploit the local asymptotic normality of the model in order to construct an adaptive estimator from the Takacs-Fiksel estimator. The Local Asymptotic Normality property (LAN) has only been proved for restrictive models in [33], but one can hope that it remains true for most Gibbs models. This procedure also leads to an efficient estimator. All these possibilities are interesting prospects for future investigations. Some asymptotic properties of the Takacs-Fiksel procedure have already been investigated in two previous studies: one by L. Heinrich in [25] and the one by J.-M. Billiot in [8]. These papers have different frameworks and are based on different tools, but they both involve regularity and integrability type assumptions on the Hamiltonian and a theoretical condition which ensures that the contrast function (associated to the Takacs-Fiksel procedure) has a unique minimum. In [25], the consistency and the asymptotic normality are obtained for a quite large class of test functions. These results are, however, proved under the Dobrushin condition (see Theorems 2 and 3 in [25]) which implies the uniqueness of the underlying Gibbs measure and some mixing properties. This condition imposes a dramatic reduction of the space of possible values for the parameters of the model. In [8], the author focuses only on pairwise interaction point processes (which excludes the quermass model for example). The author mainly obtained the consistency for a specific class of test functions. In the case of a multi-Strauss pairwise interaction point process, the author also proved that the identifiability condition holds for the class of test functions he considered. 2 In contrast, our asymptotic results are proved in a very general setting, i.e. for a large class of stationary marked Gibbs models and test functions. The method employed to prove asymptotic normality is based on a conditional centering assumption, first appeared in [24] for the Ising model and generalized to certain spatial point processes in [28]. The main restriction that this method induces is only the finite range of the Hamiltonian. There are no limitations on the space of parameters and, in particular, the possible presence of phase transition does not affect the asymptotic behavior of the estimator. Moreover, the test functions may depend on the parameters. This extension seems important to us because, as emphasized in Section 3.2.2, such test functions can lead to quick and/or explicit estimators. All the general hypotheses assumed for the asymptotic results are discussed. For this, we focus on exponential family models, that is, on models whose interaction function is linear in the parameters. We show that our integrability and regularity assumptions are not restrictive since they are valid for a large class of models such as the Multi-Strauss marked point process, the Strauss-disc type point process, the Geyer’s triplet point process, the quermass model and for all test functions used as a motivation for this work. In the setting of the exponential family models, we also discuss the classical identifiability condition which is required for the Takacs-Fiksel procedure. To the best of our knowledge, this is the first attempt to discuss it. We will specially dwell on questions like: what choices of test functions (and how many test functions) lead to a unique minimum of the contrast function? We propose general criteria and provide examples. It seems commonly admitted that to achieve the identification of the Takacs-Fiksel procedure, one should at least choose as many test functions as the number of parameters. As a consequence of our study, it appears that one should generally strictly choose more test functions than the number of parameters to achieve identification. The rest of the paper is organized as follows. Section 2 introduces notation and a short background on marked Gibbs point processes. The Takacs-Fiksel method is presented in Section 3. It is based on the GNZ formula which is recalled in Section 3 also. Several examples of test functions are given. They aim at illustrating our interest in considering the Takacs-Fiksel procedure. The asymptotic results of the induced estimator are proposed in Section 4. Our results are obtained from a single realization observed in a domain whose volume is supposed to increase to infinity. Some integrability and regularity assumptions made for the Hamiltonian and for the test functions are discussed in this section while in Section 5 the identifiability condition is specifically dealt with. In Section 6, the very special situation where the energy function is not hereditary is considered. The GNZ formula is no longer valid in this setting, but it has been recently extended in [16] thanks to a slight modification. This leads to a natural generalization of the Takacs-Fiksel procedure. In Section 7, we propose a brief simulation study in order to assess the correctness of the asymptotic results we obtained in Section 4. Finally, Section 8 contains the proofs of the asymptotic results. 2 2.1 Background and notation General notation, configuration space Subregions of Rd will typically be denoted by Λ or ∆ and will always be assumed to be Borel with positive Lebesgue measure. We write Λ ⋐ Rd if Λ is bounded. Λc denotes the complementary set of Λ inside Rd . The notation |.| will be used without ambiguity for different kind of objects. For a countable set J , |J | represents the number of elements belonging to J ; For Λ ⋐ Rd , |Λ| is the volume of Λ; For x ∈ Rd , |x| corresponds to its uniform norm while kxk is its Euclidean norm. For all x ∈ Rd , ρ > 0, let B(x, ρ) := {y ∈ Rd , ky − xk < ρ}. For a matrix M, let kMk be the Frobenius norm of M defined by kMk2 = T r(MT M), where T r is the trace operator. The space Rd is endowed with the Borel σ-algebra B(Rd ) and the Lebesgue measure λ. Let be a measurable space, which aims at being the mark space, endowed with the σ-algebra M and the probability measure λm. The state space of the point processes will be := Rd × measured by µ := λ ⊗ λm. We m shall denote for short x = (x, m) an element of . A configuration is a subset ϕ of which is locally finite in that ϕΛ := ϕ∩(Λ× ) has finite cardinality NΛ (ϕ) := |ϕΛ | for all Λ ⋐ Rd . The space Ω = Ω( ) of all configurations is equipped with the σ-algebra F that is generated by the counting variables ϕ → |ϕΛ×A | for any Λ ⋐ Rd and any A ∈ M. Finally, let T = (τx )x∈Rd be the shift group, where τx : Ω → Ω is the translation by the vector −x ∈ Rd (i.e. S S S S 3 M M M the application ϕ 7→ ∪(y,m)∈ϕ {(y − x, m)}). For the sake of simplicity, we set ϕ ∪ xm := ϕ ∪ {xm } and ϕ \ xm := ϕ \ {xm }. 2.2 Marked Gibbs point processes Our results will be expressed for general stationary Gibbs point processes. Since we are interested in asymptotic properties, we have to consider these point processes acting on the infinite volume Rd . Let us briefly recall their definition. A marked point process Φ is an Ω-valued random variable, with probability distribution P on (Ω, F ). The most prominent marked point process is the marked Poisson process π z with intensity measure zλ ⊗ λm on , with z > 0 (see for example [34] for definition and properties). For Λ ⋐ Rd , let us denote z by πΛ the marginal probability measure in Λ of the Poisson process with intensity z. Without loss of 1 generality, the intensity z is fixed to 1 and we simply write π and πΛ in place of π 1 and πΛ . p d Let θ ∈ R (for some p ≥ 1). For any Λ ⋐ R , let us consider the parametric function VΛ (.; θ) from Ω into R ∪ {+∞}. From a physical point of view, VΛ (ϕ; θ) is the energy of ϕΛ in Λ given the outside configuration ϕΛc . In this paper, we focus on stationary point processes on Rd , i.e. with stationary (i.e. T -invariant) probability measure. For any Λ ⋐ Rd , we therefore consider VΛ (.; θ) to be T -invariant, i.e. VΛ (τx ϕ; θ) = VΛ (ϕ; θ) for any x ∈ Rd . Furthermore, (VΛ (.; θ))Λ⋐Rd is a compatible family of energies, i.e. for every Λ ⊂ Λ′ ⋐ Rd , there exists a measurable function ψΛ,Λ′ from Ω into R ∪ {+∞} such that S ∀ϕ ∈ Ω VΛ′ (ϕ; θ) = VΛ (ϕ; θ) + ψΛ,Λ′ (ϕΛc ; θ). (1) A stationary marked Gibbs point process is characterized by a stationary marked Gibbs measure usually defined as follows (see [41]). Definition 1 A probability measure Pθ on Ω is a stationary marked Gibbs measure for the compatible family of T -invariant energies (VΛ (.; θ))Λ⋐Rd if for every Λ ⋐ Rd , for Pθ -almost every outside configuration ϕΛc , the law of Pθ given ϕΛc admits the following conditional density with respect to πΛ : fΛ (ϕΛ |ϕΛc ; θ) = 1 e−VΛ (ϕ;θ) , ZΛ (ϕΛc ; θ) where ZΛ (ϕΛc ; θ) is a normalization called the partition function. The existence of a Gibbs measure on Ω which satisfies these conditional specifications is a difficult issue. We refer the interested reader to [42], [41], [5], [13], [15] for the technical and mathematical development of the existence problem. In a first step, we assume that the family of energies is hereditary (the non-hereditary case will be considered in Section 6), which means that for any Λ ⋐ Rd , for any ϕ ∈ Ω, and for all xm ∈ Λ × , M VΛ (ϕ; θ) = +∞ ⇒ VΛ (ϕ ∪ xm ; θ) = +∞, (2) or equivalently, for all xm ∈ ϕΛ , fΛ (ϕΛ |ϕΛc ; θ) > 0 ⇒ fΛ (ϕΛ \ xm |ϕΛc ; θ) > 0. The minimal assumption of our paper is then: [Mod]: For any θ ∈ Θ, where Θ is a compact subset of Rp , there exists a stationary marked Gibbs measure Pθ for the compatible T -invariant hereditary family (VΛ (.; θ))Λ⋐Rd . Our data consist in the realization of a marked point process Φ with stationary marked Gibbs measure Pθ⋆ , where θ⋆ ∈ Θ̊ is the unknown parameter vector to estimate. Let us note that [Mod] ensures the existence of at least one stationary Gibbs measure. When this Gibbs measure is not unique, we say that the phase transition occurs. In this situation the set of Gibbs measures is a Choquet simplex and any Gibbs measure is a mixture of extremal ergodic Gibbs measures. If the Gibbs measure is unique, it is necessary ergodic (see [22] for more details about these properties). In the rest of this paper, the reader has mainly to keep in mind the concept of local energy defined as the energy required to insert a point xm into the configuration ϕ and expressed for any Λ ∋ x by V (xm |ϕ; θ) := VΛ (ϕ ∪ xm ; θ) − VΛ (ϕ; θ). From the compatibility of the family of energies, i.e. (1), this definition does not depend on Λ. 4 (3) 3 The Takacs-Fiksel estimation procedure 3.1 Presentation The basic ingredient for the definition of the Takacs-Fiksel method is the so-called GNZ formula. This equation was proved by Georgii, Nguyen and Zessin in the seventies but other authors such as Papangelou and Takahashi also contributed to its establishment. See [40] and [46] for historical comments and [21] or [38] for a general presentation. Lemma 1 (Georgii-Nguyen-Zessin Formula) Under [Mod], for any measurable function h(·, ·; θ) : S × Ω → R such that the following quantities are defined and finite, then ! Z  X m −V (xm |Φ;θ ⋆ ) m m m E h (x , Φ; θ) e µ(dx ) = E h (x , Φ \ x ; θ) , (4) Rd × M xm ∈Φ where E denotes the expectation with respect to Pθ⋆ . For stationary marked Gibbs point processes, (4) reduces to     M ⋆ E h 0M , Φ; θ e−V (0 |Φ;θ ) = E h 0M , Φ \ 0M ; θ , (5) where M denotes a random variable with probability distribution λm. A second tool used throughout the paper is the ergodic Theorem established in [39]. Let us give a simpler form, here, which is sufficient in this paper. Lemma 2 (Ergodic result) Under [Mod], we assume that Pθ⋆ is ergodic. Then for any family of measurable functions FΛ , indexed by the bounded sets Λ, from Ω to R which are additive (i.e. FΛ∪Λ′ = FΛ + FΛ′ − FΛ∩Λ′ ), shift invariant (i.e. FΛ (ϕ) = Fτ (Λ) (τ (ϕ)) for any translation τ ) and integrable (i.e. E(|F[0,1]d |) < +∞), we have that for Pθ⋆ -almost every ϕ lim |Λn |−1 FΛn (ϕ) = E(F[0,1]d ), n→+∞ where Λn = [−n, n]d (other regular domains (Λn )n≥1 converging towards Rd could be also considered). Let h(·, ·; θ) : S × Ω → R and let us define for any ϕ ∈ Ω, θ ∈ Θ and Λ ⋐ Rd Z X m h(xm , ϕ \ xm ; θ). CΛ (ϕ; h, θ) := h(xm , ϕ; θ)e−V (x |ϕ;θ)µ(dxm ) − Λ× M (6) xm ∈ϕΛ Assume that we observe the realization of a marked point process Φ satisfying [Mod] in a domain Λn . For appropriate choices of the functional h and a sequence of domain Λn , then it is possible to apply the ergodic result in Lemma 2 to prove that the first and second terms of |Λn |−1 CΛn (Φ; h, θ) respectively converge Pθ⋆ -almost surely to the left and right terms of (5). The latter observation is the basic argument to define the Takacs-Fiksel method. Let us give K functions hk (·, ·; θ) : S × Ω → R (for k = 1, . . . , K), then the Takacs-Fiksel estimator is simply defined by θbn (ϕ) := θbnT F (ϕ) = arg min θ∈Θ K X CΛn (ϕ; hk , θ)2 , (7) k=1 where arg minθ∈Θ F (θ) means the parameter θ which minimizes the function F . Under identification assumption (16) given later, this minimum is obtained for a unique θ provided that n is large enough. 5 3.2 Some examples In this section, some examples of models and test functions h, involved in (6), are provided. The choices made in previous studies are presented in 3.2.1. The two examples presented in 3.2.2 and 3.2.3 show the relevance of the Takacs-Fiksel procedure to provide quick estimates. The last example, in 3.2.4, is concerned with the possible identification of a special marked point process, the quermass model, and shows that appropriate choices of test functions can solve identification problems when points are not observed. There is no asymptotic consideration in this section. Therefore, the different estimates are defined b over a window, say Λ, and are denoted by θ. 3.2.1 Classical examples Let us first quote the particular case when the Takacs-Fiksel estimator reduces to the maximum pseudolikelihood estimator introduced by [29] for spatial point processes. The MPLE is obtained by maximizing the log-pseudo-likelihood contrast function, given by Z X m V (xm |ϕ \ xm ; θ). (8) e−V (x |ϕ;θ) µ(dxm ) − LP LΛ (ϕ; θ) = − Λ× M xm ∈ϕΛ b defined by (7), (xm |ϕ; θ), k = 1, . . . , p, the estimator θ, solves the system CΛ (ϕ; hk , θ) = 0, k = 1, . . . , p, which means that θb is the root of the gradient vector of LP LΛ , i.e. θb is the MPLE. The first empirical study of the Takacs-Fiksel estimator can be found in [18], where this estimate is compared to other estimators for some unmarked pairwise interaction point processes with two parameters. In this context different test functions hr1 , · · · , hrK were used, where for r > 0 X hr (x, ϕ; θ) = ϕB(x,r) = 1[0,r] (ky − xk). Therefore, with the choice hk (xm , ϕ; θ) = ∂ ∂θk V y∈ϕ The integral term involved in (6) is approximated by discretization and the induced estimation (7) is then assessed. Note that with this choice of test functions, the sum term in (6), when normalized by |Λ|−1 , is an estimation of ρ2 K(r), where K(·) is the reduced second order function and ρ denotes the intensity of the stationary point process Φ, i.e. for all B ∈ B(Rd ), E(Φ(B)) = ρ|B|. The latter choice requires the computation of the integral in (6) for all θ. A more convenient choice could be the one first proposed by Fiksel: X hr (x, ϕ; θ) = ϕB(x,r) eV (x|ϕ;θ) = eV (x|ϕ;θ) 1[0,r] (ky − xk). (9) y∈ϕ In the stationary case, this leads to the following approximation thanks to the ergodic theorem Z 1 hr (x, ϕ; θ)e−V (x|ϕ;θ)dx ≈ E ΦB(0,r) = ρπr2 . |Λ| Λ (10) The integral term in (6) is thus easily approximated by |ϕΛ |πr2 for all θ, while the sum can be explicitly computed. These historical choices of test functions have a natural extension in the marked case. 3.2.2 Some choices leading to quick estimations The main advantage of the Takacs-Fiksel procedure is to provide quick consistent estimators, that might supply initial values for a more evolved procedure. A simple way to achieve this goal is to generalize (9) and consider test functions of the form h(xm , ϕ; θ) = h̃(xm , ϕ)eV (x m m |ϕ;θ) , where h̃(x , ϕ) does not depend on θ. So, the integral term in (6) has to be computed only once and not for all θ, while the sum term in (6) does not require any approximation. Hence, the optimisation problem (7) may be resolved very quickly. In some particular examples, explicit formulas may even be obtained for the integral term, as in (10). In the same spirit, an explicit estimator for the Strauss interaction is provided below. 6 3.2.3 An example of explicit estimator for the Strauss process The (non-marked) Strauss process with range of interaction R > 0 is given for any Λ ⋐ Rd by X VΛ (ϕ; θ) = θ1 |ϕΛ | + θ2 1[0,R] (kx − yk) , (11) x,y∈ϕ {x,y}∩Λ6=∅ where θ1 ∈ R and θ2 > 0 are the two parameters of the model. Alternatively, X V (x|ϕ; θ) = θ1 + θ2 1[0,R] (ky − xk). y∈ϕ Let us consider the following family of test functions, for k ∈ N \ {0}, ( e(k−1)θ2 if ϕB(x,R) = k − 1, hk (x, ϕ; θ) = 0 otherwise. (12) This choice gives in (6) CΛ (ϕ; hk , θ) = e−θ1 Vk−1,Λ (ϕ) − e(k−1)θ2 Nk−1,Λ (ϕ), where Nk,Λ (ϕ) denotes the number of points x ∈ ϕΛ such that |B(x, R) ∩ (ϕ \ x)| = k and Vk,Λ (ϕ) denotes the volume of the set {y ∈ Λ, |B(y, R) ∩ ϕ| = k}. Several explicit estimators may be obtained following (7) from (at least) two test functions as above. Let us quote the simplest one, corresponding to the choice h1 and h2 in (7). This leads to the contrast function CΛ (ϕ; h1 , θ)2 + CΛ (ϕ; h2 , θ)2 which vanishes at the unique point (θb1 (ϕ), θb2 (ϕ)) with       V1,Λ (ϕ) V0,Λ (ϕ) V0,Λ (ϕ) , θb2 (ϕ) = ln − ln . (13) θb1 (ϕ) = ln N0,Λ (ϕ) N1,Λ (ϕ) N0,Λ (ϕ) This estimator of (θ1 , θ2 ) is completely explicit, provided the quantities Nk,Λ (ϕ) and Vk,Λ (ϕ) are available. They can be easily approximated by computational geometry tools. 3.2.4 A solution for unobservability issues The quermass model introduced in [30] is a marked point process which aims at modelling random sets in R2 . This is a generalization of the well-known Boolean model to interacting random balls. Let us denote by xR a marked point where x and R > 0 (i.e. the mark) respectively represent the center and the radius of the associated ball B(x, R). For a finite configuration ϕ, i.e. with a finite support instead of R2 , the quermass energy is defined for (θ1 , θ2 , θ3 , θ4 ) ∈ R4 by: [ B(x, R) V (ϕ; θ) = θ1 |ϕ| + θ2 P(Γ) + θ3 A(Γ) + θ4 E(Γ) where Γ = (x,R)∈ϕ and P(Γ), A(Γ) and E(Γ) denote respectively the perimeter, the area and the Euler-Poincaré characteristic (i.e. number of components minus number of holes) of the set Γ. To extend this definition to the infinite support R2 , it is convenient to suppose that the radii of the balls are almost surely uniformly bounded (i.e. λm([0, R0 ]) = 1 for some R0 > 0). In this case the family of energies (VΛ ) is defined by   VΛ (ϕ; θ) = V ϕΛ⊕B(0,2R0 ) ; θ − V ϕΛ⊕B(0,2R0 )\Λ ; θ . This definition may be extended to unbounded radius, though a restriction to the so-called tempered configurations is needed to ensure the existence of the associated Gibbs measure. We refer to [14] for more details. When θ2 = θ3 = θ4 = 0, this model reduces to the Boolean model (see [44] for a survey). The area process (see [3]) is also a particular case, taking θ2 = θ4 = 0. 7 In practice, one only observes the random set Γ, so the marked points xR in ϕ are unknown. A challenging task is then to estimate the parameters (θ1 , θ2 , θ3 , θ4 ) in the presence of this unobservability issue. In particular, a direct application of the maximum likelihood or pseudo-likelihood method is impossible to estimate all the parameters, and especially θ1 which requires the observation of the number of points in ϕ. For the other parameters, which are related to the observable functionals P, A and E, the MLE has been investigated in [35]. Let us show that the Takacs-Fiksel procedure may be used to estimate θ1 in spite of this unobservability issue. Indeed, it is possible to choose some test function h such that both the integral and the sum in (6) are computable. The unobservability issue occurs mainly for the sum term. Let us consider the following example of test function: hper (xR , ϕ; θ) = P (C(x, R) ∩ Γc ) , (14) where C(x, R) is the sphere {y, ||y − x|| = R}. For any finite configuration ϕ, we then have X hper (xR , ϕ \ xR ; θ) = P(Γ), xR ∈ϕ so that this sum is computable evenPif each term hper (xR , ϕ \ xR ; θ) is not. If the configuration ϕ is infinite then for any bounded set Λ, xR ∈ϕΛ hper (xR , ϕ \ xR ; θ) is equal to the perimeter of Γ restricted to Λ plus a boundary term which is asymptotically negligible with respect to the volume of Λ. Consequently, assuming (θ2 , θ3 , θ4 ) known, θ1 may be estimated thanks to (7) with the above choice of test function. The joint estimation of all four parameters might be achieved thanks to additional test functions sharing the same property as above, i.e. such that the sum in (6) is observable. In the particular case of the area process, θ2 = θ4 = 0 and R is constant (i.e. λm = δR ), it suffices to find one more test function to ensure an identifiable estimation (see Example 2 in Section 5 for more details). A possible additional test function is ( 1 if P (C(x, R)) = 2πR R hiso (x , ϕ; θ) = (15) 0 otherwise. In this case 4 P xR ∈ϕ hiso (xR , ϕ \ xR ; θ) corresponds to the number of isolated balls in Γ. Asymptotic results for the Takacs-Fiksel estimator We present in this section asymptotic results for the Takacs-Fiksel estimator for a point process satisfying [Mod] and assumed to be observed in a domain Λn , where (Λn )n≥1 is a sequence of increasing cubes whose size goes to +∞ as n goes to +∞. First, for a function g depending on θ, we denote by g(1) (θ) (resp. g(2) (θ)) the gradient vector of length p (resp. the Hessian matrix of size (p, p)) evaluated at θ. Let us rewrite the Takacs-Fiksel estimator as θbn (ϕ) = arg min UΛn (ϕ; h, θ), θ∈Θ with UΛn (ϕ; h, θ) = |Λn |−2 4.1 Consistency PK k=1 CΛn (ϕ; hk , θ)2 , where h = (h1 , . . . , hK ) and CΛn is given by (6). The consistency is obtained under the following assumptions, denoted by [C]: for any Gibbs measure Pθ⋆ , for all θ ∈ Θ, k = 1, . . . , K and ϕ ∈ Ω [C1] For all x ∈ Rd , hk (xm , ϕ; θ) = hk (0m , τx ϕ; θ) and    M ′ E hk 0M , Φ; θ e−V (0 |Φ;θ ) < +∞, for θ′ = θ, θ⋆ . [C2] UΛn (ϕ; h, ·) is a continuous function for Pθ⋆ −a.e. ϕ. 8 [C3] K X k=1 2   M M ⋆ =0 E hk (0M , Φ; θ) e−V (0 |Φ;θ) − e−V (0 |Φ;θ ) =⇒ θ = θ⋆ . (16) m [C4] hk and fk , defined by fk (xm , ϕ; θ) := hk (xm , ϕ; θ)e−V (x |ϕ;θ) , are continuously differentiable and     M M −V (0M |Φ;θ ⋆ ) < +∞, E max |fk (0 , Φ; θ)| < +∞ and E max |hk (0 , Φ; θ)|e θ∈Θ θ∈Θ   (1) M E max kfk (0 , Φ; θ)k < +∞ θ∈Θ   (1) M −V (0M |Φ;θ ⋆ ) < +∞. E max khk (0 , Φ; θ)ke and θ∈Θ Theorem 3 Assuming [Mod] and [C] then, as n → +∞, the Takacs-Fiksel estimator θbn (ϕ) converges towards θ⋆ for Pθ⋆ −a.e. ϕ. Assumptions [C1], [C2] and [C4] are related to the regularity and the integrability of the different test functions and the local energy function. Some general criteria may be proposed to verify these assumptions, see Section 4.3 for a discussion. Assumption [C3] corresponds to an identifiability condition and requires much more attention. It is well-known that such an assumption is fulfilled when h = V(1) (leading to the MPLE) under mild assumptions (see Assumption [Ident] proposed by [9]). The question to know if this remains true for more general test functions is difficult (actually it is untrue in several cases). This will be discussed specifically in Section 5. 4.2 Asymptotic normality We need the following assumptions denoted by [N]: For any Gibbs measure Pθ⋆ , k = 1, . . . , K, Λ ⋐ Rd , ϕ ∈ Ω and θ in a neighborhood V(θ⋆ ) of θ⋆ :   [N1] E |CΛ (Φ; hk , θ⋆ )|3 < +∞.  [N2] For any sequence of bounded domains Γn such that Γn → 0 as n → +∞, E CΓn (Φ; hk , θ⋆ )2 → 0. [N3] CΛ (ϕ; hk , θ⋆ ) depends only on ϕΛ⊕B(0,D) for some D ≥ 0 (which is uniform in Λ, ϕ, θ⋆ ). [N4] hk and fk (defined in [C4]) are twice continuously differentiable in θ and     M ⋆ E kh(2) (0M , Φ; θ)ke−V (0 |Φ;θ ) < +∞ and E kf (2) (0M , Φ; θ)k < +∞. Let us remark that Assumption [N3] leads us to consider in general that V has a finite range, which means that there exists D ≥ 0 such that for all (m, ϕ) ∈ × Ω and all θ ∈ Θ  V (0m |ϕ; θ) = V 0m |ϕB(0,D) ; θ . M The same kind of finite range property is also expected for (hk ). Theorem 4 Under Assumptions [Mod], [C] and [N], for any ergodic Gibbs measure Pθ⋆ the following convergence in distribution holds as n → +∞     d (17) |Λn |1/2 E(h, θ⋆ )E(h, θ⋆ )T θbn (Φ) − θ⋆ → N 0, E(h, θ⋆ )Σ(h, θ⋆ )E(h, θ⋆ )T , where E(h, θ⋆ ) is the (p, K) matrix defined for i = 1, . . . , p and k = 1, . . . , K by     M ⋆ (E(h, θ⋆ ))ik = E hk (0M , Φ; θ⋆ ) V(1) (0M |Φ; θ⋆ ) e−V (0 |Φ;θ ) i and where Σ(h, θ⋆ ) is the (K, K) matrix defined by  X  e ∆ (D) (Φ; h, θ⋆ )C e ∆ (D) (Φ; h, θ⋆ )T , E C Σ(h, θ⋆ ) = D−d 0 ℓ (18) |ℓ|≤1 where, for all k ∈ Zd , ∆k (D) is the cube centered at kD with side-length D and where, for any bounded e Λ (Φ; h, θ⋆ ) := (CΛ (Φ; hk , θ⋆ )) domain Λ, C k=1,...,K . 9 (1) Remark 1 While Assumptions [N1-3] will ensure that a central limit theorem holds for UΛn (Φ; h, θ⋆ ), (2) Assumption [N4] is required to prove that UΛn (Φ; h, θ) is uniformly bounded in a neighborhood of θ⋆ . These two statements allow us to apply a general central limit theorem for minimum contrast estimators (e.g. Theorem 3.4.5 of [23]). Remark 2 In Theorem 4, if the Gibbs measure Pθ⋆ is stationary (and then not necessarily ergodic), it is a mixture of ergodic measures and the left hand term in (17) converges in distribution to a mixture of normal distributions. We may also propose an asymptotic result valid in the presence of a phase T transition. Indeed, if the matrix E(h, θ⋆ )Σ(h, θ⋆ )E(h, θ⋆ ) is positive definite (for any extremal ergodic measure of the Choquet simplex of stationary Gibbs measures), then we may define a consistent empirical h i−1/2 version S(Φ) of E(h, θ⋆ )Σ(h, θ⋆ )E(h, θ⋆ )T E(h, θ⋆ )E(h, θ⋆ )T (see Section 4 of [12] for more details) to obtain:   d |Λn |1/2 S(Φ) θbn (Φ) − θ⋆ → N (0, I) . Remark 3 Following Section 3.2.1, let us underline that (17) is coherent with the asymptotic normality of the MPLE established in [11], i.e. with the case K = p and h = V(1) . Indeed, with similar assumptions to the ones presented in the present paper, Equation (4.4) in Theorem 2 [11] states that     d (19) |Λn |1/2 A(θ⋆ ) θbn (Φ) − θ⋆ → N 0, Σ(V(1) , θ⋆ ) , where A(θ⋆ ) is the symmetric (p, p) matrix given for i, k = 1, . . . , p by      M ⋆ V(1) (0M |Φ; θ⋆ ) e−V (0 |Φ;θ ) . (A(θ⋆ ))ik = E V(1) (0M |Φ; θ⋆ ) k i Since h = V(1) , then E(V(1) , θ⋆ ) := A(θ⋆ ) and therefore (17) reduces to     d |Λn |1/2 A(θ⋆ )2 θbn (Φ) − θ⋆ → N 0, A(θ⋆ )Σ(V(1) , θ⋆ )A(θ⋆ ) which is exactly (19) by assuming that A(θ⋆ ) is invertible. Remark 4 The question how to choose the test functions in order to minimize the norm of the asymptotic covariance matrix is difficult to answer, still open and is a perspective for future work. 4.3 Discussion The present paragraph is devoted to the discussion of Assumptions [Mod], [C] (except [C3]) and [N]. In the previous sections, we have expressed the different assumptions in a very general way. Our aim, here, is to make these assumptions concrete for a wide range of models and a wide range of test functions, in order to illustrate that our setting is not restrictive. In particular, we will focus on exponential family models having a local energy of the form: V (xm |ϕ; θ) := θT V(xm |ϕ) = θ1 V1 (xm |ϕ) + . . . + θp Vp (xm |ϕ), S S with V = (V1 , . . . , Vp ) a vector function from × Ω( ) to Rp . Let us consider the following assumptions: for all (m, ϕ) ∈ (20) M × Ω. (inf) [Exp] For i = 1, · · · , p, for all x ∈ Rd , Vi (xm |ϕ) = Vi (0m |τx ϕ) and there exist κi N, D > 0 such that one of the two following assumptions is satisfied : (inf) θi ≥ 0 and − κi or (inf) −κi (sup) ≤ Vi (0m |ϕ) = Vi (0m |ϕB(0,D) ) ≤ κi (sup) ≤ Vi (0m |ϕ) = Vi (0m |ϕB(0,D) ) ≤ κi 10 . (sup) , κi |ϕB(0,D) |ki . ≥ 0, ki ∈ ^ Assumption [Exp] with ki = 0 or 1 for all i (when θi ≥ 0). [Exp] [H] For all x ∈ Rd , h(xm , ϕ; θ) = h(0m , τx ϕ; θ) and there exist κ > 0, k ∈ N, D > 0 such that h(0m , ϕ; θ) = h(0m , ϕB(0,D) ; θ), such that h(0m , ϕ; ·) is twice continuously differentiable in θ and such that |Y (ϕ, m)| ≤ κ|ϕB(0,D) |k , where   (21) Y (ϕ, m) := max |h(0m , ϕ; θ)|, kh(1) (0m , ϕ; θ)k, kh(2) (0m , ϕ; θ)k . T m g h(0m , ϕ; θ) = e [H] h(0m , ϕ; θ)eθ V(0 |ϕ) with e h satisfying [H]. Let us underline that the different constants involved in these assumptions are assumed to be independent of m, ϕ, θ. Note also that if the test function h is independent of θ, Y (ϕ, m) obviously reduces to |h(0m , ϕ)|. These assumptions are common and very simple to check. Assumption [Exp] has already been investigated in [9]. It includes a wide variety of models such as the overlap area point process, the multiStrauss marked point process, the k−nearest-neighbor multi-Strauss marked point process, the Strauss type disc process, the Geyer’s triplet point process, the area process, some special cases of quermass process (for instance when λm has a compact support not containing 0 and θ4 = 0), etc. Among these ^ is the Geyer’s triplet point process (see [9], p.242). models, the only one that does not satisfy [Exp] (1) On the other hand, the test functions h(xm , ϕ; θ) = 1, h(xm , ϕ; θ) = Vk (xm |ϕ; θ) = Vk (xm |ϕ), h(xm , ϕ; θ) = |ϕB(x,r) | satisfy [H] (for the second one, it is implied by [Exp]). Note that the functional described by (12) for the Strauss model depending on θ, the functionals hper and hiso in (14), T m T m T m (15) also satisfy [H]. In a similar way, test functions like eθ V(x |ϕ) , eθ V(x |ϕ)/2 , |ϕB(x,r) |eθ V(x |ϕ) , T m g 1[0,r] (d(xm , ϕ))eθ V(x |ϕ) satisfy [H]. We show that most of all the assumptions required in Theorems 3 and 4 are not too restrictive. Proposition 5 (i) Under Assumption [Exp], Assumption [Mod] is fulfilled. g and a model satisfying [Exp] (resp. [Exp]), ^ then (ii) For a test function satisfying [H] (resp. [H]) Assumptions [C] (excepted [C3]) and [N] are fulfilled. Proof. (i) From [Exp], it is easy to check that for any fixed θ, there exists a positive constant K(θ) such that V (0m |ϕ; θ) ≥ −K(θ). Therefore the local energy is stable for any θ. On the other hand it is finite range. These two properties imply the existence of ergodic measures (see e.g. [5], Proposition 1). Among them, there exists at least one stationary measure due to the invariance by translation of the family of energy functions. (ii) [C2] and [N3] are quite obvious to check. Now, from the stability of the local energy, we have that for all (m, ϕ) ∈ × Ω, θT V(0m |ϕ) ≥ −ρ for ρ < +∞, independent of m, ϕ, θ (this is possible because Θ is compact). Let us also underline that this property ensures that for every Λ ⋐ Rd , every c ∈ R, E(ec|ΦΛ | ) < +∞ (see e.g. Proposition 11 of [6]), which obviously implies that E(|ΦΛ |α ) < +∞ and g and [Exp]), ^ the expectations E(|ΦΛ |α ec|ΦΛ | ) < +∞ for every α > 0. Now, under [H] and [Exp] (or [H] in [C1], [C4], [N1] and [N4] are clearly finite. Let us focus on [C1] for example (the justification for the other assumptions is similar). We have for any θ, θ′ ∈ Θ   M ′T E |h(0M , Φ; θ)|e−θ V(0 |Φ) M ≤ (   eρ E |h(0M , Φ; θ)| ≤ c × eρ E |ΦB(0,D) |α under [H] and [Exp]  ρ α c|ΦB(0,D) | ρ M θT V(0M |Φ) g and [Exp], ^ e ≤ c × e E |ΦB(0,D) | e under [H] e E |h(0 , Φ; θ)|e g ^ for some constants α and c. Note that, if we had not assumed  [Exp] for test functions of the form [H],  k then one would have had expectations of the form E e|ΦΛ | for some k > 1 which is not necessarily finite under the local stability property. Assumption [N2] is proved similarly and by using the dominated convergence theorem. 11 Remark 5 By following ideas in [11], it is possible to fulfill the integrability type assumptions for more complicated models such as the Lennard-Jones model (which is not locally stable and nonlinear in terms of the parameters). The using of Ruelle’s estimates [43] plays a crucial role in this case of superstable interaction. For the sake of conciseness and simplicity, we do not investigate this in the present paper. 5 Identifiability : Assumption [C3] The Assumption [C3] is related to the identifiability of the estimation procedure. It is more complicated to verify than the other assumptions and an investigation to obtain a criterion or a characterization seems necessary. We address this question in this section. In the following, we consider that the interaction has an exponential form as in (20). Then [C3] is equivalent to: θ = θ⋆ is the unique solution of the nonlinear system of equations in θ defined by    T M ⋆T M = 0, 1 ≤ k ≤ K. (22) E hk (0M , Φ; θ) e−θ V(0 |Φ) − e−θ V(0 |Φ) If hk and V are sufficiently regular, each equation in (22) gives a (p− 1)-dimensional manifold of solutions in Θ containing θ⋆ . So it is clear that the choice K ≥ p is in general necessary to prove that the system (22) admits the unique solution θ⋆ . In Section 5.1, we investigate the delicate case K = p in detail. In opposition to the linear case where p hyperplanes in Rp have in general a unique common point, the intersection of p (p − 1)-dimensional manifolds does not generally reduce to a single point. So, when K = p, there is no guarantee that (22) has a unique solution θ⋆ . This is illustrated by a simple example at the beginning of Section 5.1. In Proposition 6, we provide a criterion to ensure that the system in (22) admits the only one solution θ⋆ . Some examples, for which the criterion is available, are presented and the rigidity of the criterion, when p ≥ 3, is also evoked. In the case where p = 2, we show that our criterion is not far from being necessary. The case K > p is studied in Section 5.2. The identification problem should be simpler since, in general, p + 1 (p − 1)-dimensional manifolds in Rp have no common point. We give a sufficient criterion to prove the identification but we think that it is far from being necessary. Before presenting these two sections, let us give further notation. We denote by PV the law of V(0M , Φ) in Rp . We also define the function Ψθ , for each θ ∈ Θ, by Ψ θ : Rp v −→    7−→   RK   M E h1 (0 , Φ; θ) V(0M |Φ) = v .. .   E hK (0M , Φ; θ) V(0M |Φ) = v    .   (23) We will see that this function plays a crucial role in the identification problem. 5.1 The case K = p First of all, let us give a simple example to show that the identification problem is delicate in the situation where K = p. Let us consider that K = p = 2, V1 = 1, and let us choose the simple test functions h1 = 1 T m ⋆ M and h2 = eθ V(x |ϕ) . Then θ̃ = (θ̃1 , θ̃2 ) with θ̃1 = θ1⋆ −ln(E(e−θ2 V2 (0 |Φ) )) and θ̃2 = 0 is always a solution ⋆ of the system in (22). Therefore if θ2 6= 0 and if θ̃ defined before is in Θ, then the system in (22) admits at least two solutions. In the following, we first give a sufficient criterion to prove the identifiability and propose some examples. Next, we show the rigidity of our criterion which seems constraining when p ≥ 3. 5.1.1 Criterion for identifiability Assumption [Det] gathers the two following assumptions:  [Det(6=)] For every θ in Θ, det(v1 , . . . , vp )det Ψθ (v1 ), . . . , Ψθ (vp ) is not (PV )⊗p -a.s. identically null 12 [Det(≥)] For every θ in Θ, there exists ǫ = ±1 such that for (PV )⊗p -a.s. every (v1 , . . . , vp ) in (Rp )p  ǫ det(v1 , . . . , vp )det Ψθ (v1 ), . . . , Ψθ (vp ) ≥ 0. When ǫ = 1 (respectively ǫ = −1), [Det(≥)] means that Ψθ preserves the sign (respectively the opposite sign) of the determinant. The criterion is the following. Proposition 6 If K = p then Assumption [Det] ensures that Assumption [C3] holds.  x Proof. Denoting by ζ the real function x 7→ ln e x−1 with the convention ζ(0) = 0, the equations (22) become T (θ⋆ − θ) Xk (θ, θ⋆ ) = 0, 1 ≤ k ≤ p, (24) where the vector Xk (θ, θ⋆ ) is defined by  ⋆T M ⋆ Xk (θ, θ ) = E hk (0M , Φ; θ)e−θ V(0 |Φ) eζ (θ ⋆ −θ)T V(0M |Φ)   V(0 |Φ) . M (25) Therefore the system (22) admits the unique solution θ⋆ if the family of vectors (Xk (θ, θ⋆ ))1≤k≤p is independent in Rp . Let us give a formula of the determinant of these vectors which shows that it is not null. Conditioning by the law of V(0M |Φ) and using the multi-linearity of the determinant, we obtain = = = det(X1 (θ, θ⋆ ), . . . , Xp (θ, θ⋆ ))    Z Z  ⋆ T ⋆T · · · det E h1 (0M , Φ; θ)e−θ v1 eζ (θ −θ) v1 v1 V(0M |Φ) = v1 , . . . ,    ⋆ T ⋆T E hp (0M , Φ; θ)e−θ vp eζ (θ −θ) vp vp V(0M |Φ) = vp PV (dv1 ) · · · PV (dvp ) Z Z P p ⋆T · · · e k=1 −θ vk +ζ (θ ⋆ −θ)T vk Z Z P p ⋆T 1 X · · · e k=1 −θ vσ(k) +ζ p!  det (v1 , . . . , vp ) ⋆ (θ −θ) vσ(k) σ∈Sp p Y k=1 T  p Y k=1   E hk (0M , Φ; θ)|vk PV (dv1 ) · · · PV (dvp ) det vσ(1) , . . . , vσ(p)   E hk (0M , Φ; θ)|vσ(k) PV (dv1 ) · · · PV (dvp ),  where Sp is the set of all permutations in {1, . . . , p}. Denoting by ǫ(σ) the signature of σ, we obtain det(X1 (θ, θ⋆ ), . . . , Xp (θ, θ⋆ )) Z Z P p  X Y   p 1 −θ ⋆T vk +ζ (θ ⋆ −θ)T vk k=1 det (v1 , . . . , vp ) ··· e E hk (0M , Φ; θ)|vσ(k) ǫ(σ) = p! σ∈Sp PV (dv1 ) · · · PV (dvp ) Z Z P p ⋆T 1 · · · e k=1 −θ vk +ζ = p! (θ ⋆ −θ)T vk  (26) k=1  det(v1 , . . . , vp )det Ψθ (v1 ), . . . , Ψθ (vp ) PV (dv1 ) · · · PV (dvp ). From Assumption [Det] , this determinant is not null. The proposition is proved. Now let us give some examples for which the criterion is valid. Example 1 (linear case) If the function Ψθ is linear and invertible then Assumption [Det(≥)] is clearly satisfied and [Det(6=)] holds as soon as the support of PV is not included in a hyperplane. In particular, if hk = Vk for every 1 ≤ k ≤ p then Ψθ is equal to the identity function. This situation corresponds to the pseudo-likelihood procedure for which we regain the identifiability via our criterion. 13 Example 2 (area process) For the area process defined in Section 3.2.4 with λm = δR (i.e. the radii of balls are constant), it is easy to check that the functions hper and hiso respectively defined by (14) and (15) give a function Ψ which satisfies Assumption [Det] . Indeed the support of PV is the segment {1} × [0, πR2 ] in R2 and for a vector v = (1, v2 ) the image Ψ(v) is (ψ1 (v2 ), 0) if v2 6= πR2 and (2πR, 1) if v2 = πR2 . Therefore, it follows that [Det(≥)] is satisfied and noting that 0 < PV ((1, πR2 )) < 1 we deduce that [Det(6=)] holds too. Example 3 (a general example with p = 22) Example 2 is included in a more general setting when p = 2. Indeed let us suppose that the function Ψθ has the form Ψθ (v1 , v2 ) = gθ (v1 , v2 ), gθ (v1 , v2 )fθ (v2 /v1 ) where gθ is a nonnegative scalar function and fθ is a monotone scalar function. Then Ψθ satisfies [Det(≥)] and [Det(6=)] holds if gθ (v1 , v2 )fθ (v2 /v1 ) is not PV⊗2 -a.s. constant when gθ (v1 , v2 ) is not null. T Example 4 (functions of the type hk eθ V ) Let us suppose that the functions (hk ) ensure that Ψθ satisfies [Det(≥)] then for any nonnegative function gθ from Rp to R the functions (h̃k ) = gθ (V)hk also provide a function Ψ̃θ satisfying [Det(≥)] . This remark is related to Section 3.2.2 where it is T suggested to choose functions (h̃k ) of the form (eθ V hk ) to simplify the integral in (6).  T As an immediate consequence, the test functions Vk eθ V , considered in [8] for the particular multiStrauss point process, satisfy [Det(≥)] . 5.1.2 Rigidity of the criterion In this section, we give some comments about the rigidity of the criterion. In Proposition 7 below, we show that a function Ψθ , satisfying [Det(≥)] , has a strong linear structure since, under very reasonable assumptions, the image of any hyperplane is included in a hyperplane. For example, in the classical setting where V1 = 1, the function Ψθ is defined from the affine space H = {1} × Rp−1 and if Ψθ is assumed to be continuous then the image of any p − 2 dimensional affine space in H is included in a hyperplane. This property clearly shows that Ψθ is very rigid when p ≥ 3. However, when p = 2, we show in Proposition 8 that our criterion is not far from being necessary. Indeed, we present a large class of examples which do not satisfy our criterion and for which the identifiability fails. Proposition 7 Let Ψ be a continuous function from D to Rp satisfying [Det(≥)] , where the domain D is a subset of Rp with the following property: for any (xi )1≤i≤p ∈ Dp such that det(x1 , . . . , xp ) = 0, then − + p + for any neighborhood V of (xi ), there exist (x+ i ) and (xi ) in V ∩ D such that det(x1 , . . . , xp ) > 0 and − − p det(x1 , . . . , xp ) < 0. Then for any hyperplane H in R the image Ψ(H ∩ D) is included in a hyperplane of Rp . Proof. Let H be a hyperplane in Rp . To prove that Ψ(H ∩ D) is included in a hyperplane, it is sufficient to prove that the dimension of the vectorial space generated by the vectors in Ψ(H ∩ D) is not equal to p. Let us suppose that it is equal to p, then there exists (xi )1≤i≤p in (H ∩ D)p such that det(Ψ(x1 ), . . . , Ψ(xp )) 6= 0. Since dim(H) = p − 1, we have det(x1 , . . . , xp ) = 0. By continuity of Ψ − p and by the local properties of D assumed in Proposition 7, we find (x+ i )1≤i≤p and (xi )1≤i≤p in D such + + − − + − − that det(x1 , . . . , x+ )det(Ψ(x ), . . . , Ψ(x )) > 0 and det(x , . . . , x )det(Ψ(x ), . . . , Ψ(x )) < 0, which p p p p 1 1 1 contradicts Assumption [Det(≥)] . In the case where D = Rp , if we assume that Ψ(Rp ) is not reduced to a hyperplane and that Ψ is differentiable at the origin, then we can show that Ψ satisfies [Det(≥)] if and only if Ψ(x) = g(x)Ax, where A is an invertible matrix and g a nonnegative scalar function. It means that Ψ is quasi linear and so the rigidity of Ψ is very strong. Now let us focus on the case where p = 2 and let us show that, while our criterion seems very constraining, it is not far from being necessary in this case. We suppose that p = 2, V1 = 1 and that the support of V2 is included in an interval [a, b]. Let us remark that this case occurs for the area process with [a, b] = [0, πR2 ]. First of all, it is easy to check visually, depending on the geometry of γθ defined by the curve Ψθ ({1} × [a, b]), whether Ψθ satisfies [Det] (see figure 1 for examples). 14 Moreover let us show that the criterion is not far from being necessary. Suppose that the functions g decomposed into the (hi ) do not depend on θ and that Ψ := Ψθ satisfies for ǫ = ±1 Assumption [Det] three following assumptions: g =)] det (v1 , v2 ) det (Ψ(v1 ), Ψ(v2 )) is not P ⊗2 -a.s. identically null [Det(6 V g [Det(≥)] there exists δ > 0 such that for PV⊗2 -a.s. every (v1 , v2 ) in ({1} × [a, a + δ])2 ǫ det (v1 , v2 ) det (Ψ(v1 ), Ψ(v2 )) ≥ 0 g [Det(≤)] there exists δ > 0 such that for PV⊗2 -a.s. every (v1 , v2 ) in ({1} × [b − δ, b])2 ǫ det (v1 , v2 ) det (Ψ(v1 ), Ψ(v2 )) ≤ 0. See Figure 1 for an example of such Ψ. Obviously, this situation is not exactly the opposite of Assumption [Det], but it is strongly related to it. Then, we have the following proposition which proves that the identifiability fails for this large class of examples.   g and if det EP (Ψ(v)vT ) 6= 0 Proposition 8 If the functions (hi ) are nonnegative, if Ψ satisfies [Det] V then [C3] fails.   Let us note that even if the assumption det EPV (Ψ(v)vT ) 6= 0 seems unnatural, it is in general satisfied. Proof. Let us show that (22) admits another solution than θ⋆ . We only give here the main lines of the proof. We denote by O the set in R2 containing the vectors u which are orthogonal to at least one vector v in {1} × [a, b], i.e. O = {u ∈ R2 , ∃v ∈ {1} × [a, b], uT v = 0}. In fact, O is the union of a cone O+ in the upper half plane and a cone O− in the lower half plane. For any δ > 0, the expression of the determinant in (26) can be split in two parts = det(X1 (θ, θ⋆ ), X2 (θ, θ⋆ )) (27) Z Z  P2  ⋆T ⋆ T 1 −θ v +ζ (θ −θ) v k k e k=1 det(v1 , v2 )det Ψ(v1 ), Ψ(v2 ) PV (dv1 )PV (dv2 ) 2 [a,a+δ]2 Z Z  P2  1 −θ ⋆T vk +ζ (θ ⋆ −θ)T vk k=1 det(v1 , v2 )det Ψ(v1 ), Ψ(v2 ) PV (dv1 )PV (dv2 ). e + 2 2 2 [a,b] \[a,a+δ] g =)] and [Det(≥)] g Let u 6= 0 in O+ and θ = θ⋆ + αu with α > 0. From [Det(6 since ζ is increasing and ζ(x) ∼ x as x → +∞, we deduce that the first integral in (27) dominates the second one when α g ) when α is large goes to infinity. Therefore det(X1 (θ, θ⋆ ), X2 (θ, θ⋆ )) has the sign ǫ (defined before [Det] − g =)] and [Det(≤)] g enough. Similarly, if u is in O then from [Det(6 det(X1 (θ, θ⋆ ), X2 (θ, θ⋆ )) has the sign −ǫ when α is large enough and it implies that the sign of det(X1 (θ, θ⋆ ), X2 (θ, θ⋆ )), with θ = θ⋆ + u, is different for u in O+ or in O− as soon as |u| is large enough. We deduce that there exists a continuous ⋆ ⋆ ⋆ curve t 7→ u(t) which crosses O suchthat det(X1 (θ(t),  θ ), X2 (θ(t), θ )) = 0 for every θ(t) = θ + u(t). Let us note that the assumption det EPV (Ψ(v)vT ) 6= 0 ensures that det(X1 (θ⋆ , θ⋆ ), X2 (θ⋆ , θ⋆ )) 6= 0, and so u(t) is never null. Let us show that there exists t0 such that θ(t0 ) is a solution of the system in (22). Since the functions (hi ) are nonnegative and from Definition (25), we obtain that for every t, X1 (θ(t), θ⋆ ) T is collinear to a vector in {1} × [a, b]. By continuity of the function t 7→ X1 (θ(t), θ⋆ ) u(t) and by the ⋆ mean value theorem, there exists t0 such that u(t0 ) is orthogonal to X1 (θ(t0 ), θ ). Since the determinant det(X1 (θ(t0 ), θ⋆ ), X2 (θ(t0 ), θ⋆ )) = 0, u(t0 ) is also orthogonal to X2 (θ(t0 ), θ⋆ ) and it follows that the system in (24) provides at least two solutions θ⋆ and θ(t0 ). Identification assumption (22) or [C3] fail. 15 b b γ γ Ψ a b γ Ψ 1 a Ψ 1 a 1 Figure 1: on the left (respectively in the middle), an example of function Ψ satisfying (respectively not satisfying) g . [Det(≥)] . On the right, an example of Ψ satisfying [Det] 5.2 The case K>p In the case where K > p, we noticed, in the introduction, that the identification problem should be simpler. Nevertheless, we did not find a satisfactory criterion to prove it. The following Proposition 9 gives a sufficient criterion which is probably far from being necessary. It is based on a slight modification of Assumption [Det] which does not seem to be the appropriate tool in this setting. However, in the case where p = 2 and K = 3, this condition reduces to a nice geometrical property which can be checked easily. First, let us present the criterion. We denote by A the set of all subsets with p elements in {1, . . . , K}, n o A = I ⊂ {1, . . . , K}, such that #(I) = p . We say that Assumption [Det’] is satisfied if, for every θ in Θ, there exists a family of real coefficients (cI )I∈A such that the two following assumptions hold: X [Det’(6=)] cI det(v1 , . . . , vp )det ΨIθ (v1 ), . . . , ΨIθ (vp )) is not (PV )⊗p -a.s. identically null I∈A [Det’(≥)] for (PV )⊗p -a.s. every (v1 , . . . , vp ) in (Rp )p X cI det(v1 , . . . , vp )det ΨIθ (v1 ), . . . , ΨIθ (vp )) ≥ 0, I∈A where ΨIθ (v) denotes the p-dimensional vector extracted from Ψθ (v) for the coordinates given by I. In the particular case where K = p, Assumption [Det’] becomes Assumption [Det] exactly. Our criterion is the following Proposition 9 Assumption [Det’] ensures that Assumption [C3] holds. ⋆ Proof. As in the proof of Proposition 6, to show that (22) admits the unique  solution θ , it is sufficient to prove that there exists I in A such that, for all θ 6= θ⋆ , det (Xi (θ, θ⋆ ))i∈I 6= 0. It is equivalent to: for every θ 6= θ⋆ in Θ there exists a family of real coefficients (cI )I∈A such that X  (28) cI det (Xi (θ, θ⋆ ))i∈I > 0. I∈A With calculations as in (26) we obtain X  p! cI det (Xi (θ, θ⋆ ))i∈I I∈A = Z e Pp ⋆T vk +ζ (θ ⋆ −θ)T vk k=1 −θ  X ! cI det(v1 , . . . , vp )det ΨIθ (v1 ), . . . , ΨIθ (vp )) PV (dv1 ) · · · PV (dvp ). I∈A 16 Thanks to [Det’] , this quantity is positive. In the case where p = 2 and K = 3, [Det’(≥)] is satisfied if and only if there exist a, b, c in R3 such that for every v1 and v2 with det(v1 , v2 ) > 0       {1,2} {1,2} {1,3} {1,3} {2,3} {2,3} a det Ψθ (v1 ), Ψθ (v2 ) +b det Ψθ (v1 ), Ψθ (v2 ) +c det Ψθ (v1 ), Ψθ (v2 ) ≥ 0. (29) If we denote by ∧ the vectorial product in R3 , the inequality in (29) means that the following set n o Ψθ (v1 ) ∧ Ψθ (v2 ), for all v1 , v2 such that det(v1 , v2 ) > 0 (30) is included in the half space in R3 with equation cx − by + az ≥ 0. In the setting where V1 = 1 and V2 is included in an interval [a, b], as for the area process, this condition becomes a geometrical characteristic of the curve γθ = Ψθ ({1} × [a, b]) in R3 , which is easy to check visually. 6 Extension in the presence of non-hereditary interaction In several recent papers, Gibbs processes with non hereditary interactions are considered, in particular in the domain of stochastic geometry (see [13], [15]). The parametric estimation of such models has also been investigated. The first results in this direction have been given in [16] via a pseudo-likelihood procedure based on a generalization of the Georgii-Nguyen-Zessin formula (4). The same kind of generalization is possible for the Takacs-Fiksel procedure. We address this improvement in this section. In the following, we do not assume that the energy VΛ (ϕ, θ) satisfies the heredity assumption (2). The first consequences are that the local energy V (xm |ϕ; θ) is not defined in general and that the GeorgiiNguyen-Zessin formula is not available. Let us begin by presenting the generalization of this formula, as stated in [16], Proposition 2, which is valid in the hereditary and non-hereditary settings. We first need to recall the concept of removable points which has been introduced in [16], Definition 3. Definition 2 A point xm in a configuration ϕ is called removable if there exists a bounded set Λ containing x such that VΛ (ϕ\xm , θ) < +∞. We denote by Rθ (ϕ) the set of removable points in ϕ. Let us remark that the removable set is only related to the support of the underlying Gibbs measure. The local energy V (xm |ϕ\xm ; θ) of any removable point xm ∈ Rθ (ϕ) can then be defined by the classical expression (3) where Λ comes from Definition 2. In the hereditary case, all the points of ϕ are removable and we regain the classical definition of the local energy. The generalization of the Georgii-Nguyen-Zessin formula is the following equation    Z X m ⋆ h (xm , Φ \ xm ; θ) . h (xm , Φ; θ) e−V (x |Φ;θ ) µ(dxm ) = E  (31) E Rd × M xm ∈Rθ⋆ (Φ) Let us notice that the only difference with the classical formula is that the sum is restricted to the removable points. Now, let us present the consequences of this formula on the Takacs-Fiksel procedure. We have to consider the two following cases: • When the support of the Gibbs measure does not depend on θ: the set of removable points Rθ (ϕ) does not depend on θ either. In this case, the Takacs-Fiksel estimor is defined by (7), and CΛ is as in (6) to the exception of the sum which is restricted to the removable points: CΛ (ϕ; h, θ) := Z Λ× M h(xm , ϕ; θ)e−V (x m |ϕ;θ) µ(dxm ) − X h(xm , ϕ \ xm ; θ). (32) xm ∈Rθ⋆ (ϕ)∩Λ The sum is computable because by assumption, the set Rθ⋆ (ϕ) does not depend on θ⋆ . In this situation, with the same assumptions [C] and [N], the consistency and the asymptotic normality of the estimator may be proved as in Section 7. 17 • When the support of the Gibbs measure depends on some parameters θhc = (θ1 , . . . , θq ), q ≤ p (called the hardcore parameters): the remaining parameters θsm = (θq+1 , . . . , θp ) are supposed to parameterize the classical (or smooth) interaction between points. The set of removable points Rθ (ϕ) therefore depends on θhc only. The estimation issue is more complicated in this case. Indeed, Assumption [C2] requires some regularities of the interaction with respect to the parameter θ, such as continuity, which clearly fail to be true for the support parameter θhc . The Takacs-Fiksel procedure is therefore unable to estimate θhc . Note that this problem is not specific to the presence of non-hereditary interactions, but arises as soon as some hardcore parameters have to be estimated. In [16], the authors solve this problem in both the hereditary and non-hereditary setting, by introducing a two-step estimation procedure. We can follow the same strategy here. In a first step, the estimator θ̂hc of the hardcore parameter is defined in a natural way according to the observed support of the point process (see Section 4.2.1 in [16]). Then, in a second step, the Takacs-Fiksel estimator θ̂sm is defined by (7) with CΛ (ϕ; h, θsm ) := Z m h(xm , ϕ; θsm , θ̂hc )e−V (x |ϕ;θsm ,θ̂hc ) µ(dxm ) Λ×M X h(xm , ϕ \ xm ; θsm , θ̂hc ). − xm ∈Rθ̂ hc (33) (ϕ)∩Λ Let us remark that the estimator θ̂hc is plugged in the computation of CΛ . In particular, the removable points are determined with respect to θ̂hc . As in [16], the regularity and integrability assumptions of type [C] for θ̂sm and conditions on the support of the Gibbs measure are required in order to obtain the consistency of (θ̂hc , θ̂sm ). The asymptotic normality is more difficult to obtain and no general results are available. In fact, it seems that there is no hope to expect asymptotic normality without managing the rate of convergence of θ̂hc , which should strongly depend on the model. 7 A short simulation study The first aim of the present paper is to explore the asymptotic properties of the Takacs-Fiksel estimate. Given a model, the important question of correctly choosing the test functions is an open question and will not be treated here. Also, a complete comparison between all the existing parametric estimation methods has not been attempted. In this section, we just present a brief simulation study to illustrate the methodology and the asymptotic results. The model considered in this section is the (non-marked) Strauss process described by (11) with two parameters. Table 1 and Figure 2 give an empirical comparison of the approximated maximum likelihood estimate (MLE) described in [26], the maximum pseudo-likelihood estimate (MPLE), which is a particular case of the Takacs-Fiksel estimate, and the explicit Takacs-Fiksel estimate (TFE) described in Section 3.2.3 and in particular in (13). To generate Strauss point processes and to compute the MLE, the R package spatstat is used. The MPLE and the TFE are computed respectively from (8) and (6) where the integrals were approximated by Monte-Carlo.1 Strauss point processes are generated on the window [0, 3]2 ⊕ R where R is set to 0.05. Recall that this parameter corresponds to the finite range of the Strauss process. Then the estimates are computed on the window [0, τ ]2 ⊕ R for τ = 1, 2, 3, using a ⋆ R-erosion. We use the parametrization of spatstat, i.e. the intensity parameter is β ⋆ := e−θ1 and the ⋆ interaction parameter is γ ⋆ := e−θ2 . We set β ⋆ = 100 and γ ⋆ = 0.5. The estimation of (β ⋆ , γ ⋆ ) by the b b bb TFE is thus (β, γ ) = (e−θ1 , e−θ2 ) where (θb1 , θb2 ) are defined in (13). As expected, the three estimates have decreasing variance when the domain of observation grows. Since the explicit TFE is obtained very quickly, its behavior appears very satisfactory in comparison with the MLE and MPLE. 1 The MPLE is also implemented in spatstat but we did not use it. In fact, it appears that the implementation therein leads to biased estimations. 18 Wτ [0, 1]2 [0, 2]2 [0, 3]2 Estimates of the parameter β ⋆ = 100 MLE MPLE TFE 103.50 (18.24) 101.19 (16.92) 101.44 (18.95) 101.03 (8.04) 100.33 (8.66) 99.99 (8.71) 100.38 (6.58) 99.55 (5.47) 99.96 (6.27) Estimates of the Wτ MLE [0, 1]2 0.50 (0.17) [0, 2]2 0.50 (0.08) [0, 3]2 0.51 (0.05) parameter γ ⋆ = 0.5 MPLE TFE 0.50 (0.16) 0.52 (0.22) 0.50 (0.08) 0.51(0.11) 0.49 (0.05) 0.51 (0.06) Table 1: Empirical means and standard deviations between brackets for the MLE, MPLE and the (explicit) TFE based on m = 500 replications of a Strauss point process with parameters β ⋆ = 100, γ ⋆ = 0.5, and (known) interaction range R = 0.05, generated in the window [0, 3]2 ⊕R and estimated on the window Wτ = [0, τ ]2 ⊕ R with R-erosion for τ = 1, 2, 3. Figure 2: Boxplots of the 3 estimates of β ⋆ (left) and γ ⋆ (right) when τ = 3, based on m = 500 replications. 19 τ =1  Empirical covariance of τ × (θb1 , θb2 )T τ =2 0.037 −0.051 −0.051 0.185   0.031 −0.048 −0.048 0.185   MC approximation of (E T )−1 ΣE −1 τ =3 0.035 −0.051 −0.051 0.179   0.034 −0.045 −0.045 0.154  Table 2: Comparison between the renormalized empirical covariance matrix of the estimates (θb1 , θb2 ) (based on m = 500 replications of the Strauss process on [0, τ ]2 ⊕ R as in Table 1) and the asymptotic covariance matrix (approximated by Monte-Carlo simulations). Now, we focus more particularly on the explicit TFE given by (13). We wish to assess whether the empirical covariance matrix of (θb1 , θb2 )T agrees with the asymptotic covariance matrix deduced from (17) in Theorem 4. For this reason, we come back to the parameterization of the Strauss model of Section 3.2.3. From (17), we deduce that the asymptotic covariance matrix of τ × (θb1 , θb2 )T , as τ tends to +∞, is E −1 (h, θ⋆ )T Σ(h, θ⋆ )E −1 (h, θ⋆ ) (a simplification occurs here since K = p = 2). Moreover, we can prove that the matrix E(h, θ⋆ ) reduces to   ⋆ E (N0,∆ ) eθ2 E (N1,∆ ) ⋆ −1 ⋆ E(h, θ ) = |∆| 0 eθ2 E (N1,∆ ) where ∆ is any bounded domain. Since the matrices E(h, θ⋆ ) and Σ(h, θ⋆ ) depend on expectations with respect to Pθ⋆ , they are approximated by Monte-Carlo simulations. Table 2 contains the empirical covariance of τ × (θb1 , θb2 )T , for τ = 1, 2, 3 as well as the asymptotic covariance matrix approximated by Monte-Carlo simulations. Finally, in order to appreciate the Gaussian limit, Figure 3 compares the empirical distribution of the two estimates (θb1 , θb2 ) with the expected limit when τ = 3. Note that the bb same kind of results as in Table 2 and Figure 3 could have been easily obtained for (β, γ ) by use of the delta method. Figure 3: Histogram of τ (θbj − θj⋆ ) for j = 1 (intensity parameter, left) and j = 2 (interaction parameter, right), for the large window, i.e. τ = 3. The curves correspond to the densities of the asymptotic Gaussian laws, which are centered with variance .034 (left) and .154 (right) according to Table 2. 20 8 Proofs of asymptotic results 8.1 Proof of Theorem 3 If there is more than one stationary Gibbs measure, then some non ergodic Gibbs measures automatically exist because, in the convex set of all Gibbs measures, only the extremal measures are ergodic. But any stationary Gibbs measure can be represented as a mixture of ergodic measures ([22], Theorem 14.10). Due to this decomposition, we can assume that Pθ⋆ is ergodic to prove the consistency of the Takacs-Fiksel estimate. The proof is split into two steps. Step 1. UΛn is a contrast function. Under [C1], the ergodic Lemma 2 and the GNZ formula given in (5) may be applied to prove that Pθ⋆ -a.s.    M |Λn |−1 CΛn (Φ, hk ; θ) → E hk (0M , Φ; θ)e−V (0 |Φ;θ) − E hk (0M , Φ \ 0M ; θ)    M M ⋆ = E hk (0M , Φ; θ) e−V (0 |Φ;θ) − e−V (0 |Φ;θ ) . Therefore, as n → +∞, one obtains Pθ⋆ −a.s. UΛn (Φ; h, θ) → U (θ) := K X k=1 2   M M ⋆ . E hk (0M , Φ; θ) e−V (0 |Φ;θ) − e−V (0 |Φ;θ ) ⋆ Note that U (θ ) = 0. In addition with Assumptions [C2] and [C3], this proves that UΛn is a continuous contrast function vanishing only at θ⋆ . Step 2. Modulus of continuity The modulus of continuity of the contrast process is defined for all ϕ ∈ Ω and all η > 0 by o n Wn (ϕ, η) = sup UΛn (ϕ; h, θ) − UΛn (ϕ; h, θ′ ) : θ, θ′ ∈ Θ, ||θ − θ′ || ≤ η . This step aims at proving that there exists a sequence (εℓ )ℓ≥1 , with εℓ → 0 as ℓ → +∞ such that for all ℓ≥1      1 = 0. (34) ≥ εℓ P lim sup Wn Φ, ℓ n→+∞ Let θ, θ′ ∈ Θ, then |UΛn (ϕ; h, θ) − UΛn (ϕ; h, θ′ )| ≤ K  X |Λn |−1 |CΛn (ϕ; hk , θ) − CΛn (ϕ; hk , θ′ )| × k=1 |Λn | −1  |CΛn (ϕ; hk , θ)| + |CΛn (ϕ; hk , θ )| . ′  (35) Under Assumption [C4], there exists n0 ≥ 1 such that for all n ≥ n0 we have for Pθ⋆ −a.e. ϕ Z m ′ −1 −1 |Λn | (|CΛn (ϕ; hk , θ)| + |CΛn (ϕ; hk , θ )|) ≤ 2|Λn | max |hk (xm , ϕ; θ)|e−V (x |ϕ;θ) µ(dxm ) Λn × +2|Λn |−1 M θ∈Θ X xm ∈ϕ ≤ γ1 , where γ1 := 4 × max k=1,...,K max |hk (xm , ϕ \ xm ; θ)| θ∈Θ Λn      M ⋆ . E max |fk (0M , Φ; θ)| + E max |hk (0M , Φ; θ)|e−V (0 |Φ;θ ) θ∈Θ θ∈Θ Therefore for all n ≥ n0 K |UΛn (ϕ; h, θ) − UΛn (ϕ; h, θ′ )| ≤ γ1 × 1 X (AΛn (ϕ; hk , θ, θ′ ) + BΛn (ϕ; hk , θ, θ′ )) , |Λn | k=1 21 where ′ AΛn (ϕ; hk , θ, θ ) := BΛn (ϕ; hk , θ, θ′ ) Z = M X |fk (xm , ϕ; θ) − fk (xm , ϕ; θ′ )|µ(dxm ) Λn × xm ∈ϕ |hk (xm , ϕ \ xm ; θ) − hk (xm , ϕ \ xm ; θ′ )|. Λn Under Assumption [C4], one mayapply the mean value theorem in Rp as follows: there exist ξ (1) , . . . , ξ (p) ∈  ′ ′ ′ ′ [min(θ1 , θ1 ), max(θ1 , θ1 )] × . . . × min(θp , θp ), max(θp , θp ) such that for all ϕ ∈ Ω ′ AΛn (ϕ; hk , θ, θ ) = Z Λn × ≤ p X M j=1 (1) (θj − θj′ )(fk (1) (xm , ϕ; ξj ))j µ(dxm ) ′ p × kθ − θ k Z Λn × In a similar way, one may prove that for Pθ⋆ −a.e. ϕ X BΛn (ϕ; hk , θ, θ′ ) ≤ p × kθ − θ′ k max kfk (1) (xm , ϕ; θ)k µ(dxm ) M θ∈Θ xm ∈ϕΛn max khk (1) (xm , ϕ \ xm ; θ)k. θ∈Θ Under Assumption [C4], there exists n1 (k) ≥ 1 such that for all n ≥ n1 (k), we have for Pθ⋆ −a.e. ϕ  1 AΛn (ϕ; hk , θ, θ′ ) + BΛn (ϕ; hk , θ, θ′ ) ≤ γ2 kθ − θ′ k |Λn | where      (1) M (1) M −V (0M |Φ;θ ⋆ ) . E max kfk (0 , Φ; θ)k + E max khk (0 , Φ; θ)ke γ2 := 2p × max k=1,...,K θ∈Θ θ∈Θ We finally obtain the following upper-bound for Pθ⋆ −a.e. ϕ, for all θ, θ′ such that kθ − θ′ k ≤ 1/ℓ and for all n ≥ N = max(n0 , maxk n1 (k)) 1 |UΛn (ϕ; h, θ) − UΛn (ϕ; h, θ′ )| ≤ γ × , ℓ with γ = K × γ1 × γ2 and therefore Wn (ϕ, 1/ℓ) ≤ γ × 1ℓ . Finally, since           \ [  [  1 1 1 γ γ γ ≥ = ≥ ⊂ ≥ lim sup Wn ϕ, Wn ϕ, Wn ϕ, ℓ ℓ ℓ ℓ ℓ ℓ n→+∞ m∈N n≥m n≥N for Pθ⋆ −a.e. ϕ, the expected result (34) is proved. Conclusion step. Steps 1 and 2 ensure the fact that we can apply Theorem 3.4.3 of [23] which asserts the almost sure convergence for minimum contrast estimators. 8.2 Proof of Theorem 4 The proof is based on a classical result concerning asymptotic normality for minimum contrast estimators e.g. Theorem 3.4.5 of [23]. We split it into two different steps. (1) Step 1. Asymptotic normality of UΛn (Φ; h, θ⋆ ). We start with a Lemma which states a central limit theorem. Its proof uses a conditional centering assumption as in [28], which holds due to the particular form of the contrast function (6). This scheme of proof is now well-known and we mainly refer to previous works for the technical details. Lemma 10 Under the Assumptions [N1], [N2] and [N3], the following convergence holds in distribution, as n → +∞ d e Λn (Φ; h, θ⋆ ) → (36) N (0, Σ(h, θ⋆ )), |Λn |−1/2 C e Λn (ϕ; h, θ⋆ ) are defined in Theorem 4. where Σ(h, θ⋆ ) and C 22 e Λn (ϕ; h, θ⋆ ) (of length K) corresponds to the vector of the hk −residuals for Proof. The vector C k = 1, . . . , K computed on the same domain Λn with θb = θ⋆ , see [2] for a definition and practical study of this concept of residuals. The asymptotic behavior of the residuals process has been investigated in [12]. b e Λ (Φ; h, θ) In particular, with the notation of the present paper, the asymptotic normality of the vector C n ⋆ for general θb is obtained (see Proposition 4 in [12]). When θb = θ , the assumptions and the asymptotic covariance matrix of Proposition 4 in [12] respectively reduce to [N1-3] and (18). Now, according to the definition of UΛn (ϕ; h, θ⋆ ), we have (1) UΛn (ϕ; h, θ⋆ ) = 2|Λn |−2 K X (1) CΛn (ϕ; hk , θ⋆ )CΛn (ϕ; hk , θ⋆ ). k=1 Under Assumption [C4], one may apply the ergodic Lemma 2, in order to derive Pθ⋆ −a.s., as n → +∞   M ⋆ (1) (37) |Λn |−1 CΛn (Φ; hk , θ⋆ ) → E fk (1) (0M , Φ; θ⋆ ) − hk (1) (0M , Φ; θ⋆ )e−V (0 |Φ;θ ) . It is easily checked −E(hk , θ⋆ ), this vector of length p being defined by  that this expectation reduces to ⋆ M ⋆ (1) M ⋆ −V (0M |Φ;θ ⋆ ) . Let us denote by E(h, θ⋆ ) the (p, K) matrix E(hk , θ ) := E hk (0 , Φ; θ )V (0 |Φ; θ )e (E(h1 , θ⋆ ), . . . , E(hK , θ⋆ )), then we get the following decomposition (1) 2|Λn |1/2 × |Λn |−2 |Λn |1/2 UΛn (Φ; h, θ⋆ ) = K X (1) CΛn (Φ; hk , θ⋆ )CΛn (Φ; hk , θ⋆ ) k=1 e Λn (Φ; h, θ⋆ ) −2|Λn |−1/2 E(h, θ⋆ )C K h i X (1) +2 |Λn |−1 CΛn (Φ; hk , θ⋆ ) − (−E(hk , θ⋆ )) |Λn |−1/2 CΛn (Φ; hk , θ⋆ ). = k=1 According to (37) and Lemma 10, Slutsky’s Theorem implies that for any k ∈ {1, . . . , K},   P (1) |Λn |−1 CΛn (Φ; hk , θ⋆ ) − E(hk , θ⋆ ) |Λn |−1/2 CΛn (Φ; hk , θ⋆ ) → 0, as n → +∞, the zero here being a vector of length p. Using again Lemma 10, we finally reach the following convergence in distribution as n → +∞   d (1) T |Λn |1/2 UΛn (Φ; h, θ⋆ ) → N 0, 4E(h, θ⋆ ) Σ(h, θ⋆ ) E(h, θ⋆ ) , where Σ(h, θ⋆ ) is defined by (18). (2) Step 2. Convergence of UΛn (Φ; h, θ) for θ ∈ V(θ⋆ ) (2) According to our definition and Assumption [N4], the (p, p) matrix UΛn (ϕ; h, θ) is defined for i, j = 1, . . . , p by   (2) UΛn (ϕ; h, θ⋆ ) ij = 2|Λn | −2 K        X (2) (1) (1) . CΛn (ϕ; hk , θ) CΛn (ϕ; hk ; θ) CΛn (ϕ; hk , θ) + CΛn (ϕ; hk , θ) ij k=1 (1) (2) Note also that CΛn (ϕ; hk , θ) and CΛn (ϕ; hk , θ) are defined by Z (1) (1) CΛn (ϕ; hk , θ) = fk (xm , ϕ; θ)µ(dxm ) − Λn × (2) CΛn (ϕ; hk , θ) i = Z Λn × M M (2) f k (xm , ϕ; θ)µ(dxm ) − X xm ∈ϕΛn X xm ∈ϕΛn 23 (1) hk (xm , ϕ \ xm ; θ) (2) hk (xm , ϕ \ xm ; θ). j Under Assumption [N4], then for all i, j = 1, . . . , p and for any k = 1, . . . , K, each normalized term (1) (2) |Λn |−1 CΛn (Φ; hk , θ), |Λn |−1 CΛn (Φ; hk , θ) and |Λn |−1 CΛn (Φ; hk , θ) satisfies the assumptions of the ergodic Lemma 2. Therefore, for any θ ∈ V(θ⋆ ), there exists a matrix U(2) (h, θ) such that Pθ⋆ −a.s. (2) UΛn (Φ; h, θ) → U(2) (h, θ).  (2) This justifies that, for n large enough, in a neighborhood of θ⋆ , UΛn (ϕ; h, θ) ij is uniformly bounded by  2 × maxθ∈V(θ⋆) | U(2) (h, θ) ij | for Pθ⋆ −a.e. ϕ. When θ = θ⋆ , recall, from (5), that |Λn |−1 CΛn (Φ; hk , θ⋆ ) T converges almost surely to zero and that (37) holds. Hence, U(2) (h, θ⋆ ) reduces to 2E(h, θ⋆ )E(h, θ⋆ ) . Conclusion Step. From Theorem 1 and 2 ensure that the normalized difference     3.4.5 of [23], Steps (1) (2) ⋆ 1/2 ⋆ ⋆ b U (h, θ ) θn (Φ) − θ − UΛn (Φ; h, θ ) converges in probability to 0, which is the expected |Λn | result. Acknowledgements We are grateful to the associate editor, the anonymous referee and to Hans Zessin for their valuable comments and advices. References [1] A. Baddeley and R. Turner. Practical maximum pseudolikelihood for spatial point patterns (with discussion). Australian and New Zealand Journal of Statistics, 42:283–322, 2000. [2] A. Baddeley, R. Turner, J. Møller, and M. Hazelton. Residual analysis for spatial point processes. Journal of the Royal Statistical Society (series B), 67:1–35, 2005. [3] A. J. Baddeley and M. N. M. Van Lieshout. Area-interaction point processes. Ann. Inst. Statist. Math., 47(4):601–619, 1995. [4] K.K. Berthelsen and J. Møller. Likelihood and non-parametric Bayesian MCMC inference for spatial point processes based on perfect simulation and path sampling. Scandinavian Journal of Statistics, 30(3):549–564, 2003. [5] E. Bertin, J.-M. Billiot, and R. Drouilhet. Existence of “nearest-neighbour” spatial Gibbs models. Adv. Appli. Prob., 31:895–909, 1999. [6] E. Bertin, J.-M. Billiot, and R. Drouilhet. R-local Delaunay inhibition model. Journal of Statistical Physics, 132(4):649–667, 2008. [7] J. Besag. Spatial interaction and the statistical analysis of lattice system. Journal of the Royal Statistical Society (series B), 26:192–236, 1974. [8] J.-M. Billiot. Asymptotic properties of Takacs-Fiksel estimation method for Gibbs point processes. Statistics, 30(1):69–89, 1997. [9] J.-M. Billiot, J.-F. Coeurjolly, and R. Drouilhet. Maximum pseudolikelihood estimator for exponential family models of marked Gibbs point processes. Electronic Journal of Statistics, 2:234–264, 2008. [10] D. Brillinger. Statistical inference for stationary point processes. Stochastic processes and related topics, vol 1. 55-99 Ed. Madan Lal Puri. Academic Press, New-York-London, 1975. [11] J.-F. Coeurjolly and R. Drouilhet. Asymptotic properties of the maximum pseudo-likelihood estimator for stationary Gibbs point processes including the Lennard-Jones model. Electronic Journal of Statistics, 4:677–706, 2010. 24 [12] J.-F. Coeurjolly and F. Lavancier. Residuals for stationary marked Gibbs point processes. submitted, 2010. http://arxiv.org/abs/1002.0857. [13] D. Dereudre. Gibbs Delaunay tessellations with geometric hardcore condition. J. Stat. Phys., 121(34):511–515, 2005. [14] D. Dereudre. The existence of quermass-interaction processes for nonlocally stable interaction and nonbounded convex grains. Adv. Appli. Prob., 41(3):664–681, 2009. [15] D. Dereudre, R. Drouilhet, and H.O. Georgii. Existence of Gibbsian point processes with geometrydependent interactions. to appear in Prob. Theory and Related Fields, 2010. [16] D. Dereudre and F. Lavancier. Campbell equilibrium equation and pseudo-likelihood estimation for non-hereditary Gibbs point processes. Bernoulli, 15(4):1368–1396, 2009. [17] D. Dereudre and F. Lavancier. Practical simulation and estimation for Gibbs Delaunay-Voronoı̈ tessellations with geometric hardcore interaction. Computational Stat. and Data Analysis, 55:498– 519, 2011. [18] P. J. Diggle, T. Fiksel, P. Grabarnik, Y. Ogata, D. Stoyan, and M. Tanemura. On parameter estimation for pairwise interaction point processes. International Statistical Review, 62(1):99–117, 1994. [19] T. Fiksel. Estimation of parameterized pair potentials of marked and non-marked Gibbsian point processes. Elektron. Inform. Kybernet., 20:270–278, 1984. [20] T. Fiksel. Estimation of interaction potentials of Gibbsian point processes. Math. Operationsf. Statist. Ser. Statist., 19:77–86, 1988. [21] H.O. Georgii. Canonical and grand canonical Gibbs states for continuum systems. Communications in Mathematical Physics, 48(1):31–51, 1976. [22] H.O. Georgii. Gibbs measures and Phase Transitions. de Gruyter, Berlin, 1988. [23] X. Guyon. Random fields on a network. Modeling, Statistics and Applications. Springer Verlag, New York, 1995. [24] X. Guyon and H.R. Künsch. Asymptotic comparison of estimators in the Ising model. Lecture notes in Statistics 74. Springer, Berlin, 1992. [25] L. Heinrich. Mixing properties of Gibbsian point processes and asymptotic normality of Takacs-Fiksel estimates. preprint, 1992. [26] F. Huang and Y. Ogata. Improvements of the maximum pseudo-likelihood estimators in various spatial statistical models. Journal of Computational and Graphical Statistics, 8(3):510–530, 1999. [27] J. Illian, A. Penttinen H., and Stoyan. Statistical analysis and modelling of spatial point patterns. Wiley-Interscience, 2008. [28] J.L. Jensen and H.R. Künsch. On asymptotic normality of pseudolikelihood estimates of pairwise interaction processes. Ann. Inst. Statist. Math., 46:475–486, 1994. [29] J.L. Jensen and J. Møller. Pseudolikelihood for exponential family models of spatial point processes. Ann. Appl. Probab., 1:445–461, 1991. [30] W. S. Kendall, M. N. M. Van Lieshout, and A. J. Baddeley. Quermass-interaction processes conditions for stability. Adv. Appli. Prob., 31:315–342, 1999. [31] K. Krickeberg. Statistical problems of point processes. Rosz. Pol. Tow. Mat., Ser. III, Mat. Stosow. 13, 29-57, 1978. 25 [32] E.L. Lehman. Elements of large-sample theorey. Springer, 1999. [33] S. Mase. Uniform LAN condition of planar Gibbsian point processes and optimality of maximum likelihood estimators of soft-core potential function. Probab. Theory and Related Fields, 92:51–67, 1992. [34] K Matthes, J. Kerstan, and J. Mecke. Infinitely Divisible Point Processes. J. Wiley, 1978. [35] J. Møller and K. Helisová. Power diagrams and interaction processes for union of discs. Adv. Appli. Prob., 40(2):321–347, 2008. [36] J. Møller and R. Waagepetersen. Statistical Inference and Simulation for Spatial Point Processes. Chapman and Hall/CRC, Boca Raton, 2003. [37] J. Møller and R. Waagepetersen. Modern statistics for spatial point processes. Scan. J. Stat., 34:643–684, 2007. [38] X. Nguyen and H. Zessin. Integral and differential characterizations Gibbs processes. Mathematische Nachrichten, 88(1):105–115, 1979. [39] X.X. Nguyen and H. Zessin. Ergodic theorems for spatial processes. Z. Wahrscheinlichkeitstheorie verw. Gebiete, 48:133–158, 1979. [40] F. Papangelou. The Armenian Connection: Reminiscences from a time of interactions. Journal of contemporary mathematical analysis, 44(1):14–19, 2009. [41] C.J. Preston. Random fields. Springer Verlag, 1976. [42] D. Ruelle. Statistical Mechanics. Benjamin, New York-Amsterdam, 1969. [43] D. Ruelle. Superstable interactions in classical statistical mechanics. Comm. Math. Phys., 1970. [44] D. Stoyan, W.S. Kendall, and J. Mecke. Stochastic geometry and its applications. John Wiley and Sons, Chichester, 1995. [45] R. Takacs. Estimator for the pair-potential of a Gibbsian point process. Math. Operationsf. Statist. Ser. Statist., 17:429–433, 1986. [46] H. Zessin. Der papangelou prozess. Journal of contemporary mathematical analysis, 44(1):36–44, 2009. 26