Introduction To Spatstat: Adrian Baddeley and Rolf Turner Spatstat Version 1.8-6
Introduction To Spatstat: Adrian Baddeley and Rolf Turner Spatstat Version 1.8-6
Introduction To Spatstat: Adrian Baddeley and Rolf Turner Spatstat Version 1.8-6
K(r)/. To estimate
and plot the L function, type
K <- Kest(X)
plot(K, sqrt(./pi) ~ r)
For a quick rst analysis of a point pattern it is often convenient to hit
plot(allstats(X))
which plots estimates of the F, G, J and K functions in a single display. See
section 4.10 for more information on allstats.
21
0.00 0.05 0.10 0.15 0.20 0.25
0
.
0
0
.
2
0
.
4
0
.
6
0
.
8
1
.
0
r
k
m
,
t
h
e
o
F function
0.00 0.05 0.10 0.15 0.20 0.25
0
.
0
0
.
2
0
.
4
0
.
6
0
.
8
1
.
0
r
k
m
,
t
h
e
o
G function
0.00 0.05 0.10 0.15 0.20 0.25
0
.
0
0
.
2
0
.
4
0
.
6
0
.
8
1
.
0
r
k
m
,
t
h
e
o
J function
0.00 0.05 0.10 0.15 0.20 0.25
0
.
0
0
0
.
0
5
0
.
1
0
0
.
1
5
0
.
2
0
r
t
r
a
n
s
,
t
h
e
o
K function
Four summary functions for redwood.
There are also several related alternative functions. For the second order
statistics, alternatives are:
Kinhom K function for inhomogeneous patterns
Kest.fft fast K-function using FFT for large datasets
Kmeasure reduced second moment measure
See the help les for these functions.
Distances between points are also computed (without edge correction)
by:
22
nndist nearest neighbour distances
pairdist distances between all pairs of points
crossdist distances between points in two patterns
exactdt distance from any location to nearest data point
distmap distance map
4.5 Function envelopes
A popular strategy for assessing whether a point pattern is random (Pois-
son) or has interpoint interactions, is to plot a summary function for the
data (e.g. the K function) together with the upper and lower envelopes of
the K functions for 99 simulated realisations of a completely random (uni-
form Poisson) point process. This can be done with the command envelope.
4.6 Comparing and combining summary functions
The command eval.fv will evaluate any expression involving summary func-
tions. For example, you can use it to subtract one K function from another:
KX <- Kest(X)
KY <- Kest(Y)
Kdiff <- eval.fv(KX - KY)
plot(Kdiff)
will evaluate and plot the dierence between the K functions of two point
patterns X and Y.
4.7 Pixel images
Functions which display or manipulate a pixel image include
plot.im plot a pixel image on screen as a digital image
contour.im draw contours of a pixel image
persp.im draw perspective plot of a pixel image
[.im extract subset of pixel image
print.im print basic information about pixel image
summary.im summary of pixel image
is.im test whether an object is a pixel image
compatible.im test whether two images have compatible dimensions
eval.im evaluate an expression involving pixel images
shift.im apply vector shift to pixel image
23
4.8 Line segment patterns
Functions which display or manipulate a line segment pattern include
plot.psp plot a line segment pattern
[.psp extract subset of line segment pattern
print.psp print basic information about line segment pattern
summary.psp print summary of line segment pattern
endpoints.psp extract endpoints of line segments
midpoints.psp compute midpoints of line segments
lengths.psp compute lengths of line segments
angles.psp compute orientation angles of line segments
distmap.psp compute distance map of line segments
density.psp kernel smoothing of line segments
affine.psp ane transformation of line segments
shift.psp vector shift of line segment pattern
rotate.psp rotation of line segment pattern
4.9 Summary statistics for a multitype point pattern:
Analogues of the G, J and K functions have been dened in the literature for
multitype point patterns, that is, patterns in which each point is classied
as belonging to one of a nite number of possible types (e.g. on/o, species,
colour). The best known of these is the cross K function K
ij
(r) derived by
counting, for each point of type i, the number of type j points lying closer
than r units away.
Gcross,Gdot,Gmulti multitype nearest neighbour distributions G
ij
, G
i
Kcross,Kdot, Kmulti multitype K-functions K
ij
, K
i
Jcross,Jdot,Jmulti multitype J-functions J
ij
, J
i
alltypes array of multitype functions
Iest multitype I-function
These functions (with the exception of alltypes) operate in a very similar
way to Gest, Jest, Kest with additional arguments specifying the type(s)
of points to be studied.
To compute and plot the cross K function K
ij
(r) for all possible pairs of
types i and j,
plot(alltypes(X,"K"))
24
See the next section for further information.
0.00 0.05 0.10 0.15
0
.
0
0
0
.
0
2
0
.
0
4
0
.
0
6
r
t
r
a
n
s
,
t
h
e
o
(off,off)
0.00 0.05 0.10 0.15
0
.
0
0
0
.
0
2
0
.
0
4
0
.
0
6
r
t
r
a
n
s
,
t
h
e
o
(off,on)
0.00 0.05 0.10 0.15
0
.
0
0
0
.
0
2
0
.
0
4
0
.
0
6
r
t
r
a
n
s
,
t
h
e
o
(on,off)
0.00 0.05 0.10 0.15
0
.
0
0
0
.
0
2
0
.
0
4
0
.
0
6
r
t
r
a
n
s
,
t
h
e
o
(on,on)
Array of K functions for amacrine.
4.10 Function arrays
A function array is a collection of functions f
i,j
(r) indexed by integers i and
j. An example is the set of cross K functions K
ij
(r) for all possible pairs of
types i and j in a multitype point pattern. It is best to think of this as a
genuine matrix or array.
25
A function array is represented in spatstat by an object of type "fasp".
It can be stored, plotted, indexed and subsetted in a natural way. If Z is a
function array, then
plot(Z)
plot(Z[,3:5])
will plot the entire array, and then plot the subarray consisting only of
columns 3 to 5. See help(fasp.object), help(plot.fasp) and help("[.fasp")
for details.
The value returned by alltypes is a function array. alltypes computes
a summary statistic for each possible type, or each possible pairs of types,
in a multitype point pattern. For example if X is a multitype point pattern
with 3 possible types,
Z <- alltypes(X, "K")
yields a 33 function array such that (say) Z[1,2] represents the cross-type
K function K
1,2
(r) between types 1 and 2. The command plot(Z) will plot
the entire set of cross K functions as a two-dimensional array of plot panels.
Arguments to plot.fasp can be used to change the plotting style, the range
of the axes, and to select which estimator of K
ij
is plotted.
The value returned by allstats is a 2 2 function array containing the
F, G, J and K functions of an (unmarked) point pattern.
4.11 Summary functions for a continuously marked
point pattern
Some point patterns are marked, but not multitype. That is, the points may
carry marks that do not belong to a nite list of possible types. The marks
might be continuous numerical values, complex numbers, etc.
An example in spatstat is the dataset longleaf where the marks rep-
resent tree diameters. You can easily recognise whether a point pattern is
multitype or not by the behaviour of the plot function: a multitype pattern is
plotted using dierent plotting symbols for each type, while a marked point
pattern with numerical marks is plotted using circles of radius proportional
to the marks.
There are a few ways to study such patterns in spatstat:
26
the function markcorr computes the mark correlation function of an
arbitrary marked point pattern. See the help le for markcorr.
you can convert a marked point pattern to a multitype point pattern
using the function cut.ppp, for example, classifying the marks into
High, Medium and Low, then apply the abovementioned functions for
multitype point patterns. This is usually a good exploratory step.
the functions Kmulti, Gmulti, Jmulti operate on arbitrary marked
point patterns. They require arguments I, J identifying two subsets
of the point pattern. These two subsets will be treated as two discrete
types.
ignore the marks (use the function unmark to remove them) and analyse
only the locations of the points.
4.12 Programming tools
spatstat also contains some programming tools to help you perform calcu-
lations with point patterns. One of these is the function applynbd which
can be used to visit each point of the point pattern, identify its neighbouring
points, and apply any desired operation to these neighbours.
For example the following code calculates the distance from each point
in the pattern redwood to its second nearest neighbour:
nnd2 <- applynbd(redwood,
N = 2,
function(Y, cur, d, r){max(d)},
exclude=TRUE)
See the help on applynbd for examples and details.
You can also use applynbd to perform animations in which each point of
the point pattern is visited and a graphical display is executed. There is an
example in demo(spatstat).
27
5 Model tting
spatstat enables parametric models of spatial point processes to be tted to
point pattern data. The scope of possible models is very wide. Models may
include spatial trend, dependence on covariates, and interpoint interactions
of any order (i.e. we are not restricted to pairwise interactions). Models
are specied by formulae in the S language and tted by a function ppm()
analogous to glm() and gam().
Models can be tted either by the method of maximum pseudolikelihood,
or by an approximate maximum likelihood method. Maximum pseudolike-
lihood is very fast (using a computational device developed by Berman &
Turner (1992) and Baddeley & Turner (2000)). Approximate maximum like-
lihood is slower, but has better statistical properties when the model has
strong interpoint interactions.
For example if X is a point pattern,
ppm(X, ~1, Strauss(r=0.1), ....)
ts the stationary Strauss process with interaction radius r = 0.1, and
ppm(X, ~x, Strauss(r=0.07), ....)
ts the non-stationary Strauss process with a loglinear spatial trend of the
form b(x, y) = exp(a + bx).
The value returned by ppm() is a tted point process model of class
"ppm". It can be plotted and predicted, in a manner analogous to the plotting
and prediction of tted generalised linear models.
Simulation of the tted model is also possible using rmh.
5.1 Models
Here is a very brief summary of parametric models for point processes. See
Baddeley & Turner (2000), Cox & Isham (1980), and the excellent surveys
by Ripley (1988, 1989).
The point pattern dataset x is assumed to be a realisation of a random
point process X in W. Typically the null model (or the null hypothesis) will
28
be the homogeneous Poisson point process. Other models will be specied
by their likelihood with respect to the Poisson process. Thus we assume
X has a probability density f(x; ) with respect to the distribution of the
Poisson process with intensity 1 on W. The distribution is governed by a
p-dimensional parameter .
We frequently use the Papangelou conditional intensity dened, for a
location u W and a point pattern x, as
(u, x) =
f(x {u}; )
f(x \ u; )
Eectively our technique ts a model to the conditional intensity.
Here are four important examples.
the homogeneous Poisson process with intensity > 0 has conditional
intensity
(u, x) =
the inhomogeneous Poisson process on W with rate or intensity func-
tion : W R, has conditional intensity
(u, x) = (u).
In statistical models, the intensity
: W R
+
and interaction function h
: W W R
+
has condi-
tional intensity
(u, x) = b
(u)
i
h
(u, x
i
)
29
The term b
(u, x
i
) introduce
dependence (interaction) between dierent points of the process X.
Our technique only estimates parameters for which the model is in
canonical exponential family form,
f(x; ) = () exp(
T
V (x))
(u, x) = exp(
T
S(u, x))
where V (x) and S(u, x) are statistics, and () is the normalising constant.
5.2 Implementation in spatstat
The model-tting function is called ppm() and is strongly analogous to glm()
or gam(). It is called in the form
ppm(X, formula, interaction, ...)
where X is the point pattern dataset, formula is an S language formula
describing the systematic part of the model, and interact is an object of
class "interact" describing the stochastic dependence between points in the
pattern.
What this means is that we write the conditional intensity
(u, x) as a
loglinear expression with two components:
(u, x) = exp(
1
B(u) +
2
C(u, x))
where = (
1
,
2
) are parameters to be estimated.
The term B(u) depends only on the spatial location u, so it represents
spatial trend or spatial covariate eects. It is treated as a systematic
component of the model, analogous to the systematic part of a generalised
linear model, and is described in spatstat by an S language formula.
The term C(u, x) represents stochastic interactions or dependence be-
tween the points of the random point process. It is regarded as a distri-
butional component of the model analogous to the distribution family in a
generalised linear model. It is described in spatstat by an object of class
"interact" that we create using specialised spatstat functions.
For example
30
ppm(X, ~1, Strauss(r=0.1), ....)
ts the stationary Strauss process with interaction radius r = 0.1. The
spatial trend formula ~1 is a constant, meaning the process is stationary.
The argument Strauss(r=0.1) is an object representing the interpoint in-
teraction structure of the Strauss process with interaction radius r = 0.1.
Similarly
ppm(X, ~x, Strauss(r=0.1), ....)
ts the non-stationary Strauss process with a loglinear spatial trend of the
form b(x, y) = exp(a + bx) where a and b are parameters to be tted, and
x, y are the cartesian coordinates.
Spatial trend
The formula argument of ppm() describes any spatial trend and covariate
eects. The default is ~1, which corresponds to a process without spatial
trend or covariate eects. The formula ~x corresponds to a spatial trend
of the form (x, y) = exp(a + bx), while ~x + y corresponds to (x, y) =
exp(a + bx + cy) where x, y are the Cartesian coordinates. These could be
replaced by any S language formula (with empty left hand side) in terms of
the reserved names x, y and marks, or in terms of some spatial covariates
which you must then supply.
You can easily construct spatial covariates from the Cartesian coordi-
nates. For example
ppm(X, ~ ifelse(x > 2, 0, 1), Poisson())
ts an inhomogeneous Poisson process with dierent, constant intensities on
each side of the line x = 2.
spatstat provides a function polynom which generates polynomials in 1
or 2 variables. For example
~ polynom(x, y, 2)
31
represents a polynomial of order 2 in the Cartesian coordinates x and y. This
would give a log-quadratic spatial trend. The distinction between polynom
and poly is explained below. Similarly
~ harmonic(x, y, 2)
represents the most general harmonic polynomial of order 2 in x and y.
It is slightly more tricky to include observed spatial covariates; see sec-
tion 5.7.
Interaction terms
The higher order (interaction) structure can be specied using one of the
following functions. They yield an object (of class "interact") describing
the interpoint interaction structure of the model.
Poisson() . . . . . . . . Poisson process
Strauss() . . . . . . . . Strauss process
StraussHard() . . . Strauss process with a hard core
Softcore(). . . . . . . Pairwise interaction, soft core potential
PairPiece() . . . . . Pairwise interaction, piecewise constant potential
DiggleGratton() Diggle-Gratton potential
LennardJones() . Lennard-Jones potential
Geyer() . . . . . . . . . . Geyers saturation process
OrdThresh() . . . . . Ord process with threshold potential
Note that ppm() estimates only the exponential family type parameters of
a point process model. These are parameters such that the loglikelihood
is linear in . Other so-called irregular parameters (such as the interac-
tion radius r of the Strauss process) cannot be estimated by this technique,
and their values must be specied a priori, as arguments to the interaction
function).
For more advanced use, the following functions will accept user-dened
potentials in the form of an arbitrary S language function. They eectively
allow arbitrary point process models of these three classes.
Pairwise(). . . . Pairwise interaction, user-supplied potential
Ord(). . . . . . . . . . Ord model, user-supplied potential
Saturated(). . . Saturated pairwise model, user-supplied potential
32
The brave user may also generate completely new point process models using
the foregoing as templates.
5.3 Fitted models
The value returned by ppm() is a tted point process model of class "ppm".
It can be stored, inspected, plotted and predicted.
fit <- ppm(X, ~1, Strauss(r=0.1), ...)
fit
plot(fit)
pf <- predict(fit)
coef(fit)
Printing the tted object fit will produce text output describing the tted
model. Plotting the object will display the spatial trend and the conditional
intensity, as perspective plots, contour plots and image plots. The predict
method computes either the spatial trend or the conditional intensity at new
locations.
Methods are provided for the following generic operations applied to
"ppm" objects:
predict() prediction (spatial trend, conditional intensity)
plot() plotting
coef() extraction of tted coecients
fitted() tted conditional intensity or trend at data points
update() update the t
summary() print extensive summary information
anova() analysis of deviance
A "ppm" object contains full information about the data to which the
model was tted. These data can be extracted using the following:
quad.ppm() extract quadrature scheme
data.ppm() extract data point pattern
dummy.ppm() extract dummy points of quadrature scheme
A trap for young players
Note that problems may arise if you use predict on a point process model
whose systematic component is expressed in terms of one of the functions
poly(), bs(), lo(), or ns(). For example
33
fit <- ppm(X, ~ poly(x,2), Poisson())
p <- predict(fit)
The same problem occurs with predict for generalised linear models and
generalised additive models. Each of the abovementioned functions returns
a data frame, containing variables that are transformations of the variables
given as arguments of the function. However the transformations themselves
depend on the values of the arguments. For example poly performs Gram-
Schmidt orthonormalisation. Hence the tted coecients contained in the
fit object are not appropriate when we predict at new locations not
even for the default call to predict(fit) above.
For this reason we have supplied the function polynom which does not
perform any data-dependent transformation, and yields a data frame whose
columns are just the powers of its arguments. Replacing poly by polynom
in the code above does work correctly.
This problem does not aect the function harmonic.
5.4 Simulation and goodness-of-t
The command rmh.ppm will generate simulated realisations of a tted point
process model.
The command envelope will compute the upper and lower envelopes of
a summary statistic (such as the K function) for simulated realisations from
a tted point process model. This can be used as a test of goodness-of-t for
the tted model.
5.5 Fitting models to multitype point patterns
The function ppm() will also t models to multitype point patterns. A mul-
titype point pattern is a point pattern in which the points are each classied
into one of a nite number of possible types (e.g. species, colours, on/o
states). In spatstat a multitype point pattern is represented by a "ppp"
object X containing a vector X$marks, which must be a factor.
Interaction component
Naturally an appropriate specication of the interaction for such a model
must be available. Apart from the Poisson process, so far interaction func-
tions have been written for the following:
34
MultiStrauss() multitype Strauss process
MultiStraussHard() multitype Strauss/hard core process
For the multitype Strauss process, a matrix of interaction radii must
be specied. If there are m distinct levels (possible values) of the marks, we
require a matrix r in which r[i,j] is the interaction radius r
ij
between types
i and j. For the multitype Strauss/hard core model, a matrix of hardcore
radii must be supplied as well. These matrices will be of dimension mm
and must be symmetric. See the help les for these functions.
Trend component
The rst-order component (trend) of a multitype point process model may
depend on the marks. For example, a stationary multitype Poisson point
process could have dierent (constant) intensities for each possible mark. A
general nonstationary process could have a dierent spatial trend surface for
each possible mark.
In order to represent the dependence of the trend on the marks, the trend
formula passed to ppm() may involve the reserved name marks.
The trend formula ~ 1 states that the trend is constant and does not
depend on the marks. The formula ~marks indicates that there is a sepa-
rate, constant intensity for each possible mark. The correct way to t the
multitype Poisson process is
ppm(X, ~ marks, Poisson())
Getting more elaborate, the trend formula might involve both the marks
and the spatial locations or spatial covariates. For example the trend for-
mula ~marks + polynom(x,y,2) signies that the rst order trend is a log-
quadratic function of the cartesian coordinates, multiplied by a constant
factor depending on the mark.
The formulae
~ marks * polynom(x,2)
~ marks + marks:polynom(x,2)
both specify that, for each mark, the rst order trend is a dierent log-
quadratic function of the cartesian coordinates. The second form looks
35
wrong since it includes a marks by polynom interaction without hav-
ing polynom in the model, but since polynom is a covariate rather than a
factor this is is allowed, and makes perfectly good sense. As a result the two
foregoing models are in fact equivalent. However, they will give output that
is slightly dierent in appearance. For instance, suppose that there are 3
distinct marks. The rst form of the model gives a baseline polynomial,
say P
0
, and two polynomials say P
1
and P
2
. Assume that either Helmert or
sum contrasts were used, so that the sum constraints apply. The trends
corresponding to each of the marks would be given by exp(C
1
+ P
0
+ P
1
),
exp(C
2
+ P
0
+ P
2
), and exp(C
3
+ P
0
P
1
P
2
) respectively, where C
1
, C
2
,
and C
3
are the appropriate constant terms corresponding to each of the three
marks.
The second model simply gives 3 polynomials, say p
1
, p
2
, and p
3
, cor-
responding to each of the 3 marks. The trends would then be given by
exp(c
1
+ p
1
), exp(c
2
+ p
2
), and exp(c
3
+ p
3
).
5.6 Quadrature schemes
The function ppm is an implementation of the technique of Baddeley & Turner
(2000) which is based on a quadrature device originated by Berman & Turner
(1992). Complete control over the quadrature technique is possible.
Indeed the function ppm() prefers to be provided with a quadrature
scheme as its rst argument, although it will make do with a point pattern
and calculate a default quadrature scheme.
A quadrature scheme is an object of class "quad" giving the locations of
quadrature points and the weights attached to them. See help(quad.object)
for more details. The usual way to create a quadrature scheme is to use
quadscheme(). For example:
Q <- quadscheme(simdat,gridcentres(simdat,50,50),
nx=40,ny=40)
fit <- ppm(Q, ~polynom(x,y,3),Softcore(0.5),
correction="periodic")
Following are the most useful functions for manipulating quadrature schemes.
36
quadscheme generate a Berman-Turner quadrature scheme
for use by ppm
default.dummy default pattern of dummy points
gridcentres dummy points in a rectangular grid
stratrand stratied random dummy pattern
spokes radial pattern of dummy points
corners dummy points at corners of the window
gridweights quadrature weights by the grid-counting rule
dirichlet.weights quadrature weights are Dirichlet tile areas
print.quad print basic information
summary.quad summary of a quadrature scheme
5.7 Observed spatial covariates
If you wish to model the dependence of a point pattern on a spatial covariate,
there are several requirements.
the covariate must be a quantity Z(u) observable (in principle) at each
location u in the window (e.g. altitude, soil pH, or distance to another
spatial pattern). There may be several such covariates, and they may
be continuous valued or factors.
the values Z(x
i
) of Z at each point of the data point pattern must be
available.
the values Z(u) at some other points u in the window must be available.
Thus, it is not enough simply to observe the covariate values at the points
of the data point pattern. In order to t a model involving covariates, ppm
must know the values of these covariates at every quadrature point.
The argument covariates to the function ppm() species the values of
the spatial covariates. It may be either a data frame or a list of images.
(a) If covariates is a data frame, then the ith row of the data frame is ex-
pected to contain the covariate values for the ith quadrature point. The
column names of the data frame should be the names of the covariates
used in the model formula when you call ppm.
(b) If covariates is a list of images, then each image is assumed to con-
tain the values of a spatial covariate at a ne grid of spatial locations.
37
The software will look up the values of these images at the quadra-
ture points. The names of the list entries should be the names of the
covariates used in the model formula when you call ppm.
Covariates in a data frame
Typically you would use the data frame format (a) if the values of the spatial
covariates can only be observed at a few locations. You need to force ppm()
to use these locations to t the model. To do this, you will need to construct
a quadrature scheme based on the spatial locations where the covariate Z
has been observed. Then the values of the covariate at these locations are
passed to ppm() through the argument data.
For example, suppose that X is the observed point pattern and we are
trying to model the eect of soil acidity (pH). Suppose we have measured
the values of soil pH at the points x
i
of the point pattern, and stored them
in a vector XpH. Suppose we have measured soil pH at some other locations
u in the window, and stored the results in a data frame U with columns x,
y, pH. Then do as follows:
Q <- quadscheme(data=X, dummy=list(x=U$x, y=U$y))
df <- data.frame(pH=c(XpH, U$pH))
Then the rows of the data frame df correspond to the quadrature points in
the quadrature scheme Q. To t just the eect of pH, type
ppm(Q, ~ pH, Poisson(), covariates=df)
where the term pH in the formula ~ pH agrees with the column label pH in the
argument covariates = df. This will t an inhomogeneous Poisson process
with intensity that is a loglinear function of soil pH. You can also try (say)
ppm(Q, ~ pH, Strauss(r=1), covariates=df)
ppm(Q, ~ factor(pH > 7), Poisson(), covariates=df)
ppm(Q, ~ polynom(x, 2) * factor(pH > 7), covariates=df)
Covariates in a list of images
The alternative format (b), in which covariates is a list of images, would
typically be used when the covariate values are computed from other data.
For example, suppose we have a spatial epidemiological dataset containing
a point pattern X of disease cases, and another point pattern Y of controls.
38
We want to model X as a point process with intensity proportional to the
local density of the susceptible population. We estimate by taking a
kernel-smoothed estimate of the intensity of Y. Thus
rho.hat <- ksmooth.ppp(Y, sigma=1.234)
ppm(X, ~offset(log(rho)), covariates=list(rho=rho.hat))
The rst line computes the values of the kernel-smoothed intensity estimate
at a ne grid of pixels, and stores them in the image object rho.hat. The
second line ts the Poisson process model with intensity
(u) = (u)
Note that covariates must be a list of images, even though there is only
one covariate. The variable name rho in the model formula must match the
name rho in the list.
6 Diagnostics for a tted model
Diagnostic plots are available for checking a tted point process model.
diagnose.ppm diagnostic plots for spatial trend
qqplot.ppm diagnostic plot for interpoint interaction
For example, suppose we t a Strauss process model to the Swedish Pines
data:
data(swedishpines)
fit <- ppm(swedishpines, ~1, Strauss(7), rbord=7)
Then the adequacy of the trend in the tted model can be assessed by calling
diagnose.ppm(fit)
qqplot.ppm(fit)
and inspecting these plots. See the help les for further information, or type
demo(diagnose) for a demonstration of the diagnostics features.
39
7 Worked example
Suppose we have a data le trees.tab containing a table of x,y coordinates
and species names for all trees in a paddock. The paddock has an irregular
polygonal boundary whose vertex coordinates are stored in the le paddock.
The following code will read in these data, plot the polygonal boundary,
create the point pattern object and plot the point pattern.
tab <- read.table("trees.tab", header=TRUE)
bdry <- scan("paddock", what=list(x=0,y=0))
plot(owin(poly=bdry))
trees <- ppp(tab$x, tab$y, poly=bdry, marks=factor(tab$species))
plot(trees)
owin(poly=bdry) trees
Next we inspect the pattern of sugar gums only, using the subset operation
"[" for point patterns:
sugargums <- trees[ trees$marks == "sugargum"]
plot(sugargums)
sugargums
40
Next we compute and plot the cross-type G function between sugar gums
and red box:
G <- Gcross(trees, "sugargum", "redbox")
plot(G)
r
k
m
,
t
h
e
o
r
0 50 100 150 200 250 300
0
.
0
0
.
2
0
.
4
0
.
6
0
.
8
Next we t a nonstationary Poisson process, with a separate log-cubic
spatial trend for each species of tree:
fitsep <- ppm(trees, ~ marks * polynom(x,y,3), Poisson())
We also t the sub-model in which the species trends are all proportional:
fitprop <- ppm(trees, ~ marks + polynom(x,y,3), Poisson())
and t the stationary model in which each species has constant intensity:
fit0 <- ppm(trees, ~ marks, Poisson())
We plot the tted trend surfaces for each tree:
plot(fitsep)
Finally we t a nonstationary multitype Strauss / hard core process with a
hard core operating between trees of the same species:
mymodel <- MultiStraussHard( levels(trees$marks),
iradii=matrix(c(150,100,100,200), 2,2),
hradii=matrix(c(50,NA,NA,50), 2,2))
fit <- ppm(trees, ~ marks * polynom(x,y,3), mymodel, rbord=20)
plot(fit, cif=FALSE)
41
mark = redbox
42
References
[1] A. Baddeley and R. Turner. Practical maximum pseudolikelihood for
spatial point patterns. Australian and New Zealand Journal of Statistics
42 (2000) 283322.
[2] A.J. Baddeley and R.D. Gill. Kaplan-Meier estimators for interpoint
distance distributions of spatial point processes. Annals of Statistics 25
(1997) 263292.
[3] M. Berman and T.R. Turner. Approximating point process likelihoods
with GLIM. Applied Statistics, 41:3138, 1992.
[4] N.A.C. Cressie. Statistics for spatial data. John Wiley and Sons, New
York, 1991.
[5] P.J. Diggle. Statistical analysis of spatial point patterns. Academic Press,
London, 1983.
[6] M.B. Hansen, R.D. Gill and A.J. Baddeley. Kaplan-Meier type estima-
tors for linear contact distributions. Scandinavian Journal of Statistics
23 (1996) 129155.
[7] M.B. Hansen, A.J. Baddeley and R.D. Gill. First contact distributions
for spatial patterns: regularity and estimation. Advances in Applied
Probability (SGSA) 31 (1999) 1533.
[8] M.N.M. van Lieshout and A.J. Baddeley. A non-parametric measure of
spatial interaction in point patterns. Statistica Neerlandica 50 (1996)
344361.
[9] M.N.M. van Lieshout and A.J. Baddeley. Indices of dependence between
types in multivariate point patterns. Scandinavian Journal of Statistics
26 (1999) 511532.
[10] B.D. Ripley. Spatial statistics. John Wiley and Sons, New York, 1981.
[11] D. Stoyan, W.S. Kendall, and J. Mecke. Stochastic Geometry and its
Applications. John Wiley and Sons, Chichester, second edition, 1995.
43