Applications of Random Sampling in Computational Geometry, II

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

Applications of Random Sampling

in Computational Geometry, II
extended abstract

Kenneth L. Clarkson
A T & T Bell Laboratories
Murray Hill, New Jersey 07974

Abstract random sampling will be used in a similar way, with


the additional observation that the total of the sizes
Random sampling is used for several new geometric of the subproblems is small on the average. This fact
algorithms. The algorithms are "Las Vegas," and will allow improvement in bounds for certain random-
their expected bounds are with respect to the random ized algorithms, and for a combinatorial quantity.
behavior of the algorithms. One algorithm reports The results in this paper have been used in al-
all the intersecting pairs of a set of line segments in gorithms for computing diametral pairs[CS88], tri-
the plane, and requires O(A + n log n) expected time, angulating simple polygons[CTVW88], and in vari-
where A is the size of the answer, the number of inter- ous combinatorial results on arrangements[CEG+88].
secting pairs reported. The algorithm requires O(n) These ideas are used in [CS881 to show that a sim-
space in the worst case. Another algorithm computes ple, general technique for computing geometric struc-
the convex hull of a point set in E 3 in O(n log A) ex- tures is asymptotically fast. This technique is applied
pected time, where n is the number of points and A is in this paper, replacing earlier and more complicated
the number of points on the surface of the hull. A sim- versions of the algorithms.
ple Las Vegas algorithm triangulates simple polygons The dimension d is considered to be a constant.
in O(n log log n) expected time. Algorithms for half- The expected resource bounds indicated are "Las Ve-
space range reporting are also given. In addition, this gas," and the expectations are with respect to the
paper gives asymptotically tight bounds for a combi- random behavior of the algorithms. The parameter
natorial quantity of interest in discrete and compu- A is generally used to denote the size of the Answer
tational geometry, related to halfspace partitions of to a computation. The inputs to the algorithms will
point sets. be assumed nondegenerate, so an input set of line seg-
ments has no three intersecting at the same point, an
input set of points in E d has no d + 1 coplanar, and
1 Introduction so on. This is no loss of generality, as small random
perturbations assure the condition with probability
In recent years, random sampling has seen increasing 1, and the answer sizes A as defined are unchanged.
use in discrete and computational geometry, with ap- (See also [EM8S, Yap88].)
plications in proximity problems, point location, and
range queries [Cla85, Cla87, HW871. These applica-
tions have largely used random sampling for divide-
1.1 T h e problems and results
and-conquer, to split up problems into subproblems A n a l g o r i t h m for line s e g m e n t i n t e r s e c t i o n s .
each guaranteed to be of small size. In this paper, Let S be a set of n line segments in the plane. The
problem of determining the pairs of segments in S
that intersect has received much attention, culminat-
ing in the recent algorithm of Chazelle and Edels-
Permission to copy without fee all or part of this material is granted brunner requiring O(A + n l o g n ) time in the worst
provided thatthe copiesare not made or distributed for direct com-
mercial advantage, the ACM copyright notice and the title of the case to report the A intersecting pairs. Their algo-
publication and its date appear, and notice is given that copyingis rithm requires (moderately) sophisticated data struc-
by permissionof the Associationfor ComputingMachinery.To copy tures and many sophisticated algorithmic techniques,
otherwise, or to republish, requiresa fee and/or specificpermission.
and f~(n + A) space. This paper gives a Las Vegas
© 1988 ACM 0-89791-270-5/88/0006/0001 $1.50 algorithm for this problem. The algorithm requires
O(A + nlogn) expected time and O(n) space in the that gk,3(n) ----O(nkh). In [Cla87], it was shown that
worst case. The only data structure used by the algo- g~,3(n) = O(nk 2 logs n/(log log n)6). The new bound
rithm is a subdivision of the plane into simple regions O(nk 2) is thus an improvement over all previous re-
induced by the line segments. sults for d -- 3; no bounds were known before for
Convex hulls. A L a s Vegas algorithm is given for higher dimensions. The proof of the bound is also
computing the convex hull of n points in E 3. The considerably simpler than those given for these ear-
algorithm requires O(n log A) expected time for any lier, weaker bounds.
set of points in E 3, where A is the number of points I m p r o v e d b o u n d s for halfspace r a n g e r e p o r t -
of S on the surface of the hull. Kirkpatrick and Seidel ing. The halfspace range reporting problem is to
obtained a deterministic algorithm f.)r planar convex build a data structure for a set S of points, so that
hulls with the same time bound [KS86] given a query halfspace the points of S in the half-
A n a l g o r i t h m for t r i a n g u l a t i o n of simple space can bereported quickly. The new bound for
polygons. Triangulation of a simple polygon is a k-sets is applied in this paper to sharpen the analysis
preliminary step for many algorithms on such poly- of the algorithm of [CP86] for halfspace range report-
gons. For many problems, triangulation is the only ing. It is also used to analyze two new algorithms
step not known to require O(n) time. Therefore, tri- for that problem. One algorithm is shown to require
angulation is a particularly intriguing problem. Tar- expected O(n [d/2]+~) preprocessing time, and in the
jan and Van Wyk [TVW88] have given an algorithm worst case O(n Ld/2j+e) storage. The resulting query
for this problem requiring O(nloglog n) time in the time is O(A+log n), where A is the size of the answer
worst case. Their algorithm is quite complex. This to the query. These resource bounds apply for any
paper gives a simple Las Vegas algorithm that re- fixed e > 0, and the constant factors in the bounds
quires O(n loglog n) expected time. This algorithm depend on d and e. Another algorithm requires O(n)
has recently been refined to expected O(nlog* n). storage, O(n log n) expected preprocessing time, and
[CTVW88] allows queries to be answered in O(A + n 1+~-'~) time,
' r i g h t b o u n d s for k-sets. Let S C E d contain n where ,'/= 1/(1 + (d - 1)Ld/2J). The algorithm is a
points. A set S ~ c S with IS'I = j is a j-set of S if variant of Haussler and Welzl's [HW87]. Their query
there is a hyperplane that separates S ~ from the rest time is O(nl+~-~'), where q' = 1/(1 + d i d - 1)). (This
of S. A j-set is a (_<k)-set i f j <_ k. Let g~(S) be the is independent of the answer size, however.)
number of (<k)-sets, and let gk,d(n) be the maximum
value of gk(S) over all n-point sets S c E d.
This paper shows that 1.2 Outline of the paper

gk,d(n) = O(nLdl2]k[d/2]), The remainder of this section gives an informal dis-


cussion of the ideas in this paper, followed by a de-
as n/k ~ or, for fixed d. The proof technique for the scription of the formal framework used in the the-
combinatorial bound can also be applied to give (_<k)- orems. The next section gives a proof of a general
set bounds for independently identically distributed theorem that implies the asymptotically tight bound
points. For example, if the convex hull of such a for k-sets. Some applicationS of the k-set bound to
set of points has O(f(n)) expected facets, then the halfspace range reporting are shown. The proof tech-
expected number of (_<k)-sets is O(kdf(n/k)). The nique for the k-set bound is closely related to ideas
proof technique employed for the improved bounds is used in §3 for convex hulls, line segment intersection,
an instance of a "probabilistic method" [ES74]. and triangulation of simple polygons.
The concept of a k-set is a generalization of the
concept of a convex hull facet, which can be viewed 1.3 The ideas
as a d-set. The new bound is a generalization of the
known upper bound O(n [d/2j) for the number of facets To give an impression of the ideas used in this work,
of a convex polytope with n vertices. Indeed, the new versions of them will be given for some simple cases.
bound follows from this polytope upper bound. Conside~" a set S of n real numbers, and random
Goodman and Pollack showed [GP84] that for k < R C. S .'if size r. The subset R gives a l o t of in-
n/2, the bound gk,2(n) < 2 n k - k 2 - k holds. This was formation about S. For example, let S> be the set of
improved to gk,2(n) _<nk in [AG86], and that bound numbers in S that are greater than all the numbers in
was proven tight in the same paper. (See also [Wel86] R. Then w i ~ high probability, the number of values
for a related result for the plane.) in S> is small, that is, no more than O(log r)n/r. Sim-
Cole and others [CSY84] showed that gk,3(n) = ilarly, with high probability, the number of values in
O(n2k), and Chazelle and Preparata [CP86] showed a co~l'esponding set S< is O(logr)n/r. Indeed, with
high probability, b o t h sets are small. These state- Similar observations are also useful in other inter-
ments are "tail estimates," bounds on the probabil- section problems, such as the problem of determining
ity t h a t a r a n d o m variable will exceed a certain value. the n u m b e r of intersecting pairs in a set of n line
Another kind of statement about a r a n d o m variable segments in the plane. Suppose S is a set of n line
is its expectation. For example, the sizes of these sets segments in the plane, and R C S is a r a n d o m subset
are expected O(n/r). of S of size r. T h e set R induces a subdivision of the
Both of these kinds of statements can be general- plane t h a t will be called the "vertical visibility m a p , "
ized to higher dimensions. The convex hull of a set in or ~ ( R ) . This subdivision is defined as follows: for
E 1 is just the interval between the largest and small- every point p t h a t is either an endpoint of a segment
est values in t h a t set, so the above statements are in R, or an intersection point of two segments in R,
claims a b o u t the convex hull of R. Now suppose S extend a vertical segment from p to the first segment
is a set of points in the plane. Let e be an edge of of R above p, and to the first segment of R below p. If
the convex hull of R, and let l be the straight line no such segment is "visible" to p above it, then extend
containing e. The points in the halfspace bounded a vertical ray above p, and similarly below. The re-
by l, and not containing R, will be said to be beyond sulting vertical segments, together with the segments
the edge e. An analog to the first claim above is this: in R, form a subdivision of the plane into simple re-
with high probability, for every edge e of the convex gions t h a t are generally trapezoids. (So "l)(R) is also
hull of R, the n u m b e r of points of S t h a t are beyond known as the "trapezoidal diagram" of R.) No seg-
e is O(log r)n/r. This kind of tail estimate has been ment of R intersects the interior of any such region Q.
the basis of several previous applications of r a n d o m This implies (intuitively) t h a t few segments of S in-
sampling to computational geometry. tersect the interior of Q. This gives a way to "divide-
It is also possible to generalize the second kind of and-conquer" the line segment intersection problem.
claim a b o u t R and S, as follows: if Se is the set of Note t h a t the n u m b e r of regions in the subdivision
points in S that are beyond e, then the expected value is proportional to r ÷ A t, where A t is the n u m b e r of
of the sum of ISel, over all edges e of the hull of R, intersecting pairs of segments in R. It is easy to show
is O(n). Thus a tighter bound is obtainable a b o u t t h a t the expected value of A ~ is AO(r2/n2).
the sizes ISel on average t h a n can be m a d e a b o u t any For the combinatorial problem of bounding the
particular one. This observation is a key new idea in n u m b e r of k-sets, it is helpful to use a kind of con-
this paper. verse to the above relation between the convex hulls
Why is this kind of bound useful in computing con- of point sets R C S C E 2. T h a t is, let two given
vex hulls? The idea is to take a large r a n d o m subset points in R define some line l, t h a t divides S into j
R C S, recursively compute the convex hull of R, and and n - 2 - j points. T h e n u m b e r of such pairs of
then determine the result of adding the remaining points of S is the same, up to a constant, as the num-
points S \ R. Roughly speaking, the changes owing ber of j-sets of S. If j < n/r, then the probability
to the remaining points can be expected to be "lo- is not very small t h a t none of the other points of R
cal," and require a small amount of work per point. will be chosen from the j points on one side. In such
Indeed, for a given point p E S \ R, the n u m b e r of a case, the two given points of R will define an edge
such changes is proportional to the n u m b e r of edges of the hull of R. Roughly speaking, if the n u m b e r
of the hull of R t h a t p is beyond. The total of these of (<n/r)-sets of S is too large, then the n u m b e r of
changes is therefore expected O(n), as in the above edges of the hull of R will be greater t h a n r, which
discussion. of course cannot happen. T h a t is, a b o u n d on the
The convex hull algorithm will be given as an al- complexity of the convex hull of R implies a bound
gorithm for determining the intersection of a set of on the n u m b e r of ( < n / r ) - s e t s of S. It is this relation-
halfspaces. This problem is linear-time equivalent to ship t h a t is exploited in this paper.
the convex hull problem via well-known duality m a p -
pings. The above statements translate as follows to 1.4 The formal framework
halfspace intersections: suppose S is a set of n half-
spaces, and R C S is a r a n d o m subset of S of size r. T h e ideas in this p a p e r can be applied to a variety of
Suppose v is a vertex of the convex polytope t h a t is geometric structures. To aid and to show this gener-
the intersection of the halfspaces in R. Then the ex- ality, a formal and abstract framework for geometric
pected n u m b e r of halfspaces of S t h a t do not contain computations is useful. This framework is much the
v is O(n/r). Intuitively, because all the halfspaces of same as t h a t in [Cla87].
R contain v, it is likely t h a t most of the halfspaces of Let S be a set of n subsets of E d, which we will
S do as well. t e r m objects. T h e set S corresponds to the input to
the computation, and could be a set of points (single- 2 I m p r o v e d b o u n d s for k-sets,
ton sets), line segments in the plane, halfspaces, or
balls. Let Y- be a set of subsets of E d, which we will
and applications
term regions. The regions will be defined by the ob-
2.1 The bound
jects in some way for an application. For convex hulls
in the plane, the objects are points, and the regions Rather than prove an upper bound for k-sets only, a
are halfplanes, so the regions defined by the objects much more general result will be proven, from which
are those open halfplanes that are bounded by lines the k-set bound will follow as a corollary.
through pairs of the points. The notion of "defined" Before the main theorem, a technical lelama involv-
is formalized as follows: for integer b, let 6 be a rela- ing the function pj (n, r, b) = (n-j-b
r-b ) / (r)"
n This func-
tion between Y- and Oi<bS i. A region F is "defined" tion is a lower bound on the probability that F E ~R,
by X E S i if F 6 X , that is, if the ~i relation holds given that F E Y-~, when R is a random subset of S:
between X and F. The set of regions y-s defined by F E FRO if and only if some b or fewer objects defining
the objects in S is formally F are in R, and the CF(S) = j objects meeting F are
not in R. There are at least (-~j;b) subsets of S with
Y-s = { F E Y- I F~hX, X E S i, i < b}. these properties, out of (rn) subsets.
In problems of construction, the desired computa- L e m m a 2.1 For nonnegative integers n, b, k > 1,
tion is this: determine all F E Y's such that F n s is r = [n/kJ, and j _< k, the bound
empty for all s E S. For convex hulls, this is the set
of all open halfplanes t h a t contain none of the input 1 1
points. For Voronoi diagrams in the plane, the re-
pj(n,r,b) >
gions are open disks or halfspaces, and the empty
holds as k / n ---~O.
members of Y-s are Delannay disks or empty half-
spaces as for the convex hull. The relation ~i has Proof. Omitted in this abstract. [3
a 3-tuple of points defining the open disk bounded
by the circle circumscribing the points, and a 2-tuple T h e o r e m 2.2 With the notation of §1.4, let R C S
of points defining the two open halfspaces bounded be a random subset of size r = [n/kJ, with k > 1.
by the line through the points. For trapezoidal dia- Then
grams of line segments in the plane, the objects are
line segments, the regions are open "trapezoids," and ly-,gl <_ kbe3/2E[17 l]( 1 + O(k/n)),
the parameter b is four. The relation 5 would define osj<k
the trapezoids using various configurations of four or
as k / n ~ O.
fewer segments, as such trapezoids would appear in a
trapezoidal diagram. Proof. For F E Y's, let IF be the indicator function
While the desired regions in construction problems for the event F E ~R, so IF = 1 when F E ~R and 0
have no intersections with S, it also will be useful otherwise. Then IPRI = ~ r e T s I f , SO
to consider the number of objects of S that have
nonempty intersection with a region F E Y's- Let
CF(S) denote this number, and similarly CF(R) de-
E[IS I] = E[~ IF] = ~ EilF]
FEZ"s F E Fs
notes the number of objects of R that have nonempty
intersection with F. For a given integer j , let Y-~ de- = Z P r o b { F e 7/~}
note the set o f f E y - w i t h CF(S) = j. Thus in j>0
construction problems, the set Y-~ is desired. Note Fsg
t h a t with S and Y- as for convex hulls, as mentioned >_ Z p j ( n , r , b ) l ~ l
above, the set Y-~ is closely related to the set of j-sets j_>o
of S, and ly-~'I is the number of such sets.
> ~17~llkbeai2Cl+O(kln)),
For R C S, the sets Y-R and Y-R / are defined analo- O<_j<_k
gously: Y-R is the collection of all regions F such that
F h X for some X E R b, and F E Y-~ has CF(R) = j. The first inequality is from the discussion above of
We may have some other requirements on the pj(n, r, b). The second inequality uses the technical
relation, but the essential condition on this relation !emma. D
is t h a t Y's° contains all the regions of interest in the The theorem gives a bound on the quantity 0k,d(n),
given problem. defil~cd as follows: suppose S C E d is a set of n

4
points, ~ is the set of halfspaces in E d, and 7s" is This algorithm is conveniently described by putting
the set of halfspaces bounded by hyperplanes that the range query problem into a dual form, using the
are afiine hulls of points in S. Then ~k,d(n) is the transform P described in [Ede87, §3.1], to which the
maximum, over all such S, of ~0<_i_<k J.~'~J" A member reader is referred for background. The transform P
maps non-vertical hyperplanes in E d to points in E d,
of t~ will be called a 3"-facet of S.
and vice versa. (Here "non-vertical" means that the
C o r o l l a r y 2.3 hyperplane does not contain a vertical line. A vertical
line is a translate of the xd-axis.) Under this duality,
gk,d(n) = O(ntd/2J k[d/2]), incidence and order are preserved, so that point p is
on plane h if and only if P(h) E P(p), and also p is in
a8 n --+ oo. the halfspace h + if and only if P(H) E P(p)+. In this
setting, a point set S gives rise to an arrangement
Proof. It is easy to show, using the results of
of hyperplanes 4 by the duality transform. Given a
[Ede87, §3.2], that .0k,d(n) = O(gk,d(n)) as n ---, oo.
query halfspace h-, the answer to the query is the set
To apply the theorem to .0k,d(n), S is a set of points
of all hyperplanes P(p) in 4 such that P(h) is below
in E d (or more precisely, a collection of singleton sets
them, that is, D(h) e P(p)-.
of points of Ed). The set of regions Y" is the set of
open halfspaces of E d, and with b -- d, the ~ rela- We will be interested in the set of all points that are
tion is defined as follows: for X E S b, let F~fX when below no more than k hyperplanes of 4, for some k.
These points correspond to the set of all query half-
F is bounded by the affine hull of the points in X.
spaces h- whose answer set has no more than k
The upper bound for ~k,d(n) follows, using the upper
points. The lower surface of this set of points is called
bound O(r Ldl2j) for JT~°J,here the number of facets of
a k-level. It is not too hard to see that the number of
a polytope with r vertices [Ede87, §6.2.4] D
vertices of cells above a k-level is bounded by ~k,d(S).
L e m m a 2.4 This value asymptotically bounds the total complex-
ity of the cells of .4 above the k-level.
gk,d(n) = ~ ( n Ld/2j k [d/21), The main idea for the algorithm range search is
the following generalization and restatement of Lem-
a8 n ----~ c(~.
mas 5.3 and 5.4 of [Cla87].
Proof. Omitted. Cyclic polytopes realize the L e m m a 2.5 Suppose S C E d in general position,
bound, as can be shown using the techniques of the with [S I -- n. Let R C S be a random subset of size r.
theorem, or constructively [Ede]. D Then there is an integer j, = O(logr/loglogr) and
a value oe, = O(logr/r), such that with probability at
2.2 Algorithms for Half-Space Range least 1/2, the following holds: the fi-level of R is be-
Queries low the n/(r - d2)-level of S, and strictly above the
oe,n-level of S.
In this section, the new k-set bound gives an improved
storage bound for a deterministic algorithm for half- Proof. Omitted. D
space range search in E 3. Two new randomized algo- Let us assume that r is constant (though "suffi-
rithms for range search in E d are given and analyzed. ciently large"). A given random subset can be tested
Given n points S C E d, the halfspace range search for satisfying these conditions in O(Oj,,d(r)n) time,
problem is to build a data structure for S so that which is O(n) for fixed r. Thus, by repeatedly sam-
given a query halfspace h*, the set of points h* n S pling, a suitable sample can be found in two trials,
can be reported quickly. In [CP86], Chazelle and on the average.
Preparata show that in the case d -- 3, a data Suppose that for a query halfspace h-, the point
structure requiring O(n(logn)S(loglogn) 4) storage P(h) is below the j.-level of R, where R is now a
can be constructed that allows queries to be an- suitable sample. (We will assume that the query half-
swered in O(A + logn) time, where A is the num- space contains the - o ¢ point of the xd-axis. Symmet-
ber of points in the answer to the query. Theo- ric processing must be done for positive halfspaces.)
rem 1 of that paper implies that if ~k,3(n) = O(nkf~), Then P(h) is below the n / ( r - d2)-level of 4, and the
then the storage required by their data structure is query has answer size A = f~(n). Here sophistication
O(n(log n) 2(B-1)(log log n)/~-l). The bound given here doesn't pay, and linear search through 5 determines
implies that O(n(log n) 2 log log n) storage is sufficient. the answer in O(n) = O(A) time. On the other hand,
The upper bound on ~k,d(n) gives a bound on a ran- suppose P (h) is above or on the ./.-level of R. Then we
domized algorithm for halfspace range search in E d. will recursively search a data structure for S, which is

5
built as follows: triangulate the cells of the arrange- The resulting query time is [HW87, AHW87]
ment of R t h a t are above the j.-level of R. By the O(n~), where fl = 1 - 1/(1 + B), and
results of [Cla87], this yields O(0j.,d(r)) simplices, as
r ~ oo. For any simplex T, let S(T) be the set of l /r[d/2JJ[*d/2]"
hyperplanes of A that are above any point of T. Re~ B= °gt d-1 )
- log(d + 1)a,'
cursively build a search structure for S(T), for all
such simplices T. To answer a query when D(h) is so /3 = 1 - "7 + E, where "7 = 1/(1 + ( d - 1)[d/2]),
above the j,-level of R, search the data structure for and ~ > 0 is independent of n and decreasing in r.
the simplex containing P(h). The preprocessing time is expected O(nlogn) and
The resulting d a t a structure has a query time of the storage is O(n), as is easy to verify.
O(A + log n), and a storage bound B(n) satisfying
B(n) ~ O(n) +O(Oj.,d(r))B(v~.n ) 3 L i n e S e g m e n t Intersections,
O(n) + O(rtd/2Jj.[d/2])B(O(log r)n/r), Convex Hulls, and Triangula-
as r ~ oe. The given b o u n d O(n [d/2j+~)follows, using tion
the k-set results above. The expected time required
by the algorithm to build the d a t a structure satisfies 3.1 The probabilistic theorem
the same bound. This section gives a theorem showing t h a t random
L e m m a 2.5 and the upper b o u n d on Ok,d(n) can sampling can be used effectively for divide and con-
be applied in another way to obtain an algorithm quer. For an introduction to closely related ideas, see
for halfspace range queries t h a t requires less storage [Cla87]. A corollary is also given t h a t combines the
and preprocessing, at the cost of a longer query time. results of [Cla87] with the theorem.
Consider the arrangement .~'. 3*
of hyperplanes defined A technical lemma will be needed regarding a func-
by the (<j.)-facets of R. These hyperplanes are the tion related to the hypergeometric probability distri-
duals of the vertices of the cells of ~q that are above or bution. This function is pj(n, r, b) = ( n-b-j )/(,),n as
on the j.-level of A. The cells of this arrangement in- in L e m m a 2.1.
duce a partition of S. For each cell C, we recursively
build a d a t a structure for the points C n S. Lelrnma 3.1 For integers i,j,r,n,b with r < n/2,
In answering a query, as above we can tell quickly i > O, and j >_ in/r, let ri = [ ( r - b)l(i + 1)] + b.
when the size of the answer is more than n/(r-d2), in Then
which ease a naive algorithm should be used. If the
answer is smaller, then the data structure must be pj(n, r, b)/pi(n , ri, b) <_ (i + 1)be-i/2e2/3(1 + O(i/r)).
used. The cells of the arrangement ~q'. ~, are examined.
Some do not intersect the query hyperplane, and so Proof. Omitted in this abstract. B
contribute either all or none of the points they contain To state the main theorem needed, some terminol-
to the answer. The remaining cells, that do meet the ogy and notation in addition to that in §1.4 is useful:
query hyperplane, must be examined recursively. Call the relation $ a function up to permutation if
From previous analysis [HW87, AHW87], there are for every F E 5rs, the set {X E Ui<bSi I F6X} is
two key properties of this algorithm that imply a either empty, or has elements that are i-tuples from
bound on the query time. The first is that the number the same /-element TF C S, for some i < b. With
of cells cut by a given query hyperplane is O((gJ*'d(r)] this condition on ~5, any F E 5rs is in 5rR if and only
~, d - 1 I]'
the complexity of the subdivision of the query hyper- if TF C R. This requirement on $ makes necessary
plane by the hyperplanes of .~j,. The second property the ~ondegeneracy assumption for the inputs in the
is t h a t the total number of points in the cells exam- ~lgorithmic applications.
ined for a query is no more than (d + 1)a.n. Observe A function w (j) on the integers will be called sub-
t h a t when the dual point of a query hyperplane is multiplicative when w(ij) < w(i)w(j) for sufficiently
above the j.-level of R, that point is a convex combi- large ij. Note that w(j) = j l o g j and w(j) = jm are
nation of d + 1 vertices of ~ above the j,-level. This submuh: plicative.
implies that the query halfspace h - is contained in Let F c S, a predicate on the regions. For R C S,
the union of d + 1 halfspaces bounded by (<j,)-facets let T~ (/? ~ dcnote
of R. This union of halfspaces is also a union of cells
of Aj.. Hence the total number of points in the cells
examined for the query is no more than (d + 1)a.n. FE~nP

6
T h a t is, Tw(R) is the total work done for Y'~, when The first inequality uses the technical lemma and the
work is done only for regions in P, and w ( C F ( S ) ) condition that w ( j ) is nondecreasing, and the second
work is done for the CF(S) objects of S meeting F E inequality uses Equation 1, the definition of rn(r),
jr~ M P. Note that if w ( j ) is the function l(j) = 1 for and the condition that m(r) is nondecreasing. The
all j, then T~(R) = ]3r° f3 P]. theorem follows from the observation that the sum
converges. D
Theorem 3.2 With the terminology off §1.4 and The following is a combination of the above theo-
above, suppose the relation 5 is a function up to per-
rem and Corollary 4.2 of [Cla87].
mutation. Let re(r) > E[lY'~ n PI] for random R with
IRI = r. Suppose re(r) and w ( j ) are nondecreasing, Corollary 3.3 Under the conditions off Theorem 3.2
w ( j ) is submultiplicative and w ( j ) = jo(1) as j ~ oo. and for any q, there exists Zr,q = O(q + blnr) as
Then r ~ cx) such that with probability 3/4 - 2 -q, both off
E[T~(R)] _< O ( w ( n / r ) m ( r ) ) these conditions hold:
as n ~ oo, and the constant factor does not depend
on r. c (s) < O(n/r)g(r)
Fe~
T h a t is, the average work done for a member of
tR° is O ( w ( n / r ) ) . In particular, when w ( j ) ----j, the and
theorem implies that the average number of objects m a x C F ( S ) < Zr,qn/r.
of S meeting f E Y'~ is O(n/r). FEaR
Proof. It will be convenient to put Y's and Y'R rela-
tive to P, that is, to change the definitions to require Proof. By Markov's inequality, the probability
inclusion in P. It will also be convenient to assume that Ti(R) exceeds four times its mean is no more
that 5 is defined only for S b and not for i < b. If the than 1/4. From Corollary 4.2 of [Cla87], the proba-
resulting restricted theorem is then applied b times, bility that the second condition fails is at most 2 -q.
the general theorem is proven, with a term of b in the D
constant factor.
As in the proof of Theorem 2.2, let I f = 1 when 3.2 Line s e g m e n t i n t e r s e c t i o n s
F E Y'~, and 0 otherwise. Then
Given a set S of n line segments in the plane, the
Tw(R)= wO)ZF, problem of determining all intersecting pairs of seg-
j_>O,Feh~ ments in S is reducible to the problem of constructing
3)(S), the vertical visibility map of S. (This map was
SO defined in §1.3.) In this section, two algorithms are
E[T~,(R)] = E E[w(j)IF] given for the problem of constructing ~ (S). The first
j_>O,FeT~ algorithm will be called the "basic" algorithm, and
requires O ( A + n logn) expected time. The second
= E w ( j ) P r o b { F e ~R}" algorithm is a refinement of the basic algorithm that
j_>0,ge~r~ has the same time bound and requires O(n) space in
the worst case.
Since 5 is a function up to permutation, the proba- The basic algorithm is an instance of the "random-
bility that F e Y'~ given that F E Y'~ is pj(n, r, b) = ized incremental construction" procedure of Clarkson
( n-b-j n Letting nt = (n - b)/(r - b), we have
r-b )/(r)" and Shor[CS88]. The line segments are added in ran-
dom order, one by one, to a set R. The visibility
E[Tw(R)] = E w ( j ) p j ( n , r, b)13r~l (1) graph ~)(R) is maintained as R grows. In addition,
j_>0 a conflict graph is maintained. The nodes of the con-
= ~w(j)pj(n,r,b)]~r~l flict graph correspond to the line segments in S \ R,
i_>O and the interiors of the visibility cells of ~ (R). There
int<_j<(i+l)n' is a graph edge between a segment and a cell interior
when they intersect. T h a t is, there are edges between
O(~d)(n/r)) E w ( i -~- l ) -(i- ~-4-- -1)
- pbj ( n , ri, bllTs31
D

each segment s and all the cells of ~ (R) that not in


i_>O (R O {s}). Equivalently, for each segment s there is
int<j<(i+l)n I
a list L8 of conflicting cells, and for each cell Q, there
< O(w(n/r)m(r)) Ew(i+ 1)(i + 1) b is a list LQ of conflicting segments. When adding a
el~2
segment s, the cells that must be deleted are found
- -

i>o
by examining LB. These cells are then split up by s, size of the input is less than r = n 1/2, then the ba-
each into at most four pieces. The edges in the con- sic algorithm is used to determine the intersections of
flict graph to the new cells that result from s can be that input.
found by examining the lists LQ, for the cells Q E LB. The following lemma describes the negations of the
From the Randomized Incremental Construction desired conditions on R and the sets Qs, and implies
theorem of [CS88], the time required by this proce- that these desired conditions hold with high proba-
dure is proportional to n + A + n ~l<_r<n/~ m(r)/r2' bility.
where m(r) is the expected number of cells in 3)(R),
for a random R C S of size r. The following lemma Corollary 3.6 Suppose S is a set of n line segments
implies the time bound. in the plane, with A pairs of these segments inter-
secting. Let R be a random subset of S of size r.
L e m m a 3.4 Suppose S is a set of n nondegenerate There exist constants Kmax and Ktot such that, with
line segments in the plane, with A intersections. Let probability at most 1/4,
R be a random subset of S of size r. The number
of regions in 3)(R) is no more than O(r + A'), where Ktot(n-4-Ar/n) < ~ IQsL,
A ~ is the number of intersecting pairs of segments of Qe'Y(R)
R. The expected value of A' is Ar(r - 1)/n(n - 1).
Therefore m(r) = O(r) + O(Ar2/n2). and with probability at most 1/n l°,

Proof. The first statement is an obvious conse- Kmax(n/r)logn < max IQsi.
QeT(R)
quence of the fact that 3)(R) is a planar map.
Let X s be the set of intersection points of S, and Proof. The proof follows that of Corollary 3.3. [3
XR the set of intersection points of R. For x E Xs, If the second inequality does not hold, the sample
let Ix = 1 when x E XR, and 0 otherwise. Then R will be said to be "good." The global algorithm
A' = ~zeXs Ix, so will repeatedly take random subsets R C S of size
r -- n 112 until a good subset is found. Since the subset
E[A'] = ~ E[Ix] = Z P r o b { x E X R } is good, all the sets Qs have at most O(n 1/~ log n)
• eXs ~eXs segments. At the next recursive level, all the input
sets will have O(log2 n) segments. At this level, the
=
basic algorithm is used. Thus, no intersections are
determined using the basic algorithm for sets larger
since x E XR if and only if the two line segments that than n 1/2, and a space bound of O(n) holds.
meet at x are both in R. D It remains to describe the means by which the
The time bound for the construction and the ex- global algorithm determines the sets Qs, having com-
pression for re(r) give the following theorem. puted 3)(R). Two methods are used to do this, and
are run concurrently until one is done. Both re-
T h e o r e m 3.5 For a set S of n line segments having quire O(n) space in the worst case. Method I re-
A intersecting pairs, the vertical visibility map 3)(S) quires O(A-4-nlogn) expected time when A < n 3/2.
can be computed in O(A + n log n) expected time. Method II requires O(A + n 3/2) expected time, which
is O(A+nlogn) for A > n 3/2. Thus this step requires
The space bound for this algorithm is certainly O(A-4-n log n) expected time, regardless of the value
O(nlogn + A) on the average. To do better in the of A.
space bound, we use the above "basic" algorithm as Method I requires that additional information be
a major step in a "global" algorithm. The global al- obtained while the basic algorithm is computing
gorithm is recursive, but has a recursion depth of 3. 3) (R). This information is the location of the end-
The idea is this: the algorithm takes a random subset points of the segments of S in 3)(R). Since such in-
R of S of size r = vrn, and determines 3)(R) using formation is maintained in computing 3)(S) using the
the basic algorithm. The algorithm next determines basic algorithm, the O(A-4-nlogn) time bound ap-
Qs for all Q E 3) (R), where Qs is the set of segments plies, and the space needed is O(n).
of S that intersect Q. The algorithm also checks that With this additional information, Method I walks
these sets satisfy certain cardinality conditions. (The along each 1;<nesegment of S, determining the regions
determination of Qs is discussed below.) The algo- meeting that segment. (It is assumed that each re-
rithm then recursively determines the intersections in gion has no more than 4 neighbors, which holds if no
each set Qs in turn. If on some recursive step, the three line segment endpoints of S are on the same
vertical line. If this does not hold, a random rota- 3.3 T r i a n g u l a t i o n of s i m p l e p o l y g o n s
tion of the coordinate system ensures it with proba-
It is enough to determine vertical vertex-edge visibil-
bility 1.) If at any time, the number of line segments
ity relations, that is, the map V(S) where S is the
of S meeting a region of ~)(R) exceeds Kmaxn/r log r,
set of edges of the simple polygon P. The algorithm
or the total over all regions of ~ (R) of the number
is: pick random R C S of size r -- n/log n, and com-
of line segments meeting that region exceeds Ktotn,
pute "Y(R) in O(rlogr) = O(n) time. Determine Qs
then the method restarts with another sample. When
for every Q E "1)(R) by walking along P, maintain-
A < n 3/2, so that Ar/n < n, the probability of a
ing the current cell Q along the walk. Since each cell
restart is less than 1/2 by the above lemma, so that
Q has O(1) edges, this can be done in time propor-
two attempts on the average give the desired subset
tional to ~QeV(R)]Qsl. (This assumes that no three
and information. Each a t t e m p t requires O(nlogn)
vertices of P fall on the same vertical line, which can
expected time, independent of whether the attempt
be assured with probability 1 by a random rotation
succeeds, so the desired time bound holds.
of the coordinate system.) On the average, each set
Method II simply checks every line segment in Qs contains O(logn) edges, and it is enough to de-
S against every region in ~ ( R ) , restarting if ~ ( R ) termine the visibility relations among the edges in
is shown to have a region meeting more than each Qs. This can be done in O(lognloglogn) time
Km~xn/r log r segments of S. (That is, if R is not on average for each Q, so the total time needed is
good.) If R is good, then the sets Qs and the inter- O(n/log n)O(log n log log n), or O(n log log n).
sections therein are determined for each Q E ~ ( R )
in turn. In at most two samples, on the average, this
strategy will succeed. Method II requires O(nr+nA ~) 4 A convex hull algorithm
time, where A ~ is the number of intersections of seg-
ments of R. The expected value of A ~ is O(A/n). The algorithm will be stated in terms of an equivalent
However, we are interested in the expected value of problem, that of determining the intersection P (S) of
A ~, under two conditions: that R is not good, and a set S of n halfspaces. It is assumed, without loss
that R is good. The probability that R is not good of generality, that a point p. in the intersection is
is so small that even if the expected value of A ~ is known.
n 2, we can ignore this possibility. Since for nonneg- The main idea of the algorithm is to quickly filter
ative random variable X and event B, we know that out those halfspaces in S that contain P(S) in their
E[XIB] < E[X]/Prob{B}, we have that the expected interiors. Such halfspaces are redundant, and remov-
value of X , given that R is good, is within a constant ing them gives a set of irredundant halfspaces S ~with
factor of A/n. Thus m e t h o d II requires O(nr + A) P(S) = P(S'). Furthermore, if the descriptive com-
expected time to obtain the sets Qs. plexity of P (S) is A, then there are certainly no more
than A halfspaces in S ~. (Dually, we are quickly re-
Similar reasoning shows that the expected value of moving points t h a t are inside the hull.)
~Qe~)(R)]Qs]log ]Qsl differs by only a constant factor The algorithm is based on an expected relation be-
from the expected value of that quantity, conditioned
tween the intersection P(R) of random R C S, and
on the "success" requirements for R of either Method
the intersection P(S). Suppose P ( R ) is split up into
I or Method II. Suppose we assume inductively that
simple pieces as follows: take some arbitrary (fixed)
the global algorithm requires O(IQs I log IQsl + AQ) plane H , and cut each face of P(R) into trapezoids
expected time for region Q E 1)(R), where AQ is the
using the translates of H that pass through the ver-
number of intersections in Q. This assumption, to-
tices of the face. Decompose P (R) into a set of sim-
gether with Theorem 3.2, readily yield the following
ple regions A(R), each region consisting of the con-
theorem. Note the "induction" has only 2 steps, so
vex closure of p. with a trapezoid from the cutting
multiplication by constant factors at each step gives
of the faces. For Q E A(R), let Qs denote the set
only a larger constant factor.)
of halfspaces of S that do not contain Q entirely in
their interiors. The following lemma is a corollary of
Theorem 3.2.

Theorem 3.7 Suppose S is a set of n nondegenerate Lemma 4.1 The expected value of ~Qea(R) IQsl is
line segments in the plane, with A pairs of these seg- O(n).
ments intersecting. The vertical visibility map "I)(S)
can be computed in O(n) space and O(A + n l o g n ) Proof. (Sketch) The objects are halfspaces. The
expected time. parameter b is five: one halfspace determines the face

9
containing a trapezoid, two more halfspaces deter- P(S) identified. If a region Q E A(R) contains no
mine the two edges on that face that determine the vertices of P (S), the polygons corresponding to faces
trapezoid, and two more determine the vertices of the of Q completely determine the structure of Q MP (S),
trapezoid. The regions are five-sided polyhedra with and this can be verified or disproven in O(IQsl) time.
p. as one vertex. D Thus the regions in A*(R) and their corresponding
Note that P(S) consists of the union of the regions halfspaces can be found in O(n)(logr + logA') time.
Q M P(S), over all Q E A(R). The halfspaces of S Now to consider the problem of estimating A, or
that contribute to a nontrivial region Q N P(S) are rather, of using only lower bounds for A. To do this,
contained in Qs. we determine A* (R) for a sequence of sample sizes,
Since every irredundant halfspace determines a ver- using at each step an estimate A* of A. At each
tex of P(S), we need not consider all the regions step, only some of the halfspaces may remain, since
Q E A(R), only those that contain at least one vertex some may be eliminated by not being in some Qs for
of P(S). Call this set of regions A*(R). Certainly Q E A* (R). Suppose that at a given step, the total
A*(R) contains at most A regions. The following number of halfspaces in the sets Qs, for Q E A*(R),
lemma is a corollary of Theorem 3.2. is greater than half the number of halfspaces from
the previous step. Then the estimate A* is taken as
L e m m a 4.2 The expected value of ~]Qe,~*(R) [Qsl is the square of the maximum of the A* estimate and
AO(n/r), for sample size r. the A t value (as above), of the previous step. (If this
Proof. As in the previous lemma. The subset P of new estimate of A* is greater than the number of
Theorem 3.2 is here the subset of regions that contain halfspaces remaining, compute the intersection of the
a vertex of P(S). halfspaces by the algorithm of [CS88].) On the other
Now suppose the sample size r is fl(A2). Then hand, if at least half the halfspaces are eliminated,
the total number of halfspaces in the sets Qs, for the value of A* will remain the same.
Q E A*(R), is O(n/A). This observation provides It is easy to verify that the total amount of work
a fast means of filtering out redundant halfspaces, performed up to and during the period that A*
making two assumptions: we have a fast means of is a given value is O(nlogA*). Since logA* =
determining the regions A* (R) and the corresponding O(logA), the total work performed by the algorithm
halfspaces Qs, and we can obtain an estimate of A. is O(n log A).
We next consider these two problems.
The regions Q E A(R) and the corresponding Qs 5 Concluding R e m a r k s
can be readily obtained in O(nlog r) expected time
using the randomized incremental construction tech- As one more application of random sampling, the
nique discussed in [CS88]. In this technique, we add ideas of this paper can readily be used to obtain an
halfspaces one by one, maintaining the intersection of algorithm for point location in planar subdivisions
the halfspaces that have been added, and also main- that requires O(n log n) expected preprocessing, O(n)
taining, for each edge of the intersection, the set of space, and O(log n) query time.
(not yet included) halfspaces that do not contain that
edge in their interior. For the present application, we A c k n o w l e d g e m e n t s . It is a pleasure to thank Pe-
apply this technique by including r halfspaces, and ter Shor for helpful discussions. This work and [CS88]
then stop. The sets Qs are readily obtained from the began separately and converged, and it wasn't easy
sets maintained for the intersection edges. The time to allot results to papers. Thanks also to John Hersh-
bound follows readily from the analysis in [CS88]. berger for pointing out that the hypergeometric is not
To determine A*(R) from A(R), we must have a binomial, and for the finding an error in an ancestor
fast means of determining the regions Q that contain of Theorem 3.2. Thanks to Micha Sharir for helpful
no vertices of P(S), or conversely, the regions that comments, and for observing the need for nondegen-
contain only parts of faces or edges. This is done as eracy in Theorem 3.2. Thanks to John Hobby and
follows: let t be a face of a region Q E A(R), with Ellen Feinberg (n~ Silverberg) for helpful comments.
p. incident to t. Then the polygon Pt = t M P(S)
can be determined using the algorithm of Kirkpatrick
and Seidel [KS86] in time on the order of IQsl log At,
where At is the number of sides of Pt. All but two of
the sides of Pt correspond to faces of P (S), so that the
total time to compute all such polygons is expected
O(nlogAt), where At is the total number of faces of

10
References [EM88] H. Edelsbrunner and E. P. Muecke. Sim-
ulation of simplicity: a technique to cope
[AG861 N. Alon and E. Gy6ri. The number of with degenerate cases in geometric algo-
small semispaces of a finite set of points rithms. This proceedings, 1988.
in the plane. J. Combin. Theory Ser. A,
41:154-157, 1986. [ES74] P. Erd6s and J. Spencer. Probabilistic
Methods in Combinatorics. Academic
[AHW87] N. Alon, Press, New York, 1974.
D. Haussler, and E. Welzl. Partitioning
[GP841 J. E. Goodman and R. E. Pollack. On the
and geometric embedding of range spaces
number of k-subsets of a set of n points
of finite Vapnik-Chervonenkis dimension.
in the plane. J. Combin. Theory Ser. A,
In Proc. Third Symp. on Comp. Geome-
36:101-104, 1984.
try, pages 331-340, 1987.
[HW87] D. Haussler and E. Welzl. Epsilon-nets
[CEG+88] K. L. Clarkson, and simplex range queries. Discrete
H. Edelsbrunner, L. Guibas, M. Sharir, Comp. Geom., 2:127-151, 1987.
and E. Welzl. The complexity of many
faces in an arrangement of lines in the [KS861 D. G. Kirkpatrick and R. Seideal. The
plane. To appear, 1988. ultimate planar convex hull algorithm?
SIAM Journal on Computing, 15:287-
[Cla85] K. L. Clarkson. A probabilistic algorithm 299, 1986.
for the post office problem. In Proc. 17th
Annual SIGACT Syrup., pages 75-184, [TVW88] R. E. Tarjan and C. J. Van Wyk. An
1985. O(n loglog n) algorithm for triangulat-
ing a simple polygon. SIAM Journal on
[Cla871 K. L. Clarkson. New applications of ran- Computing, 17:143-178, 1988.
dom sampling in computational geome-
try. Discrete ~ Comp. Geom., 2:195-222, [We186] E. Welzl. More on k-sets of finite sets
1987. in the plane. Discrete ~ Comp. Geom.,
1:95-100, 1986.
[CP86] B. Chazelle and F. P. Preparata. Half-
[Yap88] C. K. Yap. A geometric consistency theo-
space range search: an algorithmic ap-
rem for a symbolic perturbation scheme.
plication of k-sets. In Discrete 8~ Comp.
This proceedings, 1988.
Geom., pages 83-94, 1986.

[cs881 K. L. Clarkson and P. W. Shor. Al-


gorithms for diametral pairs and convex
hulls that are optimal, randomized, and
incremental. This proceedings, 1988.

[CSY84] R. Cole, M. Sharir, and C. Yap. On k-


hulls and related problems. In Proc. 16th
Annual SIGACT Symp., pages 154-166,
1984.

[CTVW881 K. L. Clarkson, R. E. Tarjan, and C. J.


Van Wyk. A fast Las Vegas algorithm
for triangulating a simple polygon. This
proceedings, 1988.

[Ede] H. Edelsbrunner. Personal communica-


tion.

[Ede87] H. Edelsbrunner. Algorithms in Combi-


natorial Geometry. Springer-Verlag, New
York, 1987.

11

You might also like