Admision Control

Download as pdf or txt
Download as pdf or txt
You are on page 1of 34

Admission Control for Wireless Networks

Cynara C. Wu
Dimitri P. Bertsekas
[email protected] [email protected]
Laboratory for Information and Decision Systems
Massachusetts Institute of Technology
Cambridge, MA 02139, USA
Abstract
With the population of wireless subscribers increasing at a rapid rate, overloaded situations are likely to become an increasing problem. Admission control can be used to
balance the goals of maximizing bandwidth utilization and ensuring su cient resources
for high priority events. In this paper, we formulate the admission control problem as
a Markov decision problem. While dynamic programming can be used to solve such
problems, the large size of the state space makes this impractical. We propose an approximate dynamic programming technique, which involves creating an approximation of
the original model with a state space su ciently small so that dynamic programming can
be applied. Our results show that the method improves signi cantly on policies that are
generally in use, in particular, the greedy policy and the reservation policy. Much of the
computation required for our method can be done o -line, and the real-time computation
required is easily distributed between the cells.

Keywords
Cellular networks, admission control, dynamic programming.

1 Introduction
E cient resource utilization is a primary problem in cellular communications systems. Resource issues include determining with which users to establish connections, and assigning
transmit power levels to connected users subject to acceptable signal quality. For most users,
the inability to initiate a call is perceived as more tolerable than the unexpected termination
of a call. As a result, admission control, in which users requesting a connections are not
1

automatically admitted even if resources are available to handle the connections, may be
necessary to ensure su cient resources are available for hando s and other higher priority
events.
The most common method of dealing with hando s is to reserve a number of \guard"
channels exclusively for hando s. New calls are admitted only if the number of available
channels exceed the number of guard channels. Such systems were introduced in the mid1980's ( HR86, PG85]). A great deal of work has been done on developing more complex
admission control schemes. However, many studies make a number of unrealistic, simplifying
assumptions such as modeling hando arrivals as Poisson processes that are independent of
new call arrivals ( AAN96], Gue88], TJ92]). Other studies use more realistic models that
require extremely complex calculations ( AS96], LAN97]). Several authors have noted that
while the problem is best modeled as a continuous-time Markov chain, the size of the state
space makes the problem di cult to solve BWE95, SS97]. Chang and Geraniotis actually
use dynamic programming to solve their Markov decision problem, but their method does
not generalize well to larger problems CG98]. All the above works have focused on systems
in which channels are orthogonal and all users require the same resources. In certain systems
such as CDMA systems, di erent users may require varying resources. In addition, there
may be di erent classes of users with di erent resource requirements.
In this paper, we consider the problem of optimal admission control: given a particular
con guration of users of various classes in various regions, determine whether or not to
accept a new call request. We assume we have available an algorithm that can determine
for any distribution of users of various classes in various regions whether there is a feasible
power assignment satisfying the signal to noise requirements for all users, and if so, provides
a unique power assignment for the distribution. Our goal is to formulate the problem as
a Markov decision process and to provide a solution method that is general enough to be
widely applicable and can be implemented in real-time.
In Sec. 2, we develop a model for a multiple access cellular communications system and
formulate the admission control problem as a Markov decision process. While such processes
can be solved by dynamic programming, the size of the problem makes this impractical.
In Sec. 3, we consider an approximate dynamic programming solution. We then present
computational results in Sec. 4.

2 The Admission Control Problem


In this section, we develop a model for the admission control problem for a two-dimensional
system of cells with multiple user classes. We rst provide a general description of the system
which we consider and then formulate the problem as a Markov decision process.
2

2.1 General Description


We consider a two-dimensional system of M cells in which users of C various classes can
establish connections. Each cell contains a single base station. The j th cell, for j = 1 : : : M ,
contains Rj regions. We assume each region of a cell is small enough so that the propagation
e ects on all signals transmitted from users in a particular region can be assumed to be
equivalent. The total number of cell regions is R = PM Rj and we denote the cell regions
j =1
as r = 1 : : : R. (See Fig. 1.)

Figure 1: Model of a cellular system. A cell can be of any shape and is divided into regions that are
represented by the dashed lines. The base station is represented by the small solid circle near the middle.

If a user in region r, for r = 1 : : : R, transmits a signal with a power level w, the signal
strength received at the base station in cell j , for j = 1 : : : M , is arj w. The value arj is the
amount that a signal transmitted from region r is attenuated by interference, multi-path,
etc., by the time it is received by the base station in cell j . We assume that these values
can be obtained either empirically or by analyzing appropriate propagation models and that
they are available.
We assume that we have an algorithm that can determine whether any particular conguration of users in cell regions is feasible and if so, provides a unique assignment of power
levels to each user. The system at any particular time is then characterized by the number of
users of each class in each region. We represent this information by a matrix n of nonnegative
integers:
2
3
n11 : : : n1R 7
6
...
7
n=6
6
7
4
5
nC 1 : : : nCR
where ncr is the number of users of class c in cell region r.
The arrival of connection requests in region r = 1 : : : R by users of class c = 1 : : : C
is modeled as a Poisson process with rate cr . When a request arrives, a decision has to
be made regarding whether to admit the user or not. Given that an existing user of class
c is currently in region r, the probability that he moves to region r is independent of the
0

regions the user has already visited and is given to be qcrr . The probability that this user
ends his connection in the current region r is qcrt = 1 ; Pr qcrr > 0. If a user of class c does
not move to any more regions, the length of the remaining connection time is exponentially
distributed with rate c. Otherwise, assuming the user moves from region r to region r , the
length of time until this occurs is exponentially distributed with rate crr .
While we assume we have some information regarding the cellular network such as the
call request arrival rate to the various regions and the probability distribution of tra c
movement, we do not assume knowledge of where a particular user is heading or a history of
its locations however, our methodology extends to cases where such information is available.
In addition, we have assumed that a user's movement and remaining connection time is
independent of his previous movement and the previous length of connection. As a result,
the matrix n providing the number of users of each class in each cell region describes the
\state" of the system. We assume that we can determine this information either through
measurements or other means.
As requests for connections are accepted, existing connections are completed, and users
move from one region to another, the state of the system changes. The changes depend on
random elements such as user requests to establish connections, existing connections being
completed, and movements of users from one region to another, as well as on decisions that
need to be made, such as admission decisions. Costs or rewards can be associated with
admission decisions, as well as with lengths of connection times. The admission control
problem can therefore be viewed as a continuous-time Markov decision process. We describe
how to do so in the next section.
0

2.2 Formulation of Problem as a Markov Decision Process

In an in nite horizon Markov decision process with a nite set of states, S , the state evolves
through time according to given transition probabilities Pii that depend on a decision or
control from a set U that depends on the current state. Suppose transitions occur at times
t1 t2 : : :. If the system is in state x(tk ) = i after the kth transition and decision u(tk ) 2 U (i)
is selected, then during the (k + 1)st transition, the system moves to state x(tk+1) = i with
given probability Pii (u(tk )). The interval between transitions is referred to as a \stage."
During the kth stage, we incur a cost g(x(tk ) u(tk )) k , where g is a given function and
k = tk+1 ; tk is the length of the kth stage.
The goal is to minimize over all possible decisions at each stage the average cost per unit
time. This cost function is of the form
1 E Z tN g(x(t) u(t))dt :
lim
N
E ftN g
0
0

!1

We assume the process is stationary i.e., the transition probabilities, the set U of available
controls, the cost functions g, etc, do not depend on the particular transition.
To formulate the admission control problem as a discrete-state in nite horizon Markov
process, we rst de ne the state as a combination of a \cell con guration" and the current
event. We then provide the control space, the probability distribution of the interval between
transitions, the transition probabilities, and the cost function.

2.2.1 The State Space


The state consists of the following two components:
1. A primary state component which describes the number of users of each class in each
region and is denoted by n.
2. A random state component which describes the current random event and is denoted
by !.
The set N of all primary state components is the set of all n for which there is a feasible
power assignment. We refer to an element of N as a feasible cell con guration.
The random state component describes the current random event and is of one of the
following three types.
1. \Arrival": a connection request by a new user of any class in any region.
2. \Departure": a connection completion by an existing user of any class in any region.
3. \Movement": a movement by an existing user of any class from any region to another.
We denote a particular random state component by
3
2
!11 : : : !1R 7
6
...
7
!=6
6
7
5
4
!C 1 : : : !CR
where !cr is equal to 1 if a user of class c in region r requests to make a connection, !cr is
equal to -1 if an existing user of class c in region r ends his connection, !cr = ;1 and !cr = 1
^
if an existing user of class c in region r moves to region r, and !cr is equal to 0 otherwise.
^
The set of all possible events is then
= a d
m
where a is the set of all events consisting of an arrival to the cell, d is the set of all events
consisting of a departure from the cell, and m is the set of all events consisting of a single
user moving from one region to another.
5

The state space S is composed of all possible combinations of primary state components
and random state components and is given by
(

S = i = (n !) n 2 N ! 2

and !cr

0 if

Wc
X
l=1

ncrl = 0

where the last condition holds since departures and movements from region r by a user of
class c can only occur if the number of users of class c in region r is positive.

2.2.2 The Control Space


For each state i = (n !) 2 S , the set of available controls U (i) depends on the random
element as follows.

1. Arrival: ! 2 a . If the random element corresponds to a connection request, the control


consists of determining whether or not to admit the user. We have

U (i) = fua ubg


where ua indicates the requesting connection is admitted and ub indicates the requesting
connection is blocked.
2. Departure: ! 2 d . If the random element corresponds to an existing user completing
his connection, there is no decision to be made.
3. Movement: ! 2 m . If the random element corresponds to an existing user moving from
one region to another, the control consists of determining whether or not to maintain
the connection. We have
U (i) = fuh udg
where uh indicates the attempted hando is accommodated ud indicates the attempted
hando is dropped.
There are essentially two types of controls: one in which the random event is accommodated and one in which the random event is not. Note that the control determines the
primary state component n for the next state i . The available controls and the resulting
primary state components are summarized in Table 1.
0

2.2.3 The State Transition Rates


For any cell con guration n 2 N , the time until the next random event occurs depends

on the number of possible random events. The arrival of each of these events is a Poisson
process with the rates provided in Table 2.
6

Random State
Component

!2

Arrival

Departure ! 2 d
Movement ! 2 m

Available
Controls
Admit User u = ua
Block User
u = ub
None
Maintain
Connection u = uh
Drop
u = ud
Connection

Next Primary
State Component
n =n+!
n =n
n =n+!
0
0

n =n+!
n =n+!
(!cr = minf!cr 0g)
0

Table 1: List of the possible random state components, the available controls, and the resulting cell
con gurations according to the selected control, given a state i = (n !).

Random State Component

Rate

Total Rate

Call request by user of class c in region r

cr

Call completion by user of class c in reg r

ncr qcrt

cr
c=1 r=1
C R
XX
ncr qcrt c
c=1 r=1
C R R
XX X
ncr qcrr crr
c=1 r=1 r =1

Movement by user of class c from r to r

ncr qcrr

C R
XX

c
crr

Table 2: Possible next events for a particular con guration n 2 N and the associated arrival rate. The
second column indicates the rate for the particular random event described. The third column indicates the
overall rate for all events of the type described.

The overall rate n at which events occur starting from a con guration n is the sum of
the rates of all possible events and is given by
n=

C R
XX
c=1 r=1

cr + ncr

" R
X

r =1
0

qcrr

crr

+ qcrt

#!

For any state i = (n !) 2 S , the control u 2 U (i) determines the cell con guration n for
the next state i = (n ! ) as seen in Table 1. Assuming the control takes e ect immediately,
the transition rate from state i under control u 2 U (i) is then n , or equivalently, u. We
denote the expected value of the time from the transition to state i under control u to the
transition to the next state as i (u):
(u) = 1 :
0

2.2.4 The State Transition Probabilities


Given a state i = (n !) 2 S , the transition to the next state i = (n ! ) consists of the
0

following two factors.

1. The next cell con guration n is determined by the control u 2 U (i) exercised for state
i.
0

2. The probability distribution of the next random event ! depends on the cell con guration resulting from the current state i and control u. Given the next reduced cell
con guration n , the probability of a transition from i = (n !) to i = (n ! ) is the rate
at which the event occurs divided by the total transition rate:
0

8
> cr = n
>
>
>
>
>
>
>
>
>
>
<

Pii (u) = >


0

if ! 2 a !cr = 1 c = 1 : : : C r = 1 : : : R
0

ncr qcrt c=

if ! 2 d !cr = ;1 c = 1 : : : C r = 1 : : : R

>
>
>
>
>n q
> cr cr1 r2 cr1 r2 = n
>
>
>
>
:
0

if ! 2 m !cr1 = 1 !cr2 = ;1 c = 1 : : : C
r1 r2 = 1 : : : R:
0

(1)

We describe the state transition probabilities from a state i = (n !) to a state i = (n ! )


in more detail below for the various possible values of !.
0

1. If ! 2 a , the possible controls are u = ua, in which case n = n + !, and u = ub, in


which case n = n. The probability distribution for the next event ! then depends on
n according to Eq. 1. The possible transitions are illustrated in Fig. 2 and described in
the following table.
0

Control

Next
Con guration

Next Transition Probability


Event from (n !) to (n ! )
!2 a
cr = n
!2 d
ncr qcrt c= n
! 2 m ncr qcr1r2 cr1r2 = n
!2 a
cr = n
!2 d
ncr qcrt c= n
! 2 m ncr qcr1r2 cr1r2 = n
0

Admit User u = ua

n =n+!
0

Block User

u = ub

n =n

2. If ! 2 d , there is no decision to be made and n = n + !. The possible transitions are


described in the table below.
0

State

Next

Control

Configuration

Probab
Distrib of
Random
Event

Next
State
(n + , a )

Pr( a )
|n

n = n +

Pr( d|n )

(n + , d )

ua
Pr( m )
|n

(n + , m)

(n, )
(n, a )

Pr( a )
|n

ub

Pr( d|n )

n =n

Pr( m )
|n

(n, d )
(n, m )

Figure 2: Illustration of the transition from state (n !) when there is a connection request.
Next
Control Con guration

Next Transition Probability


Event from (n !) to (n ! )
!2 a
cr = n
!2 d
ncr qcrt c= n
! 2 m ncr qcr1r2 cr1r2 = n
0

n =n

None

3. If ! 2 m , the possible controls are u = uh, in which case n = n + !, and u = ud, in


which case n = n + ! , where !cr = minf!cr 0g. The possible transitions are described
in the table below.
0

Control
Maintain
Connection u = uh

Next
Con guration

Next Transition Probability


Event from (n !) to (n ! )
!2 a
cr = n
n =n+!
!2 d
ncr qcrt c= n
! 2 m ncr qcr1 r2 cr1r2 = n
!2 a
cr = n
n =n+!
!2 d
ncr qcrt c= n
(!cr = minf!cr 0g) ! 2 m
ncr qcr1 r2 cr1r2 = n
0

Drop
Connection u = ud

2.2.5 The Cost Function


The goal is to select controls at each possible state that minimizes the expected average cost
per unit time,
Z tN
lim E f1 g E
g(x(t) u(t))dt
N
t
!1

where g is a given function. We can rewrite this objective cost function as a sum of stage
costs:
N
X
lim 1
N
E ftN g k=1 E fGk g
where
Z tk+1
Gk =
g(x(tk ) u(tk ))dt
tk
is the cost of the kth stage. We assume the function g has a component g that depends
^
linearly on the length of time spent in a particular state and a component g that does not
depend on the length of time spent in a particular state. In this case, the cost of the kth
stage is then
Gk = (tk+1 ; tk )^(x(tk ) u(tk )) + g(x(tk ) u(tk )):
g
The rst component can accommodate assigning rewards (or negative costs) for the amount
of time users are connected, while the second component can accommodate assigning costs
for not admitting a user.
!1

2.3 Solving the Markov Decision Problem


In Sec. 2.2, we formulated the admission control problem as an average cost Markov decision
problem. We provided the state space S , the set of available controls U (i) for each possible
state i 2 S , and the state transition probabilities Pii (u). The objective is to select for each
state x(t) 2 S resulting from the kth transition, decisions u(x(tk )) 2 U (x(tk )) that minimize
the average cost per stage. In this section, we discuss applying dynamic programming (DP)
to solve the Markov decision problem.
Let be any given stationary policy such that for any state i 2 S , (i) 2 U (i) indicates
the control provided by the policy at state i. We denote the average cost per unit time under
policy by v and the average cost per unit time under an optimal policy by v . It can
be shown that for the admission control problem, this value is independent of the initial
state. As a result, states are compared using a \di erential" cost. We denote the di erential
cost of starting in state i relative to some reference state under policy as h (i) and the
di erential cost under an optimal policy as h (i).
The expected cost for a single stage corresponding to state i under an optimal policy is
0

v i(u)
where i(u) is the expected length of the transition time corresponding to state i when control
u = (i) is exercised. Similarly, the expected cost for a single stage corresponding to state
i when control u is exercised, which we denote G(i u), is

G(i u) = g(i u) i(u) + g(i u):


^
10

It can be shown that for the admission control problem, the vector h and scalar v satisfy
the following system of equations:
8
<
h(i) = umini) :G(i
U(
2

u) ; v i(u) +

X
0

for all i 2 S

Pii (u)h(i )
0

9
=

(2)

2S

known as Bellman's equation.


Given the optimal average cost per unit time v and the optimal di erential cost function
h , the optimal decision at state i, (i), is that which minimizes the immediate cost of the
current stage minus the expected average cost for the stage plus the remaining expected
di erential cost based on the possible resulting states i.e., we have
8
<
(i) = arg umini) :G(i
U(
2

u) ; v i(u) +

X
i

for all i 2 S :

Pii (u)h (i )
0

9
=

2S

For the admission control problem, the optimal policy for any state i = (n !) 2 S ,
obtained by solving the following equation at each stage:
8
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
<
(i) = >
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
:

8
<
X
arg ua ub :G(i ua) ; v i (ua) + Pii (ua)h
min
i
0

G(i ub) ; v i(ub) +

i
8
<
X
arg uh ud :G(i uh) ; v i(uh) + Pii (uh)h
min
i
0

if ! 2 a

9
=

Pii (ud)h (i )
0

(i )

(3)

2S

9
=

2S

G(i ud) ; v i(ud) +

Pii (ub)h (i )
0

(i )

2S

(i), is

if ! 2 m

2S

where the quantities G, i , and Pii are those provided in Sec. 2.2. Note that if ! 2 d ,
there is no control to exercise.
There are a number of methods for solving Bellman's equation to obtain the values for v
and h . Most of these are forms of the value iteration and policy iteration algorithms. The
computation can be done o -line, i.e., before the real system starts operating. Once this
computation is completed, the optimal decision at each call request can be determined by
comparing the cost resulting from applying each of the available controls and selecting the
control with the minimum value. A comprehensive treatment of DP can be found in Ber95].
One major advantage of DP over others is that the computation required to incorporate
any system details or an arbitrarily complex cost function is done o -line and therefore need
not be real-time. Once the cost function is computed, the optimal policy is often determined
quickly using the cost function with real-time information on the current state of the system.
Furthermore, the optimal cost function can vary from cell to cell to account for variances
such as arrival rates, and cell shapes and sizes.
0

11

Unfortunately, even o -line, the computation required to determine these values is overwhelming due to the large number of states. As a result, suboptimal methods for the solution
must be used. In the next section, we discuss a technique to obtain an approximation for
the optimal average cost and optimal di erential cost function.

3 An Approximate Solution: Cell-by-Cell Decomposition with Feature Extraction


In Sec. 2.2, we formulated the admission control problem as a Markov decision process.
Unfortunately, the size of the state space is too large for exact dynamic programming to be
practical. In this section, we consider an approximate solution.
For convenience, we de ne

H (i u) = G(i u) ; v i(u) +

X
i

Pii (u)h (i )
0

i 2 S u 2 U (i):

2S

If we were able to obtain the optimal average cost per stage v and the optimal di erential
cost vector h , the optimal policy, given by Eq. 3, would be equivalent to
(i) = arg umini) H (i
U(
2

8
>
<
u) = >
:

arg ua ub H (i u) if ! 2 a
min
arg uh ud H (i u) if ! 2 m
min

i 2 S:

(4)

We refer to H (i u) as the Q-factor of the state-control pair (i u). It is the expected cost
corresponding to exercising control u at state i and then proceeding to follow an optimal
policy. Note that the optimal di erential cost vector h satis es the equation

h = umini) H (i u):
U(
2

Instead of calculating the actual average cost per stage v and di erential cost vector h and
using Eq. 4 to determine the optimal decision at every state, we construct approximations of
~
the Q-factors H (i u), to which we refer as H (i u). After determining such approximations,
the control for every state is given by
~
(i) = arg umini) H (i u)
U(
2

for all i 2 S :

We consider cell-by-cell decomposition combined with feature extraction to formulate a


new Markov decision problem that is an approximation of the original problem but has a
signi cantly smaller state space so that dynamic programming can be applied. We apply
relative value iteration to the new formulation to determine the optimal average cost per stage
12

~
v and di erential cost function h of the approximation, thereby obtaining an approximation
~
of H (i u) of the form
X
~
~
H (i u) = G(i u) ; v i (u) + Pii (u)h(i )
~
0

i 2 S u 2 U (i):

(5)

2S

In the approximate problem, we decompose the original model into individual cells and
consider each cell independent of the others. For each cell j , we obtain an estimate of the
~
average cost per stage vj and the di erential cost vector hj . Costs associated with a particular
~
user are attributed to the cell in which the user is located when the cost is incurred. For
instance, suppose there is a xed cost associated with blocking a connection request by a
user of class c. Each cell in which a user of class c is blocked is then attributed that blocking
cost. The total average cost per stage and total di erential cost is the sum of the costs
associated with each cell:

v=
~

N
X
j =1

vj
~

X
~
~
and h(i) = hj (i)
j =1

i 2 S:

(6)

~
To obtain the approximations vj and hj , we formulate a Markov decision process for each
~
cell. The state of each process consists of \features" of the state of the original admission
control problem. With an appropriate selection of features, each process is simple enough
so that traditional dynamic programming techniques such as value iteration can be applied
~
to obtain vj and hj . Speci cally, for each state i 2 S and cell j = 1 : : : N , fj (i) =
~
fj1(i) : : : fjKj (i) is a mapping of the state i to a vector of features, where Kj is the
number of features that are to be extracted for use in evaluating cell j . The di erential
cost function is then approximated by a sum over all cells of functions that depend on these
features instead of the actual state:
N

X
~
~
h(i) = hj (fj (i)):
j =1

This approximation is illustrated in Fig. 3. Ideally, the features that are \extracted" summarize important characteristics of the state which a ect the cost, and they should incorporate
the designer's prior knowledge or intuition about the problem and about the structure of the
optimal controller.
We summarize the cell-by-cell decomposition with feature extraction procedure as follows.
1. Select suitable features for each cell.
2. Formulate a Markov decision process for each cell where the state consists of the features
selected in Step 1. We refer to the formulated process as the \feature-based MDP" for
the cell.
13

Obtain
features
for Cell 1

Formulate MDP /
Get approx value
for Cell 1

h1(f1(i))

Obtain
features
for Cell 2

f2(i)

Formulate MDP /
Get approx value
for Cell 2

h2(f2(i))

Obtain
features
for Cell M

i = (n, )

f1(i)

fM (i)

Formulate MDP /
Get approx value
for Cell M

h (fM (i))
M

~
h(i)

Figure 3: Cell-by-cell decomposition with feature extraction: For any state i, a set of features are extracted

for each cell. The features are used as the state space for a Markov decision process associated with the cell.
The costs resulting from solving the Markov decision process are approximations of the costs incurred by
that cell.

~
3. Calculate the average costs per stage vj and the di erential cost vectors hj for the
~
Markov decision processes formulated for each cell in Step 2.
4. Approximate the overall average cost per stage and di erential cost vector as a sum
of the costs associated with the individual cells and determined in Step 3 (see Eq. 6).
~
These values can be used in Eq. 5 to obtain H (i u), which is then used to make decisions
at state i 2 S according to
~
(i) = arg umini) H (i u)
U(

for all i 2 S :

Note that after using cell-by-cell decomposition and feature extraction to obtain an approximate problem, the features of a number of cells will not be a ected by the current decision
and therefore these cells need not be included in evaluating decisions. Speci cally, given a
state i = (n !), let J (i) be the set of cells whose features are modi ed by any control in
U (i). The decision at state i 2 S is then given below:
~
(i) = arg umini) H (i u)
U(
8
<
arg umini) :G(i
U(
2

u) ; i(u)

X
j

2J

(i)

vj +
~

2
39
=
X
~
4Pii (u)
hj (fj (i ))5 :
j (i)
0

2S

2J

If we do not include costs associated with cells whose features are not a ected by the
addition of a new user, the amount of calculation required to make each admission decision
becomes independent of the total number of cells in the system. The total amount of
calculation then grows linearly in the number of cells. Furthermore, since each admission
14

decision depends on local information, the method is easily distributed so that each cell
includes a controller responsible for making decisions for requests within its regions.
We describe the rst three steps in detail in the following sections. In Sec. 3.1, we
describe the process of selecting suitable features and provide several examples. In Sec. 3.2,
we illustrate with an example the process of formulating the Markov decision process for a
particular cell given a set of features. In Sec. 3.3, we describe the process of determining the
~
values of vj and hj given a Markov decision process formulation for cell j .
~

3.1 Selecting Suitable Features


In this section, we describe the rst step involved in the cell-by-cell decomposition with
feature extraction procedure: the process of selecting features from the overall state for each
cell. The features for a particular cell j are basically the vector of numbers to which a
function fj maps a given state. We describe issues to consider when selecting features and
provide several examples.
To determine what features to extract as a representation of the state for a particular
cell, we must consider what information is relevant in assessing the value of a state, and how
the value changes upon the addition of a new user or upon the movement of an existing user.
We must therefore consider the objective cost function. In selecting examples of features to
use, we have made certain assumptions about the cost function. We assume there are costs
associated with blocked calls and dropped calls, as well as possibly rewards for users that
are connected. These costs and rewards may depend on the user class.
Under this assumption, the value of the state depends on the likelihood of future blocking
and future dropping of calls of the various classes. The expected number of future blocks for
new call attempts in a particular cell depends for the most part on the interference within
the cell. The amount of interference depends on the number of calls of each class in the given
cell and the number of calls in other cells that create interference to this cell. Note that the
particular regions in which users in the given cell are located are not likely to be a signi cant
factor. This may not necessarily be the case for users in other cells that generate interference
within the given cell. Depending on the location of these other users and the base stations
to which they are connected, the interference generated to the given cell can vary from being
negligible to signi cant. For example, in Fig. 4, depending on propagation e ects, a call
in cell 2 near the border to cell 1 may generate a signi cant amount of interference at the
base station in cell 1. It may also generate some interference at the base station in cell 3,
although the amount is likely to be negligible.
The preceding considerations suggest two sets of features of state i relevant to determining
costs incurred from blocked calls occurring in each cell j :
15

Figure 4: The user in cell 2 is likely to generate a fair amount of interference to the base station in cell 1
and therefore a ect the ability to accept future connection requests. Its in uence on future arrivals in cell
3, however, is likely to be signi cantly smaller.

1. fj1c(i): the number of users of class c in cell j ,


2. fj2c(i): the number of users of class c in other cells whose signal levels at the base station
in cell j are above some threshold.
The second set of features can further be subdivided into groups by creating various signal
level threshold ranges to distinguish between users in other cells that generate a signi cant
amount of interference and those that generate a moderate amount. Tradeo s between
the additional information that is obtained at the expense of larger state spaces should be
considered in selecting the number of subgroups.
Another factor that may a ect future blocks are the users within the cell that may move
to another cell. For instance, users that move from periphery regions are more likely to
move to another cell in the near future than users located in the center. In addition, tra c
patterns may provide additional information. If cells are built around a highway, there will
be cells where users can only move in a certain direction (see Fig. 5). Users that are likely to
Traffic
direction

Figure 5: Assume tra c ows only left to right on the illustrated road. The user on the left is more likely
to be connected to the base station in the center cell for a longer period of time than that on the right. In
addition, the user on the right is likely to require its call be handed o in the near future.

leave the cell are more likely to generate less interference in the future. This factor suggests
including the following set of features:
16

3. fj3c(i): number of users of class c in cell j that are positioned in regions in which the
probability of moving to a region of another cell is greater than 0.
The expected number of future drops for users attempting to move to a particular cell
depends on the number of existing users of each class positioned to move into this cell in
addition to the interference currently present. For instance, the user on the right in Fig. 5 is
likely to move to the cell on the right and may require its call be handed o . This suggests
the following set of features:
4. fj4c(i): number of users of class c in other cells that are positioned in regions in which
the probability of moving to cell j is greater than 0.
The set of features can be tailored to each cell. Once the set of features are selected for a
particular cell, a Markov decision process is formulated whose state consists of the selected
features. This feature-based MDP is not automatically determined by the chosen features.
Instead, the transition probabilities and other characteristics are based on additional assumptions and approximations that are to some extent heuristic and re ect our understanding of
the practical problem at hand. We illustrate this process in the next section.

3.2 Formulation Example

In this section, we illustrate with an example the second step involved in the cell-by-cell
decomposition with feature extraction procedure: the process of formulating the featurebased MDP for a particular cell. This example is the basis of a portion of the computational
results presented in Sec. 4. In this example, we extract from any state i 2 S the following
2C features for each cell j = 1 : : : N :
1. fjc(i) = the number of users of class c 2 f1 : : : C g in all of the regions in cell j ,
2. fj(C +c)(i) = the number of users of class c 2 f1 : : : C g in all regions r such that
warj > T , where r is a region not in cell j , w is the power level assigned to any user in
region r, and T is an arbitrary constant.
The purpose of the rst set of features is to indicate how much interference is generated by
users within a particular cell. The purpose of the second set of features is to indicate how
much interference is generated by users in other cells. For convenience, let Rj be the set of
regions in cell j and let Ro (i) be the set of regions r not in cell j such that given a state
jc
i, the available power control algorithm assigns a power level w to users of class c such that
warj > T . In this example, we assume that for any cell j , the set of regions not in j in which
users generate su cient interference to the base station in j is independent of the user class
and of the actual power assignment (and consequently, the state). We therefore simplify
notation by referring to this set by Ro .
j
17

We now formulate the Markov decision process based on the selection of these features.
Both the state space and the control space are analogous to those in the formulation of the
admission control problem. In general, the transition rates and probabilities, and the cost
function for this feature-based MDP can not be determined exactly. We provide examples
of suitable approximations in our formulation.

The State Space The state space consists of the following two components:
1. A primary state component which describes the feature values of the state for cell j and
is denoted by n = (~1 : : : n(2C ) ). In this example, the features are the number of users
~ n
~
of each class in cell j and the number of users of each class in regions in Ro. We also
j
refer to this component as the feature con guration.
2. A random state component which describes the current random event and is denoted
by ! = (!1 : : : !(2C ) ).
~
Let Nj be the set of primary state components for cell j . We have
8
>
<

~
Nj = >n nc =
~ ~
:

X
r

2R

ncr n(C +c) =


~

X
r

2R

o
j

9
>
=

ncr c = 1 : : : C for some n 2 N >

where N is the set of primary state components for the admission control problem.
We are only concerned with events that can change a particular cell con guration. These
consist of arrivals to, departures from, and movements involving regions in Rj or Ro . Movej
o to another
ments of users from a region in Rj to another region in Rj or from a region in Rj
region in Ro need not be considered an event. In addition, we can treat a movement by a
j
user from a region in Rj or Ro to a region not in either Rj or Ro as a departure, and a
j
j
o to a region in Rj or Ro as an arrival.
movement by a user in a region not in either Rj or Rj
j
The set of all random events for a particular cell j is given by
= a ao
d
do
m1
m2
where a and ao are the sets of all events consisting of an arrival to cell j and to a region
in Ro , respectively, d and do are the sets of all events consisting of a departure from cell j
j
and from a region in Ro , respectively, and m1 and m2 are the sets of all events consisting
j
of a user moving from a region in cell j to a region in Ro and vice versa, respectively.
j
The state space associated with cell j , which we denote Sj , is composed of all possible
combinations of primary state components and random state components and is given by
n
o
Sj = i = (~ !) n 2 Nj ! 2 and !c 0 if nc = 0
n ~ ~
~
where the last condition holds since departures and movements by a user of class c can only
occur if the number of users of class c is positive.
18

The Control Space For any state i = (~ !), the possible controls as well as the resulting
n

feature con gurations n are given in Table 3 according to the various possible values of !.
~
If the random state component is a departure, there is no choice of control and the event is
accommodated. Otherwise, the control can either be to accommodate or not accommodate
the event.
0

Random State
Component

Available
Controls

Next Primary
State Component

n =n+!
~ ~
n =n
~ ~

Arrival

!2

ao

Admit User u = ua
Block User u = ub

Departure

!2

do

None

n =n+!
~ ~
n =n+!
~ ~

m2

Maintain
u = uh
Connection
Drop
u = ud
Connection

Movement

!2

m1

n =n+!
~ ~
(!c = minf!c 0g)
0

Table 3: List of the possible random state components, the available controls, and the resulting con gurations according to the selected control.

Probabilities of Moving In order to determine the transition rates and probabilities,


we need to make certain approximations since information regarding the particular region a
user is located is no longer available. For example, the probability that a user moves from a
particular cell j to a region in Ro must be estimated since we do not know whether the user
j
is in a region to which it is possible to move to another cell, and even if the user is in such
a region, the probability may vary depending on the particular region. The actual choice
of approximation should be determined by the actual system and should vary according to
cell. One possibility is to take a weighted average of the probabilities of moving for each of
the regions within a cell. If the weights depend on the arrival rate to the particular region,
the approximate probability that a user of class c in cell j moves to a region in Ro, which
j
we denote qjcm, is
~
X
X
qcrr )
( cr
r j
r o
j
X
qjcm =
~
:
0

2R

2R

19

2R

cr

We can similarly approximate the probability that a user of class c in a region in Ro moves
j
o by
to a region in cell j , which we denote qjcm
~
X

qjcm =
~o

o
j

( cr

2R

X
r

qcrr )
0

2R

cr

o
j

2R

Note that from the perspective of cell j , we can consider a movement by a user in a region
in Ro to a region that is not in Ro nor in cell j as a departure or connection completion. We
j
j
therefore include the probability of such movements as part of the probability that a user of
class c in a region in Ro ends an existing connection, which we denote qjct:
~o
j
X

qjct =
~o

o
j

cr (qcrt +

2R

X
r

o
j

X
r

0 62R

cr

o
j

qcrr )
0

2R

We also denote the total departure probability of a user of class c in cell j by qjct:
~
X

qjct =
~

cr qcrt

2R

cr

2R

Transition Rates Before presenting the transition rates and probabilities, we denote the
total rate at which users of class c request a connection in each cell j = 1 : : : N by ~jc:
~jc = X
r

cr

2R

and the total rate at which users of class c request a connection in a region not in cell j but
which interferes with the base station in cell j by ~o :
jc
~o = X
jc
r

o
j

cr :

2R

We can make approximations to specifying the overall rates at which users move or complete their connections in a manner similar to that in making approximations to specifying
the probabilities of movements or connection completions. For instance, to determine an
approximation to the rate at which a user of class c in cell j moves to a region in Ro , which
j
we denote ~jcm, we can weight the various possible transition rates according to the arrival
20

rate to individual regions in cell j , and then consider the probability of moving from any of
these regions to a region in Ro . Such a weighting results in the transition rate
j
X

~jcm =

2R

( cr

2R

X
r

o
j

qcrr

crr )

2R

( cr

2R

qcrr )

o
j

An approximation to the rate at which a user of class c in Ro moves to a region in cell j can
j
be done in a similar manner, resulting in the rate
X

~o =
jcm

2R

o
j

( cr

o
j

X
r

qcrr

crr )

2R

( cr

2R

qcrr )

2R

The rate at which a user of class c in cell j ends an existing connection is independent
of the particular region and is therefore also independent of the cell:
~ct = c:
Since the probability that a user of class c in Ro ends an existing connection includes the
j
probability that such a user moves to a region not in either Ro or in cell j , we can similarly
j
weight such transitions to obtain an approximation to the rate at which a user of class c in
Ro ends an existing connection:
j
X

~o =
jct

2R

cr (qcrt c +

2R

X
r

cr (qcrt +

62R

o
j

qcrr

crr )

62R

o
j

qcrr )

Note that in the above approximation, the probabilities and rates of transitions for users
in cell j and in Ro are weighted based on the arrival rates to the individual regions of cell
j
j and of Ro . Such a weighting assumes that the distribution of existing connections in the
j
various regions is the same as the distribution of connection requests. Such an assumption
would be appropriate if admission decisions were made independently of the particular region
in which connection requests occurred. However, it is likely that controls which give preference to connection requests in certain regions result in better performance. As a result, a
weighting that is based on arrival rates may not be appropriate. In many systems, however,
the transition rates may not vary signi cantly from region to region and therefore such a
weighting can still be very accurate.
21

Using the preceding approximations, for any feature con guration n 2 Nj of cell j , the
~ ~
time until the next random event occurs for the Markov process associated with cell j is an
exponential process with an approximate rate
jn
~

C
X
c=1

~jc + ~o + njc qjcm ~jcm + qjct ~jct] + nj(c+C ) qjcm ~o + qjct ~o ] :


~
~
~o jcm ~o jct
jc ~ ~

Transition Probabilities As described in Sec. 2.2.4, the transition for any cell j from a
state i = (~ !) 2 Sj under control u 2 U (i) to a state i = (~ ! ) 2 Sj depends on two
n
n
0

factors. The con guration for the next state is determined by the control exercised at state
Random State
Component
Arrival
! 2 a ao
Departure
! 2 d do
Movement
! 2 m1 m2

Control

u = ua
u = ub

Admit
Block

Next
Con guration
n =n+!
~ ~
n =n
~ ~
0

n =n
~ ~

None

Maintain u = uh
Drop
u = ud

n =n+!
~ ~
n =n+!
~ ~
0

Table 4: List of the possible random events and the resulting con gurations according to the exercised
control.

i, as shown in Table 4, and the probability distribution of the random event for the next
state depends on the next con guration, as shown in Table 5.
Next Event

!2
!2
!2
!2
!2
!2
0

0
0

a
ao
d
do
m1
m2

Transition Probability
from (~ !) to (~ ! )
n
n
~jc= jn
~
~o = jn
~
jc
njcqjct ~ct= jn
~ ~
~
o ~o = j n
njcqjct jct ~
~ ~
njcqjcm ~jcm= jn
~ ~
~
o ~o = j n
njcqjcm jcm ~
~ ~
0

Table 5: List of the transition probabilities for the possible next random events.

22

The Cost Function Instead of associating costs with the current state and control, we

need to reformulate costs so that they are associated with a particular cell and can be derived
based on the features of the state that are selected. Essentially, we approximate the expected
cost for a single stage corresponding to state i under control u by a sum of costs associated
with each cell:
N
X
~
G(i u)
Gj (fj (i) u)
j =1

~
where Gj (fj (i) u) contains the expected single stage costs corresponding to state i under
control u and associated with cell j , and must be derived from the features from the state
used to represent information relevant to cell j .
Determining how to decompose the costs according to the individual cells depends on
the particular form of the cost function. Note that decomposing costs according to cells is
straightforward if costs were only associated with blocking or accepting connection requests,
and with dropping or maintaining existing connections. Then, in the case in which a connection is requested, costs could be associated with the cell in which the connection request
is located, and in the case in which an existing user moves, costs could be associated with
the cell to which the user is moving. Note that the features extracted from a state for any
cell would need to contain the information necessary to collect the desired costs, as in the
case of the current formulation example.
If costs depend on the entire state, e.g., if there were bonuses for maintaining a certain number of connections in the entire system, decomposing the costs would be much less
straightforward unless the features selected included such information. As a result, selecting the features to represent the relevant information for a particular cell should take into
consideration the form of the cost function.

~
3.3 Calculating v and h.
~

In this section, we describe the third step involved in the cell-by-cell decomposition with
~
feature extraction procedure: the process of determining the values of vj and hj given a
~
feature-based MDP for each cell j . After these values are determined, the decision at each
state is obtained by determining the cells a ected by the decision, summing up the approximate costs associated with each of these cells, and selecting the control resulting in the
smallest cost:
8
n
o
> arg min H (i ua ) H (i ub )
~
~
<
if ! 2 a
~
o
(i) = arg umini) H (i u) = > ua ub n ~
i 2 S (7)
~
U(
: arg u u H (i uh) H (i ud )
min
if ! 2 m
h d
2

where

~
H (i u) = G(i u) ; i(u)

X
j

2J

(i)

vj +
~
23

2
3
X
~
4Pii (u)
hj (fj (i ))5 :
j (i)
0

2S

2J

~
Our method of determining the values for vj and hj is based on the discussion in Ber95],
~
~
x5.3. Before we consider the problem of determining for each cell j the values for vj and hj ,
~
we consider the \auxiliary discrete-time average cost" version of the problem. Let Sj be the
feature space associated with cell j and let be any scalar such that
(
0 < < i~u)
1 ; Pjii(u)
~
for all i 2 Sj and u 2 U (i) with Pjii < 1. We de ne new transition probabilities for all i and
u 2 U (i):
8
> Pjii (u)
> ~
>
>
if i 6= i
<
i (u)
^ii (u) =
P
~
>
>
> 1 ; (1 ; Pjii (u)) if i = i
>
:
i (u)
~
where the Pjii are the transition probabilities for the Markov process associated with cell j ,
and de ne
~ u
^
Gj (i u) = Gj ((iu) )
0

as the expected cost per stage associated with cell j . If for any cell j , the scalar vj and
^
^ j satisfy
vector h
^
hj (i) = umini)
U(
2

2
^
4Gj (i

~
then vj and hj where
^
satisfy

u) ; vj +
^

X
i

3
^ ^
Pii (u)hj (i )5

2S

i 2 Sj

~
^
hj (i) = hj (i) i 2 Sj

~
hj (i) = umini)
U(
2

2
4Gj (i

u) ; vj i (u) +
^

(8)

3
~ ~
Pii (u)hj (i )5
0

i 2 Sj :

2S

^
We now consider the problem of determining the values for vj and hj . One method for
^
calculating these values is a technique called relative value iteration. To simplify notation,
^
we refer to vj with vj and hj with hj .
^
Let t be any xed state, and initialize h0(i) to arbitrary scalars for all i 2 Sj . We run
j
the following iterative algorithm

hk+1(i) = (Thk )(i) ; (Thk )(t) i 2 Sj


j
j
j
where

8
<
^
(Thk )(i) = umini) :Gj (i
j
U(
2

u) +

24

X
j

9
=

^
Pii (u)hk (i ) :
j
0

2S

(9)
0

Suppose at any particular stage, there are a total of m connected users regardless of class
in the entire system. Due to capacity constraints, m must be bounded. There is then some
> 0 such that the probability that each of the next m stages involves some user completing
his connection is greater than . As a result, for our Markov problem, there exists a positive
integer m such that under any policy, the probability of reaching the \empty" state, the
state in which there are no users connected, from any state in m stages is greater than some
> 0. It has been shown that for such a problem, the sequence fhk g generated by Eq. 9
j
converges to a vector hj such that (Thj )(t) is equal to the optimal average cost per stage
and hj is an associated di erential cost vector ( Ber95]). It also follows that

vjk = (Thk )(t)


j
converges to the optimal average cost per stage.
~
Once the values for hj and vj are determined, we can obtain hj using Eq. 8, obtain vj
~
~
using the relation vj = (T hj )(t), and apply the decisions determined by Eq. 7. Equivalently,
~
we could simply use the values obtained from hj and vj and apply the decision determined
as follows for any state i 2 S :
8
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
<
(i) = >
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
:

8
2
3
<
X
X
X
^
arg ua ub :G(i ua) ;
min ^
vj + 4Pii (ua)
hj (fj (i ))5
i 2
j (i)
j (i)
39
=
X
X
X
^
^
G(i ub) ;
vj + 4Pii (ub)
hj (fj (i ))5
i 2
j (i)
j (i)
8
3
<
X
X
X
^
arg ua ub :G(i uh) ;
min ^
vj + 4Pii (uh)
hj (fj (i ))5
i 2
j (i)
j (i)
39
=
X
X
X
^
^
G(i ud) ;
vj + 4Pii (ud)
hj (fj (i ))5
i
j (i)
j (i)
0

0 2S

2J

2J

2J

2S

2J

2S

2J

2J

where
P

if ! 2 a

2J

2S

2J

if ! 2 m
(10)

i
^
G(i u) = G((uu) :
)
i

Finally, note that since the term j (i) vj is independent of the control, it need not be
calculated in making decisions and can be omitted from Eq. 10.
2J

4 Computational Results
In this section, we describe the results of applying the approximation technique described
in Sec. 3 to a simulated cellular system. We rst describe the simulated system in Sec. 4.1.
We then describe the speci c details involved in implementing the approximation technique
for this particular system, and present some computational results in Sec. 4.2.
25

4.1 Simulation Model


In actual systems, cells can vary signi cantly in shape, size, as well as in other factors.
However, the approaches that we have considered do not require that cells be of any speci c
or uniform geometry. In fact, we only need to be able to determine the arrival rate of call
requests to particular regions of the cell and the likelihood that users will move from one
region to another. These factors are in uenced by the shape and size, but our model includes
the ability to specify these details by allowing the number of regions per cell be variable
as well as by allowing arrival rates and movement probabilities be speci ed for individual
regions. We therefore consider a system in which the geometry has been simpli ed in order
to keep the implementation and collection of results straightforward. Also for the sake of
simplicity, our system consists of a single user class.
The description of the simulated system is divided into the following four parts, with each
description providing the set of parameters that need to be speci ed for each simulation.
1. description of the geometry of the system
2. description of path gains for each cell region to each base station
3. description of each user's behavior
4. description of the objective function.

Geometry We consider a two-dimensional rectangular system in which cells are of hexag-

onal shape. Each cell is divided into two concentric zones, and the outer zone is further
divided into six identical regions, as shown in Figure 6. The parameters that need to be
speci ed to determine the geometry for a particular simulation are
1. Mv , the number of cells in the vertical direction,
2. Mh, the number of cells in the horizontal direction, and
3. Ei, the fraction of the area of a cell that comprises the inner region. (Note that Eo , the
fraction of the area of a cell that comprises an outer region is (1 ; Ei)=6.)

Path Gains We assume that the path attenuation gain from any part of an inner region

of any cell to that cell's base station is A1 i.e., if a user in an inner region of a cell transmits
at a power level of w, the signal level received at the cell's base station is wA1. We also
assume that the path gain from a periphery region of any cell to that cell's base station
is A2 < A1. We assume that the power assignment algorithm assigns power levels of W1
and W2 to users in an inner region of a cell and users in a periphery region, respectively,
26

Figure 6: Cell Model: Each cell consists of an inner region and six identical periphery regions.
and that W1 A1 = W2A2 i.e., we assume that power levels are assigned according to the
inverted channel power control scheme so that the signal levels received by a base station
is the same for all users that are connected to that base station. We also assume that
mobiles within the inner region of a cell, where transmissions occur at relatively low powers,
cause negligible interference in neighboring cells, while mobiles within the periphery regions,
where transmissions occur at relatively high powers, may cause signi cant interference in the
neighboring cells that border the particular outer region. We therefore assume that a signal
from a user in another cell that is in a region that is adjacent to cell j is attenuated by a
factor of A3 < A2 by the time it is received at the base station of cell j . Fig. 7 illustrates
the various possibilities.
2

R1

R2

3
R3

Figure 7: The path gain from the inner region of any cell to its own base station is A1 and 0 elsewhere.
The path gain from region R1 to cells 1 and 2 are A2 and A3 , respectively, and 0 elsewhere. The path gain
from R3 to cells 1 and 3 are A2 and A3 , respectively, and 0 elsewhere. The path gain from R2 to cells 1, 2,
and 3 are A2 , A3 , and A3 , respectively, and 0 elsewhere.

Given our assumptions, the signals received at any base station are of two levels: transmissions from any user within the cell are received at a signal level of I1 = W1A1 = W2A2
and transmissions from any user in an adjacent region of a neighboring cell are received at
a signal level of I2 = W2 A3. The two signal levels essentially specify the three di erent path
gains and we use them as the parameters for the simulated system. In addition, we assume
that each user must satisfy the following signal to noise requirement:
I1
n1 I1 + n2 I2 > T
27

where n1 is the number of users within this cell, n2 is the number of users in an adjacent
region of a neighboring cell of j , and T is a given threshold. We assume that the receiver
noise at each base station is negligible. The parameters that need to be speci ed are then
1. I1 , the signal level received at a base station from any user within the cell,
2. I2 , the signal level received at a base station from any user in an adjacent region of a
neighboring cell, and
3. T , the signal to noise threshold ratio.

User Behavior The arrival rate of new connection requests to cell j is j . The probability
that the connection request is in any particular region of a cell is proportional to the size
of the region. The directions of travel of users can be unrestricted, in which case a user
can move to any region to which his current region is adjacent, or speci ed. One possible
direction speci cation is to allow all users to only move to regions that are \above" the
current region. The probability that a user moves from one cell region to another is q. If
the user moves to another region, he is equally likely to move to any region to which he
is allowed to move. We assume that the probability that a user moves more than once is
zero. The lengths of time that a user that moves stays in its originating region and in its
destination region are both exponentially distributed with mean . The lengths of a call of
a user that is stationary are also exponentially distributed with mean .
The parameters that need to be speci ed to describe user arrivals, departures, and movements are as follows:
1.

j,

the arrival rate of call requests to each cell,

2. , the rate at which a user moves to another region or completes a connection,


3. q, the probability that a user moves to another region, and
4. whether there are any restrictions on movement directions.

Objective Function We assign costs to blocking a connection request and to dropping


a connection for a user that moves from one region to another. For any particular state i
when control u is exercised, the cost per stage G(i u) is
8
>
>
<

B if ! 2 a u = ub
G(i u) = > D if ! 2 m u = ud
>
: 0 otherwise:
28

The objective function which we wish to minimize is a linear combination of the expected
average number of blocks and of drops per unit time:
N
X
lim 1
N
E (tN ) k=1 E fGk g
where Gk is the cost of the kth stage.
The parameters that need to be speci ed to describe the cost function are as follows:
!1

1. B , the cost of blocking a user, and


2. D, the cost of dropping a user.
We assume that the costs B and D are positive quantities and that the cost of dropping a
call is worse than that of blocking a call: D > B .

4.2 Computational Results using Cell-by-Cell Decomposition


In this section, we describe our results of applying the approximate decomposition with
feature extraction procedure to the simulated system presented in Sec. 4.1. Using various
selections of features and decomposing the problem into individual cells, approximations
were created with state spaces su ciently small for dynamic programming to be applied as
described in Sec. 3.3 to obtain the average cost per stage vj and di erential cost vector hj
for the feature-based MDP associated with each cell j .
In our simulations, we formulated feature-based MDPs for three sets of features. Table
6 indicates the features that were used for each set. Note that the formulation for the rst
feature set was presented in Sec. 3.2.
Feature
Description
# existing connections in cell
# existing connections in adjacent regions of neighboring cells
# existing connections in cell that are positioned so that they
can move to another cell
# existing connections in adjacent regions of neighboring cells
positioned so that they can move to given cell

Feature Set
1
2
3
x
x
x
x
x
x
x
x

Table 6: List of the features and those used in each feature set.
After the selection of a set of features, we formulated an appropriate feature-based MDP
for each cell as illustrated in Sec. 3.2. We then applied relative value iteration on the
29

resulting formulations to obtain the average costs per stage and di erential cost vector for
the approximate problems as described in Sec. 3.3. We have applied the policies resulting
from these cost functions on various random sequences of a large number of simulated events.
We refer to each of these sequences as a particular data set. Each data set consisted of
simulated events over a period of 3000 time units.
The problems we considered consisted of a system of four cells by four cells with a single
user class. The area of the inner region of a cell was set to Ei = 1=4 and the area of
each periphery region was set to Eo = 1=8. Calls generated a unit of interference (I1 = 1)
at the base station of the cell in which the call was located. Calls in adjacent regions of
neighboring cells generated a fraction 0:3 of a unit of interference (I2 = 0:3). The total
amount of interference that was acceptable was set to 50.15, resulting in a threshold of
T = 1=50:15. Arrival rates varied from = 15 to 30 call requests per unit time. The call
completion rate was set to = 1 per unit time. The probability that a particular call moves
varied from q = :2 to q = :4. In several of the runs, users that move were equally likely to
move to any neighboring regions in others, users that move were equally likely to move to
any neighboring region that was above the region in which the call was initiated. The cost
associated with a blocked call was B = 0:1, and the cost associated with a dropped call was
D = 1:0.
For the rst six data sets, the arrival rate of connection requests to each cell was the
same and varied from 20 to 25 requests per unit time. In the seventh data set, the arrival
rate of connection requests was 30 per unit time to the four inner cells and 15 per unit time
to the twelve outer cells.
= 15
= 15

= 15
= 15

= 30
= 15

= 15
= 30

= 30
= 15

= 15
= 30

= 15
= 15

= 15
= 15

The speci c details for each data set are provided in Table 7.
The number of value iterations we performed to obtain our DP policies varied from 40 to
500. We compared the results of applying the these policies on the above data sets to those
when the greedy algorithm or a reservation policy was applied. Under a greedy algorithm,
a user is admitted if there are su cient resources to handle the user under a reservation
policy, a user is admitted if there are su cient resources to handle a certain number of later
hando s in addition to the user. In our computations, the optimal number of later hando s
for which to reserve resources was always one. The greedy and reservation algorithms are
30

Data Set
1
2
3
4
5
6
7

Arrival
Rate
25
25
20
20
25
25
15 or 30

Travel
Direction
Uniform
Upwards
Uniform
Upwards
Uniform
Upwards
Uniform

Prob of
Movement
0.2
0.2
0.3
0.3
0.4
0.4
0.3

Total Req'd
Connections
1197265
1199966
958454
959784
1198456
1199540
898742

# Attempted
Movements
232824{234402
224420{226087
285919{275728
275128{275728
437022{444531
413321{426210
254554{256559

Table 7: List of the simulation data sets and their parameters. Each simulation was run for 3000 time
units. The expected length of a connection was 1 unit time.

generally used in current systems.


The results of applying our DP policies to data sets 1 and 2 are provided in Tables 8 and 9.
The results for data sets 3 through 7 are similar. In each of the runs that were conducted,
Feature #
Set iters
40
1
100
500
40
2
100
500
40
3
100
500
Greedy
Reserv

Cell-by-Cell Decomposition Applied to Data Set 1


# of
%
# of B+10 D % Improvement % Improvement
blocks blocked drops (Cost) over Greedy Pol over Reserv Pol
26386 2.20
428 30666
22.55
8.66
26785 2.24
343 30215
23.69
10.00
26991 2.25
311 30101
23.97
10.34
26389 2.20
429 30679
22.51
8.62
26788 2.24
342 30208
23.70
10.01
26986 2.25
310 30086
24.01
10.39
26380 2.20
437 30750
22.33
8.41
26751 2.23
349 30241
23.62
9.92
26848 2.24
340 30248
23.60
9.90
25623 2.14 1397 39593
{
{
33443 2.79
13
33573
15.20
{

Table 8: Results of applying cell-by-cell decomposition with feature extraction to Data Set 1.
applying dynamic programming to formulations resulting from cell-by-cell decomposition
with feature extraction performed signi cantly better than the greedy policy and somewhat
better than the reservation policy. The improvement in cost varied from twenty to nearly
forty- ve percent over the greedy policy. In a few cases when the number of value iterations
31

Feature #
Set iters
40
1
100
500
40
2
100
500
40
3
100
500
Greedy
Reserv

Cell-by-Cell Decomposition Applied to Data Set 2


# of
%
# of B+10 D % Improvement % Improvement
blocks blocked drops (Cost) over Greedy Pol over Reserv Pol
28696 2.39
367 32366
26.36
8.29
28995 2.42
297 31960
27.28
9.44
29390 2.45
248 31870
27.49
9.69
28644 2.39
367 32314
26.48
8.44
28989 2.42
298 31969
27.26
9.41
29384 2.45
251 31894
27.43
9.63
28676 2.39
372 32396
26.29
8.20
28814 2.40
347 32284
26.54
8.52
28804 2.40
348 32284
26.54
8.52
26670 2.22 1728 43950
{
{
35051 2.92
24
35291
19.70
{

Table 9: Results of applying cell-by-cell decomposition with feature extraction to Data Set 2.
was small, the DP solution performed slightly worse than the reservation policy, but generally
the DP solution resulted in a cost that improved upon the reservation policy from a negligible
amount to about thirteen percent. Increasing the number of iterations generally improved
performance, but not consistently nor substantially. When the utilization of the system is
high, the DP solution outperformed the greedy policy to a greater extent, but outperformed
the reservation policy to a lesser extent. In general, performance did not seem to depend on
which of the three feature sets were used. In fact, the rst feature set often outperformed
the others in spite of the fact that its state space is the coarsest.
Table 10 summarizes the best results of applying cell-by-cell decomposition with feature
extraction.

5 Summary
In this paper, we formulated the admission control problem as a Markov decision problem.
One of the main advantages of such a formulation is that it can incorporate an arbitrary
amount of detail necessary to describe real cellular systems. While dynamic programming
can be used to solve such problems, the large size of the state space makes this impractical.
We proposed an approximate dynamic programming technique, cell-by-cell decomposition
with feature extraction, which involved creating an approximation of the original model with
a state space su ciently small so that dynamic programming could be applied. The real-time
computation required by our proposed technique is small and is easily distributed between
32

Feature Set 1
Feature Set 2
Feature Set 3
Data Set % Imp/ % Imp/ % Imp/ % Imp/ % Imp/ % Imp/
Number Greedy Reserv Greedy Reserv Greedy Reserv
1
23.97
10.34
24.01
10.39
23.62
9.92
2
27.49
9.69
27.43
9.63
26.54
8.52
3
25.02
12.60
23.91
11.31
25.05
12.64
4
30.77
11.84
30.77
11.84
29.99
10.84
5
40.40
3.08
40.39
3.07
39.33
1.33
6
44.70
4.71
44.43
4.24
43.96
3.43
7
33.39
7.23
33.14
6.89
32.21
5.59
Table 10: Summary of the best results of applying cell-by-cell decomposition with feature extrac-

tion to the seven data sets. The cost improvement of the results for each of the feature sets over
the results of applying the greedy and reservation policies are provided.

the cells. Our comparisons on a simulated systems with commonly used heuristic policies
have shown signi cant improvements in performance for our methodology.

Acknowledgments
This research was supported by NSF under grant NCR-9622636.

References
AAN96] P. Agrawal, D. K. Anvekar, and B. Narendran. Channel management policies for
handovers in cellular networks. Bell Labs Technical Journal, pp. 97{110, Autumn 1996.
AS96] M. Asawa and W. E. Stark. Optimal scheduling of hando s in cellular networks.
IEEE/ACM Transactions on Networking, 4:428{441, June 1996.
Ber95] D. P. Bertsekas. Dynamic Programming and Optimal Control, Vols I and II. Athena
Scienti c, Belmont, MA, 1995.
BWE95] C. Barnhart, J. Wieselthier, and A. Ephremides. Admission-control policies for
multihop wireless networks. Wireless Networks, 1:373{387, 1995.
CG98] Y. Chang and E. Geraniotis. Optimal policies for hando and channel assignment
in networks of leo satellites using cdma. Wireless Networks, 4:181{187, 1998.
33

Gue88] R. Guerin. Queueing-blocking system with two arrival streams and guard channels.
IEEE Transactions on Communications, 36(2):153{163, 1988.
HR86] D. Hong and S. S. Rappaport. Tra c model and performance analysis for cellular
mobile radio telephone systems with prioritized and nonprioritized hando procedures.
IEEE Transactions on Vehicular Technology, VT-35(3):77{92, Aug. 1986.
LAN97] D. A. Levine, I. F. Akyildiz, and M. Naghshineh. A resource estimation and call
admission algorithm for wireless multimedia networks using the shadow cluster concept.
IEEE/ACM Transactions on Networking, 5:1{12, Feb. 1997.
PG85] E. C. Posner and R. Guerin. Tra c policies in cellular radio that minimize blocking
of hando calls. ITC 11., Kyoto, Japan 1985.
SS97] M. Sidi and D. Starobinski. New call blocking versus hando blocking in cellular
networks. Wireless Networks, 3:15{27, 1997.
TJ92] S. Tekinay and B. Jabbari. A measurement-based prioritization scheme for handovers in mobile cellular networks. IEEE Journal on Selected Areas in Communications,
10:1343{1350, 1992.

34

You might also like