Agra Wal 2006
Agra Wal 2006
Agra Wal 2006
Abstract— We consider a scheduling and resource allocation what subchannelization is used, the current channel state
problem for the downlink of an OFDMA-based wireless network, information, and the resource allocation decisions. We also
where the channel estimation error is modeled by a self-noise allow constraints on the maximum SINR or rate per subcarrier,
term in the decoding process. During each time-slot this involves
selecting a subset of users for transmission, determining the which can model a limitation on the available modulation
assignment of available subcarriers to selected users, and for each order. When the rate per sub-carrier is given via the Shannon
subcarrier determining the transmission power and the coding capacity formula and users are allowed to time-share each
and modulation scheme used. We address this in the context of sub-carrier, as in [7] this becomes a tractable convex opti-
a utility-based scheduling scheme presented in earlier papers. mization problem.2 A special case of this problem (without
This results in an optimization problem, which is convex for a
reasonable model of the feasible rates. By exploiting the structure self-noise) for a fixed set of weights and no constraints on the
of this problem, we develop optimal and sub-optimal algorithms SINR per carrier was considered in [8]; there a suboptimal
for its solution. We provide simulation results comparing different algorithm with constant power per sub-carrier was given
algorithms and parameter settings. and shown in simulations to have little performance loss.
Here, we consider a dual formulation for this problem, which
I. I NTRODUCTION
enables us characterize some structural properties and leads to
Dynamic “channel-aware” scheduling and resource alloca- both optimal and reduced complexity sub-optimal algorithms.
tion is an essential component of most recent wireless data We also present simulation results in a system where the
systems. A number of gradient-based scheduling and resource scheduling weights are dynamically adjusted according to a
allocation algorithms have been considered, which attempt gradient-based scheduling rule. Some other related work on
to maximize the projection onto the gradient of a system OFDM optimization can be found in [9]–[11]; in these papers
utility, see e.g. [2]–[5]. The utility is used to quantify fairness self noise is not considered.
and other QoS considerations. This paper addresses gradient-
based scheduling and resource allocation for a system using II. P ROBLEM F ORMULATION
a combination of Time Division Multiplexing (TDM) and We consider the downlink of a single cell in an OFDM
Orthogonal Frequency Division Multiplexing (OFDM). An system with K users. Time is divided into TDM time-slots
example of such a system is IEEE 802.16 (WiMAX). In this that contain an integer number of OFDM symbols. In each
setting, the problem is to determine which users are scheduled, time-slot, the scheduling and resource allocation decision can
as well as the allocation of transmission power and OFDM be viewed as selecting a rate vector r t = (r1,t , . . . , rK,t )
subcarriers1 among the scheduled users. from the current feasible rate region R(et ) ⊂ RK + , where et
In prior work [6], we considered gradient-based scheduling indicates the time-varying channel state information available
and resource allocation when code division multiple access at the scheduler. This decision is made according to the
(CDMA) was used to multiplex users within a time-slot, as gradient-based scheduling framework in [2], [4]. Namely, an
in CDMA 1xEVDV. In [7], we extended the approach in [6] r t ∈ R(et ) is selected that has the maximum projection onto
to an OFDM downlink where the transmitters and receivers the gradient of a system utility function ∇U (W t ), where
K
have accurate channel information. In this paper, we consider U (W t ) = i=1 Ui (Wi,t ), and Ui (Wi,t ) is an increasing
the practical case where there are channel estimation errors, concave utility function of user i’s average throughput, Wi,t ,
which are modeled as a self-noise term when calculating the up to time t. In other words, the scheduling and resource
achievable rate [13]. This significantly changes the resource allocation decision is the solution to
allocation problem. However, we show that the general ap-
proach in [6], [7] can still be applied in this case. max ∇U (W t )T · r t = max U̇i (Wi,t )ri,t . (1)
r t ∈R(et ) r t ∈R(et )
i
As in [7], within each time-slot the gradient-based policy
requires maximizing the weighted sum throughput over the For example, one class of utility functions given in [4] is
set of feasible rates. The set of feasible rates depends on ci α
Ui (Wi,t ) = α (Wi,t ) , α ≤ 1, α = 0,
(2)
Part of this work was done while V. Subramanian and J. Huang were at Mo- ci log(Wi,t ), α = 0,
torola. R. Berry was supported in part by the Motorola-Northwestern Center
for Seamless Communications and NSF CAREER award CCR-0238382. 2 We focus on systems that do not use superposition coding and successive
1 In the following we use the terms subcarrier and tone synonymously. interference cancellation within a sub-carrier.
1424407850/06/$20.00 1347
where α ≤ 1 is a fairness parameter and ci is a QoS weight. only a fraction xij of the tone is allocated. Without loss of
In this case, (1) becomes generality we set B = 1.
It has recently been suggested that for realistic systems the
max ci (Wi,t )α−1 ri,t . (3) OFDM model must have a self-interference term. Following
r t ∈R(et )
i
a similar model as in [13], we assume the received signal on
With equal class weights, setting α = 1 results in a scheduling the j’s subchannel of user i is
rule that maximizes the total throughput during each slot. For
α = 0, this results in the proportional fair rule. yij = hij sij + nij .
In general, we consider the problem of Assume that the receiver estimates channel hij with hij +hij,δ ,
where hij,δ is the channel estimation error. As a result, the
max wi,t ri,t , (4)
r t ∈R(et ) processed received signal is
i
where wi,t ≥ 0 is a time-varying weight assigned to the ith zij = h∗ij hij sij + h∗ij nij + h∗ij,δ (hij sij + nij ) ,
user at time t. We note that (4) must be re-solved at each and the effective SNR is
scheduling instant because of changes in both the channel state 2 2
hij sij
and the weights (e.g., the gradient of the utility). Eff-SNR = ,
2 hij,δ 2 2 2
2σij 1 + h 2 + hij,δ sij
A. OFDM capacity regions ij
2
Solving (4) depends on the state dependent capacity region where the variance of noise nij is assumed to be 2σij . Defining
R(e).3 We focus on a model appropriate for downlink OFDM 2 h 2 h 2 2
gij = hij , aij = hij,δ2 , eij = 2σij2 and sij = pij ,
ij ij
systems; similar models have been considered in [8], [12]. we get the effective SNR to be
In this model, R(e) is parameterized by the allocation of
pij eij
subcarriers to users and the allocation of power across subcar- Eff-SNR = . (5)
1 + aij + aij eij pij
riers. In a traditional OFDM system, at most one user may be
assigned to any subcarrier. Here, as in [9], [11], we make the Defining ẽij := eij / (1 + aij ) , then (5) can be written as
simplifying assumption that multiple users can share one tone pij ẽij
using some orthogonalization technique (e.g. via TDM) but Eff-SNR = .
1 + aij pij ẽij
not super-position coding. In practice, if a scheduling interval
contained multiple OFDM symbols, we could implement Compared to the case without self-noise, the effective SNR
such sharing by giving a fraction of the symbols to each is still increasing in pij . However, it now has a maximum of
user; of course, each user would be constrained to use an 1/aij . For the sake of presentation, we assume that a = aij
integer number of symbols and the required signaling overhead for all i and j. The analysis is almost identical if users have
would increase. If the number of tones is large4 , then this different aij ’s.
approximation gets tighter. Given a solution to this problem, Taking time sharing into consideration, user i’s feasible rate
we can obtain a feasible solution allowing only one user per on subcarrier j is now given by
tone by applying an appropriate projection. For the simulations pij ẽij
in Section IV, we assign only one user per tone. rij = xij B log 1 + . (6)
xij + apij ẽij
Let N = {1, . . . , N } denote the set of tones. For each tone
j and user i, let eij denote the received signal to noise ratio The achievable rate region is then
(SNR) per unit power. We denote the power allocated to user pij ẽij
R(e) = r : ri = xij log 1 + ,
i on subchannel j by pij and the fraction of that subchannel xij + apij ẽij
j
allocated touser i by xij . These must satisfy a total power (7)
constraint, i,j pij ≤ P, and for all subcarriers j, i xij ≤ 1, pij ≤ P, xij ≤ 1 ∀ j, (x, p) ∈ X , ,
i.e., the total fraction of each sub-carrier allocated must be no ij i
greater than one. For a given allocation, with perfect channel 6
estimation, user i’s feasible rate on subcarrier j is given by where
p e xij sij
rij = xij B log(1 + ijxijij ). This corresponds to the Shannon X := (x, p) ≥ 0 : xij ≤ 1, pij ≤ ∀i, j . (8)
eij
capacity of a Gaussian noise channel with bandwidth xij B and
p e
received SNR ijxijij .5 This SNR arises from viewing pij as Here, sij is a maximum SINR constraint on tone j for user i,
the energy per time-slot user i is allowed to use on subcarrier e.g. to model a constraint on the maximum rate per tone due
p to a limitation on the available modulation order.
j; the corresponding transmission power becomes xij when
ij
We assume that the joint channel state e is known by the
3 To simplify notation we have suppressed the time-dependence. scheduler for all users and tones with an estimation error
4 Most proposals consider about a thousand subcarriers. quantified by a. With many tones and users, providing pilots
5 As in [7], to better model the achievable rates in a practical system we
can re-normalize eij by γeij , where γ ∈ [0, 1] represents the system’s “gap” 6 We use boldfaced symbols to denote the vector of all the corresponding
from capacity. values across users/tones, e.g. x is the vector of all xij ’s.
1348
and/or feedback per tone can require excessive overhead. One where
approach for reducing this overhead is by forming subchannels
2a + 1 4a(a + 1)
from disjoint sets of tones. Feedback and resource allocation q(a, z) = 1+ z−1 .
is then done at the granularity of these subchannels. The above 2a(a + 1) (2a + 1)2
model can be adapted to this setting by viewing N as the set Notice that the optimal power allocation is no longer increas-
of subchannels and eij as the effective SNR per unit power for ing in ẽij as in the case when a = 0. On the other hand,
user i within the jth subchannel. Specifically, assuming that the optimal value of p∗ij is still a linear function of xij , which
k tones are bundled into subchannel j, eij is chosen so that means the resulting Lagrangian is also a linear function of xij .
the total rate for user i in this subchannel is approximately Substituting (11) into L(x, p, λ, μ), we have
pij eij
kxij log(1 + xij +ap ij eij
). For our simulations, we set eij to
be the geometric average of the SNR per unit power of all L(x, p∗ , λ, μ) = xij (μij (λ) − μj ) + μj + λP,
the tones in a subchannel, which provides a (provable) lower ij j
bound of the achievable rate.
wi ẽij
Subchannels can be formed in various ways; in our sim- where μij (λ) := wi h a, λ , sij , and
ulations, the following three approaches are considered: (1) ⎛ ⎞
+
adjacent channelization, where adjacent tones are grouped into q a, (ω − 1) ∧ sij
sub-channels; (2) interleaved channelization, where tones are h (a, ω, sij ) := log ⎝1 + ⎠
+
1 + q a, (ω − 1) ∧ sij
interleaved to form subchannels; and (3) random channel-
ization, where tones are randomly assigned to subchannels. 1 +
− q a, (ω − 1) ∧ sij .
In IEEE 802.16d/e, interleaved channelization is primarily ω
∗
used; the optional “band AMC mode” allows for adjacent Optimizing L(x, p , λ, μ) over x yields the corresponding
channelization. Randomized channelization can model systems dual function
that employ frequency hopping as in the Flash OFDM system.
L(λ, μ) := L(x∗ , p∗ , λ, μ)
+
III. O PTIMAL AND S UBOPTIMAL A LGORITHMS = μij (λ) − μj + μj + λP.
ij j
From (4) and (7), the scheduling and resource allocation
problem can be stated as: Since there is no duality gap, it follows that minimizing
this over λ and μ yields an optimal solution to (9). First
pij ẽij
max V (x, p) := wi xij log 1 + xij +ap ẽ
ij ij considering the optimal μ, and we have as in [7]:
(x,p)∈X
i j Lemma 3.1: For all λ ≥ 0,
(9)
subject to: pij ≤ P, and xij ≤ 1, ∀j ∈ N .
L(λ) := min L(λ, μ) = λP + μ∗j (λ),
i,j i μ≥0
j
A. Optimal Dual Solution where for all j, the minimizing value of μj is
We first solve this problem using duality methods. It can be
shown that (9) is convex and Slater’s condition holds, so there μ∗j (λ) = max μij (λ). (12)
i
is no duality gap and the optimal solution is characterized by Note that (12) requires a sort of all the users according to the
the Karush-Khun-Tucker conditions [1]. metrics μij for each sub-channel j. It can be shown that L(λ)
Consider the Lagrangian, is a convex function of λ; hence it can be minimized using
an iterated one dimensional search, like the Golden Section
pij ẽij method. At the minimizing value λ∗ , L(λ∗ ) gives the optimal
L(x, p, λ, μ) := wi xij log 1 +
i j
xij + apij ẽij solution to (9).
B. Optimal primal variables with time-sharing
+λ P − pij + μj 1 − xij .
i,j j i Next we turn to finding optimal values of the primal
variables (x, p). For a given λ ≥ 0, let
First we optimize over p given x, μ and λ. If there is no self-
noise (a = 0), we obtain a “water-filling” type of solution,7 (x∗ , p∗ ) := arg max L (x, p, λ, μ∗ (λ)) , (13)
(x,p)∈X
+
∗ xij wi ẽij which can be solved using the same procedure as in deriving
pij = −1 ∧ sij . (10)
ẽij λ the dual function. Given that λ = λ∗ , it follows from duality
theory that if the corresponding (x∗ , p∗ ) are primal feasible
When a > 0, we obtain
+ and satisfy complimentary slackness, then they are optimal.
∗ xij wi ẽij However, in (12) there can be multiple users in a given
pij = q a, −1 ∧ sij , (11)
ẽij λ subchannel whose metrics μij are tied at the maximum value.
In this case, there will be multiple primal values that satisfy
7 The notation (x)+ = max(x, 0) and x ∧ y = min(x, y). (13), not all of which may be feasible. Thus, breaking these
1349
TABLE I
ties to settle on a specific primal solution is necessary to find
P ERFORMANCE FOR DIFFERENT CHOICES OF α ( ADJACENT
the optimal solution. A key point is that when ties occur at a
SUBCHANNELIZATION , NO - SELF - NOISE ).
given λ, L(λ) is not differentiable at that λ. However, since
L(λ) is a convex function, subgradients exists. Such ties can α Algorithm Utility Log U Rate(kbps) Num.
be broken using the subgradient information as in [7]. Details 0 OPTIMAL 10.74 10.74 60.8 7.73
0 HEURISTIC 10.66 10.66 54.6 7.29
are omitted due to space constraints.
0.5 OPTIMAL 545.2 10.83 105.9 7.32
C. Single user per tone 0.5 HEURISTIC 528.8 10.73 99.3 7.20
1 OPTIMAL 261677 6.79 261.7 2.58
Next we consider the case where the final allocation
1 HEURISTIC 261676 6.79 261.7 2.58
is restricted to at most one user per subchannel. We can
still find the optimal λ∗ . If there are no ties as discussed
above, the optimal solution will only allow one user/tone.
If there are ties, a reasonable heuristic is to simply choose block-fading model based upon the Doppler frequency (for
one extreme point allocation. In our simulations, we choose the block-length in time) and a standard reference mobile
the extreme point corresponding to the subgradient with the delay-spread model (for variation in frequency). For a user’s
smallest non-negative value; i.e. the extreme point f , for which fast-fading term, each multi-path component was held fixed
for 2msec and an independent value was generated for the
j∈N f (j)j is closest to P , without exceeding it. Other rules
p̃
for choosing an extreme point could also be used. next block, corresponding to a 250MHz Doppler frequency.
For a given extreme point, the total power constraint using The delay-spread was 1μsec. The user’s channel conditions
the powers p̃f (j)j will be over-shot or under-shot (unless this averaged over the applicable channelization scheme are fed
point is optimal). We then re-optimize the power allocation back to the scheduler for all the channels.
for the given fixed tone allocation x, i.e. we solve We considered a system bandwidth of 5MHz corresponding
to 512 OFDM tones. We group these into 64 subchannels
max V (n, p) s.t. pij ≤ P. (14) (8 tones per subchannel). The symbol duration was 100μsec
p:(p,x)∈X
ij with a cyclic prefix of 10μsec. This roughly corresponds to 20
Let Lx (λ) be the dual function for this problem. Given that OFDM symbols per fading block. The resource allocation is
λ̃ = arg minλ≥0 Lx (λ), the optimal power allocation to (14) done once per fading block. All the results are averaged over
is given by (10) with λ = λ̃ and the given tone allocation. A the last 2000 OFDM symbols out of 60000 OFDM symbols
bisection search can again be used to find the optimal λ. (i.e., 3000 fading blocks), at this time the system has reached
a stationary operation point. All users were infinitely back-
D. Single sort suboptimal algorithm
logged and assigned a throughput-based utility as in (2) with
The optimal sub-carrier allocation is determined by assign- parameter ci = 1 and the same fairness parameters (α) for
ing each tone j to the user with the largest metric μ∗j (λ∗ ) on each simulation. To account for realistic network conditions,
that tone (breaking any ties as discussed above). This requires we calculate the achievable rate of user i on subchannel j as
iterating to find the optimal Lagrange multiplier λ∗ . We give a
B 0.56pij ẽij
sub-optimal algorithm that is based instead on a single sort of rij = 0.28 xij log 1 + ,
the users on each tone according to a different metric. Here, S xij + apij ẽij
we consider using the metric wi R̄ij , where R̄ij is the rate that where B is the subchannel bandwidth and S is the symbol
user i could achieve on this channel under a constant power length. The scheduling is based on the geometric average of
ẽij P/N
allocation, i.e., R̄ij = log[1+(sij ∧( 1+aẽ ij P/N
))]. The tone is the subchannel gains; the decoded rate is based on per tone
allocated to the user with the largest metric, with ties broken channel conditions. The total power constraint is fixed at P =
arbitrarily. After the tone allocation is made, constant power is 6W. There are no per-user SINR constraints (i.e., sij = ∞).
allocated on all subchannels. This metric was motivated in part The first set of simulation results are for a system with
by prior work in [8], [10] where a uniform power allocation adjacent subchannelization and no self-noise (a = 0). Table I
was shown to be nearly optimal. shows results for both algorithms for different choices of the
utility parameter α. The column “Utility” gives the average
IV. S IMULATION S TUDY utility per user for each algorithm. The column “log U” shows
In this section we report simulation results for the algorithm the log utility per user; this gives some indication of the
(OPTIMAL) that finds the optimal λ∗ and then chooses a tone- “fairness” of the resulting allocation (for α = 0 this is the same
allocation with one user per tone as described in Section III- as the utility). The column “Rate” is the average throughput
C. We also consider the sub-optimal algorithm (HEURISTIC) per user, and the final column is the average number of users
described in Section III-D. We simulate a single cell with scheduled. For each choice of α, the two algorithms perform
M = 40 users. The channel gains eij are the product of a fixed close to each other for each of these metrics; when α = 1
location-based term for each user i and a frequency-selective (maximum throughput) they have identical performance.
fast fading term. The location-based components were picked Next we consider the effect of different channelization
using an empirically obtained distribution for many users in schemes. Table II shows the performance of the two algorithms
a large system. The fast-fading term was generated using a for the adjacent (Adj.), randomized (Ran.), and Interleaved
1350
TABLE II
User Throughput CDF (α=0.5, adjacent subchannelization)
P ERFORMANCE OF DIFFERENT CHANNELIZATION SCHEMES (α = 0.5, NO 1
SELF - NOISE ).
0.95
Chan. Algorithm Utility Log U Rate (kbps) Num. OPTIMAL (a=0)
Adj. OPTIMAL 545.15 10.83 105.9 7.32 0.9 HEURISTIC (a=0)
Adj. HEURISTIC 528.83 10.73 99.3 7.20
Int. OPTIMAL 494.61 10.53 92.4 1.79 0.85 OPTIMAL (a=0.01)
Int. HEURISTIC 486.40 10.47 88.4 1.14
Ran. OPTIMAL 487.53 10.53 89.2 4.89 0.8
HEURISTIC (a=0.01)
Ran. HEURISTIC 479.07 10.46 84.2 4.39
0.75
TABLE III
0.7
P ERFORMANCE OF DIFFERENT CHANNELIZATION SCHEMES (α = 0.5,
SELF - NOISE a = 0.01). 0.65
Chan. Algorithm Utility Log U Rate (kbps) Num.
0.6
Adj. OPTIMAL 512.17 10.82 82.5 7.51 0 200 400 600 800 1000
Adj. HEURISTIC 489.30 10.70 73.7 7.39 Throughput in Kbps
Int. OPTIMAL 466.38 10.53 73.4 1.86
Int. HEURISTIC 452.30 10.45 66.1 1.16
Ran. OPTIMAL 458.53 10.51 70.0 5.03 Fig. 1. Empirical CDF of users’ throughputs.
Ran. HEURISTIC 444.69 10.43 63.1 4.46
1351