Chenming-Hu ch7

Hu_ch07v3.
fm Page 259 Friday, February 13, 2009 4:55 PM
7
MOSFETs in ICs—Scaling, Leakage,
and Other Topics
CHAPTER OBJECTIVES
How the MOSFET gate length might continue to be reduced is the subject of this chap-
ter. One important topic is the off-state current or the leakage current of the MOSFETs.
This topic complements the discourse on the on-state current conducted in the previ-
ous chapter. The major topics covered here are the subthreshold leakage and its impact
on device size reduction, the trade-off between Ion and Ioff and the effects on circuit
design. Special emphasis is placed on the understanding of the opportunities for future
MOSFET scaling including mobility enhancement, high-k dielectric and metal gate, SOI,
multigate MOSFET, metal source/drain, etc. Device simulation and MOSFET compact
model for circuit simulation are also introduced.
M
etal–oxide–semiconductor (MOS) integrated circuits (ICs) have met the
world’s growing needs for electronic devices for computing,
communication, entertainment, automotive, and other applications with
continual improvements in cost, speed, and power consumption. These
improvements in turn stimulated and enabled new applications and greatly
improved the quality of life and productivity worldwide.
7.1 ● TECHNOLOGY SCALING—FOR COST, SPEED, AND POWER

CONSUMPTION ●
In the forty-five years since 1965, the price of one bit of semiconductor memory has
dropped 100 million times. The cost of a logic gate has undergone a similarly
dramatic drop. This rapid price drop has stimulated new applications and
semiconductor technology has improved the ways people carry out just about all
human endeavors. The primary engine that powered the proliferation of electronics
is “miniaturization.” By making the transistors and the interconnects smaller, more
circuits can be fabricated on each silicon wafer and therefore each circuit becomes
cheaper. Miniaturization has also been instrumental to the improvements in speed
and power consumption of ICs.
259
Hu_ch07v3.fm Page 260 Friday, February 13, 2009 4:55 PM
260 Chapter 7 ● MOSFETs in ICs—Scaling, Leakage, and Other Topics
Gordon Moore made an empirical observation in 1965 that the number of

devices on a chip doubles every 18 to 24 months or so. This Moore’s Law is a
succinct description of the rapid and persistent trend of miniaturization. Each time
the minimum line width is reduced, we say that a new technology generation or
technology node is introduced. Examples of technology generations are 0.18 µm,
0.13 µm, 90 nm, 65 nm, 45 nm … generations. The numbers refer to the minimum
metal line width. Poly-Si gate length may be even smaller. At each new node, all the
features in the circuit layout, such as the contact holes, are reduced in size to 70% of
the previous node. This practice of periodic size reduction is called scaling.
Historically, a new technology node is introduced every two to three years.
The main reward for introducing a new technology node is the reduction of
circuit size by half. (70% of previous line width means ~50% reduction in area, i.e.,
0.7 × 0.7 = 0.49.) Since nearly twice as many circuits can be fabricated on each wafer
with each new technology node, the cost per circuit is reduced significantly. That
drives down the cost of ICs.
● Initial Reactions to the Concept of the IC ●

Anecdote contributed by Dr. Jack Kilby, January 22, 1991
“Today the acceptance of the integrated circuit concept is universal. It was not always
so. When the integrated circuit was first announced in 1959, several objections were
raised. They were:
1) Performance of transistors might be degraded by the compromises
necessary to include other components such as resistors and capacitors.
2) Circuits of this type were not producible. The overall yield would be too low.
3) Designs would be expensive and difficult to change.
Debate of the issues provided the entertainment at technical meetings for the next
five or six years.”
In 1959, Jack Kilby of Texas Instruments and Robert Noyce of Fairchild Semiconduc-
tor independently invented technologies of interconnecting multiple devices on a sin-
gle semiconductor chip to form an electronic circuit. Following a 10 year legal battle,
both companies’ patents were upheld and Noyce and Kilby were recognized as the
co-inventors of the IC. Dr. Kilby received a Nobel Prize in Physics in 2000 for invent-
ing the integrated circuit. Dr. Noyce, who is credited with the layer-by-layer planar
approach of fabricating ICs, had died in 1990.
Besides the line width, some other parameters are also reduced with scaling
such as the MOSFET gate oxide thickness and the power supply voltage. The
reductions are chosen such that the transistor current density (Ion /W) increases
with each new node. Also, the smaller transistors and shorter interconnects lead to
smaller capacitances. Together, these changes cause the circuit delays to drop
(Eq. 6.7.1). Historically, IC speed has increased roughly 30% at each new
technology node. Higher speed enables new applications such as wide-band data
transmission via RF mobile phones.
7.1 ● Technology Scaling—For Cost, Speed, and Power Consumption 261
Scaling does another good thing. Eq. (6.7.6) shows that reducing
capacitance and especially the power supply voltage is effective in lowering the
power consumption. Thanks to the reduction in C and Vdd, power consumption
per chip has increased only modestly per node in spite of the rise in switching
frequency, f and the doubling of transistor count per chip at each technology
node. If there had been no scaling, doing the job of a single PC microprocessor
chip (operating a billion transistors at 2 GHz) using 1970 technology would
require the power output of an electrical power generation plant.
In summary, scaling improves cost, speed, and power consumption per
function with every new technology generation. All of these attributes have been
improved by 10 to 100 million times in four decades—an engineering achievement
unmatched in human history! When it comes to ICs, small is beautiful.
7.1.1 Innovations Enable Scaling

Semiconductor researchers around the world have been meeting several times a
year for the purpose of generating consensus on the transistor and circuit
performance that will be required to fulfill the projected market needs in the future.
Their annually updated document: International Technology Roadmap for
Semiconductors (ITRS) only sets out the goals and points out the challenging
problems but does not provide the solutions [1]. It tells the vendors of
manufacturing tools and materials and the research community the expected
roadblocks. The list of show stoppers is always long and formidable but innovative
engineers working together and separately have always risen to the challenge and
done the seemingly impossible.
Table 7–1 is a compilation of some history and some ITRS technology
projection. High-performance (HP) stands for high-performance computer
processor technology. LSTP stands for the technology for low standby-power
products such as mobile phones. The physical gate length, Lg, is actually smaller
than the technology node. Take the 90 nm node, for example; although lithography
technology can only print 90 nm photoresist lines, engineers transfer the pattern
into oxide lines and then isotropically etch (see Section 3.4) the oxide in a dry
isotropic-etching tool to reduce the width (and the thickness) of the oxide lines.
Using the narrowed oxide lines as the new etch mask, they produce the gate
patterns by etching. Innumerable innovations by engineers at each node have
enabled the scaling of the IC technology.
7.1.2 Strained Silicon and Other Innovations

Ion in Table 7–1 rises rapidly. This is only possible because of the strained silicon
technology introduced around the 90 nm node [2]. The electron and hole mobility
can be raised (or lowered) by carefully engineered mechanical strains. The strain
changes the lattice constant of the silicon crystal and therefore the E–k relationship
through the Schrodinger’s wave equation. The E–k relationship, in turn, determines
the effective mass and the mobility.
For example, the hole surface mobility of a PFET can be raised when the
channel is compressively stressed. The compressive strain may be created in
several ways. We illustrate one way in Fig. 7–1. After the gate is defined, trenches
are etched into the silicon adjacent to the gate. The trenches are refilled by
TABLE 7–1 • Scaling from 90 nm to 22 nm and innovations that enable the scaling.
Year of Shipment 2003 2005 2007 2010 2013
Technology Node (nm) 90 65 45 32 22

Lg (nm) (HP/LSTP) 37/65 26/45 22/37 16/25 13/20
EOTe(nm) (HP/LSTP) 1.9/2.8 1.8/2.5 1.2/1.9 0.9/1.6 0.9/1.4
VDD (V) (HP/LSTP) 1.2/1.2 1.1/1.1 1.0/1.1 1.0/1.0 0.9/0.9
Ion , HP (µA/µm) 1100 1210 1500 1820 2200
Ioff , HP (µA/µm) 0.15 0.34 0.61 0.84 0.37
Ion, LSTP (µA/µm) 440 465 540 540 540
Ioff , LSTP (µA/µm) 1E-5 1E-5 3E-5 3E-5 2E-5
Innovations Strained Silicon
High-k/metal-gate
Wet lithography
New Structure
HP: High-Performance technology. LSTP: Low Standby Power technology for portable applications.
EOTe: Equivalent electrical Oxide Thickness, i.e., equivalent Toxe. Ion: NFET Ion.
epitaxial growth (see Section 3.7.3) of SiGe—typically a 20% Ge and 80% Si

mixture. Because Ge atoms are larger than Si atoms and in epitaxial growth the
number of atoms in the trench is equal to the original number of Si atoms, it is as
if a large hand is forced into a small glove. A force is created that pushes on the
channel (as shown in Fig. 7–10) region and raises the hole mobility. It is also
attractive to incorporate a thin film of Ge material in the channel itself because
Ge has higher carrier mobilities than Si [3].
In Table 7–1, EOTe or the electrical equivalent oxide thickness is the total
thickness of the gate dielectric, poly-gate depletion (if any), and the inversion layer
expressed in equivalent SiO2 thickness. It is improved (reduced) at the 45 nm node
by a larger factor over the previous node. The enabling innovations are metal gate
and high-k dielectric, which will be presented in Section 7.4.
Gate
Both trenches
filled with epitaxial
SiGe
N-type Si
FIGURE 7–1 Example of strained-silicon MOSFET. Hole mobility can be raised with a
compressive mechanical strain illustrated with the arrows pushing on the channel region.
7.2 ● Subthreshold Current—“Off” Is Not Totally “Off” 263
At the 32 nm node, wet lithography (see Section 3.3.1) is used to print the fine
patterns. At the 22 nm node, new transistor structures may be used to reverse the
trend of increasing Ioff, which is the source of a serious power consumption issue.
Some new structures are presented in Section 7.8.
7.2 ● SUBTHRESHOLD CURRENT—“OFF” IS NOT TOTALLY “OFF” ●
Circuit speed improves with increasing Ion; therefore, it would be desirable to use a
small Vt. Can we set Vt at an arbitrarily small value, say 10 mV? The answer is no.
At Vgs < Vt, an N-channel MOSFET is in the off state. However, a leakage
current can still flow between the drain and the source. The MOSFET current
observed at Vgs < Vt is called the subthreshold current. This is the main contributor
to the MOSFET off-state current, Ioff. Ioff is the Id measured at Vgs = 0 and
Vds = Vdd. It is important to keep Ioff very small in order to minimize the static
power that a circuit consumes when it is in the standby mode. For example, if Ioff is
a modest 100 nA per transistor, a cell-phone chip containing one hundred million
transistors would consume 10 A even in standby. The battery would be drained in
minutes without receiving or transmitting any calls. A desktop PC processor would
dissipate more power because it contains more transistors and face expensive
problems of cooling the chip and the system.
Figure 7–2a shows a subthreshold current plot. It is plotted in a semi-log Ids vs.
Vgs graph. When Vgs is below Vt, Ids is clearly a straight line, i.e., an exponential
function of Vgs.
Figure 7–2b–d explains the subthreshold current. At Vgs below Vt, the
inversion electron concentration (ns) is small but nonetheless can allow a small
leakage current to flow between the source and the drain. In Fig. 7–2b, a larger Vgs
would pull the Ec at the surface closer to EF, causing ns and Ids to rise. From the
equivalent circuit in Fig. 7–2c, one can observe that
dϕ C oxe
- ≡ --1-
-----------s- = ------------------------------ (7.2.1)
dV gs C oxe + C dep η
C dep
η = 1 + ----------- - (7.2.2)
C oxe
Integrating Eq. (7.2.1) yields
ϕ s = constant + V g ⁄ η (7.2.3)
Ids is proportional to ns, therefore
q ϕ s ⁄ kT q ( constant+Vg ⁄ η ) ⁄ kT qVg ⁄ η kT
I ds ∝ n s ∝ e ∝e ∝e (7.2.4)
A practical and common definition of Vt is the Vgs at which Ids = 100 nA × W/L
as shown in Fig. 6–12. (Some companies may use 200 nA instead of 100 nA.).
Equation (7.2.4) may be rewritten as
W q ( V – V ) ⁄ η kT
I ds ( nA ) = 100 ⋅ ----- ⋅ e gs t (7.2.5)
L
10000
PMOS NMOS
1000
100
Id (A/m)
10
0.1
0.01 (Vds) 0.05, 1.2V

0.001
1.2 0.9 0.6 0.3 0 0.3 0.6 0.9 1.2
Vgs (V)
(a)
Ec Vg
s
Coxe
EF s
Vg
EF Cdep
(b) (c)
Log (Ids )
mA
Vds Vdd
100nA W/L
A
1/S
nA
Ioff
Vt Vgs
(d)
FIGURE 7–2 The current that flows at Vgs < Vt is called the subthreshold current. Vt ~ 0.2 V.
The lower/upper curves are for Vds = 50 mV/1.2 V. After Ref. [2]. (b) When Vg is increased,
Ec at the surface is pulled closer to EF, causing ns and Ids to rise; (c) equivalent capacitance
network; (d) subthreshold I-V with Vt and Ioff. Swing, S, is the inverse of the slope in the
subthreshold region.
7.2 ● Subthreshold Current—“Off” Is Not Totally “Off” 265
Clearly, Eq. (7.2.5) agrees with the definition of Vt and Eq. (7.2.4). The
simplicity of Eq. (7.2.5) is another reason for favoring the new Vt definition. At
room temperature, the function exp(qVgs /kT) changes by 10 for every 60 mV
change in Vgs , therefore exp(qVgs /ηkT) changes by 10 for every η × 60 mV. For
example, if η = 1.5, Eq. (7.2.5) states that Ids drops by ten times for every 90 mV
of decrease in Vgs below Vt at room temperature. η × 60 mV is called the
subthreshold swing and represented by the symbol, S.
T
S ( mV ⁄ decade ) = η ⋅ 60 mV ⋅ -------------- (7.2.6)
300K
W q ( V – V ) ⁄ η kT W (V – V ) ⁄ S
I ds ( nA ) = 100 ⋅ ----- ⋅ e gs t = 100 ⋅ ----- ⋅ 10 gs t (7.2.7)
L L
W –q Vt ⁄ η kT W –Vt ⁄ S
I off ( nA ) = 100 ⋅ ----- ⋅ e = 100 ⋅ ----- ⋅ 10 (7.2.8)
L L
For given W and L, there are two ways to minimize Ioff illustrated in
Fig. 7–2 (d). The first is to choose a large Vt. This is not desirable because a large
Vt reduces Ion and therefore degrades the circuit speed (see Eq. (6.7.1)). The
preferable way is to reduce the subthreshold swing. S can be reduced by reducing
η. That can be done by increasing Coxe (see Eq. 7.2.2), i.e., using a thinner Tox ,
and by decreasing Cdep, i.e., increasing Wdep.1 An additional way to reduce S, and
therefore to reduce Ioff , is to operate the transistors at significantly lower than
the room temperature. This last approach is valid in principle but rarely used
because cooling adds considerable cost.
Besides the subthreshold leakage, there is another leakage current
component that has becomes significant. That is the tunnel leakage through very
thin gate oxide that will be presented in Section 7.4. The drain to the body junction
leakage is the third leakage component.
● The Effect of Interface States ●

The subthreshold swing is degraded when interface states are present (see
Section 5.7). Figure 7–3 shows that when ϕS changes, some of the interface traps
move from above the Fermi level to below it or vice versa. As a result, these interface
traps change from being empty to being occupied by electrons. This change of charge
in response to change of voltage (ϕS) has the effect of a capacitor. The effect of the
interface states is to add a parallel capacitor to Cdep in Fig. 7–2c. The subthreshold
swing is poor unless the semiconductor-dielectric interface has low density of
interface states such as carefully prepared Si-SiO2 interface. The subthreshold swing
is often degraded after a MOSFET is electrically stressed (see sidebar in Section 5.7)
and new interface states are generated.
1 According to Eq. 6.5.2 and Eq. 7.2.2, η should be equal to m. In reality, η is larger than m because
Coxe is smaller at low Vgs (subthreshold condition) than in inversion due to a larger Tinv as shown in
Fig. 5–25. Nonetheless, η and m are closely related.
EF
EF
(a) (b)
FIGURE 7–3 (a) Most of the interface states are empty because they are above EF. (b) At
another Vg, most of the interface states are filled with electrons. As a result, the interface
charge density changes with Vg.
EXAMPLE 7–1 Subthreshold Leakage Current

An N-channel transistor has Vt = 0.34 V and S = 85 mV, W = 10 µm and
L = 50 nm. (a) Estimate Ioff. (b) Estimate Ids at Vg = 0.17 V.
SOLUTION:
a. Use Eq. (7.2.6).
W –V ⁄ S 10 –0.34 ⁄ 0.0085
I off ( nA ) = 100 ⋅ ----- ⋅ 10 t = 100 ⋅ ---------- ⋅ 10 = 2 nA
L 0.05
b. Use Eq. (7.2.7).
W ( Vg –Vt ) ⁄ S 10 ( 0.17 – 0.34 ) ⁄ 0.085
I ds = 100 ⋅ ----- ⋅ 10 = 100 ⋅ ---------- ⋅ 10 = 200 nA
L 0.05
7.3 ● Vt ROLL-OFF—SHORT-CHANNEL MOSFETS LEAK MORE ●
The previous section pointed out that Vt must not be set too low; otherwise, Ioff
would be too large. The present section extends that analysis to show that the
channel length (L) must not be too short. The reason is this: Vt drops with
decreasing L as illustrated in Fig. 7–4. When Vt drops too much, Ioff becomes too
large and that channel length is not acceptable.
● Gate Length (Lg) vs. Electrical Channel Length (L) ●
Gate length is the physical length of the gate and can be accurately measured with a
scanning electron microscope (SEM). It is carefully controlled in the fabrication
plant. The channel length, in comparison, cannot be determined very accurately and
easily due to the lateral diffusion of the source and drain junctions. L tracks Lg but
the difference between the two just cannot be quantified precisely in spite of efforts
such as described in Section 6.11. As a result, Lg is widely used in lieu of L in data
presentations as is done in Fig. 7–4. L is still a useful concept and is used in theoreti-
cal equations even though L cannot be measured precisely for small transistors.
7.3 ● Vt Roll-Off—Short-Channel MOSFETs Leak More 267
0.00
0.05
Vt Roll-off (V)
0.10
0.15
Vds 50 mV
0.20 Vds 1.0 V
0.25
0.01 0.1 1
Lg (m)
FIGURE 7–4 |Vt| decreases at very small Lg. This phenomenon is called Vt roll-off. It
determines the minimum acceptable Lg because Ioff is too large when Vt becomes too low or
too sensitive to Lg.
At a certain Lg , Vt becomes so low that Ioff becomes unacceptable [see

Eq. (7.2.8)]. Doping the bodies of the short-channel devices more heavily than the
long-channel devices would raise their Vt. Still, at a certain Lg, Vt is so sensitive to
the manufacturing caused variation in L that the worst case Ioff becomes
unacceptable. Device development engineers must design the device such that the
Vt roll-off does not prevent the use of the targeted minimum Lg , e.g., those listed in
the second row of Table 7–1.
Why does Vt decrease with decreasing L? Figure 7–5 illustrates a model for
understating this effect. Figure 7–5a shows the energy band diagram along the
semiconductor–insulator interface of a long channel device at Vgs = 0. Figure 7–5b
shows the case at Vgs = Vt. In the case of (b), Ec in the channel is pulled lower than
Long Channel Short Channel
Vgs 0 V Vgs 0 V
Ec Ef
N Source Vds
N Drain
(a) (c)
Vgs Vt-long Vgs Vt-short

~0.2 V
Ef
(b) (d)
FIGURE 7–5 a–d: Energy band diagram from source to drain when Vgs = 0 V and Vgs = Vt.
a–b long channel; c–d short channel.
in case (a) and therefore is closer to the Ec in the source. When the channel Ec is
only ~0.2 eV higher than the Ec in the source (which is also ~EFn), ns in the channel
reaches ~1017 cm3 and inversion threshold condition (Ids = 100nA × W/L) is
reached. We may say that a 0.2 eV potential barrier is low enough to allow the
electrons in the N+ source to flow into the channel to form the inversion layer. The
following analogy may be helpful for understanding the concept of the energy
barrier height. The source is a reservoir of water; the potential barrier is a dam; and
Vgs controls the height of the dam. When Vgs is high enough, the dam is sufficiently
low for the water to flow into the channel and the drain. That defines Vt.
Figure 7–5c shows the case of a short-channel device at Vgs = 0. If the channel is
short enough, Ec will not be able to reach the same peak value as in Fig. 7–5a. As a
result, a smaller Vgs is needed in Fig. 7–5d than in Fig. 7–5b to pull the barrier down to
0.2 eV. In other words, Vt is lower in the short channel device than the long channel
device. This explains the Vt roll-off shown in Fig. 7–4.
We can understand Vt roll-off from another approach. Figure 7–6 shows a
capacitor between the gate and the channel. It also shows a second capacitor, Cd ,
between the drain and the channel terminating at around the middle of the channel,
where Ec peaks in Fig. 7–5d. As the channel length is reduced, the drain to source and
the drain to “channel” distances are reduced; therefore, Cd increases. Do not be
concerned with the exact definition or value of Cd. Instead, focus on the concept that
Cd represents the capacitive coupling between the drain and the channel barrier point.
From this two-capacitor equivalent circuit, it is evident that the drain voltage
has a similar effect on the channel potential as the gate voltage. Vgs and Vds,
together, determine the channel potential barrier height shown in Fig. 7–5. When
Vds is present, less Vgs is needed to pull the barrier down to 0.2 eV; therefore, Vt is
lower by definition. This understanding gives us a simple equation for Vt roll-off,
Cd
V t = V t-long – V ds ⋅ ----------
- (7.3.1)
C oxe
where Vt-long is the threshold voltage of a long-channel transistor, for which Cd = 0.
More accurately, Vds should be supplemented with a constant that represents the
combined effects of the 0.2 V built-in potentials between the N– inversion layer and
both the N+ drain and source at the threshold condition [4].
Cd
V t = V t-long – ( V ds + 0.4 V ) ⋅ ----------
- (7.3.2)
C oxe
Using Fig. 7–6, one can intuitively see that as L decreases, Cd increases. Recall
that the capacitance increases when the two electrodes are closer to each other.
That intuition is correct for the two-dimensional geometry of Fig. 7–6, too.
However, solution of the Poisson’s equation (Section 4.1.3) indicates that Cd is an
exponential function of L in this two-dimensional structure [5]. Therefore,
–L ⁄ l d
V t = V t-long – ( V ds + 0.4 V ) ⋅ e (7.3.3)
where l d ∝ 3 T oxe W dep X j (7.3.4)

Xj is the drain junction depth. Equation (7.3.3) provides a semi-quantitative model
of the roll-off of Vt as a function of L and Vds. It can serve as a guide for designing
7.3 ● Vt Roll-Off—Short-Channel MOSFETs Leak More 269
Vgs
Tox Coxe
N Wdep Xj Vds
Cd
P-Sub
FIGURE 7–6 Schematic two-capacitor network in MOSFET. Cd models the electrostatic

coupling between the channel and the drain. As the channel length is reduced, drain to
“channel” distance is reduced; therefore, Cd increases.
small MOSFET and understanding new transistor structures. At a very large L, Vt

is equal to Vt-long as expected. The roll-off is an exponential function of L. The roll-
off is also larger at larger Vds, which can be as large as Vdd. The acceptable Ioff
determines the acceptable Vt through Eq. (7.2.8). This in turn determines the
acceptable minimum L through Eq. (7.3.3). The acceptable minimum L is several
times of ld. The concept that the drain can lower the source–channel barrier and
reduce Vt is called drain-induced barrier lowering or DIBL. ld may be called the
DIBL characteristic length. In order to support the reduction of L at each new
technology node, ld must be reduced in proportion to L. This means that we must
reduce Tox, Wdep, and/or Xj. In reality, all three are reduced at each node to achieve
the desired reduction in ld. Reducing Tox increases the gate control or Coxe.
Reducing Xj decreases Cd by reducing the size of the drain electrode. Reducing
Wdep also reduces Cd by introducing a ground plane (the neutral region of the
substrate or the bottom of the depletion region) that tends to electrostatically
shield the channel from the drain.
The basic message in Eq. (7.3.4) is that the vertical dimensions in a MOSFET
(Tox, Wdep, Xj) must be reduced in order to support the reduction of the gate length.
As an example, Fig. 7–7 shows that the oxide thickness has been scaled roughly in
proportion to the line width (gate length).
100
SiO2 thickness
Thickness (Å)
10
350 nm
250 nm
180 nm
130 nm
90 nm
Technology node
FIGURE 7–7 In the past, the gate oxide thickness has been scaled roughly in proportion to
the line width.
7.4 ● REDUCING GATE-INSULATOR ELECTRICAL THICKNESS

AND TUNNELING LEAKAGE ●
SiO2 has been the preferred gate insulator since silicon MOSFET’s beginning. The
oxide thickness has been reduced over the years from 300 nm for the 10 µm
technology to only 1.2 nm for the 65 nm technology. There are two reasons for the
relentless drive to reduce the oxide thickness. First, a thinner oxide, i.e., a larger Cox
raises Ion and a large Ion raises the circuit speed [see Eq. (6.7.1)]. The second reason
is to control Vt roll-off (and therefore the subthreshold leakage) in the presence of
a shrinking L according to Eqs. (7.3.3) and (7.3.4). One must not underestimate the
importance of the second reason. Figure 7–7 shows that the oxide thickness has
been scaled roughly in proportion to the line width.
Thinner oxide is desirable. What, then, prevents engineers from using
arbitrarily thin gate oxide films? Manufacturing thin oxide is not easy, but as
Fig. 6–5 illustrates, it is possible to grow very thin and uniform gate oxide films with
high yield. Oxide breakdown is another limiting factor. If the oxide is too thin, the
electric field in the oxide can be so high as to cause destructive breakdown. (See the
sidebar, “SiO2 Breakdown Electric Field.”) Yet another limiting factor is that long-
term operation at high field, especially at elevated chip operating temperatures,
breaks the weaker chemical bonds at the Si–SiO2 interface thus creating oxide
charge and Vt shift (see Section 5.7). Vt shifts cause circuit behaviors to change and
raise reliability concerns.
For SiO2 films thinner than 1.5 nm, tunneling leakage current becomes the
most serious limiting factor. Figure 7–8a illustrates the phenomenon of gate leakage
by tunneling (see Section 4.20). Electrons arrive at the gate oxide barrier at thermal
velocity and emerge on the side of the gate with a probability given by Eq. (4.20.1).
This is the cause of the gate leakage current. Figure 7–8b shows that the exponential
rise of the SiO2 leakage current with decreasing thickness agrees with the tunneling
model prediction [6]. At 1.2 nm, SiO2 leaks 103 A/cm2. If an IC chip contains
106
Direct tunneling model
Gate current density (A/cm2)
104 Inversion bias

| VG| 1.0 V
102
Expt. Data
100 SiO2
HfO2
102
104
106
0.5 1.0 1.5 2.0 2.5 3.0 3.5
Equivalent oxide thickness (nm)
(a) (b)
FIGURE 7–8 (a) Energy band diagram in inversion showing electron tunneling path through
the gate oxide; (b) 1.2 nm SiO2 conducts 103 A/cm2 of leakage current. High-k dielectric such
as HfO2 allows several orders lower leakage current to pass. (After [6]. © 2003 IEEE.)
7.4 ● Reducing Gate-Insulator Electrical Thickness and Tunneling Leakage 271
1 mm2 total area of this thin dielectric, the chip oxide leakage current would be 10 A.
This large leakage would drain the battery of a cell phone in minutes. The leakage
current can be reduced by about 10 × with the addition of nitrogen into SiO2.
Engineers have developed high-k dielectric technology to replace SiO2. For
example, HfO2 has a relative dielectric constant (k) of ~24, six times larger than
that of SiO2. A 6 nm thick HfO2 film is equivalent to 1 nm thick SiO2 in the sense
that both films produce the same Cox. We say that this HfO2 film has an equivalent
oxide thickness or EOT of 1 nm. However, the HfO2 film presents a much thicker
(albeit lower) tunneling barrier to the electrons and holes. The consequence is that
the leakage current through HfO2 is several orders of magnitude smaller than that
through SiO2 as shown in Fig. 7–8b. Other attractive high-k dielectrics include ZrO2
and Al2O3. The difficulties of adopting high-k dielectrics in IC manufacturing are
chemical reactions between them and the silicon substrate, lower surface mobility
than the Si–SiO2 system, and more oxide charge. These problems are minimized by
inserting a thin SiO2 interfacial layer between the silicon substrate and the high-k
dielectric.
Note that Eq. (7.3.4) contains the electrical oxide thickness, Toxe, defined in
Eq. (5.9.2). Besides Tox or EOT, the poly-Si gate depletion layer thickness also needs
to be minimized. Metal is a much better gate material in this respect. NFET and
PFET gates may require two different metals (with metal work functions close to
those of N+ and P+ poly-Si) in order to achieve the optimal Vts [7].
In addition, Tinv is also part of Toxe and needs to be minimized. The material
parameters that determine Tinv is the electron or hole effective mass. A larger
effective mass leads to a thinner Tinv. Unfortunately, a larger effective mass leads
to a lower mobility, too (see Eq. (2.2.4)). Fortunately, the effective mass is a
function of the spatial direction in a crystal. The effective mass in the direction
normal to the oxide interface determines Tinv, while the effective mass in the
direction of the current flow determines the surface mobility. It may be possible to
build a transistor with a wafer orientation (see Fig. 1–2) that offers larger mn and
mp normal to the oxide interface but smaller mn and mp in the direction of the
current flow.
● SiO2 Breakdown Electric Field ●

What is the breakdown field of SiO2? There is no one simple answer because the
breakdown field is a function of the test time. If a one second (1s) voltage pulse is
applied to a 10 nm SiO2 film, 15 V is needed to breakdown the film for a breakdown
field of 15 MV/cm. The breakdown field is significantly lower if the same oxide is
tested for one hour. The field is lower still if it is tested for a month. This
phenomenon is called time-dependent dielectric breakdown. Most IC applications
require a device lifetime of several years to over 10 years. Clearly, manufacturers
cannot afford the time to actually measure the 10 year breakdown voltage for new
oxide technologies. Instead, engineers predict the 10 year breakdown voltage based
on hours- to month-long tests in combination with theoretical models of the physics
of oxide breakdown. A wide range of breakdown field was predicted for SiO2 by
different models. In retrospect, the most optimistic of the predictions, 7 MV/cm for a
10 year operation, was basically right.
This breakdown model considers a sequence of events[8]. Carrier tunneling

through the oxide at high field breaks up the weaker Si–O bonds in SiO2, thus
creating oxide defects. This process progresses more rapidly at those spots in the
oxide sample where the densities of the weaker bonds happen to be statistically high.
When the generated defects reach a critical density at any one spot, breakdown
occurs. In a longer-term stress test, the breakdown field is lower because a lower rate
of defect generation is sufficient to build up the critical defect density over the longer
test time. A fortuitous fact is that the breakdown field increases in very thin oxide.
The charge carriers gain less energy while traversing through a very thin oxide than a
thick oxide film at a given electric field and are less able to create oxide defects.
7.5 ● HOW TO REDUCE Wdep ●
Equation (7.3.4) suggests that a small Wdep helps to control Vt roll-off and enable
the use of a shorter L. Wdep can be reduced by increasing the substrate doping con-
centration, Nsub , because Wdep is proportional to 1 ⁄ N sub . However, Eq. (5.4.3),
repeated here,
qN sub 2 ε s φ st
V t = V fb + φ st + ----------------------------------
- (7.5.1)
C ox
dictates that, if Vt is not to increase, Nsub must not be increased unless Cox is
increased, i.e., Tox is reduced. Equation (7.5.1) can be rewritten as Eq (7.5.2) by
eliminating Nsub with Eq. (5.5.1). Clearly, Wdep can only be reduced in proportion
to Tox.
2 ε s T ox 
V t = V fb + φ st  1 + --------------------
- (7.5.2)
 ε W  ox dep
This fact establishes Tox as the main enabler of L reduction according to Eq. (7.3.4).
There is another way of reducing Wdep—adopt the steep retrograde doping
profile illustrated in Fig. 6–12. In that case, Wdep is determined by the thickness of
the lightly doped surface layer. It can be shown (see sidebar) that Vt of a MOSFET
with ideal retrograde doping is
ε s T ox 
V t = V fb + φ st  1 + ----------------
- (7.5.3)
 ε T  ox rg
where Trg is the thickness of the lightly doped thin layer. Again, Trg in Eq. (7.5.3)
can only be scaled in proportion to Tox if Vt is to be kept constant. However, Trg,
the Wdep of an ideal retrograde device, can be about half the Wdep of a uniformly
doped device [see Eq. (7.5.2)] and yield the same Vt. That is an advantage of the
retrograde doping. Another advantage of retrograde doping is that ionized
impurity scattering (see Section 2.2.2) in the inversion layer is reduced and the
surface mobility can be higher. To produce a sharp retrograde profile with a very
thin lightly doped layer, i.e., a very small Wdep, care must be taken to prevent
dopant diffusion.
7.5 ● How to Reduce Wdep 273
● Derivation of Eq. (7.5.3) ●

The energy diagram at the threshold condition is shown in Fig. 7–9.
Trg
Ec
fst
EF
Ev
FIGURE 7–9 Energy diagram of a steep-retrograde doped MOSFET at the threshold

condition.
The band bending, φ st , is dropped uniformly over Trg, the thickness of the
lightly doped depletion layer, creating an electric field, Ᏹ s = φ st ⁄ T rg . Because of the
continuity of the electric flux, the oxide field is Ᏹ ox = Ᏹ s ⋅ ε s ⁄ ε ox . Therefore,
ε s T ox
V ox = T ox Ᏹ ox = φ st ---------------
- (7.5.4)
ε ox T rg
From Eqs. (5.2.2), (7.5.4)

ε s T ox 
V t = V fb + φ st  1 + ----------------
- (7.5.5)
 ε T  ox rg
Here is an intriguing note about reducing Wdep further. A higher Nsub in

Eq. (7.5.1) (and therefore a smaller Wdep) or a smaller Trg in Eq. (7.5.3) can be used
although it produces a large Vt than desired if this larger Vt is brought back down
with a body (or well) to source bias voltage, Vbs (see Section 6.4). The required Vbs
is a forward bias across the body–source junction. A forward bias is acceptable, i.e.,
the forward bias current is small, if Vbs is kept below 0.6 V.
● Predicting the Ultimate Low Limit of Channel Length—A Retrospective ●

When the channel length is too small, a MOSFET would have too large an Ioff and it
ceases to be a usable transistor for practical purposes. Assuming that lithography and
etching technologies can produce as small features as one desires, what is the
ultimate low limit of MOSFET channel length?
In the 1970s, the consensus in the semiconductor industry was that the ultimate
lower limit of channel length is 500 nm. In the 1980s, the consensus was 250 nm. In
the 1990s, it was 100 nm. Now it is much smaller. What made the experts
underestimate the channel length scaling potential?
A review of the historical literature reveals that the researchers were

mistaken about how thin the engineers can make the gate oxide in mass
production. In the 1970s, it was thought that ~15 nm would be the limit. In the
1980s, it was 8 nm, and so on. Since the Tox estimate was off, the estimates of the
minimum acceptable Wdep and therefore the minimum L would be off according to
Eq. (7.3.4).
7.6 ● SHALLOW JUNCTION AND METAL SOURCE/DRAIN MOSFET ●
Figure 7–10, first introduced as Fig. 6–24b, shows the cross-sectional view of a typi-
cal drain (and source) junction. Extra process steps are taken to produce the shal-
low junction extension between the deep N+ junction and the channel. This shallow
junction is needed because the drain junction depth must be kept small according to
Eq. (7.3.4). In order to keep this junction shallow, only very short annealing at the
lowest necessary temperature is used to activate the dopants and anneal out the
implantation damages in the crystal in 0.1S (flash annealing) or 1µS (laser anneal-
ing) (see Section 3.6). To further reduce dopant diffusion, the doping concentration
in the shallow junction extension is kept much lower than the N+ doping density.
Shallow junction and light doping combine to produce an undesirable parasitic
resistance that reduces the precious Ion. That is a price to pay for suppressing Vt
roll-off and the subthreshold leakage current. Farther away from the channel, as
shown in Fig. 7–10, a deeper N+ junction is used to minimize total parasitic resis-
tance. The width of the dielectric spacer in Fig. 7–10 should be as small as possible
to minimize the resistance.
7.6.1 MOSFET with Metal Source/Drain

A metal source/drain MOSFET or Schottky source/drain MOSFET shown in
Fig. 7–11a can have very shallow junctions (good for the short-channel effect) and
low series-resistance because the silicide is ten times more conductive than N+ or
Contact Dielectric spacer
Gate
Oxide
Channel
Shallow junction
N drain extension
Silicide, e.g. NiSi2, TiSi2
FIGURE 7–10 Cross-sectional view of a MOSFET drain junction. The shallow junction
extension next to the channel helps to suppress the Vt roll-off.
7.6 ● Shallow Junction and Metal Source/Drain MOSFET 275
Metal Gate Metal

source drain
P-body
(a)
Channel
S D
EF
Vg 0
(b)
Channel
S D
Vg Vt EF
(c)
Conventional MOSFET
EF
Vg Vt N N
(d)
FIGURE 7–11 (a) Metal source/drain is the ultimate way to reduce the increasingly
important parasitic resistance; (b) energy band diagrams in the off state; (c) in the on state
there may be energy barriers impeding current flow. These barriers do not exist in the
conventional MOSFET (d) and must be minimized.
P+ Si. The only problem is that the Schottky-S/D MOSFET would have a lower Id
than the regular MOSFET if φ B is too large to allow easy flow of carriers (electrons
for NFET) from the source into the channel.
Figure 7–11b shows the energy band diagram drawn from the source along the
channel interface to the drain. Vds is set to zero for simplicity. The energy diagram is
similar to that of a conventional MOSFET at Vg = 0 in that a potential barrier stops
the electrons in the source from entering the channel and the transistor is off. In the
on state, Fig. 7–11c, channel Ec is pulled down by the gate voltage, but not at the
source/drain edge, where the barrier height is fixed at φ B (see Section 4.16). This
barrier does not exist in a conventional MOSFET as shown in Fig. 7–11d, and they
can degrade Id of the metal S/D MOSFET.
To unleash the full potentials of Schottky S/D MOSFET, a very low- φ B
Schottky junction technology should be used (for NFETs). A thin N+ region can be
added between the metal and the channel. This minimizes the effect of the barriers
on current flow as shown in Fig. 4–46. Attention must be paid to reduce the large
reverse leakage current of a low- φ Bn Schottky drain to body junction [9].
7.7 ● TRADE-OFF BETWEEN Ion AND Ioff AND DESIGN

FOR MANUFACTURING ●
Subthreshold Ioff would not be a problem if Vt is set at a very high value. That is not
acceptable because a high Vt would reduce Ion and therefore reduce circuit speed.
Using a larger Vdd can raise Ion, but that is not acceptable either because it would
raise the power consumption, which is already too large for comfort. Decreasing L
can raise Ion but would also reduce Vt and raise Ioff.
QUESTION Which, if any, of the following changes lead to both sub-

●
threshold leakage reduction and Ion enhancement? A larger Vt. A larger

L. A smaller Vdd .
Figure 7–12 shows a plot of log Ioff vs. Ion of a large number of transistors [2].
The trade-off between the two is clear. Higher Ion goes hand-in-hand with larger
Ioff. The spread in Ion (and Ioff) is due to a combination of unintentional
manufacturing variances in Lg and Vt and intentional difference in the gate length.
Techniques have been developed to address the strong trade-off between Ion
and Ioff, i.e., between speed and standby power consumption.
One technique gives circuit designers two or three (or even more) Vts to
choose from. A large circuit may be designed with only the high-Vt devices first.
Circuit timing simulations are performed to identify those signal paths and circuits
where speed must be tuned up. Intermediate-Vt devices are substituted into them.
Finally, low-Vt devices are substituted into those few circuits that need even more
help with speed. A similar strategy provides multiple Vdd. A higher Vdd is provided
to a small number of circuits that need speed while a lower Vdd is used in the other
circuits. The larger Vdd provides higher speed and/or allows a larger Vt to be used
(to suppress leakage). Yet the dynamic power consumption (see Eq. (6.7.6)) can be
kept low because most of the circuits operate at the lower Vdd.
1000
100
Ioff (nA/m)
10
1
0.9 1 1.1 1.2 1.3 1.4 1.5
Ion (mA/m)
FIGURE 7–12 Log Ioff vs. linear Ion. The spread in Ion (and Ioff) is due to the presence of
several slightly different drawn Lgs and unintentional manufacturing variations in Lg and Vt.
(After [2]. © 2003 IEEE.)
7.8 ● Ultra-Thin-Body SOI and Multigate MOSFETs 277
In a large circuit such as a microprocessor, only some circuit blocks need to

operate at high speed at a given time and other circuit blocks operate at lower speed
or are idle. Vt can be set relatively low to produce large Ion so that circuits that need
to operate at high speed can do so. A well bias voltage, Vsb in Eq. (6.4.6), is applied to
the other circuit blocks to raise the Vt and suppress the subthreshold leakage. This
technique requires intelligent control circuits to apply Vsb where and when needed.
This well bias technique also provides a way to compensate for the chip-to-chip
and block-to-block variations in Vt that results from nonuniformity among devices due
to inevitable variations in manufacturing equipment and process. Many techniques at
the border between manufacturing and circuit design can help to ease the problem of
manufacturing variations. These techniques are collectively known as design for
manufacturing or DFM. A major cause of the device variations is the imperfect control
of Lg in the lithography process. Some of the variation is more or less random variation
in nature. The other part is more or less predictable, called systematic variation. One
example of the systematic variations is the distortion in photolithography due to the
interference of neighboring patterns of light and darkness. Elaborate mathematical
optical proximity correction or OPC (see Section 3.3) reshapes each pattern in the
photomask to compensate for the effect of the neighboring patterns. Another example
is that the carrier mobility and therefore the current of a MOSFET is changed by the
mechanical stress effect (see Section 7.1.1) created by nearby structures, e.g., shallow
trench isolation or other MOSFETs. Sophisticated simulation tools can analyze the
mechanical strain and predict the Ion based on the neighboring structures and feed the
Ion information to circuit simulators to obtain more accurate simulation results. An
example of random variation is the gate edge roughness or waviness caused by the
graininess of the photoresist and the poly-crystalline Si. Yet another example of random
variation is the random dopant fluctuation phenomenon. The statistical variation of the
number of dopant atoms and their location in small size MOSFET creates significant
variations in the threshold voltage. It requires complex design methodologies to
include the intra-chip and inter-chip random variations in circuit design.
7.8 ● ULTRA-THIN-BODY SOI AND MULTIGATE MOSFETS ●
There are alternative MOSFET structures that are less susceptible to Vt roll-off
and allow gate length scaling beyond the limit of conventional MOSFET.
Figure 7–6 gives a simple description of the competition between the gate and the
drain over the control of the channel barrier height shown in Fig. 7–5. We want to
maximize the gate-to-channel capacitance and minimize the drain-to-channel
capacitance. To do the former, we reduce Tox as much as possible. To accomplish
the latter, we reduce Wdep and Xj as much as possible. It is increasingly difficult to
make these dimensions smaller. The real situation is even worse. In the
subthreshold region, Tox may be a small part Toxe in Eq. (7.3.4) because the
inversion-layer thickness, Tinv in Sec. 5.9, is large. Imagine that Tox could be made
infinitesimally small. This would give the gate a perfect control over the potential
barrier height—but only right at the Si surface. The drain could still have more
control than the gate along other leakage current paths that are some distance
below the Si surface as shown in Fig. 7–13. At this submerged location, the gate is
far away and the gate control is weak. The drain voltage can pull the potential
S D
Cg Cd
Leakage path
FIGURE 7–13 The drain could still have more control than the gate along another leakage
current path that is some distance below the Si surface.
barrier down and allow leakage current to flow along this submerged path. There
are two transistor structures that can eliminate the leakage paths that are far away
from the gate [10]. One is called the ultra-thin-body MOSFET or UTB MOSFET.
The other is multigate MOSFET. They are presented next.
7.8.1 Ultra-Thin-Body MOSFET and SOI

There are two ways to eliminate these submerged leakage paths. One is to use an ultra-
thin-body structure as shown in Fig. 7–14 [11]. This MOSFET is built in a thin Si film on
an insulator (SiO2). Since the Si film is very thin, perhaps less than 10 nm, no leakage
path is very far from the gate. (The worst-case leakage path is along the bottom of the Si
film.) Therefore, the gate can effectively suppress the leakage. Figure 7–15 shows that
the subthreshold leakage is reduced as the Si film is made thinner. It can be shown that
the thin Si thickness should take the places of Wdep and Xj in Eq. (7.3.4) such that Lg
can be scaled roughly in proportion to TSi, the Si thickness. TSi should be thinner than
about one half of the gate length in order to reap the benefit of the UTB MOSFET
concept to sustain scaling. UTB MOSFETs, as the multigate MOSFETs of the next
section, offer additional device benefits. Because small ld (Eq. (7.3.4)) can be obtained
without heavy channel doping, carrier mobility is improved. The body effect that is
detrimental to circuit speed (see Section 6.4) is eliminated because the body is fully
depleted and floating and has no fixed voltage. One challenge posed by UTB
MOSFETs is the large source/drain resistance due to their thinness. The solution is to
thicken the source and drain with epitaxial deposition. These raised source/drains are
visible in Figs. 7–14 and 7–15.
Gate
Source Drain
SiO2 Tsi 3 nm
FIGURE 7–14 The SEM cross section of UTB device. (After [11]. © 2000 IEEE.)
102
Tsi 7 nm
103 Tsi 5 nm
Tsi 3 nm
104
Drain current, Id (A/m) 105
106
107
108 G
S D
109
SiO2
1010
1011
0.0 0.2 0.4 0.6 0.8 1.0
Gate voltage, Vg (V)
FIGURE 7–15 The subthreshold leakage is reduced as the Si film (transistor body) is made
thinner. Lg = 15 nm. (After [11]. © 2000 IEEE.)
● SOI-Silicon on Insulator ●
Figure 7–16 shows the steps of making an SOI or silicon-on-Insulator wafer [12].
(The conventional wafer is sometimes called bulk silicon wafer for clarity.) Step 1 is
to implant hydrogen into a silicon wafer that has a thin SiO2 film at the surface. The
hydrogen concentration peaks at a distance D below the surface. Step 2 is to place
the first wafer, upside down, over a second plain wafer. The two wafers adhere to
each other by the atomic bonding force. A low temperature annealing causes the
two wafers to fuse together. Step 3 is to apply another annealing step that causes
the implanted hydrogen to coalesce and form a large number of tiny hydrogen
bubbles at depth D. This creates sufficient mechanical stress to break the wafer at
that plane. The final step, Step 4, is to polish the surface. Now the SOI wafer is
ready for use.
The Si film is of high quality and suitable for IC manufacturing. Even without
using an ultra-thin body, SOI provides a speed advantage because the source/drain to
body junction capacitance is practically eliminated as the source and drain diffusion
regions extend vertically to the buried oxide. The cost of an SOI wafer is higher than
an ordinary Si wafer and increases the cost of IC chips. For these reasons, only some
microprocessors, which command high prices and compete on speed, have employed
this technology so far. Figure 7–17 shows the cross-sectional SEMs of an SOI
product. SOI also finds other compelling applications because it offers extra
flexibility for making novel structures such as the ultra-thin-body MOSFET and
some multigate MOSFET structures that can be scaled to smaller gate length beyond
the capability of bulk MOSFETs.
Wafer A Wafer B
H ions
A Step 1
A
Step 2
B
Step 3
B
Step 4
Si bulk SOI wafer
or
New A New B
FIGURE 7–16 Steps of making an SOI wafer. (After [12].)
Si
Buried Oxide
Silicon substrate
FIGURE 7–17 The cross-sectional electron micrograph of an SOI integrated circuit. The lower
level structures are transistors and contacts. The upper two levels are the vias and the
interconnects, which employ multiple layers of materials to achieve better reliability and etch stops.
7.8.2 FinFET - Multigate MOSFET

The second way of eliminating deep submerged leakage paths is to provide gate
control from more than one side of the channel as shown in Fig. 7–18. The Si film is
Gate 1 Vg
Source Si Drain Tsi
Tox
Gate
FIGURE 7–18 A schematic sketch of a double-gate MOSFET with gates connected.
very thin so that no leakage path is far from one of the gates. (The worst-case path
is along the center of the Si film.) Therefore, the gate(s) can suppress leakage
current more effectively than the conventional MOSFET. Because there are more
than one gate, the structure may be called multigate MOSFET. The structure shown
in Fig. 7–18 is a double-gate MOSFET. Shrinking TSi automatically reduces Wdep
and Xj in Eq. (7.3.4) and Vt roll-off can be suppressed to allow Lg to shrink to as
small as a few nm. Because the top and bottom gates are at the same voltage and
the Si film is fully depleted, the Si surface potential moves up and down with Vg mV
for mV in the subthreshold region. The voltage divider effect illustrated in Fig. 7–1c
does not exist and η in Eq. (7.2.4) is the desired unity and Ioff is very low. There is
no need for heavy doping in the channel to reduce Wdep . This leads to low vertical
field and less impurity scattering; as a result the mobility is higher (see Section 6.3).
Finally, there are two channels (top and bottom) to conduct the transistor current.
For these reasons, a multigate MOSFET can have shorter Lg, lower Ioff, and larger
Ion than a single-gate MOSFET. But, there is one problem—how to fabricate the
multigate MOSFET structure.
There is a multigate structure that is attractive for its simplicity of
fabrication and it is illustrated in Fig. 7–19. Consider the center structure in
Fig. 7–19. The process starts with an SOI wafer or a bulk Si wafer. A thin fin of Si
is created by lithography and etching. Gate oxide is grown over the exposed
surfaces of the fin. Poly-Si gate material is deposited over the fin and the gate is
patterned by lithography and etching. Finally, source/drain implantation is
Lg
G G
S
S
D
S
D
D
Oxide
Tall Short Nanowire
FinFET FinFET FET
FIGURE 7–19 Variations of FinFET. Tall FinFET has the advantage of providing a large W
and therefore large Ion while occupying a small footprint. Short FinFET has the advantage of
less challenging lithography and etching. Nanowire FET gives the gate even more control over
the transistor body by surrounding it. FinFETs can also be fabricated on bulk Si substrates.
Gate Drain
Source
1.4 105
1E-3 R 12.5 nm 3-D simulation
Vds1V model
1.2 105 Vgs 2 V
1E-5 Tox1.5 nm
L1 m R 2.5 nm
R 2.5 nm 1.0 105
Tox1.5 nm
Drain current (A)

Drain current (A)
1E-7
L1mm
8.0 106
1E-9
Vgs 1.5 V
1E-11 6.0 106
1E-13 4.0 106

3-D simulation
Vgs 1 V
1E-15 model 2.0 106
1E-17 0.0
0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0
Gate voltage (V) Drain voltage (V)
FIGURE 7–20 Simulated I–V curves of a nanowire MOSFET. R is the nanowire radius. (After [16].)
performed. The final structure in Fig. 7–19 is basically the multigate structure in
Fig.7–18 turned on its side. This structure is called the FinFET because its Si body
resembles the back fin of a fish [13]. The channel consists of the two vertical
surfaces and the top surface of the fin. The channel width, W, is the sum of twice
the fin height and the width of the fin.
Several variations of FinFET are shown in Fig. 7–19 [14,15]. A tall FinFET
has the advantage of providing a large W and therefore large Ion while occupying
a small footprint. A short FinFET has the advantage of less challenging etching.
In this case, the top surface of the fin contributes significantly to the suppression
of Vt roll-off and to leakage control. This structure is also known as a triple-gate
MOSFET. The third variation gives the gate even more control over the Si wire
by surrounding it. It may be called a nanowire FET and its behaviors shown in
Fig. 7–20 can be modeled with the same methods and concepts used to model the
basic MOSFETs. FinFETs with Lg as small as 3 nm have been experimentally
demonstrated. It will allow transistor scaling beyond the scaling limit of the
conventional planar transistor.
7.9 ● OUTPUT CONDUCTANCE ●
Output conductance limits the transistor voltage gain. It has been introduced in
Section 6.13. However, its cause and theory are intimately related to those of Vt
roll-off. Therefore, the present chapter is a fitting place to explain it.
7.10 ● Device and Process Simulation 283
What device design parameters determine the output conductance? Let us

start with Eq. (6.13.1),
dl dsat dl dsat dV t
g ds ≡ ------------
- = ------------- ⋅ ------------ (7.9.1)
dV ds dV t dV ds
Since Ids is a function of Vgs – Vt [see Eq. (6.9.11)], it is obvious that

dl dsat – d l dsat
------------- = ---------------- = – g msat (7.9.2)
dV t dV gs
The last step is the definition of gmsat given in Eq. (6.6.8). Now, Eq. (7.9.1) can be
evaluated with the help of Eq. (7.3.3).
–L ⁄ l d
g ds = g msat × e (7.9.3)
g msat –L ⁄ l d
Instrinsic voltage gain = -----------
- = e (7.9.4)
g ds
Intrinsic voltage gain was introduced in Eq. (6.13.5). Equation (7.3.3) states
that increasing Vds would reduce Vt. That is why Ids continues to increase without
saturation. The output conductance is caused by the drain/channel capacitive
coupling, the same mechanism that is responsible for Vt roll-off. This is why gds is
larger in a MOSFET with shorter L. To reduce gds or to increase the intrinsic
voltage gain, we can use a large L and/or reduce ld. Circuit designers routinely use
much larger L than the minimum value allowed for a given technology node when
the circuits require large voltage gains. Reducing ld is the job of device designers
and Eq. (7.3.4) is their guide. Every design change that improves the suppression of
Vt roll-off also suppresses gds and improves the voltage gain.
Vt dependence on Vds is the main cause of output conductance in very short
MOSFETs. For larger L and Vds close to Vdsat, another mechanism may be the
dominant contributor to gds—channel length modulation. A voltage, Vds–Vdsat, is
dissipated over a finite (non-zero) distance next to the drain. This distance increases
with increasing Vds. As a result, the effective channel length decreases with increasing
Vds. Ids, which is inversely proportional to L, thus increases without true saturation.
It can be shown that gds, due to the channel length modulation, is approximately
l d ⋅ I dsat
g ds = ------------------------------------ ( 7.9.5)
L ( V ds – V dsat )
where ld is given in Eq. (7.3.4). This component of gds can also be suppressed with
larger L and smaller Tox, Xj, and Wdep .
7.10 ● DEVICE AND PROCESS SIMULATION ●
There are commercially available computer simulation suites [17] that solve all the
equations presented in this book with few or no approximations (e.g., Fermi–Dirac
statistics is used rather than Boltzmann approximation). Most of these equations
are solved simultaneously, e.g., Fermi–Dirac probability, incomplete ionization of
dopants, drift and diffusion currents, current continuity equation, and Poisson
equation. Device simulation is an important tool that provides the engineers with
quick feedback about device behaviors. This narrows down the number of variables
that need to be checked with expensive and time-consuming experiments.
Examples of simulation results are shown in Figs. 7–15 and 7–20. Each of the figures
takes from minutes to several hours of simulation time to generate.
Related to device simulation is process simulation. The input that a user
provides to the process simulation program are the lithography mask pattern,
implantation dose and energy, temperatures and times for oxide growth and
annealing steps, etc. The process simulator then generates a two- or three-
dimensional structure with all the deposited or grown and etched thin films and
doped regions. This output may be fed into a device simulator together with the
applied voltages and the operating temperature as the input to the device
simulator.
7.11 ● MOSFET COMPACT MODEL FOR CIRCUIT SIMULATION ●
Circuit designers can simulate the operation of circuits containing up to hundreds

of thousands or even more MOSFETs accurately, efficiently, and robustly.
Accuracy must be delivered for DC as well as RF operations, analog as well digital
circuits, memory as well as processor ICs. In circuit simulations, MOSFETs are
modeled with analytical equations much like the ones introduced in this and the
previous two chapters. More details are included in the model equations than this
textbook can introduce. These models are called compact models to highlight their
computational efficiency in contrast with the device simulators described in
Section 7.10.
It could be said that the compact model (and the layout design rules) is the
link between two halves of the semiconductor industry—technology/manufacturing
on the one side and design/product on the other. A compact model must capture all
the subtle behaviors of the MOSFET over wide ranges of voltage, L, W, and
temperature and present them to the circuit designers in the form of equations.
Some circuit-design methodologies, such as analog circuit design, use circuit
simulations directly. Other design methodologies use cell libraries. A cell library is a
collection of hundreds of small building blocks of circuits that have been carefully
designed and characterized beforehand using circuit simulations.
At one time, nearly every company developed its own compact models. In
1997, an industry standard setting group selected BSIM [18] as the first industry
standard model. If the Ids equation of BSIM is printed out on paper, it will fill sev-
eral pages.
Figure 7–21 shows selected comparisons of a compact model and measured
device data to illustrate the accuracy of the compact model [19]. It is also important
for the compact model to accurately model the transistor behaviors for any L and
W that a circuit designer may specify. Figure 7–22 illustrates this capability. Finally,
a good compact model should provide fast simulation times by using simple model
equations. In addition to the IV of N-channel and P-channel transistors, the model
also includes capacitance models, gate dielectric leakage current model, and source
7.12 ● Chapter Summary 285
W/L 10.0/0.4, T 27oC, VB 0 V W/L 20.0/0.4, T 27oC, Vd .05 V

4.84 Vgs (V) Vbs (V)
2.00 4 0.00
2.50 0.66
3.87 3.00 5 1.32
3.50 6 1.98
4.00 2.64
Log Id (A)
2.90 7 3.30
Id (mA)
Lines : model
8 Symbols : data
1.94
9
10
Lines : model 11
Symbols : data
0 12
0.0 0.8 1.6 2.4 3.2 4.0 0.0 0.66 1.32 1.98 2.64 3.3
Vd (V) Vg (V)
FIGURE 7–21 Selected comparisons of BSIM and measured device data to illustrate the accuracy of a
compact model. (After [18].)
1.6
W 20 m Tox 9 nm
8 W 20 m
1.4 Vbs 3.3 V Vgs 3.292 V Tox 9 nm
Vbs 2.64 V Vsub 0 V
1.2 6 Vgs 2.707 V
Vbs 1.98 V
Idsat (mA)
Vth (V)
Vbs 1.32 V Vgs 2.122 V

1.0
Vbs 0.66 V 4 Vgs 1.537 V
0.8 Vgs 0.952 V
Vbs 0 V 2
0.6
0.4 0
0 1 2 3 4 5 6 0 1 2 3 4 5
L (m) L (m)
FIGURE 7–22 A compact model needs to accurately model the transistor behaviors for any L and W that
circuit designers may specify. (After [19]. © 1997 IEEE.)
and drain junction diode model. Noise and high-frequency models are usually
provided, too.
7.12 ● CHAPTER SUMMARY ●
To reduce cost and improve speed in order to open up new applications, transistors
and interconnects are downsized periodically. Very small MOSFETs are prone to
have excessive leakage current called Ioff. The basic component of Ioff is the
subthreshold current
W –q Vt ⁄ η kT W –Vt ⁄ S
I off ( nA ) = 100 ⋅ ----- ⋅ e = 100 ⋅ ----- ⋅ 10 (7.2.8)
L L
S is the subthreshold swing. To keep Ioff below a given level, there is a mini-
mum acceptable Vt. Unfortunately, a larger Vt is deleterious to Ion and speed.
Therefore, it is important to reduce S by reducing the ratio Toxe/Wdep. Furthermore,
Vt decreases with L, a fact known as Vt roll-off, caused by DIBL.
–L ⁄ l d
V t = V t-long – ( V ds + 0.4V ) ⋅ e (7.3.3)
where I d ∝ 3 T oxe W dep X j (7.3.4)

Since Vt is a sensitive function of L, even the small (a few nm) manufacturing vari-
ations in L can cause problematic variations in Vt, Ioff, and Ion. To allow L reduction,
Eq. (7.3.3) states that ld must be reduced, i.e., Toxe, Wdep, and/or Xj must be reduced.
Tox reduction is limited mostly by gate tunneling leakage, which can be
suppressed by replacing SiO2 with a high-k dielectric such as HfO2. Metal gate can
reduce Toxe by eliminating the poly-Si gate depletion effect.
Wdep can be reduced with retrograde body doping. Xj can be reduced with mS
flash annealing or the metal source–drain MOSFET structure. Xj and Wdep can also
be reduced with the ultra-thin-body SOI device structure or the multigate
MOSFET structure. More importantly, these new structures eliminate the more
vulnerable leakage paths, which are the farthest from the gate.
Equation (7.3.3) also provides a theory for output conductance of the short
channel transistors.
–L ⁄ l d
g ds = g msat × e (7.9.3)
● PROBLEMS ●
● Subthreshold Leakage Current ●
7.1 Assume that the gate oxide between an n+ poly-Si gate and the p-substrate is 11 Å
thick and Na = 1E18 cm–3.
(a) What is the Vt of this device?
(b) What is the subthreshold swing, S?
(c) What is the maximum leakage current if W = 1 µm, L = 18 nm? (Assume Ids =
100 W/L (nA) at Vg = Vt.)
● Field Oxide Leakage ●

7.2 Assume the field oxide between an n+ poly-Si wire and the p-substrate is 0.3 µm thick
and that Na = 5E17 cm–3.
(a) What is the Vt of this field oxide device?
(b) What is the subthreshold swing, S?
(c) What is the maximum field leakage current if W = 10 µm, L = 0.3 µm, and
Vdd = 2.0 V?
● Vt Roll-off ●
7.3 Qualitatively sketch log(Ids) vs. Vg (assume Vds = Vdd) for the following:
(a) L = 0.2 µm, Na = 1E15 cm–3.
(b) L = 0.2 µm, Na = 1E17 cm–3.
Problems 287
(c) L = 1 µm, Na = 1E15 cm–3.

(d) L = 1 µm, Na = 1E17 cm–3.
Please pay attention to the positions of the curves relative to each other and label all
curves.
● Trade-off between Ioff and Ion ●
7.4 Does each of the following changes increase or decrease Ioff and Ion? A larger Vt. A
larger L. A shallower junction. A smaller Vdd. A smaller Tox. Which of these changes
contribute to leakage reduction without reducing the precious Ion?
7.5 There is a lot of concern that we will soon be unable to extend Moore’s Law. In your
own words, explain this concern and the difficulties of achieving high Ion and low Ioff.
(a) Answer this question in one paragraph of less than 50 words.
(b) Support your description in (a) with three hand-drawn sketches of your choice.
(c) Why is it not possible to maximize Ion and minimize Ioff by simply picking the right
values of Tox, Xj, and Wdep? Please explain in your own words.
(d) Provide three equations that help to quantify the issues discussed in (c).
7.6 (a) Rewrite Eq. (7.3.4) in a form that does not contain Wdep but contains Vt. Do so by
using Eqs. (5.5.1) and (5.4.3) assuming that Vt is given.
(b) Based on the answer to (a), state what actions can be taken to reduce the
minimum acceptable channel length.
7.7 (a) What is the advantage of having a small Wdep?
(b) For given L and Vt, what is the impact of reducing Wdep on Idsat and gate? (Hint:
consider the “m” in Chapter 6)
Discussion: Overall, smaller Wdep is desirable because it is more important to be able
to suppress Vt roll-off so that L can be scaled.
● MOSFET with Ideal Retrograde Doping Profile ●
7.8 Assume an N-channel MOSFET with an N+ poly gate and a substrate with an idealized
retrograde substrate doping profile as shown in Fig. 7–23.
Nsub
Gate Oxide Substrate
P
Very light P type
x
Tox Xrg
FIGURE 7–23
(a) Draw the energy band diagram of the MOSFET along the x direction from the
gate through the oxide and the substrate, when the gate is biased at threshold
voltage. (Hint: Since the P region is very lightly doped you may assume that the
field in this region is constant or dε/dx = 0). Assume that the Fermi level in the P+
region coincides with Ev and the Fermi level in the N+ gate coincides with Ec.
Remember to label Ec, Ev, and EF.
(b) Find an expression for Vt of this ideal retrograde device in terms of Vox. Assume
Vox is known. (Hint: Use the diagram from (a) and remember that Vt is the
difference between the Fermi levels in the gate and in the substrate. At threshold,
Ec of Si coincides with the Fermi level at the Si–SiO2 interface).
(c) Now write an expression for Vt in terms of Xrg, Tox, εox, εsi and any other common
parameters you see fit, but not in terms of Vox. Hint: Remember Nsub in the lightly
doped region is almost 0, so if your answer is in terms of Nsub, you might want to
rethink your strategy. Maybe εoxεox = εsiεsi could be a starting point.
(d) Show that the depletion layer width, Wdep in an ideal retrograde MOSFET can be
about half the Xdep of a uniformly doped device and still yield the same Vt.
(e) What is the advantage of having a small Wdep?
(f) For given L and Vt, what is the impact of reducing Wdep on Idsat and inverter
delay?
● REFERENCES ●
1. International Technology Roadmap for Semiconductors (http://public.itrs.net/)
2. Ghani, T., et al. “A 90 Nm High Volume Manufacturing Logic Technology Featuring Novel
45 nm Gate Length Strained Silicon CMOS Transistors,” IEDM Technical Digest. 2003,
978–980.
3. Yeo, Y-C., et al. “Enhanced Performance in Sub-100nm CMOSFETs Using Strained
Epitaxial Si-Ge.” IEDM Technical Digest. 2000, 753–756.
4. Liu, Z. H., et al. “Threshold Voltage Model for Deep-Submicrometer MOSFETs.” IEEE
Trans. on Electron Devices. 40, 1 (January 1993), 86–95.
5. Wann, C. H., et al. “A Comparative Study of Advanced MOSFET Concepts.” IEEE
Transactions on Electron Devices. 43, 10 (October 1996), 1742–1753.
6. Yeo, Yee-Chia, et al. “MOSFET Gate Leakage Modeling and Selection Guide for
Alternative Gate Dielectrics Based on Leakage Considerations.” IEEE Transactions on
Electron Devices. 50, 4 (April 2003), 1027–1035.
7. Lu, Q., et al. “Dual-Metal Gate Technology for Deep-Submicron CMOS Transistor,” Symp.
on VLSI Technology Digest of Technical Papers, 2000, 72–73.
8. Chen, I. C., et al. “Electrical Breakdown in Thin Gate and Tunneling Oxides.” IEEE Trans.
on Electron Devices. ED-32 (February 1985), 413–422.
9. Kedzierski, J., et al. “Complementary Silicide Source/Drain Thin-Body MOSFETs for the
20 nm Gate Length Regime.” IEDM Technical Digest, 2000, 57–60.
10. Hu, C. “Scaling CMOS Devices Through Alternative Structures,” Science in China
(Series F). February 2001, 44 (1) 1–7.
11. Choi, Y-K., et al. “Ultrathin-body SOI MOSFET for Deep-sub-tenth Micron Era,” IEEE
Electron Device Letters. 21, 5 (May 2000), 254–255.
12. Celler, George, and Michael Wolf. “Smart Cut™ A Guide to the Technology, the Process,
the Products,” SOITEC. July 2003.
General References 289
13. Huang, X., et al. “Sub 50-nm FinFET: PMOS.” IEDM Technical Digest, (1999), 67–70.
14. Yang, F-L, et al. “25 nm CMOS Omega FETs.” IEDM Technical Digest. (1999), 255–258.
15. Yang, F-L, et al. “5 nm-Gate Nanowire FinFET.” VLSI Technology, 2004. Digest of
Technical Papers, 196–197.
16. Lin, C-H., et al. “Corner Effect Model for Compact Modeling of Multi-Gate MOSFETs.”
2005 SRC TECHCON.
17. Taurus Process, Synoposys TCAD Manual, Synoposys Inc., Mountain View, CA.
18. http://www-device.eecs.berkeley.edu/~bsim3/bsim4.html
19. Cheng, Y., et al. “A Physical and Scalable I-V Model in BSIM3v3 for Analog/Digital Circuit
Simulation.” IEEE Trans. on Electron Devices. 44, 2, (February 1997), 277–287.
● GENERAL REFERENCES ●
1. Taur, Y., and T. H. Ning. Fundamentals of Modern VLSI Devices. Cambridge, UK:
Cambridge University Press, 1998.
2. Wolf, S. VLSI Devices. Sunset Beach, CA: Lattice Press, 1999.

Chenming-Hu ch7

Uploaded by

Copyright:

Available Formats

Chenming-Hu ch7

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chenming-Hu ch7

Uploaded by

Copyright:

Available Formats

Hu_ch07v3.

fm Page 259 Friday, February 13, 2009 4:55 PM

7.1 ● TECHNOLOGY SCALING—FOR COST, SPEED, AND POWER

260 Chapter 7 ● MOSFETs in ICs—Scaling, Leakage, and Other Topics

Gordon Moore made an empirical observation in 1965 that the number of

● Initial Reactions to the Concept of the IC ●

7.1 ● Technology Scaling—For Cost, Speed, and Power Consumption 261

7.1.1 Innovations Enable Scaling

7.1.2 Strained Silicon and Other Innovations

262 Chapter 7 ● MOSFETs in ICs—Scaling, Leakage, and Other Topics

Year of Shipment 2003 2005 2007 2010 2013

Technology Node (nm) 90 65 45 32 22

epitaxial growth (see Section 3.7.3) of SiGe—typically a 20% Ge and 80% Si

7.2 ● Subthreshold Current—“Off” Is Not Totally “Off” 263

7.2 ● SUBTHRESHOLD CURRENT—“OFF” IS NOT TOTALLY “OFF” ●

264 Chapter 7 ● MOSFETs in ICs—Scaling, Leakage, and Other Topics

0.01 (Vds) 0.05, 1.2V

7.2 ● Subthreshold Current—“Off” Is Not Totally “Off” 265

● The Effect of Interface States ●

266 Chapter 7 ● MOSFETs in ICs—Scaling, Leakage, and Other Topics

EXAMPLE 7–1 Subthreshold Leakage Current

7.3 ● Vt ROLL-OFF—SHORT-CHANNEL MOSFETS LEAK MORE ●

● Gate Length (Lg) vs. Electrical Channel Length (L) ●

7.3 ● Vt Roll-Off—Short-Channel MOSFETs Leak More 267

At a certain Lg , Vt becomes so low that Ioff becomes unacceptable [see

Long Channel Short Channel

Vgs Vt-long Vgs Vt-short

268 Chapter 7 ● MOSFETs in ICs—Scaling, Leakage, and Other Topics

where l d ∝ 3 T oxe W dep X j (7.3.4)

7.3 ● Vt Roll-Off—Short-Channel MOSFETs Leak More 269

FIGURE 7–6 Schematic two-capacitor network in MOSFET. Cd models the electrostatic

small MOSFET and understanding new transistor structures. At a very large L, Vt

270 Chapter 7 ● MOSFETs in ICs—Scaling, Leakage, and Other Topics

7.4 ● REDUCING GATE-INSULATOR ELECTRICAL THICKNESS

104 Inversion bias

7.4 ● Reducing Gate-Insulator Electrical Thickness and Tunneling Leakage 271

● SiO2 Breakdown Electric Field ●

272 Chapter 7 ● MOSFETs in ICs—Scaling, Leakage, and Other Topics

This breakdown model considers a sequence of events[8]. Carrier tunneling

7.5 ● HOW TO REDUCE Wdep ●

7.5 ● How to Reduce Wdep 273

● Derivation of Eq. (7.5.3) ●

FIGURE 7–9 Energy diagram of a steep-retrograde doped MOSFET at the threshold

From Eqs. (5.2.2), (7.5.4)

Here is an intriguing note about reducing Wdep further. A higher Nsub in

● Predicting the Ultimate Low Limit of Channel Length—A Retrospective ●

274 Chapter 7 ● MOSFETs in ICs—Scaling, Leakage, and Other Topics

A review of the historical literature reveals that the researchers were

7.6 ● SHALLOW JUNCTION AND METAL SOURCE/DRAIN MOSFET ●

7.6.1 MOSFET with Metal Source/Drain

Contact Dielectric spacer

7.6 ● Shallow Junction and Metal Source/Drain MOSFET 275

Metal Gate Metal

276 Chapter 7 ● MOSFETs in ICs—Scaling, Leakage, and Other Topics

7.7 ● TRADE-OFF BETWEEN Ion AND Ioff AND DESIGN

QUESTION Which, if any, of the following changes lead to both sub-

threshold leakage reduction and Ion enhancement? A larger Vt. A larger

7.8 ● Ultra-Thin-Body SOI and Multigate MOSFETs 277

In a large circuit such as a microprocessor, only some circuit blocks need to

Drain current, Id (A/m) 105

1E-13 4.0 106

Vbs 1.32 V Vgs 2.122 V