Chenming-Hu ch7
Chenming-Hu ch7
Chenming-Hu ch7
7
MOSFETs in ICs—Scaling, Leakage,
and Other Topics
CHAPTER OBJECTIVES
How the MOSFET gate length might continue to be reduced is the subject of this chap-
ter. One important topic is the off-state current or the leakage current of the MOSFETs.
This topic complements the discourse on the on-state current conducted in the previ-
ous chapter. The major topics covered here are the subthreshold leakage and its impact
on device size reduction, the trade-off between Ion and Ioff and the effects on circuit
design. Special emphasis is placed on the understanding of the opportunities for future
MOSFET scaling including mobility enhancement, high-k dielectric and metal gate, SOI,
multigate MOSFET, metal source/drain, etc. Device simulation and MOSFET compact
model for circuit simulation are also introduced.
M
etal–oxide–semiconductor (MOS) integrated circuits (ICs) have met the
world’s growing needs for electronic devices for computing,
communication, entertainment, automotive, and other applications with
continual improvements in cost, speed, and power consumption. These
improvements in turn stimulated and enabled new applications and greatly
improved the quality of life and productivity worldwide.
In the forty-five years since 1965, the price of one bit of semiconductor memory has
dropped 100 million times. The cost of a logic gate has undergone a similarly
dramatic drop. This rapid price drop has stimulated new applications and
semiconductor technology has improved the ways people carry out just about all
human endeavors. The primary engine that powered the proliferation of electronics
is “miniaturization.” By making the transistors and the interconnects smaller, more
circuits can be fabricated on each silicon wafer and therefore each circuit becomes
cheaper. Miniaturization has also been instrumental to the improvements in speed
and power consumption of ICs.
259
Hu_ch07v3.fm Page 260 Friday, February 13, 2009 4:55 PM
Besides the line width, some other parameters are also reduced with scaling
such as the MOSFET gate oxide thickness and the power supply voltage. The
reductions are chosen such that the transistor current density (Ion /W) increases
with each new node. Also, the smaller transistors and shorter interconnects lead to
smaller capacitances. Together, these changes cause the circuit delays to drop
(Eq. 6.7.1). Historically, IC speed has increased roughly 30% at each new
technology node. Higher speed enables new applications such as wide-band data
transmission via RF mobile phones.
Hu_ch07v3.fm Page 261 Friday, February 13, 2009 4:55 PM
Scaling does another good thing. Eq. (6.7.6) shows that reducing
capacitance and especially the power supply voltage is effective in lowering the
power consumption. Thanks to the reduction in C and Vdd, power consumption
per chip has increased only modestly per node in spite of the rise in switching
frequency, f and the doubling of transistor count per chip at each technology
node. If there had been no scaling, doing the job of a single PC microprocessor
chip (operating a billion transistors at 2 GHz) using 1970 technology would
require the power output of an electrical power generation plant.
In summary, scaling improves cost, speed, and power consumption per
function with every new technology generation. All of these attributes have been
improved by 10 to 100 million times in four decades—an engineering achievement
unmatched in human history! When it comes to ICs, small is beautiful.
TABLE 7–1 • Scaling from 90 nm to 22 nm and innovations that enable the scaling.
HP: High-Performance technology. LSTP: Low Standby Power technology for portable applications.
EOTe: Equivalent electrical Oxide Thickness, i.e., equivalent Toxe. Ion: NFET Ion.
Gate
Both trenches
filled with epitaxial
SiGe
N-type Si
FIGURE 7–1 Example of strained-silicon MOSFET. Hole mobility can be raised with a
compressive mechanical strain illustrated with the arrows pushing on the channel region.
Hu_ch07v3.fm Page 263 Friday, February 13, 2009 4:55 PM
At the 32 nm node, wet lithography (see Section 3.3.1) is used to print the fine
patterns. At the 22 nm node, new transistor structures may be used to reverse the
trend of increasing Ioff, which is the source of a serious power consumption issue.
Some new structures are presented in Section 7.8.
Circuit speed improves with increasing Ion; therefore, it would be desirable to use a
small Vt. Can we set Vt at an arbitrarily small value, say 10 mV? The answer is no.
At Vgs < Vt, an N-channel MOSFET is in the off state. However, a leakage
current can still flow between the drain and the source. The MOSFET current
observed at Vgs < Vt is called the subthreshold current. This is the main contributor
to the MOSFET off-state current, Ioff. Ioff is the Id measured at Vgs = 0 and
Vds = Vdd. It is important to keep Ioff very small in order to minimize the static
power that a circuit consumes when it is in the standby mode. For example, if Ioff is
a modest 100 nA per transistor, a cell-phone chip containing one hundred million
transistors would consume 10 A even in standby. The battery would be drained in
minutes without receiving or transmitting any calls. A desktop PC processor would
dissipate more power because it contains more transistors and face expensive
problems of cooling the chip and the system.
Figure 7–2a shows a subthreshold current plot. It is plotted in a semi-log Ids vs.
Vgs graph. When Vgs is below Vt, Ids is clearly a straight line, i.e., an exponential
function of Vgs.
Figure 7–2b–d explains the subthreshold current. At Vgs below Vt, the
inversion electron concentration (ns) is small but nonetheless can allow a small
leakage current to flow between the source and the drain. In Fig. 7–2b, a larger Vgs
would pull the Ec at the surface closer to EF, causing ns and Ids to rise. From the
equivalent circuit in Fig. 7–2c, one can observe that
dϕ C oxe
- ≡ --1-
-----------s- = ------------------------------ (7.2.1)
dV gs C oxe + C dep η
C dep
η = 1 + ----------- - (7.2.2)
C oxe
Integrating Eq. (7.2.1) yields
ϕ s = constant + V g ⁄ η (7.2.3)
Ids is proportional to ns, therefore
q ϕ s ⁄ kT q ( constant+Vg ⁄ η ) ⁄ kT qVg ⁄ η kT
I ds ∝ n s ∝ e ∝e ∝e (7.2.4)
A practical and common definition of Vt is the Vgs at which Ids = 100 nA × W/L
as shown in Fig. 6–12. (Some companies may use 200 nA instead of 100 nA.).
Equation (7.2.4) may be rewritten as
W q ( V – V ) ⁄ η kT
I ds ( nA ) = 100 ⋅ ----- ⋅ e gs t (7.2.5)
L
Hu_ch07v3.fm Page 264 Friday, February 13, 2009 4:55 PM
10000
PMOS NMOS
1000
100
Id (A/m)
10
0.1
Ec Vg
s
Coxe
EF s
Vg
EF Cdep
(b) (c)
Log (Ids )
mA
Vds Vdd
100nA W/L
A
1/S
nA
Ioff
Vt Vgs
(d)
FIGURE 7–2 The current that flows at Vgs < Vt is called the subthreshold current. Vt ~ 0.2 V.
The lower/upper curves are for Vds = 50 mV/1.2 V. After Ref. [2]. (b) When Vg is increased,
Ec at the surface is pulled closer to EF, causing ns and Ids to rise; (c) equivalent capacitance
network; (d) subthreshold I-V with Vt and Ioff. Swing, S, is the inverse of the slope in the
subthreshold region.
Hu_ch07v3.fm Page 265 Friday, February 13, 2009 4:55 PM
Clearly, Eq. (7.2.5) agrees with the definition of Vt and Eq. (7.2.4). The
simplicity of Eq. (7.2.5) is another reason for favoring the new Vt definition. At
room temperature, the function exp(qVgs /kT) changes by 10 for every 60 mV
change in Vgs , therefore exp(qVgs /ηkT) changes by 10 for every η × 60 mV. For
example, if η = 1.5, Eq. (7.2.5) states that Ids drops by ten times for every 90 mV
of decrease in Vgs below Vt at room temperature. η × 60 mV is called the
subthreshold swing and represented by the symbol, S.
T
S ( mV ⁄ decade ) = η ⋅ 60 mV ⋅ -------------- (7.2.6)
300K
W q ( V – V ) ⁄ η kT W (V – V ) ⁄ S
I ds ( nA ) = 100 ⋅ ----- ⋅ e gs t = 100 ⋅ ----- ⋅ 10 gs t (7.2.7)
L L
W –q Vt ⁄ η kT W –Vt ⁄ S
I off ( nA ) = 100 ⋅ ----- ⋅ e = 100 ⋅ ----- ⋅ 10 (7.2.8)
L L
For given W and L, there are two ways to minimize Ioff illustrated in
Fig. 7–2 (d). The first is to choose a large Vt. This is not desirable because a large
Vt reduces Ion and therefore degrades the circuit speed (see Eq. (6.7.1)). The
preferable way is to reduce the subthreshold swing. S can be reduced by reducing
η. That can be done by increasing Coxe (see Eq. 7.2.2), i.e., using a thinner Tox ,
and by decreasing Cdep, i.e., increasing Wdep.1 An additional way to reduce S, and
therefore to reduce Ioff , is to operate the transistors at significantly lower than
the room temperature. This last approach is valid in principle but rarely used
because cooling adds considerable cost.
Besides the subthreshold leakage, there is another leakage current
component that has becomes significant. That is the tunnel leakage through very
thin gate oxide that will be presented in Section 7.4. The drain to the body junction
leakage is the third leakage component.
1 According to Eq. 6.5.2 and Eq. 7.2.2, η should be equal to m. In reality, η is larger than m because
Coxe is smaller at low Vgs (subthreshold condition) than in inversion due to a larger Tinv as shown in
Fig. 5–25. Nonetheless, η and m are closely related.
Hu_ch07v3.fm Page 266 Friday, February 13, 2009 4:55 PM
EF
EF
(a) (b)
FIGURE 7–3 (a) Most of the interface states are empty because they are above EF. (b) At
another Vg, most of the interface states are filled with electrons. As a result, the interface
charge density changes with Vg.
The previous section pointed out that Vt must not be set too low; otherwise, Ioff
would be too large. The present section extends that analysis to show that the
channel length (L) must not be too short. The reason is this: Vt drops with
decreasing L as illustrated in Fig. 7–4. When Vt drops too much, Ioff becomes too
large and that channel length is not acceptable.
Gate length is the physical length of the gate and can be accurately measured with a
scanning electron microscope (SEM). It is carefully controlled in the fabrication
plant. The channel length, in comparison, cannot be determined very accurately and
easily due to the lateral diffusion of the source and drain junctions. L tracks Lg but
the difference between the two just cannot be quantified precisely in spite of efforts
such as described in Section 6.11. As a result, Lg is widely used in lieu of L in data
presentations as is done in Fig. 7–4. L is still a useful concept and is used in theoreti-
cal equations even though L cannot be measured precisely for small transistors.
Hu_ch07v3.fm Page 267 Friday, February 13, 2009 4:55 PM
0.00
0.05
Vt Roll-off (V)
0.10
0.15
Vds 50 mV
0.20 Vds 1.0 V
0.25
0.01 0.1 1
Lg (m)
FIGURE 7–4 |Vt| decreases at very small Lg. This phenomenon is called Vt roll-off. It
determines the minimum acceptable Lg because Ioff is too large when Vt becomes too low or
too sensitive to Lg.
Vgs 0 V Vgs 0 V
Ec Ef
N Source Vds
N Drain
(a) (c)
(b) (d)
FIGURE 7–5 a–d: Energy band diagram from source to drain when Vgs = 0 V and Vgs = Vt.
a–b long channel; c–d short channel.
Hu_ch07v3.fm Page 268 Friday, February 13, 2009 4:55 PM
in case (a) and therefore is closer to the Ec in the source. When the channel Ec is
only ~0.2 eV higher than the Ec in the source (which is also ~EFn), ns in the channel
reaches ~1017 cm3 and inversion threshold condition (Ids = 100nA × W/L) is
reached. We may say that a 0.2 eV potential barrier is low enough to allow the
electrons in the N+ source to flow into the channel to form the inversion layer. The
following analogy may be helpful for understanding the concept of the energy
barrier height. The source is a reservoir of water; the potential barrier is a dam; and
Vgs controls the height of the dam. When Vgs is high enough, the dam is sufficiently
low for the water to flow into the channel and the drain. That defines Vt.
Figure 7–5c shows the case of a short-channel device at Vgs = 0. If the channel is
short enough, Ec will not be able to reach the same peak value as in Fig. 7–5a. As a
result, a smaller Vgs is needed in Fig. 7–5d than in Fig. 7–5b to pull the barrier down to
0.2 eV. In other words, Vt is lower in the short channel device than the long channel
device. This explains the Vt roll-off shown in Fig. 7–4.
We can understand Vt roll-off from another approach. Figure 7–6 shows a
capacitor between the gate and the channel. It also shows a second capacitor, Cd ,
between the drain and the channel terminating at around the middle of the channel,
where Ec peaks in Fig. 7–5d. As the channel length is reduced, the drain to source and
the drain to “channel” distances are reduced; therefore, Cd increases. Do not be
concerned with the exact definition or value of Cd. Instead, focus on the concept that
Cd represents the capacitive coupling between the drain and the channel barrier point.
From this two-capacitor equivalent circuit, it is evident that the drain voltage
has a similar effect on the channel potential as the gate voltage. Vgs and Vds,
together, determine the channel potential barrier height shown in Fig. 7–5. When
Vds is present, less Vgs is needed to pull the barrier down to 0.2 eV; therefore, Vt is
lower by definition. This understanding gives us a simple equation for Vt roll-off,
Cd
V t = V t-long – V ds ⋅ ----------
- (7.3.1)
C oxe
where Vt-long is the threshold voltage of a long-channel transistor, for which Cd = 0.
More accurately, Vds should be supplemented with a constant that represents the
combined effects of the 0.2 V built-in potentials between the N– inversion layer and
both the N+ drain and source at the threshold condition [4].
Cd
V t = V t-long – ( V ds + 0.4 V ) ⋅ ----------
- (7.3.2)
C oxe
Using Fig. 7–6, one can intuitively see that as L decreases, Cd increases. Recall
that the capacitance increases when the two electrodes are closer to each other.
That intuition is correct for the two-dimensional geometry of Fig. 7–6, too.
However, solution of the Poisson’s equation (Section 4.1.3) indicates that Cd is an
exponential function of L in this two-dimensional structure [5]. Therefore,
–L ⁄ l d
V t = V t-long – ( V ds + 0.4 V ) ⋅ e (7.3.3)
Vgs
Tox Coxe
N Wdep Xj Vds
Cd
P-Sub
100
SiO2 thickness
Thickness (Å)
10
350 nm
250 nm
180 nm
130 nm
90 nm
Technology node
FIGURE 7–7 In the past, the gate oxide thickness has been scaled roughly in proportion to
the line width.
Hu_ch07v3.fm Page 270 Friday, February 13, 2009 4:55 PM
SiO2 has been the preferred gate insulator since silicon MOSFET’s beginning. The
oxide thickness has been reduced over the years from 300 nm for the 10 µm
technology to only 1.2 nm for the 65 nm technology. There are two reasons for the
relentless drive to reduce the oxide thickness. First, a thinner oxide, i.e., a larger Cox
raises Ion and a large Ion raises the circuit speed [see Eq. (6.7.1)]. The second reason
is to control Vt roll-off (and therefore the subthreshold leakage) in the presence of
a shrinking L according to Eqs. (7.3.3) and (7.3.4). One must not underestimate the
importance of the second reason. Figure 7–7 shows that the oxide thickness has
been scaled roughly in proportion to the line width.
Thinner oxide is desirable. What, then, prevents engineers from using
arbitrarily thin gate oxide films? Manufacturing thin oxide is not easy, but as
Fig. 6–5 illustrates, it is possible to grow very thin and uniform gate oxide films with
high yield. Oxide breakdown is another limiting factor. If the oxide is too thin, the
electric field in the oxide can be so high as to cause destructive breakdown. (See the
sidebar, “SiO2 Breakdown Electric Field.”) Yet another limiting factor is that long-
term operation at high field, especially at elevated chip operating temperatures,
breaks the weaker chemical bonds at the Si–SiO2 interface thus creating oxide
charge and Vt shift (see Section 5.7). Vt shifts cause circuit behaviors to change and
raise reliability concerns.
For SiO2 films thinner than 1.5 nm, tunneling leakage current becomes the
most serious limiting factor. Figure 7–8a illustrates the phenomenon of gate leakage
by tunneling (see Section 4.20). Electrons arrive at the gate oxide barrier at thermal
velocity and emerge on the side of the gate with a probability given by Eq. (4.20.1).
This is the cause of the gate leakage current. Figure 7–8b shows that the exponential
rise of the SiO2 leakage current with decreasing thickness agrees with the tunneling
model prediction [6]. At 1.2 nm, SiO2 leaks 103 A/cm2. If an IC chip contains
106
Direct tunneling model
Gate current density (A/cm2)
104
106
0.5 1.0 1.5 2.0 2.5 3.0 3.5
Equivalent oxide thickness (nm)
(a) (b)
FIGURE 7–8 (a) Energy band diagram in inversion showing electron tunneling path through
the gate oxide; (b) 1.2 nm SiO2 conducts 103 A/cm2 of leakage current. High-k dielectric such
as HfO2 allows several orders lower leakage current to pass. (After [6]. © 2003 IEEE.)
Hu_ch07v3.fm Page 271 Friday, February 13, 2009 4:55 PM
1 mm2 total area of this thin dielectric, the chip oxide leakage current would be 10 A.
This large leakage would drain the battery of a cell phone in minutes. The leakage
current can be reduced by about 10 × with the addition of nitrogen into SiO2.
Engineers have developed high-k dielectric technology to replace SiO2. For
example, HfO2 has a relative dielectric constant (k) of ~24, six times larger than
that of SiO2. A 6 nm thick HfO2 film is equivalent to 1 nm thick SiO2 in the sense
that both films produce the same Cox. We say that this HfO2 film has an equivalent
oxide thickness or EOT of 1 nm. However, the HfO2 film presents a much thicker
(albeit lower) tunneling barrier to the electrons and holes. The consequence is that
the leakage current through HfO2 is several orders of magnitude smaller than that
through SiO2 as shown in Fig. 7–8b. Other attractive high-k dielectrics include ZrO2
and Al2O3. The difficulties of adopting high-k dielectrics in IC manufacturing are
chemical reactions between them and the silicon substrate, lower surface mobility
than the Si–SiO2 system, and more oxide charge. These problems are minimized by
inserting a thin SiO2 interfacial layer between the silicon substrate and the high-k
dielectric.
Note that Eq. (7.3.4) contains the electrical oxide thickness, Toxe, defined in
Eq. (5.9.2). Besides Tox or EOT, the poly-Si gate depletion layer thickness also needs
to be minimized. Metal is a much better gate material in this respect. NFET and
PFET gates may require two different metals (with metal work functions close to
those of N+ and P+ poly-Si) in order to achieve the optimal Vts [7].
In addition, Tinv is also part of Toxe and needs to be minimized. The material
parameters that determine Tinv is the electron or hole effective mass. A larger
effective mass leads to a thinner Tinv. Unfortunately, a larger effective mass leads
to a lower mobility, too (see Eq. (2.2.4)). Fortunately, the effective mass is a
function of the spatial direction in a crystal. The effective mass in the direction
normal to the oxide interface determines Tinv, while the effective mass in the
direction of the current flow determines the surface mobility. It may be possible to
build a transistor with a wafer orientation (see Fig. 1–2) that offers larger mn and
mp normal to the oxide interface but smaller mn and mp in the direction of the
current flow.
Equation (7.3.4) suggests that a small Wdep helps to control Vt roll-off and enable
the use of a shorter L. Wdep can be reduced by increasing the substrate doping con-
centration, Nsub , because Wdep is proportional to 1 ⁄ N sub . However, Eq. (5.4.3),
repeated here,
qN sub 2 ε s φ st
V t = V fb + φ st + ----------------------------------
- (7.5.1)
C ox
dictates that, if Vt is not to increase, Nsub must not be increased unless Cox is
increased, i.e., Tox is reduced. Equation (7.5.1) can be rewritten as Eq (7.5.2) by
eliminating Nsub with Eq. (5.5.1). Clearly, Wdep can only be reduced in proportion
to Tox.
2 ε s T ox
V t = V fb + φ st 1 + --------------------
- (7.5.2)
ε W ox dep
This fact establishes Tox as the main enabler of L reduction according to Eq. (7.3.4).
There is another way of reducing Wdep—adopt the steep retrograde doping
profile illustrated in Fig. 6–12. In that case, Wdep is determined by the thickness of
the lightly doped surface layer. It can be shown (see sidebar) that Vt of a MOSFET
with ideal retrograde doping is
ε s T ox
V t = V fb + φ st 1 + ----------------
- (7.5.3)
ε T ox rg
where Trg is the thickness of the lightly doped thin layer. Again, Trg in Eq. (7.5.3)
can only be scaled in proportion to Tox if Vt is to be kept constant. However, Trg,
the Wdep of an ideal retrograde device, can be about half the Wdep of a uniformly
doped device [see Eq. (7.5.2)] and yield the same Vt. That is an advantage of the
retrograde doping. Another advantage of retrograde doping is that ionized
impurity scattering (see Section 2.2.2) in the inversion layer is reduced and the
surface mobility can be higher. To produce a sharp retrograde profile with a very
thin lightly doped layer, i.e., a very small Wdep, care must be taken to prevent
dopant diffusion.
Hu_ch07v3.fm Page 273 Friday, February 13, 2009 4:55 PM
Trg
Ec
fst
EF
Ev
The band bending, φ st , is dropped uniformly over Trg, the thickness of the
lightly doped depletion layer, creating an electric field, Ᏹ s = φ st ⁄ T rg . Because of the
continuity of the electric flux, the oxide field is Ᏹ ox = Ᏹ s ⋅ ε s ⁄ ε ox . Therefore,
ε s T ox
V ox = T ox Ᏹ ox = φ st ---------------
- (7.5.4)
ε ox T rg
Figure 7–10, first introduced as Fig. 6–24b, shows the cross-sectional view of a typi-
cal drain (and source) junction. Extra process steps are taken to produce the shal-
low junction extension between the deep N+ junction and the channel. This shallow
junction is needed because the drain junction depth must be kept small according to
Eq. (7.3.4). In order to keep this junction shallow, only very short annealing at the
lowest necessary temperature is used to activate the dopants and anneal out the
implantation damages in the crystal in 0.1S (flash annealing) or 1µS (laser anneal-
ing) (see Section 3.6). To further reduce dopant diffusion, the doping concentration
in the shallow junction extension is kept much lower than the N+ doping density.
Shallow junction and light doping combine to produce an undesirable parasitic
resistance that reduces the precious Ion. That is a price to pay for suppressing Vt
roll-off and the subthreshold leakage current. Farther away from the channel, as
shown in Fig. 7–10, a deeper N+ junction is used to minimize total parasitic resis-
tance. The width of the dielectric spacer in Fig. 7–10 should be as small as possible
to minimize the resistance.
Gate
Oxide
Channel
Shallow junction
N drain extension
Silicide, e.g. NiSi2, TiSi2
FIGURE 7–10 Cross-sectional view of a MOSFET drain junction. The shallow junction
extension next to the channel helps to suppress the Vt roll-off.
Hu_ch07v3.fm Page 275 Friday, February 13, 2009 4:55 PM
P-body
(a)
Channel
S D
EF
Vg 0
(b)
Channel
S D
Vg Vt EF
(c)
Conventional MOSFET
EF
Vg Vt N N
(d)
FIGURE 7–11 (a) Metal source/drain is the ultimate way to reduce the increasingly
important parasitic resistance; (b) energy band diagrams in the off state; (c) in the on state
there may be energy barriers impeding current flow. These barriers do not exist in the
conventional MOSFET (d) and must be minimized.
P+ Si. The only problem is that the Schottky-S/D MOSFET would have a lower Id
than the regular MOSFET if φ B is too large to allow easy flow of carriers (electrons
for NFET) from the source into the channel.
Figure 7–11b shows the energy band diagram drawn from the source along the
channel interface to the drain. Vds is set to zero for simplicity. The energy diagram is
similar to that of a conventional MOSFET at Vg = 0 in that a potential barrier stops
the electrons in the source from entering the channel and the transistor is off. In the
on state, Fig. 7–11c, channel Ec is pulled down by the gate voltage, but not at the
source/drain edge, where the barrier height is fixed at φ B (see Section 4.16). This
barrier does not exist in a conventional MOSFET as shown in Fig. 7–11d, and they
can degrade Id of the metal S/D MOSFET.
To unleash the full potentials of Schottky S/D MOSFET, a very low- φ B
Schottky junction technology should be used (for NFETs). A thin N+ region can be
added between the metal and the channel. This minimizes the effect of the barriers
on current flow as shown in Fig. 4–46. Attention must be paid to reduce the large
reverse leakage current of a low- φ Bn Schottky drain to body junction [9].
Hu_ch07v3.fm Page 276 Friday, February 13, 2009 4:55 PM
Subthreshold Ioff would not be a problem if Vt is set at a very high value. That is not
acceptable because a high Vt would reduce Ion and therefore reduce circuit speed.
Using a larger Vdd can raise Ion, but that is not acceptable either because it would
raise the power consumption, which is already too large for comfort. Decreasing L
can raise Ion but would also reduce Vt and raise Ioff.
Figure 7–12 shows a plot of log Ioff vs. Ion of a large number of transistors [2].
The trade-off between the two is clear. Higher Ion goes hand-in-hand with larger
Ioff. The spread in Ion (and Ioff) is due to a combination of unintentional
manufacturing variances in Lg and Vt and intentional difference in the gate length.
Techniques have been developed to address the strong trade-off between Ion
and Ioff, i.e., between speed and standby power consumption.
One technique gives circuit designers two or three (or even more) Vts to
choose from. A large circuit may be designed with only the high-Vt devices first.
Circuit timing simulations are performed to identify those signal paths and circuits
where speed must be tuned up. Intermediate-Vt devices are substituted into them.
Finally, low-Vt devices are substituted into those few circuits that need even more
help with speed. A similar strategy provides multiple Vdd. A higher Vdd is provided
to a small number of circuits that need speed while a lower Vdd is used in the other
circuits. The larger Vdd provides higher speed and/or allows a larger Vt to be used
(to suppress leakage). Yet the dynamic power consumption (see Eq. (6.7.6)) can be
kept low because most of the circuits operate at the lower Vdd.
1000
100
Ioff (nA/m)
10
1
0.9 1 1.1 1.2 1.3 1.4 1.5
Ion (mA/m)
FIGURE 7–12 Log Ioff vs. linear Ion. The spread in Ion (and Ioff) is due to the presence of
several slightly different drawn Lgs and unintentional manufacturing variations in Lg and Vt.
(After [2]. © 2003 IEEE.)
Hu_ch07v3.fm Page 277 Friday, February 13, 2009 4:55 PM
There are alternative MOSFET structures that are less susceptible to Vt roll-off
and allow gate length scaling beyond the limit of conventional MOSFET.
Figure 7–6 gives a simple description of the competition between the gate and the
drain over the control of the channel barrier height shown in Fig. 7–5. We want to
maximize the gate-to-channel capacitance and minimize the drain-to-channel
capacitance. To do the former, we reduce Tox as much as possible. To accomplish
the latter, we reduce Wdep and Xj as much as possible. It is increasingly difficult to
make these dimensions smaller. The real situation is even worse. In the
subthreshold region, Tox may be a small part Toxe in Eq. (7.3.4) because the
inversion-layer thickness, Tinv in Sec. 5.9, is large. Imagine that Tox could be made
infinitesimally small. This would give the gate a perfect control over the potential
barrier height—but only right at the Si surface. The drain could still have more
control than the gate along other leakage current paths that are some distance
below the Si surface as shown in Fig. 7–13. At this submerged location, the gate is
far away and the gate control is weak. The drain voltage can pull the potential
Hu_ch07v3.fm Page 278 Friday, February 13, 2009 4:55 PM
S D
Cg Cd
Leakage path
FIGURE 7–13 The drain could still have more control than the gate along another leakage
current path that is some distance below the Si surface.
barrier down and allow leakage current to flow along this submerged path. There
are two transistor structures that can eliminate the leakage paths that are far away
from the gate [10]. One is called the ultra-thin-body MOSFET or UTB MOSFET.
The other is multigate MOSFET. They are presented next.
Gate
Source Drain
SiO2 Tsi 3 nm
FIGURE 7–14 The SEM cross section of UTB device. (After [11]. © 2000 IEEE.)
Hu_ch07v3.fm Page 279 Friday, February 13, 2009 4:55 PM
102
Tsi 7 nm
103 Tsi 5 nm
Tsi 3 nm
104
106
107
108 G
S D
109
SiO2
1010
1011
0.0 0.2 0.4 0.6 0.8 1.0
Gate voltage, Vg (V)
FIGURE 7–15 The subthreshold leakage is reduced as the Si film (transistor body) is made
thinner. Lg = 15 nm. (After [11]. © 2000 IEEE.)
● SOI-Silicon on Insulator ●
Figure 7–16 shows the steps of making an SOI or silicon-on-Insulator wafer [12].
(The conventional wafer is sometimes called bulk silicon wafer for clarity.) Step 1 is
to implant hydrogen into a silicon wafer that has a thin SiO2 film at the surface. The
hydrogen concentration peaks at a distance D below the surface. Step 2 is to place
the first wafer, upside down, over a second plain wafer. The two wafers adhere to
each other by the atomic bonding force. A low temperature annealing causes the
two wafers to fuse together. Step 3 is to apply another annealing step that causes
the implanted hydrogen to coalesce and form a large number of tiny hydrogen
bubbles at depth D. This creates sufficient mechanical stress to break the wafer at
that plane. The final step, Step 4, is to polish the surface. Now the SOI wafer is
ready for use.
The Si film is of high quality and suitable for IC manufacturing. Even without
using an ultra-thin body, SOI provides a speed advantage because the source/drain to
body junction capacitance is practically eliminated as the source and drain diffusion
regions extend vertically to the buried oxide. The cost of an SOI wafer is higher than
an ordinary Si wafer and increases the cost of IC chips. For these reasons, only some
microprocessors, which command high prices and compete on speed, have employed
this technology so far. Figure 7–17 shows the cross-sectional SEMs of an SOI
product. SOI also finds other compelling applications because it offers extra
flexibility for making novel structures such as the ultra-thin-body MOSFET and
some multigate MOSFET structures that can be scaled to smaller gate length beyond
the capability of bulk MOSFETs.
Hu_ch07v3.fm Page 280 Friday, February 13, 2009 4:55 PM
Wafer A Wafer B
H ions
A Step 1
A
Step 2
B
Step 3
B
Step 4
Si bulk SOI wafer
or
New A New B
Si
Buried Oxide
Silicon substrate
FIGURE 7–17 The cross-sectional electron micrograph of an SOI integrated circuit. The lower
level structures are transistors and contacts. The upper two levels are the vias and the
interconnects, which employ multiple layers of materials to achieve better reliability and etch stops.
Gate 1 Vg
Tox
Gate
very thin so that no leakage path is far from one of the gates. (The worst-case path
is along the center of the Si film.) Therefore, the gate(s) can suppress leakage
current more effectively than the conventional MOSFET. Because there are more
than one gate, the structure may be called multigate MOSFET. The structure shown
in Fig. 7–18 is a double-gate MOSFET. Shrinking TSi automatically reduces Wdep
and Xj in Eq. (7.3.4) and Vt roll-off can be suppressed to allow Lg to shrink to as
small as a few nm. Because the top and bottom gates are at the same voltage and
the Si film is fully depleted, the Si surface potential moves up and down with Vg mV
for mV in the subthreshold region. The voltage divider effect illustrated in Fig. 7–1c
does not exist and η in Eq. (7.2.4) is the desired unity and Ioff is very low. There is
no need for heavy doping in the channel to reduce Wdep . This leads to low vertical
field and less impurity scattering; as a result the mobility is higher (see Section 6.3).
Finally, there are two channels (top and bottom) to conduct the transistor current.
For these reasons, a multigate MOSFET can have shorter Lg, lower Ioff, and larger
Ion than a single-gate MOSFET. But, there is one problem—how to fabricate the
multigate MOSFET structure.
There is a multigate structure that is attractive for its simplicity of
fabrication and it is illustrated in Fig. 7–19. Consider the center structure in
Fig. 7–19. The process starts with an SOI wafer or a bulk Si wafer. A thin fin of Si
is created by lithography and etching. Gate oxide is grown over the exposed
surfaces of the fin. Poly-Si gate material is deposited over the fin and the gate is
patterned by lithography and etching. Finally, source/drain implantation is
Lg
G G
S
S
D
S
D
D
Oxide
Tall Short Nanowire
FinFET FinFET FET
FIGURE 7–19 Variations of FinFET. Tall FinFET has the advantage of providing a large W
and therefore large Ion while occupying a small footprint. Short FinFET has the advantage of
less challenging lithography and etching. Nanowire FET gives the gate even more control over
the transistor body by surrounding it. FinFETs can also be fabricated on bulk Si substrates.
Hu_ch07v3.fm Page 282 Friday, February 13, 2009 4:55 PM
Gate Drain
Source
1.4 105
1E-3 R 12.5 nm 3-D simulation
Vds1V model
1.2 105 Vgs 2 V
1E-5 Tox1.5 nm
L1 m R 2.5 nm
R 2.5 nm 1.0 105
Tox1.5 nm
1E-7
L1mm
8.0 106
1E-9
Vgs 1.5 V
1E-11 6.0 106
1E-17 0.0
0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0
Gate voltage (V) Drain voltage (V)
FIGURE 7–20 Simulated I–V curves of a nanowire MOSFET. R is the nanowire radius. (After [16].)
performed. The final structure in Fig. 7–19 is basically the multigate structure in
Fig.7–18 turned on its side. This structure is called the FinFET because its Si body
resembles the back fin of a fish [13]. The channel consists of the two vertical
surfaces and the top surface of the fin. The channel width, W, is the sum of twice
the fin height and the width of the fin.
Several variations of FinFET are shown in Fig. 7–19 [14,15]. A tall FinFET
has the advantage of providing a large W and therefore large Ion while occupying
a small footprint. A short FinFET has the advantage of less challenging etching.
In this case, the top surface of the fin contributes significantly to the suppression
of Vt roll-off and to leakage control. This structure is also known as a triple-gate
MOSFET. The third variation gives the gate even more control over the Si wire
by surrounding it. It may be called a nanowire FET and its behaviors shown in
Fig. 7–20 can be modeled with the same methods and concepts used to model the
basic MOSFETs. FinFETs with Lg as small as 3 nm have been experimentally
demonstrated. It will allow transistor scaling beyond the scaling limit of the
conventional planar transistor.
Output conductance limits the transistor voltage gain. It has been introduced in
Section 6.13. However, its cause and theory are intimately related to those of Vt
roll-off. Therefore, the present chapter is a fitting place to explain it.
Hu_ch07v3.fm Page 283 Friday, February 13, 2009 4:55 PM
where ld is given in Eq. (7.3.4). This component of gds can also be suppressed with
larger L and smaller Tox, Xj, and Wdep .
There are commercially available computer simulation suites [17] that solve all the
equations presented in this book with few or no approximations (e.g., Fermi–Dirac
statistics is used rather than Boltzmann approximation). Most of these equations
are solved simultaneously, e.g., Fermi–Dirac probability, incomplete ionization of
dopants, drift and diffusion currents, current continuity equation, and Poisson
Hu_ch07v3.fm Page 284 Friday, February 13, 2009 4:55 PM
equation. Device simulation is an important tool that provides the engineers with
quick feedback about device behaviors. This narrows down the number of variables
that need to be checked with expensive and time-consuming experiments.
Examples of simulation results are shown in Figs. 7–15 and 7–20. Each of the figures
takes from minutes to several hours of simulation time to generate.
Related to device simulation is process simulation. The input that a user
provides to the process simulation program are the lithography mask pattern,
implantation dose and energy, temperatures and times for oxide growth and
annealing steps, etc. The process simulator then generates a two- or three-
dimensional structure with all the deposited or grown and etched thin films and
doped regions. This output may be fed into a device simulator together with the
applied voltages and the operating temperature as the input to the device
simulator.
Log Id (A)
2.90 7 3.30
Id (mA)
Lines : model
8 Symbols : data
1.94
9
10
Lines : model 11
Symbols : data
0 12
0.0 0.8 1.6 2.4 3.2 4.0 0.0 0.66 1.32 1.98 2.64 3.3
Vd (V) Vg (V)
FIGURE 7–21 Selected comparisons of BSIM and measured device data to illustrate the accuracy of a
compact model. (After [18].)
1.6
W 20 m Tox 9 nm
8 W 20 m
1.4 Vbs 3.3 V Vgs 3.292 V Tox 9 nm
Vbs 2.64 V Vsub 0 V
1.2 6 Vgs 2.707 V
Vbs 1.98 V
Idsat (mA)
Vth (V)
0.4 0
0 1 2 3 4 5 6 0 1 2 3 4 5
L (m) L (m)
FIGURE 7–22 A compact model needs to accurately model the transistor behaviors for any L and W that
circuit designers may specify. (After [19]. © 1997 IEEE.)
and drain junction diode model. Noise and high-frequency models are usually
provided, too.
To reduce cost and improve speed in order to open up new applications, transistors
and interconnects are downsized periodically. Very small MOSFETs are prone to
have excessive leakage current called Ioff. The basic component of Ioff is the
subthreshold current
W –q Vt ⁄ η kT W –Vt ⁄ S
I off ( nA ) = 100 ⋅ ----- ⋅ e = 100 ⋅ ----- ⋅ 10 (7.2.8)
L L
Hu_ch07v3.fm Page 286 Friday, February 13, 2009 4:55 PM
S is the subthreshold swing. To keep Ioff below a given level, there is a mini-
mum acceptable Vt. Unfortunately, a larger Vt is deleterious to Ion and speed.
Therefore, it is important to reduce S by reducing the ratio Toxe/Wdep. Furthermore,
Vt decreases with L, a fact known as Vt roll-off, caused by DIBL.
–L ⁄ l d
V t = V t-long – ( V ds + 0.4V ) ⋅ e (7.3.3)
● PROBLEMS ●
7.1 Assume that the gate oxide between an n+ poly-Si gate and the p-substrate is 11 Å
thick and Na = 1E18 cm–3.
(a) What is the Vt of this device?
(b) What is the subthreshold swing, S?
(c) What is the maximum leakage current if W = 1 µm, L = 18 nm? (Assume Ids =
100 W/L (nA) at Vg = Vt.)
● Vt Roll-off ●
7.3 Qualitatively sketch log(Ids) vs. Vg (assume Vds = Vdd) for the following:
(a) L = 0.2 µm, Na = 1E15 cm–3.
(b) L = 0.2 µm, Na = 1E17 cm–3.
Hu_ch07v3.fm Page 287 Friday, February 13, 2009 4:55 PM
Problems 287
7.4 Does each of the following changes increase or decrease Ioff and Ion? A larger Vt. A
larger L. A shallower junction. A smaller Vdd. A smaller Tox. Which of these changes
contribute to leakage reduction without reducing the precious Ion?
7.5 There is a lot of concern that we will soon be unable to extend Moore’s Law. In your
own words, explain this concern and the difficulties of achieving high Ion and low Ioff.
(a) Answer this question in one paragraph of less than 50 words.
(b) Support your description in (a) with three hand-drawn sketches of your choice.
(c) Why is it not possible to maximize Ion and minimize Ioff by simply picking the right
values of Tox, Xj, and Wdep? Please explain in your own words.
(d) Provide three equations that help to quantify the issues discussed in (c).
7.6 (a) Rewrite Eq. (7.3.4) in a form that does not contain Wdep but contains Vt. Do so by
using Eqs. (5.5.1) and (5.4.3) assuming that Vt is given.
(b) Based on the answer to (a), state what actions can be taken to reduce the
minimum acceptable channel length.
7.7 (a) What is the advantage of having a small Wdep?
(b) For given L and Vt, what is the impact of reducing Wdep on Idsat and gate? (Hint:
consider the “m” in Chapter 6)
Discussion: Overall, smaller Wdep is desirable because it is more important to be able
to suppress Vt roll-off so that L can be scaled.
7.8 Assume an N-channel MOSFET with an N+ poly gate and a substrate with an idealized
retrograde substrate doping profile as shown in Fig. 7–23.
Nsub
P
x
Tox Xrg
FIGURE 7–23
Hu_ch07v3.fm Page 288 Friday, February 13, 2009 4:55 PM
(a) Draw the energy band diagram of the MOSFET along the x direction from the
gate through the oxide and the substrate, when the gate is biased at threshold
voltage. (Hint: Since the P region is very lightly doped you may assume that the
field in this region is constant or dε/dx = 0). Assume that the Fermi level in the P+
region coincides with Ev and the Fermi level in the N+ gate coincides with Ec.
Remember to label Ec, Ev, and EF.
(b) Find an expression for Vt of this ideal retrograde device in terms of Vox. Assume
Vox is known. (Hint: Use the diagram from (a) and remember that Vt is the
difference between the Fermi levels in the gate and in the substrate. At threshold,
Ec of Si coincides with the Fermi level at the Si–SiO2 interface).
(c) Now write an expression for Vt in terms of Xrg, Tox, εox, εsi and any other common
parameters you see fit, but not in terms of Vox. Hint: Remember Nsub in the lightly
doped region is almost 0, so if your answer is in terms of Nsub, you might want to
rethink your strategy. Maybe εoxεox = εsiεsi could be a starting point.
(d) Show that the depletion layer width, Wdep in an ideal retrograde MOSFET can be
about half the Xdep of a uniformly doped device and still yield the same Vt.
(e) What is the advantage of having a small Wdep?
(f) For given L and Vt, what is the impact of reducing Wdep on Idsat and inverter
delay?
● REFERENCES ●
1. International Technology Roadmap for Semiconductors (http://public.itrs.net/)
2. Ghani, T., et al. “A 90 Nm High Volume Manufacturing Logic Technology Featuring Novel
45 nm Gate Length Strained Silicon CMOS Transistors,” IEDM Technical Digest. 2003,
978–980.
3. Yeo, Y-C., et al. “Enhanced Performance in Sub-100nm CMOSFETs Using Strained
Epitaxial Si-Ge.” IEDM Technical Digest. 2000, 753–756.
4. Liu, Z. H., et al. “Threshold Voltage Model for Deep-Submicrometer MOSFETs.” IEEE
Trans. on Electron Devices. 40, 1 (January 1993), 86–95.
5. Wann, C. H., et al. “A Comparative Study of Advanced MOSFET Concepts.” IEEE
Transactions on Electron Devices. 43, 10 (October 1996), 1742–1753.
6. Yeo, Yee-Chia, et al. “MOSFET Gate Leakage Modeling and Selection Guide for
Alternative Gate Dielectrics Based on Leakage Considerations.” IEEE Transactions on
Electron Devices. 50, 4 (April 2003), 1027–1035.
7. Lu, Q., et al. “Dual-Metal Gate Technology for Deep-Submicron CMOS Transistor,” Symp.
on VLSI Technology Digest of Technical Papers, 2000, 72–73.
8. Chen, I. C., et al. “Electrical Breakdown in Thin Gate and Tunneling Oxides.” IEEE Trans.
on Electron Devices. ED-32 (February 1985), 413–422.
9. Kedzierski, J., et al. “Complementary Silicide Source/Drain Thin-Body MOSFETs for the
20 nm Gate Length Regime.” IEDM Technical Digest, 2000, 57–60.
10. Hu, C. “Scaling CMOS Devices Through Alternative Structures,” Science in China
(Series F). February 2001, 44 (1) 1–7.
11. Choi, Y-K., et al. “Ultrathin-body SOI MOSFET for Deep-sub-tenth Micron Era,” IEEE
Electron Device Letters. 21, 5 (May 2000), 254–255.
12. Celler, George, and Michael Wolf. “Smart Cut™ A Guide to the Technology, the Process,
the Products,” SOITEC. July 2003.
Hu_ch07v3.fm Page 289 Friday, February 13, 2009 4:55 PM
13. Huang, X., et al. “Sub 50-nm FinFET: PMOS.” IEDM Technical Digest, (1999), 67–70.
14. Yang, F-L, et al. “25 nm CMOS Omega FETs.” IEDM Technical Digest. (1999), 255–258.
15. Yang, F-L, et al. “5 nm-Gate Nanowire FinFET.” VLSI Technology, 2004. Digest of
Technical Papers, 196–197.
16. Lin, C-H., et al. “Corner Effect Model for Compact Modeling of Multi-Gate MOSFETs.”
2005 SRC TECHCON.
17. Taurus Process, Synoposys TCAD Manual, Synoposys Inc., Mountain View, CA.
18. http://www-device.eecs.berkeley.edu/~bsim3/bsim4.html
19. Cheng, Y., et al. “A Physical and Scalable I-V Model in BSIM3v3 for Analog/Digital Circuit
Simulation.” IEEE Trans. on Electron Devices. 44, 2, (February 1997), 277–287.
● GENERAL REFERENCES ●
1. Taur, Y., and T. H. Ning. Fundamentals of Modern VLSI Devices. Cambridge, UK:
Cambridge University Press, 1998.
2. Wolf, S. VLSI Devices. Sunset Beach, CA: Lattice Press, 1999.
Hu_ch07v3.fm Page 290 Friday, February 13, 2009 4:55 PM