State-Of-The-Art TCAD - 25 Years Ago and Today

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

State-of-the-art TCAD: 25 years ago and today

M. Stettler, S. Cea, S. Hasan, L. Jiang, A. Kaushik, P. Keys, R. Kotlyar,


C. Landon, D. Pantuso, A. Slepko, S. Smith, V. Tiwari, C. Weber, and J. R. Weber
Logic Technology Division, Intel Corporation, Hillsboro, OR, USA, email: [email protected]

Abstract—In the past 25 years, process and device TCAD used simulating systems across the entire dimensional spectrum,
in direct support of industrial process development has from atomistic scale features and defects to device-related
undergone radical change. In the era of Dennard scaling [1], circuit and even die-scale behavior. As shown in Fig. 2, solving
TCAD efforts in manufacturers such as Intel could arguably be all of the problems of interest accurately and efficiently across
described as advanced applications work. However, with the the dimensional scale has greatly multiplied the number of
advent of nanometer device dimensions, the need to calculate simulation tools needed in the TCAD engineer’s toolkit, which
fundamental material properties on the fly, resolve quantum has increased 5X since the 1990s.
effects, and understand the role of atomic-scale defects has The objective of this work is to describe two aspects of
shifted TCAD from engineering towards research. Rigorous industrial TCAD support that once resided primarily in
solutions to Schrodinger’s equations based on NEGF and DFT academic research but now are routine: 1) the use of atomistic
and semi-classical solutions of the BTE are now in routine use. and large scale simulations for device design and understanding
Concurrently, aging continuum models such as drift-diffusion and 2) the continuing necessity to close the gap between
continue to be infused with more rigorous approaches to rigorous methods and more computational efficient approaches
maintain accuracy while still affording the fast turn-around- to meet both accuracy and wall-clock time requirements of live
time required by today’s development, which now involves a process development. Each of these trends will be illustrated
with applications work done at Intel.
significant number of novel options under simultaneous
consideration. This talk will contrast Intel’s TCAD II. ATOMISTIC TO DIE-LEVEL SIMULATION
environment of 25 years ago with today’s and give examples Extreme scaling has made atomistic simulation, where the
of studies which illustrate the evolution. effect of each atom is explicitly accounted for, an essential part
I. INTRODUCTION of the TCAD process and device toolbox for three reasons.
Although every era has its share of challenges, today is by First, much of device physics, such as phenomena responsible
far the most fascinating time to be a modeling engineer working for transistor leakage, has pushed past Newtonian physics into
in semiconductor process development. From its advent in the the quantum regime. Second, material properties, in particular
mid-1980s until early this decade, the primary role of industrial those that depend on bandstructure, are no longer described by
TCAD was finding ways to control the deleterious leakage their bulk values and must be computed on the fly with each
effects arising from silicon transistor miniaturization, which change in device dimension. Finally, miniaturization has
was pursued each generation in a very formulaic way by scaling sparked a growing need to resolve atomic-scale defects and
practically every geometric attribute. The role of the modeling interfacial properties for device control and optimization.
engineer was thus to simulate, in the most realistic way
possible, the electrical field lines arising from geometry and
active doping in a planar silicon device and then recommend
process conditions which reduced these lines in the device off-
condition and accentuated them in the on-condition as
measured by terminal current. It was complicated work to be
sure, involving detailed continuum simulations of dopant
implant, diffusion and activation phenomena in silicon as a
function of actual process conditions and then capturing the
resulting electronic behavior using basic silicon drift-diffusion
(D-D) models, calibrated by reverse engineering experimental
characteristics. Today, the fundamental objective of
maintaining an ideal scaled switch remains, however in the last
5 years nearly every other aspect of the TCAD engineer’s job
has exploded, increasing in both scope and rigor. As illustrated
in Fig. 1, the number of design parameters TCAD must now
evaluate has drastically increased, including use of novel and
“designer” materials throughout the active device region and
gate, the expansion of device architecture into the 3rd
dimension, and the deliberate exploitation of electronic
phenomena such as those arising from atomic strain, size, and Fig. 1 Design parameters and the scale of systems expected to be addressed by
other quantum effects. TCAD is also now tasked with TCAD simulation in 2019 compared to 1994 (in red).

978-1-7281-4032-2/19/$31.00 ©2019 IEEE 39.1.1 IEDM19-943


Authorized licensed use limited to: Auckland University of Technology. Downloaded on May 28,2020 at 05:09:34 UTC from IEEE Xplore. Restrictions apply.
lowers the barrier for holes from the source, greatly increasing
the off-current. The effect is self-limiting because the floating
body effect lowers the channel potential such that the overlap
between the channel and drain band edges lessens, which
prevents further tunneling. The three approaches featured in
Fig. 3 show significant differences in their estimates of the off-
current. The quantum-potential corrected drift-diffusion model
with band-to-band tunneling overestimates the off-current
because it assumes the states within the deep well of the
conduction band are continuous, leading to an overestimation
of electron build-up and resulting floating body effect. At the
other extreme is the ballistic atomistic approach, which
underestimates the off-current because the states computed in
the conduction band well of the channel are too sharply defined
in energy and thus hold little charge, artificially reducing the
floating body effect. The most accurate is the atomistic
approach with full scattering, which computes broadened states
in the conduction band well, leading to increased charge
Fig. 2. The number of simulation tools used by TCAD to support technology transfer and a more accurate estimation.
development has increased considerably since 1994.
Along with predicting the performance of idealized novel
devices, industrial TCAD is also tasked with exploring issues
related to manufacturability and the interpretation of
experimental results. Key to both is predicting the likelihood
of specific chemical reactions and non-idealities such as defect
formation, which can be accomplished by employing atomistic
methods such as density functional theory (DFT), sometimes
used alone or in conjunction with device or continuum process
simulation. Fig. 4 shows the results of applying DFT to
calculate the effects on bandstructure of atomic disorder at the
Si/InAs interface within the channel of a tunnel-FET (TFET).
For this device, edge dislocations at the Si/InAs interface were
calculated using VASP [3] and showed a significant density of
trap states within the bandgap, increasing electronic tunneling
across the interface and degrading both the subthreshold slope
and off-state leakage. This type of DFT calculation usually
involves simulating superlattice structures, so it is important to
account for spurious quantum confinement effects in the
vicinity of the interface which will artificially widen the
bandgap and exaggerate the number of tunneling states. DFT
Fig. 3. Band-to-Band Induced Barrier Lowering (BIBL) seen in a 5nm x 5nm analysis such as this is computationally very expensive; this
PMOS Ge nanowire. The lower diagram provides a simple explanation. specific study involved simulating 1740 atoms, taking 10 days
Estimating the value-proposition of future device to calculate the structural relaxation of the atomic system (using
technologies is one of the major responsibilities of TCAD. In the GGA approximation) and 15 days to calculate bandstructure
this pursuit, rigorous atomistic methods are often necessary to (using hybrid functionals) on 250 cores.
accurately compute performance in novel transistors where the
dominant device effects can’t always be predicted. Consider a
high mobility Ge nanowire transistor with a 5nm x 5nm cross-
section and 20 nm gate length, simulated by drift-diffusion with
a coupled tunneling model and NEMO [2], an atomistic NEGF-
based approach, in the ballistic approximation and with
scattering. For this device, an unanticipated leakage mechanism
is dominant which is not captured accurately by the first two
approaches. The mechanism is band-to-band induced barrier
lowering (BIBL), which can be seen in the low Vg regime of the
i-v curves shown in Fig. 3 for all three methods. At high
drain/low gate bias, the conduction band edge of the channel
region overlaps with the valence band edge of the drain, causing
electrons to tunnel from the drain to the channel region. These
electrons become trapped in the large potential well of the
channel conduction band, and the resulting floating-body effect Fig. 4. DFT Results for the interface states in a Si-InAs TFET.

IEDM19-944 39.1.2
Authorized licensed use limited to: Auckland University of Technology. Downloaded on May 28,2020 at 05:09:34 UTC from IEEE Xplore. Restrictions apply.
for a variety of cell blocks and small circuits. Also shown is
the resulting thermal map of applying this approach to an entire
die, which requires approximately a ½ day to simulate. This
map shows precisely where in the product the thermal budget is
at greatest risk and can be used for reliability verification.
III. BRIDGING ACCURACY AND EFFFICIENCY
Since the turn of the century, the computational demand
required by TCAD to support process development at Intel has
increased nearly 1000X. As shown in Fig. 7, the first major
inflection point occurred when Intel began deliberately
engineering device strain, requiring rigorous computation of
bandstructure and transport effects resulting from distortion of
the silicon lattice, and also the transition from 2D to full 3D
process and device simulation in anticipation of a device
architecture change from planar silicon to finfets. A second
Fig. 5. Flowchart describing how device self-heating (SH) is computed on the inflection occurred with the increased use of NEGF and DFT
functional unit block (FUB) level. Images are from finite element analysis of
the thermal behavior arising from specific cell geometries.
based atomistic simulations, needed to understand quantum and
novel material effects in next generation nanoscale devices.
Although computational resources have increased
considerably, they still fall short of bestowing the turn-around-
time (TAT) necessary for rigorous models to be used in day-to-
day technology support. The need for faster TAT has motivated
TCAD engineers to seek more efficient versions of rigorous
techniques and also to modify less rigorous but vastly more
efficient approaches such as D-D, which remains the mainstay
of TCAD support despite 30+ years of being declared outdated.
Fig. 6. Device self-heating modeling: comparison versus data for various test One of the rigorous techniques that has shown promise for
circuits (left) and the resulting die level thermal map (right). great computational simplification is NEGF device simulation
TCAD is also expected to give physical insight and using the low rank approximation (LRA) approach described in
predictions on reliability issues, especially those which arise [5]. The secret of LRA is a much smaller basis set, comprised
from device behavior. One such issue is the temperature impact only of the minimum set of wave functions needed to accurately
from device self-heating (SH), which has grown as a major reproduce the bandstructure in the E-k region of interest (i.e.
concern with the thermal resistance arising from the finfet near the band edge) at zero external bias. While this approach
architecture and steadily increasing power density. The scope has limitations, such as not being applicable to disordered
of modeling this effect spans a huge dimensional range, from systems, it has been shown to work well for several idealized
fins measured in nanometers to the cell block level and above device architectures of interest. Fig. 8 demonstrates excellent
measured in millimeters. Fig. 5 shows the approach taken by agreement between LRA and full-rank NEMO simulation of
Intel, which starts with a fundamental understanding of the both a 5nm x 5nm Si nanowire and 4.5nm Si UTB device,
thermal conductivity of the exact finfet structure using a which is a useful representation of an aggressively scaled finfet.
rigorous solution to the BTE for the phonon system, calculated While the results appear identical to NEMO, the computational
with MEMOSA [4]. The next step is finite element (FE) requirements of the LRA are remarkably less as denoted by the
analysis of the thermal impact of each drain hotspot on wall-clock times specified in the figure. Fig. 8 also shows
representative circuit layouts, as a function of input power, to ongoing work at Intel extending the LRA to include scattering
determine the lateral and vertical thermal spreading on nearby using a formalism described in [6] which so far exhibits robust
devices and interconnect layers. After a sufficient number of convergence while preserving current continuity.
circuits are simulated with FE, an analytic compact model is
calibrated to the results. Next, a 2D mesh is imposed on the
functional unit block (FUB) or die to be analyzed. At each node,
the results of the compact model for all interconnect layers are
evaluated and superimposed, using as inputs details from the
local layout and input power computed from Spice simulation.
The final step is combining the resulting thermal map from
device heating with the map from interconnect heating,
computed with a separate, but similar approach. This combined
device and interconnect SH modeling approach is many orders
of magnitudes faster than direct 3D FE simulation and thus
suitable for use at the FUB or die level.
Fig. 6 shows a comparison of the average temperature
calculated with this approach and experimental measurements Fig. 7. Normalized computational demand used by TCAD.

39.1.3 IEDM19-945
Authorized licensed use limited to: Auckland University of Technology. Downloaded on May 28,2020 at 05:09:34 UTC from IEEE Xplore. Restrictions apply.
IV. CONCLUSION
TCAD today remains a vital metrology and a singular, albeit
imperfect, crystal ball for industrial semiconductor process
design. While over the years the toolkit at Intel and elsewhere
has profoundly expanded with atomistic and multi-scale
approaches now in daily use, continuum models such as drift-
diffusion still enjoy routine application—a phenomenon which
modeling experts 40 years ago never would have predicted.
REFERENCES
Fig. 8. The low rank approximation (LRA) approach to NEGF device [1] R. Dennard, et al., “Design of ion implanted MOSFETs with very small
simulation [5] for two NMOS device geometries in comparison with NEMO physical dimensions,” IEEE J. of Solid State Circuits (1974).
[2]; the wall-clock hours required by each (for ballistic simulation) are noted. [2] G. Klimeck, F. Oyafuso, T.Boykin, et al., "Development of a
nanoelectronic 3-D (NEMO 3-D) simulator for multimillion atom
Despite algorithmic improvements, rigorous atomistic simulations and application to alloyed quantum dots,” CMES (2002).
techniques such as NEGF and semi-classical methods like [3] G. Kresse and J. Hafner, “Ab initio molecular dynamics for liquid
Monte Carlo remain too sluggish for day-to-day technology metals,” Phys. Rev. B (1993).
[4] J. Loy, S. Mathur, and J. Murthy, “A coupled ordinates method for the
support and unwieldy to rectify when they don’t match data. convergence acceleration of the phonon Boltzmann transport equation,”
This leaves the venerable D-D method which, even for sub-10 J. Heat Transfer (2015).
nm devices, can be made sufficiently predictive within a given [5] G. Mil'nikov, N. Mori, Y. Kamakura, "Equivalent transport models in
technology node to be incredibly useful for process option atomistic quantum wires,” Phys. Rev. B (2012).
selection and experimental understanding. D-D is able to [6] H. Mera et al., “Inelastic scattering in nanoscale devices: One-shot
current-conserving lowest-order approximation,” Phys. Rev B (2012).
accomplish this for three reasons. First, for a given node, the [7] T. Linton, K. Foley, F. Heinz, R. Kotlyar, P. Matagne et al., “MDS—a
basic architecture and the range of device dimensions which new, highly extensible device simulator,” SISPAD (2007).
meet technologies goals are fixed, so the design window is [8] M. Law and S. Cea, “Continuum based modeling of silicon integrated
relatively narrow. Second, D-D is one of the only transport circuit processing: an object oriented approach,” Comp Mat Sci (1998).
[9] C. Auth, A. Aliyarukunjum, M. Asoro, D. Bergstrom, et al., “A 10nm
approaches that can handle fully realistic device geometries. high performance and low-power CMOS technology featuring 3rd
Third is the simplicity of bolting to D-D both the calibration- generation FinFET transistors…,” IEDM Tech. Digest (2017).
friendly phenomenological models and the more fundamental
techniques needed to capture experimental trends.
Fig. 9 shows how the entire device model hierarchy is
harnessed to support D-D simulation at Intel. In the diagram,
many of the simulation elements reside in Intel’s Modular
Device Simulator which provides a framework where solution
methods and output and input fields can easily be combined
without requiring significant rework [7]. For a given
technology node, many fundamental relationships can be
established a priori such as bandstructure, computed using
pseudopotential, k.p or tight binding methods, and the impact
of quantum confinement on charge density vs. applied bias,
computed with full-band and multi-valley Schrodinger-Poisson
solvers. The bandstructure is then read directly into a Monte
Carlo (MC) device simulator which is used to calibrate the
mobility model as a function of device length and also by a fast
1D NEGF-based algorithm for tunneling, while the charge Fig 9. Models involved in drift-diffusion calibration and application.
profiles are used to calibrate the corrections needed to mimic
quantum confinement effects in both MC and D-D. After this
prework and starting with models generated from previous
process revisions, an extensive calibration of Intel’s elaborate
mobility models is performed involving successive rounds of
process simulation to generate the device structure and stress
fields (using a proprietary version of FLOOPs [8]) followed by
device simulation, whose results are then compared to
experimental data comprised of transistor i-v, c-v, and contact
resistance measured from Kelvin structures. The calibration
work can be time-consuming for the first process revision of a
new node, but usually requires only small tweaks afterwards,
and the result can be employed for nearly the life of a process
node to explain experiments, vet proposed process changes, and Fig 10. Comparison of simulated vs. experimental results for NMOS devices
construct improvement roadmaps. Fig. 10 shows some of the from a revision of Intel’s 10nm process [9]; matches with PMOS (not shown)
simulated transistor characteristics for Intel’s 10nm process [9]. show similar agreement.

IEDM19-946 39.1.4
Authorized licensed use limited to: Auckland University of Technology. Downloaded on May 28,2020 at 05:09:34 UTC from IEEE Xplore. Restrictions apply.

You might also like