Microelectronics Journal 41 (2010) 789–800
Contents lists available at ScienceDirect
Microelectronics Journal
journal homepage: www.elsevier.com/locate/mejo
Via wearout detection with on-chip monitors
Fahad Ahmed, Linda Milor
Georgia Institute of Technology, School of Electrical and Computer Engineering, Atlanta, GA 30332, USA
a r t i c l e in fo
abstract
Article history:
Received 20 September 2009
Accepted 18 January 2010
Available online 11 February 2010
This project aims to detect the onset of chip failure due to via voiding through monitoring the delays of
paths in a chip. The proposed method relates the probability of failure of individual vias to an increase
in delay for monitors of the system using data for 65 nm technology. The delay increase, as a function
of the failure distribution parameters, the path length, gate type, and process variation, has been
investigated. An on-chip, ring oscillator-based wearout monitoring circuit is presented. The proposed
scheme monitors the delay through a data path using a delay detection circuit (DDC).
& 2010 Elsevier Ltd. All rights reserved.
Keywords:
Electromigration
Built-in self-test
Reliability monitor
1. Introduction
Many applications, ranging from automatic flight control
systems, nuclear power control systems, on-line transaction
processors for financial institutions, to hospital patient monitors,
require systems to be extremely reliable. Fault-tolerant systems
require finding a way to prevent a physical defect or failure from
causing an error in system performance. Such systems incorporate fault detection, fault masking, fault isolation, and recovery
procedures.
Fault-tolerant design at the circuit level focuses primarily on
fault detection. The other aspects of fault tolerance (fault location,
system reconfiguration, and recovery) are generally performed
off-chip by the system architecture and software. The typical
approach to fault detection on-chip involves the introduction of
redundancy, in space, time, and information, at the cost of either
extra computational time and/or extra hardware components.
The cost of these approaches is a degradation in yield (and a
concomitant increase in manufacturing cost), together with
higher power dissipation, since all modules are operated in
parallel.
This paper aims to demonstrate an alternative approach,
involving detection of the onset of failure through detection of
component degradation over time. If component degradation can be
detected, the cost of building highly reliable systems would be
reduced. The additional area needed for fault detection is limited
to the design-for-test (DfT) circuitry, which is a small fraction of
the component being monitored. Similarly, the cost of operating
such a system is similarly reduced, due to the reduced power
dissipation, since the DfT circuitry is only turned on intermittently during test.
Corresponding author. Tel.: + 1 404 894 4793; fax: + 1 404 898 0677.
E-mail address:
[email protected] (L. Milor).
0026-2692/$ - see front matter & 2010 Elsevier Ltd. All rights reserved.
doi:10.1016/j.mejo.2010.01.006
A wide variety of faults can cause system failures, ranging from
static to transient faults. This paper focuses on faults due to
component wearout. The major causes of product wearout are
electromigration, gate oxide breakdown, hot carrier injection,
negative bias temperature instability, backend dielectric breakdown, and stress migration [1].
This paper focuses on the detection of electromigration only.
The proposed method to detect electromigration is through
detection of increased delays in data paths. However, it should
be noted that many other failure mechanisms can impact the
delays of data paths. Hence, our results will only show that delay
increases provide a quantifiable indication of electromigration
in the absence of other failure mechanisms. Future work will look at
the use of additional monitors, in combination with the proposed
monitors, to attempt to determine the cause of failure.
This paper is organized as follows. In the next section via
degradation is related to path delay. Section 3 considers the
sensitivity of this relationship to parameters describing the failure
rate distribution of vias, chain length, the composition of the data
path in terms of gate types, and process variation. Section 4
presents the proposed detection circuit. Section 5 considers the
problems caused by power supply noise, jitter, and temperature
variation and potential solutions to these problems. Section 6
compares this scheme with a conventional ring oscillator-based
monitoring scheme. Section 7 concludes the paper with a
summary.
2. Relating via degradation to path delay
2.1. Calculation of the via cumulative probability densities of failure
Stress-inducted electromigration has been identified as a key
concern in interconnect reliability, due to continued reduction in
via dimensions and the resulting increase in current density with
F. Ahmed, L. Milor / Microelectronics Journal 41 (2010) 789–800
each technology generation. This has adversely affected the
electromigration lifetime of interconnects, particularly vias and
contacts, due to the higher current densities through them [2,3].
Electromigration is caused by the transfer of momentum from
the moving ‘‘electron wind’’ to ions in the metallic lattice [4]. It
results in the transport of these metallic ions into the neighboring
material, causing an increase in resistance.
Two different voiding features have been identified at the via–
line interface [3]. Depending on the direction of flow of current,
void formation could occur on the via side of the line liner or on
the line side. Once big enough to be electrically detectible, void
formation translates into an increase in the delay through the data
path. Thus by constantly monitoring the delay of a path, it is
possible to detect the onset of failure of the data path.
Median time-to-failure (MTF) of a via is a strong function of
current density. Black’s equation [4] has been extensively used for
electromigration lifetime modeling. According to this model, the
MTF of a conducting path is inversely proportional to the current
density, as follows:
MTF ¼ AJ n expðF=KTÞ;
ð1Þ
where J is the current density, F is the activation energy, T is
temperature, K is Boltzmann’s constant, and A is a constant. In this
study, the activation energy and the value of n were calculated
using experimental data from [5].
As a generic example, the data path considered is made up of a
chain of inverters, and each node is assumed to contain a via, as
shown in Fig. 1.
Via dimensions are assumed to be typical via dimensions for
65 nm technology [6]. Therefore, the current density through each
via is only a function of the current through the node in which it is
located. The ring oscillator was simulated. Table 1 shows the
calculated MTFs for vias at different positions in the inverter.
It can be seen that the vias at the sources have the least expected
lifetime due to the larger current through them.
The Weibull cumulative probability density function (CDF) for
the time-to-failure is used to model the probability of failure,
Table 1
MTF for vias of the inverter at the nodes indicated in Fig. 1.
Via
PMOSS
NMOSS
PMOSG
NMOSG
OUTVIA
MTF (years)
5.2040
5.1721
18.8620
114.707
11.6598
1
PMOSS
NMOSS
0.8
OUTVIA
PMOSG
NMOSG
0.6
CDF
790
0.4
0.2
0
0
5
10
Time-Yrs
15
20
Fig. 2. CDF of vias at different nodes of an inverter for b = 1.2. The difference in the
failure rates is due to the difference in the current through each node.
P(tfail), of these vias as a function of time:
b !
tfail
Pðtfail Þ ¼ 1exp
:
Z
ð2Þ
The Weibull distribution is described by two parameters: Z
and b, which are the characteristic lifetime and shape parameter,
respectively. The median time-to-failure is
MTF ¼ Zðln ð2ÞÞ1=b :
ð3Þ
Substituting (3) into (2) enables us to relate the MTF of a via to
its probability of failure. Fig. 2 shows the CDFs of the vias of an
inverter.
2.2. Relationship between resistance and failure
The data path has been assumed to consist of an inverter chain
with 18 inverters. The initial via resistances, R0, for all vias in all of
the inverters, were assumed to be normally distributed.
Experimental results [3] show that under stressed conditions,
resistance increases very slowly, until a point where there is a
jump in the via resistance. After the jump, the resistance either
increases gradually or there is an immediate open circuit at the
via–line contact. This resistance jump was explained as an occurrence of a void at the contact. Once the void is formed, the rate of
increase in resistance increases in comparison to the initial
increase in resistance.
The resistance jump is very critical, since it indicates the point
after which a via could break at any time. The resistance value
after the jump was taken as the point at which a via fails (RFail). Li
et al. [3] had shown that the distribution of this value is a function
of the time zero resistance and is approximately distributed
around a mean value of 0.2R0 for most via configurations, i.e.
RFailMean ¼ 1:2R0 :
Fig. 1. The inverter chain consists of inverters, line the one shown, with a via at
every node.
ð4Þ
For each via, RFail is assumed to be normally distributed with
the mean of 1.2R0.
791
F. Ahmed, L. Milor / Microelectronics Journal 41 (2010) 789–800
Under normal operating conditions it is assumed that the via
resistance is modeled as a time-dependent exponential function:
ð6Þ
25
0.25
0.2
Hence, we can calculate G for each via in the inverter chain:
G ¼ tfail =lnð1:2Þ
ð7Þ
0.15
0.1
Vias Exceeding 1.2Ro
where G models the rate of increase of the effective resistance
as a function of time. The time to fail for each via (tfail) is
calculated from the Weibull curve. Values for RFail and tfail are
related by Eq. (5):
1:2R0 ¼ R0 expðtfail =GÞ:
0.3
ð5Þ
PFail
RðtÞ ¼ R0 et=G
30
and
G¼
20
15
10
5
MTF lnð1PÞ
:
lnð1:2Þbðln 2Þ1=b
0.05
ð8Þ
Accounting for randomness involves assigning a probability,
P A ½0; 1, to each via based on a uniform distribution. This
probability, P, impacts G for each via, according to (8). Eq. (5) is
then used to model the time evolution of each via resistance in
the inverter chain. Fig. 3 shows the increase in resistances of the
90 vias in the inverter chain with respect to time for one inverter
chain instance. Because P is a random variable, each inverter chain
will have a different distribution of resistances vs. time.
0
0
0
1
2
3
Time-Yrs
4
5
Fig. 4. Number of vias exceeding 1.2R0 as a function of time. PFail is the fraction of
vias that have failed in the inverter chain, which varies from zero to one.
x 10-10
2.41
2.3. Relationship to inverter chain delay
2.405
Delay-sec
The inverter chain has numerous vias exceeding RFail at
various times. Fig. 4 plots the number of vias exceeding 1.2R0 vs.
time.
An inverter chain with via resistances varying as shown in
Fig. 3 was simulated, to find the delay through the chain. The
result is shown in Fig. 5. Fig. 6 combines Figs. 4 and 5, by plotting
DDelay vs. the number of failing vias. We see that changes in delay
greater than 2e 14 s can clearly indicate the presence of an
electromigration problem that impacts the vias. This corresponds
to detecting faulty chains at the point when only one or two vias
have failed.
It can be noted from Fig. 5 that normally delay changes by 0.9%
in the course of 5 years. Some of the sources of variation in delay
not related to wearout include on-chip temperature variation, IR
drops, and crosstalk. Consider, for example, temperature, where
over an operating temperature range from 0 to 105 1C, delay of the
2.4
2.395
2.39
2.385
0
1
2
3
4
5
Time-Yrs
Fig. 5. Delay through the inverter chain. With increasing time, the via resistances
increase, causing an increase in delay.
inverter chain varies by 150 ps. This swamps out degradation due
to wearout. Hence, it is important that the operating conditions
are reproduced exactly for each test of delay degradation. The
ability to reproduce operating conditions, and hence temperature
profiles, IR drops, and crosstalk limits the resolution of the
method.
Delay increases due to other mechanisms that impact
transistors, i.e. hot carrier injection and negative temperature
bias instability, are likely to be more significant than delay
increases due to electromigration in vias. Nevertheless, there
could be processes with electromigration weaknesses which need
to be monitored. Moreover, the delay increase trigger is an
indicator of a potential reliability failure of any kind. Hence, if we
have other screens to determine the likely cause of failure,1 delay
Fig. 3. Simulated values of resistance for all vias in an inverter chain plotted
against time.
1
Screens that can distinguish between electromigration and transistor
wearout mechanisms could be a set of heavily loaded overstressed ring oscillators,
some with large transistors driving single vias and some with the same transistors
driving multiple vias, so that electromigration is unlikely.
792
F. Ahmed, L. Milor / Microelectronics Journal 41 (2010) 789–800
x 10-12
Delta Delay-sec
2
1.5
1
0.5
0
0
0
5
0.05
10
15
20
Vias Exceeding 1.2Ro
0.1
0.15
0.2
PFail
25
0.25
30
0.3
Fig. 6. Relationship between the change in delay through the chain and the
probability of failure of the inverter chain.
increases can be linked to the number of failing vias in the chain
and the fraction of failing vias, Pfail, in accordance with Fig. 6.
2.4. Estimation of the number of failures for a full chip
Since a circuit can have billions of paths, it is not possible to
monitor all paths in a circuit. Let s be the fraction of the total
number of vias in the circuit that are monitored. If a circuit has
100,000 vias and 100 are monitored, s would be 0.001.
Let us suppose that a set of monitors provide us with data on
the delay increase for each monitor. This relationship determines
the expected number of failed vias in each chain, ni. These numbers
are summed together to provide a total number of failed vias in the
P
monitors, i.e. f ¼ m
i ¼ 1 ni , where m is the number of monitors.
The total number of failed vias in the full chip is f/s, provided that
the monitors are randomly placed throughout the chip. Hence,
if we detect only two fails based on the monitors, and s=0.001,
then there are likely to be 2000 via failures in the full chip.
However, it should be noted that the failure rate is an exponential function of temperature. Hence, one can get a stronger
signal relating to wearout if paths are selected in areas of the chip
with high activity and high temperature. In this case, monitors in
each sector, j, would be used to estimate the number of failures in
that sector, and the results would be summed to estimate the
number of failures for the full chip, i.e.
X
fj =sj :
ð9Þ
j
3. Sensitivity of delay changes
3.1. Sensitivity to the failure rate distribution function
Process variations can cause a shift in the parameters of the
Weibull distribution, which describes the failure rate distribution
for a specific chip. The Weibull distribution has two parameters,
Z and b. In this section, we investigate the sensitivity of changes
in delay to changes in these parameters. We consider variations
due to b and MTF (which varies Z).
To investigate the effect of variations in MTF on the number of
via failures and DDelay, the value of A in Eq. (1) is changed, which
in turn causes variation in the values of MTF. Table 2 shows the
MTFs for three different cases. Similarly, we have investigated the
effect of variations in b. Table 3 shows the MTFs for three different
cases, where variation in b is considered.
For all of the above cases, the corresponding Weibull curves
were computed using Eq. (2). Then the corresponding via
resistances as a function of time were computed. The number of
vias exceeding 1.2R0 for the three cases involving varying MTF
and b in Tables 2 and 3 were computed, together with the
corresponding increases in delay through the chain.
The change in delay as a function of the number of failing vias
is shown in Figs. 7 and 8, considering variation in MTF and b,
respectively. These figures indicate that delay increases are not
sensitive to the parameters describing the probability density
function of the via failure rate. Hence delay changes are a strong
indicator of the number of failed vias at any given time.
3.2. Sensitivity to chain length
The same procedure was repeated for the analysis of the effect
of inverter chain length (and hence data path length). Inverter
chains with 18, 28, and 38 inverters were considered. The result in
Fig. 9 indicates that a change in delay is an indicator of the
number of failing vias, independent of via chain length.
Table 2
MTFs for Vias of an inverter for three different values of A (in years).
Case 1
Case 2
Case 3
PMOSS
NMOSS
PMOSG
NMOSG
OUT-VIA
5.20
3.08
13.12
5.17
3.06
13.04
18.86
67.92
47.58
114.70
101.80
289.40
11.64
11.16
29.39
Table 3
MTFs for vias of an inverter for three different values of b (in years).
b = 1.2
b = 1.4
b = 1.6
PMOSS
NMOSS
PMOSG
NMOSG
OUTVIA
5.20
1.08
0.48
5.17
1.08
0.48
18.86
3.27
1.35
114.7
15.4
5.72
11.64
2.16
0.91
x 10-13
2
Case1
Case2
1.5
Delta Delay-sec
2.5
Case3
1
0.5
0
0
5
0
0.05
10
15
Vias Exceeding 1.2Ro
0.1
PFail
0.15
20
0.2
Fig. 7. Change in delay vs. PFail for the cases with varying MTF.
793
F. Ahmed, L. Milor / Microelectronics Journal 41 (2010) 789–800
x 10-11
10
Beta=1.2
Beta=1.4
8
Delta Delay-sec
Beta=1.6
6
4
2
Fig. 10. The data path under test consisted of alternating NAND–NOR gates with
vias at the nodes shown.
0
10
0
20
30
40
50
Vias Exceeding 1.2Ro
0.2
0.4
PFail
60
0.6
7
x 10-13
6
Fig. 8. Change in delay vs. PFail for the cases with varying b.
Inverter Chain Length=28
Inverter Chain Length=38
5
Delta Delay-sec
x 10-13
5
Length=18
Length=28
4
Length=38
Nand-Nor Chain Length=18
4
3
Delta Delay-sec
2
3
1
0
2
0
0.04
0.06
PFail
0.08
0.1
0.12
Fig. 11. A comparison between the delay sensitivity function for the NAND/NOR
chain with 18 stages and inverter chains with 28 and 38 stages.
1
0
0.02
0
0.05
0.1
PFail
0.15
0.2
Fig. 9. Change in delay plotted vs. the fraction of vias whose resistance has
exceeded 1.2R0 for chains of different lengths.
3.3. Sensitivity to gate type
Data paths consist of gates other than inverters. Hence, it is
important to study how the gate type composition affects the
results. A data path consisting of alternating NAND–NOR gates
was considered with the via configuration shown in Fig. 10.
For the NAND gates, the nodes with DPGA and DNGA are
connected to the supply rail, while the nodes DPGB and DNGB
switch. Similarly, for the NOR gates, the nodes with RPGA and
RNGA are the switching nodes, while the other inputs are
connected to ground. The current through all nodes was used to
find the failure rate distributions for each of the vias.
As with the inverter chain, the delay sensitivity function vs. the
number of vias exceeding 1.2R0 is insensitive to variations in
Weibull parameters.
The NAND/NOR chain has 1.6 more vias than the inverter
chain. Hence, the number of vias is slightly greater than that of an
inverter chain of length 28 and less than an inverter chain of
length 38. Fig. 11 compares the delay sensitivity curves for the
NAND/NOR chain with inverter chains of length 28 and 38. The
increased intrinsic capacitances at the output nodes of NAND/
NOR gates leads to greater delay increase values for the same
number of failing vias. Hence the DDelay vs. the fraction of failed
vias relationship proven so far for similar data paths does not hold
when we vary the gate type, unless we take into account gate
node capacitances.
To better understand the impact of gate type on the delay
sensitivity relationship, consider the Elmore delay model
Delay ¼ t
k
X
Cn Rn
ð10Þ
n¼1
where Cn is the node capacitance at each node, Rn is a combination
of the driver’s resistance and node resistance, k is the data path
length, and t is a constant. In the presence of variation in via
resistance, only Rn changes. Let dRn be the degradation in node
resistance due to via degradation. DDelay can then be approximated as follows:
DDelay ¼ t
k
X
Cn dRn :
ð11Þ
n¼1
The intrinsic node capacitance Cn is both a function of gate
type and the transistor size. Cn mainly consists of the overlap and
794
F. Ahmed, L. Milor / Microelectronics Journal 41 (2010) 789–800
junction capacitances and both are assumed to increase linearly
with transistor size. Let k0 be the number of nodes in an inverter
chain, with capacitance Cl, l =1,y,k0. Let k be the number of nodes
in a path containing arbitrary gates. At each of these nodes,
n =1,y,k, jn is the number of nodes in the charge and/or discharge
path, which depends on the gate stack size in the pull-up and/or
pull-down network, and hence the gate type. Let Cn,m, n =1,y,k
and m=1,y,jn be the node capacitances associated with the mth
node in the pull-up and/or pull-down network of the nth gate in
the path. The capacitance ratio to the inverter chain is the
following:
P
Pk
ð jn
Cn;m Þ
g ¼ n ¼ 1Pk m ¼ 1
:
ð12Þ
0
C
l¼1 l
Therefore, the relationship between the delay sensitivity
function for an arbitrary gate, DDelay, and the delay sensitivity
function for a reference inverter chain, DDelayinv is
DDelay ¼ g DDelayinv :
voltage is due to statistical fluctuation in the number of dopant
atoms per unit volume in the device channel. Channel length
variation is due to line edge roughness, which is induced by the
polymer characteristics of photoresist, and systematic variation in
lithography (proximity effect, lens aberrations, and flare) [11].
Our analysis considers random variation in threshold voltage
and channel length. Figs. 13 and 14 show that there is very little
impact of random process variation on the delay sensitivity
curves. The impact is more pronounced in the presence of channel
length variation. Hence, it may be desirable to include capacitance
compensation for channel length variations, as suggested in the
previous section. This is because channel length variations can be
large, and they impact both transistor drive strength and load
capacitances.
4. DPRO-based delay detection
ð13Þ
3.4. Sensitivity to process variations
Two major sources of process variations are the threshold
voltage [7] and channel length [8–10]. Variation in threshold
In this section, we present the design of an on-chip delay
monitoring system to detect changes in the delay of a data path
being monitored.
1.2
G
NOT
NAND
NOR
NAND–NOR (9 NAND and 9 NOR)
NAND–NOR-NOT (28 NOT, 9 NAND, 9 NOR, randomly placed)
1
3.5
2.75
3.125
1.23
0.6
0.4
0
2
0
0.02
x 10-13
2
0.06
0.08
PFail
0.1
12
0.12
NOT(28)
NOR(18)
NAND(L=18)
NAND(18)
1.5
Delta-Delay (sec)
NAND-NOR(L=18)
NAND-NOR-NOT(L=46)
4
0.04
10
x 10-13
NOR(L=18)
5
4
6
8
Vias Exceeding 1.2Ro
Fig. 13. Change in delay plotted vs. the number of failing vias and PFail for a
random sample of threshold voltages. For 65 nm technology, the standard
deviation was assumed to be 30 mV.
NOT(L=28)
Norm-Delta Delay (sec)
0.8
0
Gate/chain
6
data1
data2
data3
data4
0.2
Table 4
Capacitance ratios for a variety of gates.
7
x 10-13
1
Delta Delay-sec
Table 4 contains a variety of capacitance ratios for the gates
considered. The differences in the capacitance ratios are partially
due to gate sizing.
In Fig. 12 we plot DDelay=g for a variety of gates. From this
figure it can be concluded that for an arbitrary data path, given g
and the delay sensitivity function for the inverter chain, DDelayinv,
it is possible to estimate the number of failing vias for arbitrary
paths, irrespective of the type of gates that make up the data path.
3
NAND-NOR(18)
NAND-NOR-NOT(46)
1
2
0.5
1
0
0
0.02
0.04
0.06
PFail
0.08
0.1
0.12
Fig. 12. Increase in the normalized delay of the data path with varying types pf
gates and length plotted against PFail.
0
0
0.02
0.04
0.06
PFail
0.08
0.1
0.12
Fig. 14. Change in delay plotted vs. PFail for a random sample of channel lengths.
For 65 nm technology, the standard deviation was assumed to be 20 nm.
F. Ahmed, L. Milor / Microelectronics Journal 41 (2010) 789–800
4.1. System design and operation
The proposed scheme is illustrated in Fig. 15. Data paths in a
chip can be turned into oscillators, whose frequency provides
a measurement of delay. A path is either an inverting data path,
which starts oscillating if connected in feedback, or a noninverting path, which requires an additional inverter to oscillate.
A set of data paths can be converted to ring oscillators by inserting
muxes in the paths, together with an additional feedback inverter
for the non-inverting paths. The additional inverter in noninverting paths, together with the muxes, add delay, but since we
are only interested in measuring delay degradation, determining
the exact delay of a path is not important.
During normal operation the path is connected to the regular
input and output data lines. In test mode (TM), the data path is
disconnected from its data in/out and is connected in a feedback
loop, which may or may not contain an inverter, to form a data
path ring oscillator (DPRO), shown in Fig. 15. The frequency of
oscillation is used to monitor path degradation.
This implementation poses a few problems when we take into
account the fact that the data paths are often a part of a pipelined
stage with very critical delay margins. A single path cannot go
into TM without disturbing the whole pipeline. Similarly, the
performance overhead introduced by the data selection blocks
can also be critical and may cause failures if present in critical
data paths.
To address these issues, consider the self-test scheme shown in
Fig. 16. The figure shows a normal pipeline stage with the path
under test (PUT) highlighted. During TM, clock gating halts the
pipeline, ensuring that no erroneous data is fed out during TM.
Two specially designed test registers (T-R) are inserted in place of
conventional registers (Reg) for the monitored data path. During
TM, the test registers disconnect the PUT from the preceding and
subsequent data paths and connect the feedback path.
To minimize physical overhead, another selection block is
introduced to choose among the different monitored paths. The
target PUT is connected by the selection block to the DDC for
delay monitoring.
The T-Rs were designed to minimize the performance overhead during the normal operation of the data path. Since the
register at the input of the data path has a slightly different
function from the register at the output of the data path during
TM, two separate registers were designed, as shown in Fig. 17.
During normal operation, the T-R selection blocks operate as
conventional inverters, as a standard master–slave flip–flop.
The operation of the T-Rs during TM is shown in Fig. 18. As the
PUT enters TM, the pipeline clock signal (CL) is grounded, halting
all operations in the pipeline. The master latch is disconnected
from the slave latch, and the T-gates at the input of the T-Rs are
turned on. The slave latch of the input T-R and the master latch of
the output T-R have data selection blocks which disconnect the
latch configuration and connect the inputs to output lines A and B.
The data path is thus connected in the DPRO configuration and
starts oscillating.
In order to monitor numerous paths, another selection block is
included to connect the selected PUT to the DDC. Hence only
a single DDC block is sufficient for the entire system being
monitored, as illustrated in Fig. 16.
4.2. Initiating oscillations in complex data paths
In general, a data path may not have a simple inverting or
non-inverting relationship between one input and one output.
Consider a multiple input (X1, X2, X3,y,Xn) and multiple output
(f1, f2, f3,y,fn) data path. Let us assume that one of the output functions of this data path (fx) is to be monitored for wearout.
The output fx might be a function of multiple inputs, and therefore
could be ‘1’ or ‘0’ depending on the values at the inputs. Let set
S1 be a set of input values that give a logic ‘1’ at output fx and let S0
be a set of inputs which give a logic ‘0’, i.e., fx{S1}= 1 and fx{S0} =0
where S0 ; S1 A fX1 ; X2 ; . . . ; Xn g.
Fig. 15. Data path ring oscillator (DPRO) operating in Test Mode.
Fig. 17. Test registers designed for minimum performance overhead.
Fig. 16. The DDC-based self-test scheme.
795
Fig. 18. T-R and data path configuration of the DPRO during TM.
796
F. Ahmed, L. Milor / Microelectronics Journal 41 (2010) 789–800
If we eliminate the inputs that have the same value in both the
sets S1 and S0, we are left with the inputs (XR) that toggle the
function fx and that can make the data path oscillate.
Consider the logic function shown in Fig. 19. For this function S1 = {X1 =1, X2 = 1, X3 = 0} and S0 can consist of all other
input combinations. Let us select S0 = {X1 = 0, X2 =0, X3 = 0}. Then in
order to create oscillations, we need to toggle X1 and X2.
Moreover, since X1 = X2 = 1 results in fx = 1, and X1 =X2 = 0 results in
fx =0, fx must be inverted before it is fed back to the inputs X1
and X2.
Consider a complex data path in a pipelined stage, shown in
Fig. 20. Of all the outputs, suppose we want to monitor one as a
selected wearout monitor. In TM, the T-R register at the output of
fx feeds the selected data path into the function inverting block.
The function inverting block senses the value of the output fx
and generates a set of inputs to be inverted or non-inverted and
fed back into the register file at the input of the data path. For
these inputs, the regular registers are again replaced by the T-R
registers which feedback the output of the function inverting
block into the data path, causing the value at the output fx to
toggle.
Fig. 21 shows one possible implementation of the function
inverting block. Depending on the value of fx, the selection block
chooses the appropriate values independently of other selection
blocks for inverting fx.
It should be noted that this scheme of initiating oscillations in
complex data paths leads to insignificant performance overhead
due the insertion of the specially designed T-R registers.
Fig. 21. The function inverting block.
4.3. DDC design
The oscillations of the DPRO output are fed into the DDC.
Fig. 22 shows the block level implementation of the DDC. In the
DDC, a counter counts the number of oscillations of the DPRO. The
other two blocks in the DDC generate the test start and stop
X1
X2
fx
X3
Fig. 22. The DDC block connected to the DPRO in TM.
Fig. 19. Logic function.
Fig. 20. A multiple input, multiple output complex data path in a pipelined stage.
signal, PT. PT generation is divided into two parts. The VCO
generates pulses of a predetermined frequency. Since the output
frequency of the VCO and the VCO size are inversely proportional,
an optimal VCO frequency was chosen, while keeping in mind the
required sensitivity and device overhead. The VCO feeds into a
frequency divider, which generates a waveform with a period of
2PT, by dividing the frequency of the VCO output.
As the PUT enters TM, the DPRO and the VCO are enabled
simultaneously. The counter is enabled on the rising edge of the
signal from the frequency divider and starts counting the number
of oscillations of the DPRO. The falling edge of PT disconnects the
counter from the DPRO, hence halting the counter.
To get an accurate measurement, the duration of oscillation
(PT) must be large, to avoid false count triggers due to variations
caused by power supply noise and jitter.
The VCO makes the DDC more robust towards device
degradation. As the DDC only becomes active intermittently, it
is far more likely that the data paths fail before any significant
degradation in the DDC itself. However, slight degradation in the
797
F. Ahmed, L. Milor / Microelectronics Journal 41 (2010) 789–800
DDC over a large period of time would bring about a change in PT.
This would unavoidably lead to less accurate count reads.
Specifically, small increases in delay would not be detected
because of the increase in PT. However, the VCO can be re-tuned to
minimize any errors due to DDC degradation.
4.4. PT generation
The VCO is formed by a ring oscillator consisting of current
starved inverters, as shown in Fig. 23. VT controls the oscillation
frequency of the VCO.
VCOs based on current starved inverters can have the problem
of slow rise and fall times at low values of VT. They suffer from
glitches at the start of operation. For better PT generation, the VCO
was terminated using a Schmitt trigger.
A pseudo-static counter was also used as a frequency divider.
A K-bit counter can effectively act as a N/2K frequency divider.
Fig. 24. A single bit of the counter used both as a frequency divider and a
frequency monitor.
1145
1140
4.5. Counter design and operation
1135
1130
1125
Count
The counter was designed to be both a frequency divider and a
frequency monitor for the DPRO. These blocks of the DDC have
slightly different requirements.
The accuracy of this scheme is directly dependent on PT, which
has to be large to detect very small changes in delay. This requires
the counter bits to hold logical values for long periods of time. The
DPRO frequency monitor, on the other hand, has to be fast to
measure high data path frequencies for shorter data paths. Finally,
the outputs of the counter have to have stable ‘0’s’ in idle mode to
avoid counter output variations due to incorrect counter initiations.
Fig. 24 shows the design of a single bit of the counter. As seen
from the figure, a pseudo-static configuration was chosen to meet
the requirements of both the frequency divider and the frequency
monitor.
When not in use, the input of the counter is grounded. The bit
enable signal ENc is also kept low, which forces the output to a ‘0’.
As the DDC enters TM, ENc is forced high and the input terminal
IN is made to oscillate by the VCO or the DPRO. During the high
time of the input pulse, the first pass transistor connects the two
dynamic inverters together enabling the initial state X1, fed back
from the output, to propagate to the input of the second pass
transistor. During the low time the output of the first dynamic
inverter goes to the high Z state, and the two dynamic inverters
are disconnected, hence negating the effect of any logic changes
due to leakage in the first dynamic inverter. The previous state of
the bit is toggled and is passed to the output terminal. Hence, the
output toggles for every high-to-low transition of the input,
enabling the block to act as a counter.
1120
1115
1110
1105
1100
1095
0
2
4
6
8
10
Time-yrs
Fig. 25. Counter output plotted against time.
Note that since we are only interested in the change in the
counter output, the size of the counter is not important, as long as
it is large enough to cover the expected variation in the count
from cycle-to-cycle. At every overflow, the counter resets, thus
having no effect on the final results.
4.6. Example
To test the effectiveness of this scheme, the impact of
electromigration on a data path consisting of 18 inverters was
simulated. In Fig. 25, the DPRO counter output is plotted for the
detection of via degradation.
5. Power supply noise, jitter, and temperature variation
tolerance
Fig. 23. A ring oscillator consisting of current starved inverters was used as a VCO.
Let us suppose that the number of counts for a good circuit is n.
Then, ignoring the effects of power supply noise and jitter, a faulty
circuit needs at least one more count. If we suppose that the
frequency of operation of the DPRO is fDPRO, then PT ¼ n=fDPRO . The
test time needed to detect a delay change of DDelay is
2
DDelay.
PT ¼ 1=fDPRO
Consider, for example, a DPRO that oscillates at 1 GHz. Then,
to detect a change in delay of DDelay= 0.02 ps, n =50 000, and
PT = 50 ms.
798
F. Ahmed, L. Milor / Microelectronics Journal 41 (2010) 789–800
However, this scheme has to be tolerant to power supply noise
and jitter, since they cause false triggers, indicating the end of life
of a chip still in working condition. Power supply noise and jitter
increase the appropriate value of PT.
5.2. Jitter
Similarly, jitter in PT is another potential problem. Jitter may be
due to power supply noise and ground bounce, but may also be
due to coupling capacitance and local temperature variation. Jitter
results in random errors at the VCO and DPRO outputs, because of
randomness in the switching times of the waveforms. This results
in variation in the count. This can create an error if the count
increases by more than one. The case when the DPRO and VCO
oscillate at 1 GHz, with jitter on the signals of the DPRO and VCO,
is illustrated in Fig. 27.
A solution to this problem of potential false triggers due to
jitter is to average multiple delay measurements with the counter.
During PT, the counter runs and samples an output from the DPRO.
During the high-to-low transition, the counter output is shifted
into a register and the counter is reset. After N cycles we would
have N samples of the DPRO, which are then averaged. The
selection of N depends on the level of jitter on the signals.
0
Counts
-5
-10
-15
5mv
10mv
Noise Free
-20
0
1
2
3
Time-Yrs
4
5
6
Fig. 26. Count output with Gaussian noise source added to the power supply.
0.0015
0.001
250
0.0005
50
100
150
200
RMS Jitter on
DPRO Outpu
t (ps)
50
250
M
S
O on Jit
ut V te
pu C r
t( O
ps
)
150
0
R
To investigate the effect of power supply noise, a zero mean
Gaussian noise source was added to the power supply, with a
standard deviation of 10 mV. A varying power supply produces
a varying DPRO oscillation frequency. The final count output
depends on the mean oscillation frequency of the DPRO
(fDPRO-Mean), i.e. Count= fDPRO-Mean PT. A larger PT gives the
frequency more time to settle down to a steady mean value. Let
0
fDPRO
and fDPRO-Mean be the oscillation frequencies in the absence
and presence of noise, respectively. Given a difference,
0
Dferr ¼ fDPRO
fDPRO-Mean , then the degradation in frequency of
oscillation of the DPRO due to wearout (DfW) has to be greater
than Dferr to be detectible.
To investigate the effect of power supply noise, simulations for
the detection of via degradation were repeated with Gaussian
noise added to the power supply. The impact of the noisy
environment on the final count output is shown in Fig. 26.
Although the trend is maintained, it clear that even a small
amount of noise in the power supply can have a large impact.
0.002
Probability
of Error
5.1. Power supply noise
0.0025
Fig. 27. Test accuracy vs. jitter.
5.3. Temperature
Finally, it should be noted that delays are very sensitive to
temperature. For example, over the operating temperature range
from 0 to 150 1C, delay of the inverter chain varies by 150 ps. This
can overwhelm any delay change due to wearout. To overcome
this problem, the operating conditions must be reproduced for
each test of delay degradation.
6. Comparison with conventional ring oscillator-based
monitors
System aging is usually monitored by ring oscillators scattered
throughout the chip. These monitors oscillate throughout the life
time of the chip, and their frequency degradation indicates the
total device degradation. These monitors operate on the assumption that the amount of degradation in the ring oscillator is a good
measure of the degradation of the operational parts of the chip.
One of the major causes of inaccuracy in this approach is the
spatial and temporal variations in the operating environment of a
chip. Since most wearout mechanisms are exponentially dependent on temperature, it is very unlikely that isolated ring oscillators
can accurately predict the complex degradation profile of the chip.
As an example, Fig. 28 shows the variation in the MTF of a via in a
data path undergoing degradation due to electromigration.
Another issue is the assumption of a high correlation between
the switching activity of nodes of a ring oscillator and those in
complex data paths. Consider the sum generation part of a full
adder, shown in Fig. 29, as a simple example.
‘A’ and ‘B’ are the two inputs to be added and ‘S’ is the output
sum. Let us assume we want to study the wearout at node ‘S’ of
the full adder. The amount of wearout due to electromigration, for
example, is a function of current density though a via which in
turn is a function of the switching activity. Fig. 30 plots the
switching activity of the node ‘S’ assuming no incoming carry and
uncorrelated inputs. Compared to this, a ring oscillator oscillates
independently of the clock frequency. Typically these oscillating
frequencies are very high and a node in a ring oscillator may go
through numerous charges and discharges in a single clock cycle.
This means that ring oscillators have switching activities that are
far greater than one, and monitors based on ring oscillators could
provide pessimistic estimates of chip aging.
As an example let us assume that a ring oscillator wearout
monitor is designed whose frequency of oscillation is almost
799
F. Ahmed, L. Milor / Microelectronics Journal 41 (2010) 789–800
20
200
150
MTF-yrs
MTF (yrs)
15
10
100
50
0
1
5
0
280
0.5
P(B
)
300
320
340
Temperature (K)
360
380
Fig. 28. Variation in MTF of a via undergoing degradation due to electromigration
with varying temperature.
1
0 0
0.5
P(A)
Fig. 31. Variation in MTF of a via placed at the output node of a full adder with
varying input signal probabilities.
7. Summary
This paper has shown that wearout of vias can be detected by
measuring path delays. Analysis of via failure rates shows that the
change in delay as a function of time is independent of Weibull
distribution parameters and the number of stages in a path. This
shows that a trigger point based on an increase in delay directly
relates to the number of failing vias in an inverter chain. Hence,
a delay detection circuit provides quantifiable evidence of via
degradation in the absence of other sources of wearout for
inverter chains.
A self-test scheme to monitor the wearout of a chip by
measuring the delay through data paths has been presented. This
paper summarizes the circuit design, together with analysis of the
required test time and the impact of potential problems caused by
power supply noise, jitter, and temperature. Simulation results
indicate that very small changes in delay can be detected.
The same procedure can be applicable to any degradation that
results in a delay shift, such as hot carrier injection and negative
bias temperature instability, which increase the device threshold
voltage with time.
Fig. 29. The sum generation in a full adder.
Switching Activity
0.3
0.2
0.1
0
1
Acknowledgment
0.5
P(B
)
1
0
0
0.5
P(A)
Fig. 30. Switching activity of a node of a full adder with varying switching
probabilities of the inputs.
equal to the clock frequency so that the nodes of the monitor have
switching activities close to one. The median time to failure (MTF)
for a via due to electromigration at such a node to be around
12 years. Compared to that, Fig. 31 plots the MTF of a via placed
at the output node of a full adder with varying input signal
probabilities. The variations in MTF of a via with varying
switching activities is a clear indicator of the inaccuracy of
isolated ring oscillator monitors.
Our approach, unlike ring oscillators, requires the halting of
the operation of the circuit during testing. However, all modules
are not active at all times in many applications, and wearout tests
can be scheduled during times of inactivity, at startup, or
shutdown.
The authors thank the Semiconductor Research Corporation for
their financial support, under Tasks 1645.001 and 1645.002.
References
[1] M.H. Woods, MOS VLSI reliability and yield trends, Proc. IEEE 74 (12) (1986)
1715–1729.
[2] K. Banerjee, et al., Characterization of contact and via failure under short
duration high pulsed current stress, in: Proceedings of the International
Reliability Physics Symposium, 1997, pp. 216–220.
[3] B. Li, et al., Impact of via–line contact on CU interconnect electromigration
performance, in: Proceedings of the International Reliability Physics Symposium, 2005, pp. 24–30.
[4] J. Black, Electromigration—a brief survey and some recent results, IEEE Trans.
Electron Devices 16 (4) (1969) 338–347.
[5] G. Steinlesberger, et al., Copper damascene interconnects for the 65 nm
technology node: a first look at the reliability properties, in: International
Interconnect Technical Conference, 2002, pp. 265–267.
[6] M. Lamy, et al., How effective are failure analysis methods for the 65 nm
CMOS technology node? in: Proceedings of the International Symposium on
Physical and Failure Analysis, 2005, pp. 32–37.
[7] D. Markovic, et al., Methods for true energy-performance optimization, IEEE J.
Solid-State Circuits 39 (8) (2004) 1282–1293 Aug.
800
F. Ahmed, L. Milor / Microelectronics Journal 41 (2010) 789–800
[8] M. Orshansky, et al., Impact of spatial intrachip gate length variability on the
performance of high-speed digital circuits, IEEE Trans. Computer-Aided Des.
21 (5) (2002) 544–553.
[9] B. Cline, et al., Analysis and modeling of CD variation for statistical static
timing, in: Proceedings of the International Conference on Computer-Aided
Design, 2006, pp. 60–66.
[10] K.A. Bowman, S.G. Duvall, J.D. Meindl, Impact of die-to-die and within-die
parameter fluctuations on the maximum clock frequency distribution for
gigascale integration, IEEE J. Solid-State Circuits 37 (2) (2002) 183–190.
[11] M. Orshansky, L. Milor, C. Hu, Characterization of spatial intra-field gate CD
variability, its impact on circuit performance, and spatial mask-level
correction, IEEE Trans. Semicond. Manuf. 17 (1) (2004) 2–11.