Introduction To IC Testing, Reliability and Failure Analysis

Download as pdf or txt
Download as pdf or txt
You are on page 1of 42


Introduction to IC Testing,
Reliability and Failure Analysis
3.1 Understand IC testing
3.1.1 Explain electrical evaluation of IC in wafer
form requires two different types of testing: ‘test
pattern’ and probe testing

• Semiconductor test equipment (IC tester), or

automated test equipment (ATE), is a system for
giving electrical signals to a semiconductor device to
compare output signals against expected values for
the purpose of testing if the device works as specified
in its design specifications.
*Testers are roughly categorized into logic testers,
memory testers, and analog testers.

*Normally, IC testing is conducted at two levels: the

wafer test (also called die sort or probe test) that
tests wafers, and the package test (also called final
test) after packaging.

*Wafer testing uses a prober and a probe card,

while package testing uses a handler and a test
socket, together with a tester.
*What is a wafer prober?
*A wafer prober is a system used for electrical testing
of wafers in the semiconductor development and
manufacturing process.
*In an electrical test, test signals from a measuring
instrument or tester are transmitted to individual
devices on a wafer via probe needles or a probe card
and the signals are then returned from the device.
*A wafer prober is used for handling the wafer to
make contact in the designated position on the
*What is a probe card?
*A probe card is a jig used for electrical testing of an LSI
(large-scale integrated circuit) chip on a wafer during the
wafer test process in LSI manufacturing.

*A probe card is docked to a wafer prober to serve as a

connector between the LSI chip electrodes and an LSI tester
as a measuring machine.

*The needles of the probe card contact the LSI chip

electrodes to conduct electrical testing for the go/no-go test.

*The wafer test process is highly important and highly

dependent on the reliability of probe cards.
*A test pattern detects a fault from the fault model.
• Fault models
– Stuck-line (single and multiple)
– Bridging
– Stuck-open

Faults: - nodes shorted to power or ground (stuck-at)

- nodes shorted to each other (bridging)
- inputs floating, outputs disconnected (stuck-
Test Pattern Generation

• Manufacturing test ideally would check every node

in the circuit to prove it is not stuck.
• Apply the smallest sequence of test vectors necessary
to prove each node is not stuck.
• Good observability and controllability reduces
number of test vectors required for manufacturing
– Reduces the cost of testing
– Motivates design-for-test
3.1.2 Explain final (product) testing

*There are two important tests in semiconductor manufacturing.

*One is the wafer test during the wafer process, in which
electrical characteristics of chips are tested before dicing a
wafer into many pieces of semiconductor (called dies or chips).

*The other is the final test during the assembly and testing
process, which is conducted after packaging the diced chips.

*An IC socket is used in the final test. It plays the crucial role
of connecting the device and the tester, just as a probe card
does in the wafer test (see the figure below).
*Depending on the purpose of the test, IC sockets are
categorized into two groups:

i) burn-in sockets for testing reliability, including durability,

ii) test sockets for measuring electrical characteristics.

*Although these two types are both generally referred to as IC

sockets, the required performance varies depending on the
difference in use.
3.1.3 Explain ‘burn-in’
*Burn-in is an accepted practice for detecting early failures
in a population of semiconductor devices. It usually requires
the electrical testing of a product, using an expected
operating electrical cycle (extreme of operating condition),
typically over a time period of 48-168 hours. Alternatively,
thermal (e.g. 125°C for 168 hours) or environmental stress
screening (e.g. 20 cycles from -10 to 70°C ramped at
+°C/mm) is used.

*Burn-in is applied to products as they are made, to detect

early failures caused by faults in manufacturing practice.
* k emerosotan

3.2 Understand reliability and degradation of IC

3.2.1 Explain the meaning of reliability

*Reliability is a study of probability that a component such as

integrated circuit, equipment, or system will satisfactorily perform
the intended function under given circumstances, such as
environmental conditions, limitations as to operating time, and
frequency, and thoroughness of maintenance, for a specified
period of time.

*Thus, an integrated circuit designed for the operation in the space

for a period of 15 years and if this integrated circuit can live up
with the intended period then one can say that this integrated
circuit is reliable.
3.2.2 Explain reliability prediction technique such as
*Failure Mode and Effects Analysis (FMEA)
* also “failure modes”
* was one of the first highly structured, systematic techniques
for failure analysis. It was developed by reliability engineers
in the late 1950s to study problems that might arise from
malfunctions of military systems.

* An FMEA is often the first step of a system reliability study.

It involves reviewing as many components, assemblies, and
subsystems as possible to identify failure modes, and their
causes and effects.
* For each component, the failure modes and their resulting
effects on the rest of the system are recorded in a specific FMEA

*There are numerous variations of such worksheets. An FMEA can

be a qualitative analysis, but may be put on a quantitative basis
when mathematical failure rate models are combined with a
statistical failure mode ratio database.

*A few different types of FMEA analyses exist, such as:

*FMEA is an inductive reasoning (forward logic) single
point of failure analysis and is a core task in reliability
engineering, safety engineering, and quality

*A successful FMEA activity helps identify potential failure

modes based on experience with similar products and
processes or based on common physics of failure logic. It is
widely used in development and manufacturing industries
in various phases of the product life cycle. Effects
analysis refers to studying the consequences of those
failures on different system levels.

*Customers are placing increased demands on companies for

high quality, reliable products. The increasing capabilities
and functionality of many products are making it more
difficult for manufacturers to maintain the quality and

*Traditionally, reliability has been achieved through

extensive testing and use of techniques such as probabilistic
reliability modeling. These are techniques done in the late
stages of development. The challenge is to design in quality
and reliability early in the development cycle.

FMEA is methodology for analyzing potential reliability problems

early in the development cycle where it is easier to take actions to
overcome these issues, thereby enhancing reliability through design.

FMEA is used to identify potential failure modes, determine their

effect on the operation of the product, and identify actions to
mitigate the failures.

The early and consistent use of FMEAs in the design process

allows the engineer to design out failures and produce reliable,
safe, and customer pleasing products. FMEAs also capture
historical information for use in future product improvement.
Benefits of FMEA

*Improve product/process reliability and quality

*Increase customer satisfaction
*Early identification and elimination of potential product/process
failure modes
*Prioritize product/process deficiencies
*Capture engineering/organization knowledge
*Emphasizes problem prevention
*Documents risk and actions taken to reduce risk
*Provide focus for improved testing and development
*Minimizes late changes and associated cost
*Catalyst for teamwork and idea exchange between functions
Types FMEA

*System - focuses on global system functions

*Design - focuses on components and subsystems
*Process - focuses on manufacturing and assembly processes
*Service - focuses on service functions
*Software - focuses on software functions
3.2.3 Explain bath-tub curve prediction of reliability:

*Reliability specialists often describe the lifetime of a

population of products using a graphical representation
called the bathtub curve.

*The bathtub curve consists of three periods: an infant

mortality period with a decreasing failure rate followed
by a normal life period (also known as "useful life")
with a low, relatively constant failure rate and
concluding with a wear-out period that exhibits an
increasing failure rate.
Figure 1: The Bathtub Curve
a. Infant mortality

*This is reasoned to be due to major weaknesses in materials,

production defects, faulty design, omitted inspections, and
mishandling. These failures are also considered as extrinsic
failures and it is suggested that all systems with gross defects
fail during the early operation time.

*This period is usually a few hundred hours and a "burn in" is

sometimes employed to stop these failures occurring in the
field. Note that this does not stop the failures occurring, it
just ensures that they happen in-house and not on the
customer’s premises.

*This leads to the next period, the system's useful time.

b. Operating life

*Here, a low failure rate can be observed. This part is

often modeled using a constant failure rate. Therefore,
the probability of a system to fail is randomly
distributed. Even though this assumption is heavily
debated, many reliability calculations are based on such
a constant rate.
c. Wear out

*At the end of the lifetime, the wearout period follows due to
fatigue of materials. Components begin to fail through having
reached their end of life, rather than by random failures.

*An intelligent product design makes sure that wearout occurs

after some time greater than the planned lifetime of the
product .

*Examples are electrolytic capacitors drying out, fan bearings

seizing up, switch mechanisms wearing out etc. Well
implemented preventive maintenance can delay the onset of
this region.
3.3 Understand failure analysis in microelectronic
3.3.1 Explain general process flow in IC failure analysis

*Semiconductor Failure analysis (FA) is the process of

determining how or why a semiconductor device has failed, often
performed as a series of steps known as FA techniques. Device
failure is defined as any non-conformance of the device to its
electrical and/or visual/mechanical specifications. Failure
analysis is necessary in order to understand what caused the
failure and how it can be prevented in the future.
*Failure analysis starts with failure verification.
*It is important to validate the failure of a sample prior
to failure analysis in order to conserve valuable FA
resources. Failure verification is also done to
characterize the failure mode. Good characterization
of the failure mode is necessary to make the FA
efficient and accurate.
*After failure verification, the analyst subjects the
sample to various FA techniques step by step, collecting
attributes and other observations along the way.

*Non-destructive FA techniques are done before

destructive ones. Also, the results of these various FA
techniques must be consistent or corroborative. Any
inconsistency in results must be resolved before
proceeding to the next step.

* For example, a pin that exhibits a broken wire during

X-ray inspection but also shows an acceptable curve
trace during curve tracing can not happen, so this
inconsistency must be resolved by verifying which of
the two results is correct.
*In general, the results of the various FA techniques
would collectively point to the real failure site. The
FA process is finished once there are enough
information to make a conclusion about the location
of the failure site and cause or mechanism of failure.
3.3.2 Explain non-destructive test (NDT) and desctructive
test in failure analysis.

*Failure Analysis Techniques, or simply FA Techniques,

are the individual analytical steps performed to complete
the failure analysis process. Each FA technique in the FA
process is designed to provide its own, specialized
information that will contribute to the determination of
the failure mechanism of the sample. Although FA
techniques are generally independent of each other, their
results must nonetheless be consistent and corroborative
in order to arrive at a strong conclusion for the FA cycle.
*During the FA process, all applicable non-destructive FA
techniques must be performed prior to the conduct of any
destructive FA technique.

*An FA technique that alters a sample permanently in

whatever way (whether visual, mechanical, chemical, or
electrical) is considered destructive.

*On the other hand, non-destructive techniques are those

which do not cause any permanent change in the sample.
*non-destructive test (NDT) (ujian tanpa musnah)
- Radiographic X-ray or C-Sam/SAT

*desctructive test(ujian musnah)

- cross-sectioning and decapping system of IC packages
• non-destructive test (NDT)
*Radiography X-Ray
• destructive test (NDT)
*cross-sectioning or microsectioning is a failure analysis
technique for mechanically exposing a plane of interest in a die or
package for further analysis or inspection.

*it usually consists of sawing, grinding, polishing, and staining the

specimen until the plane of interest is ready for optical or electron
microscopy. The conventional method of microsectioning
requires the encapsulation of the specimen in plastic to give it
stability, support, and protection. A relatively newer technology
utilizes specific tools and procedures to allow non-encapsulated
Table 1 shows FA
techniques commonly
used in the

You might also like