Timing Analysis of Computer Hardware: Robert B. Hitchcock, Sr. Gordon L. Smith David D. Cheng
Timing Analysis of Computer Hardware: Robert B. Hitchcock, Sr. Gordon L. Smith David D. Cheng
Timing Analysis of Computer Hardware: Robert B. Hitchcock, Sr. Gordon L. Smith David D. Cheng
Gordon L. Smith
David D. Cheng
Timing Analysis is a design automation program that assists computer design engineers in locating problem timing in a
clocked, sequential machine. The program is effectivefor large machines because, in part, the running time is proportional to
the number of circuits. This isin contrast to alternative techniques such as delay simulation, which requires large numbers of
test patterns, and path tracing, which requires tracing of all paths. The output of Timing Analysis includes ‘Slack” at each
block to provide a measure of the severity of any timing problem. The program also generates standard deviations for the
times so that a statistical timing design can be produced rather than a worst case approach. This system has successfully
detected all but a few timing problems for the IBM 3081 Processor Unit (consisting of almost 800 000 circuits) prior to the
hardware debugging of timing. The 3081 is characterized by a tight statistical timing design.
1. Introduction
As the usage of large scale integration (LSI) in computers tracedallpaths [4-61. Other approaches have taken the
continuestoincrease,debugging of timing problems on delays of the blocks to be well defined numbers and have
actualhardware becomes moreandmore difficult. LSI attempted to optimize these delays within a power constraint
creates a scarcity of probe points; and since timingproblems [7, 81. A simulator accepting the delays as stochastic vari-
tend to beidentified serially on hardware, theincreased time ables has been used [9] to allow the probabilistic compari-
required to redesign andrebuildcomponentscontaining sons of clock and data settingsignals.
many circuits canhave a serious impacton schedules.
Another approach [lo] used ablock-oriented algorithm
Timing Analysis (TA) is a design automation program on limited models. In this algorithm thereis a dependency on
which provides an alternative to the hardware debugging of the function (AND and OR) of each block. An interesting
timing problems. The program establishes whether all paths approach which does not ignore logic function yet is opera-
within the design meet stated timing criteria, that is, that tionally efficient is described in [ 1 I]. This approach does not
data signals arrive at storage elements early enoughvalidfor deal with statistics, however. Two papers [ 12, 131, which
gating butnot so early as to cause premature gating Fig. (see resulted from different statistical approaches to the block
I ) . Although Timing Analysis may beused as part of many model of the authors, deal with the problems of placing
different hardware design verification methodologies, it was bounds on the distributionsof the model.
developed as part of the IBM 3081 designverification
methodology [ 1-31. The TA program is based on a block-oriented algorithm
which leads to a running time essentially proportional to the
Because the usageof clocked designs has been widespread number of logic blocks. Unlike the approachesbased on path
since the earliest electronic computers, the verification of tracing or delaysimulation,theresult is a runningtime
path timings has alwaysbeen important. Previously reported acceptable for even very large machines, since the block-
approaches have sought hardware-independent alternatives orientedalgorithm of TimingAnalysis does notpaythe
other than hand timing of questioned paths. One approach excessive costs associated withtracing all pathsor associated
centered on the analysis of the detailed electrical properties with the modeling of a large number of patterns in delay
of the components along a path, using an approach which simulation.
Copyrigbt 1982 by International Business Machines Corporation. Copying is permitted without payment of royalty provided that ( 1 ) each
reproduction is done without alteration and ( 2 ) the Journal reference and IBM copyright notice are included on the first page. The title and
abstractmaybeusedwithoutfurtherpermissionincomputer-basedandotherinformation-service systems. Permission to republish other
100 excerpts shouldbe obtained from the Editor.
ROBERT B. HITCHCOCK. SR. ET AL. IBM J. RES. DEVELOP. VOL. 26 NO. 1 JANUARY 1982
The basic TA block-oriented algorithm for computation I I
2) for non-probabilis-
of arrival times and slacks (see Section
tic cases involving combinational logic is analogous to the
Program Evaluation and Review Techniques (PERT) [14]
algorithm when run non-probabilistically.
LIVCn i
I-
Timing Analysis is primarily a tool for aiding theengineer
who is charged with producing a design to meet a specified Figure 1 The timing verification problem. Makecertainthat
clock cycle. The output of the TA program notonly identi- every transition stored by one clock will arrive at the next storage
element in time to be gated by the otherclock.
fies any logicwith timing problems, but alsoprovidesa
measure of the severity of each problem. Thus, the engineer
is directed to theproblem areas andis given a measure of the
changes required in the path timing.
2- 7 8 - -1
provided good results on the 3081 in which the statistical --
2
- 4 I
-3 -
II
IO
terms are a significant but not overwhelming portion of the d I_
path delays.
AA
-
HA
.
1
CA
-
3
-
DA
AB
I
- 4
BB
O
- "
CB
8
IBM J. RES. DEVELOP. VOL. 26 NO. I JANUARY 1982 ROBERT B. HITCHCOCK, SR.ET AL.
RD = delay I f output DA has a delay of three and a required arrival time at its
is rising
( R D ,F D ) FD =delay if output output of ten; therefore, the required arrival time at its input
is falling
R A = rising arrival is seven (the output required arrival timeminus the internal
time delay). With respect to this block, the output of block C A is
F A =falling arrival
time one unit late,so the slackis minus one. Block DB hasa delay
of five, so the required arrival time at its input
is five, and the
signal feeding from block C A is three units late. Three units
late (slack = - 3) is worse than one unit late,so the program
stores the most negative value (-3) at the output of block
CA.Theslackvaluestored attheoutput of the block
correspondstotheslack value forthe worst path going
through that block.
.The order in which blocks are processed by Timing From Fig. 2, one canobserve a funneling of negative slack
Analysis ensures that every input to a block will have been values through block BA. IfBAcould be replaced by a
processed before the block itself. The arrival timeof a block circuit having a smaller delay, all the negative slacks could
is computed by be eliminated without changing the functionaldesign at all.
1. Establishing the arrival time at theblock output for each Clearly, there are also alternative techniques for resolving
inputtothe block by propagatingeachinput signal this timing problem.
through the block,
2. Selecting the maximum such arrival time. In addition, notice that there are many blocks with posi-
tive slacks. These blocks could use circuits with increased
Thus, blocks in the leftmost columnof Fig. 2 will have an delay and still be within the timing constraints[9, 101.
arrivaltimeequaltothedelay of the blockplus the
maximum arrival time of any primary input (which are all If we are looking for early signals causedby short paths,
shown as zero). Block BA, which has one input at timetwo the above procedures are modified slightly. The arrival time
and the other at time three,is typical of other blocks in the at a block is computed from the minimum arrival time of any
figure. Three is greater, the internal delayis four, therefore input to the block rather than the maximum). Slacks are
seven is the maximum output arrival time for this block. again defined so that a negative slack indicates a problem;
that is, the signalis too early.
These calculations yield the outputs of blocks DA, DB,
DC,andDD,attimes eleven, thirteen, nine, and nine, Figure 3 elaborates the Timing Analysis principles illus-
respectively. (Noticethat two signalsarelateand two trated by Fig. 2 by showing how the rising and falling delays
signals are early.) and the inverting properties of a block are included in the
computations. As the key at the top of Fig. 3 indicates, the
Theslack S is defined asthe differencebetween the first number inside the block is the delay if the output is
required arrival time and the actual arrival time. The slack rising; the second number is the delayif the outputis falling;
value is computed so that anegative numberindicates a and the arrival times are keptin a two-tuple with the rising
problem; that is, the signal is too late. The output of block arrival time in the first position and the falling arrival time
DA arrivesa t time eleven, one unit late (slack= - I ) . Block in the secondposition. Thefactthat block 2 inverts is
DB is three units late (slack = -3); block DC is one early indicated by the triangular wedge symbol a t its output. This
(slack = + I); and DDis three early (slack= +3). The slack means that the rising output will be the greater of the two
information is thenpropagatedbackthroughthe block falling inputs (at times three and four) plus the block delay
graph, with all blocks being reprocessed in order opposite to (three), so that the rising arrival time will be seven. The
that used for generation of the arrivaltimes. falling output will be the greaterof the two rising inputs (at
times two and three) plus the delay (one), so the falling
When eachblock is processed,the slacks are computed for arrival time will be four. The same calculation can be done
102 a t a time. For example,block
the sources for each input, one for each of these blocks so that we can compute the rising
ROBERT E. HITCHCOCK, SR.ET AL IBM J. RES. DEVELOP. VOL. 26 NO. 1 JIANUARY 1982
and falling primary output arrival timesof eight and twelve,
respectively. Note that if ten were the required arrival time
for both the rising and falling arrival times, one would be CI
.. Logic . .. ’ Logic J
early and the other late. For this reason the TA diagnostics Storage Storage
distinguish between block output rising and falling arrival element e2 element
The calculation of arrival times also includes statistics. c I propagate --I L...
Each block delay, which has been referred to as a single-
C1 compare
delaynumberuptothis point, actually consists of two
numbers-a nominal delay and a delay standard deviation
(sigma). The calculationof a rising or falling arrival time at
each block of the model involves the calculationof a nominal C2 propagate
arrivaltimeandanarrival-timesigma.Nominalarrival
C2 compare
times are computed by simply adding block nominal delays
to previous nominal arrival times. The arrival-time sigmas Figure 4 Clockdescriptions.
are computedby applying standardconvolutions.
IBM J. RES. DEVELOP. VOL. 26 NO. I JANUARY 1982 ROBERT E. HITCHCOCK, SR. ET AL.
the model based upon an understanding of the logic by the to use the confidence level to draw conclusions about the
engineer. They are processed within the context of the basic probability of a timing failure is weakened. However,the
block algorithm of TA. Central Limit Theorem [16] of the theory of statistics
indicates that this effect is usually minimal.
In certain cases, it may notbepossible to place delay
2. Timing Analysis statistical approximation
modifiers ina model without having an undesirable effect on
The algorithm of Timing Analysis is a statistical approxi-
other paths. For example, it may not be possible to locate
mation. In particular, the effect of parallel paths is not
any block pin in a two-cycle path where the placement of a
reflected within the algorithm. For example, the output
delay modifierwillnot adversely affect a non-two-cycle
of a block with two independent but identical Gaussian
path. It is then necessary to make one or more extra runs in
inputs differs slightly from that of a Gaussian distribu-
which delay modifiers inthe path are set to different values.
tion, and the output nominal delay and standard devia-
tion differ slightly from the values computed by TA. As a
As used in the present methodology, no automatic check is
second example, the probability that all paths of a
provided to ensure that delay modifiers do not erroneously
machine with many paths are free of timing problems is
conceal timing problems. Thus, the completeness of TA can
generally lower than the probability that any given path
be compromisedby the usage of delay modifiers.
is free of timing problems. Thus, the assurance given by a
successful TA run that each individual path in isolation
When a model of the total design to be analyzed is too
meets or exceeds the desired probability of freedom from
large for storage, it is still possible to analyze the entire
timing problems does not necessarily result in the same
model. The large model is broken up into more manageable
assurance for the entire design.
small models that will fit within storage. In the case of the
model of the IBM 3081, it has been found convenient to 3. Non-sensitized paths and special timing criteria
make the smaller models correspond to thermal conduction The number of delay modifiers and the number of multi-
modules (TCMs) [ 151, which can contain as many as 45 000 ple runs required because of non-sensitized paths or
circuits. The values computed at the primary outputs special timing criteria have an impact on the ease of
(inputs) of one model are saved and automatically fed to the model setup and results analysis. It is easier to set up TA
primary inputs (outputs) of the other models connected to it runs and interpret the output whenmostof the design
so thatthe total analysis effectivelyprocesses theentire contains (a) only storage elements where a single pair of
model. propagate and compare times can beassigned and (b)
paths where delay modifiers are not required.
4. Application of timing analysis
The timing analysis program is used by engineers as an aid The analysis of the 3081 system utilizing statistical timing
in the generation of a designwhich meets stated timing criteria has been successfully
accomplished using TA.
criteria. Slacks produced by the TA program are used for Almost all timing problems have beendetected via TA prior
pinpointing timing problems and for assessing their serious- to hardware debugging of timing [3]; the few exceptions
ness. The engineers make use of both slacks and arrival were traced to errors in generation ofblock delays or to
times as listed in the output for each block as an aid for misplacement of delay modifierswithin the model. The
designing changes to eliminate identified timing problems. resulting design is such that any further shortening of the
Information is also provided by TA for establishing whether clock cycle would impact a very large number of paths and
or not the timing performance goalis achievable for the therefore be extraordinarily difficult to accomplish.
machine. Additionally, TA provides information for assess-
ing various design alternatives.
The output provided (slacks and arrival times) has been
found to be very effective as a design aid to the engineer.
This timing analysis method is applicable to a broad class
Gaussian distributions have served as adequate block delay
of clocked sequential machines, including, in particular,
approximations, and it has been possible to generate block
machines of large size. The domain of applicability of this
delays with high precision. However, block delay definition
technique is primarily determined by the following three
represented a major part of the development effort because
characteristics of the timing analysis method:
of the need to achieve the shortest cycle possible with the
1. Block delay modeling technology. A significant part of this effort results from the
The accuracy of TA results is dependent upon the accu- inclusion of wiring effects in the delays. The slacks gener-
racy of the delay parameters calculated for each block on ated at each block as a result of the TA statistical approxi-
the basis of the block circuit type and the block circuit mations are almost always equal to, or very close to, the
environment. Also,ifblock delay distributions signifi- slacks for the worst path through that block-much closer
104 cantly differ from Gaussian distributions, then the ability than a worst case statistical error analysis indicates. Also,
ROBERT 8 . HITCHCOCK. SR. ET AL. 1BM I. RES.DEVELOP. VOL. 26 NO. 1 JANUARY 1982
reasonable engineering judgmentsadequatelyhandlethe These functions are located in Poughkeepsie, East Fishkill,
parallelstatistical effects when establishingthetiming and Endicott.
performance of the entire machine. Finally, the number of
delay modifiers and multiple runs required because of non- References
sensitized paths and special timing criteria has been suffi- 1. M. S. Pittler, D.M.Powers, and D.L. Schnabel,“System
Development and Technology Aspectsof the IBM 3081 Proces-
ciently small so as to presenta minimal burden. sorComplex,” IBM J. Res.Develop. 26, 2-11 (1982, this
issue).
It is believed that the TA approach has applicability to 2. R. N. Gustafson andF. J. Sparacio,“IBM 3081 Processor Unit:
many hardware systems; any system with the appropriate Design Considerationsand DesignProcess,” IBM J. Res.
Develop. 26, 12-21 (1982, this issue).
clocking is a candidate. 3. MichaelMonachino,“DesignVerificationSystemforLarge-
Scale LSI Designs,” IBM J . Res. Develop. 26, 89-99 (1982,
5. Summary and conclusions this issue).
Timing Analysis is a block-oriented algorithm for timing all 4. D. J. Pilling and H. B. Sun, “Computer-Aided Prediction of
Delays in LSI Logic Systems,” Proceedings ofthe 10th ACMI
paths within a clocked sequential machine; and since it is a ZEEE Design Automation Workshop, Portland, OR, 1973, pp.
block-oriented algorithm,it is practical for very large 182-186, and discussion heldafter presentation of paper.
machines. I t effectively checks all paths and output arrival 5. Ryotaro Damikawai, Minoru Yamada, Tsuneyo Chiba,
Kenichi Furumaya, andYoji Tsuchiya, “A Critical Path Delay
times and slacks for each block. Statistical approximation Check System,” Proceedings of the 18th ACMIIEEE Design
andmethodsfordealing with logic withspecial timing Automation Conference, Nashville, TN, June 1981, pp. 118-
characteristics areboth built into theT A program. 123.
6. M. A. Wold, “Design Verification and Performance Analysis,”
Proceedings of the 15thACM/IEEE Design Automation
The timing analysis approach has been successful on the Conference, Las Vegas, NV, June 1978, pp. 264-270.
308 1 because 7. A.E. Ruehli, P. K. Wolff, Sr., and G . Goertzel, “Analytical
PowerITimingOptimizationTechniqueforDigitalSystem,”
0 It has found all but a few timing problems prior to the Proceedings of the 14th ACMIZEEE Design Automation
hardware debuggingof timing. Conference, New Orleans, LA, 1977, pp. 142-146.
8. B. J. Agule, J. D. Lesser, A. E. Ruehli, and P. K. Wolff, Sr.,
0 It hassuccessfully allowed the generationof a tight timing “An Experimental System for PowerITiming Optimization of
design. LSI Chips,” Proceedings ofthe 14th ACMIIEEE Design Auto-
0 The basic algorithm is much more efficient than delay mation Conference, New Orleans, LA, 1977, pp. 147-152.
9. Brengt Magnhagen, “Practical Experience from Signal Proba-
simulation or path tracing for finding timing problems. bility Simulation of Digital Designs,” Proceedings of the 14th
The presentation in the output of slacks and arrival times ACM/IEEE Design Automation Conference, New Orleans,
for each block presented the engineers with meaningful LA, 1977, pp. 216-219.
10. T. I. Kirkpatrick and N. R. Clark, “PERT as an Aid to Logic
parameters for assessing and understanding each timing Design,” IBM J. Res. Develop. 10, 135-141 (1966).
problem. 11. T. M. McWilliams,“Verification of TimingConstraints on
Large Digital Systems,” Proceedings of the 17th ACMIIEEE
Design Automation Conference, 1980, pp. 139-147.
12. Arthur Nidas, “Probabilistic PERT,” IBM J . Res. Develop.
Withinthedomain of applicability (see the previous 23,339-347 (1979).
section), the TA approachdescribed in this paper is believed 13. A. Nidas, “Random Critical Paths,” IEEE International
Symposium on Circuits and Systems Proceedings, Houston,
to be effective for solving thetiming designproblem for TX, 1980, pp. 32-35.
many LSI machines. 14. D. G . Malcolm, J. H. Roseboom, C. E. Clark, and W. Fagar,
“Application of a TechniqueforResearchandDevelopment
6. Acknowledgments Program Evaluation,”Oper. Res.7,647-669 (1959).
15. A. J. Blodgettand D.R. Barbour, “Thermal Conduction
The authorsacknowledge the significant contribution of W. Module: A High-Performance Multilayer Ceramic Package,”
Donath indeveloping (alongwithone of theauthors- IBM J. Res. Develop.26,30-36 (1982, this issue).
Hitchcock) a predecessor experimental
block-oriented 16. W. Feller, Probability Theory and Its Application, Vol. 1, John
Wiley & Sons, Inc., New York, 1950.
system [ 171 atthe IBM Thomas J. WatsonResearch 17. W. E. Donath and R. B. Hitchcock, “Method for Determining
Center. This system, which included storage elements, was the Characteristicsof a Logic Block Graph Diagram to Provide
designed to analyze critical paths. an Indication of Path Delays Between the Blocks,’’ U. S. Patent
4,263,65 1, 198 1.
The authors also acknowledge the contributions of the Received September 5, 1980; revised August 18, 1981
following totheearlyformation of TA:W.Donath, R.
Bahnsen and H. Ofek. The management support of A. Fitch, Robert B. Hitchcock, Sr.. is located at the IBM General
M. Monachino, B. Golnek, and R. Russo, and the manage- Technology Division laboratory, Endicott. New York 13760.
ment of the program implementation by A. Brown were all Gordon L. Smithis located at the IBM Data Systems
key to the final success. It is not possible to name all the Division laboratory and David D. Cheng at the IBM General
many others on the 3081 project and in the design automa- Technology Division laboratory, both in Poughkeepsie, New
tion (EDS) and technology groups who contributed to TA. York 12602. 105