The Hierarchical Timing Pair Model: Conference Paper
The Hierarchical Timing Pair Model: Conference Paper
The Hierarchical Timing Pair Model: Conference Paper
net/publication/221377182
CITATIONS READS
2 22
3 authors:
K. J. Ray Liu
University of Maryland, College Park
796 PUBLICATIONS 24,721 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Shuvra S. Bhattacharyya on 22 May 2014.
ABSTRACT
We present a new model for representing timing information
for functions in High-Level Synthesis (HLS). We identify short-
comings of the conventional timing model, which is a very simple
model derived from the combinational logic model, and show that
our new model overcomes many of these defects. In particular, we
are able to provide a unified timing model that describes hierar-
chical combinational and iterative circuits and provides a compact
representation of the information, that can be used to streamline Figure 1: (a) Full adder circuit. (b)Hierarchical block view.
system performance analysis.
We present experimental results that demonstrate the effective-
ness of our new approach, and describe an efficient algorithm to
represent the same information as the expanded graph that has 5
easily compute the required timing parameters from a description
vertices and 14 edges. In large systems, the savings offered by us-
of the graph. ing hierarchical representations are essential to retaining tractabil-
ity. A hierarchical representation would also be very useful in
1. INTRODUCTION commonly used sequential circuits such as digital filter implemen-
tations.
High-Level Synthesis (HLS) refers to the task of constructing an A major disadvantage of the conventional approach is that
architecture, binding and schedule for an algorithm that has been it does not allow a hierarchical description of the system timing
described at a high level of abstraction. The algorithm is usually when the system.contains delay elements (iterative systems) such
represented as a dataflow graph whose vertices represent functions as the digital filter mentioned above. These delay elements roughly
and edges represent communication or dependencies. To map such correspond to registers in a hardware implementation, but are more
a dataflow graph onto an architecture (either hardware or software) flexible in that they do not impose the restriction that all the de-
efficiently, we need to annotate the application specification and lay elements are activated at the same instant of time [ I . 9 . 31.
architecture with infomiation about the execution times of vertices, This allowance for variable phase clocking is an important way in
and the area utilization and power consumption of processing re- which HLS differs from combinational logic implementation. The
sources. The timing information is used to generate a set of con- r-ephnsirzg optimization in [ I ] provides a good example of how this
straints related to the system that the actual implementation must can be used. Even in sequential logic synthesis. variable phase
satisfy. clocking has been considered in such forms as clock skew opti-
The conventional model for describing timing in this context mization [4] and shimming delays [SI.
is derived from the method used in combinational logic analysis. To the best of our knowledge. there does not appear to be any
Here each vertex is assigned a single value (called the “propaga- other timing model that addresses this issue. Using conventional
tion delay”) representing the maximum delay among all its input- models, a complicated subsystem containing sequential elements
output pairs. will need to be represented in full in the context of the overall
An important requirement of a timing description is the ability system design. rather than using a more convenient condensed de-
to represent systems hierarchically. For example, Fig. I shows the scription of the timing parameters alone.
circuit of a full adder. If we were to consider this as part of a larger In this paper, we propose a different timing model that over-
system (say a 4-bit adder made of 4 full adders), we would pre- comes these difficulties. By introducing a slightly more complex
fer to use the timing information for the hierarchical block view. data structure that allows for multiple input-output paths with dif-
rather than the expanded gate-level view. The reason for this is that fering numbers of delay elements, we are able to provide a single
algorithms for path length computations are typically O(11,71[ E l ) timing model that can describe both purely combinational and it-
where Il/7Jis the number of vertices and [Elis the number of edges erative systems. For purely combinational systems. the model re-
in the graph. The hierarchical view uses 1 vertex and S edges to duces with minimal overhead to the existing combinational logic
This research was suppoited in p a t by the US National Science Foun- timing model. Further details are also available in 161.
dation Grant #9734275 and NSF NYI Award MIP9457397 Our model provides compact representations of the timing data
tAlso with the University of Mruyland Institute for Advanced Com- for large systems. We have used the ISCAS 89/93 benchmark cir-
puter Studies. cuits to test our ideas and have obtained promising results.
V-367
0-7803-6685-910 1 /$10.000200 1 IEEE
R
In the next section, we discuss the requirements that a timing
model must meet. and examine some of the shortcomings of the
conventional model. Section 3 then presents a new model that
overcomes these defects, and explains how it can be efficiently
stored and manipulated. Section 4 presents the results obtained
by apqlying the new technique to benchmark circuits. Finally, we
present our conclusions and some interesting directions for future
research. Figure 2 : Timing of complex blocks
2. REQUIREMENTS OF A TIMING MODEL imply between its input and output. To clarify this idea, consider
the block in Fig. 2 . If we were to write the constraints in terms of
In order to understand the requirements of timing descriptions for
the internal blocks zi and zo, we would obtain
hierarchical systems, it is first useful to clarify certain assumptions
that are made in describing simple combinational systems. xi - 2 1 2 t 1 ; z o-xi 2 t i - 1 x T ; x- ~ > to
x0 -
First, the combinational delay of a system is the rizn.vin7urn de-
lay between any input/output pair in the system. So after the inputs Now we would like to compute certain information such that
are stable, we can wait for the amount of time specified by this de- if we were to combine the complex blocli‘ B under the single start
lay, and be sure that the output is stable. In some cases, especially time 26. we would still be able to write down equations that would
in HLS. the time is in terms of integer multiples of a system clock. provide the same constraints to the environment outside the block
For multiple-input-multiple-output (MIMO) systems, we as- L?. We see that this is achieved by the following constraints:
sume that the inputs are synchronized so that the overall system
can be treated as single-input-single-output (SISO). This is a com- 21,- . T i 2 tl I 22 - 26 2 t ; -k to - 1 X T
mon assumption in combinational timing models [5, 71. To see
why. consider for example the full adder circuit from fig. I. Since In other words, if we assume that the execution time of the block
the output depends on all the inputs, it is acceptable to assume that +
B is given by the expression ti to - 1 x T , we can put down
the computation start only after all inputs are available, thus syn- constraints that exactly simulate the effect of the complex block
chronizing the inputs. It is clear that this assumption breaks down B.
when different outputs do not depend on all inputs, but in most In general, consider a path from input vi = VI to output vo =
cases. this is considered an acceptable tradeoff as it reduces the vk through vertices {VI . . . v k } given by p : v1 +v2+ . . . +vk,
~ ~
complexity of the analysis. with edges ei : ?J~+v~+I.Let t i be the execution time (propaga-
For dataflow graphs used in HLS. we use essentially the same tion delay assuming it is a simple combinational block) of vi. and
combinational timing model that is described above. Delay el- let d, be the number of delays on edge eJ. Now we can define the
ements, however. are treated differently [ I . 2. 31. In sequential corisrrairit time of this path as
logic circuits. all delays are treated as pip-pops that are triggered
on a common clock edge. In HLS scheduling. we assume no such k k-1
restriction on the timing of delays. We assume that each functional tc(p)=):ti-Txxdd,
unit can be started at any time (possibly by providing a start sig- i=l j=1
nal).
Now we can see what exactly are the uses of a timing model. We use the term “constraint time” to refer to this quantity be-
The timing information associated with a block is used primarily cause it is in some sense very similar to the notion of the execution
for the purpose of establishing constraints on the earliest time that time of the entire path, but at the same time is relevant only within
the successor elements of the block can start operating (i.e. when the context of the constraint system it is used to build. Also, we
its outputs become stable once the inputs are applied). By using
these constraints. additional metrics can be obtained relating to the
use the term cIlto refer to the sum x!=l
t ; ,and inp to refer to the
sum dJ . The ordered pair (mplc p )is referred to as a rirning
throughput and latency of the system, such as the iterntior?period pflir.
Doiirid. which is the same as the riin.vimitni cvclc rnenr? [SI for single We therefore see that by obtaining the pair ( r n , , c p ) (in the
rate graphs. The constraints are used for determining the feasibility
of different schedules of the system, where a schedule consists
example of Fig. 2 . cp = t i +
t o and T T L ~= l), we can derive
the constraints for the system without needing to know the internal
of an ordering of the vertices on resources that can provide the construction of B.
required functionality. We can understand the constraint time as follows: if we have
a SISO system with an input data stream z ( n ) and an output data
3. THE HIERARCHICAL TIMING PAIR MODEL stream :r/( n ) = 0.5 x z( n - l),the constraint time through the sys-
tem is the time difference between the arrival of z(0) on the inpul.
Having identified the requirements of a timing model and the short- edge and the appearance of y(0) on the corresponding output edge.
comings of the existing model. we can now use Fig. 2 to illustrate This is very similar to the definition of pnirwise latencies in [I]. 11:
the ideas behind the new model for timing. In this figure, we use is obvious that y(0) can appear on its edge before x ( 0 ) ,since ?/(0)
t - to refer to the propagation delay of a block. and 2- to refer to depends only on z(-1) which (if we assume that the periodicity
the stcirt rime of the block. T is the iteration interval (clock period of the data extends backwards as well as forwards) would have
for the delay elements). appeared exactly T before z(0). So the constraint time through
To provide timing information for a complex block, we should this system is t , - T , where t , is the propagation delay of the!
be able to emulate the timing characteristics that this block would unit doing the multiplication by 0.5 and T is the iteration period of
V-368
Table 1 : Tests for dominance of a path.
V-369
#timingpairs I1 2 3 4 5 timing pairs than systems where the delay elements are restricted
# circuits 121 13 5 4 1 to a relatively small m o u n t of feedback.
V-370