On Timing Closure: Hold-Violation Removal Using Insertion of Buffers, Inverters and Delay Cells
On Timing Closure: Hold-Violation Removal Using Insertion of Buffers, Inverters and Delay Cells
On Timing Closure: Hold-Violation Removal Using Insertion of Buffers, Inverters and Delay Cells
Published By:
Blue Eyes Intelligence Engineering
195 & Sciences Publication
On Timing Closure: Hold-Violation Removal using Insertion of Buffers, Inverters and Delay Cells
Similarly, for the hold time constraints, The hold slack with represents hold violated paths are to be considered and also
respect to the hold-time constraints at pin p is given by the total negative setup slack has to be maintained, it is natural
Hold_slack = hold_actual arrival time – hold_required time that insertion of delay is allowed to the pins that are having
Timer engine gives the required time and arrival setup slack greater than zero. Therefore, the extraction of the
time.Negative values of setup slacks means setup violations pins with hold slacks less than zero and setup slacks greater
and negative values of hold slacks means hold violations. For than zero is required to get C∗ from a combinational circuit.
on-timing closure, the design must require no timing The time taken by the flop to stable while taking data at
violations. For all the pins in PO, Let TNS be the total active edge can be said as setup time. setup analysis done
negative setup slack and THS be the total negative hold slack. based on time period with which they are operating,
Hold violations are fixed after optimization of setup combinational delay between them, transition time of launch
violations. While optimizing the hold violations total negative flop, launch and capture time,when to flops are
setup slack need to be taken care. While buffers, inverters, communicating each other. This can be well explained using
delay cells are inserted as delay elements for fixing hold mathematical equation shown below.
violations, the inserted delay elements can increase the power Tcq + T comb + T launch ≤ Tclockperiod + T setup + T capture + 2T∆j
consumption and area of the design. Therefore, the delay Skew = T capture – T launch
insertion problem for removal of hold violation is given by: Where, 2T∆j is jitter, T comb is the combinational delay and T
Given a circuit level design and a standard cell library, finding cq is the delay of the flop from clock pin to output pin.
a delay solution such that total negative hold slack and the The right side of the equation is called as required time
cost of insertion of delay elements which means area and which is the time taken by the clock pathand left side is the
power consumption both are reduced. Furthermore, total arrival time that is taken by the data to travel through data
negative setup slack is to be taken care. path.
Here in this paper, the implementing approach uses buffers, The difference between the required arrival time and the
inverters and delay cells as delay elements. Over buffering actual arrival time of the data called as setup slack. This
results increase in power consumption and area, so better to should be maintained positive as a indication that the data can
avoid over-buffering. Some other different techniques for travel without any loss. Skew is nothing but the difference
fixing hold violations are cell sizing and logical restructuring. between capture and launch path.
The implementing approach shows better results using The time taken to hold the input data should not change
buffers, inverters and delay cells as delay elements by running at active edge can be hold time. For this analysis is made
together with industrial hold optimization technique. according the timing path and data path present between the
An industrial timing engine is required for providing flops. Except the time period remaining all parameters are
timing related information such as delay of the cells, the considered for hold analysis.
required arrival times, and the actual arrival times of the pins Tcq + Tcomb + Tlaunch≥ Thold+ Tcapture+ 2T∆j
with respect to the corresponding constraints (i.e.,setup The difference between arrival time and required time is
constraints for the setup slack and the hold time constraints called as hold slack. The hold slack should maintain positive
for the hold slack). The cell delay model being used is taken
from the lookup table in which two inputs are given by slew
and load capacitance, respectively, whereas the slew is also
taken from the slew lookup table.
III. LINEAR PROGRAMMING OPTIMIZATION Figure 1: Delay insertion: a) place delay cell at the cell X
output, b) place delay cell at one of the cell Z inputs.
Given with the violated path proper selection of
combination of delay elements is to be made to meet the hold Let di,b be the delay of the delay element b when inserted
requirement of the path. Here, in this work a standard circuit delay element is either buffer, inverter or delay cell b across
level design is taken as the reference. For these paths, hold pin i. For any source/sink cell C of pin i, dbC be the change
time is computed as per the linear programming optimization. in the delays of the cell C before and after inserting delay
Timing closure is done by various combinations of buffers, element b at pin i. Figure 1a) shows that the delay cell b is
inverters and delay cells. Discrete buffers, Complex timing placed at the cell X output pin i. Therefore, delays introduced
constraints and accurate timing models/analysis make time by placing this delay element is di,b + dbX + dbY + dbZ . In
consuming and problem difficult to solve. The linear Figure 1b), a delay cell b is placed at the cell Z inputs i, so the
programming-based methodology is presented to model the overall delay that is specified by this delay element is di,b + dbX
setup and hold-time constraints. Then based on the solution to + dbZ. Here, the overall delay calculation has to be computed
the linear programming optimization, buffers, inverters and for hold time constraints and setup constraints, respectively.
delay cells are inserted as delay elements to solve hold Thus, for the hold time constraints hd is used and for setup
violations. constraints sd is used in the following equation. HIDi,b
The input to the linear programming model is a represents the overall hold delay of placing delay cell b across
combinational circuit C∗ such that for any pin p of C∗ , hold pin i, and it is given by:
slack is less than zero and setup slack should be greater than
zero. Generally, for fixing the hold violation removal
problem, the pins that are having negative hold slacks which
Published By:
Blue Eyes Intelligence Engineering
196 & Sciences Publication
International Journal of Recent Technology and Engineering (IJRTE)
ISSN: 2277-3878, Volume-7 Issue-4, November 2018
Published By:
Blue Eyes Intelligence Engineering
197 & Sciences Publication
On Timing Closure: Hold-Violation Removal using Insertion of Buffers, Inverters and Delay Cells
sudden jump of standard cell utilization is bad (i.e., from Results of the industrial hold optimization using buffers is
one stage to another stage increment in utilization is to be less shown in table3 and the design is having utilization of
as ~5%) and utilization should be less than 60% after route 70.057%. Table 3 shows register to register path is having
stage for addition of circuitry in future use. Slack: WHS worst hold slack value of 130ps and also 31 reg2reg paths and
should be minimum for the violating paths in-order to meet default paths 14 are violated which results in a total hold slack
the timing closure at ECO (Engineering Change Order) stage. value of 6.24 ns.
Our main aim is to achieve minimum utilization with Table 4: Results of the implemented approach after CTS
minimum hold slack value. stage
Table 1: Design specifications: Utilization of the design is 39.623%
Design type Block (tile) Hold mode All Reg2reg Reg2cgate Default
Rectangular in shape WHS (ns) -0.13 -0.13 0.25 0.03
Tile dimensions Width = 297.92 µm, Height THS (ns) -11.84 -11.84 0 0
= 1948.52µm Violating
Standard cells count 14541 136 136 0 0
paths
Total standard cell area 124789.52 µm2 All paths 12763 7680 200 8143
Number of nets 15879
Results of the implemented approach after Clock Tree
Number of pins 59043
Synthesis (CTS) stage using buffers, inverters and delay cells
Number of primary input 798 together with industrial hold optimization flow are shown in
ports table4 and the design is having utilization of 39.623%.
Number of primary 682 Results of the implemented approach after Routing stage
output ports
using buffers, inverters and delay cells together with
Clock period 4 ns
industrial hold optimization flow are shown in table5 and the
Initial utilization 21.496 % design is having utilization 39.685%.
The experiment is done on industrial design having design Table 5: Results of the implemented approach after
specifications as shown in table1. Optimization techniques routing stage
are used in each step to meet the desired timing constraints in Utilization of the design is 39.685%
the design flow. Hold
Table 2: Timing summary and initial statistics of the industrial All Reg2reg Reg2cgate Default
mode
design WHS (ns) -0.09 -0.09 0.18 0.29
Hold All Reg2reg Reg2cgate Default THS (ns) -0.15 -0.15 0 0
mode Violating
2 2 0 0
WHS -0.25 -0.14 -0.05 -0.25 paths
(ns) All paths 12763 7680 200 8143
THS (ns) -374.33 -369.04 -1.69 -3.61
Violating 5241 5155 70 16 Compared to table3 in which buffers are used together with
paths industrial hold optimization, slack has been improved about
All paths 12763 7680 200 8143 31% as shown in table5 where buffers, inverters and delay
cells are used as delay elements (i., e.130 ps to 90 ps).
Timing summary and initial timing statistics of the design is Violating paths in table3 are 45 whereas in the implemented
shown in table2. Initial standard cell utilization is approach only 2 paths are violated having a total hold slack
21.496%.Our main focus is done on reg2reg paths which value of 150 ps. Furthermore utilization of the design in the
represents register to register path. Reg2cgate path specifies implemented approach is 39.685% which represents better
the path from clock pin to the register in which additional utilization and also additional logic circuitry can be added for
logic circuitry is added to decrease the power consumption of future purpose.
the design.
Table 3: Results of the industrial hold optimization using VI. CONCLUDING REMARKS
buffers In this paper, a standard industrial circuit level design is
Utilization of the design is 70.057% taken to compare two methodologies. One method uses only
Hold buffers for optimization and another method uses
All Reg2reg Reg2cgate Default combination of delay elements such as buffers, inverters and
mode
WHS delay cells. The implemented method uses insertion of delay
-0.41 -0.13 0.26 -0.41 elements for timing closure. At first, a linear programming
(ns)
THS (ns) -6.24 -3.13 0 -3.11 optimization is used to compute the amount of hold delay that
Violating is required to overcome
45 31 0 14
paths
All paths 12763 7680 200 8143
Published By:
Blue Eyes Intelligence Engineering
198 & Sciences Publication
International Journal of Recent Technology and Engineering (IJRTE)
ISSN: 2277-3878, Volume-7 Issue-4, November 2018
REFERENCES
1. Pei-Ci Wu ; Martin D. F. Wong ; Ivailo Nedelchev ; Sarvesh Bhardwaj
; Vidyamani Parkhe on timing closure: buffer insertion for hold
violation removal 51st ACM/EDAC/IEEE Design Automation
Conference (DAC) Pages:1 – 6,2014
2. S.-H. Huang, C.-H. Cheng, C.-M. Chang, and Y.-T. Nieh. Clock
period minimization with minimum delay insertion. InDesign
Automation Conference, 2007. DAC’07. 44th ACM/IEEE, pages
970–975. IEEE, 2007.
3. S.-H. Huang, G.-Y. Jhuo, and W.-L. Huang. Minimum buffer
insertions for clock period minimization. In Computer Communication
Control and Automation (3CA), 2010 International Symposium on,
volume 1, pages 426–429.IEEE, 2010.
4. A.B. Kahng, J. Lienig, I. L. Markov, and J. Hu. VLSI Physical Design:
From Graph Partitioning to Timing Closure. 2011..
5. B. Lin and H. Zhou. Clock skew scheduling with delay padding for
prescribed skew domains. In Design AutomationConference, 2007.
ASP-DAC’07. Asia and South Pacific,pages 541–546. IEEE, 2007
6. M. M. Ozdal, S. Burns, and J. Hu. Gate sizing and device technology
selection algorithms for high-performance industrial designs. In
Proceedings of the InternationalConference on Computer-Aided
Design, pages 724–731.IEEE Press, 2010.
7. N. V. Shenoy, R. K. Brayton, and A. L. Sangiovanni-Vincentelli.
Minimum padding to satisfy short path constraints. In
Computer-Aided Design, 1993.ICCAD-93. Digest of Technical
Papers., 1993 IEEE/ACM International Conference on, pages
156–161. IEEE, 1993.
8. W.-P. Tu, C.-H. Chou, S.-H. Huang, S.-C. Chang, Y.-T. Nieh, and
C.-Y. Chou. Low-power timing closure methodology for ultra-low
voltage designs. In Proc. Int.Conf. on Computer-Aided Design, pages
697–704, 2013.
Published By:
Blue Eyes Intelligence Engineering
199 & Sciences Publication