Academia.eduAcademia.edu

On The Complexity Of Power Estimation Problems

2000

Although many algorithms for power estimation have been proposed to date, no comprehensive results have been presented on the actual complexity of power estimation problems.

On The Complexity Of Power Estimation Problems Ana T. Freitas [email protected] IST-INESC Lisbon, Portugal Horácio C. Neto [email protected] IST-INESC Lisbon, Portugal Abstract Although many algorithms for power estimation have been proposed to date, no comprehensive results have been presented on the actual complexity of power estimation problems. In this paper we select a number of relevant problems and show that they belong to classes of high complexity. In particular we show that: a) for a given combinational circuit, the decision version of the peak power estimation problem is NP-complete; b) average power estimation for combinational circuits is #P-complete under the uniform input distribution; c) the decision version of the sequential power estimation problem is PSPACE-complete. We also address a number of related problems and conclude that even approximation algorithms for some of these problems are NP-hard. 1 Introduction Arlindo L. Oliveira [email protected] IST-INESC/CEL Lisbon, Portugal estimation methods is easily seen to be polynomial on the size of the trace and on the size of the circuit. However, an accurate characterization of the operating conditions of the circuit require the existence of a trace that can be exponentially large on the dimension of the circuit, thereby making the complexity of these methods effectively exponential on the size of the circuit. To avoid this exponential blowup without incurring in a significant loss of precision, approximate methods like sequence compaction [8] and sequence synthesis [7] have been proposed. The alternative to this approach are probabilistic power estimation methods. They do not explicitly require the existence of a trace and, in principle, can be used to compute the power dissipation of circuits without the need for an exponentially long description of the input stimuli. These methods usually assume simplified models of temporal and spatial correlations. One common simplification is the assumption that the primary inputs are uncorrelated in time and space. Both exact [5] and approximate [2] algorithms, for this problem have been proposed. Methods that use a more accurate modeling of spatial and temporal correlations have also been presented. One such method [9] models the pairwise spatial correlations of the primary inputs and propagates them through the circuit. A different approach that is able to take into account the full set of temporal and spatial correlations at the inputs has also been proposed [3]. Experimentally, it has been observed that all statistical power estimation methods become computationally very expensive as the size of the circuit grows. In sections 4 and 5 we show that the problem of power estimation in combinational and sequential circuits is indeed inherently complex, and that this complexity cannot be avoided by the use of statistical power estimation methods. In fact, in section 6, we show, using a simple argument, that no efficient approximation algorithms are likely to exist. Extensive research has been done on algorithms for power estimation of CMOS circuits. A number of problems like peak power estimation and average power dissipation in combinational and sequential circuits have been addressed using different methods. Although there exists considerable agreement that efficient (i.e., guaranteed polynomial time) algorithms do not exist for the majority of these problems, no specific analysis of their actual complexity has been published. This works addresses this lack of specific results, and shows, using a relatively straightforward analysis, that several power estimation problems belong to classes of very high complexity. The first problem of interest we consider is the problem of peak power estimation. We show in section 3 that the decision version of this problem is NP-complete. In sections 4 and 5, we consider the more general problem of power estimation of combinational and sequential 2 Definitions circuits. Existing algorithms for power estimation can be di2.1 Combinational and sequential circuits vided into two categories: simulation based methods and probabilistic methods. We will use the following abstract model for combinaSimulation based methods work by simulating a given tional and sequential circuits. A combinational circuit will input trace. The complexity of simulation based power be represented by a directed acyclic graph (DAG). Asso- ciated with each node of the graph is a variable, yi and a representation of the logic function, fi , that defines its value. There is a directed edge eij from yi to yj if fj depends explicitly on yi or yi . For the purposes of power estimations, we associate with each node a capacity value, ci , and we assume, in all cases, the zero delay model. A (synchronous) sequential circuit is composed by a combinational circuit and a set of latches, li . A subset of the primary inputs of the combinational circuit is connected to the outputs of the latches, and a subset of the primary outputs is connected to the inputs of the latches. will not be concerned with the particular description language used to specify the input sequence. All the results in the following sections are independent of the particular method used to describe the input sequences, although we believe the demonstrations could be easily adapted to handle most description languages that are powerful enough to describe exponentially long input sequences. 2.4 Complexity classes In the discussion that follows, we assume the reader is familiar with elementary complexity theory and related 2.2 Power dissipation in CMOS circuits basic notions such as polynomial reducibility and completeness. In particular, we assume the reader to be faMost present day circuit implementations use CMOS miliar with the classes of decision problems P and NP, logic. For CMOS circuits, the power dissipated by a cirwhich stand for the classes of decision problems that can cuit can be approximated, in first analysis, by be solved in polynomial time with deterministic and nondeterministic Turing machines, respectively. 2 P = VDD f Ci Ti (1) Figure 1 illustrates the containment relationships bei tween the complexity classes that are refereed in this pawhere VDD is the voltage of the power supply of the cir- per. The figure indicates the relationships within the polycuit, f is the clock frequency at which it operates, Ci the nomial hierarchy, PH, and the containment relationships capacitance off node i and Ti the number of transitions in between this hierarchy and higher complexity classes. node i. EXP For the purposes of simplifying the discussion, we will 2 normalize expression 1 and assume that VDD f equals 1. PSPACE = NPSPACE The computation of dissipated power therefore reduces to the summation in this expression, of which the significant PP P effort is the computation of Ti , the actual number of tranPH sitions in node i for the given operating conditions of the ... circuit. X NP NP 2.3 Input distributions P NP In general, one is interested in the power dissipated by a circuit under specific input sequences, since the actual values present at the inputs of the circuit strongly affects the power dissipation. Since many interesting input sequences can only be specified by exponentially long traces, one alternative possibility is to specify the values at the inputs through a specification of their statistics, or through some other compact description. Two interesting examples are the uniform distribution and binary code. In the first case, we specify that the input bits are uncorrelated in time and space, and, therefore, that all transitions between the input words are equally likely. In the second case, we specify that the input words are presented in a sequence that follows a standard binary code. Note that each of these two conditions would require an exponentially long description if input traces were to be used. Many other exponentially long input sequences can be specified using appropriate descriptions languages. One such language was proposed in [3], but, in general, we NP P Co_NP NP Co_NP Figure 1: Complexity classes. We will use the commonly accepted convention of using #P (number-P) to represent an important class of enumeration problems. An enumeration problem  belongs to #P if there is a nondeterministic algorithm that for every instance of the problem I 2 D , the number of distinct guesses that lead to the acceptance of I is exactly the number of solutions of I and each of the solutions can be checked in polynomial time on the length of the instance description [4]. Since it is not possible to directly compare #P, a class of functions, to PH, a class of languages, we take the class PP of problems asking whether more than half of the computations of a nondeterministic machine are accepting. This class is closely related to #P, 2  If the answer is NO, either node O is always at 1 or always at 0. The answer to the SAT problem can be obtained by simulating an arbitrary input combination. If it drives O to 1, the answer to SAT is YES. If it drives O to 0, the answer to SAT is NO. Toda’s theorem [12] states that PH  P PP (2) Note that P PP is no simpler than #P, since knowing the full number cannot be simpler that knowing the most significant bit. Finally, we will show that some power estimation problems belong to PSPACE, the class of all decision problems that can be solved using polynomial space. i1 i2 ... i n-1 O C in 3 Peak power estimation In this section, we prove that the decision problem associ- Figure 2: Transformation from SAT to PEAK-POWER. ated with the problem of peak power estimation of combinational circuits is NP-complete. Given the normalized Therefore, SAT / PEAK-POWER, and the theorem is 2 f = 1), the proved. assumption from the previous section (VDD peak power Ppeak is the maximum amount of capacitance 2 that can switch in any given input transition. In practical terms, this corresponds to the maximum energy dissipated 4 Average power estimation in any transition, divided by the clock period. Consider the following power estimation problem, PEAK-POWER. This problem, stated as a decision prob- We consider now the general problem of average power estimation of combinational circuits under specific input lem is: conditions. Clearly, when a polynomial size input trace INSTANCE : A combinational circuit N , a set of node that matches the desired conditions is available, any simcapacitances S = c1 ; c2 ; ::; cm and a power value P . ulation based method represents a polynomial time algoQUESTION : Is there any transition at the inputs, such rithm for the problem. The difficulty lies in the fact that, that the dissipated power of circuit N is equal or for many operating conditions of interest, the trace degreater than P ? scription is, in itself, exponential on the size of the circuit description. Theorem 1 PEAK-POWER is NP-complete. In this section, we will analyze two power estimation problems: average power dissipated when binary code is Proof: It is easy to verify that PEAK-POWER is in NP. A presented at the inputs of the circuit and average power nondeterministic algorithm for it need only guess a partic- dissipated under the uniform distribution of inputs. ular input transition and show, by simulating the circuit in polynomial time, that this particular input transition leads to power dissipation equal or greater than P . Thus the 4.1 Average Power For Binary Input Code first of the two requirements for NP-completeness is met. In this section we will consider a specific input sequence, To prove that it is NP-hard, we will transform a known although the result can be easily generalized for many NP-complete problem [1], SAT, to PEAK-POWER. other conditions. In particular, we will consider the inConsider an instance of SAT, and build the equivalent put sequence that is defined by a binary code at the incircuit, where an OR gate corresponds to each clause in puts. This problem, BIN-POWER, can be formalized in the SAT instance, and an AND gate computes the con- the following way: junction of all clauses. Let the circuit node at the output INSTANCE : A combinational circuit N and a set of of the AND gate be node O, as shown in figure 2. The node capacitances S = c1 ; c2 ; : : : ; cm . capacitance C at the output node O, has the same value as QUESTION : What is the average power dissipated in the power P and the capacitance of all OR gates is set to the circuit when binary code is presented at the in0. puts? Now, by running the algorithm for PEAK-POWER in this instance of the problem, it is possible to answer the Given the normalizing assumption from section 2, the original SAT problem in the following way: average power can be computed by counting the num If the answer is YES, then there exists at least one ber of transitions in the nodes in the circuit. We will transition in node O. This shows that there exists then consider the following counting problem, #BINone input combination that sets O to 1, thereby an- TRANSITIONS: INSTANCE : A combinational circuit N and a node O. swering positively the original SAT question. 3 QUESTION : What is the average power dissipated in the circuit for a sequence of inputs where all possible transitions appear an equal number of times? QUESTION : What is the number of transitions in node O when binary code is presented at the inputs ? Theorem 2 #BIN-TRANSITIONS is #P-complete. Before we address the complexity of this problem, we prove that a restricted version of it, formulated as a deProof: Clearly, #BIN-TRANSITIONS is in #P, since cision problem, is NP-complete. This problem, UNIFproving that a particular input transition causes a tranPOWER-GT0, stated as a decision problem is: sition in node O can be verified in polynomial time. To prove that #BIN-TRANSITIONS is #P-hard, we will INSTANCE : A combinational circuit N and a set of transform #SAT, a known #P-hard [10] problem to #BINnode capacitances S = c1 ; c2 ; : : : ; cm . TRANSITIONS. For that, we build the circuit of figure QUESTION : Does circuit N dissipate power for a se3. In this circuit, we added an extra input z , that forces quence of inputs where all possible transitions appear an equal number of times? i1 i2 ... Theorem 3 UNIF-POWER-GT0 is NP-complete. O Proof: It is easy to see that UNIF-POWER-GT0 is in . A nondeterministic algorithm needs only guess a particular input transition and show, by simulating the cirz cuit in polynomial time, that this particular input transiFigure 3: Transformation from #SAT to #BIN- tion leads to power dissipation in the circuit. To prove that it is NP-complete, we again transform SAT to UNIFTRANSITIONS. POWER-GT0. Given an instance of SAT, we build a cirthe output of the AND gate to become 0. This extra input cuit as in section 3. Now, by running the algorithm for helps in relating the number of transitions to the number UNIF-POWER-GT0 in this instance, we answer the SAT question following a procedure similar to the one used for of satisfying assignments. From an analysis of this circuit, it is easy to see that PEAK-POWER: the number of transitions in node O when binary code is  If the circuit dissipates power, the answer to the origpresented at the inputs (with variable z seen as the least inal SAT question is YES, since node O went to 0 significant bit) is exactly twice the number of satisfying and to 1 at least once. assignments to the original SAT instance. This proves that  If the circuit does not dissipate power, either node O #SAT can be transformed to #BIN-TRANSITIONS and is always at 1 or always at 0. Again, the answer to therefore that #BIN-TRANSITIONS is #P-complete. the SAT problem can be obtained by simulating an 2 arbitrary input combination. If it drives O to 1, the It should be clear that #BIN-TRANSITIONS is just a answer to SAT is YES. If it drives O to 0, the answer restricted version of BIN-POWER, where all capacitances to SAT is NO. of the nodes are zero except for node O. Therefore, the Therefore, SAT / UNIF-POWER-GT0, and the theorem following result follows immediately: is proved. 2 Corollary 1 BIN-POWER is #P-complete. We will now prove that UNIF-POWER is also #Pcomplete. To achieve this, we first need to define a code 4.2 Average Power For Uniform Input Dis- that has a distribution of input transitions as desired. We tribution will call this code min-unif-code. It is easy to show that, for a circuit with n inputs, there exists a code with An assumption commonly used by statistical power es2n (2n ? 1) + 1 words that has one and exactly one trantimation methods is that the input bits are uncorrelated sition between each possible input word1 . After defining in time and space. This is effectively equivalent to the this code, the proof proceeds in a similar way, although assumption that all transitions between input words are the result is slightly more involved. equiprobable, and, from a practical point of view, is supAgain, UNIF-POWER can be solved by counting the posed to simplify the power estimation process. transitions in the nodes. This leads us to define the folIn this section, we show that this problem has the same lowing counting problem, #UNIF-TRANSITIONS: complexity of the problem addressed in the previous secINSTANCE : A combinational circuit N and a node O. tion. Consider, then, the UNIF-POWER problem: i n-1 NP in 1 For space reasons, we omit the description on how such a code can be obtained, but, as an example, for n = 2, such a code could be 00,01,10,11,10,01,11,01,00,10,00,11,00. INSTANCE : A combinational circuit N and a set of node capacitances S = c1 ; c2 ; : : : ; cm . 4 Theorem 5 SEQ-POWER-GT0 is PSPACE-complete. QUESTION : What is the number of transitions in node O when min-unif-code is presented at the inputs ? Proof: To prove that SEQ-POWER-GT0 is in PSPACE, we notice that a non-deterministic algorithm can exercise all transitions using O(R) space, where R is the number of registers in the circuit. Given the well known result of Savitch [11], a non-deterministic algorithm can be converted to a deterministic one that uses O(R2 ) space3 and, therefore, SEQ-POWER-GT0 is in PSPACE. This conclusion also follows directly from Savitch’s result that PSPACE = NPSPACE. To prove that it is PSPACE-hard, we will transform the output of finite memory programs (OUT-FIN-PROG) to SEQ-POWER. OUT-FIN-PROG4 , a known PSPACEcomplete problem [6] [4] is defined as follows: Theorem 4 #UNIF-TRANSITIONS is #P-complete. Proof: Again, it is clear that #UNIF-TRANSITIONS is in #P. To prove that #UNIF-TRANSITIONS is #P-hard, we will again transform #SAT using the circuit of figure 3. A simple combinatorial analysis shows that, if min-unifcode is presented at the inputs, the solution to #SAT, m, is related to the solution to #UNIF-TRANSITIONS, u, by: p n 2n m = 2 ? 22 ? 2u (3) Therefore, a solution to #SAT can be obtained in polynomial time from a solution to #UNIF-TRANSITIONS, showing that #UNIF-TRANSITIONS is #P-complete. INSTANCE: Finite set X of variables, finite alphabet , a program P , defined by a sequence I1 ; I2 ; : : : ; Im of instructions of the form read xi , write vj , if vj = vk goto Il , accept or halt , where each xi 2 X , each vj 2 X [  [ f$g and Im is either halt or accept. QUESTION: Is there a string w 2  such that program P generates any output ? 2 Corollary 2 UNIF-POWER is #P-complete. It is interesting to note that theorem 2 could also be easily proved by finding a parsimonious transformation2 from the search version of SAT to the search version of PEAK-POWER. However, such a parsimonious transWe will consider only finite programs where the information is not so easily obtainable for the problem of put alphabet  is f0; 1g, since OUT-FIN-PROG remains UNIF-TRANSITIONS, so a direct reduction from #SAT PSPACE-complete even in that case [6]. To transform to #UNIF-TRANSITIONS represents a simpler demonOUT-FIN-PROG into SEQ-POWER-GT0, we observe stration of the main result. that the behavior of a finite program P can be simulated by a sequential circuit with log2 (jX j(jj + 1)) + log2 m 5 Power estimation for sequential latches. The combinational logic can also be easily constructed in polynomial time from the structure of the procircuits gram. Now, to generate the output, we will create a node O in In this section, we will study the problem of power esthe sequential circuit, that will have value 1 whenever a timation in sequential circuits. More specifically, we will write instruction is executed (and, therefore, output is bestudy the complexity of a decision version of the problem, ing generated), and that will have value 0 when no output SEQ-POWER-GT0: is being generated. INSTANCE : A sequential circuit N and a set of node Given this sequential circuit, we now define Ci to be capacitances S = c1 ; c2 ; : : : ; cm . equal to 0 for all nodes except node O. Now, by running QUESTION : Is there a sequence of inputs that makes the algorithm for SEQ-POWER-GT0 in this sequential circuit N dissipate power ? circuit, we can answer the question for OUT-FIN-PROG From a practical point of view, the added difficulty of in the following way: this problem lies in the sequential elements present, that  If the answer is YES, then the answer to the original create temporal correlations that are expensive to comOUT-FIN-PROG is also YES. pute. A number of algorithms for computing the exact  If the answer is NO, simulate the circuit with an arcorrelations have been proposed, but all the exact algobitrary input sequence. If node O is always at 1, then rithms have been observed to become rapidly very ineffithe answer is YES. Otherwise, the answer is NO. cient when large circuits are under consideration. We will show that, indeed, the problem is intrinsically Therefore, OUT-FIN-PROG / SEQ-POWER-GT0, difficult. At first analysis, it is not even obvious that and the theorem is proved. SEQ-POWER-GT0 is in PSPACE, since a simple-minded traversal of the STG requires exponential space, either if 2 3 The algorithm that traverses the STG using only O (R2 ) space is performed depth-first or breadth-first. Nonetheless, we based on Savitch’s construction. have the following result: 4 2 A transformation from a search problem  to a search problem  This problem is a special case of INEQUIVALENCE OF FINITE MEMORY PROGRAMS problem, where program 2 is a fixed program with no write instructions and hence no output. P 0 is parsimonious if it preserves the number of solutions. 5 6 Approximation algorithms and related problems To the best knowledge of the authors, these results represent the first comprehensive overview of the complexity of the analysis of dissipated power in combinational and sequential circuits. From the transformations presented in sections 3 to 5, it should be clear that approximation algorithms for some of the problems addressed are unlikely to exist. For space reasons, we will not address extensively this question. However, we note that the reductions presented for PEAK-POWER and UNIF-POWER-GT0 immediately prove that efficient approximation algorithms for the related estimation problems5 do not exist unless P = NP . In fact, for PEAK-POWER and UNIF-POWER, any approximation algorithm that generates a value P 0 that is in the interval [P (1 ? ); P (1 + )], (where P is the exact value of dissipated power and 0 <  < 1) of the desired answer would enable us to solve SAT. It should now be clear that similar results could also be proved for other versions of power estimation problems. We should also point out that the hardness of the power estimation problems we addressed is not caused by the artificial concentration of node capacitances in a single node, used in the proofs. It is easy to show that even if all the node capacitances have equal values, the complexity of the problems addressed remains the same. References [1] S. A. Cook. The complexity of theorem-proving procedures. In Proceedings of the 3rd Annual ACM Symposium on Theory of Computing, pages 151–158, 1971. [2] J. Costa, J. Monteiro, and S. Devadas. Switching Activity Estimation using Limited Depth Reconvergent Path Analysis. In Proceedings of the International Symposium on Low Power Electronics and Design, pages 184–189, August 1997. [3] A. T. Freitas, A. L. Oliveira, J. C. Monteiro, and H. C. Neto. Exact power estimation using word level transition probabilities. In Proceedings of the Ninth International Workshop on Power and Timing Modelling, Optimization and Simulation, pages 355–364, Kos Island, Greece, October 1999. [4] Michael R. Garey and David S. Johnson. Computers and Intractability. Freeman, New York, 1979. [5] A. Ghosh, S. Devadas, K. Keutzer, and J. White. Estimation of Average Switching Activity in Combinational and Sequential Circuits. In Proceedings of the 29th Design Automation Conference, pages 253–259, June 1992. 7 Conclusions [6] Neil D. Jones and Steven S. Muchnick. Even simple programs are hard to analyze. Journal of the ACM, 24(2):338– 350, April 1977. We addressed the complexity of a number of power estimation problems and proved that they belong to classes of [7] D. Marculescu, R. Marculescu, and M. Pedram. Stochashigh complexity. tic sequential machine synthesis targeting constrained seIn particular, we proved that the decision version of the quence generation. In 33rd Design Automation Conference peak power estimation problem for combinational circuits (DAC’96), pages 696–701, New York, June 1996. Association for Computing Machinery. is NP-complete. For power estimation problems, we have shown that ex- [8] D. Marculescu, R. Marculescu, and M. Pedram. Hieraract computation of the average power dissipated by comchical sequence compaction for power estimation. In 33rd binational circuits under the uniform input distribution is Design Automation Conference (DAC’97), pages 570–575, New York, June 1997. Association for Computing Machin#P-complete. The same is true for power estimation of ery. combinational circuits when the input sequence is binary code. Furthermore, we believe that this complexity analy- [9] R. Marculescu, D. Marculescu, and M. Pedram. Efficient sis is easily extensible to a large number of operating conPower Estimation for Highly Correlated Input Streams. In Proceedings of the 32nd Design Automation Conference, ditions that involve exponentially large input sequences. pages 628–634, June 1995. This result is remarkably strong, since #P-complete problems are known to be no simpler than any problem in the [10] C. Papadimitriou. Computational Complexity. Addison polynomial hierarchy. Wesley, 1994. Finally, we proved that the decision problem associated [11] W. J. Savitch. Relational between nondeterministic and with power estimation in sequential circuits is PSPACEdeterministic tape complexity. Journal of Computer and complete, a strong but not surprising result given known System Sciences, 4:177–192, 1970. related results in sequential circuit analysis and synthesis. [12] Seinosuke Toda. On the computational power of PP and Based on these results, we also argued briefly that P . In 30th Annual Symposium on Foundations of Comapproximation algorithms for the majority of interestputer Science, pages 514–519, Research Triangle Park, ing problems in power estimation do not exist unless North Carolina, 30 October–1 November 1989. IEEE. P = NP . 5 The related estimation problems ask for the actual value of the peak or average power. 6