On The Complexity Of Power Estimation Problems
Ana T. Freitas
[email protected]
IST-INESC
Lisbon, Portugal
Horácio C. Neto
[email protected]
IST-INESC
Lisbon, Portugal
Abstract
Although many algorithms for power estimation have
been proposed to date, no comprehensive results have
been presented on the actual complexity of power estimation problems.
In this paper we select a number of relevant problems
and show that they belong to classes of high complexity.
In particular we show that: a) for a given combinational
circuit, the decision version of the peak power estimation
problem is NP-complete; b) average power estimation for
combinational circuits is #P-complete under the uniform
input distribution; c) the decision version of the sequential
power estimation problem is PSPACE-complete.
We also address a number of related problems and conclude that even approximation algorithms for some of
these problems are NP-hard.
1 Introduction
Arlindo L. Oliveira
[email protected]
IST-INESC/CEL
Lisbon, Portugal
estimation methods is easily seen to be polynomial on the
size of the trace and on the size of the circuit. However,
an accurate characterization of the operating conditions of
the circuit require the existence of a trace that can be exponentially large on the dimension of the circuit, thereby
making the complexity of these methods effectively exponential on the size of the circuit. To avoid this exponential
blowup without incurring in a significant loss of precision, approximate methods like sequence compaction [8]
and sequence synthesis [7] have been proposed.
The alternative to this approach are probabilistic power
estimation methods. They do not explicitly require the
existence of a trace and, in principle, can be used to compute the power dissipation of circuits without the need
for an exponentially long description of the input stimuli. These methods usually assume simplified models of
temporal and spatial correlations. One common simplification is the assumption that the primary inputs are uncorrelated in time and space. Both exact [5] and approximate [2] algorithms, for this problem have been proposed.
Methods that use a more accurate modeling of spatial and
temporal correlations have also been presented. One such
method [9] models the pairwise spatial correlations of the
primary inputs and propagates them through the circuit.
A different approach that is able to take into account the
full set of temporal and spatial correlations at the inputs
has also been proposed [3].
Experimentally, it has been observed that all statistical
power estimation methods become computationally very
expensive as the size of the circuit grows. In sections 4
and 5 we show that the problem of power estimation in
combinational and sequential circuits is indeed inherently
complex, and that this complexity cannot be avoided by
the use of statistical power estimation methods. In fact,
in section 6, we show, using a simple argument, that no
efficient approximation algorithms are likely to exist.
Extensive research has been done on algorithms for power
estimation of CMOS circuits. A number of problems
like peak power estimation and average power dissipation in combinational and sequential circuits have been
addressed using different methods. Although there exists considerable agreement that efficient (i.e., guaranteed
polynomial time) algorithms do not exist for the majority of these problems, no specific analysis of their actual
complexity has been published.
This works addresses this lack of specific results, and
shows, using a relatively straightforward analysis, that
several power estimation problems belong to classes of
very high complexity.
The first problem of interest we consider is the problem
of peak power estimation. We show in section 3 that the
decision version of this problem is NP-complete.
In sections 4 and 5, we consider the more general problem of power estimation of combinational and sequential 2 Definitions
circuits.
Existing algorithms for power estimation can be di2.1 Combinational and sequential circuits
vided into two categories: simulation based methods and
probabilistic methods.
We will use the following abstract model for combinaSimulation based methods work by simulating a given tional and sequential circuits. A combinational circuit will
input trace. The complexity of simulation based power be represented by a directed acyclic graph (DAG). Asso-
ciated with each node of the graph is a variable, yi and
a representation of the logic function, fi , that defines its
value. There is a directed edge eij from yi to yj if fj
depends explicitly on yi or yi . For the purposes of power
estimations, we associate with each node a capacity value,
ci , and we assume, in all cases, the zero delay model.
A (synchronous) sequential circuit is composed by a
combinational circuit and a set of latches, li . A subset
of the primary inputs of the combinational circuit is connected to the outputs of the latches, and a subset of the
primary outputs is connected to the inputs of the latches.
will not be concerned with the particular description language used to specify the input sequence. All the results
in the following sections are independent of the particular
method used to describe the input sequences, although we
believe the demonstrations could be easily adapted to handle most description languages that are powerful enough
to describe exponentially long input sequences.
2.4 Complexity classes
In the discussion that follows, we assume the reader is
familiar with elementary complexity theory and related
2.2 Power dissipation in CMOS circuits
basic notions such as polynomial reducibility and completeness. In particular, we assume the reader to be faMost present day circuit implementations use CMOS
miliar with the classes of decision problems P and NP,
logic. For CMOS circuits, the power dissipated by a cirwhich stand for the classes of decision problems that can
cuit can be approximated, in first analysis, by
be solved in polynomial time with deterministic and nondeterministic Turing machines, respectively.
2
P = VDD f Ci Ti
(1)
Figure 1 illustrates the containment relationships bei
tween the complexity classes that are refereed in this pawhere VDD is the voltage of the power supply of the cir- per. The figure indicates the relationships within the polycuit, f is the clock frequency at which it operates, Ci the nomial hierarchy, PH, and the containment relationships
capacitance off node i and Ti the number of transitions in between this hierarchy and higher complexity classes.
node i.
EXP
For the purposes of simplifying the discussion, we will
2
normalize expression 1 and assume that VDD f equals 1.
PSPACE = NPSPACE
The computation of dissipated power therefore reduces to
the summation in this expression, of which the significant
PP
P
effort is the computation of Ti , the actual number of tranPH
sitions in node i for the given operating conditions of the
...
circuit.
X
NP
NP
2.3 Input distributions
P
NP
In general, one is interested in the power dissipated by
a circuit under specific input sequences, since the actual
values present at the inputs of the circuit strongly affects
the power dissipation.
Since many interesting input sequences can only be
specified by exponentially long traces, one alternative
possibility is to specify the values at the inputs through
a specification of their statistics, or through some other
compact description.
Two interesting examples are the uniform distribution
and binary code. In the first case, we specify that the input bits are uncorrelated in time and space, and, therefore,
that all transitions between the input words are equally
likely. In the second case, we specify that the input words
are presented in a sequence that follows a standard binary code. Note that each of these two conditions would
require an exponentially long description if input traces
were to be used.
Many other exponentially long input sequences can be
specified using appropriate descriptions languages. One
such language was proposed in [3], but, in general, we
NP
P
Co_NP
NP
Co_NP
Figure 1: Complexity classes.
We will use the commonly accepted convention of using #P (number-P) to represent an important class of enumeration problems. An enumeration problem belongs
to #P if there is a nondeterministic algorithm that for every instance of the problem I 2 D , the number of distinct guesses that lead to the acceptance of I is exactly the
number of solutions of I and each of the solutions can be
checked in polynomial time on the length of the instance
description [4]. Since it is not possible to directly compare #P, a class of functions, to PH, a class of languages,
we take the class PP of problems asking whether more
than half of the computations of a nondeterministic machine are accepting. This class is closely related to #P,
2
If the answer is NO, either node O is always at 1 or
always at 0. The answer to the SAT problem can be
obtained by simulating an arbitrary input combination. If it drives O to 1, the answer to SAT is YES. If
it drives O to 0, the answer to SAT is NO.
Toda’s theorem [12] states that
PH
P PP
(2)
Note that P PP is no simpler than #P, since knowing the
full number cannot be simpler that knowing the most significant bit.
Finally, we will show that some power estimation problems belong to PSPACE, the class of all decision problems
that can be solved using polynomial space.
i1
i2
...
i n-1
O
C
in
3 Peak power estimation
In this section, we prove that the decision problem associ- Figure 2: Transformation from SAT to PEAK-POWER.
ated with the problem of peak power estimation of combinational circuits is NP-complete. Given the normalized
Therefore, SAT / PEAK-POWER, and the theorem is
2 f = 1), the proved.
assumption from the previous section (VDD
peak power Ppeak is the maximum amount of capacitance 2
that can switch in any given input transition. In practical
terms, this corresponds to the maximum energy dissipated
4 Average power estimation
in any transition, divided by the clock period.
Consider the following power estimation problem,
PEAK-POWER. This problem, stated as a decision prob- We consider now the general problem of average power
estimation of combinational circuits under specific input
lem is:
conditions. Clearly, when a polynomial size input trace
INSTANCE : A combinational circuit N , a set of node that matches the desired conditions is available, any simcapacitances S = c1 ; c2 ; ::; cm and a power value P . ulation based method represents a polynomial time algoQUESTION : Is there any transition at the inputs, such rithm for the problem. The difficulty lies in the fact that,
that the dissipated power of circuit N is equal or for many operating conditions of interest, the trace degreater than P ?
scription is, in itself, exponential on the size of the circuit
description.
Theorem 1 PEAK-POWER is NP-complete.
In this section, we will analyze two power estimation
problems: average power dissipated when binary code is
Proof: It is easy to verify that PEAK-POWER is in NP. A presented at the inputs of the circuit and average power
nondeterministic algorithm for it need only guess a partic- dissipated under the uniform distribution of inputs.
ular input transition and show, by simulating the circuit in
polynomial time, that this particular input transition leads
to power dissipation equal or greater than P . Thus the 4.1 Average Power For Binary Input Code
first of the two requirements for NP-completeness is met. In this section we will consider a specific input sequence,
To prove that it is NP-hard, we will transform a known although the result can be easily generalized for many
NP-complete problem [1], SAT, to PEAK-POWER.
other conditions. In particular, we will consider the inConsider an instance of SAT, and build the equivalent put sequence that is defined by a binary code at the incircuit, where an OR gate corresponds to each clause in puts. This problem, BIN-POWER, can be formalized in
the SAT instance, and an AND gate computes the con- the following way:
junction of all clauses. Let the circuit node at the output
INSTANCE : A combinational circuit N and a set of
of the AND gate be node O, as shown in figure 2. The
node capacitances S = c1 ; c2 ; : : : ; cm .
capacitance C at the output node O, has the same value as
QUESTION : What is the average power dissipated in
the power P and the capacitance of all OR gates is set to
the circuit when binary code is presented at the in0.
puts?
Now, by running the algorithm for PEAK-POWER in
this instance of the problem, it is possible to answer the
Given the normalizing assumption from section 2, the
original SAT problem in the following way:
average power can be computed by counting the num If the answer is YES, then there exists at least one ber of transitions in the nodes in the circuit. We will
transition in node O. This shows that there exists then consider the following counting problem, #BINone input combination that sets O to 1, thereby an- TRANSITIONS:
INSTANCE : A combinational circuit N and a node O.
swering positively the original SAT question.
3
QUESTION : What is the average power dissipated in
the circuit for a sequence of inputs where all possible
transitions appear an equal number of times?
QUESTION : What is the number of transitions in node
O when binary code is presented at the inputs ?
Theorem 2 #BIN-TRANSITIONS is #P-complete.
Before we address the complexity of this problem, we
prove
that a restricted version of it, formulated as a deProof: Clearly, #BIN-TRANSITIONS is in #P, since
cision
problem, is NP-complete. This problem, UNIFproving that a particular input transition causes a tranPOWER-GT0,
stated as a decision problem is:
sition in node O can be verified in polynomial time.
To prove that #BIN-TRANSITIONS is #P-hard, we will INSTANCE : A combinational circuit N and a set of
transform #SAT, a known #P-hard [10] problem to #BINnode capacitances S = c1 ; c2 ; : : : ; cm .
TRANSITIONS. For that, we build the circuit of figure QUESTION : Does circuit N dissipate power for a se3. In this circuit, we added an extra input z , that forces
quence of inputs where all possible transitions appear an equal number of times?
i1
i2
...
Theorem 3 UNIF-POWER-GT0 is NP-complete.
O
Proof: It is easy to see that UNIF-POWER-GT0 is in
. A nondeterministic algorithm needs only guess a
particular input transition and show, by simulating the cirz
cuit in polynomial time, that this particular input transiFigure 3:
Transformation from #SAT to #BIN- tion leads to power dissipation in the circuit. To prove
that it is NP-complete, we again transform SAT to UNIFTRANSITIONS.
POWER-GT0. Given an instance of SAT, we build a cirthe output of the AND gate to become 0. This extra input cuit as in section 3. Now, by running the algorithm for
helps in relating the number of transitions to the number UNIF-POWER-GT0 in this instance, we answer the SAT
question following a procedure similar to the one used for
of satisfying assignments.
From an analysis of this circuit, it is easy to see that PEAK-POWER:
the number of transitions in node O when binary code is
If the circuit dissipates power, the answer to the origpresented at the inputs (with variable z seen as the least
inal SAT question is YES, since node O went to 0
significant bit) is exactly twice the number of satisfying
and to 1 at least once.
assignments to the original SAT instance. This proves that
If the circuit does not dissipate power, either node O
#SAT can be transformed to #BIN-TRANSITIONS and
is always at 1 or always at 0. Again, the answer to
therefore that #BIN-TRANSITIONS is #P-complete.
the SAT problem can be obtained by simulating an
2
arbitrary input combination. If it drives O to 1, the
It should be clear that #BIN-TRANSITIONS is just a
answer to SAT is YES. If it drives O to 0, the answer
restricted version of BIN-POWER, where all capacitances
to SAT is NO.
of the nodes are zero except for node O. Therefore, the
Therefore, SAT / UNIF-POWER-GT0, and the theorem
following result follows immediately:
is proved.
2
Corollary 1 BIN-POWER is #P-complete.
We will now prove that UNIF-POWER is also #Pcomplete. To achieve this, we first need to define a code
4.2 Average Power For Uniform Input Dis- that has a distribution of input transitions as desired. We
tribution
will call this code min-unif-code. It is easy to show
that, for a circuit with n inputs, there exists a code with
An assumption commonly used by statistical power es2n (2n ? 1) + 1 words that has one and exactly one trantimation methods is that the input bits are uncorrelated
sition between each possible input word1 . After defining
in time and space. This is effectively equivalent to the
this code, the proof proceeds in a similar way, although
assumption that all transitions between input words are
the result is slightly more involved.
equiprobable, and, from a practical point of view, is supAgain, UNIF-POWER can be solved by counting the
posed to simplify the power estimation process.
transitions in the nodes. This leads us to define the folIn this section, we show that this problem has the same
lowing counting problem, #UNIF-TRANSITIONS:
complexity of the problem addressed in the previous secINSTANCE : A combinational circuit N and a node O.
tion. Consider, then, the UNIF-POWER problem:
i n-1
NP
in
1 For space reasons, we omit the description on how such a code
can be obtained, but, as an example, for n = 2, such a code could
be 00,01,10,11,10,01,11,01,00,10,00,11,00.
INSTANCE : A combinational circuit N and a set of
node capacitances S = c1 ; c2 ; : : : ; cm .
4
Theorem 5 SEQ-POWER-GT0 is PSPACE-complete.
QUESTION : What is the number of transitions in node
O when min-unif-code is presented at the inputs ?
Proof: To prove that SEQ-POWER-GT0 is in PSPACE,
we notice that a non-deterministic algorithm can exercise
all transitions using O(R) space, where R is the number of registers in the circuit. Given the well known result of Savitch [11], a non-deterministic algorithm can be
converted to a deterministic one that uses O(R2 ) space3
and, therefore, SEQ-POWER-GT0 is in PSPACE. This
conclusion also follows directly from Savitch’s result that
PSPACE = NPSPACE.
To prove that it is PSPACE-hard, we will transform
the output of finite memory programs (OUT-FIN-PROG)
to SEQ-POWER. OUT-FIN-PROG4 , a known PSPACEcomplete problem [6] [4] is defined as follows:
Theorem 4 #UNIF-TRANSITIONS is #P-complete.
Proof: Again, it is clear that #UNIF-TRANSITIONS is in
#P. To prove that #UNIF-TRANSITIONS is #P-hard, we
will again transform #SAT using the circuit of figure 3.
A simple combinatorial analysis shows that, if min-unifcode is presented at the inputs, the solution to #SAT, m, is
related to the solution to #UNIF-TRANSITIONS, u, by:
p
n
2n
m = 2 ? 22 ? 2u
(3)
Therefore, a solution to #SAT can be obtained in polynomial time from a solution to #UNIF-TRANSITIONS,
showing that #UNIF-TRANSITIONS is #P-complete.
INSTANCE: Finite set X of variables, finite alphabet ,
a program P , defined by a sequence I1 ; I2 ; : : : ; Im of
instructions of the form read xi , write vj , if vj = vk
goto Il , accept or halt , where each xi 2 X , each
vj 2 X [ [ f$g and Im is either halt or accept.
QUESTION: Is there a string w 2 such that program
P generates any output ?
2
Corollary 2 UNIF-POWER is #P-complete.
It is interesting to note that theorem 2 could also be
easily proved by finding a parsimonious transformation2
from the search version of SAT to the search version of
PEAK-POWER. However, such a parsimonious transWe will consider only finite programs where the information is not so easily obtainable for the problem of
put
alphabet is f0; 1g, since OUT-FIN-PROG remains
UNIF-TRANSITIONS, so a direct reduction from #SAT
PSPACE-complete
even in that case [6]. To transform
to #UNIF-TRANSITIONS represents a simpler demonOUT-FIN-PROG
into
SEQ-POWER-GT0, we observe
stration of the main result.
that the behavior of a finite program P can be simulated
by a sequential circuit with log2 (jX j(jj + 1)) + log2 m
5 Power estimation for sequential latches. The combinational logic can also be easily constructed in polynomial time from the structure of the procircuits
gram.
Now, to generate the output, we will create a node O in
In this section, we will study the problem of power esthe sequential circuit, that will have value 1 whenever a
timation in sequential circuits. More specifically, we will
write instruction is executed (and, therefore, output is bestudy the complexity of a decision version of the problem,
ing generated), and that will have value 0 when no output
SEQ-POWER-GT0:
is being generated.
INSTANCE : A sequential circuit N and a set of node
Given this sequential circuit, we now define Ci to be
capacitances S = c1 ; c2 ; : : : ; cm .
equal to 0 for all nodes except node O. Now, by running
QUESTION : Is there a sequence of inputs that makes
the algorithm for SEQ-POWER-GT0 in this sequential
circuit N dissipate power ?
circuit, we can answer the question for OUT-FIN-PROG
From a practical point of view, the added difficulty of in the following way:
this problem lies in the sequential elements present, that
If the answer is YES, then the answer to the original
create temporal correlations that are expensive to comOUT-FIN-PROG is also YES.
pute. A number of algorithms for computing the exact
If the answer is NO, simulate the circuit with an arcorrelations have been proposed, but all the exact algobitrary input sequence. If node O is always at 1, then
rithms have been observed to become rapidly very ineffithe answer is YES. Otherwise, the answer is NO.
cient when large circuits are under consideration.
We will show that, indeed, the problem is intrinsically
Therefore, OUT-FIN-PROG / SEQ-POWER-GT0,
difficult. At first analysis, it is not even obvious that
and the theorem is proved.
SEQ-POWER-GT0 is in PSPACE, since a simple-minded
traversal of the STG requires exponential space, either if 2
3 The algorithm that traverses the STG using only O (R2 ) space is
performed depth-first or breadth-first. Nonetheless, we
based on Savitch’s construction.
have the following result:
4
2 A transformation from a search problem
to a search problem
This problem is a special case of INEQUIVALENCE OF FINITE
MEMORY PROGRAMS problem, where program 2 is a fixed program
with no write instructions and hence no output.
P
0
is parsimonious if it preserves the number of solutions.
5
6 Approximation algorithms and
related problems
To the best knowledge of the authors, these results represent the first comprehensive overview of the complexity
of the analysis of dissipated power in combinational and
sequential circuits.
From the transformations presented in sections 3 to 5, it
should be clear that approximation algorithms for some of
the problems addressed are unlikely to exist.
For space reasons, we will not address extensively this
question. However, we note that the reductions presented for PEAK-POWER and UNIF-POWER-GT0 immediately prove that efficient approximation algorithms
for the related estimation problems5 do not exist unless
P = NP .
In fact, for PEAK-POWER and UNIF-POWER, any
approximation algorithm that generates a value P 0 that is
in the interval [P (1 ? ); P (1 + )], (where P is the exact
value of dissipated power and 0 < < 1) of the desired
answer would enable us to solve SAT.
It should now be clear that similar results could also be
proved for other versions of power estimation problems.
We should also point out that the hardness of the power
estimation problems we addressed is not caused by the
artificial concentration of node capacitances in a single
node, used in the proofs. It is easy to show that even if all
the node capacitances have equal values, the complexity
of the problems addressed remains the same.
References
[1] S. A. Cook. The complexity of theorem-proving procedures. In Proceedings of the 3rd Annual ACM Symposium
on Theory of Computing, pages 151–158, 1971.
[2] J. Costa, J. Monteiro, and S. Devadas. Switching Activity
Estimation using Limited Depth Reconvergent Path Analysis. In Proceedings of the International Symposium on
Low Power Electronics and Design, pages 184–189, August 1997.
[3] A. T. Freitas, A. L. Oliveira, J. C. Monteiro, and H. C.
Neto. Exact power estimation using word level transition
probabilities. In Proceedings of the Ninth International
Workshop on Power and Timing Modelling, Optimization
and Simulation, pages 355–364, Kos Island, Greece, October 1999.
[4] Michael R. Garey and David S. Johnson. Computers and
Intractability. Freeman, New York, 1979.
[5] A. Ghosh, S. Devadas, K. Keutzer, and J. White. Estimation of Average Switching Activity in Combinational and
Sequential Circuits. In Proceedings of the 29th Design Automation Conference, pages 253–259, June 1992.
7 Conclusions
[6] Neil D. Jones and Steven S. Muchnick. Even simple programs are hard to analyze. Journal of the ACM, 24(2):338–
350, April 1977.
We addressed the complexity of a number of power estimation problems and proved that they belong to classes of [7] D. Marculescu, R. Marculescu, and M. Pedram. Stochashigh complexity.
tic sequential machine synthesis targeting constrained seIn particular, we proved that the decision version of the
quence generation. In 33rd Design Automation Conference
peak power estimation problem for combinational circuits
(DAC’96), pages 696–701, New York, June 1996. Association for Computing Machinery.
is NP-complete.
For power estimation problems, we have shown that ex- [8] D. Marculescu, R. Marculescu, and M. Pedram. Hieraract computation of the average power dissipated by comchical sequence compaction for power estimation. In 33rd
binational circuits under the uniform input distribution is
Design Automation Conference (DAC’97), pages 570–575,
New York, June 1997. Association for Computing Machin#P-complete. The same is true for power estimation of
ery.
combinational circuits when the input sequence is binary
code. Furthermore, we believe that this complexity analy- [9] R. Marculescu, D. Marculescu, and M. Pedram. Efficient
sis is easily extensible to a large number of operating conPower Estimation for Highly Correlated Input Streams. In
Proceedings of the 32nd Design Automation Conference,
ditions that involve exponentially large input sequences.
pages 628–634, June 1995.
This result is remarkably strong, since #P-complete problems are known to be no simpler than any problem in the [10] C. Papadimitriou. Computational Complexity. Addison
polynomial hierarchy.
Wesley, 1994.
Finally, we proved that the decision problem associated [11] W. J. Savitch. Relational between nondeterministic and
with power estimation in sequential circuits is PSPACEdeterministic tape complexity. Journal of Computer and
complete, a strong but not surprising result given known
System Sciences, 4:177–192, 1970.
related results in sequential circuit analysis and synthesis. [12] Seinosuke Toda. On the computational power of PP and
Based on these results, we also argued briefly that
P . In 30th Annual Symposium on Foundations of Comapproximation algorithms for the majority of interestputer Science, pages 514–519, Research Triangle Park,
ing problems in power estimation do not exist unless
North Carolina, 30 October–1 November 1989. IEEE.
P = NP .
5 The related estimation problems ask for the actual value of the peak
or average power.
6