Ultra Low Power Vedic Multiplier
Ultra Low Power Vedic Multiplier
Ultra Low Power Vedic Multiplier
Noveel Tran
nsistorr Levell Realiizationn of Ulltra Loow
Pow
wer High-speeed Ad
diabatiic Veddic Muultiplieer
M. Ch
handa1, S. Baanerjee2, D. Saha3, S. Jaain4
ECE
E Dept.1&4, VD
DTT Dept.2 andd ETCE Dept.3
Meg
ghnad Saha Insstitute of Techn
nology1&4, IIT D
Delhi2, Jadavppur University3
1
2
[email protected] , [email protected]
om , [email protected]
m2, [email protected]
m4
I.INTRODU
UCTION
A plethora of multiplication algorithm hass been proposeed
recenntly in literatu
ure [1]-[8]. In
I this paper we present a
system
matic design methodology
m
for
f fast and areea efficient dig
git
multiiplier based on
o Vedic math
hematics [3],[4
4],[9]-[12]. Th
he
Multiiplier Architeccture is based on the Urdhv
va-Tiryakbhyam
[9], ssutra or Verttical and Crosswise algoriithm of ancient
Indiaan Vedic Mathematics. Conv
ventional, as well
w as, adiabattic
88 Vedic multip
plier structure have been implemented
i
in
MC 0.18m CM
MOS technolog
gy, using CAD
DENCE Desig
gn
TSM
suite..
A
Adiabatic swittching [13]-[15
5] has recentlly become of a
certaiin interest, and
d is being impleemented in maany systems. Th
he
approoach is based on
o a slow charg
ging of the capacitive nodes by
b
time--varying clock
ked ac power and a partial recovery of th
he
energgy used by slow
wly decreasing
g the supply witthout sacrificin
ng
noisee immunity an
nd driving ab
bility. In this paper, Energ
gy
efficiient adiabatic logic (EEAL) based
b
on DCVS
S logic [16], haas
been introduced as an adiabatic lo
ogic style. EEA
AL requires on
nly
wer clock supplly, has simple implementation,
one ssinusoidal pow
and iis geared towarrd high-speed and low-energ
gy VLSI desig
gn.
In E
EEAL, high speed
s
operatio
on as well as low energ
gy
consuumptions are ensured by using a paralleel resistive paath
betweeen the outpu
ut nodes and clock supply
y. EEAL log
gic
featurres simplicity and static log
gic resembled
d characteristiccs,
whichh substantially
y decreases transistor
t
overrheads and th
he
circuiit complexity.
EE
EAL is a duall-rail adiabaticc logic which cconsists of two
o
DCVS
S network and a pair of crosss-coupled PM
MOS devices in
n
each sttage, as illustraated by figure 1(a).
Figuree 1. EEAL logic (aa) Block diagram ((b) Inverter/Bufferr circuit (c)Power
supply (dd) Cascading of Invverter/Buffer circuuits
EE
EAL requires only one sinuusoidal powerr clock supply
y,
has siimple implem
mentation, andd performs better than thee
previoously proposedd adiabatic logiic families [133]-[15] in terms
of eneergy consumptiion. As single--clock circuit rrequires simplee
clock scheme [16], this logic stylle can enjoy m
minimal contro
ol
overheeads. figure 1 ((b) and (c) shoows the EEAL buffer/Inverter
circuitt and supply cloock () respecctively.
802
T
The operation of EEAL inverter/buffer can
n be summarizeed
usingg figure 1 (b). Assuming thee complementaary output nodes
(outt and outb) are initially low
w and supply clock
c
() ramp
ps
up froom logic 0 ( 0
0 ) to logic 1 ( VDD) statee. Now if in =
0 aand inb= 1; N1, M1 will be turned off and
a M2, N2 an
nd
P1 w
will be turned ON. The ou
ut node is th
hen charged by
b
follow
wing the supp
ply clock () closely throu
ugh the paralllel
combbination of PM
MOS (P1) and NMOS (M2), whereas outb
b
node is kept at grround potentiaal, as N2 is On. When th
he
gs from VDD
D to ground, out node is
supplly clock swing
dischharged through
h the same charging
c
path and un-driveen
outbb is kept at saame ground po
otential. Resulttantly full swin
ng
can bbe obtained in out node an
nd ground pottential at outb
b
node.. Output voltaage swing for an adiabatic inverter at 10
00
MHzz frequencies with
w 20 fF capaccitive load is sh
hown in fig. 2.
={nC ox(W/L)n(VDDD-2VT)}-1
(5))
T
The energy ad
dvantage of EEAL
E
circuit can be readily
underrstood by assu
uming a ramp
p type voltagee source whicch
ramps up between
n 0 and V
VDD and deliv
vers the charg
ge
me period T. The dissipation through th
he
CLVDDD over a tim
channnel resistance R is,
Ediss=
={(CLVDD)/T}2RT = {(RCL)/T
T}CL(VDD)2
(1)
(2)
(3)
nventional CM
MOS logic, which
w
consumees
Comppared to con
CLVDDD2 energy in
n a full cyclee (CL is load
d capacitancess),
adiabbatic gain (G) of
o EEAL becom
mes,
Adiabbatic Gain (G) in (%)
Energy comsum
mption by EEAL peer cycle
x100
Enerrgy Consumption by
b conventional CM
MOS per cycle
= [{22RPCL/T} + {(
V)/VDD}2] 100
1
= 2{R
RPCL/T}100 (as V<<VDD, {(V)/VDD}2<<1)
<
(4
4)
IIII.IMPLEMENTA
ATION OF NN MULTIPLIER ST
TRUCTURE
3.1 U
Urdhva Tiryakbbhyam sutra
Thhe proposed m
multiplier is baased on an alggorithm Urdhvaa
Tiryakkbhyam (Verrtical & Crrosswise) [9]], a generaal
multipplication formuula of anciennt Vedic mathhematics. The
paralleelism in genneration of ppartial produccts and their
summaation is obtainned using Urddhva Tiryakbhhyam explained
d
later. S
Since the partiaal products andd their sums arre calculated in
n
paralleel, the multipliier is independdent of the clocck frequency of
the proocessor. It is ddemonstrated tthat this archittecture is quitee
efficieent in terms of silicon area/speed.
Thhe 22 or 444 multiplicaation utilizingg conventionaal
mathem
matical methoods (successivve additions w
when used on
n
compuuters) needs noo explanation. Hence, the Veedic method for
44 m
multiplication iss illustrated in the example below, shown in
n
Figuree 3. Hence digits of multiiplier and muultiplicands aree
placedd in two conseecutive sides ((along row andd column) of a
squaree. In case of NN multiplication (hencee N=4), wholee
803
squarre will be divid
ded into N2 (=16) no. of squ
uares, which wiill
be paartitioned again by crosswise line, as show
wn in Figure 3.
Each digit of the multiplier is then
t
independeently multiplieed
nd and the two
o-digit product is
with every digit of the multiplican
x. All the dig
gits lying on a
writteen in the smaall square box
crosswise dotted lin
ne are added to
o the previous carry. The leaast
n
acts ass the result dig
git
signifficant digit of the obtained number
and thhe rest as the carry
c
for the neext step. In thiss above examp
ple
initiaal carry is taken
n as logic 0.
Figure 3. Mu
ultiplication using UrdhvaTiryakbhy
U
am Sutra
3.2 Implementatio
on of general Vedic
V
multiplierr structure
IIn this section
n we first disccuss the organ
nization of 2
2
multiiplier block, which
w
will be further
f
used to
o configure 4
4
and 88 multiplier structurees. In 22 multiplication,
nputs (X and Y) having two
t
digits eacch
consiidering two in
(XX1X0 and Y
Y1Y0), we get four outputs (SS4S3S2S1) as
a
a resuult, by doing vertical and cross-multiplication and addition.
The ssteps are:
i) S1 iis the result of Vertical multip
plication betweeen X0 and Y0.
ii) S2 is the addition of crosswiree bit multiplicaation of (X1 an
nd
Y0) aand (X0 and Y1).
)
iii) S3 is the vertiical product of
o X1 and Y1, if no carry is
generrated from thee previous step
ps, otherwise carry
c
bit will be
b
addedd with the vertiical product to generate S3 as a sum.
iv) S4 is the carry geenerated during
g addition of S3.
B
By using this 22 multiplieer block 44, 88, 1616 etc
e
multiiplier block caan be implemen
nted. For NN
N multiplication,
show
wn in figure 4, N-bit
N
multiplier and multipliccand first will be
b
dividded into two eq
qual halves, co
onsisting of N//2 no. of bits in
each halve. Assum
ming N-bit mulltiplication bettween X and Y,
Y
a XH= {X (NN/2+1) X (N/2+2) X
we get XL= {X1 X2 X3.XN/2} and
o halves will be
b
(N/2+3)).XN}as two halves of X. For Y, its two
YL={{Y1 Y2 Y3YN/2
Y(N/2+1) Y(N/2+2) Y(N/2+3) .YN}.
N } and YH={Y
804
S
So in a NN
N multiplicatio
on, we need four N/2N/2
multiipliers, two N bit
b adders, a haalf adder and a N/2 bit adder.
S
Static conventiional CMOS lo
ogic style is used to implement
the conventional Vedic multip
plier. In case of adiabattic
w
impleementation, firrst we describee the EEAL gaates and then we
preseent the design
n of adiabatic 8x8 Vedic multiplier
m
usin
ng
EEAL
L logic. Com
mplex gates caan be easily implemented by
b
usingg simple NMO
OS based DC
CVS network. In Fig. 1, by
b
replaccing the DCV
VS network we
w can implem
ment the AND
DNAN
ND gate with EEAL
E
circuit topology. DCV
VS network for
fo
sum and carry block of Full add
der circuits, allong with AN
ND
blockk are shown in
n figure 5. We
W have design
ned an adiabattic
standdard-cell library
y, consisting of
o common diigital gates succh
as buuffer/inverter, two
t
inputs and
d three-input functions,
f
add
der
and multiplier blo
ock of varyin
ng bit length using Cadencce
ulator in 0.18m
m technology. W/L ratio of th
he
specttre circuit simu
PMO
OS and NMOS
S are taken witth W/L = 12 /2 and 6 /2
2
where =0.9 m.
3.3 Results and Siimulations
ge.
All ssimulations haave been donee under 1.8V supply voltag
Durinng simulation we
w apply {A} = {A7, A6, A5,A4, A3, A2, A1
and A0} and {B} = {B7, B6, B5,B4, B3, B2, B1 an
nd B0} as inputts.
Hencce random patteerns consist off four bits are assigned for eacch
inputt bit (Ai or Bi, where i = 0 to 7). The asssigned bits arre;
{A}=
={01010101, 00001111,
0
00110011,00010101, 00010101,
00001111, 001100
011 and 00010101} and {B}={01110011,
011, 01110011, 00010101,
00010101, 00001111, 001100
110011}. The simulated waaveform of 8
8
00001111 and 001
OS and Vedic multiplier are also shown in
n Figure 6 & 7
CMO
respeectively. Perforrmance measu
urement of 88
8 CMOS Ved
dic
multiiplier circuit along
a
with Carrry-Save, Arraay, Wallace treee
multiipliers with vaarying bit-sizes (2-bit, 4-bit,, and 8-bit) haas
805
Bit L
Length
oof
Conveentional
Vedic
Carry-save
Bit-array
222
27
81
92
185
444
104
427
449
389
888
892
2060
2090
2172
Figure 7. Outpu
ut waveform of 8
8 Adiabatic Vedicc Multiplier
Bit L
Length
T
Table 1 showss that Vedic multiplier
m
sho
ows least pow
wer
consuumption comp
pared to other optimized mu
ultiplier circuitts.
Sincee greater num
mbers of addeer cells are used
u
for larger
multiipliers, the pow
wer savings forr smaller operaand sizes can be
b
directtly extrapolateed to higher op
perand multipllier modules. In
I
10MH
Hz 22, 44 an
nd 88 Vedic multiplier consume only 33%
%,
30% and 41% of th
he total powerr consumed by
y carry save an
nd
vely. Table 1 also shows th
hat
Wallaace tree multiplier respectiv
Multtiplier
oof
Conveentional
Multtiplier
Vedic
Carry-save
Bit-array
222
0.23
0.46
0.45
444
0.58
0.66
0.84
888
1.38
1.68
2.69
Wallace Tree
0.42
0.73
1.53
806
REFERENCES
Energy-Delay Product (10-25 Js) comparison of different type
Bit Length
[1]
of Multiplier units
of
Conventional
Multiplier
Vedic
Carry-save
Bit-array
22
0.14
0.17
0.17
44
3.50
18.80
30.50
88
169.80
589.60
1505.10
Wallace Tree
3.20
508.40
Conventional Vedic
Adiabatic Vedic
Savings (%)
22
27
9.64
64.3
44
104
45.76
56.0
88
892
526.28
41
Adiabatic Vedic
Savings (%)
22
0.23
0.25
-8.0
44
0.58
0.67
-15.5
88
1.38
1.87
-35.5
Multiplier
22
[5]
[7]
[8]
[9]
Bit Length of
[4]
[6]
Bit Length of
Multiplier
Multiplier
[3]
20.70
Bit Length of
[2]
[10]
[11]
[12]
[13]
Conventional Vedic
Adiabatic Vedic
Savings (%)
0.14
.06
57.1
[14]
[15]
IV.CONCLUSION
An energy efficient new adiabatic multiplier structure
based on Urdhva Tiryakbhyam sutra of Vedic mathematics has
been proposed using EEAL style. On basis of Cadence spectre
simulations, it can be concluded that this Vedic multiplier is
more efficient than array multiplier, Booth
multiplier and
Wallace-Tree multiplier, in terms of timing efficiency and
speed. The speed improvements are gained by parallelizing the
generation of partial products with their concurrent
summations. It is also shown that energy efficiency can be
enhanced significantly in low frequency domain using the
newly proposed adiabatic approach.
[16]